Skip to content

About Specs: Provenance Tag

Status: Human-readable overview.
Normative definition: See the JSON Schema at
spec/schemas/provenance_tag.schema.json
($id: https://civic-interconnect.github.io/civic-transparency-spec/en/spec/schemas/provenance_tag.schema.json)

Added to public posts (stored server-side for aggregate queries; not shown per-user unless a platform chooses to).

Fields (bucketed / coarse): - acct_age_bucket — e.g., 0-7d, 8-30d, 1-6m, 6-24m, 24m+ - acct_typeperson, org, media, public_official, unverified (optional: declared_automation) - automation_flagmanual, scheduled, api_client, declared_bot (self-labeled) - post_kindoriginal, reshare, quote, reply - client_familyweb, mobile, third_party_api (no specific app IDs) - media_provenancec2pa_present, hash_only, none - origin_hint — coarse locale bucket (country/region), only where lawful - dedup_hash — rolling/normalized hash for identical/near-identical text/media

Privacy & neutrality - Bucketed categories only; no handles, PII, or message content.
- Used to compute aggregate transparency metrics (not to score individuals).

Conformance & stability - Implementations MUST validate against the JSON Schema.
- Changes follow semver; breaking changes bump the schema’s MAJOR. - Platforms SHOULD publish method notes describing how they populate each field.

Non-normative example

{
  "acct_age_bucket": "1-6m",
  "acct_type": "person",
  "automation_flag": "scheduled",
  "post_kind": "reshare",
  "client_family": "mobile",
  "media_provenance": "c2pa_present",
  "origin_hint": "EU",
  "dedup_hash": "rhh_6f4c…"
}