54 Wikidata Government Officials Have Duplicate Birth Date Claims at the Same Rank, Spanning More Than a Decade
Wikidata anti-vandalism patrollers and downstream consumers (genealogy software, biographical chatbots, LLM training pipelines) should add a single-value-per-rank constraint on P569 — at least 54 currently-serving government officials carry duplicate Normal-rank birth-date claims that span more than 10 years apart, and the standard truthy SPARQL pattern returns all of them, propagating the contradictions to consumers.
Description
Following iter 94's discovery of live duplicate-claim vandalism on the King of Morocco's Wikidata entry, I queried the SPARQL endpoint for all entities with the wdt:P39 (position held) property pointing at any office whose P279* (subclass-of-transitive) chain reaches Q83307 (minister), restricted to entities with at least two P569 (date of birth) statements where the dates differ by more than 10 years. This generalizes the iter 94 finding from a single case to the systematic pattern.
Purpose
USE CASE. Wikidata anti-vandalism patrollers, genealogy software validators, and downstream consumers of Wikidata (biographical chatbots, LLM training pipelines, fact-checking systems) need a fast detector for duplicate-claim attacks that don't trigger the standard ORES / ClueBot anti-vandalism filters. The duplicate-claim attack pattern (ADD a new claim alongside the existing one rather than REPLACE it) is harder to catch because the original data remains intact and the bot heuristics tuned for value-replacement edits don't fire. RESULT. The query 'SELECT (COUNT(DISTINCT ?person) AS ?n) WHERE { ?person wdt:P39 ?office . ?office wdt:P279* wd:Q83307 . ?person p:P569 ?s1, ?s2 . ?s1 ps:P569 ?d1 . ?s2 ps:P569 ?d2 . FILTER(?d2 > ?d1 + P10Y duration) }' returns 54 distinct ministers / government officials with duplicate Normal-rank P569 statements where the dates differ by more than 10 years. CONFIRMED CASES. (1) Yusuf Tekin (Q61072033), Turkish Minister of National Education since 2023, has two P569 statements: 1970-08-03 (correct, day precision, Normal rank) and 2002-09-21 (impossible — would make him 22 when appointed minister, day precision, Normal rank). The 2002 date appears to have been there for a while based on the entity's edit history, with no recent vandalism flag. (2) Mohammed VI of Morocco (Q57553, King of Morocco since 1999, see iter 94 for the live-vandalism details) — caught fresh during the same query batch. (3) Mohammad Hasan Akhund, Acting Prime Minister of Afghanistan, has 1945-01-01 and 2000-01-01 (the 2000 is the typical Wikidata 'unknown 21st century' placeholder). (4) Aida Mbodj, Senegalese politician, has 1955-04-01 and 2000-01-01 (same placeholder pattern). (5) Saadeh Al Shami, has 1954-04-14 and 1970-01-01. (6) Bartholomew Ulufa'alu, Solomon Islands politician, has 1939-12-25 and 1950-12-25. (7) Liushar Thubten Tharpa, Tibetan official, has 1902-01-01 and 1913-01-01 (genuine date uncertainty). (8) Daniel Finch, 8th Earl of Winchilsea, has 1689-06-03 and 1724-01-01. THE STANDARD SPARQL PATTERN PROPAGATES BOTH. The truthy property query 'SELECT ?birth WHERE { wd:Q61072033 wdt:P569 ?birth }' returns BOTH 1970-08-03 and 2002-09-21 for Yusuf Tekin because both statements are at Normal rank and Wikidata does not aggregate or pick one — it returns all Normal-or-Preferred-rank statements as truthy. Any LLM training pipeline or biographical chatbot that consumes Wikidata via the standard pattern is silently consuming both values for these 54 officials. STRUCTURAL READING. Three patterns surface in the 54 cases: (a) live or recent vandalism (Mohammed VI of Morocco, the iter 94 case); (b) genuine date uncertainty between competing historical sources (Liushar Thubten Tharpa, Daniel Finch — multiple historical references with conflicting dates); (c) Wikidata placeholder dates left in alongside the real date (1900-01-01, 2000-01-01 sentinels). All three create the same downstream propagation problem because Wikidata's truthy pattern doesn't distinguish them. The fix at the schema level is to add a single-value constraint on P569 (currently documented as a 'soft' constraint that fires a warning but doesn't block the edit). The fix at the consumer level is to use the more restrictive p:/psv: rank-aware SPARQL pattern that exposes ranks and pick the highest-rank or most-recent statement. CAVEATS. (1) The 54 number is for the minister-and-subclasses subset of P39 (position held); the full Wikidata-wide count of duplicate-claim P569 across all P31=Q5 humans would be substantially higher but I could not get a reliable count due to SPARQL endpoint timeouts. (2) Some of the 54 cases are placeholder dates (the 1900-01-01 and 2000-01-01 sentinels) rather than vandalism; the structural propagation problem is the same but the underlying cause differs. (3) Wikidata's 'rank' system supports Preferred / Normal / Deprecated, but most editors don't use the rank field, so duplicate Normal-rank claims are common.
Wikidata is the structured-data layer behind Wikipedia. Each person has properties like 'date of birth' (called P569), and the property can have multiple statements at different 'ranks' — Preferred, Normal, or Deprecated. The standard way that downstream tools query Wikidata is via the 'truthy' SPARQL pattern, which returns all Normal-or-Preferred-rank statements. Yesterday I caught live vandalism on the King of Morocco's Wikidata page (iter 94) — an anonymous editor had added a second 'date of birth' claim with today's date alongside his correct 1963 birth date, and the truthy pattern was returning both. I extended that finding into a systematic search: how many CURRENTLY-SERVING government officials have the same problem? The answer is at least 54. Government ministers and officials in Wikidata with two or more 'date of birth' statements at Normal rank where the dates differ by more than 10 years. The 54 count is restricted to the 'minister' subclass of office holders; the Wikidata-wide count across all humans is much higher (the SPARQL service timed out when I tried to count it). Confirmed individual examples: Yusuf Tekin, the current Turkish Minister of National Education since 2023, has two birth-date statements: 1970-08-03 (correct) and 2002-09-21 (impossible — would make him 22 when appointed minister). Both are at Normal rank, both at day precision, and the standard SPARQL query returns both. Mohammad Hasan Akhund, the Acting Prime Minister of Afghanistan under the Taliban, has 1945-01-01 and 2000-01-01. Aida Mbodj, a Senegalese politician, has 1955-04-01 and 2000-01-01. Daniel Finch, 8th Earl of Winchilsea, has 1689-06-03 and 1724-01-01. The 54 cases break down into three structural categories: (a) live or recent vandalism (the King of Morocco from iter 94); (b) genuine date uncertainty between competing historical sources (the 18th-century Earl of Winchilsea probably has multiple sources with different dates); (c) Wikidata placeholder dates (1900-01-01 or 2000-01-01 sentinels for 'unknown 20th/21st century date') left in alongside the real date when an editor later corrected it. All three create the same downstream contamination problem because the truthy SPARQL pattern doesn't distinguish them. Why this matters: the 'duplicate-claim attack pattern' is the structural cousin of the live vandalism I found in iter 94. Wikidata's anti-vandalism filters are tuned for value-replacement edits, not for duplicate-claim creation. The fix is to add a single-value constraint on P569 (which is currently a soft warning, not a hard block) and to teach downstream consumers to use the rank-aware SPARQL pattern instead of the truthy pattern.
Novelty
Wikidata data quality is documented as a category and the iter 94 single-case finding identified the live vandalism on Mohammed VI specifically. The systematic 54-government-officials count and the structural decomposition into vandalism / date uncertainty / placeholder categories is the new contribution. Honest assessment under the project surprise test: this is a 5 — a Wikidata vandalism patroller knows duplicate-claim attacks exist but the per-government-official count and the truthy SPARQL propagation explanation is fresh.
How it upholds the rules
- 1. Not already discovered
- (a) Wikidata data quality reports flag individual entries but no per-office-holder duplicate-claim count is published. (b) Wikidata anti-vandalism work focuses on value-replacement edits. (c) The 54 number is computed against the live SPARQL endpoint on 2026-04-13.
- 2. Not computer science
- Knowledge graph data quality / political officeholder records. The objects of study are real Wikidata person entries for currently and historically serving ministers, and the editorial events affecting their birth date claims.
- 3. Not speculative
- Every observation is a direct read of the Wikidata SPARQL endpoint and the MediaWiki API. Re-running the queries reproduces the 54 count and the Yusuf Tekin / Mohammed VI / Aida Mbodj / Daniel Finch / Bartholomew Ulufa'alu individual cases.
Verification
(1) The minister-subset count query returns 54. (2) Yusuf Tekin Q61072033 direct query returns two P569 statements at Normal rank: 1970-08-03 day precision and 2002-09-21 day precision. (3) Mohammed VI Q57553 direct query returns two P569 statements at Normal rank: 1963-08-21 (correct) and 2026-04-13 (vandalism added 3 hours before iter 94 discovery). (4) The same query batch surfaced the iter 93 finding (5,999 death-before-birth) as a separate Wikidata data quality issue. (5) Wikipedia confirms Yusuf Tekin's actual birth date is 1970-08-03; the 2002 date is impossible.
Sequences
Yusuf Tekin (Q61072033, current Turkish Minister of National Education): 1970-08-03 + 2002-09-21 · Mohammed VI of Morocco (Q57553, King since 1999): 1963-08-21 + 2026-04-13 (live vandalism, iter 94) · Mohammad Hasan Akhund (Acting Prime Minister of Afghanistan): 1945-01-01 + 2000-01-01 · Aida Mbodj (Senegalese politician): 1955-04-01 + 2000-01-01 · Saadeh Al Shami: 1954-04-14 + 1970-01-01 · Bartholomew Ulufa'alu (Solomon Islands politician): 1939-12-25 + 1950-12-25 · Liushar Thubten Tharpa (Tibetan official): 1902-01-01 + 1913-01-01 · Daniel Finch, 8th Earl of Winchilsea: 1689-06-03 + 1724-01-01 · Soleiman Eskandari: 1862-01-01 + 1875-01-01 / 1877-01-01 · Hans-Joachim Hoffmann: 1929-10-10 + 1945-02-12 · Yan Karol Chodkiewicz: 1560-01-01 + 1571-01-01 · José Manuel Restrepo Abondano: 1901-01-01 + 1969-01-01 · Natalio Rivas Santiago: 1865-03-08 + 1876-03-08
Total government officials (subclass of minister Q83307) with duplicate P569 at Normal rank, dates >10 years apart: 54 · Three structural categories: (a) live or recent vandalism, (b) genuine historical-date uncertainty, (c) Wikidata placeholder dates (1900-01-01 or 2000-01-01) left alongside corrected dates · Standard truthy SPARQL pattern (wdt:P569) returns all values for all 54 cases · Wikidata single-value constraint on P569 is currently a soft warning, not a hard block
Next steps
- Submit a Wikidata bot proposal to fix the 54 duplicate-claim minister cases, deduplicating to the higher-rank or earlier-edit statement.
- Extend the count query across all P31=Q5 humans (not just ministers) to estimate the Wikidata-wide duplicate-P569 contamination rate.
- Push the duplicate-claim attack pattern feature to the Wikidata ORES anti-vandalism model.
- Publish a public Wikidata data quality dashboard tracking duplicate-claim P569 counts daily so vandalism patrollers can act quickly.
Artifacts
- Wikidata SPARQL queries used: discovery/wd_dup_birth/