Community Wishlist Survey 2023/Wikidata/Prevent data duplication
Appearance
Prevent data duplication
- Problem: Wikidata allows in various ways to add duplicate data: this makes the maintenance of the items more difficult and uselessly increases the size of Wikidata (which is bad for the SPARQL query service, storage space etc.).The cases I consider are: duplications between labels and aliases in the same language; duplications of date-statements (due to different encode of the same date); duplications of references (two exactly identical references, or the two references identical except for a different retrieval date)
- Proposed solution: make it impossible on server-side to save exactly duplicate data; make it more difficult (e.g. show a message before saving the edit) to add nearly duplicate data (this regards the case of references identical except for a different retrieval date)
- Who would benefit: Wikidata community (the maintenance of items becomes a slightly smaller task); SPARQL query service (the size of the items becomes slightly smaller)
- More comments:
- Phabricator tickets:
- T157774: Make it impossible to set the same content in the same language for label and alias
- T310981: Duplication of dates due to different encode
- T224333: It's possible to save a statement with duplicate references
- T270375: Saving identical references with different retrieval dates should be more difficult
- T44325: Prevent creation of items having the same sitelinks (duplicates)
- Proposer: Epìdosis 09:45, 29 January 2023 (UTC)
Discussion
- There are also duplicated items on Wikidata that cannot be merged. https://www.wikidata.org/wiki/Property:P2959 C933103 (talk) 05:44, 7 February 2023 (UTC)
- Also see https://phabricator.wikimedia.org/T44325 M2k~dewiki (talk) 08:54, 7 February 2023 (UTC)
- Permanent duplicates are a slightly different problem, caused by the fact that an item cannot have two sitelinks for the same Wikimedia project; truly unpleasant, but I would leave it out from this task. Same sitelink in two items is relevant, I add it above! --Epìdosis 10:41, 9 February 2023 (UTC) P.S. I would, but it seems it's too late. Anyway, it is another sure case of possible data duplication. --Epìdosis 10:44, 9 February 2023 (UTC)
- Also see https://phabricator.wikimedia.org/T44325 M2k~dewiki (talk) 08:54, 7 February 2023 (UTC)
- Note: you can also very easily add a new Wikidata item via Wikishootme. It is a great tool, but when there are no geocoordinates in an already existing Wikidata item about the same subject, then you do not see it in Wikishootme. So you always have to check first before you use the Wikishootme tool for making a new Wikidata item; but who does that? So in that tool the automatic check should also be implemented. --JopkeB (talk) 12:36, 11 February 2023 (UTC)
Voting
- Support --M2k~dewiki (talk) 18:45, 10 February 2023 (UTC)
- Support —2dk (talk) 19:22, 10 February 2023 (UTC)
- Support Strainu (talk) 20:22, 10 February 2023 (UTC)
- Support MHM (talk) 21:59, 10 February 2023 (UTC)
- Support Geert Van Pamel (WMBE) (talk) 22:14, 10 February 2023 (UTC)
- Support NMaia (talk) 00:00, 11 February 2023 (UTC)
- Support ··· 🌸 Rachmat04 · ☕ 02:42, 11 February 2023 (UTC)
- Support * Pppery * it has begun 04:03, 11 February 2023 (UTC)
- Support EpicPupper (talk) 05:34, 11 February 2023 (UTC)
- Support I have seen many data duplication problems so good suggestion. Goliv04053 (talk) 06:43, 11 February 2023 (UTC)
- Support Gohan 07:17, 11 February 2023 (UTC)
- Support HHill (talk) 08:30, 11 February 2023 (UTC)
- Support Muted Red Tulip (talk) 09:17, 11 February 2023 (UTC)
- Support Plaga med (talk) 11:19, 11 February 2023 (UTC)
- Support JopkeB (talk) 12:29, 11 February 2023 (UTC)
- Support Nw520 (talk) 12:32, 11 February 2023 (UTC)
- Support RVA2869 (talk) 14:54, 11 February 2023 (UTC)
- Support FinixFighter (talk) 15:17, 11 February 2023 (UTC)
- Support Betseg (talk) 03:45, 12 February 2023 (UTC)
- Support HLFan (talk) 08:57, 12 February 2023 (UTC)
- Support Ameisenigel (talk) 09:15, 12 February 2023 (UTC)
- Support Bencemac (talk) 20:23, 12 February 2023 (UTC)
- Support Elena moz (talk) 10:35, 13 February 2023 (UTC)
- Support Behnam N (talk) 13:12, 13 February 2023 (UTC)
- Support Syunsyunminmin 🗨️talk 14:07, 13 February 2023 (UTC)
- Support Libcub (talk) 02:42, 14 February 2023 (UTC)
- Support --Alexmar983 (talk) 10:37, 14 February 2023 (UTC)
- Support Ottawajin (talk) 09:24, 15 February 2023 (UTC)
- Support Jaider Msg 22:11, 15 February 2023 (UTC)
- Support Aishik Rehman (talk) 08:46, 16 February 2023 (UTC)
- Support Steam Flow (talk) 22:10, 16 February 2023 (UTC)
- Support DobryBrat (talk) 10:20, 17 February 2023 (UTC)
- Support DoublePendulumAttractor (talk) 05:59, 18 February 2023 (UTC)
- Support Kpjas (talk) 07:59, 18 February 2023 (UTC)
- Support --Yining Chen (Talk) 10:05, 18 February 2023 (UTC)
- Support Wikisaar (talk) 11:36, 18 February 2023 (UTC)
- Support We should prevent the duplication problem. Thingofme (talk) 16:09, 18 February 2023 (UTC)
- Support Gdafs (talk) 16:26, 18 February 2023 (UTC)
- Support Albinfo (talk) 22:13, 18 February 2023 (UTC)
- Support Elucches (talk) 23:10, 18 February 2023 (UTC)
- Support Scewing (talk) 21:06, 19 February 2023 (UTC)
- Support Czupirek (talk) 21:36, 19 February 2023 (UTC)
- Support Niskka2 (talk) 22:00, 19 February 2023 (UTC)
- Support — Omegatron (talk) 16:25, 20 February 2023 (UTC)
- Support Watty62 (talk) 17:50, 20 February 2023 (UTC)
- Support Nashona (talk) 18:18, 20 February 2023 (UTC)
- Support T. Wirbitzki (talk) 18:46, 20 February 2023 (UTC)
- Support cyrfaw (talk) 18:49, 20 February 2023 (UTC)
- Support +s MartinPoulter (talk) 13:36, 21 February 2023 (UTC)
- Support Serieminou (talk) 23:06, 21 February 2023 (UTC)
- Support Alessandra.Moi (talk) 16:00, 23 February 2023 (UTC)
- Support Bargioni (talk) 19:10, 23 February 2023 (UTC)
- Support --Deinocheirus (talk) 14:46, 24 February 2023 (UTC)
- Support Mutawakkilite (talk) 17:45, 24 February 2023 (UTC)