Community Wishlist Survey 2023/Wikidata/Prevent data duplication/Proposal
Appearance
- Problem: Wikidata allows in various ways to add duplicate data: this makes the maintenance of the items more difficult and uselessly increases the size of Wikidata (which is bad for the SPARQL query service, storage space etc.).The cases I consider are: duplications between labels and aliases in the same language; duplications of date-statements (due to different encode of the same date); duplications of references (two exactly identical references, or the two references identical except for a different retrieval date)
- Proposed solution: make it impossible on server-side to save exactly duplicate data; make it more difficult (e.g. show a message before saving the edit) to add nearly duplicate data (this regards the case of references identical except for a different retrieval date)
- Who would benefit: Wikidata community (the maintenance of items becomes a slightly smaller task); SPARQL query service (the size of the items becomes slightly smaller)
- More comments:
- Phabricator tickets:
- T157774: Make it impossible to set the same content in the same language for label and alias
- T310981: Duplication of dates due to different encode
- T224333: It's possible to save a statement with duplicate references
- T270375: Saving identical references with different retrieval dates should be more difficult
- T44325: Prevent creation of items having the same sitelinks (duplicates)
- Proposer: Epìdosis 09:45, 29 January 2023 (UTC)