Community Wishlist Survey 2023/Wikidata/Popup to link to or create a new Wikidata item after creating an article
Popup to link to or create a new Wikidata item after creating an article
- Problem: The problem of connecting newly created articles to existing objects respectivley creating new objects for unconnected pages (when, how, by whom, ...) for hundreds of newly created articles per day in different language versions, and how to avoid duplicates amongst the currently 105 million objects, has been discussed for years again and again without a real solution, for example at d:Wikidata:Requests for permissions/Bot/RegularBot 2
- Proposed solution: At d:Wikidata:Contact the development team/Archive/2020/09#Connecting newly created articles to existing objects resp. creating new object - additional step when creating articles, categories, etc. a possible solution has been discussed:
An additional step after saving a newly created article etc. to present to the user a list of possible matching wikidata objects (e.g. a list of persons with the same name; could be a similar algorithm as the duplicate check / suggestion list in PetScan, see for example this duplicity example
or the option to create a new object if no one matches (depending one the type of the object, some values could be already be pre-filled and pulled from the article, e.g. from categories or infoboxes). From my point of view, one current problem is, that a lot of creators of articles, categories, navigational items, templates, disambiguations, lists, commonscats, etc. are either not aware of the existance of wikidata or did forget to connect a newly created article etc. to an already existing object or to create a new one if not yet existing, which might lead to (more) duplicates, if this creation respectivley connection is not done manually, but by a bot instead, which have to be merged manually afterwards.
This pop-up should be presented for:
- newly created articles in the article namespace
- articles that have been moved from user/draft namespace to article namespace
- articles, that have been expanded from a redirect
- newly created categories
- In addition, there could be specialized (depending on the type of the objects, e.g. one bot for humans, one for films, one for buildings, etc.) bots, which are for example able to check for various IDs (like GND, VIAF, LCCN, IMDb, monument-IDs ...) in order to avoid creating duplicates and creates new items or connects matching items based on IDs.
Also, if someone uses the "translation function" to create a translated article in another language version, then the new translated article could be connected automatically to the object of the original article. And after a version import (after a translation), at the moment often the link to the Wikidata object gets lost and the article has to be reconnected again a second time manually.
Also see:
- Community Wishlist Survey 2021/Wikidata/Creation of new objects resp. connecting to existing objects while avoiding duplicates
- Community Wishlist Survey 2022/Wikidata/Creation of new objects resp. connecting to existing objects while avoiding duplicates
- Community Wishlist Survey 2022/Wikidata/Autosuggest linking Wikidata item after creating an article
- and
- d:Wikidata:Project_chat/Archive/2022/11#Reducing_the_backlog_of_unconnected_pages_on_a_regular_base
- de:Benutzer:M2k~dewiki/FAQ#Technische_Wünsche_/_Wunschliste_/_Wishlist
- de:Benutzer:M2k~dewiki/FAQ#Wie_finde_ich_ein_bestehendes_Wikidata-Objekt_zu_einem_Artikel?
- de:Benutzer:M2k~dewiki/FAQ#Warum_fehlen_die_von_mir_gewünschten_Informationen_im_neu_erstellten_Wikidata-Objekt?
Statistics of unconnected articles:
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=enwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=itwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=plwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=nowiki
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=trwiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=jawiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=ukwiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=ruwiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?
- Who would benefit: Improved data quality, i.e. less duplicates ; users are able to switch beetween languages and projects
- More comments:
- Phabricator tickets: T308059, T178249, T312927
- Proposer: M2k~dewiki (talk) 18:26, 23 January 2023 (UTC)
This wish now has a project page. Please consult it for development updates. –– STei (WMF) (talk) 10:50, 24 April 2024 (UTC)
Discussion
- See also another idea on linking new pages to Wikidata items: phab:T178249. — putnik 22:05, 23 January 2023 (UTC)
- Doesn't a bot already do this for unattached articles? Also if this pops up after a new article creation, it's unlikely to have any of the identifiers to avoid creating duplicate entries (speaking from experience). I generally create Wikidata entries but I always have to search first as it's unpredictable what's already in the system. czar 22:27, 29 January 2023 (UTC)
- Hello @Czar:, for a bot it is hard to decide if two articles in different languages actually described the same entity (the same movie, building, sport event/sport season, organisation, mountain, river, city, taxon, chemical substance, flight incident, ...). A bot might use some IDs for this (e.g. CAS-ID for chemicals, ASN-ID for flight incidents, VIAF for persons, various country dependent monument IDs for monuments, SANDRE-ID for rivers in France, IMDb for movies, ...) to match articles in different languages (if it is able to extract this IDs from the articles in various languages at all, e.g. if these IDs are used in templates). Even for wikidata objects for persons its is hard to decide if two articles describe the same person, due to different spellings of the name even in the same language (especially for people who lived centuries ago, as well as different languages/alphabets as russian, japanese, korean, greek, ...). So, if an author does research for writing an article, he/she might have already found out, that there is already an article in other languages, so some information and sources from these articles might have been used for the newley created article. When publishing such an article, the author only has to connect to the existing object/articles he/she has found during researching for writing the article. In my opinion, if done manually, the probabilty to create duplicates is far lower than if this is done by a bot, although there always will be some duplicates with both methods. From my point of view, a lot of (new) authors do not know about the existance of Wikidata and the possibility to link/connect a new article to other languages. The pop-up might be a way, that authors are informed automatically about this possibility. M2k~dewiki (talk) 23:38, 29 January 2023 (UTC)
- The absence of a pop up, a bot for connecting to existing objects, or some other mechanism regularly leads to backlogs with some thousands of unconnected articles:
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=enwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=itwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=plwiki
- https://wikidata-todo.toolforge.org/duplicity.php?mode=stats&wiki=nowiki
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=trwiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=jawiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=ukwiki&mode=stats
- https://wikidata-todo.toolforge.org/duplicity.php?wiki=ruwiki&mode=stats
- A bot mainly can create new objects for unconnected articles (independent of the existence in other languages), but hardly can decided, if an existing objects in one language describes the same entity in another language. Therefore, objects created by a bot might have to be merged afterwards manually at least more often, than objects where a human decided, if this subject already might exist in other languages or not. M2k~dewiki (talk) 23:49, 29 January 2023 (UTC)
- The absence of a pop up, a bot for connecting to existing objects, or some other mechanism regularly leads to backlogs with some thousands of unconnected articles:
- Hello @Czar:, for a bot it is hard to decide if two articles in different languages actually described the same entity (the same movie, building, sport event/sport season, organisation, mountain, river, city, taxon, chemical substance, flight incident, ...). A bot might use some IDs for this (e.g. CAS-ID for chemicals, ASN-ID for flight incidents, VIAF for persons, various country dependent monument IDs for monuments, SANDRE-ID for rivers in France, IMDb for movies, ...) to match articles in different languages (if it is able to extract this IDs from the articles in various languages at all, e.g. if these IDs are used in templates). Even for wikidata objects for persons its is hard to decide if two articles describe the same person, due to different spellings of the name even in the same language (especially for people who lived centuries ago, as well as different languages/alphabets as russian, japanese, korean, greek, ...). So, if an author does research for writing an article, he/she might have already found out, that there is already an article in other languages, so some information and sources from these articles might have been used for the newley created article. When publishing such an article, the author only has to connect to the existing object/articles he/she has found during researching for writing the article. In my opinion, if done manually, the probabilty to create duplicates is far lower than if this is done by a bot, although there always will be some duplicates with both methods. From my point of view, a lot of (new) authors do not know about the existance of Wikidata and the possibility to link/connect a new article to other languages. The pop-up might be a way, that authors are informed automatically about this possibility. M2k~dewiki (talk) 23:38, 29 January 2023 (UTC)
- How would this work for non-Wikipedia projects? Wikisource items are editions of specific works, that have their own publication data, and they usually should not be added to any existing Wikidata item because they have their own publication data (specific to that edition) that needs to be recorded in a data item for that edition. Would this pop-up recognize the difference between needing a Wikidata item for an edition (specific to Wikidata and to the scan of that edition on Commons) versus a Wikidata item for the work itself, which would link to Wikipedia articles, Commons Categories (not scans), and Wikiquote? Or will the pop-up look for closely-matching items regardless of these issues? The very fact that this proposal is about "articles" shows that it doesn't consider the issues inherent with non-Wikipedia projects. --EncycloPetey (talk) 20:16, 23 January 2023 (UTC)
- This functionality could be available on a project base respectivley on a language base, so every community of every project and language version could decide, if this feature should be (de)activated. --M2k~dewiki (talk) 20:42, 23 January 2023 (UTC)
- That response does not address the problem I raised. If Wikipedia has it active, but they are adding links for works to editions items, that's a problem, because Wikipedia is then adding information incorrectly to data items. Likewise, if a Wikisource has this active, but it puts links for an edition into a work data item, then it's added the information incorrectly. If English Wikisource has this active, and it causes English edition links for an English edition to be added to a data item for a French edition, then the information has been added incorrectly. Your response does not address these issues. --EncycloPetey (talk) 16:08, 25 January 2023 (UTC)
- Editions could be filtered out in the result sets presented by for example duplicity: https://wikidata-todo.toolforge.org/duplicity.php?wiki=dewiki&norand=1&page=Vasco%5FCordeiro --M2k~dewiki (talk) 16:12, 25 January 2023 (UTC)
- Then how would links be added if the desired target is to be an edition? If it is information for a specific publication, then it's an edition. 99% of data items for Wikisource projects are editions. When a published reference is to be cited, it's a specific edition. And the example link you provided makes no sense in terms of the context of this discussion. Do you understand the difference between a work and an edition? --EncycloPetey (talk) 16:20, 25 January 2023 (UTC)
- Editions could be filtered out in the result sets presented by for example duplicity: https://wikidata-todo.toolforge.org/duplicity.php?wiki=dewiki&norand=1&page=Vasco%5FCordeiro --M2k~dewiki (talk) 16:12, 25 January 2023 (UTC)
- That response does not address the problem I raised. If Wikipedia has it active, but they are adding links for works to editions items, that's a problem, because Wikipedia is then adding information incorrectly to data items. Likewise, if a Wikisource has this active, but it puts links for an edition into a work data item, then it's added the information incorrectly. If English Wikisource has this active, and it causes English edition links for an English edition to be added to a data item for a French edition, then the information has been added incorrectly. Your response does not address these issues. --EncycloPetey (talk) 16:08, 25 January 2023 (UTC)
- I've merged "Add interlanguage links" interface should let users input sister wiki pages into this proposal, because it looks like both can be solved with a system of presenting possible existing Wikidata items (and optionally creating a new one with the sitelink already added). These tasks are what the AutosuggestSitelink gadget is aiming to solve, so I'd suggest that we'd look at adding any missing functionality there. For example, when creating a new Wikisource edition page, often there is no item for the edition but we could prompt to find the appropriate work item and if one exists create the edition item with all links as appropriate. (Details t.b.d. of course! Hopefully I'm not handwaving away something impossible there.) SWilson (WMF) (talk) 07:17, 1 February 2023 (UTC)
- Hello @SWilson (WMF): thanks for the information about the AutosuggestSitelink gadget, which I just have activated:
- and tried to connect
- to an object ( https://www.wikidata.org/wiki/Q75243783 has been already existing), but got the error message:
- Something went wrong: Unable to parse the messages page meta:MediaWiki:Gadget-AutosuggestSitelink-messages/de. There may have been a recent change that contains invalid JSON.
- Also, in my opinion there should be no (error) message at all when creating a redirect.
- Are there any plans to activate the AutosuggestSitelink gadget for a broader range of users per default after a test phase or does every single user explicitly have to activate the gadget? M2k~dewiki (talk) 19:52, 1 February 2023 (UTC)
- @M2k~dewikiThank you for taking a look at this new AutosuggestSitelink gadget. I just posted on the proposal's talk page from the Wishlist Survey 2022. We are currently taking feedback. We'll take a look at error you experienced and please let us know if the talk page if you find any further issues. Do you mind updating this proposal so that the problem and solution is about the improvements you'd like to see in this new AutosuggestSitelink gadget? Thank you! HMonroy (WMF) (talk) 21:10, 1 February 2023 (UTC)
Voting
- Support Exilexi (talk) 09:13, 11 February 2023 (UTC)
- Support Muted Red Tulip (talk) 09:22, 11 February 2023 (UTC)
- Support Shizhao (talk) 13:55, 11 February 2023 (UTC)
- Support Кирилл С1 (talk) 14:35, 11 February 2023 (UTC)
- Support Bluerasberry (talk) 15:14, 11 February 2023 (UTC)
- Support OwenBlacker (Talk) 15:14, 11 February 2023 (UTC)
- Support CROIX (talk) 15:18, 11 February 2023 (UTC)
- Support --NGC 54 (talk|contribs) 01:08, 12 February 2023 (UTC)
- Support Mauricio V. Genta (talk) 08:02, 12 February 2023 (UTC)
- Support Bencemac (talk) 20:27, 12 February 2023 (UTC)
- Support Husky (talk) 21:13, 12 February 2023 (UTC)
- Support Libcub (talk) 02:59, 14 February 2023 (UTC)
- Support Ottawajin (talk) 09:22, 15 February 2023 (UTC)
- Support Lupe (talk) 21:20, 15 February 2023 (UTC)
- Support Fuchs B (talk) 20:11, 17 February 2023 (UTC)
- Support ArthurPSmith (talk) 20:59, 17 February 2023 (UTC)
- Support Zblace (talk) 07:36, 18 February 2023 (UTC)
- Support This is extremely needed!!! Thingofme (talk) 16:17, 18 February 2023 (UTC)
- Support Albinfo (talk) 22:11, 18 February 2023 (UTC)
- Support Jklamo (talk) 12:17, 19 February 2023 (UTC)
- Support — Draceane talkcontrib. 12:44, 20 February 2023 (UTC)
- Support Nashona (talk) 18:13, 20 February 2023 (UTC)
- Support cyrfaw (talk) 18:43, 20 February 2023 (UTC)
- Support Serieminou (talk) 23:13, 21 February 2023 (UTC)
- Support Kalendar (talk) 06:37, 22 February 2023 (UTC)
- Support Carsrac (talk) 12:52, 22 February 2023 (UTC)
- Support Dajasj (talk) 13:54, 22 February 2023 (UTC)
- Support Rots61 (talk) 14:11, 22 February 2023 (UTC)
- Support Gusfriend (talk) 00:33, 23 February 2023 (UTC)
- Support JiriMatejicek (talk) 09:39, 24 February 2023 (UTC)
- Support Matěj Suchánek (talk) 17:05, 24 February 2023 (UTC)