Community Wishlist Survey 2021/Categories/Find similar images
Find similar images
- Problem: Many of the new images on Wikimedia Commons have insufficient metadata, but are similar to ones we already have.
- Who would benefit: Users of Wikimedia Commons
- Proposed solution: A find similar routine that enabled categorisers on Wikimedia commons to link a new image which lacks a good description to other images of the same object
- More comments:
- Phabricator tickets:
- Proposer: WereSpielChequers (talk) 22:12, 29 November 2020 (UTC)
Discussion
How would this work? By comparing metadata or comparing the files? The former seems to be the issue ("insufficient metadata") and the latter seems to be difficult due to the amount of files, and the fact that even if the files are similar enough to be detected, they would get removed for being too similar. Opalzukor (talk) 19:36, 8 December 2020 (UTC)
Often new images lacking categories and/or detailed description are used in a specific Wikipedia article by the owner of the image. This kind of indirect metadata could be used automatically by adding the commonscat category of the article (if it exists) to the image. --HeinrichStuerzl (talk) 20:43, 8 December 2020 (UTC)
Perhaps use the same technique as in Google Images (click on the camera icon), would that be possible? JopkeB (talk) 04:37, 9 December 2020 (UTC)
A better search would allow part names or words rather than rely on complex AI e.g. 'suffrag' finding Suffrage, Suffragette, Suffragist - apologies if this already exists in Commons. Kaybeesquared (talk) 13:42, 9 December 2020 (UTC)
Voting
- Support Movses (talk) 19:04, 8 December 2020 (UTC)
- Support --HeinrichStuerzl (talk) 20:43, 8 December 2020 (UTC)
- Support BugWarp (talk) 01:40, 9 December 2020 (UTC)
- Support JopkeB (talk) 04:33, 9 December 2020 (UTC)
- Oppose Per Opalzukor's concerns, I'm not convinced this is feasible. AIs still have a lot of trouble identifying objects within images, and we have limited other data to work with for images whose concern is, well, limited other data. {{u|Sdkb}} talk 04:45, 9 December 2020 (UTC)
- While AI might not reach the requirement to fully automate it, I think if getting a little help from AI, a human Wikipedia editor can have 10x or 100x productivity finding similar images while maintain a high precision by human review. Therefore, it might not be feasible in other use-cases, but with passionate Wikipedian's help I think there is a feasiblity. Xinbenlv (talk) 05:07, 9 December 2020 (UTC)
- Support Xinbenlv (talk) 04:59, 9 December 2020 (UTC)
- Support Avron (talk) 07:53, 9 December 2020 (UTC)
- Support -- Triple C 85 |talk| 10:39, 9 December 2020 (UTC)
- Support It should help to deal with duplicates, so I will support. MarioSuperstar77 (talk) 11:14, 9 December 2020 (UTC)
- Support Zoozaz1 (talk) 17:16, 9 December 2020 (UTC)
- Support Pechristener (talk) 17:30, 9 December 2020 (UTC)
- Support dwf² (talk) 22:48, 9 December 2020 (UTC)
- Support Arielllaura (talk) 20:59, 10 December 2020 (UTC)
- Support Also can be proposed categories from Wikipedia, if there is not similar one in Commons. BoldLuis (talk) 11:01, 11 December 2020 (UTC)
- Support Anaxial (talk) 18:50, 11 December 2020 (UTC)
- Support --Matlin (talk) 19:23, 11 December 2020 (UTC)
- Support Facilitates the search for similar images. WikiFer msg 19:56, 11 December 2020 (UTC)
- Oppose Categorization is a human endeavour. Guesswork and artificial “intelligence” have no place here. It’s bad enough to have to correct categorization errors done by humans, thanks very much. Tuvalkin (talk) 21:23, 11 December 2020 (UTC)
- Support Fixer88 (talk) 09:13, 12 December 2020 (UTC)
- Support Conny (talk) 15:43, 12 December 2020 (UTC)
- Support Vincent Simar (talk) 22:27, 12 December 2020 (UTC)
- Support Lalviarez (talk) 22:46, 12 December 2020 (UTC)
- Support Wikibenchris (talk) 08:49, 13 December 2020 (UTC)
- Support ThomasLendt (talk) 15:22, 13 December 2020 (UTC)
- Support Useful for beginners with little experience in categories. Moleskine (talk) 18:02, 13 December 2020 (UTC)
- Support Trang Oul (talk) 11:49, 16 December 2020 (UTC)
- Oppose existing images may be incorrectly identified e.g. wildlife Charlesjsharp (talk) 16:55, 16 December 2020 (UTC)
- Support Risk Engineer (talk) 15:50, 17 December 2020 (UTC)
- Support David1010 (talk) 13:01, 21 December 2020 (UTC)
- Support Like to find similar images using image comparison software and also text by OCR. Someone proposed filename search instead, but that's not a solution because typical filenames represent only one very narrow point of view. eg. "Imperial Museum collection Number 457.jpg" Wotheina (talk) 13:45, 21 December 2020 (UTC)