Jump to content

Enquête 2015 sur les souhaits de la communauté/Bots et gadgets

From Meta, a Wikimedia project coordination wiki
This page is a translated version of the page Community Wishlist Survey 2015/Bots and gadgets and the translation is 100% complete.

Extension/gadget d'évaluation d'article

Tracked in Phabricator:
Task T116092

Créer une interface simple à utiliser pour ajouter les modèles d'évaluation Wikiprojet aux articles. Pour que cela fonctionne, il faudrait sans doute une sorte de page de configuration au format JSON listant les modèles disponibles. Cela fonctionnerait comme l'extension WikiLove (qui ajoute des récompenses sur les pages utilisateur). Kaldari (talk) 17:30, 19 May 2015 (UTC)[reply]

Earlier discussion and endorsements
Gadget for en.wiki proposed here: en:Wikipedia:Gadget/proposals#AssessmentHelper. Kaldari (talk) 17:58, 7 July 2015 (UTC)[reply]
See also: Grants:IEG/Revision scoring as a service/Renewal#Scope. Helder 13:42, 9 July 2015 (UTC)[reply]
See also #Make quality/reliability of an article more clear to the reader I proposed above. --Piotrus (talk) 04:33, 12 November 2015 (UTC)[reply]
Comment: This is usually done with a bot that checks if all articles in a category contain a certain template and adds it if necessary (and in some cases the bot can prefill certain arguments of the template). That is far more efficient than using something similar to WikiLove. The Quixotic Potato (talk) 14:41, 12 November 2015 (UTC)[reply]
Endorsed Endorsed I use en:User:Kephir/gadgets/rater.js regularly, and it makes a world of difference for Assessment. I think having something like this baked in as a default gadget on WikiProjects would make a world of difference for maintaining Assessment. Sadads (talk) 15:05, 12 November 2015 (UTC)[reply]
Oppose Oppose. Wikipedia specific and well within the capability of volunteer editors. There is no need for this to be an extension, the gadgets above should be sufficient. This proposal would also be made even more unnecessary by having a global repository of gadgets. MER-C (talk) 16:22, 14 November 2015 (UTC)[reply]

Voeux

  1. Oppose Oppose WMF specific and well within the capability of volunteer editors. There is no need for this to be an extension, the gadgets above should be sufficient. This proposal would also be made even more unnecessary by having a global repository of gadgets. MER-C (talk) 09:52, 30 November 2015 (UTC)[reply]
  2. Support Support anything that makes the adding of assessment templates is a Good Thing. Allows us to look at overviews of the development of subjects. Casliber (talk) 05:04, 1 December 2015 (UTC)[reply]
  3. Support Support--Shizhao (talk) 09:32, 1 December 2015 (UTC)[reply]
  4. Support Support although I wouldn't use the Rater script as it currently exists as it snags the load of some pages. Something more advanced without bugs, yes. Stevie is the man! TalkWork 13:52, 1 December 2015 (UTC)[reply]
  5. Support Support Sadads (talk) 15:41, 1 December 2015 (UTC)[reply]
  6. Support Support Goombiis (talk) 16:17, 1 December 2015 (UTC)[reply]
  7. Support Support Sethtalk 16:51, 1 December 2015 (UTC)[reply]
  8. Support Support – I've just discovered Rater on en.wp and am finding it very useful. Smaller projects like cy.wp miss out on scripts like this as there are fewer editors with the coding expertise, which is why I'm supporting this proposal. Ham II (sgwrs / talk) 18:45, 2 December 2015 (UTC)[reply]
  9. Neutral Neutral I believe it would be easy to implement this if each wiki would have local Wikibase repository --AS (talk) 09:23, 3 December 2015 (UTC)[reply]
  10. Support Support as Casliber said, it would make adding assessment templates easier, it would be pretty useful as a second more general point. - SantiLak (talk) 10:24, 4 December 2015 (UTC)[reply]
  11. Support Support - ƬheStrikeΣagle 16:11, 6 December 2015 (UTC)[reply]
  12. Support Support - WikiProjects aren't so active these days, but assessments are still valuable, so I would support attempts to make engaging in such activity easier/accessible — Rhododendrites talk \\ 17:16, 6 December 2015 (UTC)[reply]
  13. Oppose Oppose - while I'm not opposed to the concept, it seems that this is sufficiently covered by other efforts, so isn't the kind of thing I would want to tie up limited paid staff resources to work on. Wbm1058 (talk) 15:13, 7 December 2015 (UTC)[reply]
  14. Support Support. This could be useful, and I can see myself using it. NinjaRobotPirate (talk) 10:49, 14 December 2015 (UTC)[reply]
  15. Support Support --Rahmanuddin (talk) 14:54, 14 December 2015 (UTC)[reply]

Améliorer le bot de détection du copier-coller

À l'heure actuelle, nous avons un bot qui analyse « toutes » les nouvelles éditions de Wikipédia en anglais pour vérifier les problèmes de copyright. Le résultat se trouve ici. Et il pourrait fonctionner dans d'autres langues.

Le problème est qu'il n'est pas aussi fiable qu'il le devrait. Par ailleurs, la manière dont les problèmes sont présentés pourrait être améliorée. J'adorerait que la finalité tourne en une extension et que le résultat soit formaté de façon semblable à en:Special:NewPagesFeed.

Actuellement, le résultat peut être trié par Wikiprojet. Ce serait bien de créer des modules par Wikiprojet pouvant être insérés sur la page du projet en question. Doc James (talk · contribs · email) 03:45, 4 November 2015 (UTC)[reply]

Earlier discussion and endorsements

Votes

  1. Support Support 4nn1l2 (talk) 03:00, 30 November 2015 (UTC)[reply]
  2. Support Support --Tobias1984 (talk) 11:17, 30 November 2015 (UTC)[reply]
  3. Support Support Lugnuts (talk) 12:00, 30 November 2015 (UTC)[reply]
  4. Support Support This is one of the most amazing and Wikipedia-changing automated tools to come to editor attention in some years. Having automated copyright detection should be a priority because of the time that it saves experienced editors and the credibility that it gives to Wikimedia projects. Blue Rasberry (talk) 16:34, 30 November 2015 (UTC)[reply]
  5. Support Support Great idea which is going to save a lot of time. Bharatiya29 (talk) 17:37, 30 November 2015 (UTC)[reply]
  6. Support Support This is very important. Tryptofish (talk) 18:15, 30 November 2015 (UTC)[reply]
  7. Support Support Armbrust (talk) 22:29, 30 November 2015 (UTC)[reply]
  8. Support Support --Isacdaavid (talk) 02:06, 1 December 2015 (UTC)[reply]
  9. Support Support Risker (talk) 04:21, 1 December 2015 (UTC)[reply]
  10. Support Support Casliber (talk) 05:03, 1 December 2015 (UTC)[reply]
  11. Support Support Doc James (talk · contribs · email) 09:23, 1 December 2015 (UTC)[reply]
  12. Support Support other languages--Shizhao (talk) 09:33, 1 December 2015 (UTC)[reply]
  13. Support Support, especially "WikiProject specific modules to go on individual project pages". Perhaps this could also coordinate with the bot that creates cleanup listings for WikiProjects. Stevie is the man! TalkWork 14:05, 1 December 2015 (UTC)[reply]
  14. Support Support --Arnd (talk) 14:41, 1 December 2015 (UTC)[reply]
  15. Support Support Mbch331 (talk) 14:48, 1 December 2015 (UTC)[reply]
  16. Support Support --Natkeeran (talk) 14:50, 1 December 2015 (UTC)[reply]
  17. Support Support as an extension that can be easily enabled on other wikis/projects. -- Dave Braunschweig (talk) 15:09, 1 December 2015 (UTC)[reply]
  18. Support Support it would be a great save of time! --Nastoshka (talk) 15:34, 1 December 2015 (UTC)[reply]
  19. Support Support Cavamos (talk) 15:34, 1 December 2015 (UTC)[reply]
  20. Support Support - SantiLak (talk) 10:25, 4 December 2015 (UTC)[reply]
  21. Support Support Goombiis (talk) 16:18, 1 December 2015 (UTC)[reply]
  22. Support Support --Jarekt (talk) 17:11, 1 December 2015 (UTC)[reply]
  23. Support Support This is an important issue. --Frmorrison (talk) 17:13, 1 December 2015 (UTC)[reply]
  24. Support Support --SucreRouge (talk) 17:40, 1 December 2015 (UTC)[reply]
  25. Support Support --Wesalius (talk) 18:49, 1 December 2015 (UTC)[reply]
  26. Support Support StevenJ81 (talk) 21:49, 1 December 2015 (UTC)[reply]
  27. Support Support--Jey (talk) 22:03, 1 December 2015 (UTC)[reply]
  28. Support Support, it would be extremely useful, and it is not just a Wikipedia tool, as it should cover all languages and all wikis (copyvio is also a big problem for Wikibooks, Wikinews, Wikiversity) — NickK (talk) 23:37, 1 December 2015 (UTC)[reply]
  29. Support Support Spencer (talk) 01:05, 2 December 2015 (UTC)[reply]
  30. Support Support Good idea. Beyond My Ken (talk) 02:09, 2 December 2015 (UTC)[reply]
  31. Support Support --Chaoborus (talk) 02:18, 2 December 2015 (UTC)[reply]
  32. Support Support --Rosiestep (talk) 02:34, 2 December 2015 (UTC)[reply]
  33. Support Support --Shubha (talk) 04:41, 2 December 2015 (UTC)[reply]
  34. Support Support --Jasonzhuocn (talk) 06:58, 2 December 2015 (UTC)[reply]
  35. Support Support Litlok (talk) 08:10, 2 December 2015 (UTC)[reply]
  36. Support Support Anything that might help reduce the rampant copy-pasting on India/Pakistan-related articles has to be A Good Thing. - Sitush (talk) 08:44, 2 December 2015 (UTC)[reply]
  37. Support Support Amen Sitush, amen. Bgwhite (talk) 09:40, 2 December 2015 (UTC)[reply]
  38. Support Support Especially "the potential for it to work in a number of other languages" bit. ...Aurora... (talk) 10:29, 2 December 2015 (UTC)[reply]
  39. Support Support --β16 - (talk) 11:37, 2 December 2015 (UTC)[reply]
  40. Support Support It's surprising this hasn't been done yet.  DiscantX 12:01, 2 December 2015 (UTC)[reply]
  41. Support Support Matěj Suchánek (talk) 15:26, 2 December 2015 (UTC)[reply]
  42. Support Support Fluffernutter (talk) 16:33, 2 December 2015 (UTC)[reply]
  43. Support Support Gap9551 (talk) 20:11, 2 December 2015 (UTC)[reply]
  44. Support Support Absolutely. Logical Fuzz (talk) 20:47, 2 December 2015 (UTC)[reply]
  45. Support Support - tucoxn\talk 14:02, 3 December 2015 (UTC)[reply]
  46. Support Support Tremendously important. --Dweller (talk) 15:25, 3 December 2015 (UTC)[reply]
  47. Neutral Neutral I don't understand the idea. If it's about detecting copy-paste moving, Support Support, otherwise Neutral Neutral Krett12 (talk) 16:20, 3 December 2015 (UTC)[reply]
  48. Support Support - Sarahj2107 (talk) 21:34, 3 December 2015 (UTC)[reply]
  49. Support Support Nikkimaria (talk) 00:49, 4 December 2015 (UTC)[reply]
  50. Neutral Neutral What's about quotes ? A human control after bot detection is essential ! Lionel Scheepmans Contact French native speaker, désolé pour ma dysorthographie 23:00, 4 December 2015 (UTC)[reply]
  51. Support Support --Yeza (talk) 16:28, 5 December 2015 (UTC)[reply]
  52. Support Support לסטר (talk) 18:13, 5 December 2015 (UTC)[reply]
  53. Support Support J36miles (talk) 00:33, 6 December 2015 (UTC)[reply]
  54. Support Support - ƬheStrikeΣagle 16:11, 6 December 2015 (UTC)[reply]
  55. Support Support Jim Carter (talk) 07:50, 7 December 2015 (UTC)[reply]
  56. Support Support Wbm1058 (talk) 15:19, 7 December 2015 (UTC)[reply]
  57. Support Support Mpn (talk) 18:13, 7 December 2015 (UTC)[reply]
  58. Support Support Daniel Case (talk) 19:14, 8 December 2015 (UTC)[reply]
  59. Support Support AlbinoFerret (talk) 18:24, 10 December 2015 (UTC)[reply]
  60. Support Support As more people around the world gain internet access, we'll see a lot more editors unfamiliar with Wikipedia policy, and who don't see a problem with copy/pasting large amounts of text. This is evident in India articles, and the problem will only grow. Nocowardsoulismine (talk) 16:08, 12 December 2015 (UTC)[reply]
  61. Support Support Besides improving the GUI-and-features of the EranBot, additionally I would also like to see the GUI-and-features of the tool on which EranBot depends improved, the CopyViosTool at toolserver. In particular, the en:WP:DIFF-like functionality leaves something to be desired.[1][2] See also, Editing#Improved_diff_compare_screen proposal, which is also DIFF-related technology. 75.108.94.227 16:57, 13 December 2015 (UTC)[reply]
  62. Support Support I had no contact with this bot before, but it appears really really useful.--MisterSanderson (talk) 03:00, 14 December 2015 (UTC)[reply]
  63. Support Support. This should be a priority. NinjaRobotPirate (talk) 10:51, 14 December 2015 (UTC)[reply]
  64. Support Support --Davidpar (talk) 14:22, 14 December 2015 (UTC)[reply]
  65. Support Support --Rahmanuddin (talk) 14:56, 14 December 2015 (UTC)[reply]

Outil d'apprentissage automatique pour réduire les échanges nuisibles en page de discussion

Proposition

Créer un outil d'intelligence artificielle pour repérer les cas d'abus manifeste en page de discussion sur la Wikipédia en anglais en temps réel, reposant sur les fonctions existant sur en:WP comme les balises et les filtres anti-abus.

Bénéfices attendus
  1. Un filtre anti-abus pourrait prévenir les utilisateurs que leur commentaire nécessite sans doute une reformulation pour être considéré comme approprié.
    1. Réduction du nombre de messages abusifs publiés sur les pages de discussion.
  2. Les contributeurs pourraient vérifier les modifications balisées dans les modifications récentes :
    1. Cela amènerait à bon escient des tiers sur des pages où un contributeur fait face à du harcèlement sexuel ou d'autres types d'abus.
    2. Le temps de réaction serait amélioré et cela soulagerait les victimes de la charge d'avoir à faire une demande aux administrateurs.
  3. Prévention de l'escalade de conflits en page de de discussion.
  4. Amélioration des usages en page de discussion.
  5. Baisse du nombre de contributeurs quittant le projet.

Des discussions à ce sujet ont précédemment eu lieu sur https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(proposals)#Proposed:_Tag_.2F_edit_filter_for_talk_page_abuse

Comme User:Denny l'indiquait hier sur la liste de diffusion Wikimédia, un projet similaire a d'après certaines informations été exécuté sur la communauté du jeux vidéo en ligne « League of Legends » pour améliorer la qualité des interactions sociales, avec un succès considérable : on a constaté un baisse de plus de 40% du nombre d'insultes verbales au sein de cette communauté.

Une autre conclusion de ce projet fut de constater que « 87 % de la toxicité en ligne vient de citoyens neutres ou positifs qui ont juste eu une mauvaise journée. » Ceci semble très bien applicable à Wikipédia : « Nous avons dû changer la façon dont les gens pensaient la société en ligne et modifier leurs attentes quant à ce qui était acceptable. »

Les designers de jeux et les scientifiques travaillant sur ce projet ont commencé en compilant une grande table de données des interactions entre membre d'une même communauté jugées contre-productives (comportement nuisible, harcèlement, abus) puis ils ont appliqué le langage machine à cette table de données pour pouvoir fournir un tableau des réactions presque en temps réel sur la qualité de leurs interactions. (Ils se sont également employés à examiner les comportement positifs et collaboratifs.)

J'aimerais voir la Fondation Wikimédia étudier si cette approche pourrait être adapter pour résoudre les problèmes similaires dans la communauté Wikipédia. L'ensemble des révisions effacées et des publication supprimées dans les pages de discussion de la Wikipédia anglophone pourrait par exemple fournir un ensemble initial de données ; comme la communauté de « League of Legends », la Fondation pourrait inviter des laboratoires externes et des instituts académiques pour aider à analyser cet ensemble de données.

Il y a d'importantes difficultés impliquées dans la construction d'un système suffisamment sophistiqué qui puisse éviter de détecter un nombre inacceptablement grand de faux-positifs. Mais c'est un défi similaire à la programmation de ClueBot, et une des équipes de « League of Legends » semble avoir résolu le problème : « Classer les mots a été facile, mais comment résoudre des problème de linguistique avancée, comme par exemple faire la différence entre un propos humoristique et un propos réellement agressif ? Qu'en est-il des concepts plus positifs, comme les phrases qui ont appuyé la résolution de conflit ? Pour faire face aux problèmes les plus difficiles, nous avons voulu collaborer avec les laboratoires de classe mondiale. Nous leurs avons offert la chance de travailler sur ces ensembles de données et de résoudre ces problèmes avec nous. Les scientifiques ont sauté sur l'occasion de faire une différence et les avancées ont suivi. Nous avons commencé à mieux comprendre la collaboration entre des personnes étrangères, puis comment le langage évolue au fil du temps, et la relation entre l'âge et la nuisibilité dans les discussions. Étonnamment, il n'y a pas de lien entre l'âge et la nuisibilité des discussions dans les sociétés en ligne. »

Un projet réussi de ce type pourrait ensuite être proposé à d'autres projets Wikimedia. Il s'étendrait sûrement sur une longue date et serait certainement le problème le plus discuté au sein de la Wikipédia anglophone, avec pour finalité d'amener la Fondation à la fine pointe de la culture internet. Andreas JN466 20:03, 14 November 2015 (UTC)[reply]

Earlier discussion and endorsements
  • Neither an endorsement nor a rejection at this point (I could see this being workable and helpful. I could also see it going down in flames. It would all depend on the implementation, legalities, and subsequent use of the tool), but here are some initial and somewhat rambling thoughts that might help you expand on this, Jayen466:
    • The dataset of "revdeleted and oversighted edits" contains within it a large quantity of personally identifying information about editors, article subjects, etc (as well as things like allegations of criminal wrongdoing). Even within the WMF, I don't think the majority of staff are under an NDA that covers this type of personal information (and appropriately so - that kind of stuff needs to be as compartmentalized and protected as possible), and certainly the majority of community edit filter managers have not been vetted by the community for handling this type of information. A tool running on the OS dataset would thus need some kind of...is "backstopping" the right word? It would need to be either developed and tested only by people who are under an NDA covering this type of information, or the dataset would need to be pre-sanitized of that kind of information (replacing it with placeholder text along the lines of "[home address removed]"? or something?) before being turned over to whoever will develop the tool. That's a lot of work and a lot of person-hours from NDA'd folk; would this machine learning work be valuable enough to prioritize over other tasks?
    • A tool of this type could only ever catch the lowest-hanging fruit. Like, certainly it could catch "you're an asshole" or "[username] is a shit editor", but the kind of machine learning that would catch, say, "as for my esteemed opponent, I do question whether he is entirely qualified to make such judgments, given his affinity for barnyard animals"...afaik that doesn't exist and maybe never will. That's not to say a tool that only catches the low-hanging fruit is a tool that's useless, either, though - the League of Legends data you cite about the bulk of the abuse coming from non-repeat-offenders makes me think that the bulk of the abuse was probably not of the exquisitely-phrased-to-evade-censors type, either. If a filter could catch even 25% of the abuse that flies, particularly from the people who are acting in the heat of the moment and could benefit from a tap on the shoulder and a "hey, think about that", I think I'd consider it a success.
    • You're not wrong to call the revdel + OS dataset a corpus of "unacceptable speech", but I think we need to keep something in mind about that data compared to LoL's crowdsourced voting: the vast, vast majority of OS/revdel decisions-to-hide-text were made by a single person on the basis of a) a much more general policy bullet point and b) that person's own judgment about whether that edit fit within that policy point. For things like the "big" racial slurs, that's probably a judgment that can be generalized to the community at large, but for a lot of things, it's more debatable than that, both because people's personal opinions vary and because the culture's understanding of what is and isn't grievously unacceptable changes over time (for the former, think of something like the British-American divide on the c-word; for the latter, think of how internet culture would have reacted to someone saying "that's so gay" or the gay-slur f-word ten years ago compared to today). When the eventual tool is developed to filter/flag these unacceptable-on-wikimedia utterances, whose standard will we use? A possible solution that occurs to me would be to merge LoL-style voting with machine learning and use an iterative stragegy: develop machine learning tool, run on test data to generate a set of potential flags, and then have the community vote/RfC on the machine learning tool's flags' accuracy (in the sense of "yes this thing that was identified is indeed a problem we would handle as humans if we saw it"). That's another huge time investment, both for developers and community, though. Fluffernutter (talk) 16:26, 15 November 2015 (UTC)[reply]
See also Research:Revision scoring as a service and ORES. Helder 20:22, 15 November 2015 (UTC)[reply]
Endorsed Endorsed This is a good idea. While there are always challenges in anything this complex, I suspect that toxic interactions are a major contributor to burnout of editors. Reducing the potential for such interactions will have long-term benefits that greatly outweigh the short-term costs of development of this tool. Etamni (talk) 17:19, 16 November 2015 (UTC)[reply]
Support Support Also a good tools to fight vandals. Yosri (talk) 14:53, 19 November 2015 (UTC)[reply]
Endorsed Endorsed This is likely to to take a long time and a lot of experimentation to get right, but the possible benefits are enough to make it well worth trying. JohnCD (talk) 21:57, 20 November 2015 (UTC)[reply]

Votes

  1. Comment: This is something I mentioned to an OS during Wikimania actually. Basically, imagine a system that automatically hides suspicious edits pending OS review. Think of it like FlaggedRevs but with a more strict set of rules. Certain OSable content is fairly trivial to spot. Emails and phone numbers, social security numbers etc. follow a certain pattern detectable with a simple regex for example. AI would learn such rules. We often see same content be reposted as well. I would propose a OSed content campaign where clustered content would be hand labelled by humans (perhaps oversight users to avoid privacy concerns). Humans would go through the unsupervised clusters and label them (this one looks like a phone number so it should be hidden, that one looks ok so it doesn't need to be hidden) and we can then use this to train AI (clustering or multi class classification). AI would develop a level of confidence of "bad content" such that if the post is above a certain threshold it would be hidden until reviewed by AI. -- とある白い猫 chi? 19:47, 30 November 2015 (UTC)[reply]
  2. Support Support Anthonyhcole (talk) 09:18, 1 December 2015 (UTC)[reply]
  3. Support Support · · · Peter (Southwood) (talk): 13:50, 1 December 2015 (UTC)[reply]
  4. Neutral Neutral I wouldn't mind seeing a pilot done on particular parts of the English Wikipedia where negative talk page interactions tend to be a big issue, but I will withhold support pending the results of said pilot. This idea sounds like "easier said than done", but I'm willing to be open-minded and see what can be done. At any rate, I would oppose any full rollout until any serious kinks are worked out. Stevie is the man! TalkWork 14:14, 1 December 2015 (UTC)[reply]
  5. Oppose Oppose ORES is a better way to do this. After all, edit filters on enwiki (the database name for en.wikipedia) have for a long time pretty much used up the limits they have.--Snaevar (talk) 16:30, 1 December 2015 (UTC)[reply]
  6. Oppose Opposedifficult in chinese.--Temp3600 (talk) 16:38, 1 December 2015 (UTC)[reply]
  7. Neutral Neutral I did not notice this to be a big problem. --Jarekt (talk) 17:12, 1 December 2015 (UTC)[reply]
  8. Support Support on an experimental basis. We'll never know if this will work unless we try it. Eman235/talk 21:01, 1 December 2015 (UTC)[reply]
  9. Oppose Oppose - We've got too much of the Friendly Space baloney going on already without adding a highly fallible bot layer. These things don't work elsewhere and won't work here. Scunthorpe, anyone? - Sitush (talk) 08:41, 2 December 2015 (UTC)[reply]
  10. Oppose Oppose any project for English Wikipedia only. A Community Tech project must be a global one, but making a reasonable machine learning tool for detecting problematic talk page comments is a huge challenge even for one language (as our words may be offensive of not depending on the context), and would take an unreasonable number of resources for all wikis — NickK (talk) 10:03, 2 December 2015 (UTC)[reply]
  11. Oppose Oppose Seems that it would be too difficult to do effectively and with enough precision for it to be worth the man hours.  DiscantX 12:03, 2 December 2015 (UTC)[reply]
  12. Support Support Keep its purpose to a manageable task (fending off gross abuse) and this bot could make WP a much more attractive place. Currently, Talkpages are only lightly guarded, and a surprising amount of unhelpful remarks (i.e. obvious vandalism in the form of profane non sequiturs and other gratuitous scribbling) is kept forever, much to the detriment of WP's public image. SteveStrummer (talk) 05:16, 3 December 2015 (UTC)[reply]
  13. Oppose Oppose As even humans misunderstand each other all the time, I have no confidence a machine will cope with this task. --Dweller (talk) 15:26, 3 December 2015 (UTC)[reply]
  14. Oppose Oppose per Dweller. ƬheStrikeΣagle 16:11, 6 December 2015 (UTC)[reply]
  15. Neutral Neutral Searching for strings that could be phone numbers, social security numbers, and other sensitive information sounds like a great project, but this seems too nebulous/open-ended, with too many areas that can go wrong beyond those numeric strings which seems like a much more doable project that enwp could do on its own, since it's less applicable elsewhere. — Rhododendrites talk \\ 17:13, 6 December 2015 (UTC)[reply]
  16. Neutral Neutral I like the idea, but I'm not sure we have sufficient actual-human resources to positively analyze and act upon the talk-abuse occurrences such a bot would detect, nor do I have confidence in the Foundation staff's technical abilities to develop such a bot. Unless maybe such tools developed elsewhere can successfully be leveraged by the Foundation? Wbm1058 (talk) 15:29, 7 December 2015 (UTC)[reply]
  17. Oppose Oppose Mpn (talk) 18:14, 7 December 2015 (UTC)[reply]
  18. Oppose Oppose While I endorse any effort to improve civility on Wikipedia, I'm not comfortable with using bots to accomplish this. However, if a pilot project such as described by Stevietheman were preformed first & that showed promise, I'll reconsider my opposition. -- Llywrch (talk) 18:59, 11 December 2015 (UTC)[reply]
  19. Support Support --Tgr (talk) 22:29, 13 December 2015 (UTC)[reply]
  20. Support Support. This is worth exploring in a limited trial. I agree that a full rollout is potentially a bad idea. NinjaRobotPirate (talk) 11:05, 14 December 2015 (UTC)[reply]

Migrer les liens morts vers la Wayback Machine

Voir aussi : mw:Archived Pages.

La plupart des liens externes ont une espérance de vie moyenne de 7 ans avant de disparaître. Tandis que Wikipédia vieillit, le problème des liens externes morts croît de façon exponentielle. Il y a un partenariat entre Internet Archive et Wikipédia qui garantit que tous les nouveaux liens externes sont mis en cache par la Wayback Machine. Cependant, aucun processus formel n'a été mis en place pour ajouter à Wikipédia les liens vers la Wayback machine (par exemple via le modèle cite web |archiveurl=..). Il y a eu plusieurs tentatives d'automatisation de ce processus par divers bots (voir en:WP:Link rot) mais la programmation est non triviale et malgré les efforts de multiples volontaires, la situation est au point mort. Ce dont il y aura probablement besoin, c'est une équipe de développeurs travaillant à plein temps, ce qui n'est pas à la portée de quelques bénévoles y consacrant leur temps libre. C'est le genre de travail de développement que MediaWiki pourrait soutenir et cela ferait une grande différence en ce qui concerne la qualité du contenu, avec un effet visible sur tous les articles. -- Green Cardamom (talk) 19:27, 7 November 2015 (UTC)[reply]

Earlier discussion and endorsements
Endorsed Endorsed Much needed and certainly very important for smaller Wikipedias as well. Jopparn (talk) 09:12, 8 November 2015 (UTC)[reply]
Endorsed Endorsed We have made significant progress with en:User:Cyberbot II adding links to archiveurls, but there needs to be a good technical way to store. Talked with @Jdforrester (WMF): about building it into citoid at WikiCon USA. Internet archive was there, and expressed an interest in pushing their API's to the limit, to fix the 404 and other errors on Wikipedia, Sadads (talk) 10:11, 8 November 2015 (UTC)[reply]
Endorsed Endorsed Agree that this is very much needed. ONUnicorn (talk) 13:47, 8 November 2015 (UTC)[reply]
Endorsed Endorsed Much globally needed, volunteer efforts shouldn't be the only way for an important feature like this. --AlessioMela (talk) 21:06, 8 November 2015 (UTC)[reply]
Endorsed Endorsed and also migrate to WebCite--Shizhao (talk) 01:57, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Useful. --Piotrus (talk) 04:58, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Useful. Aphaia (talk) 05:05, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Very good idea, it could be very useful! Restu20 07:33, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Esquilo (talk) 08:07, 10 November 2015 (UTC)[reply]
Endorsed Endorsed For all Wikipedias including RTL wikis. 4nn1l2 (talk) 09:09, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Danmichaelo (talk) 21:26, 10 November 2015 (UTC)[reply]
Endorsed Endorsed Kropotkine 113 (talk) 21:32, 10 November 2015 (UTC)[reply]
Endorsed Endorsed I do this all the time by hand, it would be great if there was an automated procedure to take care of it. --Waldir (talk) 13:40, 14 November 2015 (UTC)[reply]
Endorsed Endorsed Useful. Afernand74 (talk) 17:28, 14 November 2015 (UTC)[reply]
Endorsed Endorsed particularly with newly-added links. If there is an API-equivalent to the "save page now" button at https://archive.org/web/ , it should be useful for this task. Davidwr/talk 05:41, 16 November 2015 (UTC)[reply]
Endorsed Endorsed Libcub (talk) 07:00, 20 November 2015 (UTC)[reply]
As a programmer Ill note that this shouldn't be that difficult since the release of MW 1.22. Key points is to use el_id from the database to make incremental dumps of all external links. Feed those to archive.org to archive, have a bot throw in those links. In reality the hardest part is figuring out which snapshot should be used. Otherwise the rest is easy to do via bot. https://tools.wmflabs.org/betacommand-dev/cgi-bin/sandbox?page=Redeemer_Presbyterian_Church_(New_York_City) is an example of what my tools have been doing for years. Δ (talk) 19:32, 20 November 2015 (UTC)[reply]
Endorsed Endorsed I did by hand, it is a terrible waste of time...--Alexmar983 (talk) 22:54, 21 November 2015 (UTC)[reply]
Comment: I was pointed here as someone already involved in this project. I am indeed already working on a bot, approved on the English Wikipedia on a bot that does just that. It's pretty far in development and am working with the WMF and IA to coordinate on getting a fully functional bot. Our aim is to get this running on top 30 Wikipedias. —cyberpower ChatHello! 05:12, 10 December 2015 (UTC)[reply]

Votes

  1. Support Support 4nn1l2 (talk) 03:01, 30 November 2015 (UTC)[reply]
  2. Support Support Jenks24 (talk) 10:16, 30 November 2015 (UTC)[reply]
  3. Support Support Lugnuts (talk) 12:01, 30 November 2015 (UTC)[reply]
  4. Support Support Debresser (talk) 12:55, 30 November 2015 (UTC)[reply]
  5. Support Support Wildthing61476 (talk) 13:22, 30 November 2015 (UTC)[reply]
  6. Support Support MrX (talk) 15:08, 30 November 2015 (UTC)[reply]
  7. Support Support TeriEmbrey (talk) 15:55, 30 November 2015 (UTC)[reply]
  8. Support Support Internet Archive should be Wikimedia's best friend! Blue Rasberry (talk) 16:35, 30 November 2015 (UTC)[reply]
  9. Support Support Daniel Case (talk) 17:17, 30 November 2015 (UTC)[reply]
  10. Support Support Bharatiya29 (talk) 17:37, 30 November 2015 (UTC)[reply]
  11. Support Support BethNaught (talk) 17:43, 30 November 2015 (UTC)[reply]
  12. Support Support PresN (talk) 17:48, 30 November 2015 (UTC)[reply]
  13. Oppose Oppose Bots notifying/helping with a migration to archive.org links were appropriate are already available. A fully automated process, which is imho likely to automated cause errors as well is imho a rather bad idea. Money for professional development is better spend elsewhere on many other pressing issues. For the link management a half automated approach is the way to go imho.--Kmhkmh (talk) 19:38, 30 November 2015 (UTC)[reply]
  14. Oppose Oppose Isn't it ironic that a community that uses the most flexible content management system you could possibly think of longs for becoming a museum of sorts, offering its readers the state that used to be seven or eight years ago? Wikipedia was successful against its competitors because it provided up to date content in any way you want. Now, it's becoming a museum. At best, an interface to the internet archive's wayback machine. But that's already available next door. Go and get moving, look for what's up now, instead. The future does not lie in the world we had yesterday.--Aschmidt (talk) 20:10, 30 November 2015 (UTC)[reply]
    Nothing stops editors from adding fresher references. Stevie is the man! TalkWork 14:38, 1 December 2015 (UTC)[reply]
  15. Support Support --YodinT 02:00, 1 December 2015 (UTC)[reply]
  16. Support Support --Isacdaavid (talk) 02:06, 1 December 2015 (UTC)[reply]
  17. Support Support. This is a basic function for keepingthe links usable. DGG (talk) 02:07, 1 December 2015 (UTC)[reply]
  18. Support Support - Not all links last forever and plus I've already been converting dead links to Archived versions and so It'd be better if a bot could do it. –Davey2010Talk 02:43, 1 December 2015 (UTC)[reply]
  19. Support Support some good stuff gets archived and disappears every day. This is a good idea. Casliber (talk) 05:05, 1 December 2015 (UTC)[reply]
  20. Support Support--Kippelboy (talk) 05:30, 1 December 2015 (UTC)[reply]
  21. Support Support--Gbeckmann (talk) 09:14, 1 December 2015 (UTC)[reply]
  22. Neutral Neutral-- Has been done. Is a great idea. Should be running in the next few weeks from what I understand. Doc James (talk · contribs · email) 09:24, 1 December 2015 (UTC)[reply]
  23. Support Support--Shizhao (talk) 09:34, 1 December 2015 (UTC)[reply]
  24. Support Support--Purodha Blissenbach (talk) 10:19, 1 December 2015 (UTC)[reply]
  25. Support Support · · · Peter (Southwood) (talk): 13:52, 1 December 2015 (UTC)[reply]
  26. Support Support. When sources are cited and then vanish, credibility suffers. LLarson (talk) 14:00, 1 December 2015 (UTC)[reply]
  27. Support Support as long as this is well-tested enough to not cause new headaches for editors. Also, I don't want to see perfectly good original links replaced with archive links from the reader's point of view -- this replacement should only occur if the original link is dead. Another concern is that sometimes webpages migrate without proper redirects instead of truly going dead -- might we also have a tool that hunts around for where the webpage moved to, so we can maintain a fresh original link? Stevie is the man! TalkWork 14:35, 1 December 2015 (UTC)[reply]
  28. Support Support --Arnd (talk) 14:42, 1 December 2015 (UTC)[reply]
  29. Support Support --Natkeeran (talk) 14:50, 1 December 2015 (UTC)[reply]
  30. Neutral Neutral It's a little hard to support this as long as IA maintains their policy of retroactively applying robots.txt. In the worst case, some domain is archived for years and the archive links are used, then the registration expires, the domain is scooped up by a squatter or some other company, they put up a robots.txt denying access, and boom, the existing archives for the former site under that domain are inaccessible. Or a site is bought out by some other company, and the new company redirects every URL from the old site to their existing homepage and throws up a robots.txt denying access to everything on the old site so as to prevent Google potentially penalizing it as a SEO trick, and boom, the existing archives for the old site are inaccessible. Or a site just reorganizes everything and puts up a robots.txt blocking access to the old URLs to get them out of Google searches, and boom, the archives for the old pages are inaccessible. Anomie (talk) 14:53, 1 December 2015 (UTC)[reply]
    The situations you describe happen rather rarely. More commonly seen are: 1) rare cases where robots.txt is changed for "censorship" purposes (this usually means the matter is controversial and there are probably secondary sources); 2) widespread cases of domain parkers which buy hundreds of thousands of domains and block everything in their robots.txt (these are usually smaller websites, but not always; it would be interesting to see how many are used as sources on Wikimedia wikis). Nemo 07:17, 7 December 2015 (UTC)[reply]
  31. Support Support--KRLS (talk) 15:12, 1 December 2015 (UTC)[reply]
  32. Support Support --Andyrom75 (talk) 15:12, 1 December 2015 (UTC)[reply]
  33. Support Support It's for me a very important issue. People can be always able to check the source and to learn more about the topics mentionend. DanGong (talk) 15:14, 1 December 2015 (UTC)[reply]
  34. Support Support Wittylama (talk) 15:15, 1 December 2015 (UTC) While the Internet Archive is an excellent 'catch all' source, I'd also like to see this solution be able to address the fact that many national libraries already perform web-archiving of their national domain. For example, the Pandora service of the National Library of Australia has a far more professional and consistent archive of Australian content than IA does, but it is done on a 'permission' basis due to local law. It would be good if any tool build to solve this request could be made to search other notable web-archives too. Wittylama (talk) 15:15, 1 December 2015 (UTC)[reply]
    Like Internet Archive, Pandora is member of the IIPC (International Internet Preservation Consortium), so they're supposedly already working together. Perhaps Wikimedia Foundation should join the consortium? If I understand correctly, your point is that we should look for more sources: that should be rather easy as long as they are IIPC members and use compatible tooling. In the long term, if WMF joined the IIPC, then it could push for some sort of federated openwayback (or just send patches upstream?).
    For now however, the main goal is probably to somehow store the archived URLs MediaWiki-side so that some sort of parsing can happen to fix links without running bots on hundreds wikis. Otherwise, bot owners are in a better position to deal with the issue. Nemo 07:34, 7 December 2015 (UTC)[reply]
  35. Support Support--Bramfab (talk) 15:26, 1 December 2015 (UTC)[reply]
  36. Support Support -- but work is already happening with The Wikipedia Library, Internet Archive, and the Citoid team, to support work on en:w:User:Cyberbot II implementation of archiveurls. If implemented, we need to build on existing work conversation with these teams. Sadads (talk) 15:43, 1 December 2015 (UTC)[reply]
  37. Support Support Goombiis (talk) 16:19, 1 December 2015 (UTC)[reply]
  38. Support Support JohanahoJ (talk) 16:42, 1 December 2015 (UTC)[reply]
  39. Support Support This is a serious issue which definitely needs to be worked on, and is, in the end, going to need a much more permanent and inventive fix. I don't know if a full-on editing team is required -- maybe just an improved bot, or a section on some kind of common page (like the Wikpedia Community Portal) listing pages tagged with a "improve dead links" template (which would need to be created, I believe). I also definitely agree with user Sadads above, that any work done should build off of editors' current efforts, rather than starting completely from scratch. -- 2ReinreB2 (talk) 17:11, 1 December 2015 (UTC)[reply]
  40. Support Support This is a growing issue that needs to be worked on. --Frmorrison (talk) 17:12, 1 December 2015 (UTC)[reply]
  41. Support Support --Jarekt (talk) 17:14, 1 December 2015 (UTC)[reply]
  42. Support Support I think that automatic migrating deadlink to Wayback Mashine will improve external links in Wikipedia. --Urbanecm (talk) 17:34, 1 December 2015 (UTC)[reply]
  43. Support Support --SucreRouge (talk) 17:38, 1 December 2015 (UTC)[reply]
  44. Support Support --Coentor (talk) 18:18, 1 December 2015 (UTC)[reply]
  45. Support Support --Wesalius (talk) 18:50, 1 December 2015 (UTC)[reply]
  46. Support Support --Usien6 (talk) 18:56, 1 December 2015 (UTC)[reply]
  47. Support Support --Hkoala (talk) 20:23, 1 December 2015 (UTC)[reply]
  48. Support Support --Akela (talk) 20:56, 1 December 2015 (UTC)[reply]
  49. Support Support Something really, really needs to be done about the linkrot problem. Eman235/talk 21:03, 1 December 2015 (UTC)[reply]
  50. Support Support It is very important for the verifiability now and in the future. Regards, Kertraon (talk) 21:33, 1 December 2015 (UTC)[reply]
  51. Support Support StevenJ81 (talk) 21:51, 1 December 2015 (UTC)[reply]
  52. Support Support Emptywords (talk) 00:01, 2 December 2015 (UTC) I was thinking about that for a long time.[reply]
  53. Support Support Hondo77 (talk)
  54. Comment Comment Often, there is a better ("live") replacement for a dead link than the Wayback Machine's archived version. An automated process could discourage people from actively looking for such a replacement. Using an archived version should always be seen as a last-resort option; I'm not convinced that a blindly acting bot is what is needed here. Gestumblindi (talk) 01:11, 2 December 2015 (UTC)[reply]
  55. Support Support --Rosiestep (talk) 02:36, 2 December 2015 (UTC)[reply]
  56. Support Support but not without hesitation. Basically I agree with arguments used by Gestumblindi (discouraging pepole from active looking for "live" replacements)". On the other hand however an old link is better than none. Pawel Niemczuk (talk) 02:48, 2 December 2015 (UTC)[reply]
  57. Support Support RoodyAlien (talk) 02:51, 2 December 2015 (UTC)[reply]
  58. Support Support Syced (talk) 03:52, 2 December 2015 (UTC)[reply]
  59. Support Support - Shubha (talk) 04:44, 2 December 2015 (UTC)[reply]
  60. Support Support - WillemienH (talk) 05:15, 2 December 2015 (UTC)[reply]
  61. Support Support --Moroboshi (talk) 06:57, 2 December 2015 (UTC)[reply]
  62. Support Support --Jasonzhuocn (talk) 07:00, 2 December 2015 (UTC)[reply]
  63. Support Support Litlok (talk) 08:10, 2 December 2015 (UTC)[reply]
  64. Support Support - Sitush (talk) 08:38, 2 December 2015 (UTC)[reply]
  65. Support Support, or use any other archive solution if Internet Archive is inappropriate, such as Wikiwix used by French Wikipedia — NickK (talk) 10:04, 2 December 2015 (UTC)[reply]
  66. Support Support Graham87 (talk) 10:20, 2 December 2015 (UTC)[reply]
  67. Support Support --β16 - (talk) 11:39, 2 December 2015 (UTC)[reply]
  68. Support Support--Barcelona (talk) 11:49, 2 December 2015 (UTC)[reply]
  69. Support Support Addition to multiple archives would be preferable though.  DiscantX 12:09, 2 December 2015 (UTC)[reply]
  70. Support Support Bazj (talk) 12:13, 2 December 2015 (UTC)[reply]
  71. Support Support--Manlleus (talk) 15:01, 2 December 2015 (UTC)[reply]
  72. Support Support --Nux (talk) 19:42, 2 December 2015 (UTC)[reply]
  73. Support Support WeeJeeVee (talk) 20:56, 2 December 2015 (UTC)[reply]
  74. Support Support As the encyclopedia ages we will have more and more problems with dead links - this proposal sounds admirable. PamD (talk) 21:28, 2 December 2015 (UTC)[reply]
  75. Support Support Thémistocle (talk) 21:55, 2 December 2015 (UTC)[reply]
  76. Support Support Gap9551 (talk) 00:32, 3 December 2015 (UTC)[reply]
  77. Support Support – This solution is better than nothing. (And agree with Steve is the man's point – I myself update links from Wayback Machine links relatively often...) IJBall (talk) 03:56, 3 December 2015 (UTC)[reply]
  78. Support Support: Too many dead links. There are even no such bots running on Chinese Wikipedia now.- Earth Saver(talk)Peace, strive, save the Earth! at 05:54, 3 December 2015 (UTC)[reply]
  79. Support Support Pbm (talk) 12:13, 3 December 2015 (UTC)[reply]
  80. Support Support: It's necessary.--Bowleerin (talk) 13:20, 3 December 2015 (UTC)[reply]
  81. Support Support - tucoxn\talk 14:02, 3 December 2015 (UTC)[reply]
  82. Support Support Yes, yes, yes, yes please. --Dweller (talk) 15:27, 3 December 2015 (UTC)[reply]
  83. Support Support Theredmonkey (talk) 19:35, 3 December 2015 (UTC)[reply]
  84. Support Support - Sarahj2107 (talk) 21:35, 3 December 2015 (UTC)[reply]
  85. Support Support Nikkimaria (talk) 00:49, 4 December 2015 (UTC)[reply]
  86. Support Support - SantiLak (talk) 10:30, 4 December 2015 (UTC)[reply]
  87. Support Support --Jane023 (talk) 16:19, 4 December 2015 (UTC)[reply]
  88. Support Support - Wieralee (talk) 17:07, 4 December 2015 (UTC)[reply]
  89. Support Support --The Polish (talk) 17:33, 4 December 2015 (UTC)[reply]
  90. Support Support Bináris tell me 18:22, 4 December 2015 (UTC)[reply]
  91. Support SupportLionel Scheepmans Contact French native speaker, désolé pour ma dysorthographie 22:59, 4 December 2015 (UTC)[reply]
  92. Support Support - Shiftchange (talk) 03:24, 5 December 2015 (UTC)[reply]
  93. Support Support --Yeza (talk) 16:29, 5 December 2015 (UTC)[reply]
  94. Support Support J36miles (talk) 00:36, 6 December 2015 (UTC)[reply]
  95. Support Support -- Gts-tg (talk) 01:54, 6 December 2015 (UTC)[reply]
  96. Support Support -- Sir Gawain (talk) 14:18, 6 December 2015 (UTC)[reply]
  97. Support Support - ƬheStrikeΣagle 16:11, 6 December 2015 (UTC)[reply]
  98. Support Support - valuable for article sourcing, but also takes steps against a common form of spam (looking for deadlinks, copying content from an archive to a personal ad-filled website, and linking to it) — Rhododendrites talk \\ 17:15, 6 December 2015 (UTC)[reply]
  99. Support Support --Waldir (talk) 12:55, 7 December 2015 (UTC)[reply]
  100. Support Support --100 Wbm1058 (talk) 15:44, 7 December 2015 (UTC)[reply]
  101. Support Support --Bender235 (talk) 01:23, 8 December 2015 (UTC)[reply]
  102. Support Support Anyone working on a old half-finished article has had the experience of spending entirely too much time chasing down references that have rotted. Courcelles 08:15, 8 December 2015 (UTC)[reply]
  103. Support Support - Bcharles (talk) 23:14, 8 December 2015 (UTC)[reply]
  104. Support Support - Valuable proposal. The possibility of extending it to WebCite should also be investigated. - Pointillist (talk) 14:00, 9 December 2015 (UTC)[reply]
  105. Neutral Neutral - WebCite and other ones must be added, too, as one engine only is not reliable. Zezen (talk) 08:58, 10 December 2015 (UTC)[reply]
  106. Support Support Therud (talk) 09:16, 10 December 2015 (UTC)[reply]
  107. Support Support AlbinoFerret 18:26, 10 December 2015 (UTC)[reply]
  108. Support Support --João Carvalho (talk) 16:43, 11 December 2015 (UTC)[reply]
  109. Support Support --Edgars2007 (talk) 08:54, 12 December 2015 (UTC)[reply]
  110. Support Support Beagel (talk) 15:15, 12 December 2015 (UTC)[reply]
  111. Support Support --R. S. Shaw (talk) 03:01, 13 December 2015 (UTC)[reply]
  112. Support Support --Piramidion 13:10, 13 December 2015 (UTC)[reply]
  113. Support Support --ESM (talk) 16:01, 13 December 2015 (UTC)[reply]
  114. Support Support GREAT! Much needed!! --MisterSanderson (talk) 03:00, 14 December 2015 (UTC)[reply]
  115. Support Support --Davidpar (talk) 14:22, 14 December 2015 (UTC)[reply]
  116. Support Support We are at tewiki doing something similar. How to deal with caching already dead pages? --Rahmanuddin (talk) 15:00, 14 December 2015 (UTC)[reply]
  117. Support Support -- AshLin (talk) 18:43, 14 December 2015 (UTC)[reply]
  118. Support Support Armbrust (talk) 22:46, 10 January 2016 (UTC)[reply]
  119. Neutral Neutral-- I am Pascal Martin, creator of Wikiwix, the French wikipedia link archiver since 2008. We host 80 million links from fr.wikipedia.org on our archive. We also archive the English ( not sure all ) Hungarian and Romanian link sources although we do not link to these. I think that a better solution than having all links archived by IA would be to allow users a choice of which archive version to use. This already works very well on dead links on fr.wikipedia.org, as for example in references 4 and 5 on this article: https://fr.wikipedia.org/wiki/Front_de_gauche_(France).And if Wikimedia Fondation is thinking of sponsoring someone for archives, Wikiwix would like to be considered for the job! Partner of the WMF since 2008 [3]

Pmartin (talk) 19:33, 12 February 2016 (UTC)[reply]