Research:Disinformation, Wikimedia and Alternative Content Moderation Models: Possibilities and Challenges
This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.
Wikipedia is a global platform that generates nonproprietary knowledge with joint ownership of aggregated user-generated (Greenstein 2012). The model turns the traditional delivery of information by Web site providers into the active co-construction of resources by communities of contributors (Okoli2009). From this standpoint, Wikipedia is a true Internet-based community, that---as all communities---is ruled by a set of social norms peculiar to its own composition (Lessig2006, 83-86). In its case, the community is committed to a set of values that stem the fact that it is an encyclopedic project in dialogue with previous encyclopedic efforts buit built in an innovative and communitarian fashion (De Laat 2012=. This is what made Wikipedia exciting at first. With time, it has become one of the most important sources of reliable information on the Internet---by October 2022, the number of registered editors in Wikipedia was over 44.4 millon (Serbanuta 2022). The way the community manages disagreements, especially on questions of defining the *trustworthiness* of sources, is a worthy research question with potential impact on different fields, including disinformation research, content moderation, and polarization.
Methods
[edit]Within the scope of this research, the narrow focus we selected will increase the likelihood of success of the research method we propose: trace ethnography[1]. Through this method we will look at how these articles evolved and the kind of discussions that they generated among Wikipedians who proposed editions. Furthermore, we want to understand how editors understand their role, how they fight the polarization that affects their communities, and what strategies they develop to deal with the tensions that naturally ensue.
This calls for expanding our understanding of this methodology. Stuart Geiger and David Ribes define trace ethnography as a form of institutional ethnography that takes advantage of the rich documentation produced in "highly technologically-mediated systems"[2].
"Analysis of these detailed and heterogeneous data---which include transaction logs, version histories, institutional records, conversation transcripts, and source code---can provide rich qualitative insight into the interactions of users, allowing us to retroactively reconstruct specific actions at a fine level of granularity. Once decoded, sets of such documentary traces can then be assembled into rich narratives of interaction, allowing researchers to carefully follow coordination practices, information flows, situated routines, and other social and organizational phenomena across a variety of scales".[3]
In that sense, our own research proposal will jump into those documents, specially the 'Discussion' tab of the three entries we have identified, as a first object of analysis. We expect to follow the most interesting discussions from the point of view of our research question in detail, and interview editors and contributors who took part in them. (We depend to some extent of the collaboration of local Wikipedia communities in that effort and have already contacted some of them). Towards the final stage of our data gathering efforts, we will likely interview senior members of the Wikipedia's bureaucracy as well as Wikimedia Foundation employees supporting these efforts from different sectors (engineering, policy, legal, trust and safety, etc).
Results
[edit]Only the Nisman page warns of disagreement around NPOV (Neutral Point of View) that Wikipedia guidelines state. It is the page with more page views and more editors with a single participation (Table 1). The page of the Argentinian prosecutor has had regular activity over the four years we examined, related to the salience that biography pages have. The user Rosarino did 158 editions from 2015-01-19 to 2017-12-08, with an Average Time Between Edits (ATBE) of 6.7 days.
Name of the page (original language) | Unique editors | Editions Super editor | TEditions made by the top 6 editors |
---|---|---|---|
“Crisis política en Bolivia de 2019” (Spanish) | 353 | 6% | 556 |
“Alberto Nisman” (Spanish) | 857 | 9% | 503 |
Operação Lava Jato (Portuguese) | 465 | 26% | 1666 |
Considering the editors, the Brazilian page had the most active user tables (e.g., Table 1). The user Istambul did 712 editions from 2015-07-14 to 2021-02-11, with an ATBE of 2.5 days. The user has been inactive since February 2022. The second active user was Leandrod, with 552 editions. In contrast, single-edition users represented 58% of the editions. The three pages showed the majority of single-edition users (Table 1). However, they differed in their participation of the top 6 editors. The Spanish language pages showed a similar trend, while Brazilian discourse seemed to be concentrated among the top editors.
Prosecutor death (Argentina)
[edit]The peaks of editing were three: January 2015 (249 modifications), December 2015 (107 modifications), and December 2019 (47 modifications). The first mention was in January 2015, after the death of prosecutor Alberto Nisman occurred on January 18\. During this month, the page showed numerous transformations because of two opposite hypotheses about the death of the prosecutor: suicide, which was the official version, or murder (Fig. 2).
The second peak of editing was in December 2015, related to Judge Fabiana Palmaghini\`s decision to remove the prosecutor Viviana Fein. Another critical event was the political decision of the Mauricio Macri administration (2015-2019) on December 14th to suspend the agreement with Iran signed in 2013 between the former president of Cristina Fernández de Kirchner (2007-2015) and the Persian president Mahmoud Ahmadinejad. This meant a political turnout on the judicial investigation of the AMIA bombing, and it ignited public debate, mainly on three fronts: the responsibility of the former administrations and the performance of prosecutor Alberto Nisman.
The third peak of activity occurred in December 2019\. This boost was related to two news. Related to the trial, the tribunal exempted former President Cristina Fernández from the preventive detention that weighed on her because of the agreement with Iran. The decision occurred after she took a new term as a vice-president of Alberto Fernández. This month, the Netflix platform released the trailer for the documentary series “Nisman: The Prosecutor, the President and the Spy”, including testimonies of unheard witnesses as an Intelligence Secretary’s ex-agent Jaime Stiusso.
Almost one out of ten editions corresponded to one editor (Rosarino). The six most active editors gathered one-third of editions (Fig. 3).
Electoral crisis (Bolivia)
[edit]The peaks of editing of this post were in November 2019 (496 editions), July 2020 (64 editions), and July 2021 (62 editions) (Fig. 3). As in Alberto Nisman's page, the creation of page on November 10, showed the highest activity. This day, the Organization of American States (OAS) published a report arguing a "clear manipulation" of the October 20, 2020 elections by the government of Evo Morales, against Morales’ version that he and his vice president, Álvaro García Linera, were victims of a *coup d'état*. Major political events occurred the first month. On November 12, Jeanine Áñez, the second vice-president of the Senate, proclaimed herself president of Bolivia. On November 15, eight protesters are killed during a clash between the Bolivian National Police and Morales' supporters.
The second peak occurred in July 2020, when the Supreme Electoral Tribunal (TSE) postponed the presidential elections because of the COVID-19 crisis. Former president Evo Morales, still in exile, strongly rejected as well as the militants of his party (Movimiento al Socialismo, MAS). The third peak of editing occurred in July 2021, when the Prosecutor's Office closed the case for alleged fraud, ruling that there was only "negligence" based on an expert report by the University of Salamanca in Spain. By then, Evo Morales' former Minister of Economy, Luis Arce, had won the elections and intended to change the narrative of the 2019 crisis.
In this case, one of four editions corresponded to the six most active editors (Fig. 5). None seemed to concentrate a majority editions.
Corruption case (Brazil)
[edit]The three peaks of the *Operação Lava Jato* page were in March 2016 (212 modifications), January 2017 (137 modifications), and April 2017 (128 modifications) (Fig. 6). In this case, the highest point of editing took place during the arrest of former President Lula Da Silva. On March 4, 2016, a raid operation took place at the former president's home. The request for his preventive detention by the Public Ministry of São Paulo accentuated the public debate on Wikipedia.
The second peak of editing was in January 2017, when Teori Zavascki, a judge of Brazil's Supreme Court of Justice and supervisor within the state-owned company Petrobras, died in a plane accident. The public debate concerned his replacement and the court's decision after the fatal event. The third peak of editing occurred when the Supreme Court investigated eight ministers of President Michel Temer's cabinet, 29 senators, 40 deputies, and three governors, getting other political forces involved in the Workers' Party (PT) issue.
Compared with the other cases, half of the editions were concentrated on two editors: Istambul (26%) and Leandrod (20%) (Fig. 7). Only these accounts could be called super editors, as they led half of the publications.
Discussion and Conclusion
[edit]These findings gave the hypothesis for the preliminary analysis of public data. We examined whether the crowd's wisdom prevails in controversial issues and whether only a few people produce the most content, maintaining a power concentration model. The preliminary results show that a distributed content model prevails. Even on controversial issues, a lot of contributors participated.
Wikipedia’s community-led model may offer an insight into a core issue within the disinformation dilemma: authoritative sources or crowdsourced editors. Our preliminary conclusions showed the prevalence of contributors of one or two publications. Against previous conclusions, the percentage of people who produce most of the content is distributed. Further analysis could go deep into the contributions of each editor.
This chronology of political events stimulated discussion of the Wikipedia article. This hypothesizes that Wikipedia is a place where society maintains public debate in controversial situations. In polarized societies such as Argentina, Brazil, and Bolivia, the Wikipedia model keeps the dialogue under common rules accepted by the community. The number of comments is similar in the three cases, so daily activity seemed to have a time limit. The transparency of the process also seems to contribute to an open debate. Accessible data about the contributions and the author facilitates the control of the community and the cross-control of the content. Similarly, the tools that offer data on the platform activity allow the control of the users.
Some studies argued that Wikipedia is more reliable than traditional encyclopedias (Atapour et al., 2023; Giles, 2005). Even compared to alternate sources on the Web, Wikipedia is a reliable source of knowledge. Wikipedia is considered a model of User Generated Content: “A striking feature is that Wikipedians are not only invited to edit articles, but are also allowed to, at least to a certain extent, participate in financial decision-making, software development, and the formulation and enforcement of strategies, policies, and guidelines” (Rijshouwer et al., 2023, p. 1290).
The user-generated content of Wikipedia espouses a neutral point of view (NPOV) policy, according to which every entry should be represented fairly, proportionately, and as far as possible without bias (Callahan & Herring, 2011). The question of how Wikipedians resolve controversial Wikipedia articles is connected to a fundamental challenge of political polarization. In an environment of affective polarization, denial and rejection depend on identity (Törnberg et al., 2021). Reliable and fact-based knowledge is a prerequisite for meaningful democratic deliberation. In politically polarized environments, drafting Wikipedia entries on controversial topics must be challenging. Wikipedia editors are subjected to polarisation dynamics that affect their larger communities. Information and communication in these communities are accountable as it is accessible in real-time for every user (Rijshouwer et al., 2023), which enables participants to hold each other accountable.
Extensive literature on collective intelligence and the “wisdom of the crowd” suggests that crowdsourcing is more reliable for fighting misinformation than professional fact-checkers (Pennycook & Rand, 2019). Previous studies found that the percentage of people who produce most of the content is tiny (Baeza-Yates & Saez-Trumper, 2015). A similar conclusion could be propose with our observation.
Wikipedia’s rules and processes of engagement and deliberation invite extreme positions to reach some degree of consensus on the sources that can be used to write and source a Wikipedia article. Public data expose the users' activity, so as valuable as their point of view is their online reputation. Wikipedia’s community-led model may offer insight into a core issue within the disinformation dilemma: how distributed sources are more reliable than authoritative power. In polarized societies, user-generated content could be a civic tool for solving differences.
Dissemination
[edit][Reviewed, partially caused by RightsCon to Feb 2025]
- Derivative materials: active, 2024-04-01,2024-05-01
- Discussion in selected seminars: active, 2024-04-01,2024-06-30
- Presentation in RightsCon2024: active, 2024-06-01,2024-06-30
https://meta.wikimedia.org/wiki/Special:WhatLinksHere/Research:Disinformation,_Wikimedia_and_Alternative_Content_Moderation_Models:_Possibilities_and_Challenges#REDIRECTX
Policy, Ethics and Human Subjects Research
[edit]Because our research is based on observations and review of documented data, we do not expect any kind of disruption in the processs of our research.
References
[edit]Atapour, H., Khalilzadeh, S., & Zavaraqi, R. (2023). Comparison of Stanford Encyclopedia of Philosophy and Wikipedia Articles’ References: In Search of Evidence for Wikipedia Credibility. Journal of Scientometric Research, 12(2), 469–479. https://doi.org/10.5530/jscires.12.2.043 Baeza-Yates, R., & Saez-Trumper, D. (2015). Wisdom of the crowd or wisdom of a few? An analysis of users’ content generation. HT 2015 - Proceedings of the 26th ACM Conference on Hypertext and Social Media, 69–74. https://doi.org/10.1145/2700171.2791056 Callahan, E. S., & Herring, S. C. (2011). Cultural bias in Wikipedia content on famous persons. Journal of the American Society for Information Science and Technology, 62(10), 1899–1915. https://doi.org/10.1002/asi.21577 Geiger, R. S., & Ribes, D. (2011). Trace ethnography: Following coordination through documentary practices. 4th Hawaii International Conference on System Sciences. https://doi.org/10.1109/HICSS.2011.455 Giles, J. (2005). Internet encyclopaedias go head to head. Nature, 438(7070), 900–901. https://doi.org/10.1038/438900a Greenstein, S., & Zhu, F. (2012). Is Wikipedia biased? American Economic Review, 102(3), 343–348. https://doi.org/10.1257/aer.102.3.343 Lessig, Lawrence. 2006. Code. Version 2.0. New York: Basic Books. Okoli, C. (2009). A brief review of studies of Wikipedia in peer-reviewed journals. Proceedings of the 3rd International Conference on Digital Society, ICDS 2009, 155–160. https://doi.org/10.1109/ICDS.2009.28 Pennycook, G., & Rand, D. G. (2019). Fighting misinformation on social media using crowdsourced judgments of news source quality. Proceedings of the National Academy of Sciences of the United States of America, 116(7), 2521–2526. https://doi.org/10.1073/PNAS.1806781116 Rijshouwer, E., Uitermark, J., & de Koster, W. (2023). Wikipedia: a self-organizing bureaucracy. Information Communication and Society, 26(7), 1285–1302. https://doi.org/10.1080/1369118X.2021.1994633 Serbanuta, C., & Constantinescu, M. (2022). Learning by Wikipedia’s NPOV principle: an online dynamic experience. Proceedings of the International Conference on Virtual Learning, 17, 205–216. https://doi.org/10.58503/icvl-v17y202217 Törnberg, P., Andersson C., Lindgren, K., Banisch, S. (2021). Modeling the emergence of affective polarization in the social media society. PLoS ONE, 16 (10): e0258259.
- ↑ Emiel Rijshouwer, Organizing Democracy: Power concentration and self-organizing bureaucratization in the evolution of Wikipedia (Erasmus University Rotterdam ene. 2019), p. 40; R Stuart Geiger & David Ribes, Trace Ethnography: Following Coordination through Documentary Practices, 2011 44th Hawaii International Conference on System Sciences 1 (IEEE ene. 2011).
- ↑ R Stuart Geiger & David Ribes, Trace Ethnography: Following Coordination through Documentary Practices, 2011 44th Hawaii International Conference on System Sciences 1 (IEEE ene. 2011), p. 1.
- ↑ R Stuart Geiger & David Ribes, Trace Ethnography: Following Coordination through Documentary Practices, 2011 44th Hawaii International Conference on System Sciences 1 (IEEE ene. 2011), p. 1.