Research:Develop a working definition for moderation activity and moderators
Moderation is focused on the social and governance work needed to sustain an online community. This entails the creation and revision of community values, rules, and norms, and the social work required to support this (e.g. guiding discussion, modeling norms) in addition to the technical work of enforcing the space’s boundaries (by removing content or users that fall outside of these boundaries). We define Moderators as the human actors responsible for social, technical and governance work needed to sustain an online community, including the creation, revision and enforcement of community values, rules, and norms.
Based on comprehensive literature review, that includes academic and product-related work, we operationalize these definitions in a detailed list of moderation actions that we categorize according in 6 dimensions:
- Process Category: We group moderation actions according to the process type, for example: Governance Work, Patrolling or User Management.
- Significance for moderation: How much of a moderator action is it? While some activities can clearly be tagged as moderation, others could be considered as moderation depending on context. This dimension classifies actions as: Very, somewhat, Tenuous or Nonhuman actions.
- How common is it?: A classification of tasks according to their frequency.
- User Groups: The relation between the action and the need of extend user rights
- Expertise: A categorization of the expertise needed to perform the action
- Measurement: Based on an extensive data analysis, we categorize actions according to how easy or difficult they are to be measured. For example, actions reflected on meta-data (e.g. blocking a user) are simple to measure, others like protecting articles from SPAM are very difficult to measure.
Based on the previous approach and focusing on measurable actions at the article level, we gain a deeper understanding of moderation practices across various Wikipedia language editions. Specifically, we analyze moderation activity across 12 Wikipedia editions by examining all edits made in October 2024. Our analysis reveals significant differences in moderation activity between editions. The percentage of actions categorized as moderation ranges from less than 1% in some editions, such as German (0.09%) and Polish (0.53%), to nearly 10% in others, like Russian (9.6%). The English Wikipedia recorded 3.7% of edits as moderation actions. These findings reveal considerable variation in moderation activity across different language editions.
In this report, we also discuss the challenges of measuring and interpreting certain moderation activities. For instance, a decrease in reverts or blocks could indicate either a reduction in moderation, suggesting that issues are being left unaddressed (negative), or it could reflect proactive intervention, where problems are being resolved before corrective actions are necessary (positive). Despite these complexities, the methodologies outlined in this work enable us to effectively monitor key moderation activities. This, in turn, opens up opportunities to provide support to moderators and identify significant changes in their practices.
Goal
[edit]WMF has been investing in understanding and supporting moderators with tool development over the past couple of years. The WMF product and feature teams want to become more specific with interventions and for that they need to know who are moderators? The goal of this hypothesis is to have a working definition of moderators and moderation activities that meets essential metric criteria.
State of Research
[edit]We have conducted extensive prior research on the following processes, which we wish to call “moderator work”:
- Patrolling. This is the act of reviewing incoming new edits, judging their quality, and then allowing them to persist or reverting them. Key publicly available pieces of research include the Patroller Work Habits Survey, or the Patrolling on Wikipedia report.
- Administrator actions. This is concurrently being researched as part of SDS 1.2.2 Wikipedia Administrator Recruitment, Retention, and Attrition, and can be summarized as a subset of actions that frequently require administrator rights, and therefore tend to be carried out by administrators. These are:
- Blocking and unblocking users,
- Deleting pages,
- Changing page protection settings,
- Changing user rights, especially to elevated user rights such as admin or other functionaries.
- Checkuser workflows. This concerns the activities carried out by users with the checkuser right, which allows them to use the Checkuser tool, which can reveal the IP addresses used by logged-in users as well as their useragents. Because this tool deals heavily with personally-identifying information, use of the tool is highly restricted. The tool is largely used for sockpuppet investigations by the volunteer community, and in certain cases by WMF T&S in the course of investigations.
- Other administrative concerns. This catch-all category describes other work focusing on administrators that are not concerned with the four core administrator actions. Examples of this work include the Content Moderation on Medium Sized Projects report.
External research. External academic research on volunteer moderation on Wikipedia has tended to focus almost exclusively on the role of administrators as a proxy for “moderator”, and even so, rarely commit to defining the full scope of what they would consider as “moderator work”. However, external research does exist, helpfully gesturing at some expanded definitions of what we ought to consider as crucial to moderator work. Butler et. al. (2008) describe how distributing the work of writing and updating policy allows English Wikipedia to sustain very complex policies. Karczewka (2024) talks about knowledge-sharing between Eastern European Wikipedia administrators as part of “organizing and creating knowledge processes”, and Schwitter (2024) discusses the role of offline social ties in influencing online voting behaviour in German Wikipedia administrator elections.
However, the core definitional problem when looking at the existing literature on volunteer moderators is that there is no serious attempt at defining the totality of volunteer moderator work, rather focusing on ways to appreciate and expand our understanding of the sprawling nature of said work. Commercial content moderation is more clearly defined and bounded (Roberts 2019, Gillespie 2018). In contrast, where research does attempt to talk about the types of work that volunteer moderators engage in, it is expansive: proactive responses (Lo 2018), social modelling (Seering et. al. 2017, Matias 2019), emotional labor (Dosono and Seeman, 2019), volunteer governance (Matias 2019). Studies of volunteer moderation in non-text based space indicate that moderators are also responsible for the rapid development, employment, and iteration of novel moderation strategies, imaginaries and processes (Jiang et al 2019). In research talking about the divide between visible and invisible volunteer moderation work, a common refrain is the sheer heterogeneity of moderator activities (Li et. al. 2022, Thatch et al 2024) as well as the fuzzy boundaries between work and pleasure (Wohn 2019).
Therefore, for this project, existing literature is useful as a provocation and a reminder for us that what we may wish to define as “moderator work”, is a very expansive category that may be better suited to different heuristic contexts and frameworks, rather than a more traditional categorization exercise. It also reminds us that much of moderation work is invisible and not liable to be capturable by conventional metrics, or that obtainable measures may only be proxies for measuring moderator activity.
Qualitative Definitions
[edit]Having considered existing research in this field, we propose the following working definitions.
Moderation is focused on the social and governance work needed to sustain an online community. This entails the creation and revision of community values, rules, and norms, and the social work required to support this (e.g. guiding discussion, modeling norms) in addition to the technical work of enforcing the space’s boundaries (by removing content or users that fall outside of these boundaries). Moderators are the human actors responsible for social, technical and governance work needed to sustain an online community, including the creation, revision and enforcement of community values, rules, and norms. Note that non-human actors (such as bots) can carry out moderation actions, but we restrict the definition of "moderator" for our purposes to human actors since it is reliant on the subjective intention of the human taking the action, or creating, directing, and modifying the non-human actor that takes the action.
Maintenance is the technical activity that allows the community space to exist in its current or desired form, focused on the creation and ongoing maintenance of the infrastructure that facilitates regular activity in the community. Good examples of maintenance work that is also moderation work include: the creation of templates or bots that facilitate a policy on a wiki (e.g. archival bots, Articles-for-Deletion templates, creation of maintenance categories). Non-moderator maintenance work might include things like renaming pages in accordance with the Manual of Style, gadget maintenance, contributions to MediaWiki, and so on.
From research work done in related areas, most notably SDS 1.2.2 Wikipedia Administrator Recruitment, Retention and Attrition, we can be reasonably certain that the population of people involved in both creating policy and enforcing it are the same, or at least overlaps significantly.
From definitions to measurements
[edit]Starting from a qualitative (purposefully comprehensive) spreadsheet on the variety of actions that editors take on-wiki that might be considered moderation, we discussed each set of processes and what aspects were measurable. Summary below:
Largely measurable
[edit]- Article maintenance (messageboxes, in-line cleanup tags): this is how editors flag content integrity issues within articles. We extract instances from HTML diffs and discuss findings in the data/statistics section.
- User blocks: largely covered by the logging table (superset dashboard). The one exception here are bans that aren't easily enforceable via blocks (e.g., telling a user to stop editing about politics).
We can measure parts
[edit]- Anti-spam (Abusefilter; SpamBlacklist; TitleBlacklist; AutoModerator): pretty good insight into curation and how often these tools are triggered via logging. On the flip side, there are a number of automated bots that do similar things in the community into which we have very little insight (e.g., COIBot, NSFW-detection)
- Patrolling: in some wikis, this work is quite explicit (marking revisions as patrolled) but in many we only really see what is reverted (which could be a small proportion of the revisions that are actually checked).
- Page deletion and protection: outcomes are easy to detect but it's harder to measure the process/discussion aspect.
- New article review: varies by wiki but English Wikipedia's is relatively legible via PageTriage extension.
- User rights management: relatively easy to track when a user's rights have changed but harder to measure the requests/discussions/etc. that go into these decisions.
Largely not measurable
[edit]- Governance
- Committees (ArbCom, U4C, etc.): very little insight into how much work is happening around these.
- Revising policies and how they are implemented is another big space. In some rare cases like the SpamBlacklist, we can easily measure as URLs are added/removed but few areas of governance are this structured.
- Communicating: the big one here is giving feedback to other users about their actions (e.g., via User page messages, talk page comments, edit summaries, etc.). We know this is important but it's very diffuse and unstructured. Same with mentorship.
- Reporting or requesting moderation support: again these processes tend to be quite diffuse and unstructured so we have very little ability to measure what's happening in these spaces. A brighter spot is around Checkuser, which is a specific process that's pretty well-structured and we have more insight into the volume of activity.
How easy/difficult is to measure | Moderation action category |
---|---|
Easy to measure | Article maintenance, user blocks |
We can measure parts | Anti Spam, Patrolling, Page deleting and protection, new article review, user rights management |
Largely not measurable | Governance, Communication, Reporting or requesting moderation support |
Data & Statistics
[edit]In order to understand moderators we work with two different data sources: For “edit actions”, we deeply analyze each revision, by using the mwedittypes library, which allows understanding - among other things - with whether certain templates or infoboxes were added on a revision. By using this approach we can understand complex moderation and distinguish them with high granularity. The limitation of this approach is that it requires high computational resources to process all text, and that is language dependent, requesting adjustments per language. Moreover, this approach requires having an HTML version of each Wikipedia article, which is currently not available, and requires an extra step to build such data. For that reason, we build a second set of statistics based on a logging table, which is part of MediaWiki, and that contains meta-data related to moderation (ex: deleting a page). This approach misses relevant moderation actions, such adding moderation related templates, but doesn’t require additional computation resources given that is based on existing data.
Edit Actions
[edit]Based on the edit actions approach, we build the following insights and data outcomes:
- Percentage of edits in the wikis under study that are moderation-related (adding/removing messageboxes or inline cleanup tags):
Wiki | % Moderation (ignoring revert-related) | % Moderation (considering revert-related) |
---|---|---|
dewiki | 0.06% | 7.42% |
arzwiki | 0.53% | 3.49% |
plwiki | 0.59% | 9.49% |
nlwiki | 0.98% | 9.80% |
itwiki | 1.30% | 13.06% |
frwiki | 1.46% | 9.91% |
eswiki | 1.59% | 35.71% |
svwiki | 3.14% | 7.68% |
zhwiki | 3.68% | 8.45% |
enwiki | 3.74% | 16.09% |
jawiki | 4.08% | 9.43% |
ruwiki | 9.59% | 16.52% |
- Top templates being added / removed on English Wikipedia. Note that there were 134,907 edits on enwiki from that period that had at least one moderation action (and one edit can have multiple actions so actually 268,439 separate actions across those edits). You'll notice that most moderation messages are being added slightly more often than they are being resolved (removed):
Type of Moderation Action | # occurrences | % of Total Moderation Actions |
---|---|---|
inline:Wikipedia:Citation needed-add | 52331 | 19.50% |
inline:Wikipedia:Citation needed-remove | 42357 | 15.80% |
inline:Wikipedia:Link rot-add | 11050 | 4.10% |
inline:Wikipedia:Link rot-remove | 8745 | 3.30% |
mbox:Wikipedia:Citing sources-remove | 6607 | 2.50% |
mbox:Wikipedia:Citing sources-add | 6347 | 2.40% |
mbox:Wikipedia:Verifiability-add | 6336 | 2.40% |
mbox:Special:EditPage-add | 4875 | 1.80% |
mbox:Special:EditPage-remove | 4865 | 1.80% |
mbox:Wikipedia:Verifiability-remove | 4848 | 1.80% |
The code for extracting moderation actions can be found here, and the code for analyzing the data can be found here.
Logged Actions
[edit]In addition to “edit actions”, we have built a second set of statistics based on “logged actions”. Those are actions that we capture from meta-data generated by the MediaWiki software. They are simpler, and more limited in scope compared with edit actions described above, but also easier to capture and track over time.
We have begun by creating a dataset with a census of actions recorded in the MediaWiki logging table. The spreadsheet not only lists these actions but also includes our evaluation of whether each action is related to moderation activities. We have also categorized moderation actions, making a distinction between:
- Content moderation: abusefilter, delete, hide, lock, managetags, merge, move, pagetriage-copyvio, pagetriage-curation, patrol, protect, review, stable, suppress, tag, upload.
- User moderation: block, delete, gblblock, globalauth, rights.
To give a sense of the scale, the spreadsheet includes the number of these actions on multiple language editions of Wikipedia (enwiki, eswiki, frwiki, idwiki, ruwiki, arwiki) over the period from January to October 2024.
A preliminary exploration of the dashboard has already revealed some patterns that should be taken into account for a characterization of content moderation activities:
- The distribution of actions over time shows continuous activity, with noticeable peaks occurring presumably in response to incidents within the community.
- The moderation workload is unevenly distributed among users, with some bots functioning as super-moderators.
- Some moderation actions may be very popular in some wikis, but marginal or even non-existent in others, e.g., review in ruwiki patrol in enwiki and frwiki.
Conclusion and Recommendations
[edit]- We often have good insight into explicit outcomes of moderation (page is deleted, edit is reverted, user is blocked, etc.) but much less insight into the processes that lead to those outcomes, and all of the content/users that are reviewed but for whom no follow-up actions are taken.
- Where we have built centralized tooling for a process, we generally have reasonably good data about usage. When we have left a process largely to the community, it often is much harder to measure. There are some exceptions where e.g., standardized templates for sockpuppet investigations make that process a bit more legible.
- The levels of difficulty we have defined when measuring content moderation reveal a major limitation of any essential metric: some actions by Wikipedians may occur off-wiki or fail to leave traces (i.e., work that doesn't fit metrics).
- It's hard to know how much of "moderation" we can measure at this point. We can certainly measure outcomes for a number of important processes but it's harder to know how to interpret differences in these numbers. For example, a drop in reverts or blocks could either mean that there's less moderation and issues are not being addressed (bad) or it could mean that issues are being addressed before they require corrective actions (good).
- While it will be hard to measure the volume of moderation on a wiki in a useful way, hopefully we can still get useful trends around what types of users are taking corrective actions in a given wiki and notable gaps here (e.g., minimal automated moderation, newer editors are not getting involved, etc.).
- The methodology based on edit types allows us to have detailed information about moderation actions compared to data available on the logging table. However, the lack of historical HTML dumps limits the sustainability of this approach.
- Potential Paths Forward:
- Productionize: Choose a key moderation process where we intend to build product interventions and refine our metrics in that space.
- Generalize: extend edit actions measurement to not just include moderation but also attempt to differentiate between other forms of work (e.g., maintenance, generation, etc.). Focus on productionizing an HTML-based diff pipeline for this.
- Understand: qualitatively explore the differences we see in the quantitative data to understand what is causing them – i.e. why do different language editions engage in these actions at different rates?
- Refine: work with current data but spend time adding more facets/metadata about the editors and context under which these moderation actions are occurring.
Related studies and references
[edit]- Butler, B., Joyce, E., & Pike, J. (2008). Don’t look now, but we’ve created a bureaucracy: The nature and roles of policies and rules in wikipedia. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1101–1110. [1](https://doi.org/10.1145/1357054.1357227)
- Dosono, B., & Semaan, B. (2019). Moderation Practices as Emotional Labor in Sustaining Online Communities: The Case of AAPI Identity Work on Reddit. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–13. [2](https://doi.org/10.1145/3290605.3300372)
- Gillespie, T. (2018). Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press.
- Jiang, J. A., Kiene, C., Middler, S., Brubaker, J. R., & Fiesler, C. (2019). Moderation Challenges in Voice-based Online Communities on Discord. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1–23. [3](https://doi.org/10.1145/3359157)
- Karczewska, A. (2024). Knowledge Sharing in the Adminship Communities of the Eastern Eurpoean Versions of Wikipedia Pages. European Conference on Knowledge Management, 25(1), 351–360. [4](https://doi.org/10.34190/eckm.25.1.2761)
- Li, H., Hecht, B., & Chancellor, S. (2022). All That’s Happening behind the Scenes: Putting the Spotlight on Volunteer Moderator Labor in Reddit. Proceedings of the International AAAI Conference on Web and Social Media, 16, 584–595. [5](https://doi.org/10.1609/icwsm.v16i1.19317)
- Lo, C. (2018). When All You Have is a Banhammer: The Social and Communicative Work of Volunteer Moderators \[Master’s thesis, MIT\]. [6](https://cmsw.mit.edu/wp/wp-content/uploads/2018/05/Claudia-Lo-When-All-You-Have-Is-a-Banhammer.pdf)
- Matias, J. N. (2019a). Preventing harassment and increasing group participation through social norms in 2,190 online science discussions. Proceedings of the National Academy of Sciences, 116(20), 9785–9789. [7](https://doi.org/10.1073/pnas.1813486116)
- Matias, J. N. (2019b). The Civic Labor of Volunteer Moderators Online. Social Media + Society, 5(2), 2056305119836778. [8](https://doi.org/10.1177/2056305119836778)
- Roberts, S. T. (2019). Behind the Screen: Content Moderation in the Shadows of Social Media. Yale University Press. [9](https://doi.org/10.2307/j.ctvhrcz0v)
- Schwitter, N. (2024). Offline connections, online votes: The role of offline ties in an online public election. New Media & Society, 14614448241274456. [10](https://doi.org/10.1177/14614448241274456)
- Seering, J., Kraut, R., & Dabbish, L. (2017). Shaping Pro and Anti-Social Behavior on Twitch Through Moderation and Example-Setting. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 111–125. [11](https://doi.org/10.1145/2998181.2998277)
- Thach, H., Mayworm, S., Delmonaco, D., & Haimson, O. (2024). (In)visible moderation: A digital ethnography of marginalized users and content moderation on Twitch and Reddit. New Media & Society, 26(7), 4034–4055. [12](https://doi.org/10.1177/14614448221109804)
- Wohn, D. Y. (2019). Volunteer Moderators in Twitch Micro Communities: How They Get Involved, the Roles They Play, and the Emotional Labor They Experience. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–13. [13](https://doi.org/10.1145/3290605.3300390)