Research:Automated classification of edit types/Taxonomy
This page documents a complete and inclusive taxonomy. The goal is to capture all potential change types that describe editing activity on Wikipedia. A practical subset will be used for the automated classification system, but we leave the identification of this practical subset to other discussion.
Syntactic
[edit]These classes describe "what" was done during an edit. (As opposed to "why")
Mechanical operations
[edit]These types of changes can be detected with simple regular expressions
- wiki links
- insert/delete
- modify
- disambiguate
- inter-wiki links
- insert/modify/delete
- external links
- insert/modify/delete
- category
- insert/modify/delete
- headers
- insert/modify/delete
- table
- insert/modify/delete
- image
- insert/modify/delete
- references
- insert/modify/delete
- content move / refactor
- redirect
- cleanup
- punctuation
- insert/delete
- whitespace
- insert/delete
- formatting -- css/style/bold/italics
- punctuation
Abstract/probabilistic operations
[edit]These classes can't be detected trivially with regular expressions. They would require some machine prediction.
- Grammar (word-level)
- punctuation, whitespace
- spelling error, typo
- capitalization
- tense change
- Rephrase (word-level)
- synonym
- remove redundant words
- Sentence (sentence-level)
- insert/modify/delete (substantive)
Semantic
[edit]These classes describe "why" an edit was made. They usually amount to subjective applications of policy.
- NPOV
- Vandalism
- Notable?
- External link policy
- Manual of style
- New topic (article creation)
Complex operations
[edit]These classes describe changes that are part of a multi-edit operation
- Merge
- Archiving
Discussion
[edit]These classes describe actions relevant to a discussion.
- New topic
- Reply
- !Vote (Support/oppose)
- Comment signing
- Suggestion
- WP tagging/assessment