Research talk:Understanding Wikidata's Value/Work log/2017-09-02
Add topicSaturday, September 2, 2017
[edit]The following are our hypotheses related to how Wikidata predicted entity quality and page views (in client pages) relates with edit types (types being bot, semi-automated, logged-in user, and anon). We use ORES for predicted quality which has a scale of E to A where A is the best quality. We use the Perfect Alignment Hypothesis to derive 5 classes of page view counts.
Hypothesis 1: When the entity quality class is fairly low (D or C) and the entity's views are in a lower class than the quality, bots and semi-automated edits are higher in proportion compared to aligned entities of the same quality. Logged-in user and anon edits are thus lower in proportion.
Rationale for Hypothesis 1: Bots create misalignment by making poor quality, low-viewed entities into somewhat better quality.
Hypothesis 2: High quality class (B or A) aligned data has a higher percentage of human (logged-in user and anon) edits compared to lower quality class aligned data. Bot and semi-automated edits are thus lower in proportion.
Hypothesis 3: High quality class (B or A) misaligned data has a higher percentage of human (logged-in user and anon) edits compared to lower quality class misaligned data. Bot and semi-automated edits are thus lower in proportion.
Rationale for Hypotheses 2 and 3: It takes human (logged-in user and anon) edits to get an entity to become really high quality.