Jump to content

Wikidata For Wikimedia Projects/Projects/Wikidata entity usage count

From Meta, a Wikimedia project coordination wiki
Tracked in Phabricator:
Task T279069

Fix Wikidata entity usage tracking and access count


Background

[edit]

Parser functions are an integral component to MediaWiki and Wikimedia pages, allowing editors to dynamically generate content on Wiki pages.
As each Parser function draws a little more processing or computational power, the addition of many functions to a single wiki page can lead to slow rendering times, inconsistent page caching and incrementally adding load to servers. To help Wikimedians analyse what resources are being accessed and used when loading a page, the New PreProcessor Parser Report was created. It lists a series of statistics about what is being called and tracked when rendering the page.
One of the statistics being tracked is the no. of Wikibase entities being called, with a page limit set at 400.

The Problem

[edit]

When the no. of Wikibase entities being accessed on an indiviudal Wiki page exceeds the threshold (400), the NewPP Parser Report resets the count to 0 and resumes at the next Wikibase entity, resulting in a false report of the on the NewPP Parser Report. Additionally, any succeeding entities will be accessed and rendered as normal which means the limit is not currently enforced. The limit has a purpose, to ensure individual Wiki pages do not take too long to render or place too much computational load on Wikidata servers.

The Solution

[edit]

We would like to make improvements in two ways:

  • Accurate Count in the NewPP Parser Report

When the threshold of Wikibase entities being accessed is met, the count will end at 400/400. The count will no longer reset to 0 and begin anew.

  • Enforce entity limits with improved, context-aware error messaging

If an error message has be displayed due to the page containing more entity access requests than the limit, we want to ensure the error message is clear, provides context on why it has been generated (and what actions the editor can take to avoid it). Two errors are currently generated:

  1. Too many Wikidata entities accessed. Number of entities loaded: 401/400.
    displayed on the individual 401st entity being accessed on the page.

  2. Failed to render *P(n) property: Too many entities loaded, must not load more than 400 entries
    displayed on any further entity access requests.

*where P(n) refers to the specific Property-number and Wikidata page being accessed

Status and next steps

[edit]
Below is a timeline of the progress of this task:

1.0 Yes Researching the issue

  • We are still investigating the full scope of the error

1.1 Yes Build tracking component

  • To better understand how many pages are affected by this error (note: tracking requires an editor or bot to visit the page).
  • Deploy tracking component on Train
  • Collect information on affected pages.

1.2 Yes Build solution

  • Created RestrictedEntityLookupFactory as a service instead of RestrictedEntityLookup. Each parser will each have its respective RestrictedEntityLookup, preventing an interruption to one from affecting the others.
  • Yes Proposed solution passes code review (18.11.2024)

1.3 Yes Global rollout to all Wikis

  • Yes Scheduled for train deployment MW-1.43-notes (1.44.0-wmf.4; 2024-11-19)
  • Yes Enforce the limit on entities accessed when the threshold is reached. Successive entities accessed should not render and instead display the secondary error message
    Failed to render *P(n) property: Too many entities loaded, must not load more than 400 entries

1.4 In progress… Investigate if Entity usage limit can be optimised for Wikipedia pages.

  • In progress… Provide more context or clarity to the error message for editor's to better understand what they can change to not invoke the error.
  • In progress… Introduce improvements or efficiencies to the Lua module and decrease the frequency of the error being generated.

Affiliated Tasks & Sub-Tickets

[edit]