Wikidata For Wikimedia Projects/Projects/cache createSchemaElement
Improve article loading time by caching SkinAfterBottomScriptsHandler::createSchemaElement
Background
Every Wikipedia article page contains a JSON-LD block in its HTML-source. JSON-LD is used to structure linked data in article pages for better search engine and third-party data consumption.
Currently, when building the JSON-LD schema, two database queries are executed:
TermLookup::getDescription
: Fetches the description of the associated entity (possibly from Wikidata).RevisionLookup::getFirstRevision
Retrieves the first revision of the article.
The Problem
[edit]These function calls contribute significantly to the request time, with building the JSON-LD schema taking ~5% of the total request time when loading an article page.
The Solution
[edit]- Caching the results of
TermLookup::getDescription
andRevisionLookup::getFirstRevision
will improve article loading time by avoiding the re-execution of these queries on every request.
Function 1: Cache getFirstRevision value for Linked Data Schema
[edit]The Wikibase webhook SkinAfterBottomScriptsHandler::createSchema runs an expensive functiongetFirstRevision
every time a Wiki article is loaded, accounting for approximately 2.5% of a pages total load-time, of every read-page.
Its operation calculates and outputs the first (and oldest) article version by retrieving all revisions, sorting them by date and ID, and selecting the oldest.
Caching this revision value will reduce the frequency of this expensive function being called and server-load will be reduced. Cached values will be added in the Parser cache output.
Development
[edit]Development on this task started December 2024.
After deployment, some edge cases were discovered on page move, page preview or new page creation could lead to an error whereby the first revision timestamp hasn't been stored yet. A patch is being prepared in
task T383657.
Deployment
[edit]The patch was added to MediaWiki version MW-1.44.0-wmf.12
and was deployed to all Wiki groups on January 16, 2025.
The patch for edge-cases as mentioned in task T383657 will be merged to MW-1.44.0-wmf.18
and will be deployed to all Wikis by 27 February, 2025.
Function 2: Cache getDescription value for Linked Data Schema
[edit]The function TermLookup::getDescription()
is invoked whenever a Wiki page is read, it will query and return the Wikibase entity (Wikidata item) ID, the description and the language code. These values for the most part do not change, and requesting this information everytime a Wiki page is loaded is often unnecessary.
By caching this information for easy retrieval, the frequency of the function being invoked will be significantly reduced and so will the load on the Wikimedia servers.
The new criteria for a cache invalidation will become:
- An edit is made to the Client page (the page/article where the function is being invoked from)
- An edit is made to the Wikidata item Description, in the Userlanguage or fallback language (English)
- Once per 30 Days if neither of the above happen.
Development
[edit]Development on this task started December 2024.
Deployment
[edit]The patch was merged to MediaWiki version MW-1.44.0-wmf.15
and will be deployed to all Wiki groups on February 6, 2025.