Wikidata For Wikimedia Projects/Projects/cache createSchemaElement
Improve article loading time by caching SkinAfterBottomScriptsHandler::createSchemaElement
Background
Every Wikipedia article page contains a JSON-LD block in its HTML-source. JSON-LD is used to structure linked data in article pages for better search engine and third-party data consumption.
Currently, when building the JSON-LD schema, two database queries are executed:
TermLookup::getDescription:
Fetches the description of the associated entity (possibly from Wikidata).RevisionLookup::getFirstRevision:
Retrieves the first revision of the article.
The Problem
[edit]These queries contribute significantly to the request time, with building the JSON-LD schema taking ~5% of the total request time when loading an article page.
The Solution
[edit]- We are investigating if caching the results of
TermLookup::getDescription
andRevisionLookup::getFirstRevision
will improve article loading time by avoiding the re-execution of these queries on every request.
Next Steps
[edit]We are currently investigating the following sub-tasks to improve performance for this task:
T379169
: Cleanup service wiring (declined)T379170
: Create getFirstRevisionTimestamp function in Revision Store.
Create getFirstRevisionTimestamp function in Revision Store
[edit]
In SkinAfterBottomScriptsHandler::createSchema, a resource-intensive functiongetFirstRevision
is invoked to retrieve all revisions, sort them by date and ID, and select the oldest.
However, createSchema
only requires the creation timestamp, not the full revision data.
This can be optimized, by directly querying the minimum timestamp (MIN(rev_timestamp)) from the revisions table.
This approach would simplify the process and avoid executing unnecessary selection and sorting operations, eliminating the getTimestamp helper function in createSchema
.
By implementing a a new function, getFirstRevisionTimestamp the earliest revision timestamp will be more efficiently retrieved.