Toolhub/Progress reports/2022-04-29
Production release made
[edit]A number of feature enhancements and bug fixes were deployed to the production https://toolhub.wikimedia.org site on 2022-04-26. This was the first deployment of the production service since 2022-03-15.
The major features added in this release are the new recent changes feed for patrolling and annotation editing which allows the community to backfill some missing information for tools indexed by the crawler which are publishing Hay's Directory compatible toolinfo.json records.
Features added:
- Add parent_id to revision API output
- UI for annotations
- Audit log should clearly indicate unpatrolled edits for patrollers
- There should be a way to add a tool to a list from any of the tool views
- Experiment with adding more context to list and tool revision history
- Display extended author information on tool detail screen
- Make editing and deleting lists easier
- [Spike] Figure out how to compute facets across multiple fields
- Refactor the edit-tool form to make it more user friendly
Bugs fixed:
- Different list API endpoints order the tools contained in the list differently
- List not removed from search index when un-published
- LogEvent incorrectly assumes that all 'version' (revision) operations are on tools
- Calls to api/getRequestSchema vuex action can race with each other
- Recent changes does not handle deleted entites well
- Removing a tool from a list using the menu is not reactive everywhere
- `search_index --rebuild` exits with non-zero status
- Exclude /metrics from TLS enforcement
- Lists for some users being incorrectly indexed as public lists
Technical debt:
- Add context helper for creating reversion revisions
- Refactor all "list of tools" lists to use the same basic "Lists" component
- Upgrade from django2.2 to django 3.2
Production database config error corrected
[edit]Following the deployment of the annotation editing UI to production, Bryan made some edits to exercise the feature. This testing revealed an unexpected error condition where string values like annotations.icon were returning from the API as binary values rather than the expected unicode strings. Further investigation confirmed an initial hunch by Bryan that the Toolhub production database had a default character set of 'binary' and that new tables like toolinfo_annotations which had been created using Django migrations had inherited this character set.
This misconfiguration has been fixed by correcting the character set of all text columns in the toolinfo_annotations and also updating the default character set for the database.
Service monitoring dashboard setup
[edit]Bryan got quite a bit of help from Alexandros Kosiaris in creating a service monitoring dashboard for Toolhub. This dashboard follows the "four golden signals" monitoring practice which has been popularized by Google's SRE book. The four aspects of the application that are monitored are:
- Traffic
- How many requests are being processed per unit time.
- Errors
- How many system errors are occurring per unit time.
- Latency
- The amount of time it takes to answer a request.
- Saturation
- The ratio of resource utilization to resource availability per unit time .
While building out the dashboard Bryan noticed that the traffic numbers were much, much larger than he expected. Alexandros did some investigation and found that the /metrics endpoint which produces data for the traffic, errors, and latency signals was returning a 301 redirect to the Prometheus scraper. This misconfiguration was fixed in this week's release. Following the fix the measured traffic numbers fell to values which seem much more reasonable. We do not currently have an explanation for how exactly the redirect caused the monitoring anomaly.
Wrap up
[edit]This week's deployment includes some really powerful new features that can be used by the community to improve the catalog. Annotations allow anyone to populate a number of informative properties about each tool no matter how the main toolinfo record made it into the catalog. These new attributes include:
- API URL
- Translate URL
- Bug tracker URL
- User docs URL
- Developer docs URL
- Feedback URL
- Privacy policy URL
- Icon
- Tool type
- Available UI languages
- For wikis
- Deprecated
- Experimental
- Replaced by
These are all new properties added to the toolinfo.json schema by Toolhub. The majority of records in Toolhub are derived from toolinfo.json data originally published for Hay's Directory and it's 1.0.0 toolinfo.json schema. This means that there are a lot of gaps that the community can start working to fill in about tools that they use.
Expect to see more information about these features and how to use them as the team prepares for the upcoming Hackathon May 20-22, 2022. We will be holding some sessions during the event to explain Toolhub and inviting participation in editing Toolhub records.