Jump to content

User:Sj/!/struct

From Meta, a Wikimedia project coordination wiki

StrepHit: ?

HTML dumps

[edit]

Enterprise maintains html dumps for 6 wikipedias (as of 2/25)

WD integration

[edit]

Hoping to merge into Enterprise

what about dbpedia?

Coordinated project: Structured Wikipedia

[edit]

Repository

currently used by Ecosia + Pleias

Structure pages for external reuse. Do parsing that reusers already do or need

  • HuggingFace (talk to Poli) -- detailed drop templates incl numerical conversions
  • Other embeddings : often use a bespoke parsing (wikitext, not html)
  • Note the high template/infobox count on some wikis

abstract / entity / sections / infobox / image / ORES scores / revert risk / redirects

Todo: talk page activity, references, what links here, tables
Investigation into annotation as upgrade to talk page sections

Commons metadata

[edit]

Releasing among free snapshots.

Other

[edit]

Classification pages:

Images: extraction and listing

Infoboxes:

References:

Sections:

Pageviews (HalT): by geo, priv preserving