Grants:Project/DBpedia/GlobalFactSyncRE/Timeline/Tina
Appearance
Next GFS Call
[edit]Friday August 2, 11am
Tasks
[edit]Getting ready
[edit]- move Python code from https://git.informatik.uni-leipzig.de/kwecel/infoboxes-refs to github https://github.com/dbpedia/ (Sebastian)
- DONE (Johannes/Sebastian) enable more/all extractors - provide list of possible values for extractors
- web-server (Marvin)
First Release
[edit]- publish reference dump and deploy a micro-service for current Python extraction as is `?article=http://en.wikipedia.org/wiki/Arthur_Schopenhauer` outputs csv as is (Wlodzimierz)
- deploy DIEF (extraction framework) micro-service on the GFS server (Johannes)
- Mongodb prefusion - example queries (Marvin)
- DONE Study and Categorization (Tina)
- Wikidata non-adoption report (count of properties extracted by generic extraction 580 millions) - (Sebastian)
- measures all values of infobox parameters -> this infobox doesn't use wikidata for this parameter
- add template counterexample here (@Lewoniewski:)
- Previous email: https://lists.wikimedia.org/pipermail/wikidata/2018-December/012681.html
Second Release
[edit]- (Johannes) Mapping package/snapshot/protoype
- 1. problem analysis
infobox param -> DBpedia property <->/-> Wikidata property -------------------------------------------------------- infobox param <-> Wikidata property (publish with release)
- 2. (later) inclusion of DBpedia into Wikidata (sameaAs and owl:equivalent(P|C))
Study / Scouting for good examples
[edit]- DONE preliminary study of sync targets
- integration of MusicBrainz:
- check how well it is mapped (Johannes)
- mapping of 5 properties (Johannes)
- contact user Jc86035 (Johannes)
- integration of MusicBrainz into FlexiFusion (Marvin)
Exploitation
[edit]- Wikimania 16-18 August | Stockholm, Sweden (Johannes will go)
- Wikidatacon 25 – 26 October 2019 | Berlin, Germany (open, not Johannes)
- DONE draft release note
Other
[edit]Back-end:
- check out Scala (Johannes, Wlodzimierz)
- Can template extractions in the extraction framework be used with python code?
- new wikidata release (Marvin)
- find best structure of the references
Front-end:
- Factual Consensus Finder:
- development of better statistical tool (Marvin/Jan?)
- tool/query to find the most likely errors (Marvin)
Misc.:
- DONE write project announcement (Sebastian, Tina)
- post GFS challenge (Tina)
Completed Tasks
[edit]Getting ready:
- DONE accounts (Tina)
- DONE make GFS server ready, @JohannesFre: any news on this? (Sebastian)
- DONE Wikimania presentation format specification (Johannes)
First Release:
- DONE deployment of Mongodb prefusion deployed (Marvin)
Study / Scouting for good examples:
- DONE - see preliminary study of sync tartgets
- problem: four layers of complexity: Subject variation / fixed vs. varying property / reference (inferred from 1 and 2) / normalisation of values (currency, inch/cm, ...)
- NBA Players and Cloud types (Tina)
- Videogames (easy disambiguations)
- films 100k budget is fixed and revenue parameter varies in language
- Cars & Products (complex)
- organisations (page for a group)
- Sports
- Cities (easy disambiguation)
- Difficult examples:
- subjects/articles are of a different granularity
- city & population: core, close area and county
- subjects/articles are of a different granularity
Exploitation:
- ...
Other:
Back-end:
- ...
Front-end:
- Factual Consensus Finder:
- DONE UI needs of average user (Tina)
Misc.:
- DONE check your profile and edit if necessary (everyone)