Wikidata:WikidataCon 2017/Submissions/Data completeness: How to know what Wikidata knows?

From Wikidata
Jump to: navigation, search

Pictogram voting info.svg This is an Open submission for WikidataCon 2017 that has not yet been reviewed by the members of the Program Committee.

Submission no. 15
Title of the submission
Data completeness: How to know what Wikidata knows?

Author(s) of the submission
E-mail address
Country of origin
Affiliation, if any (organisation, company etc.)
Free University of Bozen-Bolzano

Type of session
Talk + Discussion
Length of session
45 minutes
Ideal number of attendees
Unlimited (talk), 5-20 (discussion)


Wikidata is a great project towards mapping structured information about the world, and exhibits a high degree of correctness. Its degree of completeness, in turn, is much less understood. Anecdotal evidence suggests that it covers many popular topics quite well, but there are few standard means that help in this assessment: At present, editors and consumers have to analyze largely on a case-by-case basis whether given information might be complete or not. This session concerns the automated assessment of the completeness of Wikidata, and consists of two parts:

In the first 30 minutes I will survey techniques to assess the completeness of parts of Wikidata, in particular (i) mandatory properties (like P1963), (ii) explicit assertions of completeness via COOL-WD, (iii) comparative completeness using Recoin, and (iv) tabular views like discussed here. I will briefly introduce each technique and discuss its advantages and limitations.

The second part of the session (15 minutes) shall be an open discussion, guided by the questions

  • What kind of (anecdotal) knowledge about completeness of parts of Wikidata do participants have?
  • What kind of structured knowledge about completeness would participants like to obtain?
  • What tools could help towards this?
What will attendees take away from this session?
  • Awareness of completeness as multidimensional concept
  • Knowledge of the existing tools for assessing the completeness of Wikidata, their potential and their limitations
  • Hopefully a roadmap how completeness of Wikidata will be made more transparent in the future
Slides or further information
Special requests

Interested attendees[edit]

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest.

  1. ArthurPSmith (talk) 13:33, 26 July 2017 (UTC)
  2. --YULdigitalpreservation (talk) 12:06, 27 July 2017 (UTC)
  3. -- JakobVoss (talk) 21:25, 28 July 2017 (UTC)
  4. Andreasm háblame / just talk to me 04:59, 30 July 2017 (UTC)
  5. Would love to see these tools applied systematically to some example areas (e.g. the Zika corpus), and over time. Daniel Mietchen (talk) 07:19, 31 July 2017 (UTC)
  6. Jklamo (talk) 00:02, 1 August 2017 (UTC)
  7. Maxlath (talk)
  8. Lucyfediachambers (talk) 16:36, 2 August 2017 (UTC)
  9. Criscod (talk) 15:00, 8 August 2017 (UTC)
  10. Gikü (talk) 15:15, 21 August 2017 (UTC)
  11. --Sannita - not just another sysop 16:45, 1 September 2017 (UTC)
  12. ...