Wikidata:WikidataCon 2017/Submissions/WikiCite: Wikidata as a structured repository of bibliographic data

From Wikidata
Jump to navigation Jump to search
Watch the video recording
Submission no. 78
Title of the submission

WikiCite: Wikidata as a structured repository of bibliographic data

Part of the citation network for a scholarly article
Overview of journals an author has published in
Citation statistics for a publisher's journals
Topics co-occurring with Zika virus in scholarly publications
Author(s) of the submission
E-mail address
  • dario at
Country of origin
  • USA, Germany
Affiliation, if any (organisation, company etc.)
  • Wikimedia Foundation
  • University of Virginia
  • Wikimedia Deutschland

Type of session
  • Talk (the usual conference format. 45min, large audience)
Length of session
  • 45 min
Ideal number of attendees
  • 25-30
EtherPad for documentation


WikiCite is an ongoing effort to build an extensive bibliographic database in Wikidata to serve all Wikimedia projects. While the idea has been around for over a decade, the technology needed to sustain this effort is maturing, and the immediate goal of the initiative to produce a well-curated, high-quality structured dataset of all sources cited across Wikimedia projects in Wikidata is taking shape.

The current WikiCite project originated at Wikimania London and was supported by a first dedicated event in Berlin in the spring of 2016. In May 2017, a second, larger event took place in Vienna, co-located with the Wikimedia Hackathon, to further advance this work. The aim of the second event was to showcase progress made so far, identify technical gaps/needs, strengthen ties with key partners (such as Zotero, the Internet Archive, OCLC, Crossref, ORCID) as well as funders (the Alfred P. Sloan Foundation, the Gordon and Betty Moore Foundation, the Simons Foundation).

Significant progress has been made to date:

  • the modeling of a schema to represent scholarly article metadata is nearly completed: WikiCite participants and other Wikidata users have catalogued over 2 million scientific articles on Wikidata and described over 3.3 million citations between scientific articles. These articles have begun to be used as references for Wikidata statements and could be used to improve the handling of references on other Wikimedia projects.
  • tools and platforms to allow a rapid integration of references (such as SourceMD or PMIDTool) and to visualize the relations between scholarly knowledge and the rest of Wikidata (such as Scholia) have seen significant growth and adoption.
  • multiple corpora have also been created as part of WikiCite, showcasing the value of linking up knowledge to its sources in a machine-readable way. Beyond efforts around the Zika Corpus, collaborations with scientific open data communities, such as the Gene Wiki project, Wikipathways, CIViC, Reactome are continuing.
What will attendees take away from this session?

This will be the first comprehensive presentation to the Wikidata community about the project, its challenges and prospects. While many of the contributors are active on Wikidata, the project will benefit from broader input and buy-in from the community.

Slides or further information
Special requests

Interested attendees[edit]

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest.

  1. Andreasm háblame / just talk to me 05:02, 30 July 2017 (UTC)
  2. Taniamaio (talk) 08:49, 30 July 2017 (UTC)
  3. Maxlath (talk) 15:49, 30 July 2017 (UTC)
  4. seav (talk) 18:20, 30 July 2017 (UTC)
  5. Alessandro Piscopo (talk) 14:25, 31 July 2017 (UTC)
  6. ArthurPSmith (talk) 15:29, 31 July 2017 (UTC)
  7. Criscod (talk) 14:55, 8 August 2017 (UTC)
  8. Carlojoseph14 (talk) 15:55, 8 August 2017 (UTC)
  9. Micru (talk) 14:31, 11 August 2017 (UTC)
  10. Sic19 (talk) 20:57, 20 August 2017 (UTC)
  11. Yayamamo (talk) 01:31, 29 August 2017 (UTC)
  12. Sannita - not just another sysop 19:49, 2 September 2017 (UTC)