Wikidata:Bibliography of Life/Argument for an invitation

From Wikidata
Jump to navigation Jump to search

In two articles[1][2] Rod Page argues for Wikidata to be the knowledge graph for a "Bibliography of Life",[3] because of certain of its key properties:

  1. graphical structure - node (Qitem) joins node (Qitem/text) via an edge (property), thereby creating a network or "knowledge graph"
  2. the graphical structure allows it to be queried by the opensource SPARQL query language
  3. crowdsourced via a combination of bots, tools and individual edits

"Ozymandias" (joint winner of the GBIF Ebbe Nielsen prize in 2018) is a biodiversty knowledge graph for taxa created from AFD and ALA databases.[1] To create the Ozymandias, Page formed over 9 million triples.[1] While it is harder to add data without creating duplicates to an existing database (Wikidata) than to create a database ab initio, some of the things he did are surely usable by Wikidata editors to do some of the mass uploading necessary to achieving the goal of Wikidata as a "Bibliography of Life".

In the pursuit of turning "strings into things" in Wikidata I believe that Professor Page has knowledge and skills many of which may be useful to the Wikimedian community. Hence I am hoping that we could invite him to be a guest speaker at ESEAP 2022 or on some other occasion which may suit him better. I believe, too that there is sufficient interest and knowledge within both the NZ and Australian Wikidata community to be able to take advantage of what he offers. (Note, for example, in the top 50 contributors of a random selection of 10,000 publications in Wikidata[2] there were at least five Australians /New Zealand Wikidata editors. (doi:107717/peerj.13712/fig-5)

Things I want to know[edit]

  1. How to harvest author names in Zoobank (17,000 in Wikidata out of 87,000) (If we can harvest them, we can then reconcile them and add to Wikidata those not already in Wikidata)
  2. How to use the IPNI API to download authors and reconcile with Wikidata authors. (Used by Page for Blazegraph.)[4] 35,000 Wikidata authors from a possible 43,000 in Mix'n'Match catalogs for both IPNI and BHL currently deal with this very well.
  3. How to use the ORCID API.
  4. Professor Page gives code[2] at https://github.com/rdmpage/wikidata-bibliographic-data for some of his bibliographic approaches. Is it possible to show how to use these resources for populating Wikidata? and as well as ancillary resources, Ozymandias itself.
  5. SPARQL queries for databases other than Wikidata
  6. ORCID API, ALA API , BHL API, etc.
  7. Can we harvest the Ozymandias database for its corrected publications, together with DOIs? And for its authors? Create for example, Mix'n'Match catalogues from such harvests?
  8. How does one use SPARQL on other databases, as Page did, when creating Ozymandias

Ozymandias[edit]

Ozymandias can be seen at https://ozymandias-demo.herokuapp.com, and its source code is found on GitHub (https://github.com/rdmpage/ozymandias-demo).

Some examples of queries:

  1. Telenomus images
  2. Names linked to publications
  3. Taxonomists associated with ALA /AFD databases (beginning with K)

References[edit]

  1. 1.0 1.1 1.2 Roderic D. M. Page (2019). Hilmar Lapp (ed.). "Ozymandias: a biodiversity knowledge graph" (PDF). PeerJ. 7: e6739. doi:10.7717/PEERJ.6739. ISSN 2167-8359. PMC 6459178. PMID 30993051. Wikidata Q63687022.View profile on Scholia
  2. 2.0 2.1 2.2 Roderic D. M. Page (7 July 2022). "Wikidata and the bibliography of life". PeerJ. 10: e13712. bioRxiv 10.1101/442638. doi:10.7717/PEERJ.13712. ISSN 2167-8359. PMC 9271275 Check |pmc= value (help). PMID 35821898 Check |pmid= value (help). Wikidata Q112959127.View profile on Scholia
  3. David King; David R Morse; Alistair Willis; Anton Dil (28 November 2011). "Towards the bibliography of life". ZooKeys. 150 (150): 151–66. doi:10.3897/ZOOKEYS.150.2167. ISSN 1313-2989. PMC 3234436. PMID 22207811. Wikidata Q22679575.View profile on Scholia
  4. Roderic D. M. Page (10 December 2019). "Reconciling Author Names in Taxonomic and Publication Databases" (PDF). Proceedings of the 12th International Conference on Semantic Web Applications and Tools for Health Care and Life Sciences (SWAT4HCLS). CEUR Workshop Proceedings (in English and English). 2849: 36–43. doi:10.1101/870170. Wikidata Q113391182. Archived from the original (PDF) on 13 April 2021.View profile on Scholia