Wikidata:WikiCite Satellite Cologne 2020/Submission/DieDatenlaube

From Wikidata
Jump to navigation Jump to search

This is a submission for WikiCite Satellite Cologne 2020.


Title[edit]

#DieDatenlaube – Linking Wikisource with Wikidata and Commons

Abstract[edit]

Wikisource as the primary Wikimedia portal for open accessible fulltext sources could use Wikidata as its fully structured and linked storage for bibliographic records. The German Wikisource has its focus on historical sources under public domain. One of the biggest project is the transcription of the first German illustrated mass magazine “Die Gartenlaube”. At the moment there are about 13,000 of approx. 19,000 articles published in the 19 century available on Wikisource. With the citizen science project “#DieDatenlaube” (diedatenlaube.github.io) we want to encourage the Wikisource community in the usage of Wikidata and its capabilities in describing and analyzing this article corpus.

In this talk we will give an overview about the following activities in this project:

  • data ingest based on Wikisource-infoboxes using Python Scripts and QuickStatements,
  • identification of subjects based on intellectual work or on title heuristics,
  • developing models for semantic subject cataloging to make the relation of different,entities in the main subject statement much clearer. (e.g. using qualifiers for location, timestamps or relations),
  • increase the quality of the metadata
    • identify the genre of the article (Gartenlaube contains both fictional and non-fictional articles)
    • disambiguation of authors and illustrators
    • clarify page and issue information for articles splitted up over several issues
    • identify references
  • dealing with illustrations
    • link to illustrators or photographers
    • add depicts statements
    • using structured data on commons to link explicitly the file to its bibliographic record.

An example: Carl Ernst Bock refering himself in Die Gartenlaube, w.wiki/HCr als #Bocknetz, oder: ergänzt um Artikel, die er (Bock) nicht selbst zitiert" hat, #BockHatQuery > w.wiki/HdS.

Type of contribution and estimated time[edit]

talk 50 (30 min + discussion, german and english) minutes

Contact information[edit]

Christian Erlinger (Q67173261), Scholia,
Vienna Public Libraries (Q1020347)
christian.erlinger-schiedlbauer@wien.gv.at
Twitter: @LibrErli

Jens Bemme (Q56880673), Scholia
(SLUB Dresden, Fellows Freies Wissen)
jens.bemme@slub-dresden.de
Twitter: @Jeb_140

We are going to participate locally with 2 people.