Wikidata:WikidataCon 2019/Program/Sessions/Shape Expressions workshop
|ID : SUB-113||Shape Expressions workshop|
|Speaker(s): EricP, Lucas Werkmeister (WMDE), Andra Waagmeester (Micelio), Jose Labra (Oviedo), Tom Baker (DCMI)||Timeblock: tb-saturday||Start: 16:00||Slides:|
|Room: Darwin||Duration: 55min|
Experience from the Linked Open Data Cloud shows that data quality is the biggest predictor of whether a collaborative database will be widely adopted. Well-structured RDF data like Uniprot tended to be backed by relational stores which ensured structural integrity. ShEx brings such integrity validation to RDF, allowing graph databases to meet the same standards in data quality, encouraging more widespread use of the data by industry, academia and governments.
ShEx is a concise, formal modeling and validation language for knowledge graphs. It can be used to define shapes within the graph. In the case of Wikidata, this would be sets of properties, qualifiers and references that describe the domain being modeled. Subsets of the Wikidata graph can be tested to see whether or not they conform to a specific shape through the use of validation tools. In this workshop, we will demonstrate the utility of ShEx. During the workshop, participants will learn how to write Shape Expressions to model and disseminate structural expectations, and use existing tools to test conformance with those expectations. We will discuss the infrastructure requirements for a healthy, interlinked data ecosystem and how to maintain a level of data quality that will attract institutional investement and dependence. These requirements have to meet the needs of the contributor community, who frankly don't always agree on a single structure. In conclusion, ShEx is to signal structural issues; insert sensible information; document schemas and check for conformance on for Wikidata items.
|Keywords: data quality, tool|
|People planning to attend:|