Wikidata talk:WikiProject ShEx

From Wikidata
Jump to navigation Jump to search

Meeting Notes[edit]

14 Feb 2018[edit]

ericP: what is our mission Andra: to raise awareness and grow a community ... we can grow the WikiProject page over time ... use cases for ShEx ... namespace for ShEx- Lucas' idea from WikidataCon, should host shapes on their own URI, what would that URI be?

ericP: conflicting intrests- give ownership to WD community, conflicting one is to have them visible so people can copy and steal shapes even if they are outside wd community ... maybe move shemas over to WD and then mirror to shex schemas space when you want

Lucas: To create a new namespace- prob not trivial. Even if no one argues, the technical side might be complicated.

Andra: maybe postpone this until we have more use cases. ... Kat's http://wikidp.org/ demo- can we use this for additonal domains by driving it with shex- property checklist driven by shex ... We could create a generic version of the portal, containerize it and then poeple could slot in their own shape expressions to create their own property checklists ... need shapes avail through URL to reuse Harold Solbrig's pyshex, so that is why i need a namespace for shape URIs

ericP: demo manifests to run in Eric's or Jose's implementation- like the primer try it links- 1. create manifests so ... good queries and validation tests either that are picked up remotely, or static data, create the schemas that will be shared, demo data, and manifests in a picklist ... demos show why validation is useful, hints on how is used in different domains, give people ideas ... wiki page with try it links, if we have a data structure, we can express it like this, that catches errors like this, help people

Andra: create a page similar to the example queries


TODOs for next meeting:

   Lucas- ask around WMDE about how to request a new namespace
   Kat- create an example on the WikiProject page
   Andra- create an example on the WikiProject page
   Kat- paste notes in the talk page of the WikiProject
  ? Create phabricator ticket for a new namespace?

Examples and tools[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) User:Malore Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC)


Pictogram voting comment.svg Notified participants of WikiProject ShEx Could you please first provide examples of ShEx shapes that check particular data models in Wikidata and guidelines how to check Wikidata against this shapes? I'd prefer

  • a web form tailored to Wikidata to edit and check shape expressions with syntax highlighting and typeahead, such as Wikidata query service
  • a bot that regularly runs ShEx given at Wiki pages and posts the results, such as User:ListeriaBot

-- JakobVoss (talk) 07:15, 22 February 2018 (UTC)

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) User:Malore Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC)


Pictogram voting comment.svg Notified participants of WikiProject ShEx I've just added a "Tutorials and examples" section on the project homepage, with a very basic example on how to get started with ShEx2. Please help improving! (Thanks to Eric for fixing two minor issues in ShEx2!) Jneubert (talk) 12:06, 25 June 2018 (UTC)

Wikidata ShEx Inference tool[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) User:Malore Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC)


Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hi folks! I’ve been working on a tool to automatically infer ShEx schemas from Wikidata items, and a first version of the tool is now available at toolforge:wd-shex-infer (documentation). I would be very thankful if you could try it out and let me know how it works for you, preferably within the next two weeks (the tool will stay available after that, but eventually I’ll have to write and hand in my thesis). Let me know if you have any questions! --Lucas Werkmeister (talk) 12:33, 16 August 2018 (UTC)

Some initial observations: This tool is a great idea and could potentially become very useful — thanks! It's understandable that only a small number of jobs can be run at any time, but it would be nice to be able to submit jobs into a queue if they cannot be run immediately. The tool tips when exploring the ShEx results are helpful. I haven't seen references covered in the ShEx output, but it would be handy to be able to run some jobs specifically to explore the data model used for references on items of particular types. --Daniel Mietchen (talk) 02:29, 18 August 2018 (UTC)
@Daniel Mietchen: thanks! I’ll think about adding a job queue, depending on how many people use the tool. And currently, qualifiers and references are ignored, yes – I’m afraid that the way RDF2Graph works doesn’t really work well with them (it heavily relies on “instance of” and “subclass of” relations, so it would see all statement and reference nodes as equivalent, since they all have the type wikibase:Statement/wikibase:Reference). It might be possible to fix that, but I don’t think I’ll have time for that before my thesis is done. --Lucas Werkmeister (talk) 12:14, 22 August 2018 (UTC)
Friendly reminder that the next few days would be an especially helpful time for feedback :) it should also be possible to run two jobs at once now. Please let me know if there are any problems! --Lucas Werkmeister (talk) 17:56, 28 August 2018 (UTC)
I’ve also updated the tool to fix several problems with the simplification step, so now the schemas should look much nicer. For example, compare the shape for human (Q5) between job #11 and job #29 (both for “films that won ten or more Oscars”): five target classes for nominated for (P1411) were merged into one (award (Q618779)), as were nine target classes for award received (P166); eight target classes for country of citizenship (P27) were merged into two (political territorial entity (Q1048835) and political system (Q28108) – that second one is probably a bug in the data); and so on. You might even see completely new predicates be mentioned, because the tool drops any predicate with more than ten possible target classes (rationale: that’s pointless noise), so predicates which would previously have been dropped might now be included due to the target classes being merged. If you were dissatisfied with the schemas before, perhaps take another look? :) --Lucas Werkmeister (talk) 15:49, 6 September 2018 (UTC)