Wikidata talk:WikiProject Schemas

From Wikidata
Jump to navigation Jump to search

Meeting Notes[edit]

14 Feb 2018[edit]

ericP: what is our mission Andra: to raise awareness and grow a community ... we can grow the WikiProject page over time ... use cases for ShEx ... namespace for ShEx- Lucas' idea from WikidataCon, should host shapes on their own URI, what would that URI be?

ericP: conflicting intrests- give ownership to WD community, conflicting one is to have them visible so people can copy and steal shapes even if they are outside wd community ... maybe move shemas over to WD and then mirror to shex schemas space when you want

Lucas: To create a new namespace- prob not trivial. Even if no one argues, the technical side might be complicated.

Andra: maybe postpone this until we have more use cases. ... Kat's http://wikidp.org/ demo- can we use this for additonal domains by driving it with shex- property checklist driven by shex ... We could create a generic version of the portal, containerize it and then poeple could slot in their own shape expressions to create their own property checklists ... need shapes avail through URL to reuse Harold Solbrig's pyshex, so that is why i need a namespace for shape URIs

ericP: demo manifests to run in Eric's or Jose's implementation- like the primer try it links- 1. create manifests so ... good queries and validation tests either that are picked up remotely, or static data, create the schemas that will be shared, demo data, and manifests in a picklist ... demos show why validation is useful, hints on how is used in different domains, give people ideas ... wiki page with try it links, if we have a data structure, we can express it like this, that catches errors like this, help people

Andra: create a page similar to the example queries


TODOs for next meeting:

   Lucas- ask around WMDE about how to request a new namespace
   Kat- create an example on the WikiProject page
   Andra- create an example on the WikiProject page
   Kat- paste notes in the talk page of the WikiProject
  ? Create phabricator ticket for a new namespace?

Examples and tools[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx Could you please first provide examples of ShEx shapes that check particular data models in Wikidata and guidelines how to check Wikidata against this shapes? I'd prefer

  • a web form tailored to Wikidata to edit and check shape expressions with syntax highlighting and typeahead, such as Wikidata query service
  • a bot that regularly runs ShEx given at Wiki pages and posts the results, such as User:ListeriaBot

-- JakobVoss (talk) 07:15, 22 February 2018 (UTC)

I've just added a "Tutorials and examples" section on the project homepage, with a very basic example on how to get started with ShEx2. Please help improving! (Thanks to Eric for fixing two minor issues in ShEx2!) Jneubert (talk) 12:06, 25 June 2018 (UTC)
Updated version of How to get started with ShEx on Wikidata? - please help improving. --Jneubert (talk) 14:35, 25 July 2019 (UTC)

Wikidata ShEx Inference tool[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hi folks! I’ve been working on a tool to automatically infer ShEx schemas from Wikidata items, and a first version of the tool is now available at toolforge:wd-shex-infer (documentation). I would be very thankful if you could try it out and let me know how it works for you, preferably within the next two weeks (the tool will stay available after that, but eventually I’ll have to write and hand in my thesis). Let me know if you have any questions! --Lucas Werkmeister (talk) 12:33, 16 August 2018 (UTC)

Some initial observations: This tool is a great idea and could potentially become very useful — thanks! It's understandable that only a small number of jobs can be run at any time, but it would be nice to be able to submit jobs into a queue if they cannot be run immediately. The tool tips when exploring the ShEx results are helpful. I haven't seen references covered in the ShEx output, but it would be handy to be able to run some jobs specifically to explore the data model used for references on items of particular types. --Daniel Mietchen (talk) 02:29, 18 August 2018 (UTC)
@Daniel Mietchen: thanks! I’ll think about adding a job queue, depending on how many people use the tool. And currently, qualifiers and references are ignored, yes – I’m afraid that the way RDF2Graph works doesn’t really work well with them (it heavily relies on “instance of” and “subclass of” relations, so it would see all statement and reference nodes as equivalent, since they all have the type wikibase:Statement/wikibase:Reference). It might be possible to fix that, but I don’t think I’ll have time for that before my thesis is done. --Lucas Werkmeister (talk) 12:14, 22 August 2018 (UTC)
Friendly reminder that the next few days would be an especially helpful time for feedback :) it should also be possible to run two jobs at once now. Please let me know if there are any problems! --Lucas Werkmeister (talk) 17:56, 28 August 2018 (UTC)
I’ve also updated the tool to fix several problems with the simplification step, so now the schemas should look much nicer. For example, compare the shape for human (Q5) between job #11 and job #29 (both for “films that won ten or more Oscars”): five target classes for nominated for (P1411) were merged into one (award (Q618779)), as were nine target classes for award received (P166); eight target classes for country of citizenship (P27) were merged into two (political territorial entity (Q1048835) and political system (Q28108) – that second one is probably a bug in the data); and so on. You might even see completely new predicates be mentioned, because the tool drops any predicate with more than ten possible target classes (rationale: that’s pointless noise), so predicates which would previously have been dropped might now be included due to the target classes being merged. If you were dissatisfied with the schemas before, perhaps take another look? :) --Lucas Werkmeister (talk) 15:49, 6 September 2018 (UTC)

You can now try Shape Expressions on a test system[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hello all,

The Wikidata team started working on support for Schemas, specifically Shape Expressions, to integrate a new extension into Wikidata, in order to store and reuse Schemas.

It’s still in development, but we wanted to share the first results with you, so you can give us early feedback.

On the test system, one can create and edit Schemas. You can see an example Schema here.

Please note that the multilingual labels, descriptions and aliases are not enabled for now, this is the next step we will work on. After that we will work on linking to a tool that allows you to check the Schema against a list of Items.

If you have any questions or remarks at that stage, please let me know by replying to this section :) If you want to create Phabricator tickets, you can use the tag Shape Expressions.

Cheers, Lea Lacroix (WMDE) (talk) 14:13, 26 February 2019 (UTC)

  • Thanks for letting us know. I just tried it out and created O10. YULdigitalpreservation (talk) 19:29, 26 February 2019 (UTC)
  • Sorry for the delayed reply. This is cool! I finally got around to it, but will put my (few) ShEx there. --Egon Willighagen (talk) 09:50, 22 April 2019 (UTC)

Improvements on ShEx test system[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hello all,

Our developers keep working hard on Shape Expressions, and we would love to have your feedback on the current version :)

Here's what has been improved recently:

  • the "termbox" area of the page now displays several languages
  • if you switch your interface from English to a language that has label filled, the title of the page will change accordingly
  • if you want to add a label/description in a new languages, two options are possible: you can switch your interface in this new language, and an editable line will appear in the table, or you can edit directly the URL to access the special page, eg. https://wikidata-shex.wmflabs.org/wiki/Special:SetSchemaLabelDescriptionAliases/O2/fr
  • there is no more edit button on the top of the page, but the different sections are independantly editable
  • A new special page, Special:SchemaText, provides the raw text of the Schema in an external file. Example: https://wikidata-shex.wmflabs.org/wiki/Special:SchemaText/O2

And here's what is coming next:

  • the "edit" buttons will be translated in the language of your interface
  • we will add a button to check the schema in the validator tool

Feel free to try the interface on the test system, create new schemas, play around. If you find any issue, or if there is a feature/improvement that you would like to add, please let me know :)

Cheers, Lea Lacroix (WMDE) (talk) 09:14, 14 March 2019 (UTC)

  • One thing that comes to mind is to be able to indicate what items the expression should run on. At this moment I am not entirely sure how to 'run' my ShEx on Wikidata. --Egon Willighagen (talk) 09:51, 22 April 2019 (UTC)

Please explain[edit]

What is the purpose and how will it affect the existing structure that is opaque. That can not be explained to me (I have asked repeatedly). What are the material benefits of this approach? Thanks, GerardM (talk) 16:45, 16 March 2019 (UTC)

Community requirements for data integrity[edit]

@GerardM: Hi, for completeness and to make sure we're addressing your issues, could you link to your previous requests for explanation?

While not all Wikidata communities require or even desire validation, it is essential for some of the more complex ones, e.g. GeneWiki (c.f. GeneWiki grant proposal). Such validation can be hand-rolled, but having a standard schema language offers obvious advantages in terms of tooling, completeness and ease of maintenance. Compiling even a simple ShEx schema to SPARQL produces a 10-100x explosion in line noise and scripting something with conjunction of JSON path expressions would require tooling investment and would require maintenance of a corpus of rules to enforce cardinality, data type consistency and structural coherence. It would be possible to invent a Wikidata-specific schema language but it would lack the tooling support that ShEx offers (validators in five languages, form-generation, import from UML/XMI, etc).

I've witnessed many publicly-curated databases lose relevance as their data rotted over time or changed structure so that potential users gave up trying to track it. Open PHACTS was founded specifically to provide integrity and consistency to Linked Data. Domain-specific databases typically have greater institutional investment because they offer integrity and consistency backed by schemas (e.g. UniProt, whose RDF structure reflects a conventional SQL (DDL) schema for genes and proteins). General knowledge stores have to add schema validation because their native schema is not domain-specific but instead one of generalized assertions, which can express incoherent data structures as easily as coherent ones.

Of course not all communities demand validation, but I believe that the offer of testable contracts to ensure the longevity and institutional investment in Wikidata more than justifies this effort.

--EricP (talk) 07:00, 18 March 2019 (UTC)

When technology is introduced that enforces particular behaviour, it is all too easy to use the same technology elsewhere when at first glance a similar situation exists. So you have been abstract in your answer and it does not satisfy. I am familiair with SwissProt/UniProt from my Wikiprotein days. I know that Wikidata is not as good as Wikiprotein used to be. The quality of the data is not the issue, the issue is that a schema enforces. It follows that a certain "completeness" will be enforced and that is not necessarily a good thing. What I learned at Wikiprotein is how vital it is that people include information that is valid but not necessary complete.
In conclusion, what is it EXACTLY what you aim to achieve/enforce? Thanks, GerardM (talk) 11:11, 18 March 2019 (UTC)
ShEx or any schema language is not about enforcing, it is more instrumental to checking for conformance. As a data-consumer I want to be able to check data consistency according to relevant data-models. Relevant to me, not necessarily to you. There are many case where even within a single application multiple schema's could apply, depending on the use case. As you say it is crucial that people include data that is valid, not necessarily complete. There is no intention to enforce, only to be able to check the validity. --Andrawaag (talk) 11:40, 18 March 2019 (UTC)
You asked EXACTLY what we aim to enforce. It would be tedious to enumerate everything but as an example, in Gene Wiki we want to know when an item on a protein doesn't have properties related to genes (e.g. chromosomal location) AND that a genomic build is missing as a qualifier to the statement on the gene location, making the statement non-sensical. When these inconsistencies occur having flags indicating these inconstancies being part of the workflow, tremendously helps in curating protein and gene information. Early prototypes of this system have already help me fixing errors. --Andrawaag (talk) 12:10, 18 March 2019 (UTC)
That makes perfect sense. So in conclusion the intention is to signal structural issues in order to help people insert sensible information and to use it as a template to query those records that fail a "sanity"check. Thanks, GerardM (talk) 15:16, 18 March 2019 (UTC)

Update documentation[edit]

Hello dear ShEx enthusiasts!

Because we will release Schemas on Wikidata very soon, I'm currently reviewing the existing documentation. When I announce it, I expect a lot of people in the Wikidata community to wonder "what is it exactly? how can write my own?"

The main links I'll redirect people to is your Wikiproject page and Wikidata:WikiProject ShEx/How to get started?. Is this second page still up to date from your point of view?

I think that now would be a good time to give a bit of polish to the presentation of shape expressions. From the development team side, will add technical documentation about the new extension and data type.

If you have any question or wish, feel free to ping me. Cheers, Lea Lacroix (WMDE) (talk) 15:11, 23 April 2019 (UTC)

Shape Expressions arrive on Wikidata on May 28th[edit]

See full announcement on the Project Chat :)

Thanks a lot to all of you who have been involved in discussing, suggesting improvements, testing the feature! Lea Lacroix (WMDE) (talk) 13:30, 19 May 2019 (UTC) Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hello all,
As announced here, we just released shape expressions on Wikidata. You can for example have a look at E10, the shape for human, or create a new EntitySchema.
A few useful links:
If you have any question or encounter issues, feel free to ping me. Cheers, Lea Lacroix (WMDE) (talk) 16:07, 28 May 2019 (UTC)
Indeed it's CC0. Thanks for the reminder! I created a ticket. Lea Lacroix (WMDE) (talk) 07:18, 29 May 2019 (UTC)

Are the following validations possible?[edit]

1. Ensure that at least one statement for a given property (where multiple statements exist) has a value in a specified value set. If other statements exist for the property, ignore them. For example, validate that an item has at least the statement instance of (P31) sovereign state (Q3624078), but may also have other instance of (P31) statements that should be ignored. --Dhx1 (talk) 18:35, 28 May 2019 (UTC)

  • 1. Yes, the keyword EXTRA says that other values of the property may appear. This is common for P31. This example shows a schema with a simple value set [<Qx> <Qy>]. (In many schemas, that's a value set of 1 element.) <Q2> fails <WithoutExtra> because it has an extra P31 (outside the value set).but it passes <WithExtra>. I added a <Q3> which has two P31's within the value set. There you don't need an EXTRA, you need instead to increase the number of expected P31s matching the value set. I added + which is a shorthand for {1,}, i.e min number of 1, max number unlimited. --EricP (talk)

2. Extract data on linked Wikidata items using EXTERNAL (?) or some other technique, allowing a country (P17) statement to be validated to ensure the linked item has a statement instance of (P31) sovereign state (Q3624078).

  • 2. Yes, but you don't need EXTERNAL. If I understand the question, you just want your constraints to link to another resource in the wikidata world. I created a shape for national flags as an example. It has the constraint below (which 90% of flags fail, but...) to say that the NationalFlag must have a country with a given type. --EricP (talk)
   wdt:P17 { wdt:P31 [wd:Q3624078] }
    • @EricP: is it possible to go a step further than that, and say that the linked country from a given flag is not only of a certain type, but also itself has a flag (P163) of this same item? --Oravrattas (talk) 06:35, 8 July 2020 (UTC)
      • at present, no, though there is a proposal to directly compare the value of some TripleConstraint against a property path, which is relatively simple to implement, and another to add more generic functions (example), which is more powerful but more complex. Aside from picking between the two, we also have to decide if we want to break the locality features of ShEx to add either one of them. --EricP (talk)

Validate in Blazegraph/query server ?[edit]

It would be interesting if these schemes could be used directly on query server, i.e. filter for items that match, check if items match, list errors. --- Jura 10:38, 29 May 2019 (UTC)

Running validation with API access (i.e. getStatements()) would greatly accelerate validation and reduce parsing and serialization effort on the query server. ---EricP (talk)

Structure e-entities ?[edit]

There are a few essential, but secondary elements sometimes included on entities:

  • queries of items that could be validated
  • lists of prefixes

I think the first could easily go into the long announced "query"-namespace. The second could probably be assumed in the configuration of whatever tool one uses, at least if they are WD prefixes. --- Jura 10:38, 29 May 2019 (UTC)

  • Associating an item with each shape could help link the queries. --- Jura 09:36, 30 May 2019 (UTC)

links between entities schema[edit]

I couldn't figure out the way to refer from an entity schema to another: for instance, I would like to be able to write from E36 entry point something like wdt:P629 @<someprefix:E35> Is that possible? Is is the right pattern to have several EntitySchema to describe different shapes of a schema? pinging the ShExperts ;) @Andrawaag, YULdigitalpreservation, Jelabra, Tombakerii: -- Maxlath (talk) 15:35, 29 May 2019 (UTC)

I am definitely a ShEx beginner as well, but I have found the import command as described in [1] and [2] which looks promising. You can access the raw ShEx schema code via Special:EntitySchemaText (e.g. Special:EntitySchemaText/E10).
Unfortunately, I don't get it to work in the shex-simple tool, and I am not sure whether this is due to my poor ShEx skills, or some bug in the tool (error message is: "failed to create validator: loadImports@https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/browser/shex-webapp-webpack.js:53845:9 …"). —MisterSynergy (talk) 09:31, 30 May 2019 (UTC)
I'll dive into this. @MisterSynergy, can you pass me an experiment that failed and I'll see if I can tweak it to make it succeed? (One requirement is of IMPORT <XXX> is that XXX returns the schema without any HTML around it; also that we don't get defeated by CORS issues which require administration beyond my fingertips. — EricP (talk)
For instance this one (sorry for the non-clickable link, there are several unmasked characters which I don't want to change in order not to break the link):
  • https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html?schema=PREFIX%20rdf%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20%3A%20%3Chttps%3A%2F%2Fwww.example.org%2F%23%3E%0A%0Aimport%20%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FSpecial%3AEntitySchemaText%2FE48%3E%0Astart%20%3D%20%40%3Asportsperson%0A%0A%3Asportsperson%20EXTRA%20wdt%3AP106%20{%0A%20%20wdt%3AP106%20[%20wd%3AQ2066131%20]%3B%0A%23%20wdt%3AP22%20%40%3Chuman%3E%0A}&data=Endpoint%3A%20https%3A%2F%2Fquery.wikidata.org%2Fsparql&shape-map=SPARQL%20%27%27%27SELECT%20DISTINCT%20%3Fid%20WHERE%20{%20%3Fid%20wdt%3AP106%20wd%3AQ2066131%3B%20wdt%3AP22%20[]%20}%20LIMIT%2010%27%27%27@START&interface=human&regexpEngine=threaded-val-nerr
It uses EntitySchema:E48 via Special:EntitySchemaText/E48 (raw shex without any HTML around—just click on it). I already tried several things, including this older version of E48 with prefixes. Note that E48 does not have a "start" command, as required for imported shape expressions. In the simple-shex tool, you'll see that the line that would actually make use of the imported shex is commented because it does not work anyways.
The error message displayed in Google Chrome is failed to create validator TypeError: Cannot read property 'keepImports' of undefined at loadImports (https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/browser/shex-webapp-webpack.js:53845:23). Sounds like a Javascript issue, but I am not very experienced with that… Thanks for investigating, —MisterSynergy (talk) 15:28, 30 May 2019 (UTC)
One engineering decision is whether that import would be just textual, like C's *#include*, or whether the prefixes (and inclusion there-of) should appear in the JSON (ShExJ) and RDF (ShExR) versions of the schema. You may want to raise a language issue with the tag "enhancement". — EricP (talk)

What to do with duplicate schemas?[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Hi all, since people are already working to do their own schemas, and since we still didn't set up a list of all existing ones, there are already a couple of them who are basically the same thing, like E10, E14 and E48. What do we do in this case? Do we cancel them or "reuse" them? --Sannita - not just another it.wiki sysop 14:32, 11 June 2019 (UTC)

Hello,
Not directly answering your question, I just wanted to point to a few tickets - we will continue improving the software in the future.
Cheers, Lea Lacroix (WMDE) (talk) 08:58, 13 June 2019 (UTC)
Some more input: I do not think that we should be concerned about duplicates at this point. ShEx is a relatively new functionality and there is quite a lot of dev work going on, as well as the community needs to become familiar with it. According to [3], there are not that many EntitySchemas created until now. Later, we probably want to either merge duplicates (i.e. redirect the E-numbers), or simply allow "duplicated" EntitySchemas. Reuse does not seem to be a good idea, though. --MisterSynergy (talk) 09:24, 13 June 2019 (UTC)

CheckShex UserScript[edit]

I thought this project might be interested in a new userscript named CheckShex. It adds a field to items, properties, lexemes where you can enter an entitySchema and it will return whether it passes or fails. It also adds a field to entitySchemas, where you can do the reverse. The userscript can be installed to your common.js from User:Teester/CheckShex.js.

The userscript is backed by an api based on PyShEx (Q51672520). The api is located at https://tools.wmflabs.org/pyshexy/api and details about its use are at https://tools-static.wmflabs.org/pyshexy/. Teester (talk) 11:56, 22 June 2019 (UTC)

Thanks for this great tool! Sometimes however, I get strange results: Checking Antifaschistisches Pressearchiv und Bildungszentrum Berlin (Q575202) against "E94", I get "Pass Fail" as message. When I hit "Check" again, I get "Fail". This behaviour seems not to be reproducible, but I encountered it once for 20th Century Press Archives (Q36948990), too. A hint may be that hitting "Check" again on an item page after "Pass" always results in "Fail". From E94, both items are validated consistently as passing. Jneubert (talk) 09:55, 20 July 2019 (UTC)
Thanks. There was a bug in the userscript where when you hit check more than once the schema would be checked against itself rather than the item being checked against the schema. I wonder if the "Pass, Fail"" behaviour is from clicking "Check" a second time before the check is complete and running into the bug?
Looking at the items, Antifaschistisches Pressearchiv und Bildungszentrum Berlin (Q575202) currently fails against E94 because of a missing parent organization (P749), while 20th Century Press Archives (Q36948990) currently passes. I get this result when using both the user script and the ShEx2 Validator. For the ShEx2 Validator, a query like this gets you just that item to validate:
SELECT ?item WHERE {BIND(wd:Q36948990 as ?item)} LIMIT 1
Try it!
A simpler way:
SELECT (wd:Q36948990 as ?item) WHERE {}
Try it! --Vladimir Alexiev (talk) 09:00, 9 January 2020 (UTC)
Let me know if there are any other bugs or problems. Teester (talk) 14:21, 20 July 2019 (UTC)
A big sorry - I'm currently figuring out possible workflows, and indeed have made E94 more strict, which causes it to fail with Antifaschistisches Pressearchiv und Bildungszentrum Berlin (Q575202), while it passes the new relaxed E95. This messed up the test case - sorry again!
Now in multiple tests with some arbitrary clicking, I was not able to reproduce a case with "Pass Fail", so I suppose this is gone together with your bug fix, which also worked consistently well. Thank you for the quick fix! --Jneubert (talk) 08:01, 21 July 2019 (UTC)
May I suggest a possible extension of the script? The API already returns the reason for failing (e.g., [4]). So it should be possible to show it to the user on request (with a popup/mouse-over perhaps, because the messages do not look nice, but are helpful nonetheless). --Jneubert (talk) 08:10, 21 July 2019 (UTC)
Great idea. I've updated the user script so that now it shows some error information on failure. Now, if there's a missing or incorrect property in the response, the property number is shown beside the Fail message. Additionally, the raw error response is available on mouse over of the fail message. Teester (talk) 11:03, 23 July 2019 (UTC)
This is fantastic - thank you so much. --Jneubert (talk) 14:37, 23 July 2019 (UTC)
While adding the tool to the How to get started ... page, another possible improvement came to mind: On the item page, a tiny "schema" link, right of the validating result, would make it super-easy to navigate to the selected schema. --Jneubert (talk) 17:57, 23 July 2019 (UTC)
I added more suggestions at User_talk:Teester/CheckShex.js#Usability --~~

Add saved queries to EntitySchema entries?[edit]

The "check entities against this Schema" link on the schema pages is a great thing. However, it requires newbies and experts alike to write a query from scratch, which is tedious. Some Schema authors are working around this by embedding example query code in the schema text as comment - which helps, but looks a bit messy, and still needs manual copy+paste for transfer to the query field.

So it would be great if we could save a query - or even better, muliple named queries - with the schema. The code to load queries and allow for user selection is already in place (see ShEx2 on Toolforge) with the "dataLabel" and "queryMap" parameters in the manifest file (though perhaps not yet as http request query parameter).

On the Wikidata/Wikibase side, I wonder if setting the Wikidata SPARQL query equivalent (P3921) property could be enabled for EntitySchema entries. Together with named as (P1810) qualifiers, that would allow for multiple queries to be saved with each schema. --Jneubert (talk) 09:31, 21 July 2019 (UTC)

My idea was to re-use the property definition at EntitySchema:E123 in order to use it there directly (formatted as a link to ShEx2), not to add the property to an item about the schema. --Jneubert (talk) 09:46, 21 July 2019 (UTC)
This is currently not supported. Once an item is associated with a schema, you should be able to load its content on the schema page with LUA. --- Jura 10:09, 21 July 2019 (UTC)

Have there been improvements in this regard? Having a formal association of shape with query is essential for example for a reporting bot. --SCIdude (talk) 14:28, 10 November 2020 (UTC)

Comparison between ShEx and constraints[edit]

What’s possible with ShEx that is not with constraints and vice versa ? What I got so far is:

  • Constraints are tight to a property, shapes are « free » to be checked against any item and reused
  • Constraints are somewhat easier to edit textually, more efficient
  • Constraints are automatically checked by Mediawiki.
  • Shapes are more powerful, for example it’s possible to express something like any property that is not authorized is impossible

Anything else/wrong ? It’s unclear to me how type shape constraints can be dealt with on Wikidata, as « rdf:type » is irrelevant in Wikidata items. Wikibase has domain and range constraints, I’m not sure this can be dealt with with shape expressions as it seems there is no notion analog to Sparql PropertyPath’s in shex. author  TomT0m / talk page 19:42, 21 July 2019 (UTC)

Correction, it’s definitely possible to express paths, my bad (this is used on the example shape for file formats, and for example showed in the 13th slide of this comparison between shex and shacl). author  TomT0m / talk page

Could we generate constraint from shapes and vice-versa ?

author  TomT0m / talk page 19:42, 21 July 2019 (UTC)

Rename this project[edit]

The project should be about entity schemas, not about ShEx. The latter is just the technical language to write entity schemas in. Anybody against renaming the project? -- JakobVoss (talk) 08:42, 25 October 2019 (UTC)

  •  Support; maybe we should call it just "WikiProject Schemas"? —MisterSynergy (talk) 09:07, 25 October 2019 (UTC)
Ok, I was bold, renamed it and created some additional redirects such as Help:Schemas to avoid creation of too many separated pages. -- JakobVoss (talk) 12:44, 26 October 2019 (UTC)

Lack of help[edit]

There is also a lack of help. No mention of schemas in the Help namespace so far. There should be Help:Schemas just like Help:Constraints. -- JakobVoss (talk) 08:48, 25 October 2019 (UTC)

  • There is Wikidata:WikiProject ShEx/How to get started? as the only subpage of this project at this point, and it clearly is "work in progress". We definitely need to collect some more experience with Schemas in Wikidata in order to come up with a helpful help page. —MisterSynergy (talk) 09:10, 25 October 2019 (UTC)
In my opinion the technical references, links to standards and implementations should be removed. For an overview about SheX in general there is en:ShEx. This page in contrast should focus of use of SheX in/for Wikidata. -- JakobVoss (talk) 12:46, 26 October 2019 (UTC)

Request a Schema page[edit]

Schemas are still hard for people for various reasons. We had the same problem with queries and one thing that beautifully helped was the Request a Query page. There anyone who doesn't know how to write sparql can ask for help from people who can. I think a similar Request a Schema page could be super helpful to get more wiki projects to adopt Schemas. Thoughts? --Lydia Pintscher (WMDE) (talk) 13:27, 31 October 2019 (UTC)

 Support Currently, there are only around 140 entity schemas. This number may be possibly improved with the creation of a dedicated page for schema related questions. John Samuel (talk) 13:32, 31 October 2019 (UTC)
 Support Fabulous idea. Do we have people willing to build schemas on request? - PKM (talk) 03:13, 15 November 2019 (UTC)
 Support Request a query is super-helpful, and an equivalent for schemas would be great. --Oravrattas (talk) 06:43, 8 July 2020 (UTC)

Human readable schemas[edit]

One of the biggest problems with schemas right now is that they are difficult to understand without sufficient technical knowledge. But it seems to me that it should be possible to translate a schema into human readable language without too much difficulty, for the most part.

For example, if my understanding of shex is correct, currently E10 could be translated as follows:

Schema Translation
 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

start = @<human>

<human> EXTRA wdt:P31 {
  wdt:P31 [wd:Q5];
  wdt:P21 [wd:Q6581097 wd:Q6581072 wd:Q1097630 wd:Q1052281 wd:Q2449503 wd:Q48270]?;   # gender
  wdt:P19 .;                     # place of birth
  wdt:P569 . + ;                 # date of birth
  wdt:P735 . * ;                 # given name
  wdt:P734 . * ;                 # family name
  wdt:P106 . * ;                 # occupation
  wdt:P27 @<country> *;  # country of citizenship
  rdfs:label rdf:langString+;
}

<country> EXTRA wdt:P31 {
  wdt:P31 [wd:Q6256 wd:Q3024240 wd:Q3624078] +;
}
  • start with <human>

I could see this sort of thing being useful as part of a schema's talk page, similar to how property's talk pages contain a template containing useful information about a property and its constraints. Does anyone know of a service which will translate a schema into human readable language or vice versa? Teester (talk) 13:46, 16 November 2019 (UTC)

Since there seems to be nothing that can translate schemas into human readable language, I've put something together at https://tools-static.wmflabs.org/shextranslator/ Any feedback would be appreciated. Teester (talk) 12:23, 23 November 2019 (UTC)
  • Schemas have great potential to be come a good tool, but, in its present implementation, I don't think we can or should expect from users to rely on them as a primary mean of understanding which properties to add or what statements to fix.
  • A human readable version should always be outlined on a WikiProject page or with property constraints. --- Jura 12:42, 23 November 2019 (UTC)

PyShexy and sparql query[edit]

https://tools-static.wmflabs.org/pyshexy/ Have anyone figured out a way to get it to work with a sparql query? I tried hard but failed, I get HTTP 500 error. Example: query, pyshexy url--So9q (talk) 23:33, 25 November 2019 (UTC)

Troubles[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject Schemas

I'm a big fan of shapes, extensively reviewed the "Validating RDF" book, try to use them in my work, and we at Onto are helping with the rdf4j effort (though that's SHACL not SHEX). I'm quite enthusiastic about the Wikidata ShEx project and see a lot of good things.

But I tried to validate a realistic list, eg BG painters (this selects 100 of 310 on WD) against E10:

select ?item {?item wdt:P106 wd:Q1028181; wdt:P27 wd:Q219} limit 100

Try it! and I think the results are not quite usable yet.

PyShexy[edit]

PyShexy just gave up on me, even with limit 1 it returns "The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."

shex.js[edit]

ShEx.js behaves better (paste the query in the box).

But there are still usability problems:

  • Some of the errors are reported many times, eg (I cut to only the first few) cc @EricP::
wd:Q284264@!START
validating http://www.wikidata.org/entity/Q284264 as //www.wikidata.org/wiki/Special:EntitySchemaText/human:
    validating http://www.wikidata.org/entity/Q12287013:
        Missing property: http://www.wikidata.org/prop/direct/P19
        Missing property: http://www.wikidata.org/prop/direct/P19
        Missing property: http://www.wikidata.org/prop/direct/P19

wd:Q6957611@!START
validating http://www.wikidata.org/entity/Q6957611 as //www.wikidata.org/wiki/Special:EntitySchemaText/human:
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P569
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P569
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19

wd:Q11317581@!START
validating http://www.wikidata.org/entity/Q11317581 as //www.wikidata.org/wiki/Special:EntitySchemaText/human:
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  Missing property: http://www.wikidata.org/prop/direct/P19

wd:Q12283051@!START
validating http://www.wikidata.org/entity/Q12283051 as //www.wikidata.org/wiki/Special:EntitySchemaText/human:
    validating http://www.wikidata.org/entity/Q12299788:
      validating http://www.wikidata.org/entity/Q12283051:
        validating http://www.wikidata.org/entity/Q28194288:
            Missing property: http://www.wikidata.org/prop/direct/P19
            Missing property: http://www.wikidata.org/prop/direct/P19
            Missing property: http://www.wikidata.org/prop/direct/P19
            Missing property: http://www.wikidata.org/prop/direct/P19
            Missing property: http://www.wikidata.org/prop/direct/P19
  OR
  validating http://www.wikidata.org/entity/Q28194288:
      Missing property: http://www.wikidata.org/prop/direct/P19
      Missing property: http://www.wikidata.org/prop/direct/P19
      Missing property: http://www.wikidata.org/prop/direct/P19
      Missing property: http://www.wikidata.org/prop/direct/P19
      Missing property: http://www.wikidata.org/prop/direct/P19
      Missing property: http://www.wikidata.org/prop/direct/P19
  • It takes over 10s for some of the more difficult items. This isn't scalable.

WikiShape[edit]

http://wikishape.weso.es/ by @Jelabra: runs validations in parallel so even though the hard items (eg 1,2,4 the count is zero-based) are spinning 10 min already, I can inspect the easier items.

  • I think the hard items have relatives, so they cause recursive validation (see next section) and I'm doubtful their validation will ever finish.
  • Parallel threads reuse validations of subsidiary entries, which is great: after 100, it added 27 "country", "language" and "human", and each is checked only once.
  • The error messages are quite hard to grok, see below for 6 wd:Q3650675. It'd take me probably 20 min to understand what's wrong.
Error: None of the candidates matched. Attempt: Attempt: node: wd:Q3650675, shape: <internal://base/human>
Bag: C0,C1?,C2,C3+,C4*,C5*,C6*,C7*,C8*,C9*,C10*,C11*,C12*,C13*,C14*,C15*,C16*,C17+,C18*
Candidate lines:
CandidateLine: ((<http://www.wikidata.org/prop/direct/P31>,<http://www.wikidata.org/entity/Q5>),C0)
((<http://www.wikidata.org/prop/direct/P21>,<http://www.wikidata.org/entity/Q6581097>),C1)
((<http://www.wikidata.org/prop/direct/P569>,"1827-01-01T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>),C3)
((<http://www.wikidata.org/prop/direct/P735>,<http://www.wikidata.org/entity/Q15501913>),C4)
((<http://www.wikidata.org/prop/direct/P106>,<http://www.wikidata.org/entity/Q1028181>),C6)
((<http://www.wikidata.org/prop/direct/P27>,<http://www.wikidata.org/entity/Q219>),C7)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@de),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Никола Образописов"@bg),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@sq),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@nl),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@es),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@en),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@ga),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@fr),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@pt),C17)
CandidateLine: ((<http://www.wikidata.org/prop/direct/P21>,<http://www.wikidata.org/entity/Q6581097>),C1)
((<http://www.wikidata.org/prop/direct/P569>,"1827-01-01T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>),C3)
((<http://www.wikidata.org/prop/direct/P735>,<http://www.wikidata.org/entity/Q15501913>),C4)
((<http://www.wikidata.org/prop/direct/P106>,<http://www.wikidata.org/entity/Q1028181>),C6)
((<http://www.wikidata.org/prop/direct/P27>,<http://www.wikidata.org/entity/Q219>),C7)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@de),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Никола Образописов"@bg),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@sq),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@nl),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@es),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@en),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikola Obrazopisov"@ga),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@fr),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Nikolai Obrazopisov"@pt),C17)
((<http://www.wikidata.org/prop/direct/P31>,<http://www.wikidata.org/entity/Q5>),C18)

Or look at 10 wd:Q3804651:

Error: None of the candidates matched. Attempt: Attempt: node: wd:Q3804651, shape: <internal://base/human>
Bag: C0,C1?,C2,C3+,C4*,C5*,C6*,C7*,C8*,C9*,C10*,C11*,C12*,C13*,C14*,C15*,C16*,C17+,C18*
Candidate lines:
CandidateLine: ((<http://www.wikidata.org/prop/direct/P31>,<http://www.wikidata.org/entity/Q5>),C0)
((<http://www.wikidata.org/prop/direct/P21>,<http://www.wikidata.org/entity/Q6581097>),C1)
((<http://www.wikidata.org/prop/direct/P569>,"1864-05-18T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>),C3)
((<http://www.wikidata.org/prop/direct/P735>,<http://www.wikidata.org/entity/Q21104340>),C4)
((<http://www.wikidata.org/prop/direct/P106>,<http://www.wikidata.org/entity/Q1028181>),C6)
((<http://www.wikidata.org/prop/direct/P27>,<http://www.wikidata.org/entity/Q219>),C7)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@ga),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@en),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@ast),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@nl),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@de),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@it),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@sq),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@fr),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@es),C17)
CandidateLine: ((<http://www.wikidata.org/prop/direct/P21>,<http://www.wikidata.org/entity/Q6581097>),C1)
((<http://www.wikidata.org/prop/direct/P569>,"1864-05-18T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>),C3)
((<http://www.wikidata.org/prop/direct/P735>,<http://www.wikidata.org/entity/Q21104340>),C4)
((<http://www.wikidata.org/prop/direct/P106>,<http://www.wikidata.org/entity/Q1028181>),C6)
((<http://www.wikidata.org/prop/direct/P27>,<http://www.wikidata.org/entity/Q219>),C7)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@ga),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@en),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@ast),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@nl),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@de),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@it),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@sq),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@fr),C17)
((<http://www.w3.org/2000/01/rdf-schema#label>,"Ivan Angelov"@es),C17)
((<http://www.wikidata.org/prop/direct/P31>,<http://www.wikidata.org/entity/Q5>),C18)

Recursive Shapes[edit]

E10 includes recursive shape refs.

<human> EXTRA wdt:P31 {
  wdt:P22 @<human> *;           # father
  wdt:P25 @<human> *;           # mother
  wdt:P3373 @<human> *;         # sibling
  wdt:P26 @<human> *;           # husband/wife
  wdt:P40 @<human> *;           # children
  wdt:P1083 @<human> *;         # relatives

But some British politician ancestries have been tracked back to Adam (through some uncertain/fictional rulers). So if you follow all these links recursively, back and forth, you may pick up a majority of Humans on WD (5-6M). So following such recursion faithfully is suicide, and shex.js does seem to recurse faithfully:

wd:Q2989196@!START
validating http://www.wikidata.org/entity/Q2989196 as //www.wikidata.org/wiki/Special:EntitySchemaText/human:
    validating http://www.wikidata.org/entity/Q3657670:
      validating http://www.wikidata.org/entity/Q2989196:
        validating http://www.wikidata.org/entity/Q4162892:
          validating http://www.wikidata.org/entity/Q35228:
            Missing property: http://www.wikidata.org/prop/direct/P31
  OR
  validating http://www.wikidata.org/entity/Q4162892:
    validating http://www.wikidata.org/entity/Q35228:
      Missing property: http://www.wikidata.org/prop/direct/P3

What we need instead is something like:

<human> EXTRA wdt:P31 {
  wdt:P31 [wd:Q5];
  wdt:P22 @<mini_human> *;           # father
  wdt:P25 @<mini_human> *;           # mother
  wdt:P3373 @<mini_human> *;         # sibling
  wdt:P26 @<mini_human> *;           # husband/wife
  wdt:P40 @<mini_human> *;           # children
  wdt:P1083 @<mini_human> *;         # relatives
  ...
}
<mini_human> EXTRA wdt:P31 {
  wdt:P31 [wd:Q5];
}

So it's really easy for a schema writer to shoot himself in the foot.

Discussion[edit]

I've thought a lot about shape validation performance and scalability, and I think that fetching entities ad nauseum (esp. through numerous SPARQL queries) can never scale. What we need is for SHEX engines to strictly enforce limits on what's checked about referenced WD entities: basically we need an "existence check" but not full recursive checking.

@EricP, Jelabra: what do you think? --Vladimir Alexiev (talk) 09:41, 9 January 2020 (UTC)

On pyshex: this one works. You have given incorrect input in the sparql= parameter. —MisterSynergy (talk) 11:27, 9 January 2020 (UTC)
On Blaze: @Vladimir Alexiev: we've raised [an issue](https://phabricator.wikimedia.org/T243595) to move validation to a Blaze instance so we're not spending 99% of our time waiting for SPARQL scheduling.  – The preceding unsigned comment was added by EricP (talk • contribs) at 07:50, January 24, 2020‎ (UTC).


shex-simple[edit]

For the tool at

https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html

Sample link:

https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html?data=Endpoint:%20https://query.wikidata.org/sparql&hideData&manifest=[]&textMapIsSparqlQuery&schemaURL=%2F%2Fwww.wikidata.org%2Fwiki%2FSpecial%3AEntitySchemaText%2FE10

is there a way to link the sparql query in the url? (To avoid having to paste it into the query field). --- Jura 01:11, 10 February 2020 (UTC)

I think a URL parameter "shape-map" does the work here: https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html?data=Endpoint:%20https://query.wikidata.org/sparql&hideData&manifest=[]&textMapIsSparqlQuery&schemaURL=%2F%2Fwww.wikidata.org%2Fwiki%2FSpecial%3AEntitySchemaText%2FE10&shape-map=SELECT%20?item%20WHERE%20{%20?item%20wdt:P31%20wd:Q5%20}%20LIMIT%205 —MisterSynergy (talk) 07:11, 10 February 2020 (UTC)
Thanks @MisterSynergy:! Is there a way to autorun it? --- Jura 08:17, 10 February 2020 (UTC)
No idea. I meanwhile do not use this shex-simple tool any longer, as it seems to be very basic in functionality and there are other ones available. I like the pyshexy API that is linked about two sections above this one most at this time. --MisterSynergy (talk) 08:39, 10 February 2020 (UTC)
Interesting. I will try to add both to Talk:Q4925477. --- Jura 08:58, 10 February 2020 (UTC)
No idea what your exact plan is, but for your convenience, User:Teester/CheckShex.js might also be worth a try... It adds an input field on item pages where you just need to provide an E-number for an evaluation of that item. --MisterSynergy (talk) 09:08, 10 February 2020 (UTC)
The idea is to provide a link to the check the item against a schema. There are a few other approaches at "This item:" in the list. --- Jura 09:27, 10 February 2020 (UTC)

ShExStatements[edit]

During Wiki Techstorm 2019 [1], we started exploring simplification for creating shape expressions. One possibility that was explored was to make something like QuickStatements that will take CSV/tabular format as input to generate shape expressions.

ShExStatements is now released: https://github.com/johnsamuelwrites/ShExStatements

The main goal is to help newcomers write shape expressions. The users write a CSV file and ShExStatements will translate it to a shex file.

Take for example, a CSV file concerning a language (with prefixes): https://github.com/johnsamuelwrites/ShExStatements/blob/master/examples/language.csv is translated to a shape expression [2].

There are five columns. Column 1 is used for specifying the node name, 2 for specifying the property value, 3 for one or possible values, 4 is for cardinality (+,*) and column 5 for comments.

Columns 3,4 and 5 are empty for prefixes. Columns 1, 2, 3 are mandatory. Column 3 can be . (to say any value).

Examples related to Wikidata that were used to create some entity Schemas E177, E178, E179 can be found here [3], with some additional examples in [4].

For a detailed documentation, please check [5].

Please let me know if you have any questions/remarks.


  1. https://medium.com/@jsamwrites/wiki-techstorm-2019-a996d69c60a5
  2. https://github.com/johnsamuelwrites/ShExStatements#quick-start
  3. https://github.com/johnsamuelwrites/ShExStatements/tree/master/examples/wikidata
  4. https://github.com/johnsamuelwrites/ShExStatements/tree/master/examples
  5. https://github.com/johnsamuelwrites/ShExStatements/blob/master/docs.md

Experimenting with Bioschemas at Scholia[edit]

For Scholia, we have begun to explore how to annotate entities using Bioschemas (Q93995803). You can see this in action at taxon profiles like toolforge:scholia/taxon/Q12024, whose HTML now includes the following:

/* BioSchemas annotation */
if (item.claims.P225) {
        try { /* Taxon */
      var taxonName = item.claims.P225[0].mainsnak.datavalue.value;
      bioschemasAnnotation = {
         "@context" : "https://schema.org",
         "@type" : "Taxon" ,
         "name" : taxonName ,
         "url" : "http://www.wikidata.org/entity/Q12024"
      }
      if (item.claims.P105) {
         var taxonRank = item.claims.P105[0].mainsnak.datavalue.value.id;
         bioschemasAnnotation.taxonRank = "http://www.wikidata.org/entity/" + taxonRank ;
      }
      if (item.claims.P171) {
         var parent = item.claims.P171[0].mainsnak.datavalue.value.id;
         bioschemasAnnotation.parentTaxon = "http://www.wikidata.org/entity/" + parent ;
      }
      $( '#bioschemas' ).append( JSON.stringify(bioschemasAnnotation) );
      // console.log(JSON.stringify(bioschemasAnnotation, "", 2))
   } catch(e) {}
}

In the process, we were wondering to what extent such Wikidata-generic annotations could be represented on Wikidata rather than hardcoded on the Scholia end, and are inviting your comments, here or via a currently open pull request for similar annotation of molecular entities. --Daniel Mietchen (talk) 20:24, 11 May 2020 (UTC)

New subpage to document and explore subsets of Wikidata[edit]

as per Wikidata:WikiProject Schemas/Subsetting. --Daniel Mietchen (talk) 09:51, 4 June 2020 (UTC)

Date-conditional checks[edit]

Andra Waagmeester Andrawaag (talk) 19:33, 30 January 2018 (UTC) YULdigitalpreservation (talk) 13:32, 6 February 2018 (UTC) Daniel Mietchen (talk) 01:52, 7 February 2018 (UTC) Finn Årup Nielsen (fnielsen) (talk) 13:55, 13 February 2018 (UTC) Lucas Werkmeister (talk) 12:34, 14 February 2018 (UTC) John Samuel 20:31, 26 February 2018 (UTC) Dhx1 (talk) 02:39, 8 March 2018 (UTC) Jneubert (talk) 13:35, 19 June 2018 (UTC) Malore (talk) 15:59, 24 August 2018 (UTC) Vladimir Alexiev (talk) 06:33, 10 September 2018 (UTC) Jose Emilio Labra Gayo (talk) 19:34, 21 November 2018 (UTC) Spinster 💬 08:45, 18 December 2018 (UTC) Egon Willighagen (talk) 07:43, 5 March 2019 (UTC) EricP (talk) 10:44, 14 March 2019 (UTC) Tombakerii (talk) 15:03, 17 May 2019 (UTC) Maxlath (talk) 13:26, 19 May 2019 (UTC) Jumtist (talk) 13:29, 19 May 2019 (UTC) SilentSpike (talk) 13:48, 19 May 2019 (UTC) MisterSynergy (talk) 19:17, 19 May 2019 (UTC) Harmonia Amanda (talk) 06:32, 20 May 2019 (UTC) Salgo60 (talk) 09:07, 20 May 2019 (UTC) Ivanhercaz (Talk) 15:38, 20 May 2019 (UTC) Andrew Su (talk) 15:50, 20 May 2019 (UTC) Mlemusrojas (talk) 16:50, 21 May 2019 (UTC) Dani Fernandez 14:11, 23 May 2019 (UTC) PKM (talk) 02:43, 29 May 2019 (UTC) Sannita - not just another it.wiki sysop 09:47, 2 June 2019 (UTC) Infomuse (talk) 22:37, 3 June 2019 (UTC) Buccalon (talk) 17:42, 18 June 2019 (UTC) author  TomT0m / talk page 11:52, 30 June 2019 (UTC) Ecritures (talk) 20:08, 15 July 2019 (UTC) Fuzheado (talk) 17:03, 10 July 2019 (UTC) Iovka Boneva (Iovka) Csisc (talk) 20:43, 24 August 2019 (UTC) Fuzheado (talk) 18:01, 23 October 2019 (UTC) Ash Crow (talk) Pdehaye (talk) 22:13, 27 October 2019 (UTC) Tinker Bell 20:18, 1 November 2019 (UTC) So9q (talk) 06:26, 13 November 2019 (UTC) ElanHR (talk) 21:29, 14 November 2019 (UTC) Arybolab (talk) Blue Rasberry (talk) 14:21, 24 November 2019 (UTC) Susanna Ånäs (Susannaanas) (talk) BlaueBlüte (talk) 22:20, 8 December 2019 (UTC) Arcadialib (talk) 21:37, 19 February 2020 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits TiagoLubiana (talk) 18:31, 23 March 2020 (UTC) VIGNERON (talk) Iwan.Aucamp (talk) 11:39, 5 May 2020 (UTC) —M@sssly 15:52, 30 April 2020 (UTC) Moebeus Moebeus (talk) 11:41, 27 May 2020 (UTC) CamelCaseNick (talk) 17:13, 28 May 2020 (UTC) Jvcavv (talk) 21:38, 23 September 2020 (UTC) Bodhisattwa (talk) 15:23, 6 November 2020 (UTC) DeniseSl (talk) 09:49, 11 November 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject ShEx

Is it possible to have a check for election items that says something like "There should be a successful candidate (P991) unless there's a point in time (P585) that's in the future"? Or would we need to have two separate schemas for "future elections" and "past elections"? --Oravrattas (talk) 13:23, 8 July 2020 (UTC)

@Oravrattas: So far as I understand, the checking process will always report failure for missing fields. I think the solution would be to only have one schema, no checks for special conditions, but there could be a second process which runs either before or after the schema check to exclude items which have certain characteristics. Blue Rasberry (talk) 22:29, 1 August 2020 (UTC)
@Bluerasberry: I might be missing something about your suggestion, but I'm not sure how that would help. I don't want to skip all checks on future elections: I still want to ensure that those have a date, a jurisdiction, an office contested, candidates, the previous election, etc, but there are properties that will only have values once the election has passed, such as the number of votes cast, and the winning candidate. How can I validate that those do not have any value yet if the date of the election is in the future, but should if it's in the past? --Oravrattas (talk) 08:08, 2 August 2020 (UTC)
@Oravrattas: As far as I know, there is no easy option to compare dates. You could maybe use regular expressions for it. If you do so, you do not need multiple shape expressions, however you will need multiple shapes. You can combine shapes via or and and, so <#election> { ... } and (@<#election-past> or @<#election-future>) with the one for future election with a regular expression matching future dates and the past election having successful candidate (P991) and point in time (P585) and that might also have a an expression for past dates so that no future election can have an outcome already. --CamelCaseNick (talk) 20:45, 2 August 2020 (UTC)

EntitySchema labels[edit]

EntitySchema labels don't seem to be working for me - do they need to be updated to a new format? Sj (talk) 18:56, 30 July 2020 (UTC)

Early test schemas - still in use? What happens when the incrementer gets to 733?[edit]

There are a few schemas with odd numbers: E734 (E734) (family name), E735 (E735) (given name), E999 (E999) (Borked), E11424 (E11424) (film). Are these actually in use? What was the mechanism of generating those EIDs, do we want to keep it, will the ID incrementing pass over them smoothly? Sj (talk) 18:59, 30 July 2020 (UTC)

@Sj: I see - it looks like at some point, there were 2+ processes for assigning item numbers, and now hopefully there should only be the automated one. You are asking what happens when the canonical current system counts to the item numbers which previous earlier systems assigned.
So far as I know, the current practice is keeping all schemas, even test schemas, regardless of whether anyone uses them. I expect that the current desired practice is for the number assigning process to skip any existing numbers, not write over them. Eventually I suppose we should have inclusion or notability criteria for schemas, or otherwise, anyone could automatically generate countless schemas never to be used. Blue Rasberry (talk) 22:26, 1 August 2020 (UTC)
Thank you kindly. Sj (talk) 15:24, 17 August 2020 (UTC)

Creating schemas for basic concepts[edit]

(Reposting here:)

The entity schemas E3 (E3) ((Wikidata Item), E5 (E5) (Statement), E6 (E6) (Language mappings), E7 (E7) (Citation), E8 (E8) (External RDF), and E9 (E9) (Wikidata-Wikibase) are all blank.


  • There should be something there, even if it is all comments and optional elements.
  • Is there a quick way to find other blank entity schemas?
  • Was there discussion about this when schemas were being created for the first time?

Sj (talk) 15:28, 17 August 2020 (UTC)

Help in creating Schema for Pokémon species[edit]

Hi, I don't know how to create a correct Schema for a Pokémon species. Could someone help me? Item QYYY which defines a Pokémon species must have:

Thank you very much for the ones who will help me! --★ → Airon 90 13:01, 5 October 2020 (UTC)

@Airon90: I see that you are active at Wikidata:WikiProject Pokémon, and presumably if you learned this then you would bring the practice back to that WikiProject. I am still learning this myself and I do not know how to help, but I wanted to thank you both for asking the question and doing documentation at that other WikiProject. Pokemon are very important for the history of Wikipedia as the origin of English Wikipedia's notability policy. There is a lot of interest in good Pokemon content everywhere, so I think we should get this right. Blue Rasberry (talk) 14:28, 18 November 2020 (UTC)

Best way to browse schemas?[edit]

Excuse me if I am missing this. How can I browse schemas?

I want to see schemas for instance of (P31) = human (Q5). Among other things, I am hoping to identify the most common properties among schemas, but I also would like to be able to browse individual schemas. I do not see how to search for schemas around a given theme. Thanks. Blue Rasberry (talk) 14:25, 18 November 2020 (UTC)

The answer is User:HakanIST/EntitySchemaList
Right now there are only 264 schemas. The reason I could not find many is because hardly any exist. All of this is still new. Blue Rasberry (talk) 21:14, 18 November 2020 (UTC)
@Bluerasberry:
Yes, it still quite new and a bit too technical to be widely adopted yet.
That said, you can just use Special:Search and ask for results in the EntitySchema namespace only.
Cdlt, VIGNERON (talk) 14:33, 22 November 2020 (UTC)