Wikidata:Requests for comment/Wikidata and schema.org are incompatible

From Wikidata
Jump to navigation Jump to search
An editor has requested the community to provide input on "Wikidata and schema.org are incompatible" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

I recently stumbled upon an issue which I want to explain here. I looking for consensus on how it should be solved, if it is solvable or if it should be solved.

I chose the above title to catch your attention: the more precise title would be:

The way the internet uses schema.org is incompatible the community consensus on wikidata

I am working on a tool that tries to suggest statements to an entity. It does that based on structured or machine readable data found elsewhere on the internet. A while ago somebody suggested to me it should parse JSON-LD. Which is a great idea!

So one of my approaches was the following: we can determine the class of an object by

  1. looking at its class in schema.org
  2. finding an wikidata item with an equivalent class (P1709)
  3. setting instance of (P31) with that equivalient classs.

For instance:

This website: podchaser.com/creators/barack-obama-107Zzj88OJ Includes the following code:

{
    "@context": "http://schema.org",
    "@type": "Person",
    "additionalName": "Barack",
    "birthDate": "1961-08-04 00:00:00",
    "name": "Barack Obama",
    "description": "Host on Renegades: Born in the USA, Guest on The Breakfast Club, Fresh Air, and The Michelle Obama Podcast, and Archive Recording on Making. Barack Hussein Obama II is an American politician who served as the 44th President of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African American to be elected to the presidency and previously served as a United S…",
    "url": "https://www.podchaser.com/creators/barack-obama-107Zzj88OJ"
}

So parsing the first two lines I come to the conclusion that this Barack Obama (Q76) should have instance of (P31)equivalent class (P1709) = "http://schema.org/Person".

The extension – of course – does nothing automatically without the user performing an action. So I didn't look into it but the statement…

Barack Obama (Q76)instance of (P31)person (Q215627)

…intuitively made sense to me. I think it is actually correct. At least podchaser.com states, it is the case.

I was kind of aware, that instance of (P31)human (Q5) is more common. What I didn't know that The community™ somewhere ages ago decided that existing non-fictional people should be instance of (P31)human (Q5), rather than instance of (P31)person (Q215627) even if both statements are correct. I am not challenging this decision.

To solve this issue I moved the equivalent class statement to human (Q5).

Now thinking about it, I think this move was wrong and it should be reverted because:

A person (alive, dead, undead, or fictional).

https://schema.org/Person

This sounds more like person (Q215627) than human (Q5). Unfortunately, these two classes aren't even related, they just weirdly overlap.

Venn diagram Q5 and Q215627

Solutions[edit]

Use both classes in Wikidata[edit]

The vast majority of person (Q215627) are also human (Q5). Adding both classes to all instances reflects the situation best, but is highly impractical.

https://schema.org/Human[edit]

We could propose to add human (Q5) as a schema.org schema and convince the rest of the internet to use it where appropriate. It would be a subclass of https://schema.org/Thing, not https://schema.org/Person. This is unrealistic.

Screw compatibility![edit]

Being compatible is more trouble than it is worth. Contributors have to review every statement added to Wikidata anyway. Third party tools like this one should only make suggestions and assume the user knows consensuses made by the community.

Thoughts?[edit]