User:Peter F. Patel-Schneider/disjoint

From Wikidata
Jump to navigation Jump to search

Improving Support for Disjointness in Wikidata[edit]

WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

I am proposing a new way of providing disjointness in Wikidata. Please provide comments on whether there should be more support for disjointness in Wikidata and whether this proposal is reasonable.

Motivation[edit]

Disjointness can help discover incorrect information in Wikidata. For example, meat scientist (Q6804279) was a subclass of meat (Q10990) for quite some time - https://www.wikidata.org/w/index.php?title=Q6804279&oldid=2022326135 - and this could have been flagged if there was disjointness between meat (Q10990) and scientist (Q901), but there is nothing currently in Wikidata that supports this disjointness.

Background[edit]

There have been two proposals to add a disjointness relationship directly between two classes - https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/20#disjoint_with and https://www.wikidata.org/wiki/Wikidata:Property_proposal/disjoint_with. Both times the proposal was not done, the last one partly because disjoint union of (P2738) was considered superior. A big problem is that disjoint with would be a binary relationship and there are many situations where there are large groupings of mutually disjoint classes, which would require O(n^2) disjoint with relationships for a group of size n.

But being able to use disjoint union of (P2738) depends on the existence of a class that is in fact a disjoint union, and there may not be one, as is the case for human (Q5) and dog (Q144).

Proposals[edit]

One way of providing disjointness groupings is providing a facility for stating that the direct subclasses of a class are all mutually disjoint. This method is very problematic, however, because there is no good way of preventing new subclasses of being created, for example adding a subclass for North American mammals to a class whose subclasses where all previously mammal species.

Another way to have a way of directly saying that a group of classes are all mutually disjoint as can be done in OWL with DisjointClasses (see https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#Disjoint_Classes). To do this directly would require a way of flagging a Wikidata item as being a disjointness grouping, perhaps by being an instance of a special class, of the classes that are the values of one of its properties, perhaps called "disjoint classes". For example, one could create one of these instances that had "disjoint classes" of human (Q5), dog (Q144), house cat (Q146), and other classes of mammals. mammal (Q110551885) itself could be in another grouping stating disjointness between different classes of animals, reducing the size of the disjointness groupings.

A refinement of this method would state that the instances of a class are all mutually disjoint classes. With this refinement disjointness groupings can be created with the addition of only one fact for any class whose instances are in fact mutually disjoint.

Is it reasonable to add this new way of stating disjointness of classes? It does add a class with special characteristics, but there are already several of these classes in Wikidata, including entity (Q35120), class (Q16889133), metaclass (Q19478619), and first-order class (Q104086571).

Example of final refinement of the proposal[edit]

disjoint mammalian classes (QC1) instance of (P31) mutually disjoint instances (QM1)
human (Q5) instance of (P31) disjoint mammalian classes (QC1) 
dog (Q144) instance of (P31) disjoint mammalian classes (QC1) 
cat (Q146) instance of (P31) disjoint mammalian classes (QC1) 
...
disjoint animal classes (QC2) instance of (P31) mutually disjoint instances (QM1)
mammal (Q110551885) instance of (P31) disjoint animal classes (QC2)
Reptilia (Q10811) instance of (P31) disjoint animal classes (QC2)
...