Wikidata:Property proposal/property for implied values

From Wikidata
Jump to navigation Jump to search

value hierarchy property[edit]

Return to Wikidata:Property proposal/Generic

   Under discussion
Descriptionproperty which specifies less precise items than the indicated value for which statements using the subject property would still be true
Data typeProperty
DomainWikidata property (Q18616576)
Allowed valuestransitive property (Q18647515)
Example 1instance of (P31)subclass of (P279)
Example 2headquarters location (P159)located in the administrative territorial entity (P131)
Example 3occupation (P106)subclass of (P279)
Example 4afflicts (P689)anatomical location (P927)
Planned useadd to a bunch of properties and use in tools, see below

Note: we have considered various labels for this property: "property for implied values", "transitive over", "refinement hierarchy" and currently "value hierarchy property". All these variations are used in the discussion below. − Pintoch (talk) 21:40, 15 January 2019 (UTC)

Motivation[edit]

I would be interested in expressing that a given property (such as subclass of (P279)) is used to determine how values of another property (instance of (P31)) can be refined. This is a pattern that we use at various places in Wikidata:

Potentially this property could be multi-valued if there are multiple refinement hierarchies in place (but I can't think of an example right now). I am not very happy with the label so I hope others have better ideas.

The idea behind introducing such a property would be to make it easier for data import tools (such as OpenRefine) to respect these hierarchies natively (for instance, when adding headquarters location (P159)  Brussels (Q239) to an item, any unsourced headquarters location (P159)  Brussels-Capital Region (Q240) statement could be safely deleted, I think).

This could also potentially improve the constraint system: for instance, if we have an item requires statement constraint (Q21503247) which requires headquarters location (P159)  Paris (Q90), then the constraint system could also accept headquarters location (P159)  7th arrondissement of Paris (Q259463) (because it is more precise). As far as I am aware this behaviour is only available for instance of (P31)/subclass of (P279) via type constraint (Q21503250) currently. @Lucas Werkmeister (WMDE): I have no idea if it is feasible (and in which case you could prefer another syntax for this information)? − Pintoch (talk) 15:09, 8 January 2019 (UTC)

--Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC) Vladimir Alexiev (talk) 07:01, 24 February 2017 (UTC) Pintoch (talk) 22:42, 5 March 2017 (UTC) Fuzheado (talk) 14:43, 15 May 2017 (UTC) YULdigitalpreservation (talk) 14:37, 14 June 2017 (UTC) PKM (talk) 00:24, 17 June 2017 (UTC) Fractaler (talk) 14:42, 17 June 2017 (UTC) Andreasmperu Diana de la Iglesia Jsamwrites (talk) Finn Årup Nielsen (fnielsen) (talk) 12:39, 24 August 2017 (UTC) Alessandro Piscopo (talk) 17:02, 4 September 2017 (UTC) Ptolusque (.-- .. -.- ..) 01:47, 14 September 2017 (UTC) Gamaliel (talk) --Horcrux92 (talk) 11:19, 12 November 2017 (UTC) MartinPoulter (talk) Bamyers99 (talk) 16:47, 18 March 2018 (UTC) Malore (talk) Wurstbruch (talk) 22:59, 4 April 2018 (UTC) Dcflyer (talk) 07:50, 9 September 2018 (UTC) Ettorerizza (talk) 11:00, 26 September 2018 (UTC) Ninokeys (talk) 00:05, 5 October 2018 (UTC) Buccalon (talk) 14:08, 10 October 2018 (UTC) Jneubert (talk) 06:02, 21 October 2018 (UTC) Yair rand (talk) 00:16, 24 October 2018 (UTC) Tris T7 (talk) ElanHR (talk) 22:05, 26 December 2018 (UTC) linuxo

Pictogram voting comment.svg Notified participants of WikiProject Ontology

@Jean-Frédéric, Olivier LPB: you have requested this feature at Help_talk:Property_constraints_portal/ItemPintoch (talk) 15:14, 8 January 2019 (UTC)

Discussion[edit]

  • Symbol support vote.svg Support would a label something like "property over which this is transitive" make sense? But yes, this seems like a very useful abstraction to provide a structured mechanism for. ArthurPSmith (talk) 19:36, 8 January 2019 (UTC)
  • This can get complicated. occupation (P106) is only transitive over subclass of (P279) until outside the occupation class tree. --Yair rand (talk) 23:45, 8 January 2019 (UTC)
    • Good point. But values outside the occupation class tree are already forbidden by a value type constraint (Q21510865), so maybe we can say that the implication only holds if the broader value is acceptable with respect to other constraints? Or we could indicate the containing class with a qualifier? Anyway, for both applications mentioned (data import and constraint violations), this should not be an issue. − Pintoch (talk) 06:57, 9 January 2019 (UTC)
  • Symbol support vote.svg Support David (talk) 12:53, 9 January 2019 (UTC)
  • Pictogram voting comment.svg Comment New label ("transitive over") works for me! ArthurPSmith (talk) 16:28, 10 January 2019 (UTC)
    • @ArthurPSmith: given MisterSynergy's reaction below, I have the feeling that this label is too close to "transitive" and might cause confusion for others. Maybe something like "refinement hierarchy" would be better - it would also rule out values like sibling (P3373) which are symmetric properties. − Pintoch (talk)
  • Symbol wait.svg Wait How do other ontologies handle this? Where is prior art? Especially, when it comes to naming this relationship?
ChristianKl❫ 18:23, 10 January 2019 (UTC)
  • @ChristianKl: yes indeed surely this has been formalized in other places. I will investigate. − Pintoch (talk) 20:12, 10 January 2019 (UTC)
  • @ChristianKl: I have asked around and it does not seem to have a particular name, it is just a particular case of an OWL construct: property chains. People in graph theory or other fields of math might have names for that particular relation, but I am not sure how to look for it (and it is likely to be as obscure as any other name we come up with, TBH). − Pintoch (talk)

ChristianKl
ArthurPSmith
d1g
JakobVoss
Jura
Jsamwrites
MisterSynergy
Salgo60
Micru
Pintoch
Harshrathod50


Pictogram voting comment.svg Notified participants of WikiProject Properties

John C. Calhoun (Q207191)position held (P39)  Vice President of the United States (Q11699)
Vice President of the United States (Q11699)part of (P361)  Federal Government of the United States (Q48525)
part of (P361)instance of (P31)  transitive property (Q18647515)
However, it would be wrong to say that John C. Calhoun (Q207191)position held (P39)  Federal Government of the United States (Q48525).
So, position held (P39) is transitive over subclass of (P279) but not over part of (P361) (and both subclass of (P279) and part of (P361) are transitive).
So this notion of "X transitive over Y" is distinct from "Y transitive". Do you agree? Maybe the new label is causing confusion? − Pintoch (talk) 20:12, 10 January 2019 (UTC)
If you prefer symbolic statements over prose:
  • Q is transitive when
  • P is transitive over Q when
Pintoch (talk) 20:58, 10 January 2019 (UTC)
Another counterexample found by sparqling around: sibling (P3373) is transitive. child (P40) is transitive over sibling (P3373) (if X has child Y and Y has sibling Z, then X has child Z), but mother (P25) is not transitive over sibling (P3373) (if X has mother Y and Y has sibling Z, then surely Z is is not X's mother!). − Pintoch (talk) 23:47, 10 January 2019 (UTC)
@Pintoch: I don't think so. Halfsiblings frequently are siblings. ChristianKl❫ 07:16, 11 January 2019 (UTC)
@ChristianKl: yes that is true − this example is not something we would want to have anyway for the two applications mentioned above (because sibling (P3373) is symmetric, so it cannot be seen as a refinement hierarchy like the other examples). I just hope it helps MisterSynergy grasp the difference with transitivity a bit better (in particular, the fact that mother (P25) is not transitive over sibling (P3373) is still a counterexample of his claim, I think). − Pintoch (talk) 07:52, 11 January 2019 (UTC)
Yeah, I understand. However, this is the same problem that User:Yair_rand mentioned above and which you acknowledged, just in a much more obvious sense. Transitivity of a property Y (e.g. part of (P361)) used in value items of property X (e.g. position held (P39)) holds only as long as you stay within the range of X (value type constraint (Q21510865) we say at Wikidata). It works for a couple of steps for occupations and then breaks down, but barely for parthood relations which are typically very messi at Wikidata. It works practically always for the proposed headquarters location (P159)located in the administrative territorial entity (P131), as the range of the latter is completely contained within the range of the former; the problem does not really matter for the proposed instance of (P31)subclass of (P279), as instance of (P31) does not have a defined range.
There is another, independent problem with parthood relations in particular, in contrast to other transitive properties (mereology (Q1194916) is the research field that studies parthood relations). For parthood relations transitivity is typically assumed within mereology, but there is research in the community about whether this is generally valid or not (see here). Consider for instance these (fictional) claims:
Although both claims would be acceptable for Wikidata and part of (P361) is transitive, you clearly would not infer that
< Pintoch's big toe > part of (P361) View with SQID < Wikidata admin corps >
. The problem is that parthood relations are only transitive under certain circumstances, for example in a local meronomy (such as “Pintoch's body parts” or “Wikimedia project entites”). However, parthood relations are not generally transitive.
I still do not think that this proposed property should be created. The range problem raised by Yair_rand would have to be addressed with extra efforts based on existing value type constraint (Q21510865) claims on properties anyway (so the proposed property does not add new value), and the parthood problem would rather be solved by removing the transitivity from part of (P361)/has part (P527). (I still have to have a look at the sibling (P3373) problem, though.) —MisterSynergy (talk) 10:30, 11 January 2019 (UTC)
@MisterSynergy: you rightly point out that transitivity itself is often messy. I would argue that this is not just true for part of (P361): very often, long chains of subclass of (P279) also give nonsensical results when we follow them high up into very abstract concepts. That's inevitable: we are building a knowledge graph, not a math textbook.
Let's step back and look at the motivation of the proposal. What I mean with this proposal is that many properties tend to have one designated property for their refinement hierarchy. It seems to me that this pattern is pretty pervasive, but we tend to only acknowledge this for the instance of (P31)/subclass of (P279) pair. Why?
It would be massively useful if the constraint system could handle other such hierarchies. Typically when working with human (Q5), the fact that we discourage subclasses of human (Q5) makes the type system useless there. Have you ever wanted to enforce things like "every item with an ORCID iD (P496) should have occupation (P106) researcher (Q1650915), or any subclass of it"? Now, I could propose to add a parameter to the constraints definition so that we could provide subclass of (P279) there, but I think this would be misguided as this information is really determined by the property used for the required statement. So I think this should be stored in a generic way, outside the constraint system. Now I would very much welcome any other suggestion about how to represent this. − Pintoch (talk) 10:54, 11 January 2019 (UTC)
Also (sorry for the long reply!), it seems to me that there might still be a conceptual misunderstanding here: in the abstract, the fact that a property Q is transitive really does not mean that "x P y Q z" implies "x P z" for any property P. This is wrong both in mathematics and in ontology design. So there is nothing to "fix" about part of (P361) being transitive in this regard. It turns out that most of the transitive properties we have in Wikidata are containment relations, where this tends to hold often,, but that does not have to be the case. For instance, imagine we had a property called "hates". The fact that "MisterSynergy hates cauliflower (Q23900272)" really does not imply that "MisterSynergy hates vegetable (Q11004)", even if the chain of subclass of (P279) is totally correct. − Pintoch (talk) 11:41, 11 January 2019 (UTC)
Now Symbol neutral vote.svg Neutral, as I do not want to stand in the way too much. I changed my mind after a second read of this proposal and its comments. The transitivity does no longer seem to be the critical aspect of this property, and I really hope that this term does not appear in the potential property label. It is also worth to mention that it may be useful for refinements, i.e. looking *down* along a hierarchy for potentially better/more specific values, which pretty much avoids the discussed problems that arise when going *up* a hierarchy towards more general values. —MisterSynergy (talk) 11:53, 13 January 2019 (UTC)
Yes, it could be interesting to have something for implications which are going down a hierarchy. I cannot think of an example though (beyond my "hates" example above, maybe). − Pintoch (talk) 15:21, 14 January 2019 (UTC)
I'm not sure if the above can of worms is what ChristianKl intended to open, but the question of how other ontologies deal with this again seems relevant. I remember some peculiarities with transitivity in SKOS - there are both the relations "broader" and "broaderTransitive", and "broaderTransitive" is declared as a superproperty of "broader" (and note this does NOT make "broader" transitive, despite the fact that "superproperty" like "superclass" is itself transitive). There's a discussion of some of this (in a more general case) here. It all kind of hurts the brain a bit... ArthurPSmith (talk) 14:06, 11 January 2019 (UTC)
@ArthurPSmith: I don't think that properties like this should be trivally created. We should generally follow existing standards and only invent our own if we have good reasons to invent the wheel anew. ChristianKl❫ 10:46, 13 January 2019 (UTC)
@ChristianKl: I totally agree. That being said, it seems to me that there is no established RDF predicate for this in the usual namespaces. The reason for that is that this information is best expressed in OWL, not in RDF directly. Wikidata has not engaged much with OWL so far - the project has followed a different route by storing ontological information in the same data model as the data itself. For instance we have inverse of (P1696), and we use statements such that instance of (P31)  transitive property (Q18647515), which intrinsically don't do anything (Wikibase does not give them any special meaning, the query service does not infer any triples from them, and so on). Storing property constraints as statements on the property is another example. So this is why I proposed this property: it seems to be in line with the current customs of storing ontological information as statements on properties. That being said, it would obviously be much better if we had generic support for constructs like property chains, which would be much more expressive. But that is a much larger project, and we might want to think twice before construing these concepts in the Wikibase data model (maybe this is something we want to store elsewhere?) − Pintoch (talk) 11:03, 13 January 2019 (UTC)
  • Pictogram voting comment.svg Comment Ok, it sounds like we need to settle on a non-confusing label (and if we can find a label somebody else has used, all the better). I was looking at a sample of the item-valued properties we have; for many of them this whole discussion simply does not apply (there's no "refinement hierarchy" or transitivity that would be associated with person-valued properties like head of government (P6) or director (P57), and I can't think of one that would apply for unitary entity values like country (P17) either.) For the ones associated with a physical location, for example place of birth (P19), then we're clearly interested in the location-based hierarchy that Wikidata manages via located in the administrative territorial entity (P131). For some others, subclass of (P279) seems to apply (for example for language-valued properties like official language (P37)). For employer (P108) I am not sure if we have a consistent refinement hierarchy - parent organization (P749) perhaps, but also part of (P361), owned by (P127)? Fundamentally I think what we're looking for is the way in which the domain of the property value (class, place, language, organization, etc.) is hierarchically modeled in Wikidata. So would a label like "domain hierarchy property" make sense? ArthurPSmith (talk) 14:20, 14 January 2019 (UTC)
Yes, I also expect that subclass of (P279) and located in the administrative territorial entity (P131) would be the most common values for this property. Concerning the label, does "domain" generally refer to the target value? Intuitively, for me it would refer to the subject item. (Maybe by a vague mathematical analogy with the domain of a function, although statements aren't functions) So it would be more intuitive for me to have something else like "range", "value" or "target". − Pintoch (talk) 15:21, 14 January 2019 (UTC)
Oh! Yes, I was thinking "domain" in the generic sense of type of item. Maybe "value hierarchy property"? ArthurPSmith (talk) 17:10, 14 January 2019 (UTC)
I think that would be quite clear indeed! − Pintoch (talk) 17:21, 14 January 2019 (UTC)
I attempted to update the label and description in English in the proposal - ok now? ArthurPSmith (talk) 20:33, 15 January 2019 (UTC)
Yes, thanks! I am adding a note to explain the variations of labels in the discussion. − Pintoch (talk) 21:40, 15 January 2019 (UTC)
  • Pictogram voting comment.svg Comment as far as constraint checks are concerned: any kind of transitivity hurts performance (“type” and “value type” are among the most expensive constraint checks by far) and caching (we can’t invalidate all cached results if an item buried somewhere in the chain is edited, that’s why you’ll see “this result may be outdated” on some type / value type check results). --Lucas Werkmeister (WMDE) (talk) 17:27, 14 January 2019 (UTC)
Thanks! Yes I can imagine this would be expensive to run. So it makes all the more sense to store this information outside the constraint system. I would still be interested in this to ease data imports. − Pintoch (talk) 20:07, 14 January 2019 (UTC)