Wikidata:Requests for comment/Merging relationship properties

From Wikidata
Jump to navigation Jump to search

Background[edit]

At the moment, we have two kinds of family relationship properties. One group (like spouse (P26)) are gender neutral - they can be used for both men and women. Some others are gendered (like P7 (P7) & P9 (P9)) - they are dependent on the gender of the target item.

This approach has several problems.

  • It is inconsistent - we can currently say that A is father to B, but not that B is son of A.
  • It is complicated - people querying the database have to write more complex queries to identify relationships between entities
  • It is mostly duplicating information already recorded - 94% of entries on people already have sex or gender (P21) data, rising to ~99%+ for those with father (P22) or mother (P25) values.
  • It forces a gender binary - we support non-"male" or "female" values in sex or gender (P21), but are left with a choice of referring to them as "sister" or "brother" even if we want to avoid this.

It does, however, have some benefits:

  • Not all languages have an easy gender-neutral term for "sibling" or "parent"
  • Requiring one and only one father (P22) & mother (P25) for a given item makes for easy constraint checking

Proposal[edit]

I suggest we merge the following pairs of properties:

This issue was previously discussed in mid-2013, focussing on P7/P9. The main problem at the time was that some languages (eg Dutch) do not have a single clear term for "sibling", making it difficult to reuse this data on Wikipedia; it was suggested to revisit this in Phase III.

It is now two years later, and we have got arbitrary access rolled out to most wikis. It should (hopefully!) be possible for projects which wish to display a gendered relationship to query the target item and determine its gender, rather than requiring the relationship property to embed the gender information.

proposer, Andrew Gray (talk) 21:07, 24 August 2015 (UTC)[reply]

Discussion[edit]

  • I would maybe do the opposite and divide child (P40) into two and create other properties for grandparents, etc. Thierry Caro (talk) 21:24, 24 August 2015 (UTC)[reply]
  • Problem 1: is it real problem? Not all properties must be fully symmetric. Problem 2: this depends on concrete goal. Suggested structure requires more complex queries for simple requests like person`s father/mother. Problem 3: duplicating have positive points too, for example more simple queries, vandalism checks and etc. Also current structure is well known and standard. So I prefer to save current structure. — Ivan A. Krestinin (talk) 21:39, 24 August 2015 (UTC)[reply]
  • I would suggest simply deleting P7 (P7) and P9 (P9), since they can be derived from other statements, along with P43 (P43) and P44 (P44), for the same reasons. father (P22) and mother (P25) should remain separate, since biologically all humans in the database have precisely one mother and one father, and the statements don't mean identical things. I don't see how forcing a gender binary could be an issue; a biological mother/father doesn't need to be strictly female/male. --Yair rand (talk) 22:27, 24 August 2015 (UTC)[reply]
  • Agree with others above to preserve separate properties for father and mother, as this allows unique patrilineal and matrilineal ancestries to be extracted easily, with a simple TREE query in WDQ. Merging these into a combined parent property would make these lineages far harder to extract. Jheald (talk) 23:46, 24 August 2015 (UTC)[reply]
  • We do not need brother or sister when we have a father or mother. They are brother and sister by inference. We have already many qualifiers that are redundant. Why make a fuss about this? Thanks, GerardM (talk) 07:36, 25 August 2015 (UTC)[reply]
    • That assumes that we (a) know who the father was, and (b) have an item for him. But often we may have items for the brothers but no item for the father; and in some cases, eg some medieval painters, we might have no idea who the father was, just that the brothers were brothers. So we shouldn't just assume that an item for the father will necessarily exist. Jheald (talk) 19:06, 25 August 2015 (UTC)[reply]
  • Father/mother: With current tools, it's easy to determine if both father and mother are defined as we have two separate properties. It's probably impossible with a single "parent" property. --- Jura 10:12, 25 August 2015 (UTC)[reply]
  •  Question although in principle I'm for a merger, I'm concerned that relying too much on inferences will become a concern with arbitrary access current limitation. Is the limit of item loading, as the function is "expensive" a concern we should take into account ? author  TomT0m / talk page 10:27, 25 August 2015 (UTC)[reply]
  • Father/mother should definitely remain separate. In most cases, they are single value properties which is an enormous advantage when it comes to running queries etcetera, such as finding people with missing father/mother statements etcetera. Having a unified parent property will also mean we will have to infer who someone's father or mother in unnecessarily complex ways. They are also key if you want to infer other relationships, such as paternal grandfather (Q19682162) or maternal uncle (Q6041134) (again, technically possible with a unified parent property, but unnecessarily complex and dependent on the quality of related data). Overall, this seems like something that will make Wikidata harder to use with not real benefit.
    Children, siblings and spouses are a different matter as there is no limit to how many values you have and their relationships to gender differs in many ways from parent relationships. I'm therefore not against merging the brother/sister properties as they are unnecessarily gender binary. Väsk (talk) 14:14, 27 August 2015 (UTC)[reply]
  • I propose to merge only P7 (P7) and P9 (P9). These properties can have multiple values, and if somebody changed sex, there is a complexity to filling them. Other properties should have only one value (at least one preferred value), and therefore easier to work with them separately. —putnik 17:07, 31 August 2015 (UTC)[reply]
  • I agree with Putnik: OK for brother and sister, but the other work better separately. --Epìdosis 19:01, 31 August 2015 (UTC)[reply]
  • I remember us having this same discussion in the past. In the mean time Dutch didn't develop a word for sibling (Q31184) so I still strongly oppose these proposed mergers. Multichill (talk) 19:41, 2 September 2015 (UTC)[reply]
    @Multichill: I don't see how Dutch not having a word for it is relevant. The label could be a translation of "brother or sister". --Yair rand (talk) 22:09, 2 September 2015 (UTC)[reply]
    That's a big step backwards. Multichill (talk) 05:57, 3 September 2015 (UTC)[reply]
    How so? "brother" and "sister" could still be aliases, so it wouldn't be more difficult for editors to access. Are you worried about the presentation of the data? The labels aren't particularly user-facing... --Yair rand (talk) 07:27, 3 September 2015 (UTC)[reply]
  • Oppose merger for brother/sisters, as per above it seems to be bound to having a single property for parents. --- Jura 07:30, 3 September 2015 (UTC)[reply]

Summary of discussion[edit]

This was widely opposed and the proposal is retracted. Andrew Gray (talk) 13:55, 11 October 2015 (UTC)[reply]