Wikidata:WikiProject Reasoning

From Wikidata
Jump to navigation Jump to search

Purpose[edit]

This WikiProject aims to explore the possibility of drawing inferences from Wikidata content. How can the community specify what should be inferred? How can these inferences be computed by tools? How should the computed inferences be used?


Participants[edit]

[+] Add yourself to the list

The participants listed below can be notified using the following template in discussions:
{{Ping project|Reasoning}}

Motivation[edit]

The spouse (P26) of Douglas Adams (Q42) was Jane Belson (Q14623681). Clearly, this means that, conversely, the spouse (P26) of Jane Belson (Q14623681) was Douglas Adams (Q42). This is a simple example of a case where one statement (about Jane Belson (Q14623681)) can be inferred from another statement (about Douglas Adams (Q42)). It would be great if we could build tools that drew such inferences, for example, to alert us when information is missing or contradictory.

This can only work if we (the Wikidata community) document somewhere which inferences should be drawn. We know that spouse (P26) should give rise to the example inference above, but where is this actually written up on Wikidata? In fact, for spouse (P26), we find the information that it is symmetric in the statement on Property:P26 that is an <instance of (P31):symmetric property (Q18647518)>. This information is also on its talk page in the form of a constraint template that expresses symmetry. Unfortunately, this template does not tell us anything about the qualifiers. For example, the spouse (P26) statement of Douglas Adams (Q42) has qualifiers start time (P580) and end time (P582). Obviously, the spouse (P26) statement of Jane Belson (Q14623681) should use the same qualifiers with the same values. This is not clarified anywhere. Moreover, there are cases of symmetric relationships where some qualifiers are not symmetric (should not be copied to the inferred statement), as in the case of diplomatic relation (P530), which uses a non-symmtetric qualifier diplomatic mission sent (P531) to specify the embassy of the source item in the country of the target item. Clearly, just copying all quantifiers for symmetric properties would not work either.

Therefore, we need a way to clarify more precisely which inferences are valid. This will not only benefit the external use ("by some machine") but it will also help to document our own assumptions about the use of our properties.

Proposed approach[edit]

This is a complex problem that cannot be solved in one step. Proposed solutions will need to be refined in several rounds until this will work as expected. However, a general idea can be given as follows:

  1. The community should specify rules of inference on the wiki.
  2. Each rule should be on one page, with a dedicated explanations and a discussion.
  3. The rules themselves should be given in a fixed format, e.g., using templates, so that tools can extract and use them. (It will not be possible to specify all relevant rules with statements on property pages, so better have them in a unified form on some other wiki page; there can always be links from property pages).
  4. External tools will read the rules from the wiki, compute inferences, and make use of them depending on their purpose. Some rules might be for quality control only, others might be used to enrich external applications like Reasonator, others might be used to compute inferences that should be added back to Wikidata by bots.

What "rules of inference"?[edit]

The first big question is to find out how to best write rules of inference that can serve many basic use cases of Wikidata. We start by collecting use cases:

  • Use cases: examples of rules we might want to express

How to express/store/discuss/manage rules of inference?[edit]

We need to work with rules on the wiki. There are many possible ideas on how to best do this. Some of this can already be discussed without knowing the exact capabilities of the intended rules.