Wikidata:Property proposal/unusualness
unusualness
[edit]Originally proposed at Wikidata:Property proposal/Generic
Description | how unusual is this item compared to other items of this type |
---|---|
Represents | any |
Data type | Number (not available yet) |
Domain | item |
Allowed values | 0-number of instances |
Allowed units | none or "items" |
Example 1 | 200 |
Example 2 | 1000 |
Example 3 | 10000 |
Example 4 | 100000 |
Planned use | filtering unusual/exotic instances with extra complexitiy that are not worthwile to handle |
Expected completeness | complete for values over 1000 that hurt |
Motivation
[edit]https://www.wikidata.org/wiki/Q7395156 is an example of an unusual instance. While most instances of https://www.wikidata.org/wiki/Q47258130 scientific conference series are relating to a single conference series this one is a "pair". This is an attempt to model a real world situation at the cost of extra complexity. To avoid the complexity a means to filter this unusual instance out might be needed. In my usecase today https://confident.dbis.rwth-aachen.de/dblpconf/wikidata with the query: try it!
the above conference series shows up multiple times and has multiple entries for some properties. This increases the complexity of the query and doesn't add enough value for my usecase - I need a means to filter this unusual record. I could do this by filtering the item by its WikiData Q identifier but then I'd have to add any more upcoming unusual/exotic cases later.
Having an unusualness property with "220" as its value in this case pointing out that this is the only such case out of some 220 cases would help to filter by "unusualness". If only the most usual records are wanted that either have no usualness property or its value is below a certain threshold.
There are many other "longtail" situations where a similar strategy might be applicable based on this property. Think of the Pareto Rule where only the most usual 20% give 80% of the value in quite a few scenarios. A proper usualness indicator could filter here.
Which brings up the question whether decile/n-tile properties are already available for setting values like the usualness...
# Conference Series wikidata query
# see https://confident.dbis.rwth-aachen.de/dblpconf/wikidata
# WF 2021-01-30
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?confSeries ?display_name ?confSeriesLabel ?official_website ?DBLP_pid ?WikiCFP_pid ?GND_pid
WHERE
{
# scientific conference series (Q47258130)
?confSeries wdt:P31 wd:Q47258130.
OPTIONAL { ?confSeries wdt:P1813 ?short_name . }
BIND (COALESCE(?short_name,?confSeriesLabel) AS ?display_name).
# official website (P856)
OPTIONAL {
?confSeries wdt:P856 ?official_website
}
# any item with a DBLP venue ID
OPTIONAL {
?confSeries wdt:P8926 ?DBLP_pid.
}
# WikiCFP pid
optional {
?confSeries wdt:P5127 ?WikiCFP_pid.
}
# GND pid
optional {
?confSeries wdt:P227 ?GND_pid.
}
# label
?confSeries rdfs:label ?confSeriesLabel filter (lang(?confSeriesLabel) = "en").
}
ORDER BY (?display_name)
– The preceding unsigned comment was added by Seppl2013 (talk • contribs) at 13:27, January 30, 2021 (UTC).
Discussion
[edit]- Comment about the sample SWAT and WADS conferences (Q7395156) mentioned above, not the proposal itself. I think the item incorrectly combines what should be on two separate items. For pairs of people, we generally have three items: one for the pair, and one for each part. @Uzume: --- Jura 14:04, 30 January 2021 (UTC)
- @Jura1: I am certainly not opposed to anyone splitting and creating separate items for these. —Uzume (talk) 12:59, 31 January 2021 (UTC)
- @Uzume: It seems the item gets worse with every edit. --- Jura 13:00, 14 February 2021 (UTC)
- @Jura1: I do not really see that. However, I do believe your suggestion is the right way to go. —Uzume (talk) 00:17, 15 February 2021 (UTC)
- @Uzume: It seems the item gets worse with every edit. --- Jura 13:00, 14 February 2021 (UTC)
- @Jura1: I am certainly not opposed to anyone splitting and creating separate items for these. —Uzume (talk) 12:59, 31 January 2021 (UTC)
- Comment SWAT and WADS conferences (Q7395156) is exceptional and therefore unusual. While discussing while it should stay or go an exception/unusual property might be a helpful vehicle to make it filterable for those who don't want to wait for the discussion result (which might never come ...) I agree that the pair here consist of two biannual conference series which as a pair make an annual conference series. ---Seppl2013 (talk) 16:42, 30 January 2021 (UTC)
- Comment proposition may be interesting but it's too unclear right now ; at least, the examples should be correct. Plus, the datatype "number" is not available, so proposition can not be accepted as it is. Cheers, VIGNERON (talk) 15:09, 31 January 2021 (UTC)
- Comment I edited the description as you seem to be talking about items, not properties. ArthurPSmith (talk) 19:23, 1 February 2021 (UTC)
- Oppose this makes no sense to me at the moment. BrokenSegue (talk) 21:07, 1 February 2021 (UTC)
- Comment I don't understand what these numbers in the examples are meant to represent. UWashPrincipalCataloger (talk) 01:10, 2 February 2021 (UTC)
- Oppose I don't get the proposal at all. Tetizeraz (talk) 01:42, 2 February 2021 (UTC)
- Oppose This proposal is extremely confusing. It seems that this property is meant to help you filter properties for your querying needs and not actually act as a definabe property of an element. Therefore, I oppose. --Lectrician1 (talk) 04:33, 6 February 2021 (UTC)
- Not done, no consensus to create --DannyS712 (talk) 18:02, 17 February 2021 (UTC)