Wikidata:Property proposal/Best result in tournament
Jump to navigation
Jump to search
[edit]
Originally proposed at Wikidata:Property proposal/Sports
Not done
Description | best achievement in a sports competition |
---|---|
Represents | tennis player (Q10833314) |
Data type | Item |
Template parameter | en:Template:Infobox_tennis_biography *AustralianOpenresult *FrenchOpenresult *Wimbledonresult *USOpenresult |
Domain | human (Q5) |
Allowed values | Australian Open, Wimbledon Championships, French Open, US Open |
Example 1 | Jeļena Ostapenko (Q5551564) → Australian Open (Q60874) |
Example 2 | Jarkko Nieminen (Q10270) → US Open (Q123577)
|
Example 3 | Marta Kostyuk (Q28535267) → Australian Open (Q60874)
|
Planned use | add it to all tennis player articles who have competed in grand slam tournaments |
Motivation
[edit]Really useful to store info of success in grand slam tennis tournaments on Wikidata instead of updating it to every Wikipedias. Stryn (talk) 13:49, 16 October 2018 (UTC)
Discussion
[edit]From my experience in the field of sports results, this does not look like a good idea for several reasons:
- results (P2501) and sport (P641) are not supposed to be used like that; use stage reached (P2443) and preferrably competition class (P2094) (alternatively sports discipline competed in (P2416)) instead.
- The meaning of point in time (P585) is not clear either; does it apply to the point when the result was achieved (your intention, as far as I can tell), or when it is valid?
- It really looks difficult to maintain in good shape
- Furthermore, superseded data would have to be deprecated or removed or updated, but preferrably we do not need to do this at all; at least not at such a scale.
- Alternative suggestion: add all participation information with participant in (P1344) and event items (like 2018 Wimbledon Championships – women's singles (Q30098268), add appropriate qualifiers (stage reached (P2443) and/or ranking (P1352)). Most other data like point in time (P585), competition class (P2094) (which you meant with “sport” above) and the connection to 2018 Wimbledon Championships (Q30085309) via part of (P361) belong to the event item.
- Some more theoretical background: the main question here is how and where to aggregate data. This proposal aims to store already aggregated data in Wikidata, so that the users do not have to (and cannot) aggregate by themselves. However, it is much more efficient to store un-aggregated original data (as suggested by me), and let the user do the aggregation. This allows users to retrieve different aggregation from the same data set; here, for instance, one could ask for "number of wins per Grand Slam tournament", "list of Grand Slam wins", or even things like "number of Grand Slam semi-final appearances" and so on, all from the same set of claims. If one stores already aggregated data in Wikidata, each of those would have to have its own property, which isn’t really practical of course. For that reason, it is in most situations preferrable not to store aggregated data in Wikidata; things like "number of X" (equivalent to
COUNT()
in SPARQL), "total prize money won" (SUM()
), "best of Y" (MAX()
) or "list of Z" (GROUP_CONCAT()
), and so on should be done by the users, if ever possible. Which definitely is the case here, as the number of Grand Slam participations per player (or even tennis tournament participations) is relatively low even for longer professional careers.
- Some more theoretical background: the main question here is how and where to aggregate data. This proposal aims to store already aggregated data in Wikidata, so that the users do not have to (and cannot) aggregate by themselves. However, it is much more efficient to store un-aggregated original data (as suggested by me), and let the user do the aggregation. This allows users to retrieve different aggregation from the same data set; here, for instance, one could ask for "number of wins per Grand Slam tournament", "list of Grand Slam wins", or even things like "number of Grand Slam semi-final appearances" and so on, all from the same set of claims. If one stores already aggregated data in Wikidata, each of those would have to have its own property, which isn’t really practical of course. For that reason, it is in most situations preferrable not to store aggregated data in Wikidata; things like "number of X" (equivalent to
- A Wikipedia template would then have to loop over all participation data to figure out the best results. If SPARQL from Lua was available this would be easier to do, but right now we don’t have it.
—MisterSynergy (talk) 18:33, 16 October 2018 (UTC)
- Support David (talk) 07:04, 17 October 2018 (UTC)
- I agree with User:MisterSynergy that this looks like a property that should be created by querying other data, not having a property in its own right. I lean to oppose, but would like to see some replies to MisterSynergy's points. MartinPoulter (talk) 14:47, 22 October 2018 (UTC)
- Not done, it looks like what is needed is more powerful querying capabilities. − Pintoch (talk) 20:43, 6 March 2019 (UTC)