Talk:Q112826905

From Wikidata
Jump to navigation Jump to search

Autodescription — class of anatomical entity (Q112826905)

description: anatomical entity as a first-order metaclass. To be used as P31 values for all anatomical structure classes. Its instances are classes (e.g. heart)
Useful links:
Classification of the class class of anatomical entity (Q112826905)  View with Reasonator View with SQID
For help about classification, see Wikidata:Classification.
Parent classes (classes of items which contain this one item)
Subclasses (classes which contain special kinds of items of this class)
class of anatomical entity⟩ on wikidata tree visualisation (external tool)(depth=1)
Generic queries for classes
See also


It seems to me that this class dublicates anatomical structure class type (Q103914748). @TiagoLubiana:

Currently the second-order classes for anatomy all named after "type". Naming them (class) instead seems to me more confusing then helpful, especially if only some are renamed that way. ChristianKl19:01, 13 July 2022 (UTC)[reply]

Looking more at it class of anatomical entity (Q112826905) seems even more confused. tubercle of bone (Q29015039) is not the same kind of thing as heart (Q1072). ChristianKl19:14, 13 July 2022 (UTC)[reply]

@ChristianKl: Hey, Christian. The idea was to avoid touching the current items. The label of many biological items are confusing to non-ontologies, that is why the description details its usage. Also, both "tubercle" and "heart" are considered anatomical structures by the UBERON Ontology, which is the core source of IDs for anatomical structures, so it is reasonable to represent them as such TiagoLubiana (talk) 19:52, 13 July 2022 (UTC)[reply]


Oh, and regarding why a new item: the old ones were used in a variety of ways, causing conceptual disarray (see Type or Individual? Evidence of Large-Scale Conceptual Disarray in Wikidata (Q109990743)). There were(as of Jun 30, 2022) more than 500 items that are at the same time instance and subclass of anatomical structure (Q4936952; https://w.wiki/5Ney.) Wikidata has an item for anatomical structure type (Q103813670) and anatomical structure class type (Q103914748), and usage of both are unclear. TiagoLubiana (talk) 19:52, 13 July 2022 (UTC)[reply]
@TiagoLubiana: Uberon does not have a separation of subclass and instance of. We do have that distinction. There's https://www.wikidata.org/wiki/Wikidata:WikiProject_Anatomy/Ontology_of_Anatomy/draft suggesting a way to use classes for instance_of.
A class that's currently instance of (P31) anatomical structure (Q4936952) and also subclass of (P279) organ (Q712378) shouldn't get instance of (P31) class of anatomical entity (Q112826905) but instance of (P31) organ type (Q103812529) .
Adding a lot of new concepts to reduce conceptual disarray seems to me like a bad idea. Adding new concepts in addition to the existing ontology increases the conceptual complexity and likely differences in usage. Wikidata has the general principle that if X subclass of (P279) Y and Y subclass of (P279) Z we don't list X subclass of (P279). Your particular anatomical entity (Q112826975) class breaks that principle at heart (Q1072). The way you name the two and distinguish them by the bracketed content also violates https://www.wikidata.org/wiki/Help:Label .
Breaking existing modeling principles in the hope of increasing clarity is likely to backfire.
I do think that the list you linked to should be worked through and created https://www.wikidata.org/wiki/User%3AChristianKl%2Fanatomical_structures_conflict for that purpose. ChristianKl20:52, 13 July 2022 (UTC)[reply]
@ChristianKl: No modelling is broken, as I am not removing anything. In UBERON everything is a subclass of anatomical structure, thus an instance of "anatomical structure class". The Ontology page is interesting, and definitely took a lot of work, thanks for that! Both modelings can live together, though. Many applications need to look for any anatomical structure, and having a single value for P31, instead of 37 different types, queries become much easier. This is the same rationale used for humans as instance of (P31) human (Q5) by the way, which is widely agreed. TiagoLubiana (talk) 21:34, 13 July 2022 (UTC)[reply]
Also current usage of anatomical structure (Q4936952) is where the issue lies. It is, in fact, used as an example of punning (https://www.w3.org/2007/OWL/wiki/Punning). Most (if not all) items in the list abote are not rigorously speaking instance of (P31) anatomical structure (Q4936952). People get mad if you change it, though; that is why a new item was created. .TiagoLubiana (talk) 21:34, 13 July 2022 (UTC)[reply]
When it comes to human (Q5) we have ideas of treating all people equally. Instead of having subclasses for male and females, we have the gender property to distinguish those.
Sparql has * as an operator for those applications where you really care about all anatomical structures. In practice I think we often do care about specific types. When we want a list of all organs we for example just want "heart" and not "female heart". "Heart" and "female heart" are both anatomical structures but most applications don't want to get back both of those for a query. ChristianKl21:50, 13 July 2022 (UTC)[reply]
I agree that queries using the * operator are useful, and could in theory do the job, but in my experience t it often imposes a heavy load on querying as the engine has to transverse many paths. I think both applications exist. For example, see this prototype of a Visual SPARQL Query builder: https://lubianat.github.io/sparnatural_wikidata_prototype/ . For this kind of thing to work efficiently, entities need a common "P31" to be pulled. But I am totally ok with multiple types. It bloats the database a little bit but I'd reason the time saved in the SPARQL queries by not using "*" leads to an overall plus. TiagoLubiana (talk) 15:58, 14 July 2022 (UTC)[reply]
@TiagoLubiana: To the extend that the visual query builder currently doesn't do the job it's up to improving the query builder. If you violate the normalisation principle in one space of Wikidata you encourage people to also violate it in other areas of Wikidata in a unstructured way. It's the opposite of making Wikidata's data-modeling more clear. You haven't listed a single use-case where one would even want to run the queries where one doesn't care for all anatomical entities but does care for all anatomical structures including left/right and male/female distinctions. It's very unclear to me that this special case warrents a massive amount of claims that bloat all items of anatomical structures. Even if it would be worth it, that doesn't explain having both P279 and P31 claims.
Having the extra claims means that users who ask for a truthy P279 or P31 will less likely get what they want and will have to deal with inconsistency. [
It's a general mistake to do things for performance improvement without really having a good idea of the performance implications.
It's worth noting here that "anatomical structure" isn't even the root of the Uberon ontology. ChristianKl15:52, 17 July 2022 (UTC)[reply]
Ok, I'll change the name to "anatomical entity", as it is stated in the root of UBERON, I'm really not making a distinction between "entity" and structure. Would that work for you? TiagoLubiana (talk) 12:30, 20 July 2022 (UTC)[reply]
The query using P279* took 44 seconds: https://w.wiki/5UfL, the query using P31 took 4: https://w.wiki/5UfP . That is a 10x performance improvement, and why this kind of things are useful. TiagoLubiana (talk) 12:35, 20 July 2022 (UTC)[reply]
@TiagoLubiana: Comparing a query that returns 84079 items with one that returns 3007 items and finding that it takes 10x less time to return 20x less items does not show that there's any use. It just shows that what you are doing makes the status quo confusing enough that you don't understand what you are doing.
Not making a distinction between the existing concepts is actively muddling the existing ontology and making things worse.
Apart from that it's still a bunch of statements that make my work in cleaning up the ontology harder (if it would be constistently used I couldn't query anymore easily for items that currently have no P31) and you haven't shown any usecase where it produces gains. You make a performance assumption based on ideas about Wikibase not caching certain queries which I think is more likely false than true.
Without a bot that consistently keeps it up to date using P279* is going to give more completely results. That bot would in turn causing noise that takes attention. Doing bot work without a bot approval seems bad to me. ChristianKl12:57, 20 July 2022 (UTC)[reply]
@TiagoLubiana: Thanks for the comments. I do my best to study and know what I am doing, that is why I am a part of the current UBERON team (http://obophenotype.github.io/uberon/team/). Property paths like "*" are generally expensive, e.g. see Learning SPARQL: Querying and Updating with SPARQL 1.1 (Q61718117) page 225. The previous example was not good, I agree. A better one is for cell types P279* takes 4.3 seconds: https://w.wiki/5Ug6 and P31 takes 1 second: https://w.wiki/5Ug7 and the number of results are about the same. For more complex queries, these savings can be really important (e.g see https://www.youtube.com/watch?v=--Ktu1Qrbgk) TiagoLubiana (talk) 13:38, 20 July 2022 (UTC)[reply]