User talk:Wostr

Jump to navigation Jump to search

About this board

I'm quite busy in my real life, so I may not respond swiftly to comments on this page.

Archived discussion (from before March 10, 2018) is available here.

2001:7D0:81F7:B580:3CC9:F2D3:BAD:9AA7 (talkcontribs)

Hi! Regarding Special:Diff/1033545034: I have to admit that it's hard to tell with sufficent certainty which is the correct relation as currently the entire classification of chemical substances is rather messy. I may have just hidden parts of the confusion. Generally it would make sense that chemical compound (Q11173) is a subclass of substance, the latter in turn being a subclass of concrete object of which instances are not substance classes like germanium dioxide (Q419133) (as opposed, say, particular chunk of this substance displayed on a photo in that item). However for some reason chemical compound (Q11173) is currently also set a metaclass (through "chemical component" item), as if it was a group or class of chemical substances (Q17339814). This is certainly wrong as it should not be a class of concrete objects and class of classes at the same.

Indeed there are lots of items (about 164k) directly set as chemical compound instances. Lots of it seem to be set by bots. This might have been wrong in the first place as it's unclear if "chemical compound" was ever supposed to be a metaclass and not class of concrete objects.

However, there are also lots of class item that are set as subclass of (subclasses of) "chemical compound" item. I'm having trouble to get the exact counts without query timing out, but for instance there are currently about 54k items that are a subclass of subclass of subclass of chemical compound (Q11173). Probably many items are incorrently set as instance of and subclass of chemical compound at the same time, too. So I doubt that using chemical compound (Q11173) in either relation is established really.

For the sake of practicality and simplicity I'd suggest correcting root items like chemical compound (Q11173) so that they were explicitly set as classes of concrete objects. Otherwise, if it was a metaclass, then what were true instances of concrete objects with particular chemical composition (substances) instances of. (In long term it might be reasonable to have separate items from substances and relevant molecular entities.)

Wostr (talkcontribs)

The problem we have in WD with chemical compounds classification is mainly on the specific chemical compounds level. We have a growing classification tree starting from chemical compound (Q11173) using subclass of (P279) relation. These classes have either instance of (P31) structural class of chemical compounds (Q47154513) or instance of (P31) group of chemical compounds (Q56256086) (or its subclasses), but still many classes or groups of chemical compounds are not properly classified.

On the chemical compounds level there are many problems, partially because all items imported by bots have instance of (P31) chemical compound (Q11173), also because we still lack a consensus whether chemical compounds should be instances or subclasses of chemical compound (Q11173). What's more:

  1. there are many items describing chemical compounds with undefined stereochemistry; these items we agreed to treat as group of compounds (i.e. usually instance of (P31) group of stereoisomers (Q59199015)) and should be placed in the classification tree using subclass of (P279)
  2. there are many items describing ions that should be placed under ion (Q36496) and not under chemical compound (Q11173)
  3. in some items there is instance of (P31) chemical compound (Q11173) and subclass of (P279) with a class being a subclass of chemical compound (Q11173)
  4. it is temporarily adopted that all chemical compounds should have instance of (P31) chemical compound (Q11173) so as to retrieving all chemical compounds using SPARQL be possible.

However, the main problem IMO is that there is not many people involved in Wikidata:WikiProject Chemistry discussions and we couldn't come to an agreement whether chemical compounds are instances or subclasses of classes like chemical compound (Q11173), amine (Q167198), pyridines (Q47317020) or salt (Q12370).

SCIdude (talkcontribs)

Coming from molbio where the inconsistencies are similar, e.g. specific proteins being instance of protein mostly but also many subclass-of protein. Then protein families of which the specific proteins are part-of but also some are subclass-of. Fortunately UniProt is the de-facto database for proteins (although it's not complete) so to get all available proteins queried the best way is to query for UniProt ID.


Obviously there is a hierarchy of things, expressed as a DAG (directed acyclic graph) where there are end nodes (substances, proteins) from which no subclasses or instances are made, and there are between-nodes or parents (substance groups, protein families) that are subclassed and instantialized. This is well known to object-oriented programmers, and I'm surprised there is no standard way of handling this across Wikidata. So, at least molbio and chem should aim for a consistent approach. Please ping me on pages where this is discussed.

Wostr (talkcontribs)

Classification of chemical compounds was discussed on many WikiProject Chemistry subpages. You may want to check Wikidata:WikiProject Chemistry/Proposal:Models and its discussion page, maybe also Wikidata_talk:WikiProject_Chemistry/Archive/2018#Documenting_how_to_model_chemical_concepts_in_Wikidata, Wikidata_talk:WikiProject_Chemistry/Archive/2018#'Is_a'_=_'chemical_compound' and other archived discussion of Wikidata_talk:WikiProject_Chemistry.

What we were able to agree on is the classification of 'compounds without fully defined isomerism or isotopic composition', see Wikidata:WikiProject Chemistry/Guidelines. The classification that uses structural class of chemical compounds (Q47154513) and group of chemical compounds (Q56256086) (or its subclasses like group of stereoisomers (Q59199015)) is similar to the concept of open and closed classes/groups in ChEBI.

SCIdude (talkcontribs)

If the Wikiproject cannot agree on the design I think we should try to agree here. The problem is not that ions have their separate tree, or that compounds with undefined stereo-, cis/trans-isomery or tautomery are actually groups/sets. Of course these ions and sets need to be identified and changed to the specific instances. The real design problems are, as you say, if a compound class gets a has-part statement, does this statement apply to the class, or does it apply to each of its instances? If we want it to apply to every instance then indeed the compound class must be subclass of a class of all objects where this statement holds:

Solution 2 would resolve the dilemma you have with has-part/part-of by moving the has-part one level up. We just would need 100 molecular entity classes. If you agree this could be proposed on one of the project pages. If you disagree we can find another solution, but please do comment.

Wostr (talkcontribs)

We cannot force a solution. If there is no agreement, the problem won't be fixed. The problem with solution 2 is that there would be too many P361-statements in some classes – and it was rejected once, when P361 was used as a obligatory inverse property for P527 (has part), because in items about carbon or oxygen there would be thousands of compounds listed and because P361/P527 is used for many other things (too broad properties). The problem with items like oxygen molecular entity (Q76272846) is also that in WD compounds are more substances than molecular entities (most physical properties, uses etc. describe substance, not a single molecule).

By the way, there's even no agreement that we should classify compounds using classes ;)

For me, the solution 1 would be the best and the simplest, but the main reason for not doing so was that it would be hard to retrieve all the compounds via SPARQL (cf. Wikidata:WikiProject Chemistry/Tools). That's why all chemical compounds have instance of (P31) chemical compound (Q11173) despite being also an instance of its subclass) and that's why I'm now trying to clean up all items being subclasses of chemical compound and being incorrectly described using instance of (P31) chemical compound (Q11173) (ions, group of compounds, sometimes minerals, etc.). and also trying to expand and fix the existing classification tree for compounds.

I was also thinking about creating some sort of a metaclass that could be used instead of instance of (P31) chemical compound (Q11173), so as to every chemical compound could be queried via SPARQL and it would not affect the classification or relations part of/has part. Then all existing instance of (P31) chemical compound (Q11173) could be switched to P279 or deleted if there is already a subclass of chemical compound (Q11173).

Wostr (talkcontribs)

BTW there may be another problem with instance of (P31) chemical compound (Q11173) in the future – the situation in which we have an item about a specific substance, i.e. a sample of substance (in a museum or whereever). I think we already have a few items about minerals being a real instances (a rock in a museum with its number and everything). It is at least theoretically possible with compounds I think.

SCIdude (talkcontribs)

I don't understand why you say that there would be too many P361-statements in some classes. The only item that would have as part oxygen would be oxygen molecule, all other compound classes subclass from it. That is the reason why I support the idea, instead of having has-part oxygen in every class and instance. Don't you agree?

You say if there is no agreement, the problem won't be fixed. I think that if there is agreement between the active people then this can be documented, and comments requested. This is one thing and worthwhile. And secondly, as you have seen, I have no problems doing mass edits via QuickStatement, after no valid objections have been raised. Just talk to me.

Wostr (talkcontribs)

There are more active people than just the two of us ;) I think this matter should be brought to the Wikiproject (again), because maybe there will be consensus now to do something about it. Both solutions may be described there with an information that, if needed, there may exists a metaclass for all chemical compounds to be able to query them easily (like structural class of chemical compounds (Q47154513) but for specific chemical compounds like water (Q283)); there is chemical species (Q899336) but it doesn't seem entirely correct for a metaclass).

Reply to "Instance of chemical compound"
Charles Matthews (talkcontribs)

Let me explain about the "inorganic" definition that was used there. It comes from https://www.ncbi.nlm.nih.gov/mesh/68007287, i.e. from the {{P|486}} system. The definition may not be standard in relation to its treatment of carbon compounds; but MeSH is a major system for searching the medical literature. I have used that, to add several thousand links to the item: see for example https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Q73543107&limit=2000. where links to Q73543107 persist. I suppose those links will be updated soon.

The information in the corresponding {{P|921}} statements has an exact scope as defined by the MeSH page, and so with the given description. Because this information has value, I would like to undo the merge you did, and update Wikidata:Do not merge. I think where Help:Merge#Check to be sure talks about "subtle differences", that advice applies here.

Wostr (talkcontribs)

I see no differences between these two items and in fact there is no difference. Inorganic compounds do not have only one definition and this class is defined usually as "any compound that is not organic" = "any compound that do not consist of carbon, with exception of carbonates, ..., ...". Keeping these two items separated is a mistake, because this is the same concept.

You may expand the definition in Q190065, because both definitions can be used interchangeably.

Charles Matthews (talkcontribs)

I think we are not understanding each other.

It does seem, from the English Wikipedia article "inorganic compound", that chemists are not very interested in the inorganic/organic boundary. They probably aren't concerned with it, in a practical way.

MeSH, Medical Subject Headings, is a "comprehensive controlled vocabulary for the purpose of indexing journal articles and books in the life sciences". The vocabulary is controlled by scope notes that define how it is properly used. In other words the index terms from the vocabulary carry the value of fairly precise definitions.

Here there are indeed multiple concepts. I work with a bot that imports these terms from the PubMed repository, where they are attributed by the MEDLINE indexing system. That indexation is carried out by human experts.

I certainly wish that in giving main subjects to biomedical papers, referencing PubMed, I'm able to attribute exact meanings.


Wostr (talkcontribs)

I'm not interested at all what is written in any language version of Wikipedia. Sitelinks to other Wikimedia projects are only an addition to Wikidata and do not define the Wikidata content.

The definition of inorganic compound in chemistry may be written as follow: "any chemical compound that is not organic, i.e. does not contain carbon with exception of some compound classes traditionally defined as inorganic, e.g. carbon monoxide, carbon dioxide, carbon disulfide, carbon diselenide, carbides, hydrogen cyanide, carbonic acid, cyanic acid, isocyanic acid, fulminic acid and its salts like carbonates, hydrogen carbonates, cyanides, cyanates". This definition covers both English Wikipedia definition and MeSH definition.

Charles Matthews (talkcontribs)

OK, thank you for the description, which I shall add to the item. We shall have to agree to disagree on other matters.

Wostr (talkcontribs)

Maybe there is some difference between these two concepts that I just can't see, but it must be something other than the definition from https://www.ncbi.nlm.nih.gov/mesh/68007287. MeSH definition is one that is widely used for inorganic compounds.

BTW I'm not sure about items describing compounds of a specific element. I've just noticed that all entries in MeSH like this one: https://meshb.nlm.nih.gov/#/record/ui?ui=D017610 are for inorganic compounds only. However, in WD we don't have items for inorganic compounds of a specific element yet. So Q12548019 has MeSH id (inorganic) matched to 'calcium compound' (both organic and inorganic), Q74819737 has description about inorganic compounds, but is matched to category for both organic and inorganic compounds.

Is it intentional (and narrow match (Q39893967) as a qualifier should be used) or maybe I should create items for 'inorganic compound of ...' and move MeSH id to such items?

Charles Matthews (talkcontribs)

For a very accurate analysis of the MeSH identifiers for compounds, one can look at the MeSH tree codes (P672). E.g https://www.ncbi.nlm.nih.gov/mesh/68058085 for iron compounds has two, D01.490 and D02.691.550, where D01 means inorganic, and D02 organic. https://www.ncbi.nlm.nih.gov/mesh/68017612 for gold compounds has only D01.379, because the scope is inorganic only. So, yes, it would be possible to make these distinctions, and check them with queries.

At present I'm working to complete the P486 dataset, and this is not my major concern. There are some thousands of statements still to add.

Reply to "{{Q|73543107}}"

Właściwości polskich słowników

13
KaMan (talkcontribs)
Wostr (talkcontribs)

Jasne.Na SGJP trafiłem przypadkowo z Wikisłownika (z baru). Ale aż dziwne, że po jednym głosie została ta właściwość utworzona, tak się zazwyczaj nie dzieje ;)

KaMan (talkcontribs)

Dziękuję. Dla mnie to też niezrozumiałe, że jednym głosem przesądza się o utworzeniu właściwości, ale może to wynika z tego, że na razie dane leksykograficzne nie mają wielu wielbicieli. Przeglądam wszystkie tworzone leksemy i jest raptem kilku stałych twórców. W ramach wdzięczności za głos mogę chętnie utworzę jakiś chemiczny leksem dla Ciebie :), jest może jakieś ciekawe chemiczne słowo? Hasła dla pierwiastków chemicznych już tworzę. Ostatnio zrobiłem całe drzewo etymologiczne berylu https://lucaswerkmeister.github.io/wikidata-lexeme-graph-builder/?subjects=L6291&predicates=P5191

Wostr (talkcontribs)

Nie ma sprawy. Pierwiastki to chyba najważniejsze chemiczne słowa ;)

Nie mają wielu wielbicieli podobnie jak całe Wikidane na początku nie miały, a dopiero od pewnego czasu znacznie wzrósł ruch i są podejmowane kroki, aby włączać dane z WD do innych projektów. Jestem bardzo ciekaw, jak rozwiną się dane leksykograficzne, bo interesująco to wygląda. W razie potrzeby (podobnych propozycji właściwości) możesz pisać albo po prostu dać pinga.

KaMan (talkcontribs)
KaMan (talkcontribs)
KaMan (talkcontribs)
Wostr (talkcontribs)

Nie ma żadnego problemu. Dzięki temu dowiaduję się o stronach, o których nie miałem pojęcia — a często szukam różnych rzeczy w słownikach ;)

KaMan (talkcontribs)
KaMan (talkcontribs)

Tym razem przybywam z pytaniem czy byś nie poparł Wikidata:Property proposal/usage example . Nie jest to jednak kolejny słownik a właściwość (+2 kwalifikatory) przydatna przy przenoszeniu z Wikisłownika przykładów użycia więc możesz mieć wątpliwości. Nic na siłę, jeśli nie popierasz tej propozycji to w porządku :)

Wostr (talkcontribs)

Dzięki, postaram się zerknąć w wolnej chwili ;)

KaMan (talkcontribs)
Wostr (talkcontribs)

Bardzo ciekawe słowniki swoją drogą :)

Fuzheado (talkcontribs)

Thanks, yeah I realized just after the merge there was probably a problem. But we need to resolve this, as I'm pretty sure there are erroneous claims that don't apply correctly, like the identifiers that don't seem to map correctly.

Wostr (talkcontribs)

Yes, you're probably right, but this is a problem with many, many items about chemical compounds – mass bot imports of data caused this along with the fact that in Wikipedias there are two or more concepts described in a single article. In this case there should be probably four items needed to map all the ids: two about stereoisomers, one about pair of stereoisomers (something that is also called 'compound without defined stereochemistry'), and one about racemic mixture of the two stereoisomers. I'll try to clean this up this week.

Fuzheado (talkcontribs)
149.210.234.143 (talkcontribs)

nicotinamide adenine dinucleotide

4
Jmarchn (talkcontribs)

I do not understand why you have reverted my edition. Obviously, one thing is not an instance of itself. In this case, it could be an instance of Cofactor and/or Nucleotide .

Wostr (talkcontribs)

You've merged items about different concepts. Check the ChEBI entries in both items. One is about specific molecule, the second is about group of molecules. Everything is fine as it is now, well-defined molecule is an instance of a group of molecules.

Jmarchn (talkcontribs)

I do not agree with your response.

Now:

  • CHEBI: 13389 (NAD) -> "nicotinamide adenine dinucleotide" (Q61962141)
  • CHEBI: 44215 (NAD zwitterion) -> "nicotinamide adenine dinucleotide zwitterion", but tagged in wikidata also as "nicotinamide adenine dinucleotide" (Q12499775)

I refer to "Property: P31" = "Instance of" from Q12499775. It makes no sense to refer to an identical denomination, and therefore, the same.

To resolve this error, that you hold: I propose, perhaps, to rename Q12499775.

But that is not my war nor my field.

Wostr (talkcontribs)

I don't exactly know what is wrong. If the English label is incorrect, you may change it; but, the names in ChEBI are not determinants to how WD items should be named. What's important are the statements in both WD and ChEBI and the correlations between the concepts.

One is an instance of the other (correct); both items in WD have the same label in English, but this is not a problem; descriptions should be different and descriptive enough to distinguish the concepts, not the labels. There are many situations in chemistry where two items linked via P31/P279 have the same labels, but descriptions and statements in each item are different. What may be wrong here is that in the nicotinamide adenine dinucleotide (Q12499775) there may be interwiki links or other statements that should be moved to different WD items.

AnBuKu (talkcontribs)

Hello Wostr


My apologies for the trouble I have done in acryl. It was due to the attempt to add acryl to "material used" for Q63978964 (Balancing Bear). The only source what I have, mention the materials as "Polyester, Acryllaminat, bemalt" (German). Thus I have tried to add this to Q63978964, but then I have got:

value type constraintHelp Discuss

Values of material used statements should be instances or subclasses of one of the following classes (or of one of their subclasses), but acryl currently isn't:

Anyhow, maybe I leave "material used" as it is now (with exclamation mark), as I don't know how to do better.


Again, sorry for the trouble. Best regards ~~~~

Wostr (talkcontribs)

Q623834 is not what you're looking for, it's not material and it shouldn't be used in 'material used' property anywhere. It has the same name in German/English, but Q623834 is not about any polymeric material. You should search for a proper item, maybe some item being a subclass of Q423145?

AnBuKu (talkcontribs)

Thx, have replaced existing stuff with acrylic paint (Q207849), what seems to me the most appropriate. - Unfortunately quite often certain expressions/names in one language have not really an expression/name in an other language and it gets worse, if these expressions/names as e.g. "Acryllaminat" are not proper scientific names/expressions, but rather trade or marketing names. - Anyhow, thx again for your support :-)

Cz ja (talkcontribs)

Witam. Nie zgadzam się ze zmianą. Główny artykuł (najlepiej dopracowany, najobszerniejszy, posiadający najwięcej powiązań) jest w języku angielskim see: https://en.wikipedia.org/wiki/Saturated_fat Q970537 (mięsista część artykułu jest niżej)

Lista kwasów tłuszczowych nasyconych jest tutaj https://en.wikipedia.org/wiki/List_of_saturated_fatty_acids wikidata Q5487901

Praktycznie 100% nic nie jest przetłumaczone. z tego na język polski. Cała grupa w wielu językach do zintegrowania w jeden artykuł "Nienasycone kwasy tłuszczowe" tak samo podstrona z listą owych kwasów.

W razie problemów chętnie pomogę. Pozdrawiam.


Cz ja (talkcontribs)
Wostr (talkcontribs)

Tego rodzaju infoboksy nie są spotykane w polskojęzycznej Wikipedii zbyt często. U nas odpowiednikiem tego byłby po prostu szablon nawigacyjny u dołu strony. W pl:Szablon:Navbox jest dokładna instrukcja, zobacz też jak to jest zrobione np. w pl:Szablon:Karotenoidy. W razie problemów pisz w pl:WP:CHEM, bo ja w tym momencie nie mam zbyt wiele czasu, jestem poza domem i nie mogę na daną chwilę tego szablonu zrobić.

Cz ja (talkcontribs)

Ok. Dzięki za odpowiedź.

Integracja hasła Kwasy tłuszczowe nasycone i tłuszcze nasycone (synonimy)

3
Cz ja (talkcontribs)

Zwracam się z prośbą o integrację hasła o Kwasach tłuszczowych nasyconych (prawidłowa forma) i tłuszczach nasyconych (forma potoczna) gdyż są to synonimy wikidata Q11789887 i Q970537.

W razie problemów chętnie pomogę. Pozdrawiam.

Wostr (talkcontribs)

Nie są, przynajmniej nie w ujęciu chemicznym (tłuszcz = ester kwasu tłuszczowego i alkoholu (glicerolu), więc tłuszcz nie może być kwasem tłuszczowym). W Q970537 jest definicja obejmująca zarówno tłuszcze nasycone, jak i kwasy tłuszczowe nasycone (jest to więc definicja bliższa naukom o żywieniu, natomiast Q11789887 zawiera definicje stricte chemiczną, która jest niezbędna do prawidłowej klasyfikacji związków chemicznych. Poza tym integracja jest niemożliwa, bo istnieją hasła o obydwu definicjach w projektach językowych Wikipedii (a nie ma możliwości, aby w jednym elemencie były dodane dwie strony w tym samym projekcie).

Odpowiedź dotyczy też wątku poniżej. Integracja nie jest możliwa, zarówno ze względów merytorycznych i klasyfikacyjnych, jak i technicznych.

Cz ja (talkcontribs)

OK. Dziękuję za wyjaśnienie.