Wikidata talk:WikiProject Chemistry

From Wikidata
Jump to: navigation, search

Organization question[edit]

Is it a problem if we keep English as working language for this task force ? Snipre (talk) 11:27, 14 February 2013 (UTC)

I think it might be a problem if we don't. Thanks again for starting this. Looks very good so far.--Saehrimnir (talk) 15:50, 14 February 2013 (UTC)

Base for one Data Record[edit]

Hi all, to be honest, up until now I was not involved with WikiData at all and I am not planning to changes this very much. I just want to bring up one very basic but important question: On what basis should a data record in Wikidata be based?

To give you an example: de:Isoleucin - fr:Isoleucine - en:Isoleucine has in all language versions one article (and therefore just one WikiData interwiki data set). This article covers in all languages at least the two enantiomeric compounds: L-Isoleucine and D-Isoleucine. Both of them do have different CAS-Numbers, PubChem-Entries and physical properties like specific torsion angles and melting points. In deWP also the stereoisomeric compounds L-allo-Isoleucine and D-allo-Isoleucine are subject of the same article, in enWP in addition the two Isoleucines are part of the data table.

So what I want to point out is: To be very precise with the physical properties of stereoisomers, it will be not sufficient to have one data set per lemma, it must be one data set per stereoisomer (and, furthermore, if different isotopes are involved, one per isotope).--Mabschaaf (talk) 16:14, 14 February 2013 (UTC)

You are right about streoisomer: data will have to put implement according to the correct component. For isotope I don't think that for large molecules it is possible to detect a large difference in physical properties. Snipre (talk) 19:48, 14 February 2013 (UTC)
Isotope labelled compounds are rare in WP, but we should keep them in mind.
Far more important are compounds which usually are present as salts, like de:Ephedrin/fr:Ephedrine/en:Ephedrine, where some data relate to the hydrochloride, some to the sulphate and some to the hemihydrate. In deWP we try to catch this up by adding a proper description to each value in the box, enWP is not displaying any further details in the box, frWP takes only care about this fact at the CAS/EINECS-entries. In my optinion WikiData should have a data set for each of these different compounds. In other words: There should be a distinct data set for each isomer and each salt of each isomer. So one data set is clearly connected to a full substance name including stereochemical descriptors and counter ions/salts.--Mabschaaf (talk) 10:13, 15 February 2013 (UTC)


Proposition about general policy for data about chemicals and elements:

  • Data have to refer to the exact chemical/element defined by the item description.
    • For chemical the distinction has to be made between stereoisomers or mixture of stereoisomers i.e. a data of a specific stereoisomer can't be added as statement of the item describing a mixture of isomers
    • Same rule concerning salts wihch have to be separated from the neutral form of the component
    • If no item exists for the specific component please refer to the general policy of Wikidata to create the item
  • Data has to be referenced with the help of available structure. Referencing includes addition of conditions in which the data are measured according to available structure (qualifier(s)). The Chemistry task force defines mandatory references justifying the conservation of the statement.
Please comment this proposition. thanks Snipre (talk) 20:18, 20 February 2013 (UTC)


A classification has to be organised to describe the chemical. The first divison can be organic/inorganic. Then the question is to know haw we can classify the components: by functional group ? Does anyone know a classification for chemical coumpound ?

  • Organic chemical
    • Hydrocarbon
      • Alkane
      • Alkene
      • Alkyne
    • Carbonyl
      • Ketone
      • Aldehyde
      • ...
    • ...
  • Inorganic chemical
    • ...
  • ...

Snipre (talk) 21:16, 25 February 2013 (UTC)

Organic/Inorganic is really not very useful for classification. In deWP substances are classified by functional groups (as you proposed above) with chemical elements on the top level (Hydrocarbons are part of Hydrogen containing compounds and carbon containing compounds). Just take a look at de:Kategorie:Chemische Verbindung nach Element (should be easy to understand even for non-German speakers).--Mabschaaf (talk) 19:01, 1 March 2013 (UTC)
I am not clear what this discussion is about the "description field" or a Property "compound class" ? In the first case chemical compound should be sufficient, in the later I agree that we should classify by functional group.--Saehrimnir (talk) 19:47, 1 March 2013 (UT

Is it possible to come up with a classficitation that is close to the enwp category tree?

en:Category:Chemical compoounds


Mange01 (talk) 22:23, 1 March 2013 (UTC)

Have you read what I wrote? A discrimination between organic and inorganic is just historical, not systematical. Methane is inorganic (by definition!), Ethane organic? Seems not very logical.
My proposal would be: Come back to the roots! No complicate decision wheter a compound is organic/inorganic or aromatic/aliphatic. It should be very easy, even for not high sophisticated chemists. Start with the chemical formula: C containing compound, H containing compound, Na containing compound, etc. Maybe we could discuss to use this according to the order of the Hill-Notation (Top level: does the compound contain Carbon-Atoms, second level: does it contain Hydrogen, etc). This would make decision how to classify pretty straight. --Mabschaaf (talk) 10:21, 12 March 2013 (UTC)
So before entering the details we have to focus on the basics: I propose to fix the property "instance of" with value "chemical compound" for all pure chemical and property "instance of" with value "chemical substance" for mixture of pure chemicals. Snipre (talk) 22:49, 21 March 2013 (UTC)
It seems like unlike in German en:Chemical substance also applies only to pure chemicals at least the IUPAC has defined it so. So it would be better to have chemical compound and chemical mixture.--Saehrimnir (talk) 16:36, 23 March 2013 (UTC)

For the classification according to function groups we need 2 things: a property and a list of functional groups. For the property we can use again "instance of" (Property:P31), use another existing property or create a new property specific to chemical classification (like chemical family or chemical class). For the functional groups list we need to define that list in order to give contributors an easy way to classify themselves compounds. Snipre (talk) 02:09, 29 March 2013 (UTC)


See Wikidata:Chemistry_task_force/Tools#Classification_trees Snipre (talk) 02:18, 29 March 2013 (UTC)

Sounds good.--Saehrimnir (talk) 15:04, 29 March 2013 (UTC)


Can the Chembox infobox parameters be used as property names? What parameters should be prioritized? See en:Template:Chembox. -- Mange01 (talk) 22:24, 1 March 2013 (UTC)

Look at Wikidata:Chemistry task force/Properties: we already compare en, de, and fr chemox in order to extract the main properties. Snipre (talk) 07:52, 2 March 2013 (UTC)

How do we write chemical formula in wikidata db ?[edit]

please give your opinion there. 11:53, 16 March 2013 (UTC)

Classifying chemicals with 'instance of' (P31) is incorrect[edit]

A few weeks ago, a bot added instance of (P31) claims to items about chemicals. This is problematic, since those items are not about instances. As explained in Help:Basic membership properties, P31 only applies to subjects that represent single, concrete things. For example, a particular molecule of ethylamine in a container on a lab bench would be an instance. Of course, Wikidata is not concerned with any one particular instance of ethane; it is interested in the class of thing called ethylamine.

This could be corrected by replacing those 'instance of' claims with subclass of (P279) claims. This notion is supported not only by the straightforward logic above, but also by the fact that ChEBI, the largest database of small chemical compounds that uses Semantic Web properties, uses 'subclass of' and not 'instance of' to classify compounds like ethylamine. (If you're interested and your computer can handle opening a ~137 MB file in a browser, then you can see for yourself in

This should be fixable by a routine bot request. What are others' thoughts? Emw (talk) 02:09, 10 April 2013 (UTC)

Instance of definitely is the wrong property here. We might say that the decay events of some radioactive elements are instances of radioactivity, but I agree subclass of is the better property in general.--Jasper Deng (talk) 02:13, 10 April 2013 (UTC)
You assume that a molecule of ethylamine is different from the general concept of ethylamine: it is not right because properties of an amount of ethylamine are not different from a molecule of ethylamine. Instead of proposing some modification better propose the definitions of subclass and instance applicable to all possibilities especially on countable and uncountable element. For me a subclass has to have the properties of a class, then a class can contain classes and instances. The question is now to know if a class which can contain only one type of instance is still a class. For me doing a difference between one molecule of ethylamine and the concept of ethylamine is just brain mess. And the comparison with the chEBI onthology is not correct because from what I know they have no concept below subclass. If you find a similar property to "instance of" in the chEBI onthology I will agree with you, if not wikidata and chEBI are not the same and comparaison cannot be always true. Snipre (talk) 11:39, 10 April 2013 (UTC)
@Emw The present definition of instance of is good for countable elements but for uncountable elements you are creating an arbitrary distinction between one unique element and several identical elements. And if you want to push the details until the end, if you look at the properties definition, the item ethylamine is defined by specific properties so no differences between one or several molecules. Look at the chemical formula and you will find C2H7N which is the atomic composition of ONE molecule of ethylamine and not C2nH7nNn which the atomic composition of n molecules of ethylamine. Snipre (talk) 12:17, 10 April 2013 (UTC)
The ChEBI ontology doesn't concern instances, it concerns only classes; thus one would not expect ChEBI to classify chemicals as instances. The source for the 'instance of' (P31) and 'subclass of' (P279) properties are rdf:type and rdfs:subClassOf, which are both W3C recommendations for the Semantic Web. If you look in the (huge) chebi.owl file, you'll see that ChEBI describes all chemicals with rdfs:subClassOf, which corresponds to 'subclass of' (P279). Given that both ChEBI and P31/P279 represent structured data using the same W3C recommendation, I think ChEBI's decision to use 'subclass of' instead of 'instance of' is relevant to Wikidata.
More importantly, though, the argument for using 'subclass of' instead of 'instance of' to classify chemicals is a straightforward appeal to the meaning of those two ontological terms. The distinction between them is explained in Help:Basic membership properties most closely for this case of chemicals by the example for 'quark'. It's type-token distinction, which is the basis for differentiating classes (types) and instances (tokens).
I don't see how I'm making an arbitrary distinction between countable items and uncountable items -- Wikipedia has already done that. The Wikipedia article on ethylamine is clearly not about an individual molecule of ethylamine -- that is, the article is not about a single ethylamine molecule with a unique location in space and time. If it were, then the article would be about ethylamine as an instance. But the article is clearly about ethylamine as a class.
You entertain the question of whether a class that contains only instances that are identical except for their location in space and time is still a class. The answer is "yes". That's because an instance is fundamentally a thing with a unique location in space and time. While all instances of ethylamine might be exact copies of each other, they all occupy a different space and time. These molecules are each instances of a kind of thing (i.e. a class) called 'ethylamine'. This distinction can admittedly be a bit of a brain mess for certain subjects like chemicals. However, once the idea of an instance as a "spatiotemporal particular" is clear, cases like this become much easier to think about. Emw (talk) 04:03, 11 April 2013 (UTC)
An item is an instance of a class if it can not be subdivided further without breaking its relation to the class. For example: USS Vincennes is an instance of Ticonderoga-class cruiser, but its is not an instance of Ship class, although Ticonderoga-class cruiser is an instance, not a subclass, of Ship class. Delta class submarine however is a subclass of Ship class since it subdivides into four different classes. The same principle applies to chemical substances; Ethanol is an instance, not a subclass, of alcohol. It is a subclass of molucule, but since each ethanol-molecule is indistinguishable from another, subdividing them is quite pointless. /Esquilo (talk) 08:29, 16 April 2013 (UTC)
  • Just a small addition to what was said above: it is possible to create item for each molecule of ethanol but then the properties of a molecule item will be the same as for the substance item: time eand place properties are not relevant because even if you label a molecule with a name and you it back into an large amount of other molecules, you can't find it again. Substance item is the lowest subdivision you can do in chemistry in term of identification. As the property "instance of" in the lowest classification level we have to match it with the lowest chemical subdivision. Snipre (talk) 08:47, 16 April 2013 (UTC)
  • Esquilo, have you read Help:Basic_membership_properties? Whether 'instance of' or 'subclass of' is applicable for a given Wikidata item is determined by whether that item is an instance or a class. An instance is a token and a class is a type; please see type-token distinction if the distinction between 'instance' and 'class' is unclear. If you're still not convinced that classifying ethanol and other chemical compounds with 'instance of' is incorrect, please see my more detailed reply at the Help:Basic_membership_properties talk page. Emw (talk) 01:40, 17 April 2013 (UTC)
Actually I have not (finding guidlines on Wikidata is more difficult than on other Wikimedia projects), but the examlpe of USS Nimitz and Nimitz-class aircraft carrier matches my description exactly. Is is simply applied inheritance and polymorphism of the same kind that is used in Object-oriented programming. The sentence from the talk-page "homo sapiens is an instance of species, individual homo sapiens are not" is an even better example. /Esquilo (talk) 08:39, 17 April 2013 (UTC)
+1 for the programming concept. The classification relies on properties not on conceptual distinctions: as position and time are not properties of element we can not use them in order to perform a possible distinction even if it si possible to do it. A classification relies only on what you have as properties in your classification even if other classifications can do the thing differently. If now you create an item for an individual molecule of ethanol, if we don't specify its position at a certain time by adding new properties there will be no difference with the properties set from the item ethanol so how do you differentiate a molecule from the concept ? In terme of wikidata classification you can't so the conceptual distinction between a molecule and its concept is wrong (again according to the classification used in wikidata right now). Snipre (talk) 10:28, 17 April 2013 (UTC)

Chemical formula[edit]

Hi. Alunite (Q338106) has and end member formula on KAl3(SO4)2(OH)6. I'm confortable with this, the formula is similar to my school time. De.wikipedia uses a different notation: KAl₃[(OH)₆|(SO₄)₂]. Is this ok? --Chris.urs-o (talk) 13:49, 15 May 2013 (UTC)

Normally there is a rule for chemical writing but right now I can't say for inorganic compounds. 12:10, 16 May 2013 (UTC)
Are things moving? Is there a controversy? Or a new consensus building up? --Chris.urs-o (talk) 14:13, 16 May 2013 (UTC)
Square brackets of "anion complex" for minerals is nice to have, but not really essential. The formatting is anyway so limited in the current "chemical formula string" that we might as well leave them away. Once more advanced math-typesetting-datatypes become available we can reintroduce this concept. (It is also not always straight forward what should go into the anion complex: --Tobias1984 (talk) 14:25, 16 May 2013 (UTC)
Thanks, so de.wikipedia is right according to IUPAC rules. 15:49, 17 May 2013 (UTC)
They follow, but I think that is sometimes more up to date. --Chris.urs-o (talk) 15:25, 18 May 2013 (UTC)
I think we need to add a qualifier for the chemical formula to give the method used to write the formula.
Right now we have
  • Hill formula for organic component
  • complex rules for complex component
  • inorganic rules for salts and inorganic acids
If you know other rules please add them. Snipre (talk) 16:17, 18 May 2013 (UTC)

Just thinking...[edit]

...that you may be interested in this. --Ricordisamoa 05:41, 30 May 2013 (UTC)

Classification ... again[edit]

I am trying to match wikidata item for chemicals (around 4500) with their Pubchem ID in order to extract different data from the PubChem database. But I have some problem to define some chemical entities. To list the chemicals present in Wikidata I use instance of (P31) = chemical compound (Q11173) or a subclass of chemical compound (Q11173). By doing that I found some radicals or some mixture of chemicals, isomers mixtures or substance mixtures, defined as instance of (P31) = chemical compound (Q11173). So I propose to reserve the use of instance of (P31) = chemical compound (Q11173) for an unique molecule (no mixture of different compounds), for an unique isomer (no mixture of different isomers). Radical or ion are not considered as chemical compound (Q11173).

Chemical entity Example instance of (P31) subclass of (P279) Properties
Isomer mixture butanol (Q663902) - chemical compound (Q11173)  ?
Simple isomer 1-butanol (Q16391) chemical compound (Q11173)
butanol (Q663902)
-  ?
Simple isotope dideuterium (Q6419441)  ?  ?  ?
Radical methyl (Q4407) radical (Q185056) -  ?
Anion carbonate (Q181699) anion (Q107968) -  ?
Cation ammonium (Q190901) cation (Q326277) -  ?
Allotrope diamond (Q5283) chemical compound (Q11173)
-  ?

To solve the problem of allotrope and isotrope, we need to create an intermediate item between element/chemical coumpound and unique isotrope/allotrope:

Can I1 and I2 be the same item ? Snipre (talk) 11:39, 28 September 2013 (UTC)

Source definition[edit]

Please look at this proposition to source use of ATC code (P267). Snipre (talk) 18:56, 13 October 2013 (UTC)

Collaboration with PubChem[edit]

While visiting NCBI recently to discuss ways in which they could collaborate with the Wikimedia community (see my notes), the idea came up to explore specifically how their database PubChem might fit with Wikidata. This has been discussed in an initial meeting with PubChem yesterday, in which they did indeed express an interest in finding out what Wikidata might offer to them, what kind of information we might be wishing to get from their site, and possibly in how well the information in their database matches with what we have (including on Wikipedia). They are working on exposing their data via RDF (scheduled release is in January; preliminary site is here) and open to inquiries, suggestions or other forms of feedback from the Wikidata community, including on the vocabulary they used and why. For a start, I'd suggest to collect such feedback here. I have also posted to the Wikidata mailing list. --Daniel Mietchen (talk) 06:18, 5 December 2013 (UTC)

@Daniel Mietchen: Thank you for your proposition. I was just thinking about an initiative in order to import the PubChem data in Wikidata , see ChemID initiative. The main purpose is to collect data from the different free databases and to match the corresponding chemicals between the databases in order to create an unique list of all data available from thoses databases.
Right now I am afraid we can't propose something to PubChem: we have first to match the Q items of our chemicals with PubChem ID. Then we can propose this list to Pubchem in order to allow them to create a link from their chemical pages to the corresponding item in Wikidata: this will give them access to the future data for each chemical in Wikidata. Snipre (talk) 18:34, 5 December 2013 (UTC)
@Snipre: Sorry if this is a dumb question, I am new to Wikidata. How do you get the Q-numbers from the Wikipedia articles with chembox templates? I just viewed the source of w:Methane, for example, and don't see any cross-reference from there to wikidata. Klortho (talk) 04:23, 9 December 2013 (UTC)
@Klortho: There is no direct way to find all those articles. There are 9348 transclusions ( Once we gather all the identifiers from those infoboxes, a query (e.g.[662]) would show all the q-items. --Tobias1984 (talk) 08:53, 9 December 2013 (UTC)
@Klortho:@Snipre:@Tobias1984: Wait, there _must_ be a way to get the Wikipedia-to-Wikidata mappings, right? (Embarrassed, I should know the answer from our similar effort on human genes and proteins...) But from w:Methane, the left-hand nav bar --> Languages --> "Edit links" clearly links to methane (Q37129), right? Despite my incredulity, in my experience Tobias1984 usually ends up being right about these things... Andrew Su (talk) 02:05, 10 December 2013 (UTC) redirects to methane (Q37129). Klortho (talk) 07:15, 23 December 2013 (UTC)
@Daniel Mietchen: I saw that you have a bot. perhaps can you have a look at that request which is the first step to collaborate with other databases. Snipre (talk) 14:06, 6 December 2013 (UTC)
This would be a great idea. I know some of the PubChem people personally, and although we've talked about working together I've usually had other things keeping me away. If we have a group of people committed to working on this, we should seize this opportunity now! I'm very busy with final exams right now, but in 10 days or so I'll be able to commit some serious time to it - let me know how I can best help. Thanks for taking the initiative! Walkerma (talk) 01:25, 8 December 2013 (UTC)
@Snipre: I would be interested in helping with the bot, but since we do not have a Wikidata Toolkit yet, someone else would have to take the technical lead. --Daniel Mietchen (talk) 01:38, 10 December 2013 (UTC)
@Daniel Mietchen: Hi, you don't need to do that directly in Wikidata but just extract the data like here and we will work from that. By the way if you have contact with PubChem guys, perhaps can you ask them how they get the agreement from chEBI, CHEMBL and KEGG databases to import some of their data into PubChem database. They are some uncompatibilities between the licences. Snipre (talk) 17:58, 23 December 2013 (UTC)
@Daniel Mietchen: I enthusiastically support this idea. Scanning the PubChem record on methane, I think the identifier and descriptor mappings are no-brainers, as are the physiochemical properties. If we can figure out the links to other Wikidata entries based on the "Biomolecular Interactions and Pathways" section, I think that would be awesome. However, we should _not_ attempt to import all of the data in the "Biological Test Results" section. That is beyond the scope of what I think Wikidata should be (but obviously that's up to the community to decide). More generally, I think the rate-limiting factor in getting this done is developer time. There's probably some relevant code in our WikiDataGeneBot repository, but we're still looking for someone to maintain/develop it full time as well... Cheers, Andrew Su (talk) 02:05, 10 December 2013 (UTC)
@Andrew Su: Due to licence compatibility we can't import third part data from PubChem. Right now we can only import data like SMILES, InChI, InChIKey, formula and CID. Snipre (talk) 18:01, 23 December 2013 (UTC)
I just had a good Skype discussion with someone from PubChem about working together, in a similar way to how w:WP:CHEM worked with CAS and ChemSpider to check IDs, and then to look at what data can be shared. I agree that the licence compatibility is an issue, but it seems PubChem keeps good track of provenance so we could perhaps select from sources that share data openly, or perhaps use data as part of a validation program. His concern was that PubChem just has so much data - changes run to terabytes per week - so we need to be able to be selective for just the data we need.
I think we also need to feed data INTO PubChem - the data should flow both ways. We've discussed how this might be done, and I'll be sending over a template Excel-type file for him to look at. We'll start with comparing identifiers, and maybe we can grow things from there to include physical properties. During this transition period it's going to involve people from both Wikidata and Please let me know your thoughts. I'll also cross post on WP:CHEM on the English Wikipedia. Thanks, Walkerma (talk) 17:04, 25 March 2014 (UTC)
@Walkerma: I already started something like that: see Wikidata:WikiProject_Chemistry/ChemID and for the excel file see that. The list of chemicals in the excel files correspond to the chemicals in the WP:fr. Perhaps you can do the same for WP:en. I know that WP:en has an Excel file with identifiers. The only thing to do is to add to each chemical on that list the Q number of wikidata. Snipre (talk) 19:49, 25 March 2014 (UTC)
And about what we can import from PubChem are th InChI, InChIKey, Smiles, PubChem CID and IUPAC name. Ifyou already privide that data to all chemicals in WD, we will reach a good objective. Snipre (talk) 19:53, 25 March 2014 (UTC)
Thanks! I was following the ChemID project, and was hoping it would form part of that, but I hadn't seen your Excel sheet! That's perfect! What I think we need to do is to combine the English WP data with this one, then we can share it with PubChem. On the English WP we had a validation project to ensure that the above were correct (all except PubChemID and maybe SMILES), and you can see it is patrolled by bot (if someone vandalises data we indicate it with a red X). Many thanks! Walkerma (talk) 04:35, 26 March 2014 (UTC)
The best thing will be to have a third list of chemical with PubChem CID and Q number from a third wp (typically the german one) and then we can perform a comparison analysis to finally obtain a final list.
If you can get the english list of chemical and put it in a public server, please put the address in the Chem ID initiative page under the "Progress" paragraph. Snipre (talk) 10:26, 26 March 2014 (UTC)
I got in touch with the german WP to obtain the list of PubChem ID with the Q number for chemicals (a bot request was created). I got in touch with Beetstra in the WP:en to see if it is possible to get the english list of identifiers. Snipre (talk) 06:52, 27 March 2014 (UTC)
@Walkerma: I got the list of articles with CAS number, PubChem CID and Q number for WP:de and WP:en, see fr:Utilisateur:Snipre/Infobox Chimie/en and fr:Utilisateur:Snipre/Infobox Chimie/de. The french list is here. I have no time to start the analysis now but if someone wants to work in them feel free. Snipre (talk) 19:24, 9 April 2014 (UTC)

Constraint violations[edit]

Finally we have some good constraint violations for chemicals. I already went through part of the list:

Hi, Tobias1984, can you help me figure out what the table of "unique value" constraint violations means? I would think that it "unique value" means that at most one item on Wikidata is allowed to have a particular value for the PubChem ID (CID) (P662) property. So, that would mean that for any of the items on this list, the value for this property must be duplicated somewhere, right? But consider, for example, Lavendamycin (Q1808882), which is given a value of "100585". If my understanding is correct, then there must be another (at least one) Wikidata item with this same value for this property? But, I'd expect that other item to also show up on this list, but searching for "100585", it only shows up once. So, I am confused. What am I missing? Thanks! Klortho (talk) 20:28, 25 January 2014 (UTC)
Hi @Klortho:. You are right. Unique value means only one item can have the same string. The reason why your example doesn't have a second item is because I already merged that pair. But I didn't merge this pair yet: benzoyl peroxide (Q411424) and no label (Q15633266). The important thing is that we merge into the lower Q-number and list the other item for deletion. See Help:Merge. --Tobias1984 (talk) 23:26, 25 January 2014 (UTC)
Ah, I missed this in the header, "Some may already be fixed since the last update". Thanks! Klortho (talk) 03:16, 26 January 2014 (UTC)
Wikidata:Database_reports/Constraint_violations/P231 I went through the CAS-ID. Lots of Russian pages that are not connected to the rest of the wiki-world. Some duplicates are also from copy-pasted infoboxes where one of the IDs wasn't updated. We should make a habit of it to try to also correct the value on the respective Wikipedia. At least until a bot can do that on a regular basis. --Tobias1984 (talk) 09:59, 27 January 2014 (UTC)

Germanium subclass tree[edit]

I was working on the classification of Germanium compounds and isotopes a bit. What do you think of this structure:

Most of the subdivisions are also present in the Wikipedia categories. If you find this satisfactory we could model the rest of the chemical compounds in a similar way. --Tobias1984 (talk) 22:50, 14 February 2014 (UTC)

I'm hesitant to agree that compounds are subclasses of a substance. It seems to me that a compound would better be modeled as having the components (relation: has part) of that element.
Maybe I'm crazy though. :) --Izno (talk) 00:50, 15 February 2014 (UTC)
You're not crazy :) I'm still thinking if I made the right choice with the isotopes being a subclass of the element. - The tree also splits in germanium compound (Q15727447) and goes to Germanium and to "chemical compound". We could also remove the link to Germanium. The tree for "chemical compound" looks like this:

Currently is is 99 % incomplete though, because it only has the minerals and the germanium compounds. --Tobias1984 (talk) 10:26, 15 February 2014 (UTC)

I would say no: if this structure can not be applied to all chemicals this is not interesting form classification point of view. Instead of this imported classification new properties in order to describe element composition. And we can do the same for functions. All other classifications will be too complex to be used by contributors without a deep knowledge of it.
But for isotope I agree. Snipre (talk) 10:46, 15 February 2014 (UTC)
The subdivisions of germanium compounds into germanes, organogermanium-compounds and germanates are pretty standard. There might be some more obscure classes for germanium-compounds which we can still debate. - Why do you think that we can't apply this to all compounds? --Tobias1984 (talk) 11:36, 15 February 2014 (UTC)


< isotope of germanium (Q2288723) (View with Reasonator) > subclass of (P279) miga < germanium (Q867) (View with Reasonator) >

Not sure I agree, I would better see germanium (Q867) as a class of classes. germanium-73 (Q2437511) is for sure a subclass of germanium (Q867) as a germanium73 atom is for sure a germanium atom, it's more dubious that he is (all alone) a germanium isotope. It make more sense to mark germanium (Q867) as

< isotope of germanium (Q2288723) (View with Reasonator) > subclass of (P279) miga < germanium (Q867) (View with Reasonator) >

 : <germanium isotopes> is the class of all classes which regroups isotopes atoms which have the same numbers of neutrons. TomT0m (talk) 12:14, 15 February 2014 (UTC)

Sounds good. Please make the changes. We should try to find a good tree for this element and model the other elements accordingly. --Tobias1984 (talk) 12:23, 15 February 2014 (UTC)
OK, here is my attempt :
  • Now I am a little confused. What is no label (Q15730548) and is it used in chemical literature? And for the subclass of (P279)-tree I think we should stick to text-book subclasses (A typical chapter on the chemistry of germanium will have chapters for germanes, germanates etc...) - These are subdivisions natural to the science of chemistry. And I also don't understand why you want to skip isotope of germanium (Q2288723). Why leave it out of the subclass tree if we even have Wikipedia-articles for it. --Tobias1984 (talk) 15:54, 15 February 2014 (UTC)
  • My bad, I got mixed up in those <censorded>*–$"</censorded> Q numbers. So, I leave it (
I would also add
  • The subdivisions on a textbook are not aligned to the subclass property meaning. The no label (Q15730548) item is maybe a little overkill but I created it for element to be an instance of: We have several levels : the individual atom level (0). We regroup atoms in classes like hydrogen (1): the hydrogen class is, as the germanium class, a class of atoms with the same atomic number, which we call elements (3). We also regroup atoms in other kind of classes we call isotopes. Finally isotopes and elements are units we use to class atom interesting sets (4)
plus as obviously the isotopes of germanium are a special kind of isotopes that share a property. TomT0m (talk) 16:34, 15 February 2014 (UTC)
We have to exclude "isotope of XXX" from the classification tree: it is a useless concept. Better create for each element a general item for the element and consider all isotopes of the element as subclasses. We have to be more systematic than wikipedia articles and even if it exists some articles for some isotopes you don't need to use this classification. And each isotope item has to be defined as "subclass of": "isotope" and all general element items as "subclass of": "element". I don't understand the need of the no label (Q15730548): we can define "chemical element" as "subclass of": "atom". As we are speaking about concept and never about specific atom "instance of" is not relevant. Snipre (talk) 17:09, 15 February 2014 (UTC)
Your tree is included in my suggestion of ontology. Apart from that I don't see how expressing a little more is useless, we're in a project whose pirpose is to structure datas, let's stucture datas. Plus it's difficult to make something more systematic than this. The ,,isotope of X items are for example useful with simple queries. I'm not aware of other kinds of atom subclasses but we can't exclude there is, let's build something robust, a little bit of redundancy does not harm. For the most abstract item, I think it's not a bad habit to try to put an instance of property on every item. Wikidata, in general, will have to class things according to several ponts of view, this item can be a help for that as it's an entry point for querying the standard classes of atom in chemistry. But I went a little far, we're still in experiment time :) Concept classification is imho a very important feature in a project who aims to represent the sum of all knowledge (no less than that … :) ) TomT0m (talk) 17:39, 15 February 2014 (UTC)
For And each isotope item has to be defined as
< isotope item > subclass of (P279) miga < "isotope" >
No, this would mean that an atom item (an instance of an isotope item) is also an isotope item (hence a class of atoms,) which does not make sense.
< isotope item > instance of (P31) miga < isotope >
, which means it is a member of the set of all isotopes. TomT0m (talk) 18:07, 15 February 2014 (UTC)
@TomT0m: The "isotope of X" can be expressed as a double queries: item defined as "subclass of": "isotope" and as "subclass of": "X". So as the query can do the job why do we want to complexify our classification tree ? For me "isotope of X" item is the same as "X" item because all atoms of an element are isotope. So your proposition nixes classification and query. And why do we need extr levels in the classification when no need is defined ? For me no label (Q15730548) and isotope of germanium (Q2288723) are good examples of useless levels in classification: we have to create levels and branches when needed not because we don't know. And experiments are not good idea because experiment means someone will have to clean up and cleaning is not always well done. If you really want to do experiment use the test server. Snipre (talk) 18:25, 15 February 2014 (UTC)
For the debate instance of vs. subclass of, I don't care but if you really want a clear definition of the diference speak with user:Emw about the semantic standards: according to Emw instance of should be used only a specific isotope you can trace at position x at time t, so one atome you follow and for each you can give the position at any time. A labelled atom like the dog of your aunt which has a name and is clearly identified along many other dogs of the same specie. Snipre (talk) 18:32, 15 February 2014 (UTC)
(edit conflict)Mixing classification and query does not make sense. As said in another place, if a class has the same instances (claims) that the result of a query is supposed to return, it's an opportunity to add a consistency check, so not necessarily a bad thing. For the definition, I'll quote french Wikipedia : Un élément chimique désigne l'ensemble des atomes caractérisés par un nombre défini de protons dans leur noyau atomique. It appears to exactly match he definition we have. We can define isotopes the same way, and we should to keep things consistent and rigorous. And soucable, otherwise this is a POV. TomT0m (talk) 18:41, 15 February 2014 (UTC)
Emw changed his mind, OWL2 allows to use a class item to be an instance of another class throw Punning, he was referring to an old version of the standard in which this was possible but would have made query undecidable. This allows to class classes cleanly, which is fortunate. TomT0m (talk) 18:47, 15 February 2014 (UTC)
So in summary the two classifications according to instance of/subclass relations between germanium-73 and atom are:
1) germanium-73 -> germanium -> chemical element -> atom
2) germanium-73 -> isotope of germanium -> germanium -> chemical element -> atom kind class -> atom
For the simplicity it is clear which version is the best, redundancy and checking are not necessary if data import is well organized with a bot in order to do the classification in one step and in a very short time. For me including already now some checking structure is stupid because who is doing that checking and according to which format ? If we have to do something it is according the current state of the tools and of the wikidata organization because nobody can say how a check system will work in the future. Perhaps all the things proposed here won't match the future specifications so this is again useless at that point. Snipre (talk) 19:23, 15 February 2014 (UTC)
I don't understand your arrows. The relations make sense independently from the need of redundancies or not and are really no big deal in this case. Robustness is important far after the initial import as it can help to spot errors or vandalism in editions. I don't understand which specifications you are talking about. TomT0m (talk) 19:53, 15 February 2014 (UTC)
One example of things for which the isotope item is interesting for right now, and I did not planned this : reasonator on this item
Personally, I would reject use of anything related to something like "atom kind class". We need to have a centralized discussion about things like that, as the only person I've seen pushing the point of view that that would be useful is TomTom. (For better or worse.) It still seems evident to me that it duplicates information already implicit in the P279 claim that something is a subclass of an element/atom. Additionally, I find it highly unlikely that we would find literature to use to specify those claims. I really think we should hold off on doing anything like that for now where it doesn't make sense, and right now, it doesn't feel like it makes sense here because it simply sounds wrong. --Izno (talk) 03:43, 16 February 2014 (UTC)
(For better or worse.) :) I did not looked much into that direction, but I would not be surprise we are touching here some kind of upper ontology concept. There already were some discussions about that on project chat, I'll dig this a little. TomT0m (talk) 10:19, 16 February 2014 (UTC)


Hi, I'd like to suggest that you add some more information to the Participants section, outlining the way the project works and how someone becomes a participant. I can guess that I could add my name to the list of participants but that in itself would be a really only change the length of the list and doesn't practically make me a participant. I'm happy to dive in and start discussions but others may not be. --The chemistds (talk) 16:47, 4 April 2014 (UTC)

@The chemistds: You are more than welcome to start any discussions about the chemistry-data here. Adding your name to the list has the advantage that we can ping all the participants to alert them about discussions which are not on anybodies watch list. Tobias1984 (talk) 17:29, 4 April 2014 (UTC)
@The chemistds: Done Snipre (talk) 18:10, 4 April 2014 (UTC)

Atomic composition[edit]

The description of the atomic composition of a molecule can be done using has part (P527). See ethanol (Q153) as example. Snipre (talk) 08:55, 17 April 2014 (UTC)

Salt classification[edit]

Jasper Deng
the chemistds
Egon Willighagen
Daniel Mietchen
Andy Mabbett
Emily Temple-Wood
Pablo Busatto (Almondega)
Pictogram voting comment.svg Notified participants of Wikiproject Chemistry: How can we classify salts ?

Two PubChem CIDs in cobalt(II) cyanide (Q2620039)[edit]

Which is correct?--GZWDer (talk) 09:41, 6 July 2014 (UTC)

For cobalt cyanide, both are correct in the PubChem database. Best is to contact the database to see which is the difference. Snipre (talk) 14:29, 7 July 2014 (UTC)

To-Do for elements[edit]

toollabs:ricordisamoa/period has been hugely updated, now it shows periods and groups correctly, with labels in the user's language. But, for many items, the software couldn't find the correct position: they are reported on the top of the page. Some of them are missing atomic number (P1086), some others subclass of (P279) with a valid item for a group. Please add correct and sourced statements. Thanks in advance, --Ricordisamoa 02:06, 12 August 2014 (UTC)

Thanks for the report: it helps to see the missing data. Snipre (talk) 08:04, 13 August 2014 (UTC)
Basic support for lanthanides etc. has been also implemented. --Ricordisamoa 20:14, 18 August 2014 (UTC)

Royal Society of Chemistry - Wikimedian in Residence[edit]

Hi folks,

I've just started work as w:Wikimedian in Residence at the w:Royal Society of Chemistry. Over the coming year, I'll be working with RSC staff and members, to help them to improve the coverage of chemistry-related topics in Wikipedia and sister projects.

You can keep track of progress at w:Wikipedia:GLAM/Royal Society of Chemistry, and use the talk page if you have any questions or suggestions.

How can I and the RSC support your work to improve Wikipedia? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:17, 24 September 2014 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Pigsonthewing: Hello, thanks for your proposition. Some ideas:

  1. create the links between the ChemSpider IDs and the Wikidata Q numbers. I think it is a good think for the chemspider to not focus on WP:en only but to offer a link to all WPs through the WD Q number instead.
  2. offer support to create an ontology about chemistry based on the instance of, subclass, part of properties in order to generate a structure between the chemical definitions.
  3. share some data about chemical properties (when numeric datatype with unit will be available)
  4. take part to the matching of all different IDs used to identify chemicals.
  5. for WP articles, propose a generic template for chemical article.
  6. help to develop fundamental articles like chemistry, iupac nomenclature,...
  7. export biographic data from the RCS database to WD.

We need expertise of some persons working in chemistry field having the possibility to provide data or information which are beyond the access or knowledge of Mister X. I know that the WP system is not the best one for expert because you can see your contributions modified by any user without having the possibility to oppose your expertise. Perhaps start to create a global account for the RCS staff in order to allow a better recognition of other contributors. Snipre (talk) 16:57, 25 September 2014 (UTC)

Thanks, Snipre - I've numbered your points for ease of response. #1 I've already suggested, and #4 is under discussion. #5 is en:Template:Chembox, surely? Or have I misunderstood? #7 would probably not be allowed, under UK data protection law. I'll discuss the rest with my new colleagues. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:24, 25 September 2014 (UTC)
For #5 I think more about an evaluation of the current structure of the article but of the chembox too from data visualization point of view: can a scientist use the data presented in WP articles or some important informations are missing (conditions of measurement, references,...) ?
From my perspective #2 is the more challenging and interesting and some scientists are working in the field of ontology I think this will help us to find examples in the literature or to develop through discussions. Snipre (talk) 08:52, 26 September 2014 (UTC)

2. offer support to create an ontology about chemistry based on the instance of, subclass, part of properties in order to generate a structure between the chemical definitions.
Andy, are you familiar with ChEBI? It is the most widely used chemistry ontology. Background papers are here, here and here. ChEBI is developed by the European Bioinformatics Institute and, to my knowledge, endorsed and used by The Royal Society of Chemistry. I think any proper Wikidata chemistry ontology development should have as a requirement straightforward interoperability with ChEBI.
Colin Batchelor at RSC does a lot of ontology development and comments on ChEBI mailing lists, so I imagine he would be helpful to consult on developing a Wikidata chemistry ontology that's interoperable with major existing ontologies in this domain. Emw (talk) 12:38, 27 September 2014 (UTC)


@Pigsonthewing, Emw: The ChEbI ontology is a specialized one: we have to keep the relation as simple as possible to ensure the use of a common set of properties for whole wikidata. Currently we have only "instance of", "subclass of" and "part of" relations and even if we can create more relation/properties, too complex structure we will be difficult to maintain and to use for new/occasional contributors. Snipre (talk) 12:00, 29 September 2014 (UTC)

I have filled in the table with some ChEBI-Wikidata mappings. I'll elaborate tomorrow. Emw (talk) 04:03, 30 September 2014 (UTC)
ChEBI relation Wikidata property Example In ChEBI Note
- instance of (P31) None in ChEBI ChEBI follows the practice of BFO- and RO-based ontologies and does not include instances, e.g. "this molecule of ethanol in a bottle" -- just like Wikidata w.r.t. chemical entities. See 'Background' in Relations in Biomedical Ontologies (RO), Smith et al. 2005.
is_a subclass of (P279) oxygen-18 (Q662269) subclass of oxygen (Q629) "A has_subclass B = [definition] B is_a A.", 'Discussion' in RO. Note how all includes of 'is_a' are replaced by rdfs:subClassOf in chebi.owl (warning, big). P279 is mapped to rdfs:subClassOf on Wikidata.
part_of part of (P361) water (Q283) part of hydrate (Q462174) See 'Part_of' section in RO
has_part has part (P527) Inverse of above. See above. 'Discussion' in RO
has_role None yet, see Wikidata:Property_proposal/Archive/25#has_role Example
is_conjugate_acid_of (and is_conjugate_base_of)  ? Example
has_functional_parent  ? Example
is_substituent_group_from part of Example
Example Example Example
Regarding the notion that ChEBI is "specialized", it's worth noting that ChEBI's domain is chemical entities of biological interest. This includes many types of subatomic particles, minerals, mixtures, pharmaceutical compounds, biological macromolecules, and more. So it is a domain ontology, but that domain is fairly general. ChEBI also has far fewer properties (i.e. relations) than Wikidata, even if we consider only chemistry properties. Notably, ChEBI's main properties -- is_a, part_of and has_part -- already exist on Wikidata as subclass of (P279), part of (P361) and has part (P527).
So I don't think extra complexity will be a major issue in the effort to ensure straightforward compatibility with ChEBI. The issue will be adjusting usage of instance of (P31) and subclass of (P279) on Wikidata to be compatible with that in ChEBI and other scientific Semantic Web ontologies, e.g. by replacing virtually all statements like "instance of (P31) chemical compound" with statements like "subclass of (P279) chemical compound" (or some subclass of chemical compound). Emw (talk) 01:12, 2 October 2014 (UTC)
@Emw: By specialized I was meaning that ChEBI is working for chemistry but not for persons, plants,... Wikidata is larger that ChEBI so the ontology of WD should be applicable to the different fields of WD in order to avoid particular classifications according to fields. Contributors should be able to work in any fields without having to learn different systems. So classification of chemicals and others chemistry subjects in WD should by related more with other classification schemes used on WD that with ChEBI. WD is not a mirror of ChEBI so we have no interest just to copy ChEBI structure even if this is an authority in its field. WD is more a result of popularization of chemistry than a advanced classification system in a specific science. So ChEBI is not a reference just an example. Snipre (talk) 11:42, 10 October 2014 (UTC)
Snipre, I encourage you to read Relations in Biomedical Ontologies. The ontological relations described there -- instance_of, is_a and part_of -- align with generic properties widely used on Wikidata as outlined in the table above. They apply to persons, plants and beyond -- and also to chemical compounds. Thus, while ChEBI is specialized to the domain of chemistry, the foundational properties (i.e. relations) it uses are applicable to all domains of knowledge.
The problem here is that WikiProject Chemistry is using an idiosyncratic definition of "instance" which makes this project incompatible with not only the world's most widely used chemistry ontology, but also other reference domain ontologies based on the Relation Ontology -- e.g. Gene Ontology, Disease Ontology, Plant Ontology, etc. Emw (talk) 22:18, 27 October 2014 (UTC)

Relevant discussion on wikidata-l[edit]

Please see the discussion at and It will likely affect how chemical compounds are classified on Wikidata. Thanks, Emw (talk) 12:54, 8 October 2014 (UTC)

Again I found no reason to compare Porsche 356 and ethanol: you can always define an instance of a Porsche 356 by some unique characteristics ( chassis number, events where the car was used, famous owners or not,...) but you can't do the same for a molecule of ethanol. You can describe some energy states of a molecule, its position but if you stop to follow it one moment and you try again to find it again you won't be able to find it again because a molecule can't have its own specific properties, that's a scientific rule. So again comparing a car model with a chemical is not correct. That why we don't have a "chemical model" or "chemical compound type" because this concept is not necessary in chemistry: doing the difference between ethanol as chemical and ethanol as molecule is a non-sense from a properties point of view.
Here we reach the heart of the problem: what is the definition of an "instance". For me an instance is an entity which has its own characteristics over the time. A Porsche 356 can have some unique characteristics over the time like its chassis number. But not a molecule of ethanol: you can specify its position or its energy level but this is not characteristic of this molecule (other molecules can have the same energy level and can have the same position at a different time. It is the same like creating different items for a country because its population is changing over the time. The specific population is not a characteristic of the country like the position or the energy level is not a characteristic of a molecule of ethanol. So even if its is possible to create an item for one molecule, this is not making sense because you can't provide unique characteristics for this molecule. Snipre (talk) 13:12, 10 October 2014 (UTC)
ChEBI subscribes to the Relation Ontology, which uses the definition of instance from the Basic Formal Ontology:
Instances are individuals (particulars, tokens) of special sorts. Thus each is a simply located entity, bound to a specific (normally topologically connected) location in space and time.
Thus, even if they are identical in every respect except location, two molecules of ethanol are indeed instances. The fact that they are merely spatiotemporally distinguishable is sufficient to make them instances of the class (aka universal, type) ethanol. This is an established practice in not only ChEBI, the world's most popular chemistry ontology, but also ontologies in many other domains of knowledge. You make rather bold claims ("'s a scientific rule") without citing any scientific or ontological sources. Please provide some relevant literature or, even better, existing Semantic Web ontologies that support your idiosyncratic definition of instance.
Even though spatiotemporal distinguishabilty alone is enough to make something an instance per ChEBI, RO, BFO and a wide raft of established philosophy, there are other properties that can make two entities of ethanol distinguishable. For example, one ethanol molecule could have a different isotopic composition -- one molecule could have its hydrogens in the form of deuterium (hydrogen-2, i.e. be deuterated ethanol (Q1101193)) and the other could have its hydrogen in the more familiar form of protium (hydrogen-1). This is a prima facie argument that ethanol is a subclass of, not an instance of, chemical compound. Emw (talk) 12:51, 27 October 2014 (UTC)

Two new properties[edit]

@Pigsonthewing: Gmelin Number (P1578) and Beilstein Registry Number (P1579) are ready. -Tobias1984 (talk) 17:00, 26 October 2014 (UTC)

Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:54, 26 October 2014 (UTC)

Launch of WikiProject Wikidata for research[edit]

Hi, this is to let you know that we've launched WikiProject Wikidata for research in order to stimulate a closer interaction between Wikidata and research, both on a technical and a community level. As a first activity, we are drafting a research proposal on the matter (cf. blog post). It would be great if you would see room for interaction! Thanks, --Daniel Mietchen (talk) 01:39, 9 December 2014 (UTC)

Precision in identifier mapping[edit]

The following is from an email discussion with Gang Fu of PubChem: "The challenge would be that different data sources have different chemical structural representation for the same drug ingredient. For instance, the following PubChem compounds were mapped to drug ingredient 'ETHAMBUTOL (NDFRT: N0000147838)':

My question now is how we should map that. To get things going, I have added all of these PubChem IDs to Ethambutol (Q412318). I think most of these should be split off into separate items eventually, so I'd like to invite comments as to the granularity and thus precision we should be aiming at. --Daniel Mietchen (talk) 08:56, 17 December 2014 (UTC)

@Daniel Mietchen: The project is not so active to have reached a high definition level of the data structure. But in my opinion we have to create one item for each stereoisomer and each salt. We have to create even specific items for mixture of stereoisomers. So for dichloroethene we will have 3 items: one for the mixture, one for the Z and one for the E form. This represents a hug amount of item but we don't need to import all possible molecules in once and we have to concentrate for the most important molecules first. But for the case of stereoisomers, once one molecule is added, we should add the whole family in order to be sure that data can be added at the right place. Snipre (talk) 09:19, 17 December 2014 (UTC)

Free 'RSC Gold' accounts[edit]

I am pleased to announce, as Wikimedian in Residence at Royal Society of Chemistry (Q905549), the donation of 100 "RSC Gold" accounts, for use by editors wishing to use RSC journal content to expand articles/ items on chemistry-related topics. Please visit en:Wikipedia:RSC Gold for details, to check your eligibility, and to request an account. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:28, 18 December 2014 (UTC)


IP changed a chemical formula: [1]
Regards --Chris.urs-o (talk) 06:39, 24 January 2015 (UTC)

Chemical element in french and english[edit]

Hi, I started a discussion on enwiki WikiProject about this. See en:Wikipedia talk:WikiProject Chemistry#Chemical element : french and english definition. TomT0m (talk) 13:24, 7 June 2015 (UTC)

Chemical mixtures[edit]

@Snipre: Should chemical mixtures be tagged with material used (P186) or with has part (P527)?--Kopiersperre (talk) 16:04, 12 June 2015 (UTC)

@Kopiersperre: I would say has part (P527). It's totally the kind of relationships it's supposed to express, while material used (P186) seems to be liked not to the parts themselves but what the parts are made of : a wooden ship is made of wood, and has a hull (a wooden hull :) ). But has part would also work for
< wooden ship > has part search < wooden hull >
. TomT0m (talk) 16:33, 12 June 2015 (UTC)
@TomT0m: But has part (P527) is already used for the elemental composition (as seen in ethanol (Q153)). I would like to state Aerozine 50 (Q16831) consists of 1,1-Dimethylhydrazine (Q161296) (50 %) and hydrazine (Q58447) (50 %).--Kopiersperre (talk) 16:51, 12 June 2015 (UTC)
@Kopiersperre: A misread ? I say has part (P527) is good :) TomT0m (talk) 16:59, 12 June 2015 (UTC)
@Kopiersperre, TomT0m: has part (P527) is a good start but we need to specify the amount or percentage. Perhaps a specific property can help to avoid to use has part (P527) for everything. But we have to think about a qualifier in any case to specify the percentage. Snipre (talk) 19:33, 12 June 2015 (UTC)
@Kopiersperre, Snipre: : Can we convert the percentage into raw numbers ? there is already a qualifier for this. quantity (P1114) miga. I guess percentage qualifier is needed in the cases of classes of variable size molecules with constant ratios of components anyway, if there is some of interests ( I'm not a chemist :) ) ... TomT0m (talk) 09:05, 13 June 2015 (UTC)

I found this thesis who describes (page 13) the "has part" relationship in ChEBI exactly as this. Although I couldn't find in a quick search a real example of this in ChEBI itself The example given in ChEBI tutorial (page 9, search potassium) gives an example of an ionic bond if I'm correct ... TomT0m (talk) 09:34, 13 June 2015 (UTC)

property SMILES: canonical and isomeric[edit]

Currently we have only one property for the SMILES notation: canonical SMILES (P233). But SMILES notation can have several notations: one with the isomer description and one without. Do we need to have two properties or to add a qualifier to distinguish both notations ? Snipre (talk) 11:24, 17 June 2015 (UTC)

State & phase[edit]

Should state of matter (Q11430) and phase (Q104837) be merged? The former has "phase" as an alias, and they both point to commons:Category:States of aggregation, which suggest so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:48, 18 June 2015 (UTC)

@Pigsonthewing: Phase is not equal to state of matter. State of matter is limited to solid, liquid, gas and plasma (I know there are some others more exotic) and phase is just "An entity of a material system which is uniform in chemical composition and physical state" according to IUPAC. You can have 2 liquid phases in a bottle (think about mixture of water and oil). Both are liquid but as there is a separation into 2 phases this can't considered as one entity. So phase used as state of matter is not correct and you can't merge both. Snipre (talk) 15:50, 18 June 2015 (UTC)
Fair enough. In that case, how should we disentangle the two items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:03, 18 June 2015 (UTC)
@Pigsonthewing: By using clear definitions:
State of matter: one distinct form of the matter
Phase: An entity of a material system which is uniform in chemical composition and physical state. Snipre (talk) 08:35, 19 June 2015 (UTC)
But note the issues in my first post in this section: aliases, and a shared commons category. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:55, 19 June 2015 (UTC)
For the alias, we have to delete phase as alias of state, and the commons category we have to choose according to the documents we have in the category. Snipre (talk) 12:30, 19 June 2015 (UTC)


There is two ways to define elements

  1. Types of atoms with the same atomic number (aka. the set of all atoms for some atomic number)
  2. Types of substances with only one type of atoms (in definition 1)

It seems that we have an interwiki conflict here, because en:Chemical elements uses 1. as a main definition, and, for example, fr:élément chimique does use the second. I'm afraid I failed to gain a concensus jut for the pair of languages to align the definitions, so I guess we won't avoid the spitting of items, this will give work to WD:XLINK. For each element … The good news is that it will clarify some classification issue (assuming we solve the same problem for chemical substance, molecular entity and all) :

< Hydrogen (en) > subclass of (P279) miga < pure chemical substance >
< Hydrogène (fr) > subclass of (P279) miga < atom >
< Hydrogène (fr) > part of (P361) miga < Hydrogen (en) >
Hydrogen (en) = hydrogène élémentaire/hydrogène pur (fr)
Hydrogène (fr) = Hydrogen atom (en)
< Hydrogène (fr) > instance of (P31) miga < élément chimique (fr) >
élément chimique (fr) = "type of atom occuring in some element" (en)
< USS Akron (Q1456109) (View with Reasonator) > has part (P527) miga < Helium (en) >

Does that seem correct ? @Emw, Snipre: of course.


Aren't diamond (Q5283) and graphite (Q5309) instance of (P31) chemical substance (Q79529)?

No, they are . Each diamond is a substance and there is many diamond. author  TomT0m / talk page 11:19, 18 August 2015 (UTC)
@TomT0m: It seems to me an instance of (P31) of chemical substance (Q79529). See the registry at NIST WebBook. It can be a class of gemstone (Q83437), but they are chemically equivalent. Almondega (talk) 12:22, 18 August 2015 (UTC)
@Almondega: see In chebi is a is the equivalent of subclass of, if my understanding is correct. author  TomT0m / talk page 12:31, 18 August 2015 (UTC)
@TomT0m: diamond (Q5283) figures also in The Merk Index 13th Ed. (Q20819290) at inventory number (P217) 3007. But there aren't chemical substance (Q79529) that are diamond (Q5283) instances. I honestly do not see how ChEBI ontology corroborate with your thinking. Almondega (talk) 12:54, 18 August 2015 (UTC)
I do honestly do not understand what you mean. Are you aware of basic classification principles explained in Help:Classification ? author  TomT0m / talk page 13:10, 18 August 2015 (UTC)
@TomT0m: Ok, but why exactly diamond (Q5283) would be a subclass of (P279) chemical substance (Q79529)? Almondega (talk) 13:33, 18 August 2015 (UTC)
@Almondega: Because we have Each diamond is a pure substance. Like each bottle of Helium we use for parties with childrens for balloons. One bottle and one diamonds are tokens in the type/token distinction principle, real objects of the real world. Diamond is a class of real world object, a type in the type/token distintion sense, with a lot of instances. Similarly, pure substance is a class of real world object, the class of all real world objects who are chemically pure. Then the set of all diamonds is a subset of all pure chemical substances, and this is precisely what
< Diamond > subclass of (P279) miga < pure substance >
 means. If we want to say that "diamond" is an instance of something, then the something is a metaclass. This could be labelled "chemical substance type" for example. author  TomT0m / talk page 14:15, 18 August 2015 (UTC)
@TomT0m: It is not yet clear to me. By this reasoning silicon dioxide (Q116269) should be a chemical subclass since describes several instances of that material in the world. Almondega (talk) 14:53, 18 August 2015 (UTC)
@Almondega: There is an ambiguity that we can retrieve all other chemistry, about substances and the molecules (or assimilated) they are made of, included into basic definition like chemical elements. For example if you search "silicon dioxyde" in chebi you'll find : then you'll find ( more precisely) that it's a molecular entity. A type of molecules, said in other words. Then there may be some substances made of "silicon dioxyde molecule" ... we could call them "pure silicon dioxyde substances". "silicon dioxyde molecule" would be a subclass of molecules, and "pure silicon dioxyde substance" would be a subclass of chemical, with
< pure silicon dioxyde substance > has part (P527) miga < "silicon dioxyde molecule" >
As I said, this distinction is often ambiguous in everyday chemistry as the same word is often used for the two concepts, for example a "chemical element" may be a substance or a type of atoms depending on the definition you make (frwiki and others made the second, enwiki made the first, so we have a beautiful iw conflict to manage ...). author  TomT0m / talk page 15:21, 18 August 2015 (UTC)
@TomT0m:I'm increasingly confused :( Almondega (talk) 15:33, 18 August 2015 (UTC)
It's just that your question raised another ambiguity :) but essentially yes you were right, silicon dioxyde substance is a subclass of chemical according to the type/token distinction principle. author  TomT0m / talk page 15:44, 18 August 2015 (UTC)