Wikidata:WikiProject Chemistry/Structure-related issues with partially ionic entities

From Wikidata
Jump to navigation Jump to search

Background[edit]

Ionic bond refers to the bond between atoms with sharply different electronegativities. Such chemical entity is overall neutral, but consists of positively charged cations and negatively charged anions. In practice, however, chemical compounds are neither purely ionic, nor purely covalent. Each bond between atoms in chemical compounds exhibits some amount of ionic character and some of covalent character.

Such partial bonding characters are difficult to capture when representing a chemical entity through structural formulas and thus cannot be correctly reproduced using identifiers based on such formulas, like simplified molecular input line entry specification or International Chemical Identifier, which leads to the representation of the chemical entity as either ionic or covalent. Chemical databases can thus contain records for the same chemical entity generated from different structure-based identifiers.

Affected properties[edit]

In Wikidata, structure-related properties like InChI (P234), InChIKey (P235) and canonical SMILES (P233) are affected by this, as well as identifiers in large-scale chemical databases like ChemSpider ID (P661) or PubChem CID (P662). Entries in such databases are usually not considered incorrect, even if a predominantly ionic chemical entity described has a covalent structure in such entry.

In some cases, like with ChemSpider entries, standard InChI and InChIKey may be the same for both ionic and covalent structures, and the difference may be with structural formula, SMILES or non-standard InChI/InChIKey (which are visible only for logged in users after clicking ‘Wikibox‘ button). This is a result of generating ChemSpider entries for organometallic entities and salts based on non-standard InChI with RecMet option and adding a metal reconnected layer ‘\r‘ in InChI Software. Sometimes entries for ionic and covalent entities may have different standard InChI and InChIKey.

This page contains examples of solutions that will work in most cases, however, in unclear situations, you may consult each case on the WikiProject Chemistry talk page. For the evaluation purposes, all cases of the above-mentioned issues with ionic entities should be listed below.

Applicable qualifiers[edit]

To counter this issue, a set of qualifiers and criteria are used:

criterion used (P1013)
which should be added for every external identifier value if there is more than one value valid for an item.
possible values are:
entry in a database describing the character of a chemical entity as ionic (Q121136291)
entry in a database describing the character of a chemical entity as covalent (Q121136454)
reason for preferred rank (P7452)
which should be added for external identifier and structure-related properties for values that show a more appropriate way of representing a chemical entity as ionic.
possible value is:
proper structure of a molecular entity with a predominantly ionic character (Q118215656)
reason for deprecated rank (P2241)
which should be added to structure-related properties, but not to external identifiers, for values that show a covalent character for a predominantly ionic compound
possible value:
covalent structure for a chemical entity with a predominantly ionic character (Q107345922)
sometimes incorrect structure of molecular entity (Q52679949) value may be of use (for both structure-related properties and external identifier), if for some reason the generated structure in the database, and on this basis also the statements imported to Wikidata, is obviously incorrect (this does not concern the nature of the bonds, but e.g. charge or stoichiometry).

Examples[edit]

cobalt(II) sulfate (Q411214)
ChemSpider database for this salt contains two entries: CSID:23338 for an entity consisting of a cobalt(2+) ion and a sulfate anion, and CSID:28822501 for an entity with a covalent bonds between a cobalt atom and two oxygen atoms. Standard InChI and InChIKeys are identical, structural formula, SMILES and non-standard InChI and InChIKeys are different.
ChemSpider ID
Preferred rank 23338
reason for preferred rank proper structure of a molecular entity with a predominantly ionic character
criterion used entry in a database describing the character of a chemical entity as ionic
0 references
add reference
Normal rank 28822501
criterion used entry in a database describing the character of a chemical entity as covalent
0 references
add reference
Both identifiers have a specific criterion used (P1013) qualifier added regarding the ionic/covalent character. As the chemical entity has a predominantly ionic character (salt), one value is set to preferred with a proper reason for preferred rank (P7452) added.
add value
canonical SMILES
Normal rank [O-]S(=O)(=O)[O-].[Co+2]
0 references
add reference
Deprecated rank O=S1(=O)O[Co]O1
reason for deprecated rank covalent structure for a chemical entity with a predominantly ionic character
0 references
add reference
First value shows a correct representation of a ionic compound, no need to set rank as preferred. The second value shows a covalent representation of a predominantly ionic compound, thus it is set to deprecated with a proper reason stated.
add value


potassium sulfide (Q408920)
For this item there are four entries in PubChem and two entries in ChemSpider. Also, there are many SMILES, InChI and InChIKey values. This example shows a complex situation in which there is not only a problem of different representation of ionic and covalent bonding, but also issues with generating records in chemical databases with improbable formula or charge, which are then imported to Wikidata based on some identifiers or names.
canonical SMILES
Normal rank [S-2].[K+].[K+]
0 references
add reference
Deprecated rank [SH-].[K+].[K+]
reason for deprecated rank incorrect structure of molecular entity
0 references
add reference
Deprecated rank S([K])[K]
reason for deprecated rank covalent structure for a chemical entity with a predominantly ionic character
0 references
add reference
First value shows a correct representation of a ionic compound, no need to set rank as preferred. The second value was imported from ChemSpider and shows an incorrect representation of an entity (wrong formula and charge), while the third value shows a covalent representation of a predominantly ionic compound; both values are set to deprecated with a proper reason stated.
add value
InChI
Normal rank InChI=1S/2K.S/q2*+1;-2
0 references
add reference
Deprecated rank InChI=1S/2K.H2S/h;;1H2/q2*+1;/p-1
reason for deprecated rank incorrect structure of molecular entity
0 references
add reference
Deprecated rank InChI=1S/2K.S
reason for deprecated rank covalent structure for a chemical entity with a predominantly ionic character
0 references
add reference
First value shows a correct representation of a ionic compound, no need to set rank as preferred. The second value was imported from ChemSpider and shows an incorrect representation of an entity (wrong formula and charge), while the third value shows a covalent representation of a predominantly ionic compound; both values are set to deprecated with a proper reason stated.
add value
InChIKey
Normal rank DPLVEEXVKBWGHE-UHFFFAOYSA-N
0 references
add reference
Deprecated rank FANSKVBLGRZAQA-UHFFFAOYSA-M
reason for deprecated rank incorrect structure of molecular entity
0 references
add reference
Deprecated rank NDZFQAFUFZHMFU-UHFFFAOYSA-N
reason for deprecated rank covalent structure for a chemical entity with a predominantly ionic character
0 references
add reference
First value shows a correct representation of a ionic compound, no need to set rank as preferred. The second value was imported from ChemSpider and shows an incorrect representation of an entity (wrong formula and charge), while the third value shows a covalent representation of a predominantly ionic compound; both values are set to deprecated with a proper reason stated.
add value
ChemSpider ID
Normal rank 142491
0 references
add reference
Deprecated rank 14116
reason for deprecated rank incorrect structure of molecular entity
0 references
add reference
First value shows a correct representation of a ionic compound. However, based on the content of the second ChemSpider record, it was imported into this item, but as it describe incorrect structure, it is set as deprecated.
add value
PubChem CID
Preferred rank 162263
reason for preferred rank editorial choice
0 references
add reference
Normal rank 139047412
0 references
add reference
Deprecated rank 139047332
reason for deprecated rank deprecated identifier value
0 references
add reference
Deprecated rank 14800
reason for deprecated rank incorrect structure of molecular entity
0 references
add reference
The reason why both records exist in PubChem is unknown, and the choice between them remains purely editorial (the first one seems to be more complete), hence the indicated reason for preferred rank. The third value has been withdrawn from PubChem (non-live status), while the fourth value shows an invalid representation of a chemical entity (wrong formula and charge).
add value

List of items[edit]

No Item Description
1 cobalt(II) sulfate (Q411214)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 23338: InChI=1/Co.H2O4S/c;1-5(2,3)4/h;(H2,1,2,3,4)/q+2;/p-2
  • 28822501: InChI=1/Co.H2O4S/c;1-5(2,3)4/h;(H2,1,2,3,4)/q+2;/p-2/rCoO4S/c2-6(3)4-1-5-6
2 calcium hydroxide (Q182849)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 14094: InChI=1/Ca.2H2O/h;2*1H2/q+2;;/p-2
  • 21170965: InChI=1/Ca.2H2O/h;2*1H2/q+2;;/p-2/rCaH2O2/c2-1-3/h2-3H
3 iron(II) nitrate (Q10337052)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 7991154: InChI=1/Fe.2NO3/c;2*2-1(3)4/q+2;2*-1
  • 32867051: InChI=1/Fe.2NO3/c;2*2-1(3)4/q+2;2*-1/rFeN2O6/c4-2(5)8-1-9-3(6)7
4 copper(I) oxide (Q407709)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
5 potassium sulfide (Q408920)
canonical SMILES (P233), InChI (P234), InChIKey (P235) and PubChem CID (P662) affected as a result of an import from PubChem (which entry is now set as non-live due to invalid structure). The remaining issue is the presence of two valid PubChem CID (P662) values, both representing ionic structure.
6 aluminum citrate (Q18213342)
PubChem CID (P662) affected, with two IDs added for ionic and covalent representation.
6 caesium auride (Q368965)
canonical SMILES (P233) affected, with two notations added for ionic and covalent representation.
7 ferric sulfate (Q409021)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 23211: InChI=1/2Fe.3H2O4S/c;;3*1-5(2,3)4/h;;3*(H2,1,2,3,4)/q2*+3;;;/p-6
  • 21493902: InChI=1/2Fe.3H2O4S/c;;3*1-5(2,3)4/h;;3*(H2,1,2,3,4)/q2*+3;;;/p-6/rFe2O12S3/c3-15(4)9-1(10-15)13-17(7,8)14-2-11-16(5,6)12-2
8 cesium hydroxide (Q296363)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 56494: InChI=1/Cs.H2O/h;1H2/q+1;/p-1
  • 21247404: InChI=1/Cs.H2O/h;1H2/q+1;/p-1/rCsHO/c1-2/h2H
9 lithium hydroxide (Q407613)
ChemSpider ID (P661) affected, with two IDs added for ionic and covalent representation:
  • 3802: InChI=1/Li.H2O/h;1H2/q+1;/p-1
  • 21170131: InChI=1/Li.H2O/h;1H2/q+1;/p-1/rHLiO/c1-2/h2H