There is also one problem regarding the classification of cyclic compounds that has to be adressed. We have classes like tricyclic compound (Q3539074) and there are two ways such classes are defined in sources:
- n-cyclic compound = every compound has exactly three rings, no more, no less, in the whole structure
- n-cyclic compound = every compound has no less than three rings, but may have more
Selecting any of the options has serious consequences for the entire classification and may result in our classification being inconsistent with classifications from other sources.
The first option seems more logical and consistent as every compound is classified according to the number of rings in the structure. However, classes like phenothiazine (Q16023748) or dibenzazepine (Q33416403) cannot be subclasses of tricyclic compound (Q3539074) but only polycyclic compound (Q426145) (as there is no certainty that every compound belonging to phenothiazine (Q16023748) or dibenzazepine (Q33416403) has exactly three rings). It is also not consistent with ChEBI, e.g. pentacyclic LSM-20934 is classified under organic tricyclic compound. From the other side, choosing the second option leaves us with a weird classification tree: tetracyclic compound (Q7706284) (four or more rings) should be a subclass of tricyclic compound (Q3539074) (three or more rings).
I have no good solution to this. I'd personally choose the first option, even if it means a lot of inconsistencies between databases and the need for carefully checking that each class and chemical compound is assigned to the appropriate n-cyclic compounds class.