Wikidata:WikiProjekt Chemie Naturstoffe
Umfang und verwandte Projekte
Umfang
In diesem WikiProjekt geht es um type of chemical entity (Q113145171): Wikidata metaclass that covers physical entities of interest in chemistry including molecular entities and pure chemical substances oder group of stereoisomers (Q59199015): set of several stereoisomers — das kann found in taxon (P703): the taxon in which the item can be found taxon (Q16521): group of one or more organism(s), which a taxonomist adjudges to be a unit — sowie verwandte Informationen, insbesondere bibliographic reference (Q10358455): minimum data needed to identify the literary source of a piece of information sein. Diese type of chemical entity (Q113145171): Wikidata metaclass that covers physical entities of interest in chemistry including molecular entities and pure chemical substances werden oft als natural product (Q901227): chemical compound or substance produced by a living organism, found in nature bezeichnet.
Verwandte Projekte
Geschichte
Das Projekt wurde von Adriano Rutz und Pierre-Marie Allard vom University of Geneva (Q503473): public research university located in Geneva, Switzerland initiiert und von Jonathan Bisson unterstützt.
Ursprüngliches Ziel war es, eine offene Datenbank aufzubauen, in der Naturstoffe, chemische Strukturen, die sie produzierenden Organismen und eine zugehörige bibliografische Referenz, die solche Verbindungen dokumentiert, zusammengestellt werden. Zu diesem Zweck haben wir taxonomische, chemische und bibliografische Daten aus bestehenden Ressourcen zusammengestellt und standardisiert.
WikiData mit seinen WikiProjekt Chemie, WikiProjekt Taxonomie, WikiProjekt Quellen-Metadaten passt zu dem Zweck dieser Datenbank, für alle verfügbar zu sein und mit anderen Ressourcen verlinkt zu werden.
Publikationen
Nachfolgend sind die im Zusammenhang mit dem Projekt veröffentlichten Arbeiten aufgeführt:
- Adriano Rutz; Maria Sorokina; Jakub Galgonek; et al. (1 March 2021). "The LOTUS Initiative for Open Natural Products Research: Knowledge Management through Wikidata". bioRxiv. bioRxiv 10.1101/2021.02.28.433265. doi:10.1101/2021.02.28.433265. S2CID 235262250 Check
|s2cid=
value (help). Wikidata Q105742243. - Adriano Rutz; Maria Sorokina; Jakub Galgonek; et al. (26 May 2022). "The LOTUS initiative for open knowledge management in natural products research". eLife. 11. doi:10.7554/ELIFE.70780. ISSN 2050-084X. PMC 9135406 Check
|pmc=
value (help). PMID 35616633 Check|pmid=
value (help). S2CID 249064853 Check|s2cid=
value (help). Wikidata Q112143478.
Participants
Menschen
The participants listed below can be notified using the following template in discussions:{{Ping project|Chemistry Natural products}}
Bots
Der Bot (in Kotlin) ist in der Lage, unsere Datei zu nehmen, sie zu verarbeiten und sie der Test-Wikidata-Instanz hinzuzufügen:
Siehe einige Beispieleinträge:
[1]: Beispiel für eine Verbindung (verbunden mit einer Art und mit einem Verweis)
[2]: Beispiel einer Art
Da wir keinen SPARQL-Endpunkt für diese Instanz von Wikidata haben, können wir nicht einfach überprüfen, ob die Entität bereits existiert, aber der Bot unterstützt SPARQL-Abfragen, um Entitäten aufzulösen und Duplikate zu vermeiden.
Es funktioniert gut und ist trotz der API-Geschwindigkeitsbegrenzung recht schnell.
Repositories
Die aktuelle Organisation, die die Repositories für das Projekt umgruppiert, ist: https://github.com/lotusnprod
Struktur der ursprünglichen Daten
organism_name | organism_db | organism_id | structure_inchikey | structure_inchi | structure_smiles | reference_title | reference_doi |
---|---|---|---|---|---|---|---|
Curcuma longa | NCBI | 136217 | VFLDPWHFBUODDF-FCXRPNKRSA-N | InChI=1S/C21H20O6/ c1-26-20-11-14(5-9-18 (20)24)3-7-16(22)13-17 (23)8-4-15-6-10-19(25)21 (12-15)27-2/h3-12,24-25H ,13H2,1-2H3/b7-3+,8-4+ | COc1cc(/C=C/C(=O) CC(=O)/C=C/c2ccc (O)c(OC)c2)ccc1O | Characterization of powdered turmeric by liquid chromatography-mass spectrometry and gas chromatography-mass spectrometry | 10.1016/ 0021-9673 (96)00103-3 |
Wir haben etwa 900.000 Einträge hinzugefügt, die wie der oben abgebildete aussehen.
User:SCIdude hat einen Testeintrag erstellt, um zu demonstrieren: [3].
Dies zeigt, dass die Eigenschaft auch die umgekehrte Aussage erfordert.
Abfragen
Inhalt
Was war schon da?
Am 2021-05-15 ergab diese Abfrage etwa 50 natural product of taxon (P1582): links a natural product with its source (animal, plant, fungal, algal, etc.)- und mehr als 1.200 found in taxon (P703): the taxon in which the item can be found-Angaben, wobei es sich bei letzteren hauptsächlich um menschliche Stoffwechselprodukte handelt (wir schließen Rohdrogen, Öle usw. aus). Mehr als 100 aller Arten hatten keine Referenz. Angesichts des Umfangs der von uns hinzugefügten Daten wird die Abfrage jetzt nicht mehr ausgeführt.
The following query uses these:
- Properties: instance of (P31) , natural product of taxon (P1582) , stated in (P248)
SELECT ?item ?itemLabel ?taxonLabel ?artLabel WHERE { VALUES ?classes { wd:Q113145171 # type of a chemical entity wd:Q59199015 # group of stereoisomers } ?item wdt:P31 ?classes. # instance of { ?item p:P1582 ?stmt. # natural product of taxon ?stmt ps:P1582 ?taxon. # natural product of taxon OPTIONAL { ?stmt prov:wasDerivedFrom ?ref. ?ref pr:P248 ?art. # stated in } } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } LIMIT 10000
Welche referenzierten Struktur-Organismus-Paare sind auf Wikidata verfügbar? (Beispiel begrenzt auf 1 Mio.)
Sie liefert nun ca. 700.000 referenzierte Struktur-Organismus-Paare.
The following query uses these:
- Properties: InChIKey (P235) , found in taxon (P703) , stated in (P248)
#title: Which are the available referenced structure-organism pairs on Wikidata? (limited to 1mio) SELECT DISTINCT ?structure ?taxon ?reference WHERE { ?structure p:P235 []; p:P703 [ ps:P703 ?taxon; (prov:wasDerivedFrom/pr:P248) ?reference; ] . hint:Prior hint:rangeSafe true. } LIMIT 1000000
<span id="What_are_the_compounds_found_in_Mouse-ear_cress_Arabidopsis thaliana (Q158695)_or_children_taxa?">
Welche Verbindungen finden sich in der Acker-Schmalwand Arabidopsis thaliana (Q158695) oder in Kindertaxa?
The following query uses these:
- Properties: parent taxon (P171) , found in taxon (P703) , InChI (P234)
#title: What are the compounds found in Mouse-ear cress Arabidopsis thaliana (Q158695) or children taxa? SELECT DISTINCT ?structure ?structureLabel ?structure_inchi WHERE { VALUES ?taxon { wd:Q158695 # You can remove the Qxxxxxx and hit Ctrl+space, type the first letters and it should autocomplete } ?children (wdt:P171*) ?taxon. # Include children taxa ?structure wdt:P703 ?children; # Get the taxon of the structure wdt:P234 ?structure_inchi. # Get the InChI SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } } LIMIT 10000
<span id="Which_organisms_are_known_to_contain_beta-sitosterol (Q121802)?">
Welche Organismen enthalten bekanntermaßen beta-sitosterol (Q121802)?
The following query uses these:
- Items: beta-sitosterol (Q121802)
- Properties: found in taxon (P703) , taxon name (P225)
#title: Which organisms are known to contain Beta-Sitosterol (Q121802)? SELECT DISTINCT ?taxon ?taxon_name WHERE { VALUES ?compound { wd:Q121802 # You can remove the Qxxxxxx and hit Ctrl+space, type the first letters and it should autocomplete } ?compound wdt:P703 ?taxon. # Found in taxon ?taxon wdt:P225 ?taxon_name. # Get scientific name of the taxon } LIMIT 10000
<span id="Which_organisms_are_known_to_contain_stereoisomers_of_beta-sitosterol (Q121802)?">
Von welchen Organismen ist bekannt, dass sie Stereoisomere von beta-sitosterol (Q121802) enthalten?
The following query uses these:
- Properties: InChIKey (P235) , found in taxon (P703) , taxon name (P225)
#title: Which organisms are known to contain stereoisomers of Beta-Sitosterol (Q121802)? SELECT ?taxon_name ?compound ?InChIKey WITH { SELECT ?queryKey ?srsearch ?filter WHERE { VALUES ?queryKey { "KZJWDPNRJALLNS-VJSFXXLFSA-N" # beta-sitosterol } BIND (CONCAT(substr($queryKey,1,14), " haswbstatement:P235") AS ?srsearch) BIND (CONCAT("^", substr($queryKey,1,14)) AS ?filter) } } AS %comps WITH { SELECT ?compound ?InChIKey WHERE { INCLUDE %comps SERVICE wikibase:mwapi { bd:serviceParam wikibase:endpoint "www.wikidata.org"; wikibase:api "Search"; mwapi:srsearch ?srsearch; mwapi:srlimit "max". ?compound wikibase:apiOutputItem mwapi:title. } ?compound wdt:P235 ?InChIKey . FILTER (REGEX(STR(?InChIKey), ?filter)) FILTER (?InChIKey != ?queryKey) } } AS %compounds WHERE { INCLUDE %compounds ?compound (wdt:P703/wdt:P225) ?taxon_name. } LIMIT 10000
Welche Verbindungen, die der (den) angegebenen Summenformel(n) entsprechen, kommen in welchem Organismus (welchen Organismen) vor
The following query uses these:
- Properties: chemical formula (P274) , canonical SMILES (P233) , found in taxon (P703) , taxon name (P225)
#title: Which taxa contain structures corresponding to the following chemical formula? SELECT DISTINCT ?structure ?smiles_canonical ?formula ?taxon ?taxon_name WHERE { VALUES ?formula { "C₉H₁₁Cl₂FN₂O₂S₂" # Use lower case digits ₁₂₃₄₅₆₇₈₉₀ "C₂₂H₂₂O₉" } ?structure wdt:P274 ?formula; wdt:P233 ?smiles_canonical; wdt:P703 ?taxon. ?taxon wdt:P225 ?taxon_name. } LIMIT 10000
Welche Verbindungen, die einer bestimmten Masse ± 10ppm entsprechen, kommen in welchen Organismen vor?
The following query uses these:
- Properties: mass (P2067) , found in taxon (P703) , taxon name (P225) , InChI (P234) , chemical formula (P274)
#title: Which compounds corresponding to a given mass ± 10ppm are found in which organism(s)? SELECT DISTINCT ?compound ?mf ?inchi (GROUP_CONCAT(?taxon_name; SEPARATOR = ", ") AS ?organism) WITH { SELECT ?compound WHERE { VALUES ?query { "524.1765"^^xsd:decimal } VALUES ?ppm { "10"^^xsd:decimal } ?compound wdt:P2067 ?mass. FILTER((?mass > (?query - ((?ppm * "0.000001"^^xsd:decimal) * ?query))) && (?mass < (?query + ((?ppm * "0.000001"^^xsd:decimal) * ?query)))) } } AS %compounds WHERE { INCLUDE %compounds ?compound (wdt:P703/wdt:P225) ?taxon_name; wdt:P234 ?inchi; wdt:P274 ?mf. } GROUP BY ?compound ?mf ?inchi LIMIT 10000
Welche Pigmente kommen in welchen Taxa vor, nach welcher Referenz?
The following query uses these:
- Items: pigment (Q161179)
- Properties: instance of (P31) , subclass of (P279) , DOI (P356) , taxon name (P225) , found in taxon (P703) , stated in (P248)
#title: Which pigments are found in which taxa, according to which reference? # special thanks goes to User:Lmichan for updating this information! SELECT DISTINCT ?compound ?compoundLabel ?taxon ?taxonname ?DOI WITH { SELECT ?compound WHERE { ?compound (wdt:P31*/wdt:P279*) wd:Q161179. # get pigments } } AS %compounds WITH { SELECT ?compound ?P703statement WHERE { INCLUDE %compounds ?compound p:P703 ?P703statement. # check for "found in taxon" statements } } AS %P703statement WITH { SELECT ?compound ?taxon ?DOI WHERE { INCLUDE %P703statement ?P703statement ps:P703 ?taxon ; # get the respective taxa prov:wasDerivedFrom / pr:P248 [ # get the reference supporting that statement wdt:P356 ?DOI # get the DOI for the reference ] . } } AS %taxa WHERE { { INCLUDE %taxa ?taxon wdt:P225 ?taxonname . # get the taxon name } ?compound rdfs:label ?compoundLabel . # get compound labels FILTER (LANG(?compoundLabel) = "en") . # filter for English } ORDER BY ASC(?compoundLabel) LIMIT 10000
Was sind Beispiele für Organismen, bei denen Verbindungen in einem Organismus gefunden wurden, der dasselbe Elterntaxon hat, aber nicht der Organismus selbst?
The following query uses these:
- Properties: InChIKey (P235) , found in taxon (P703) , parent taxon (P171) , taxon name (P225)
#title: What are examples of organisms where compounds were found in an organism sharing the same parent taxon, but not the organism itself? SELECT DISTINCT ?compound ?compoundLabel ?taxonname_with_compound ?taxonname_without_compound ?parent_taxon WITH{ SELECT DISTINCT ?compound ?taxon_with_compound ?parent_taxon WHERE { ?compound wdt:P235 ?inchikey. SERVICE bd:sample { ?compound wdt:P703 ?taxon_with_compound. bd:serviceParam bd:sample.limit 1000 } ?taxon_with_compound wdt:P171 ?parent_taxon. } } AS %taxon_with_compound WITH { SELECT DISTINCT ?taxon_without_compound ?parent_taxon ?compound WHERE { INCLUDE %taxon_with_compound ?taxon_without_compound wdt:P171 ?parent_taxon. FILTER (?taxon_with_compound != ?taxon_without_compound) } } AS %taxon2 WHERE { INCLUDE %taxon_with_compound INCLUDE %taxon2 FILTER NOT EXISTS {?compound wdt:P703 ?taxon_without_compound.} ?taxon_with_compound wdt:P225 ?taxonname_with_compound. ?taxon_without_compound wdt:P225 ?taxonname_without_compound. ?compound rdfs:label ?compoundLabel. FILTER(LANG(?compoundLabel) = "en"). } LIMIT 10000
<span id="Which_Zephyranthes (Q191364)_spp._lack_compounds_known_from_at_least_two_species_in_the_genus?">
Bei welchen Zephyranthes (Q191364)-Arten fehlen Verbindungen, die von mindestens zwei Arten der Gattung bekannt sind?
The following query uses these:
- Properties: found in taxon (P703) , parent taxon (P171) , taxon name (P225) , instance of (P31) , subclass of (P279)
#title: Which Zephyranthes (Q191364) spp. lack compounds known from at least two species in the genus? PREFIX target: <http://www.wikidata.org/entity/Q191364> # Zephyranthes SELECT DISTINCT ?compound ?compoundLabel ?taxon_with_compound ?another_taxon_with_compound ?taxon_without_compound WITH { SELECT DISTINCT ?compound ?taxon_YES_1 ?taxon_YES_2 WHERE { ?compound wdt:P703 ?taxon_YES_1 . ?compound wdt:P703 ?taxon_YES_2 . ?taxon_YES_1 wdt:P171 target: . ?taxon_YES_2 wdt:P171 target: . FILTER (?taxon_YES_2 != ?taxon_YES_1) } } AS %taxa_with_compound WITH { SELECT DISTINCT ?taxon_NO ?compound WHERE { INCLUDE %taxa_with_compound ?taxon_NO wdt:P171 target: . FILTER (?taxon_YES_1 != ?taxon_NO) } } AS %taxon_without_compond WHERE { INCLUDE %taxa_with_compound INCLUDE %taxon_without_compond FILTER NOT EXISTS { ?compound wdt:P703 ?taxon_NO .} VALUES ?classes { wd:Q113145171 wd:Q59199015 } ?taxon_YES_1 wdt:P225 ?taxon_with_compound . ?taxon_YES_2 wdt:P225 ?another_taxon_with_compound . ?taxon_NO wdt:P225 ?taxon_without_compound . ?compound (wdt:P31*/wdt:P279*) ?classes . ?compound rdfs:label ?compoundLabel. FILTER(LANG(?compoundLabel) = "en"). } LIMIT 10000
Wie viele Verbindungen sind strukturell ähnlich wie die als Antibiotika gekennzeichneten Verbindungen? Die Ergebnisse sind nach dem Stammtaxon des Organismus gruppiert, in dem sie gefunden wurden.
The following query uses these:
- Properties: subclass of (P279) , subject has role (P2868) , MeSH descriptor ID (P486) , canonical SMILES (P233) , found in taxon (P703) , parent taxon (P171) , taxon name (P225)
#title: How many compounds are structurally similar to compounds labeled as antibiotics? Results are grouped by the parent taxon of the organism they were found in. PREFIX sachem: <http://bioinfo.uochb.cas.cz/rdf/v1.0/sachem#> # prefixes needed for structural similarity search PREFIX idsm: <https://idsm.elixir-czech.cz/sparql/endpoint/> SELECT ?parent_taxon ?parent_taxon_name (COUNT(DISTINCT ?compound) AS ?count) WHERE { SERVICE idsm:wikidata { VALUES ?CUTOFF { "0.9"^^xsd:double } SERVICE <https://query.wikidata.org/bigdata/namespace/wdq/sparql> { VALUES ?MESH { "D000900" } ?antibiotic ((wdt:P279*)/wdt:P2868/wdt:P486) ?MESH; wdt:P233 ?smiles. } ?compound sachem:similarCompoundSearch _:b40. _:b40 sachem:query ?smiles; sachem:cutoff ?CUTOFF. } hint:Prior hint:runFirst "true"^^xsd:boolean. ?compound wdt:P703/wdt:P171 ?parent_taxon. ?parent_taxon wdt:P225 ?parent_taxon_name. } GROUP BY ?parent_taxon ?parent_taxon_name ORDER BY DESC (?count)
Welche Organismen enthalten indolische Gerüstsubstanzen? Zählen Sie die Vorkommen, gruppieren und ordnen Sie die Ergebnisse nach dem übergeordneten Taxon.
The following query uses these:
- Properties: parent taxon (P171) , taxon name (P225) , found in taxon (P703)
#title: Which organisms contain at least 100 indolic scaffolds? Results ordered by parent taxon. PREFIX sachem: <http://bioinfo.uochb.cas.cz/rdf/v1.0/sachem#> # prefixes needed for structural similarity search PREFIX idsm: <https://idsm.elixir-czech.cz/sparql/endpoint/> SELECT ?parent_taxon ?parent_taxon_name (COUNT(DISTINCT ?compound) AS ?count) WHERE { SERVICE idsm:wikidata { VALUES ?SUBSTRUCTURE { "C12=C(C=CC=C2)C=CN1" # indolic scaffold } ?compound sachem:substructureSearch [ sachem:query ?SUBSTRUCTURE ]. } hint:Prior hint:runFirst "true"^^xsd:boolean. ?compound p:P703 ?statement. ?statement ps:P703/wdt:P171 ?parent_taxon. ?parent_taxon wdt:P225 ?parent_taxon_name. } GROUP BY ?parent_taxon ?parent_taxon_name HAVING (?count > 100) ORDER BY DESC (?count)
<span id="Which_compounds_with_known_bioactivities_were_isolated_from_Actinomycetes (Q62606918),_between_2014_and_2019,_with_related_organisms_and_references?">
Welche Verbindungen mit bekannter Bioaktivität wurden zwischen 2014 und 2019 aus Actinomycetes (Q62606918) isoliert, mit entsprechenden Organismen und Referenzen?
The following query uses these:
- Items: Actinomycetes (Q62606918)
- Properties: parent taxon (P171) , taxon name (P225) , InChI (P234) , subject has role (P2868) , MeSH descriptor ID (P486) , title (P1476) , publication date (P577) , found in taxon (P703) , stated in (P248)
#title: Which compounds with known bioactivities were isolated from Actinomycetes (Q62606918), between 2014 and 2019, with related organisms and references? SELECT ?organism ?organism_name ?compound ?compound_inchi (GROUP_CONCAT(DISTINCT ?meshLabel; SEPARATOR = "|") AS ?bioactivities) ?isolation_reference ?reference_title WHERE { ?organism (wdt:P171*) wd:Q62606918; wdt:P225 ?organism_name. ?compound wdt:P234 ?compound_inchi; p:P703 ?statement; (wdt:P2868/wdt:P486) ?meshId. ?mesh wdt:P486 ?meshId; rdfs:label ?meshLabel. FILTER(LANGMATCHES(LANG(?meshLabel), "EN")) ?statement ps:P703 ?organism; prov:wasDerivedFrom ?ref. ?ref pr:P248 ?isolation_reference. ?isolation_reference wdt:P1476 ?reference_title; wdt:P577 ?reference_date. FILTER(((YEAR(?reference_date)) >= 2014 ) && ((YEAR(?reference_date)) <= 2019 )) } GROUP BY ?organism ?organism_name ?compound ?compound_inchi ?isolation_reference ?reference_title LIMIT 100000
<span id="Which_compounds_labelled_as_terpenoids (Q426694)_were_found_in_Aspergillus (Q335130)_spp.,_between_2010_and_2020,_with_related_references?">
Welche als terpenoids (Q426694) gekennzeichneten Verbindungen wurden zwischen 2010 und 2020 in Aspergillus (Q335130) spp. gefunden, mit entsprechenden Hinweisen?
The following query uses these:
- Properties: InChIKey (P235) , InChI (P234) , instance of (P31) , subclass of (P279) , parent taxon (P171) , title (P1476) , publication date (P577) , found in taxon (P703) , stated in (P248)
#title: Which compounds labelled as terpenoid (Q426694) were found in Aspergillus (Q335130) spp., between 2010 and 2020, with related references? SELECT ?compound ?compound_inchi (GROUP_CONCAT(DISTINCT ?isolation_reference; SEPARATOR = "|") AS ?isolation_references) (GROUP_CONCAT(DISTINCT ?reference_title; SEPARATOR = "|") AS ?references_titles) WHERE { VALUES ?taxon { wd:Q335130 } VALUES ?chemical_class { wd:Q426694 } ?compound wdt:P235 ?compound_id; wdt:P234 ?compound_inchi; ((wdt:P31|wdt:P279)/(wdt:P279*)) ?compound_class; p:P703 ?statement. ?statement (ps:P703/(wdt:P171*)) ?taxon; (prov:wasDerivedFrom/pr:P248) ?isolation_reference. ?isolation_reference wdt:P1476 ?reference_title; wdt:P577 ?reference_date. FILTER(((YEAR(?reference_date)) >= 2010 ) && ((YEAR(?reference_date)) <= 2020 )) FILTER(?compound_class = ?chemical_class) } GROUP BY ?compound ?compound_inchi
Wie viele Struktur-Organismus-Paare wurden von bestimmten Autoren referenziert? (Hier werden zwei ältere Naturstoffchemiker mit dem verstorbenen Ferdinand Bohlmann verglichen)
The following query uses these:
- Properties: author (P50) , found in taxon (P703) , stated in (P248)
#title: How many structures found in taxon have been referenced by certain authors? (Here, two senior natural products chemists are compared to the late Ferdinand Bohlmann) #defaultView:BarChart SELECT ?authors_namesLabel (COUNT(DISTINCT(?compound)) AS ?count) WHERE { ?compound p:P703/prov:wasDerivedFrom/pr:P248 ?art. # Get the references VALUES ?authors_names { wd:Q56084663 # JLW wd:Q40259636 # GFP wd:Q1405133 # A german chemist of the 20th century ... Ferdinand Bohlmann } ?art wdt:P50 ?authors_names. # Limit to references containing the author SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } GROUP BY ?authors_namesLabel ORDER BY DESC (?count)
Welches sind die verfügbaren referenzierten Struktur-Organismus-Paare auf Wikidata, für die eine PDB-Struktur-ID verfügbar ist?
The following query uses these:
- Properties: PDB structure ID (P638) , found in taxon (P703) , InChIKey (P235)
#title: Which are the available structures found in taxon on Wikidata, for which a PDB structure ID is available? SELECT DISTINCT ?structure (COUNT(DISTINCT ?pdb_id) AS ?count) (GROUP_CONCAT(DISTINCT ?pdb_id; SEPARATOR = ", ") AS ?pdb) WHERE { ?structure p:P703 []; p:P235 []; # To exclude proteins wdt:P638 ?pdb_id. } GROUP BY ?structure ORDER BY DESC (?count)
Welches sind die verfügbaren referenzierten Struktur-Organismus-Paare auf Wikidata, für die ein CSD Refcode verfügbar ist?
The following query uses these:
- Properties: CSD Refcode (P11375) , found in taxon (P703)
#title: Which are the available structures found in taxon on Wikidata, for which a CSD Refcode is available? SELECT DISTINCT ?structure (COUNT(DISTINCT ?csd_id) AS ?count) (GROUP_CONCAT(DISTINCT ?csd_id; SEPARATOR = ", ") AS ?csd) WHERE { ?structure p:P703 []; wdt:P11375 ?csd_id. } GROUP BY ?structure ORDER BY DESC (?count)
Welches sind die verfügbaren referenzierten Struktur-Organismus-Paare auf Wikidata, für die eine CAS-Registernummer verfügbar ist? (begrenzt auf 1 Mio.)
Wenn Sie diese Abfrage verwenden, zitieren Sie bitte Andrea Jacobs; Dustin Williams; Katherine Hickey; et al. (13 May 2022). "CAS Common Chemistry in 2021: Expanding Access to Trusted Chemical Information for the Scientific Community". Journal of Chemical Information and Modeling. doi:10.1021/ACS.JCIM.2C00268. ISSN 1549-9596. Wikidata Q111987319.
The following query uses these:
- Properties: found in taxon (P703) , CAS Registry Number (P231) , stated in (P248) , retrieved (P813)
#title: Which are the available structure found in taxon on Wikidata, for which a CAS Registry Number is available? (limited to 1mio) SELECT DISTINCT ?structure ?cas ?date WHERE { { ?structure p:P703 []; # Found in taxon p:P231 ?casStatement . # Get the CAS Registry Number } { ?casStatement ps:P231 ?cas . OPTIONAL { ?casStatement prov:wasDerivedFrom ?casReference . { ?casReference pr:P248 wd:Q18907859 . } UNION { ?casReference pr:P248 wd:Q911173 . } OPTIONAL { ?casReference pr:P813 ?date . } } } } LIMIT 1000000
Welche im Taxon gefundenen chemischen Strukturen wurden neu zugewiesen? Liste veralteter und aktueller SMILES
The following query uses these:
- Properties: canonical SMILES (P233) , stated in (P248) , reason for deprecated rank (P2241) , reason for preferred rank (P7452)
#title: Which chemical structures found in taxon were reassigned (Q116482192)? List deprecated and actual SMILES SELECT ?item ?valueDeprecated ?valueNew ?referenceOld ?referenceNew WHERE { ?item p:P233 ?st, ?st2. ?st ps:P233 ?valueDeprecated; wikibase:rank wikibase:DeprecatedRank; pq:P2241 wd:Q116482192; (prov:wasDerivedFrom/pr:P248) ?referenceOld. ?st2 ps:P233 ?valueNew; wikibase:rank wikibase:PreferredRank; pq:P7452 wd:Q116482192; (prov:wasDerivedFrom/pr:P248) ?referenceNew. FILTER(?referenceOld != ?referenceNew) } LIMIT 100000
Wartung
Welche referenzierten Struktur-Organismus-Paare sind auf Wikidata verfügbar? (mit P1582 und nicht P703, das wir verwenden)
Diese Abfrage ergab 21 Ergebnisse am 2022-02-24.
The following query uses these:
- Properties: instance of (P31) , natural product of taxon (P1582) , stated in (P248)
#title: Which are the available referenced structure-organism pairs on Wikidata? (with P1582 and not P703 we are using) SELECT DISTINCT (REPLACE(STR(?item), ".*Q", "Q") AS ?qid) (REPLACE(STR(?taxon), ".*Q", "Q") AS ?P703) (REPLACE(STR(?art), ".*Q", "Q") AS ?S248) WHERE { VALUES ?classes { wd:Q113145171 wd:Q59199015 } ?item wdt:P31 ?classes. { ?item p:P1582 ?stmt. ?stmt ps:P1582 ?taxon; prov:wasDerivedFrom ?ref. ?ref pr:P248 ?art. } } LIMIT 1000
Welche nicht referenzierten Struktur-Organismus-Paare sind auf Wikidata verfügbar? (begrenzt auf 10)
The following query uses these:
- Properties: InChIKey (P235) , found in taxon (P703)
#title: Which are the available non-referenced structure-organism pairs on Wikidata? (limited to 10) SELECT ?statement WHERE { [ p:P235 []; p:P703 ?statement; ] MINUS { ?statement prov:wasDerivedFrom []. } } LIMIT 10
Diskussionen
Hier findet ein Austausch und eine Diskussion statt, z.B. über die Verwendung geeigneter Mappings oder bestimmter Eigenschaften https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry/Natural_products