User:ProteinBoxBot/SPARQL Examples

From Wikidata
Jump to navigation Jump to search

Contents

Query Wikidata with SPARQL[edit]

SPARQL queries can be submitted via a web form at http://query.wikidata.org/, or, for programmatic access: https://query.wikidata.org/sparql. The latter URL allows integrating Wikidata items with external SPARQL endpoints through federated queries or to integrate in data analysis packages such as R or any other platform with a SPARQL plugin.

This page collects biomedical example queries to the SPARQL endpoint of Wikidata. For more general example queries, see here.

Drugs & Diseases[edit]

Find drugs that treat a disease and show a link for each supporting reference[edit]

The following query uses these:

  • Properties: NCI Thesaurus ID (P1748) View with Reasonator View with SQID, formatter URL (P1630) View with Reasonator View with SQID, drug used for treatment (P2176) View with Reasonator View with SQID, ChEMBL ID (P592) View with Reasonator View with SQID, NDF-RT ID (P2115) View with Reasonator View with SQID
     1 #Find drugs that treat a disease and show a link for each supporting reference
     2 SELECT ?disease ?diseaseLabel ?diseaseDescription ?drug ?drugLabel ?drugDescription ?link
     3 WHERE {
     4  ?disease wdt:P1748 'C3243' . #multiple sclerosis 
     5  ?disease p:P2176 ?disease_drug .  #statement about drug used for treatment
     6  ?disease_drug ps:P2176 ?drug . #which drug was it in that statement...  
     7  ?disease_drug prov:wasDerivedFrom ?reference . #chemblid pr:P592 , #NDF-RT P2115        
     8  optional { 
     9    ?reference pr:P592 ?chemblid . 
    10    wd:P592 wdt:P1630 ?url .
    11    BIND (replace(?url, "\\$1", ?chemblid) AS ?link)           
    12  }
    13  optional {
    14   ?reference pr:P2115 ?NDF_RT_ID .  
    15   wd:P2115 wdt:P1630 ?url .
    16   BIND (replace(?url, "\\$1", ?NDF_RT_ID ) AS ?link) 
    17   }
    18   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    19 }
    

Get all the drug-drug interactions for Methadone[edit]

The following query uses these:

  • Properties: ChEMBL ID (P592) View with Reasonator View with SQID, significant drug interaction (P769) View with Reasonator View with SQID
    1 #Get all the drug-drug interactions for Methadone
    2   SELECT ?compound ?chembl ?label WHERE {
    3   ?item wdt:P592 'CHEMBL651' .
    4   ?item wdt:P769 ?compound .
    5   ?compound wdt:P592 ?chembl .
    6   OPTIONAL  {?compound rdfs:label ?label filter (lang(?label) = "en")}
    7 }
    

Drug Repurposing[edit]

Drug interacts with protein encoded by gene with association to disease. Showing Metformin[edit]

The following query uses these:

  • Properties: physically interacts with (P129) View with Reasonator View with SQID, encoded by (P702) View with Reasonator View with SQID, genetic association (P2293) View with Reasonator View with SQID
    1 #Drug interacts with protein encoded by gene with association to disease. Showing Metformin
    2 SELECT ?gene ?geneLabel ?disease ?diseaseLabel WHERE {
    3   wd:Q19484 wdt:P129 ?gene_product .   # drug (metformin) interacts with a gene_product 
    4   ?gene_product wdt:P702 ?gene .  # gene_product is encoded by a gene
    5   ?gene	wdt:P2293 ?disease .    # gene is genetically associated with a disease 
    6   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    7 }
    

Find drugs for cancers that target genes related to cell proliferation, where a drug physically interacts with the product of gene known to be genetically associated to a disease[edit]

These cases may show opportunities to repurpose a drug for a new disease See [1] and [2]. An example that was recently validated involved a new link between metformin (Q19484) and cancer survival [3]. Query is currently set up to find drugs for cancers that target genes related to cell proliferation. Adapt by changing constraints (e.g. to 'heart disease' Q190805) or removing them

The following query uses these:

  • Properties: physically interacts with (P129) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, genetic association (P2293) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, biological process (P682) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, drug used for treatment (P2176) View with Reasonator View with SQID
     1 #Find drugs for cancers that target genes related to cell proliferation, where a drug physically interacts with the product of gene known to be genetically associated to a disease
     2 SELECT ?drugLabel ?geneLabel ?biological_processLabel ?diseaseLabel WHERE {
     3   ?drug wdt:P129 ?gene_product .   # drug interacts with a gene_product 
     4   ?gene wdt:P688 ?gene_product .  # gene_product (usually a protein) is a product of a gene (a region of DNA)
     5   ?disease	wdt:P2293 ?gene .    # genetic association between disease and gene 
     6   ?disease wdt:P279* wd:Q12078 .  # limit to cancers wd:Q12078 (the * operator runs up a transitive relation..)
     7   ?gene_product wdt:P682 ?biological_process . #add information about the GO biological processes that the gene is related to  
     8   #limit to genes related to certain biological processes (and their sub-processes):
     9   		#apoptosis wd:Q14599311 
    10   		#cell proliferation wd:Q14818032
    11   {?biological_process wdt:P279* wd:Q14818032 } # chain down subclass
    12    UNION 
    13   {?biological_process wdt:P361* wd:Q14818032 } # chain down part of
    14   #uncomment the next line to find a subset of the known true positives (there are not a lot of them in here yet)
    15   #?disease wdt:P2176 ?drug . 	# disease is treated by a drug 
    16   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    17 }
    

Variants[edit]

Which variant of which gene predicts a positive prognosis in colorectal cancer[edit]

The following query uses these:

  • Properties: positive prognostic predictor (P3358) View with Reasonator View with SQID, biological variant of (P3433) View with Reasonator View with SQID
    1 #Which variant of which gene predicts a positive prognosis in colorectal cancer
    2 SELECT ?geneLabel ?variantLabel WHERE {  
    3   values ?disease {wd:Q188874}
    4   ?variant wdt:P3358 ?disease ; # P3358 Positive prognostic predictor
    5            wdt:P3433 ?gene . # P3433 biological variant of
    6   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    7 }
    

This list is periodically updated by a bot. Manual changes to the list will be removed on the next update!

WDQS | PetScan | YASGUI | TABernacle | Find images Recent changes | Query: SELECT ?item WHERE { ?item wdt:P3358 wd:Q188874 ; wdt:P3433 ?gene . }
label description biological variant of
CDX2 EXPRESSION genetic variant CDX2
MIR218-1 EXPRESSION genetic variant MIR218-1
DCC EXPRESSION genetic variant DCC
POLE P286R genetic variant POLE
POLE V411L genetic variant POLE
POLE S459F genetic variant POLE
BRAF Non-V600 genetic variant BRAF
End of automatically generated list.

Microbial Queries[edit]

Request all taxa that have at least one gene and are child of the genus Chlamydia[edit]

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, parent taxon (P171) View with Reasonator View with SQID
    1 #Request all taxa that have at least one gene and are child of the genus Chlamydia
    2 SELECT DISTINCT ?taxa ?taxaLabel WHERE {
    3   ?gene wdt:P31 wd:Q7187 .
    4   ?gene wdt:P703 ?taxa .
    5   ?taxa wdt:P171* wd:Q846309 .
    6   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    7 }
    

Request all operons and their genes for Listeria monocytogenes EGD-e[edit]

This query is meant to show the structured version of data published in The Listeria transcriptional landscape from saprophytism to virulence. (Q28131414), and specifically the table starting on page 17 of the supplementary file.

The following query uses these:

  • Properties: NCBI Taxonomy ID (P685) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, strand orientation (P2548) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, has part (P527) View with Reasonator View with SQID, NCBI Locus tag (P2393) View with Reasonator View with SQID, Entrez Gene ID (P351) View with Reasonator View with SQID
     1 #Request all operons and their genes for Listeria monocytogenes EGD-e
     2 SELECT ?operon ?operonLabel (if(?label = "Forward Strand"@en, '+', '-') as ?strand)
     3  (group_concat(distinct ?locustag; separator=" ") as ?locustagG) WHERE {
     4   ?strain wdt:P685 "169963". # NCBI Taxonomy ID for Listeria monocytogenes EGD-e
     5   ?operon wdt:P703 ?strain; # get operon that is found in that genome
     6           wdt:P2548 ?strand; # get strand orientation
     7           wdt:P31 wd:Q139677; #instance of operon
     8           wdt:P527 ?gene. # has part gene (gets all genes in operon)
     9   ?gene wdt:P2393 ?locustag; # get ncbi locus tag for genes in operon
    10         wdt:P351 ?entrez.  # get ncbi entrez id for genes in operon
    11       SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    12    ?strand rdfs:label ?label . filter(lang(?label) = 'en')
    13 } GROUP BY ?operonLabel ?strandLabel ?label ?operon
    14 ORDER BY ?operonLabel
    

Request all operons, their regulators, and their products[edit]

The following query uses these:

  • Properties: subclass of (P279) View with Reasonator View with SQID, has part (P527) View with Reasonator View with SQID, product or material produced (P1056) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID
     1 #Request all operons, their regulators, and their products 
     2 SELECT ?taxa_name ?regulator_name ?operon_name ?go_name ?product_name 
     3 WHERE {   
     4 ?operon wdt:P279 wd:Q139677 ;
     5 rdfs:label ?operon_name ;
     6    	wdt:P527 ?gene ;
     7    	wdt:P1056 ?protein .  
     8 ?regulator wdt:P128 ?operon  ;
     9    	rdfs:label ?regulator_name .
    10 ?protein ?function_type ?go_term ;
    11  	wdt:P1056 ?product .
    12 ?go_term wdt:P686 ?go_id ;
    13    	rdfs:label ?go_name .   
    14 ?product rdfs:label ?product_name . 
    15 ?gene wdt:P703 ?taxa .
    16 ?taxa rdfs:label ?taxa_name . 
    17 FILTER (LANG(?taxa_name) = "en") .
    18    	FILTER (LANG(?regulator_name) = "en") .
    19    	FILTER (LANG(?go_name) = "en") 
    20   	FILTER (LANG(?product_name) = "en") .  
    21 }
    

Request all organisms that are located in the female urogenital tract and that have a gene with product indole[edit]

The following query uses these:

  • Properties: habitat (P2974) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, product or material produced (P1056) View with Reasonator View with SQID
    1 #Request all operons, their regulators, and their products
    2 SELECT ?organism_name ?organism_item WHERE {   
    3   ?organism_item wdt:P2974 wd:Q5880 ;
    4     rdfs:label ?organism_name . 
    5   ?gene wdt:P703 ?organism_item ; 
    6     wdt:P1056 wd:Q319541 . 
    7   FILTER (LANG(?organism_name) = "en") .    
    8 }
    

Return gene counts for each bacterial genome in Wikidata[edit]

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, parent taxon (P171) View with Reasonator View with SQID
    1 #Return gene counts for each bacterial genome in Wikidata
    2 SELECT ?species ?label (count (DISTINCT ?gene) as ?gene_counts) WHERE {
    3   ?gene wdt:P31 wd:Q7187 .
    4   ?gene wdt:P703 ?species .
    5   ?species wdt:P171* wd:Q10876 .
    6   ?species rdfs:label ?label FILTER (lang(?label) = "en") .
    7 } GROUP BY ?species ?label ORDER BY DESC(?gene_counts)
    

Return protein counts for each bacterial genome in Wikidata[edit]

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, parent taxon (P171) View with Reasonator View with SQID
    1 #Return protein counts for each bacterial genome in Wikidata
    2 SELECT ?species ?label (count (DISTINCT ?protein) as ?protein_counts) WHERE {
    3   ?protein wdt:P31 wd:Q8054 ;
    4     wdt:P703 ?species .
    5   ?species wdt:P171* wd:Q10876 .
    6   ?species rdfs:label ?label FILTER (lang(?label) = "en") .
    7 } GROUP BY ?species ?label ORDER BY DESC(?protein_counts)
    

Queries for microRNAs and extracellular RNAs[edit]

Get all mature miRNAs in Wikidata[edit]

#Get all mature miRNAs in Wikidata 
SELECT ?mirna ?mirnaLabel WHERE {
  ?mirna wdt:P31 wd:Q23838648.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

Retrieve all mature miRNAs and their targets (names and NCBI entrez gene ID)[edit]

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID, Entrez Gene ID (P351) View with Reasonator View with SQID
    1 #Retrieve all mature miRNAs and their targets (names and NCBI entrez gene ID)
    2 SELECT DISTINCT ?mirna ?mirnaLabel ?gene ?geneLabel ?entrez WHERE {
    3   ?mirna wdt:P31 wd:Q23838648 ;
    4          wdt:P128 ?gene .
    5   ?gene wdt:P351 ?entrez .
    6   
    7   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    8 }
    

Retrieve all genes and mature miRNAs which are involved in the 'regulation of immune response' (GO:0050776)[edit]

The following query uses these:

  • Properties: biological process (P682) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID, encoded by (P702) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID
    1 #Retrieve all genes and mature miRNAs which are involved in the 'regulation of immune response' (GO:0050776)
    2 SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel WHERE {
    3   ?protein wdt:P682 [wdt:P686 'GO:0050776'] . # regulation of immune response
    4   ?protein wdt:P702 ?gene .
    5   
    6   ?mir wdt:P128 ?gene .
    7   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    8 }
    

Retrieve all genes that are involved in "regulation of immune response" AND that are regulated by an miRNA expressed in blood plasma or saliva[edit]

The following query uses these:

  • Properties: biological process (P682) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID, encoded by (P702) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID
     1 #Retrieve all genes that are involved in "regulation of immune response" AND that are regulated by an miRNA expressed in blood plasma or saliva
     2 SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel ?fluidLabel WHERE {
     3   ?protein wdt:P682 [wdt:P686 'GO:0050776'] . # regulation of immune response
     4   ?protein wdt:P702 ?gene .
     5   
     6   ?mir wdt:P128 ?gene .
     7   ?mir wdt:P361 ?fluid .
     8   values ?fluid {wd:Q79749 wd:Q155925} .
     9   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    10 }
    

For these genes involved in regulation of immune reponse, are there drugs which modulate the immune response, by targeting one of these gene?[edit]

The following query uses these:

Biological processes impacted by hsa-miR-211-5p[edit]

The following query uses these:

  • Properties: regulates (molecular biology) (P128) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, biological process (P682) View with Reasonator View with SQID
    1 #Biological processes impacted by hsa-miR-211-5p
    2 SELECT DISTINCT ?bioProcess ?bioProcessLabel WHERE {
    3 	?mirna rdfs:label 'hsa-miR-211-5p'@en .
    4     ?mirna wdt:P128 ?gene .
    5     ?gene wdt:P688 ?protein .
    6     ?protein wdt:P682 ?bioProcess .
    7   
    8   	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    9 }
    

Retrieve all human genes which are involved any immune response (as annotated by GO) AND are regulated by hsa-miR-211-5p.[edit]

The following query uses these:

  • Properties: subclass of (P279) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID, encoded by (P702) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID
     1 #Retrieve all human genes which are involved any immune response (as annotated by GO) AND are regulated by hsa-miR-211-5p.
     2 SELECT DISTINCT ?geneLabel ?goLabel ?mirLabel WHERE {
     3   ?go wdt:P279* [wdt:P686 'GO:0006955'] . # immune response
     4   ?p ?pr ?go .
     5   ?p wdt:P702 ?gene .
     6   
     7   ?mir wdt:P128 ?gene .
     8   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}  
     9 }
    10 GROUP BY ?goLabel ?mirLabel ?geneLabel
    11 HAVING(?mirLabel = 'hsa-miR-211-5p'@en)
    

Same query as the previous, but also retrieving small molecules interacting with these gene products.[edit]

The following query uses these:

  • Properties: subclass of (P279) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID, encoded by (P702) View with Reasonator View with SQID, physically interacts with (P129) View with Reasonator View with SQID, regulates (molecular biology) (P128) View with Reasonator View with SQID
     1 #Same query as the previous, but also retrieving small molecules interacting with these gene products.
     2 SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel ?drug ?drugLabel WHERE {
     3   ?x wdt:P279* [wdt:P686 'GO:0006955'] . # regulation of immune response
     4   ?p ?pr ?x .
     5   ?p wdt:P702 ?gene .
     6   OPTIONAL { ?p wdt:P129 ?drug . }
     7   
     8   ?mir wdt:P128 ?gene . 
     9   
    10   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    11 }
    12 GROUP BY ?gene ?mir ?mirLabel ?geneLabel ?drug ?drugLabel
    13 HAVING(?mirLabel = 'hsa-miR-211-5p'@en)
    

BOSC 2017 examples[edit]

Get genes with GWAS association with asthma[edit]

Demonstrates basic retrieval of GWAS catalog data Result (as of 2017-07-22): 39 genes

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID
    1 #Get genes with GWAS association with asthma
    2 SELECT DISTINCT ?gene ?geneLabel where {
    3   ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
    4   ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
    5   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    6 }
    

... and whose gene product is localized to membrane[edit]

Demonstrates basic integration w/ Gene Ontology Result (as of 2017-07-22): 22 genes

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID
     1 #... and whose gene product is localized to membrane
     2 SELECT DISTINCT ?gene ?geneLabel where {
     3   ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
     4   
     5   ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
     6 
     7   ?gene wdt:P688 ?protein .                # gene encodes a protein
     8   ?protein wdt:P681 ?cc .                  # protein has a cellular component
     9   ?cc wdt:P279*|wdt:P361* wd:Q14349455 .   # cell component is 'part of' or 'subclass of' membrane
    10 
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    12 }
    

... where the GO localization is based on a non-IEA evidence code[edit]

Demonstrates computing on provenance Result (as of 2017-07-22): 15 genes

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 #... and whose gene product is localized to membrane
     2 SELECT DISTINCT ?gene ?geneLabel where {
     3   ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
     4   
     5   ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
     6 
     7   ?gene wdt:P688 ?protein .                        # gene encodes a protein
     8   ?protein p:P681 ?s .                             # protein's cell component statement
     9     ?s ps:P681 ?cp .                               # get statement value
    10     FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
    11     ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    12 
    13   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    14 }
    

... with GWAS association with any respiratory disease[edit]

Demonstrates leveraging ontology structure Result (as of 2017-07-22): 31 genes over 8 diseases

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 #... with GWAS association with any respiratory disease
     2 SELECT ?diseaseGALabel (count (DISTINCT ?gene) as ?gene_counts) 
     3 (group_concat(DISTINCT ?geneLabel; separator=", ") as ?geneList) WHERE {
     4   ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
     5   ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
     6   
     7   ?gene wdt:P31 wd:Q7187 ; wdt:P688 ?protein ;    # gene is subclass of "gene" and encodes protein
     8         rdfs:label ?geneLabel . 
     9   FILTER (lang(?geneLabel) = "en")
    10   ?protein p:P681 ?s .                             # protein's cell component statement
    11     ?s ps:P681 ?cp .                               # get statement value
    12     FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
    13     ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    14 
    15   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    16 } 
    17 GROUP BY ?diseaseGALabel ?geneList ORDER BY DESC(?gene_counts)
    

... show associated chemical exposures[edit]

Demonstrates opportunistic integration with independent community contribution Result (as of 2017-07-22): 4 diseases / 6 chemical hazards

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, has effect (P1542) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 #... show associated chemical exposures
     2 SELECT DISTINCT ?diseaseGA ?diseaseGALabel ?exposure ?exposureLabel where {
     3   ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
     4   ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
     5   
     6   ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
     7 
     8   ?gene wdt:P688 ?protein .                        # gene encodes a protein
     9   ?protein p:P681 ?s .                             # protein's cell component statement
    10     ?s ps:P681 ?cp .                               # get statement value
    11     FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
    12     ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    13 
    14   ?exposure wdt:P1542 ?diseaseGA .  # something causes disease
    15   ?exposure wdt:P279 wd:Q21167512 . # and that something is a chemical hazard
    16   
    17   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    18 }
    

... and show associated pathways[edit]

Demonstrates opportunistic integration with other community contributions Result (as of 2017-07-22): 59 gene-pathway combos

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, has part (P527) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 #... and show associated pathways
     2 SELECT DISTINCT ?gene ?geneLabel ?pathwayLabel where {
     3   ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
     4   ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
     5   
     6   ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
     7 
     8   ?gene wdt:P688 ?protein .                        # gene encodes a protein
     9   ?protein p:P681 ?s .                             # protein's cell component statement
    10     ?s ps:P681 ?cp .                               # get statement value
    11     FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
    12     ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    13 
    14   ?pathway wdt:P31 wd:Q4915012 ;                   # instance of a biological pathway
    15            wdt:P527 ?gene .
    16 
    17   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    18 }
    

NCATS Translator Queries[edit]

What compensatory mutations in FA core genes confer resistance to chemotherapeutic drugs (e.g. cisplatin)?[edit]

See here for more information.

The following query uses these:

  • Properties: HGNC gene symbol (P353) View with Reasonator View with SQID, biological variant of (P3433) View with Reasonator View with SQID, PubChem CID (P662) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, negative therapeutic predictor (P3355) View with Reasonator View with SQID, reference URL (P854) View with Reasonator View with SQID, medical condition treated (P2175) View with Reasonator View with SQID
     1 #What compensatory mutations in FA core genes confer resistance to chemotherapeutic drugs (e.g. cisplatin)?
     2 select ?geneLabel ?variantLabel ?variant ?drugLabel ?cid ?diseaseLabel ?ref where {
     3   values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
     4   ?gene wdt:P353 ?hgnc .
     5   ?variant wdt:P3433 ?gene .
     6   ?variant p:P3355 ?s .
     7   ?s ps:P3355 ?drug .
     8   ?s prov:wasDerivedFrom/pr:P854 ?ref .
     9   ?drug wdt:P662 ?cid .
    10   ?s pq:P2175 ?disease .
    11   ?disease wdt:P279* wd:Q12078 .
    12   SERVICE wikibase:label {  bd:serviceParam wikibase:language "en" }
    13 }
    

Get GO Annotations for FA genes[edit]

The following query uses these:

  • Properties: HGNC gene symbol (P353) View with Reasonator View with SQID, encodes (P688) View with Reasonator View with SQID, molecular function (P680) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, biological process (P682) View with Reasonator View with SQID, Gene Ontology ID (P686) View with Reasonator View with SQID
    1 #Get GO Annotations for FA genes
    2 SELECT ?hgnc ?protein ?go ?goLabel ?goId WHERE {
    3   values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
    4   ?gene wdt:P353 ?hgnc .  # get gene items with these HGNC symbols
    5   ?gene wdt:P688 ?protein . # get the protein
    6   ?protein wdt:P680|wdt:P681|wdt:P682 ?go . # get GO terms
    7   ?go wdt:P686 ?goId
    8   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    9 }
    

Retrieve orthologs of FA-core genes[edit]

Used for part of the competency question here.

The following query uses these:

  • Properties: HGNC gene symbol (P353) View with Reasonator View with SQID, ortholog (P684) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID
    1 #Retrieve orthologs of FA-core genes
    2 SELECT ?hgnc ?gene ?geneLabel ?ortho ?orthoLabel ?taxonLabel WHERE
    3 {
    4   values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
    5   ?gene wdt:P353 ?hgnc . 
    6   ?gene wdt:P684 ?ortho .
    7   ?ortho wdt:P703 ?taxon
    8   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    9 }
    

Other Example Queries[edit]

Retrieve an item by its external ID[edit]

Get the item that has the Disease Ontology ID "DOID:8577"

The following query uses these:

  • Properties: Disease Ontology ID (P699) View with Reasonator View with SQID
    1 #Retrieve an item by its external ID
    2 SELECT ?item ?itemLabel WHERE {
    3    ?item wdt:P699 "DOID:8577" .
    4    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    5 }
    

Retrieve all items with an external ID[edit]

Get all items with a Disease Ontology ID

The following query uses these:

  • Properties: Disease Ontology ID (P699) View with Reasonator View with SQID
    1 #Get all items with a Disease Ontology ID
    2 SELECT ?item ?doid WHERE {
    3    ?item wdt:P699 ?doid .
    4 }
    

Example demonstrating GROUP BY and COUNT[edit]

Count the number of genes in each taxon by NCBI Tax ID

The following query uses these:

  • Properties: NCBI Taxonomy ID (P685) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, Entrez Gene ID (P351) View with Reasonator View with SQID
    1 #Count the number of genes in each taxon by NCBI Tax ID
    2 
    3 SELECT (COUNT(?gene) as ?count) ?taxon ?taxonLabel ?taxids WHERE {
    4   values ?taxids {"559292" "6239" "7227" "7955" "10090" "10116" "9606"}
    5   ?taxon wdt:P685 ?taxids .
    6   ?gene wdt:P703 ?taxon .
    7   ?gene wdt:P351 ?en
    8   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    9 } GROUP BY ?taxon ?taxonLabel ?taxids
    

Example demonstrating how to retrieve qualifiers and references for statements[edit]

Get Gene Ontology subcellular localization information, with evidence codes, and references for Reelin

The following query uses these:

  • Properties: UniProt protein ID (P352) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, stated in (P248) View with Reasonator View with SQID, retrieved (P813) View with Reasonator View with SQID, reference URL (P854) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 #Get Gene Ontology subcellular localization information, with evidence codes, and references for Reelin
     2 SELECT distinct ?proteinLabel ?value ?valueLabel ?determination ?determinationLabel ?reference_stated_inLabel ?reference_retrieved ?reference_URL WHERE {
     3   ?protein wdt:P352 "P78509" . # get a protein by uniprot id 
     4   ?protein p:P681 ?statement . # get the cell component statements
     5   ?statement ps:P681 ?value .  # get the value associated with the statement
     6   ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
     7   # change ?determination to wd:Q23175558 for ISS (Inferred from Sequence or structural Similarity)
     8   # or e.g. wd:Q23190881 for IEA (Inferred from Electronic Annotation)
     9   #add reference links 
    10   ?statement prov:wasDerivedFrom/pr:P248 ?reference_stated_in . #where stated
    11   ?statement prov:wasDerivedFrom/pr:P813 ?reference_retrieved . #when retrieved
    12   ?statement prov:wasDerivedFrom/pr:P854 ?reference_URL 
    13   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    14 } ORDER BY ?value
    

Example demonstrating how to retrieve determination method qualifiers[edit]

Get all genetic association claims linking gene to disease, along with the determination method (GWAS, TAS, etc).

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
    1 #Get all genetic association claims linking gene to disease, along with the determination method (GWAS, TAS, etc).
    2 SELECT distinct ?geneLabel ?gene ?diseaseLabel ?disease ?determinationLabel WHERE {
    3   ?gene p:P2293 ?statement . # all gene disease genetic associations
    4   ?statement ps:P2293 ?disease .  # get the value associated with the statement
    5   ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
    6   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    7 }
    

Example demonstrating how to retrieve references for statements that cite journal articles[edit]

Get Gene Ontology subcellular localization information and references for proteins

The following query uses these:

  • Properties: UniProt protein ID (P352) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, PubMed ID (P698) View with Reasonator View with SQID, PMCID (P932) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, stated in (P248) View with Reasonator View with SQID
     1 #Get Gene Ontology subcellular localization information and references for proteins
     2 
     3 SELECT ?proteinLabel ?uniprot ?valueLabel ?paperLabel ?PMID ?PMCID WHERE {
     4   ?protein wdt:P352 ?uniprot .
     5   ?protein p:P681 ?statement .
     6   ?statement ps:P681 ?value . 
     7   ?statement prov:wasDerivedFrom/pr:P248 ?paper .
     8   ?paper wdt:P31 wd:Q13442814 .
     9   OPTIONAL { ?paper wdt:P698 ?PMID . }
    10   OPTIONAL { ?paper wdt:P932 ?PMCID . }
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    12 } limit 10
    

Example demonstrating how to retrieve information about references for statements[edit]

Get all Gene Ontology information and references for proteins, but only show statements where the refernece is a journal article published in Nature, Cell, or Science

The following query uses these:

  • Properties: UniProt protein ID (P352) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, PubMed ID (P698) View with Reasonator View with SQID, published in (P1433) View with Reasonator View with SQID, molecular function (P680) View with Reasonator View with SQID, cell component (P681) View with Reasonator View with SQID, biological process (P682) View with Reasonator View with SQID, stated in (P248) View with Reasonator View with SQID
     1 #Get all Gene Ontology information and references for proteins, but only show statements where the refernece is a journal article published in Nature, Cell, or Science
     2 SELECT ?proteinLabel ?uniprot ?valueLabel ?goTypeLabel ?paperLabel ?PMID ?PMCID WHERE {
     3   ?protein wdt:P352 ?uniprot .
     4   ?protein p:P680|p:P681|p:P682 ?statement .
     5   ?statement ps:P680|ps:P681|ps:P682 ?value . 
     6   ?value wdt:P31 ?goType .
     7   ?statement prov:wasDerivedFrom/pr:P248 ?paper .
     8   ?paper wdt:P698 ?PMID .
     9   ?paper wdt:P1433 ?journal .
    10   values ?journal {wd:Q192864 wd:Q180445 wd:Q655814}
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    12 } limit 1000
    

Example demonstrating how to filter out unreferenced claims[edit]

Get all "drug used for treatment of disease" claims with a column that indicates whether a reference exists for the claim.

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID, drug used for treatment (P2176) View with Reasonator View with SQID, stated in (P248) View with Reasonator View with SQID
     1 SELECT DISTINCT ?disease ?diseaseLabel ?drug ?drugLabel ?hasRef ?stated_inLabel WHERE {
     2   ?disease wdt:P31 wd:Q12136 ;  # find items that are in instance of disease
     3         p:P2176 ?id .  # get "drug used for treatment" statements
     4   ?id ?b ?drug .  # get the object used in these statements
     5   FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
     6   # FILTER NOT EXISTS { ?id prov:wasDerivedFrom ?provenance }  # filter out statements with no references
     7   # ?id prov:wasDerivedFrom ?provenance  # only keep statements with a references
     8   BIND(EXISTS {?id prov:wasDerivedFrom ?provenance } as ?hasRef) # tag statements with whether or not a ref exists
     9   OPTIONAL {?id prov:wasDerivedFrom ?prov .
    10             ?prov pr:P248 ?stated_in }
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    12 }
    

Example demonstrating how to filter out contradictory claims[edit]

Get "positive therapeutic predictor" claims, filtering out claims that are "disputed by" anything

The following query uses these:

  • Properties: CIViC variant ID (P3329) View with Reasonator View with SQID, positive therapeutic predictor (P3354) View with Reasonator View with SQID, statement disputed by (P1310) View with Reasonator View with SQID
    1 SELECT DISTINCT ?item ?itemLabel ?civic ?value ?valueLabel ?disputed_by WHERE {
    2   ?item wdt:P3329 ?civic ;  # find items that have a civic id
    3         p:P3354 ?id .  # get "positive therapeutic predictor" statements
    4   ?id ?b ?value .  # get the object used in these statements
    5   FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
    6   # FILTER NOT EXISTS {?id pq:P1310 ?disputed_by } # filter out statements that have a disputing qualifier
    7   BIND(EXISTS {?id pq:P1310 [] } as ?disputed_by)
    8   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    9 }
    

Miscellaneous Queries[edit]

Get mapping of Wikipedia articles to WikiData items to Entrez Gene IDs[edit]

The following query uses these:

  • Properties: Entrez Gene ID (P351) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID
    1 SELECT ?entrez_id ?cid ?article ?label WHERE {
    2     ?cid wdt:P351 ?entrez_id .
    3   	?cid wdt:P703 wd:Q15978631 . 
    4     OPTIONAL { ?cid rdfs:label ?label filter (lang(?label) = "en") . }
    5     ?article schema:about ?cid .
    6     ?article schema:inLanguage "en" .
    7     FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") . 
    8     FILTER (SUBSTR(str(?article), 1, 38) != "https://en.wikipedia.org/wiki/Template")
    9 } limit 10
    

Count of number of GO annotations on yeast grouped by curator[edit]

The following query uses these:

Retrieve reverse/"what links here" statements (statements where the item is the object)[edit]

The following query uses these:

  • Items: night blindness (Q7758678) View with Reasonator View with SQID, color blindness (Q133696) View with Reasonator View with SQID
     1 SELECT ?item ?itemLabel ?property ?propertyLabel ?value ?valueLabel ?id
     2 WHERE {
     3   values ?value {wd:Q7758678 wd:Q133696}
     4   ?item ?propertyclaim ?id .
     5   ?property wikibase:propertyType wikibase:WikibaseItem .
     6   ?property wikibase:claim ?propertyclaim .
     7   ?id ?b ?value .
     8   FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
     9   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    10 }
    

Get all proteins with GO annotations that are a subclass of "signaling receptor activity", where the determination method is a type of manual assertion, and the evidence is a scientific article that was published since 2014[edit]

The following query uses these:

  • Properties: Gene Ontology ID (P686) View with Reasonator View with SQID, subclass of (P279) View with Reasonator View with SQID, found in taxon (P703) View with Reasonator View with SQID, UniProt protein ID (P352) View with Reasonator View with SQID, molecular function (P680) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, publication date (P577) View with Reasonator View with SQID, PubMed ID (P698) View with Reasonator View with SQID, stated in (P248) View with Reasonator View with SQID, curator (P1640) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
     1 SELECT distinct ?uniprot ?determinationLabel ?curatorLabel ?reference_stated_inLabel ?pmid ?publication_date ?go_id ?sig_rec_goLabel WHERE  {
     2   
     3   ?sig_rec_go wdt:P686 ?go_id . # get GO IDs
     4   ?sig_rec_go wdt:P279* wd:Q21109843 . # that are subclasses of "signaling receptor activity"
     5   
     6   ?protein wdt:P703 wd:Q15978631 . # get items that are "found in taxon" human
     7   ?protein wdt:P352 ?uniprot . # and have a uniprot ID
     8   
     9   ?protein wdt:P680 ?sig_rec_go . # proteins where the MF a signaling receptor activity subclass
    10   ?protein p:P680 ?statement . # get statements
    11   
    12   ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
    13   ?determination wdt:P31 wd:Q28955254 . # filter where the determination method is a "manual assertion"
    14   
    15   ?statement prov:wasDerivedFrom/pr:P248 ?reference_stated_in . # get the "stated in" reference
    16   ?reference_stated_in wdt:P31 wd:Q13442814 . # stated in a "scientific article"
    17   ?reference_stated_in wdt:P577 ?publication_date . # get the publication date
    18   ?reference_stated_in wdt:P698 ?pmid . # get the pubmed id
    19   FILTER (?publication_date >= "2014-01-01T00:00:00Z"^^xsd:dateTime) . # filter where publication date is after 2014
    20   
    21   ?statement prov:wasDerivedFrom/pr:P1640 ?curator . # get the curator
    22 
    23   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    24   
    25 } limit 50
    

Get labels and description for Wikidata items[edit]

Get label, synonyms, description for for a selected list of diseases as DOID

The following query uses these:

  • Properties: Disease Ontology ID (P699) View with Reasonator View with SQID
     1 SELECT DISTINCT ?item ?doid ?itemLabel (group_concat(distinct ?itemaltLabel; separator="|") as ?altLabel) ?itemDesc WHERE {
     2   ?item wdt:P699 ?doid .
     3   values ?doid {"DOID:0050602" "DOID:0060308" "DOID:0060728" "DOID:10595" "DOID:11589" "DOID:2476" "DOID:5212"}
     4   OPTIONAL{
     5   ?item skos:altLabel ?itemaltLabel .
     6     FILTER(LANG(?itemaltLabel) = "en")
     7   ?item schema:description ?itemDesc .
     8     FILTER(LANG(?itemDesc) = "en")
     9   }
    10   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    11 }
    12 group by ?item ?doid ?itemLabel ?itemDesc
    

Count language labels for diseases[edit]

The following query uses these:

  • Properties: Disease Ontology ID (P699) View with Reasonator View with SQID
    1 SELECT ?disease ?doid ?enLabel (count(?language) as ?languages) WHERE {
    2 	?disease wdt:P699 ?doid ;
    3              rdfs:label ?label ;
    4              rdfs:label ?enLabel .
    5     FILTER (lang(?enLabel) = "en")
    6     BIND (lang(?label) AS ?language)
    7 } group by ?disease ?doid ?enLabel
    8 order by desc(?languages)
    

Example counting statements by their determination method[edit]

Get all "genetic association" (P2293) claims linking gene to disease, how many are from GWAS versus other methods?

The following query uses these:

  • Properties: genetic association (P2293) View with Reasonator View with SQID, determination method (P459) View with Reasonator View with SQID
    1 SELECT distinct (COUNT(*) as ?c) ?determinationLabel WHERE {
    2   ?gene p:P2293 ?statement . # all gene disease genetic associations
    3   ?statement ps:P2293 ?disease .  # get the value associated with the statement
    4   ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
    5   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    6 } GROUP BY ?determinationLabel
    

Get all properties and their usage counts on items that are diseases[edit]

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID
     1 SELECT ?propertyLabel ?propertyDescription ?pt ?count WHERE {
     2   {
     3     SELECT ?propertyclaim (COUNT(*) AS ?count) WHERE {
     4       ?item wdt:P31 wd:Q12136 .
     5 	  ?item ?propertyclaim [] .
     6 	} GROUP BY ?propertyclaim
     7   }
     8   ?property wikibase:propertyType ?pt .
     9   ?property wikibase:claim ?propertyclaim .
    10   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    11 } ORDER BY DESC (?count)
    

Get all properties and their usage counts for statements where the values are diseases[edit]

i.e. these properties have values that are diseases

The following query uses these:

  • Properties: instance of (P31) View with Reasonator View with SQID
     1 SELECT ?propertyLabel ?propertyDescription ?pt ?count WHERE {
     2   {
     3     SELECT ?propertyclaim (COUNT(*) AS ?count) WHERE {
     4       ?id ?b ?item .
     5       ?item wdt:P31 wd:Q12136 .
     6 	  [] ?propertyclaim ?id .
     7 	} GROUP BY ?propertyclaim
     8   }
     9   ?property wikibase:propertyType ?pt .
    10   ?property wikibase:claim ?propertyclaim .
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    12 } ORDER BY DESC (?count)
    

Quality Control (QC) Queries[edit]

See here for more info.

Query Wikidata using SPARQL Through R[edit]

Example Query[edit]

library(SPARQL)
sparql <- "https://query.wikidata.org/bigdata/namespace/wdq/sparql"
query <- "PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P279 wd:Q1049021 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'en' . }
}
"
results <- SPARQL(sparql, query)
View(as.matrix(results$results))

Query Wikidata using SPARQL Through Python[edit]

Example Query[edit]

from wikidataintegrator.wdi_core import WDItemEngine
import pandas as pd
r = WDItemEngine.execute_sparql_query("""SELECT ?item ?itemLabel WHERE {
  ?item wdt:P279 wd:Q1049021 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}""")['results']['bindings']
df = pd.DataFrame([{k:v['value'] for k,v in item.items()} for item in r])

Federated Queries[edit]

Wikidata -> Wikipathways[edit]

Wikipathways: Get drugs that act as channel blockers from Wikidata, get the pathways that these drugs are part of from Wikipathways[edit]

The following query uses these:

  • Properties: subclass of (P279) View with Reasonator View with SQID, part of (P361) View with Reasonator View with SQID, instance of (P31) View with Reasonator View with SQID, physically interacts with (P129) View with Reasonator View with SQID, P794
     1 PREFIX bd: <http://www.bigdata.com/rdf#>
     2 PREFIX wp:      <http://vocabularies.wikipathways.org/wp#> 
     3 PREFIX dcterms:  <http://purl.org/dc/terms/>
     4 PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
     5 
     6 SELECT DISTINCT ?metabolite ?wikidatadrug ?wikidatadrugLabel ?title ?wpIdentifier WHERE {
     7   ?protein wdt:P279*|wdt:P361* wd:Q422500 .
     8   ?protein wdt:P31 wd:Q8054 .
     9   ?wikidatadrug wdt:P129 ?protein .
    10   ?wikidatadrug p:P129/pq:P794 wd:Q389934 .
    11   SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    12   SERVICE <http://sparql.wikipathways.org/> {
    13   ?metabolite a wp:Metabolite ;
    14     wp:bdbWikidata ?wikidatadrug ;
    15     dcterms:isPartOf ?pathway .
    16    ?pathway a wp:Pathway .
    17     ?pathway dc:title ?title .
    18     ?pathway dc:identifier ?wpIdentifier .
    19   }
    20 } LIMIT 100
    

Get all interaction between metabolites in Pathways from Wikipathways and their individual pKa values[edit]

The following query uses these:

  • Properties: pKa (P1117) View with Reasonator View with SQID
     1 #defaultView:Dimensions
     2 PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
     3 PREFIX dcterms:  <http://purl.org/dc/terms/>
     4 PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
     5 SELECT DISTINCT ?pwTitle ?metabolite1Label ?pKa1 ?pKa2 ?metabolite2Label WHERE {
     6   ?metabolite2 wdt:P1117 ?pKa2 .
     7   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
     8   {SELECT * WHERE {
     9     ?metabolite1 wdt:P1117 ?pKa1 .
    10     {SELECT * WHERE {
    11        SERVICE <http://sparql.wikipathways.org/> {
    12          ?pathway dc:identifier ?pw ;
    13                   dc:title ?pwTitle ;
    14                    wp:organismName "Homo sapiens"^^xsd:string .
    15          ?interaction rdf:type wp:Interaction ;
    16                 wp:participants ?wpmb1, ?wpmb2 ;
    17                 dcterms:isPartOf ?pathway .
    18           ?wpmb1 wp:bdbWikidata ?metabolite1 .
    19           ?wpmb2 wp:bdbWikidata ?metabolite2 .
    20          FILTER (?wpmb1 != ?wpmb2)}
    21      }
    22     }
    23    }
    24   }
    25 }
    

Find genes regulated by an miRNA of interest from Wikidata and retrieve pathways this gene is active in from WikiPathways[edit]

Submit through Wikipathways endpoint

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?mirna ?gene ?pathway ?pLabel WHERE {
  ?target dc:identifier ?exact .
    ?target dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .
    ?pathway <http://purl.org/dc/elements/1.1/title> ?pLabel .
  SERVICE <https://query.wikidata.org/sparql> {
    ?mirna rdfs:label 'hsa-miR-211-5p'@en .
    ?mirna wdt:P128 ?gene .
    ?gene wdt:P2888 ?exact filter (?exact = <http://identifiers.org/ncbigene/1234>)
  }
} LIMIT 10

Try it!

Uniprot -> Wikidata[edit]

Retrieve all human membrane proteins annotated for a role in colorectal cancer[edit]

Submit through uniprot endpoint

PREFIX up: <http://purl.uniprot.org/core/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?gene ?geneLabel ?wdncbi ?disease_text ?disease_annotation ?sa ?mesh_iri WHERE {    
  SERVICE <https://query.wikidata.org/sparql> {    
    ?gene wdt:P351 ?wdncbi ;
          wdt:P703 wd:Q15978631;
          rdfs:label ?geneLabel ;
          wdt:P688 ?wd_protein .
    ?wd_protein wdt:P352 ?uniprot_id ;
                wdt:P681 ?go_term .
    ?go_term wdt:P686 "GO:0016020" .
    FILTER (LANG(?geneLabel) = "en") .
    ?disease wdt:P31 wd:Q12136 .
    ?disease wdt:P486 ?mesh .
    ?disease wdt:P279* wd:Q188874 .
  }
  BIND(IRI(CONCAT("http://purl.uniprot.org/uniprot/", ?uniprot_id)) as ?protein)
  BIND(IRI(CONCAT("https://id.nlm.nih.gov/mesh/", ?mesh)) as ?mesh_iri)
  ?protein up:annotation ?annotation .
  ?annotation a up:Disease_Annotation .
  ?annotation up:disease ?disease_annotation .
  ?disease_annotation <http://www.w3.org/2004/02/skos/core#prefLabel> ?disease_text .
  ?disease_annotation rdfs:seeAlso ?mesh_iri
}

Try it!

Select all human UniProt entries with a sequence variant that leads to a 'loss of function' and also physically interact with (P129) a drug with a qualifier of "use" (P366) of "enzyme inhibitor" (Q427492)[edit]

Submit through uniprot endpoint

PREFIX keywords:<http://purl.uniprot.org/keywords/> 
PREFIX uniprotkb:<http://purl.uniprot.org/uniprot/> 
PREFIX ec:<http://purl.uniprot.org/enzyme/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX skos:<http://www.w3.org/2004/02/skos/core#> 
PREFIX owl:<http://www.w3.org/2002/07/owl#> 
PREFIX bibo:<http://purl.org/ontology/bibo/> 
PREFIX dc:<http://purl.org/dc/terms/> 
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 
PREFIX up:<http://purl.uniprot.org/core/> 
PREFIX taxon:<http://purl.uniprot.org/taxonomy/> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX p: <http://www.wikidata.org/prop/>

SELECT DISTINCT ?wd_item ?physically_interacts_with ?interactswithLabel ?typeLabel ?iri ?uniprot ?text  WHERE  {
   {SELECT * WHERE { ?iri a up:Protein ;
		    up:organism taxon:9606 ; 
		    up:annotation ?annotation .
		?annotation a up:Natural_Variant_Annotation ; 
		            rdfs:comment ?text .
		FILTER (CONTAINS(?text, 'loss of function')) }
   }
   SERVICE <https://query.wikidata.org/bigdata/namespace/wdq/sparql> {
      	VALUES ?use {wd:Q427492}
		?wd_item	wdt:P352 ?uniprot ;
             		p:P129 ?physically_interacts_with_node ;     
        			wdt:P2888 ?iri ;
        			wdt:P703 wd:Q15978631 .
    	?phys_interacts_with_node 	ps:P129 ?physically_interacts_with ;
                              		pq:P366 ?use .    
    	?physically_interacts_with 	wdt:P31 ?type ;
                               		rdfs:label ?interactswithLabel .
        ?type rdfs:label ?typeLabel .
    	FILTER (lang(?interactswithLabel) = "en")
        FILTER (lang(?typeLabel) = "en")
    }
}

Try it

Maintenance Queries[edit]

See here: Maintenance Queries