Wikidata:WikiProject Molecular biology/Properties

From Wikidata
Jump to navigation Jump to search

Home

 

Properties

 

Presentations

 

Tools

 

Goals

  • This page aims to organize a consensus view of the properties that describe molecular biology concepts. Please be bold and add your suggestions below! (For example, what property should we create to capture connections between genes and the categories defined by the Gene Ontology or the Disease Ontology?)

Rules of this page:

  • Feel free to add new property for discussion in the tables. Set the "Creation level" to Proposal
  • Please use the talk page to discuss about properties creation or use. If you want to discuss about one property, create a new section on the talk page and set the "Creation level" to 'discussion and link the property in the table with the section.

Other relevant pages

Understanding properties: Properties link to particular datatypes. http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values

See examples on the (currently much more complete) Wikidata:Chemistry task force/Properties.

General properties for genes and proteins[edit]

See the properties that the ProteinBoxBot understands.

Application of data[edit]

Identifier Properties[edit]

Human genes[edit]

property does not exist. Use "id=new" if it's to be created.

Title
ID Data type Description Examples Inverse
Entrez Gene IDP351External identifierEntrez: identifier for a gene per the NCBI Entrez databaseCyclin dependent kinase 2 <Entrez Gene ID> 1017-
HGNC gene symbolP353External identifierHuman Genome Organisation: identifier for a human geneRELN <HGNC gene symbol> RELN-
HGNC IDP354External identifierHUGO Gene Nomenclature Committee: identifier for a gene from the HGNC databaseRELN <HGNC ID> 9957-
OMIM IDP492External identifierdisease, gene and phenotype: Online "Mendelian Inheritance in Man" catalogue codes for diseases, genes, or phenotypesHuntington disease <OMIM ID> 143100-
Ensembl Gene IDP594External identifiergene: identifier for a gene as per the Ensembl (European Bioinformatics Institute and the Wellcome Trust Sanger Institute) databaseMB <Ensembl Gene ID> ENSG00000198125-
genomic startP644Stringgenomic starting coordinate of the biological sequence (e.g. a gene)RELN <genomic start> 103112231-
genomic endP645Stringgenomic ending coordinate of the biological sequence (e.g. a gene)RELN <genomic end> 103629963-
genomic assemblyP659Itemspecify the genome assembly on which the feature is placedRELN <genomic assembly> Genome assembly GRCh38-
HomoloGene IDP593StringHomoloGene: identifier in the HomoloGene databaseRhodopsin <HomoloGene ID> 68068-
Refseq Genome IDP2249External identifierNational Center for Biotechnology Information: ID in the RefSeq Genome databaseChlamydia trachomatis D/UW-3/CX <Refseq Genome ID> NC_000117.1-
  • proposed:: Alias ( Other gene symbols (e.g. retired) used to name this gene). Note there are also aliases for item labels outside the property structure)

Human proteins[edit]

Title
ID Data type Description Examples Inverse
UniProt protein IDP352External identifierUniProt: identifier for a protein per the UniProt database.RELN <UniProt protein ID> P78509-
PDB structure IDP638External identifierProtein Data Bank: identifier for 3D structural data as per the PDB (Protein Data Bank) databaseHydroxysteroid 11-beta dehydrogenase 1 <PDB structure ID> 4P38 and 1XU7-
EC numberP591StringEnzyme Commission number: classification scheme for enzymesTriacylglycerol lipase <EC number> 2.7.3.2-
RefSeq Protein IDP637External identifierRefSeq: identifier for a proteinReelin <RefSeq Protein ID> NP_005036.2-
Ensembl Protein IDP705External identifierprotein: identifier for a protein issued by Ensembl databaseReelin <Ensembl Protein ID> ENSP00000392423 and ENSP00000345694-

Mouse genes[edit]

Title
ID Data type Description Examples Inverse
Mouse Genome Informatics IDP671External identifierMouse Genome Informatics: identifier for a gene in the Mouse Genome Informatics databaseMyoglobin <Mouse Genome Informatics ID> MGI:96922-

Mouse proteins[edit]

Unsorted[edit]

Title
ID Data type Description Examples Inverse
RefSeq RNA IDP639External identifierRefSeq: RNA IdentifierRELN <RefSeq RNA ID> NM_005045-
chromosomeP1057Itemchromosome: chromosome on which an entity is localizedRELN <chromosome> human chromosome 7-

Proposed Media Properties[edit]

Title
ID Data type Description Examples Inverse
chemical structureP117Commons media filechemical structure: image of a representation of the structure for a chemical compoundmethane <chemical structure> Methan Keilstrich.svg-
Gene Atlas ImageP692Commons media fileimage showing the GeneAtlas expression patternRELN <Gene Atlas Image> PBB GE RELN 205923 at tn.png-

Proposed properties linking genes to other biological concepts (cell components, processes, etc.)[edit]

Title
ID Data type Description Examples Inverse
found in taxonP703Itemthe taxon in which the item can be foundRELN <found in taxon> human-
cell componentP681Itemcellular component: component of the cell in which this item is presentReelin <cell component> cytoplasm-
biological processP682Itembiological process: is involved in the biological processNeurotrophin 3 <biological process> activation of MAPK activity-
molecular functionP680Itemmolecular function: represents gene ontology function annotationsRELN <molecular function> metal ion binding-
regulates (molecular biology)P128Itemprocess regulated by a protein or RNA in molecular biologyReelin <regulates (molecular biology)> Neural development-
encodesP688Itemthe product of a gene (protein or RNA)RELN <encodes> Reelinencoded by
encoded byP702Itemthe gene that encodes some gene productReelin <encoded by> RELNencodes

Notes:

  • 682: As in, Reelin is involved in the process of neuron migration. Use to represent gene ontology process annotations. "operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms"; see Gene Ontology. This biological process (Q2996394) property would be a predicate that links a gene or protein subject like BRCA1 (Q227339) with a specific biological process object like DNA repair (Q210538) || || A typical reference for the statement would be a link to the subject's entry on the Gene Ontology website. For the BRCA1-biological process-DNA repair example above, the reference would be http://amigo.geneontology.org/cgi-bin/amigo/gp-assoc.cgi?gp=UniProtKB:C6YB45.
Property Datatype Creation level Description Links Comments
Taxon Item Proposal Taxon / species from in gene/protein is encoded
contains_domain Item Proposal As in, Reelin contains the domains "Reeler domain" and "BNR/Asp-box repeat"

Proposed Properties linking genes to genes[edit]

Title
ID Data type Description Examples Inverse
physically interacts withP129Itemphysical entity that the subject interacts withtrack chain <physically interacts with> soil-
orthologP684Itemorthology: orthologous gene in another species (use with 'species' qualifier)RELN <ortholog> Reln-
Property Datatype Creation level Description Links Comments
Activates Item Proposal The product of this gene activates the function of the target gene
Inhibits Item Proposal The product of this gene inhibits the function of the target gene
Binds to Item Proposal The product of this gene binds to the product of the target gene
Phenotype Item Proposal See use in http://string-db.org
Catalysis Item Proposal See use in String database
Post-translationally-modifies Item Proposal See use in String database
Reaction Item Proposal See use in String database
Expression Item Proposal See use in String database

General properties for genomics[edit]

Property Datatype Creation level Description Links Comments
Genome size (or Genome length) Number Proposal The size (or length) of the genome for a given species wikipedia:Genome_size Currently being discussed here: Wikidata:Property_proposal/Natural_science#Genome_size
Number of genes Number Proposal The number of genes for a given species
Nucleic acid type String Proposal Is it: ssDNA / dsDNA / ssRNA / dsRNA
Number of chromosomes Number Proposal The number of chromosomes in a genome
  • proposed:: Genomes assembly database identifiers. See [1]
  • proposed:: ENA Sequence identifier.

General properties for pathways[edit]

Proposed identifier properties[edit]

Title
ID Data type Description Examples Inverse
KEGG IDP665External identifierKyoto Encyclopedia of Genes and Genomes: identifier from databases dealing with genomes, enzymatic pathways, and biological chemicalsascorbic acid <KEGG ID> D00018-
Property Datatype Creation level Description Links Comments
Wikipathways ID String Proposal WikiPathways Identifier. http://www.wikipathways.org

Drugs[edit]

Identifiers[edit]

Title
ID Data type Description Examples Inverse
Guide to Pharmacology Ligand IDP595External identifierInternational Union of Basic and Clinical Pharmacology and IUPHAR/BPS Guide to PHARMACOLOGY: ligand identifier of the Guide to Pharmacology databasecocaine <Guide to Pharmacology Ligand ID> 2286-
ChEMBL IDP592External identifierChEMBL: identifier from a chemical database of bioactive molecules with drug-like propertiestropicamide <ChEMBL ID> CHEMBL1200604-
HomoloGene IDP593StringHomoloGene: identifier in the HomoloGene databaseRhodopsin <HomoloGene ID> 68068-
Drugbank IDP715External identifierDrugBank: identifier in the bioinformatics and cheminformatics database from the University of AlbertaVitamin C <Drugbank ID> 00126-
ChemSpider IDP661External identifierChemSpider: identifier in a free chemical database, owned by the Royal Society of Chemistrymethadone <ChemSpider ID> 3953-

Interactions[edit]

Title
ID Data type Description Examples Inverse
significant drug interactionP769Itemdrug interaction: clinically significant interaction between two pharmacologically active substances (i.e., drugs and/or active metabolites) where concomitant intake can lead to altered effectiveness or adverse drug events.warfarin <significant drug interaction> Lovastatin-


Andrew Su
Marc Robinson-Rechavi
Pierre Lindenbaum
Michael Kuhn
Boghog
Emw
Chandres
Dan Bolser
Pradyumna
Chinmay
Timo Willemsen
Salvatore Loguercio
Tobias1984
Daniel Mietchen
Optimale
Mcnabber091
Ben Moore
Alex Bateman
Klortho
Hypothalamus
Vojtěch Dostál
Gtsulab
Andra Waagmeester
Sebotic
Mvolz
Toniher
Elvira Mitraka
David Bikard
Dan Lawson
Francesco Sirocco
Konrad U. Förstner (talk)
Chris Mungall (talk)
Kristina Hettne
Hardwigg
i9606
Putmantime
Tinm
Karima Rafes
Finn Årup Nielsen
Jasper Koehorst
Till Sauerwein
Crowegian
Nothingserious
Okkn
AlexanderPico
Amos Bairoch
Gstupp
DePiep
Was a bee
SarahKeating
Muhammad Elhossary
Ptolusque
Netha
Damian Szklarczyk
Pictogram voting comment.svg Notified participants of WikiProject Molecular_biology Anandhisuresh (talk) 17:16, 28 February 2018 (UTC)anandhisuresh Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
Pictogram voting comment.svg Notified participants of WikiProject Medicine

Notes: The following drug objects should serve as the unifying examples for drugs in WikiData. In order to include all major identifiers, several new properties will be requested shortly (e.g. WHO INN, USAN)

Taxa[edit]

Identifiers[edit]

Title
ID Data type Description Examples Inverse
NCBI Taxonomy IDP685External identifierTaxonomy database of the U.S. National Center for Biotechnology Information: identifer for a taxon in the Taxonomy Database by the National Center for Biotechnology Informationhuman <NCBI Taxonomy ID> 9606-