Wikidata:WikiProject Molecular biology/Properties

From Wikidata
Jump to: navigation, search

Home

 

Properties

 

Presentations

 

Tools

 

Goals

  • This page aims to organize a consensus view of the properties that describe molecular biology concepts. Please be bold and add your suggestions below! (For example, what property should we create to capture connections between genes and the categories defined by the Gene Ontology or the Disease Ontology?)

Rules of this page:

  • Feel free to add new property for discussion in the tables. Set the "Creation level" to Proposal
  • Please use the talk page to discuss about properties creation or use. If you want to discuss about one property, create a new section on the talk page and set the "Creation level" to 'discussion and link the property in the table with the section.

Other relevant pages

Understanding properties: Properties link to particular datatypes. http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values

See examples on the (currently much more complete) Wikidata:Chemistry task force/Properties.

General properties for genes and proteins[edit]

See the properties that the ProteinBoxBot understands.

Application of data[edit]

Identifier Properties[edit]

Human genes[edit]

property does not exist. Use "id=new" if it's to be created.

Title ID Data type Description Examples Inverse
Entrez Gene ID P351 External identifier Entrez: identifier for a gene per the NCBI Entrez database Cyclin dependent kinase 2 <Entrez Gene ID> 1017 -
HGNC gene symbol P353 External identifier identifier for a human gene RELN <HGNC gene symbol> RELN -
HGNC ID P354 External identifier HUGO Gene Nomenclature Committee: identifier for a gene from the HGNC database RELN <HGNC ID> 9957 -
OMIM ID P492 External identifier disease: Online "Mendelian Inheritance in Man" catalogue codes for diseases Huntington's disease <OMIM ID> 143100 -
Ensembl Gene ID P594 External identifier gene: identifier for a gene as per the Ensembl (European Bioinformatics Institute and the Wellcome Trust Sanger Institute) database MB <Ensembl Gene ID> ENSG00000198125 -
genomic start P644 String genomic starting coordinate of the biological sequence (e.g. a gene) RELN <genomic start> 103112231 -
genomic end P645 String genomic ending coordinate of the biological sequence (e.g. a gene) RELN <genomic end> 103629963 -
genomic assembly P659 Item specify the genome assembly on which the feature is placed RELN <genomic assembly> Genome assembly GRCh38 -
HomoloGene ID P593 String HomoloGene: identifier in the HomoloGene database Rhodopsin <HomoloGene ID> 68068 -
Refseq Genome ID P2249 External identifier ID in the RefSeq Genome database Chlamydia trachomatis D/UW-3/CX <Refseq Genome ID> NC_000117.1 -
  • proposed:: Alias ( Other gene symbols (e.g. retired) used to name this gene). Note there are also aliases for item labels outside the property structure)

Human proteins[edit]

Title ID Data type Description Examples Inverse
UniProt ID P352 External identifier UniProt: identifier for a protein per the UniProt database. RELN <UniProt ID> P78509 -
PDB ID P638 External identifier Protein Data Bank: gene entry for 3D structural data as per the PDB (Protein Data Bank) database Property talk:P638 -
EC number P591 String Enzyme Commission number: classification scheme for enzymes Triacylglycerol lipase <EC number> 2.7.3.2 -
RefSeq Protein ID P637 External identifier RefSeq: identifier for a protein Reelin <RefSeq Protein ID> NP_005036.2 -
Ensembl Protein ID P705 External identifier protein: identifier for a protein issued by Ensembl database Reelin <Ensembl Protein ID> ENSP00000392423 and ENSP00000345694 -

Mouse genes[edit]

Title ID Data type Description Examples Inverse
Mouse Genome Informatics ID P671 External identifier Mouse Genome Informatics: identifier for a gene in the Mouse Genome Informatics database Myoglobin <Mouse Genome Informatics ID> MGI:96922 -

Mouse proteins[edit]

Unsorted[edit]

Title ID Data type Description Examples Inverse
RefSeq RNA ID P639 External identifier RNA Identifier RELN <RefSeq RNA ID> NM_005045 -
chromosome P1057 Item chromosome: chromosome on which an entity is localized RELN <chromosome> Homo sapiens chromosome 7 -

Proposed Media Properties[edit]

Title ID Data type Description Examples Inverse
chemical structure P117 Commons media file image of a representation of the structure for a chemical compound methane <chemical structure> Methan Keilstrich.svg -
Gene Atlas Image P692 Commons media file image showing the GeneAtlas expression pattern RELN <Gene Atlas Image> PBB GE RELN 205923 at tn.png -

Proposed properties linking genes to other biological concepts (cell components, processes, etc.)[edit]

Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number, got nil instead. Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number, got nil instead.

Title ID Data type Description Examples Inverse
found in taxon P703 Item the taxon in which the item can be found RELN <found in taxon> human -
cell component P681 Item cellular component: component of the cell in which this item is present Reelin <cell component> cytoplasm -
biological process P682 Item biological process: is involved in the biological process Neurotrophin 3 <biological process> activation of MAPK activity -
molecular function P680 Item represents gene ontology function annotations RELN <molecular function> metal ion binding -
regulates (molecular biology) P128 Item process regulated by a protein or RNA in molecular biology Reelin <regulates (molecular biology)> Neural development -

Notes:

  • 682: As in, Reelin is involved in the process of neuron migration. Use to represent gene ontology process annotations. "operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms"; see Gene Ontology. This biological process (Q2996394) property would be a predicate that links a gene or protein subject like BRCA1 (Q227339) with a specific biological process object like DNA repair (Q210538) || || A typical reference for the statement would be a link to the subject's entry on the Gene Ontology website. For the BRCA1-biological process-DNA repair example above, the reference would be http://amigo.geneontology.org/cgi-bin/amigo/gp-assoc.cgi?gp=UniProtKB:C6YB45.
Property Datatype Creation level Description Links Comments
Taxon Item Proposal Taxon / species from in gene/protein is encoded
contains_domain Item Proposal As in, Reelin contains the domains "Reeler domain" and "BNR/Asp-box repeat"

Proposed Properties linking genes to genes[edit]

Title ID Data type Description Examples Inverse
physically interacts with P129 Item physical entity that the subject interacts with track chain <physically interacts with> soil -
ortholog P684 Item orthology: orthologous gene in another species (use with 'species' qualifier) RELN <ortholog> Reln -
Property Datatype Creation level Description Links Comments
Activates Item Proposal The product of this gene activates the function of the target gene
Inhibits Item Proposal The product of this gene inhibits the function of the target gene
Binds to Item Proposal The product of this gene binds to the product of the target gene
Phenotype Item Proposal See use in http://string-db.org
Catalysis Item Proposal See use in String database
Post-translationally-modifies Item Proposal See use in String database
Reaction Item Proposal See use in String database
Expression Item Proposal See use in String database

General properties for genomics[edit]

Property Datatype Creation level Description Links Comments
Genome size (or Genome length) Number Proposal The size (or length) of the genome for a given species wikipedia:Genome_size Currently being discussed here: Wikidata:Property_proposal/Natural_science#Genome_size
Number of genes Number Proposal The number of genes for a given species
Nucleic acid type String Proposal Is it: ssDNA / dsDNA / ssRNA / dsRNA
Number of chromosomes Number Proposal The number of chromosomes in a genome
  • proposed:: Genomes assembly database identifiers. See [1]
  • proposed:: ENA Sequence identifier.

General properties for pathways[edit]

Proposed identifier properties[edit]

Title ID Data type Description Examples Inverse
KEGG ID P665 External identifier Kyoto Encyclopedia of Genes and Genomes: identifier from databases dealing with genomes, enzymatic pathways, and biological chemicals ascorbic acid <KEGG ID> D00018 -
Property Datatype Creation level Description Links Comments
Wikipathways ID String Proposal WikiPathways Identifier. http://www.wikipathways.org

Drugs[edit]

Identifiers[edit]

Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number, got nil instead.Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number, got nil instead.Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number, got nil instead.

Title ID Data type Description Examples Inverse
Drugbank ID P715 External identifier DrugBank: identifier in the bioinformatics and cheminformatics database from the University of Alberta Vitamin C <Drugbank ID> 00126 -
ChemSpider ID P661 External identifier ChemSpider: identifier in a free chemical database, owned by the Royal Society of Chemistry methadone <ChemSpider ID> 3953 -

Interactions[edit]

Title ID Data type Description Examples Inverse
significant drug interaction P769 Item drug interaction: clinically significant interaction between two pharmacologically active substances (i.e., drugs and/or active metabolites) where concomitant intake can lead to altered effectiveness or adverse drug events. warfarin <significant drug interaction> lovastatin -


Andrew Su
Marc Robinson-Rechavi
Pierre Lindenbaum
Michael Kuhn
Boghog
Emw
Chandres
Dan Bolser
Pradyumna
Chinmay
Timo Willemsen
Salvatore Loguercio
Tobias1984
Daniel Mietchen
Optimale
Mcnabber091
Ben Moore
Alex Bateman
Klortho
Hypothalamus
Vojtěch Dostál
Gtsulab
Andra Waagmeester
Sebotic
Mvolz
Toniher
Elvira Mitraka
David Bikard
Dan Lawson
Francesco Sirocco
Konrad U. Förstner (talk)
Chris Mungall (talk)
Kristina Hettne
Hardwigg
i9606
Putmantime
Tinm
Karima Rafes
Finn Årup Nielsen
Jasper Koehorst
Till Sauerwein
Crowegian
Nothingserious
Okkn
AlexanderPico
Amos Bairoch
Gstupp
DePiep
Was a bee
SarahKeating
Muhammad Elhossary
Pictogram voting comment.svg Notified participants of WikiProject Molecular_biology


Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:CFCF
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
Pictogram voting comment.svg Notified participants of WikiProject Medicine

Notes: The following drug objects should serve as the unifying examples for drugs in WikiData. In order to include all major identifiers, several new properties will be requested shortly (e.g. WHO INN, USAN)

Taxa[edit]

Identifiers[edit]

Title ID Data type Description Examples Inverse
NCBI Taxonomy ID P685 External identifier Taxonomy database of the U.S. National Center for Biotechnology Information: identifer for a taxon in the Taxonomy Database by the National Center for Biotechnology Information human <NCBI Taxonomy ID> 9606 -