User:ProteinBoxBot

From Wikidata
Jump to: navigation, search

#REDIRECTen:User:ProteinBoxBot
Soft redirect page


Wikidata-Bot This user account is a bot with a bot flag. It is operated by Andrawaag  and Sebotic.
  • Block this bot, if it is malfunctioning.
  • Check its work.
  • Contact the operator about mistakes.
  • See all Requests for Permissions related to this bot: 1 2 3 4
GeneWikidata-logo-en.png

Purpose[edit]

The objective of this bot is to provide WikiData with up-to-date high quality information about genes, diseases, and drugs from authoritative sources. These concepts will form the backbone upon which many biomedical applications of WikiData will be based. Specifically it will make it possible to answer important biomedical questions using the Wikidata query service. We are working to establish a common set of standards for representing the evidence and provenance of this kind of information in wikidata and will be working to apply these standards to all of the work described below.

Sister Bots[edit]

To better divide the many tasks we are undertaking, our team also runs these bot accounts:

Bot tasks and state[edit]

Bots use a python module for reading and writing to Wikidata called WikidataIntegrator. The open source bot code is divided into a collection of tasks. The initial tasks are concerned with establishing sets of entities corresponding to the three main classes (genes, diseases, drugs) and creating a stable cycle of updates. The next level of tasks focuses on establishing relationships between these entities. All bot edits are based on content from trusted, manually curated scientific resources. For additional information about each bot task, follow the links in the status table below.

Bot task Discussion started Coding and testing Production ready Is approved Is running update frequency last full cycle
Gene and protein items x x x x x monthly human genes (2017-01-30)

human proteins (2017-01-30)

mouse/rat/yeast genes (2017-02-06)

mouse/rat/yeast proteins (2017-01-30)

Gene Ontology x x x x x monthly 2017-02-01
Disease items x x x x x new releases Disease Ontology release 2017-01-27 (Q28556593)(log)
Drug items x x x x x manually
Gene-drug links x x x x x manually 2016-08-17
Gene-disease links x x x x x manually
Drug-disease links x x x x x manually 2016-08-17
Microbial gene and protein items x x x x x monthly genes

proteins

Protein Families x new releases InterPro Release 61.0 (Q28543953)(log)
GO Protein Annotations x monthly 2017-01-04
Clinical trials
...

Legalities[edit]

A lot of the work done by this bot involves the import, synchronization, and maintenance of information brought in from other sources. Where those sources are not entirely in the public domain, specific agreements need to be reached about which content can be brought into wikidata and hence rendered CC0. We will track these agreements on the legal subpage.

The team[edit]

Past participants / operators[edit]

Task permission requests[edit]

Discussions[edit]

Sprints[edit]

Bot development cycle[edit]

  1. an initial manual modeling of 1 or 2 example entries.
  2. Then develop the bot on 10 entries.
  3. Do a test run on 100 entries
  4. wait for the possible constraint violations to surface.
  5. perform a full run

Useful Links[edit]

Publications, presentations [edit]

See also: Presentations on the WikiProject Molecular and Cellular Biology

Type Title / link Date
Presentation Opportunities and challenges presented by Wikidata in the context of biocuration 2016-08-01
Poster Wikidata: a central hub of linked open life science data 2015-04-22
Poster Wikidata: a central hub of linked open life science data 2015-04-23
Presentation Crowd Sourcing Methods to Annotate Biological Processes 2015-05-11
Presentation Lets eat soup together - RD Connect workshop on data linkage and ontologies in rare diseases Rome 2015-09-24
Presentation Open Biomedical Knowledge: Wikipedia, Wikidata and Beyond - WikiConferenceUSA 2015 2015-10-12
Publication Wikidata: A platform for data integration and dissemination for the life sciences and beyond 2015-11-16
Publication Wikidata as a semantic framework for the Gene Wiki initiative (Q23712646) Link 2016-03-17
Publication Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes (Q21503281) Link 2016-03-28
Publication WikiGenomes: an open Web application for community consumption and curation of gene annotation data in Wikidata. 2017-01-24

Network View[edit]

Network of the current status of the ProteinBoxBot wikidata project