Wikidata:Property proposal/PharmGKB ID

From Wikidata
Jump to navigation Jump to search

PharmGKB ID[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Done: PharmGKB ID (P7001) (Talk and documentation)
DescriptionA unique identifier for an entity in the PharmGKB knowledgebase.
RepresentsPharmGKB (Q18392556)
Data typeExternal identifier
Domaingene (Q7187), chemical compound (Q11173), disease (Q12136), biological pathway (Q4915012), sequence variant (Q15304597), scholarly article (Q13442814), medication (Q12140), research (Q42240), drug (Q8386)
Allowed valuesPA\d+
Example 1TPMT (Q14880759)PA356
Example 2tamoxifen (Q412178)PA451581
Example 3warfarin (Q407431)PA451906
Example 4breast neoplasm (Q58833934)PA443560
Example 5NUDT16 (Q18050218)PA134955224
External linksUse in sister projects:
Number of IDs in source< 20000 (ref)
Expected completenessalways incomplete (Q21873886)
Formatter URL$1
Robot and gadget jobsAdding PharmGKB IDs to wikidata and checking cross-references.


We would like to start adding PharmGKB IDs for genes and drugs/chemicals in Wikidata. PharmGKB 20190701

Jasper Deng
Egon Willighagen
Denise Slenter
Daniel Mietchen
Andy Mabbett
Emily Temple-Wood
Pablo Busatto (Almondega)
Antony Williams (EPA)
Devon Fyson
Samuel Clark
Tris T7
Pictogram voting comment.svg Notified participants of WikiProject ChemistryEihel (talk) Last user of Chemistry Project: Pablo Busatto (Almondega)


  • Symbol support vote.svg Support David (talk) 06:32, 2 July 2019 (UTC)
  • Symbol support vote.svg Support Germartin1 (talk) 19:07, 4 July 2019 (UTC)
  • Symbol support vote.svg Support --DannyS712 (talk) 19:12, 4 July 2019 (UTC)
  • Symbol support vote.svg Support --Tinker Bell 23:42, 5 July 2019 (UTC)s
  • Symbol support vote.svg Support --Trade (talk) 21:53, 7 July 2019 (UTC)
  • Symbol support vote.svg Support - Premeditated (talk) 11:23, 9 July 2019 (UTC)
  • Symbol support vote.svg Support @PharmGKB: shouldn't the IDs be gene/PA356 or chemical/PA451581 for use in a formatter URL?
  • Symbol neutral vote.svg Neutral I maintain the enwiki en:Template:Infobox drug, and I have never heard about this database nor its name. It was never asked adding this ID to the template. (Dozens of such databases exist. Some general line advisable?). -DePiep (talk) 23:06, 11 July 2019 (UTC).
  • Symbol support vote.svg Support --Leiem (talk) 02:26, 12 July 2019 (UTC)
  • @PharmGKB: How many chemicals do you have in your database ? Snipre (talk) 06:12, 12 July 2019 (UTC)
  • @Snipre: Currently at 3965 chemicals -- PharmGKB
  • @DePiep: We're focused on curated pharmacogenetic knowledge, so we only add chemicals when we have something of pharmacogenetic interest about it. Getting into the Infobox would be fantastic, but right now our primary goal is to make it easier for people who want to use our API to figure out how to cross-reference from other databases to ours. -- PharmGKB
  • We only guarantee that our Accession IDs (PAxxx) are constant. So for literature, while in practice it's effectively permanent, we're not guaranteeing it. I've added a formatter url ($1) for this. -- PharmGKB
  • Symbol support vote.svg Support --Egon Willighagen (talk) 12:25, 12 July 2019 (UTC)
  • Symbol wait.svg Wait @PharmGKB: I added a lot of missing parameters to create this property. You change some settings and leave others:
    1. What is/are the domains/s?
    2. What is the number of IDs (if possible with a reference)?
    3. What is the maximum number after PA to have a finished RegEx?
    4. What name will this Property take if part of the site is taken into account? pharmacogenetic id of PharmGKB? or accession? Would you be kind enough to propose something coherent, please? —Eihel (talk) 11:41, 13 July 2019 (UTC)
  • @Eihel: Sorry, I thought I _was_ proposing something coherent. I'm a newbie to Wikidata and if you want to point me to what constitutes as coherent to you, I would be more than happy to comply. All my changes make sense to me.
    1. We are a pharmacogenetics knowledge base. We cover genes, drugs/chemicals, variants, haplotypes, pathways. We reference a lot of scientific literature. There are diseases and phenotypes as well, but that's there in more of a supporting role.
    2. The current largest Accession ID is PA166183124. No reference, I'm just looking at the current sequence in the db. I'm not sure how this is relevant.
    3. We have what we call PharmGKB Accession IDs (these start with PAxxx where xxx is some number. We don't restrict it's length, it's an ever increasing sequence. If we have to put a number, how about 20? I don't want to have to come back and change that regexp every so often (e.g. DrugBank and MeSH). -- PharmGKB
  • @Eihel: I'm open to adding typing info to the IDs as it was before I changed it (i.e. `gene/PA67` instead of just `PA267`). I believe that's the change you're objecting to? If you were to ask me what "aspirin" (Q18216) was in PharmGKB, I'd say `PA448497`, not `chemical/PA448497`. We don't think of our ID's that way, but if that's the requirement, then that's what I'll provide Wikidata. -- PharmGKB
  • ✓ Done @PharmGKB, ديفيد عادل وهبة خليل 2, Germartin1, DannyS712, Trade, Premeditated: created as PharmGKB ID (P7001). Thanks for proposing - especially if you're willing to help with the matching. Enjoy! --99of9 (talk) 01:47, 15 July 2019 (UTC)