Wikidata:Property proposal/GoodRx

From Wikidata
Jump to navigation Jump to search

GoodRx ID[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
RepresentsGoodRx (Q30640316)
Data typeExternal identifier
Template parameterProposed as "GoodRx" in en:template:infobox drug and en:template:Drugbox external links (the intent is to pull GoodRx urls from wikidata to supply the corresponding fields these templates)
Domainpharmaceutical product (Q28885102), prescription drug (Q1643563), medication (Q12140), drug (Q8386), chemical compound (Q11173)
Example 1amphetamine (Q179452)Adderall
Example 2amphetamine (Q179452) → Adderall-XR
Example 3amphetamine (Q179452) → Evekeo
Example 4amphetamine (Q179452) → Adzenys-XR-ODT
Example 5Adderall (Q935761) → Adderall
Example 6Adderall (Q935761) → Adderall-XR
Example 7methylphenidate (Q422112) → Ritalin
Example 8methylphenidate (Q422112) → Ritalin-LA
Example 9methylphenidate (Q422112) → Concerta
Example 10Insulin Lispro (Q3492616) → Humalog
External linksUse in sister projects:
Planned useAdd the corresponding IDs to each Wikidata item from GoodRx
Number of IDs in source~6000
Expected completenesseventually complete (Q21873974)
Formatter URL$1
Robot and gadget jobsAdding GoodRx IDs to wikidata to permit use on Wikipedia


Doc James
Daniel Mietchen
Andrew Su
Projekt ANA
Pavel Dušek
Was a bee
Chris Mungall
Dr. Abhijeet Safai
Sami Mlouhi
Netha Hussain
Abhijeet Safai
Shani Evenstein
ZI Jony
Pictogram voting comment.svg Notified participants of WikiProject Medicine

I'd like to add GoodRx IDs to wikidata to permit linking to GoodRx webpages through templates on Wikipedia. I intend to code a bot script to do this using pywikibot; I'm aware that I need to get the bot approved. GoodRx provides pharmacy price data and coupons for prescription drugs in the US. Seppi333 (Insert ) 11:43, 16 August 2019 (UTC)


  • Support What is proposed here seems like routine integration of the identifiers from a popular medical database into Wikidata.
    Wikidata collects many names for what should be the same drug. We are still relying on all of these products by various names resolving as equivalent to one name. GoodRx is another layer on that depending on the foundational quality that all our data data is correct. I question whether the information we have is correct. The pharma industry is playing many anti-consumer games by marketing all these various names into the marketplace. Such as things are, this mapping plan with GoodRx matches to Wikidata's current quality and the best quality data that NIH and similar databases present for import and reconciliation with Wikidata. If we ever have separate Wikidata items for various product names, then we could easily split this GoodRx cataloging system into more specific name articles. Blue Rasberry (talk) 12:24, 16 August 2019 (UTC)
    The fact that two drugs share the same active ingrident doesn't mean that they are the same product. The ways a drug gets manifactured has often clinical effects.
    I don't believe that items about chemical compounds should link to items about individual product names. I would want items for the individual named products to be created if you want to add external ids of individual named products to items. Otherwise, I Symbol support vote.svg Support having the property. There should be a single value constraint. ChristianKl❫ 12:43, 16 August 2019 (UTC)
    The main reason I linked 2 items to "Adderall" and "Adderall-XR" is that on en-wiki, en:Adderall and en:Amphetamine both exist; the Adderall article is about a specific mixture of amphetamine enantiomers (1:3 levoamphetamine to dextroamphetamine) in clinical use, whereas the Amphetamine article is about the compound in general (i.e., 1:1 racemic and any enantiomeric mixtures of levo- and dextro-amphetamine). But, for what it's worth, amphetamine (Q179452) already lists Adderall, Evekeo, and several other brands under a different property (active ingredient in (P3780)) pertaining to brands in which amphetamine is or was previously an active ingredient.
    Also, a single value constraint would almost entirely preclude the use I had in mind for this property (i.e., pulling the urls from Wikidata and linking to the corresponding GoodRx pages in Wikipedia templates). Seppi333 (Insert ) 13:32, 16 August 2019 (UTC)
    may help clarify why there's so much confusion. Multiple regulators. Multiple database systems. Multiple producers. Multiple labels. Multiple formulations. Multiple dosages. Seems almost like it was designed to confound multi-national studies. Anywho, different is different: we should be careful that we do not conflate referents through sloppy handling of identifiers. This could cause serious harm. If it is made glaringly clear that formulations may differ, there might be value in identifying "related" referents. LeadSongDog (talk) 19:02, 16 August 2019 (UTC)
    When it comes to adding external ids to Wikidata the first priority is to keep our order on Wikidata. Just because Wikipedia versions mix different concepts on the same page doesn't mean that we should do so as well.
    In those cases it might make sense to sooner or later mark in the templates on Wikipedia which concepts are actually covered by the pages.
    There's potential drama involved here by drawing links to thousands of pages of a for-profit unicorn startup and it seems to me like till now you haven't got a clear consensus from EnWiki that those links are considered welcome on EnWiki. Going through a bot request on EnWiki leaves less potential for drama afterwards.
    @Doc_James: What do you think here? ChristianKl❫ 09:05, 20 August 2019 (UTC)
  • Symbol support vote.svg Support David (talk) 05:34, 17 August 2019 (UTC)
  • Can we also import some pricing data as well? More useful than just a link though of course more complicated to do. Doc James (talk · contribs · email) 09:44, 20 August 2019 (UTC)
    • @Doc James: When it comes to pricing data it seems to me even more important that it's for the price for a specific drug and not the general compound. Importing privacy data might be copyright sensistive. It would make sense to ask Goodrx what they think about such an import.
      I think have a good understanding of the opinions of the medical Wikipedians on Enwiki. How much potential for conflict for linking to a for-profit website like this in infoboxes do you see? How do you think the relevant consensus should be established over there? ChristianKl❫ 13:00, 20 August 2019 (UTC)
    • @Seppi333: did you have any contact with GoodRx about this import project and how they stand on it? ChristianKl❫ 13:00, 20 August 2019 (UTC)
      • What is the "specific drug" versus "general compound"? We specifically label medications by the INN and redirect all brands to generics (except when a brand is used for more than one separate medication)
        The price is a data point and not copyrightable. Doc James (talk · contribs · email) 13:37, 20 August 2019 (UTC)
        • When you buy a drug in a pharmacy you are not only getting the compound but a lot of other things in the same pill. Some compounds can be orally as well as intravenously and are sold in different drug formulations.
          To the extend that we have links to generics, we could simply link to the generics in GoodRx. What benefit would we get from also linking to brand names on it? ChristianKl❫ 14:58, 20 August 2019 (UTC)
        • "We [...] redirect all brands to generics" No, we do not: Concerta (Q10868995); Adderall (Q935761); Ritalin (Q47521826). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:00, 23 August 2019 (UTC)
          • @Pigsonthewing: I don't see why a many-to-one linking is a problem. This will, in fact, be an m-to-n association between our database and theirs because we have additional IDs on some medications but not others (it's going to be a painstaking process for me to resolve all of the identifiers between the two databases, but I'm hoping I can get USANs (very likely) or INNs (probably) for the drug names, as I'd only have maybe 5-10% remaining from the union of both sets to sort out (e.g., pages on obscure brand and/or generic pharmaceuticals that are poorly linked on WD/WP, of which there's quite a few). The whole point of this proposal is not to establish a 1-to-1 bijective association between two databases; it's about facilitating data utilization on other projects. In any event, the only reason that I am proposing the brand name is that (1), (2) the prototype brand of a particular chemical, and particularly a drug class, is at least as recognizable to the public - if not moreso - than its generic name, (3) doctors often prescribe a brand name, with generic substitution being intended, because the drug product (i.e., dosage form, active ingredient, and its excipients) is inherently more clear to a pharmacist than prescribing the dosage form and the generic drug name, as that is not a drug product. Case in point, there is a qualitative but rather technical chemical difference in the dosage forms of Adderall XR and Mydayis (see table; the difference in cost between them is several hundred US dollars because one is off-patent and the other is not), and I have no idea how a doctor, much less a patient, would be able to distinguish between the two if a generic term were used. Seppi333 (Insert ) 06:09, 8 September 2019 (UTC)
      • It seems a bit premature for me to ask for their data without having established any form of consensus on WP to merit access to potentially proprietary information. That said, there isn't one fixed way of doing what I'm proposing, so I'm open to hearing what others have to say and, if there's consensus to do this only in a certain manner, I figured I'd broach the issue by making my request and stating those conditions upfront. If they're fine with that, great. If not, I could probably still get the data I need from a cloud-based NLP AI, but I probably wouldn't get all the relevant data (e.g., GoodRx/retail price, dosage form, dose, brand name, generic name, and/or other data items) on every drug in their database if that ends up being the only alternative. It would be a bit of a pain in the ass to go that route because they have brand name drugs and - assuming they're no longer patented - the corresponding generic(s) redirected to the same uri, so I'd be generating duplicate data on alternate identifiers through redirects. I'd probably have to delete the redundancies and save the corresponding identifiers from a web scrape after-the-fact rather than preclude writing that data because I don't think a web-scraping AI would be programmed with niche functionality like that. Seppi333 (Insert ) 06:09, 8 September 2019 (UTC)
  • Symbol oppose vote.svg Oppose "Ritalin", for example, is an identifier for Ritalin (Q47521826), not methylphenidate (Q422112). (Also, FYI, I get the error "GoodRx is not available outside of the United States." when trying to access the site.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:05, 23 August 2019 (UTC)