Wikidata talk:WikiProject Medicine

From Wikidata
Jump to: navigation, search

Languages[edit]

We should run this update on a few languages. The EMA drug if appears to be in 21 languages. Doc James (talk · contribs · email) (if I write on your page reply on mine) 00:56, 2 June 2013 (UTC)

We basically always have 2 options: Create an item that doesn't exist on Wikipedia, but is noteworthy on its own or wait for the datatype Multilingual text to be rolled out. Both can be translated in all languages. The item having the advantage that it can hold additional statements about e.g. a drug. I think you already said that we should use non-proprietary names. We could create them as items and put all the copyrighted sales names into a multilingual text field. --Tobias1984 (talk) 08:40, 2 June 2013 (UTC)
So the INN name is fairly consistent across languages. There are many brand names sometimes in a single language with these being determined by country and manufacturer.
We should have multilingual links for the EMA. I am still not 100% clear how Wikidata works. Will need to spend some time. Doc James (talk · contribs · email) (if I write on your page reply on mine) 13:43, 4 June 2013 (UTC)
I think everybody has a steep learning curve with Wikidata ;). I added the babel-template to your page which will enable you to view more languages and edit them (You can also switch between languages in the drop-down in the top-right of your screen). Each of the items (Q with number) can have labels and descriptions in each language. If you switch languages then items, statements and multi-lingual strings will be translated in your viewing language. Strings stay the same and numbers are displayed in the local number formatting.
We can make a language dependent link to EMA as soon as multi-lingual string becomes available as a datatype. Until that time we can just continue mapping the diseases and drug infobox. If those are finished that is going to take a huge workload of all language Wikipedias because we will generate the infobox centrally in every language and there will be no outdated numbers anymore :) --Tobias1984 (talk) 14:03, 4 June 2013 (UTC)
That sounds amazing. Blue Rasberry (talk) 15:53, 4 June 2013 (UTC)

Anatomy[edit]

Anatomy will be a harder subject (especially for me as a non-physician). It will definitely require much more human input than gathering strings from a database. We can for example say that the hypothalamus (Q164386) is a subclass of (P279) of Diencephalon (Q192419) which is a subclass of (P279) of brain (Q1073). The brain could be defined as a subclass of (P279) of the nervous system (Q9404) and an instance of (P31) of organ (Q712378). It is up to us to find a good (and well sourced) classification for all of these things. Ideally we have to create as little properties as possible. Maybe we can tackle one anatomical subject first and find a good approach on how to handle it. --Tobias1984 (talk) 16:24, 4 June 2013 (UTC)

For the examples you give above, part of (P361) would seem to be more appropriate to me than 'subclass of'. The Terminologia Anatomica might be a good starting point, which I think is what the anatomical navigation templates on the English wikipedia are based on. Arcadian did a lot of work on that, he might be able to help further. --Wouterstomp (talk) 10:51, 5 June 2013 (UTC)
Sounds good. If the data is structured enough in the infobox we could make a bot request to transfer the data. If we can get an expert on board that would be great too. --Tobias1984 (talk) 11:15, 5 June 2013 (UTC)

Reviewing properties[edit]

We currently have a lot of properties proposed that need about 3 support votes so they can be created. I know it is a little tedious especially for string properties that are anyway included in the infobox, but this process makes sure that we don't create too many or poorly defined properties. So if you have some time follow the links on the project page and look at the different proposals. --Tobias1984 (talk) 09:42, 6 June 2013 (UTC)

Added support for all the no-brainers. I was wondering about how properties such as route of administration would be handled: with certain allowed values, will it give some sort of drop-down menu that you can choose from? Or will it result in a mix of various descriptions of the same thing, e.g. intravenous, intravenous infusion, IV, intravenous catheter, venous, infusion, catheter, injection. And should it list the most usual routes or 'official' routes, or all possible routes (e.g. you could administer a drug such as morphine in lots of different ways). --Wouterstomp (talk) 10:11, 6 June 2013 (UTC)
A drop-down menu is planned. Currently you can enter anything but we can set up a bot to report weird entries (Its really easy to do: Template:Constraint). Probably the most commons ways are better than "any-thinkable-way". But we can also create a qualifier which distinguished between "most often" and "sometimes". We can add the things we work out to the documentation of the property, so people will know how to use it. And thank you for reviewing ;) --Tobias1984 (talk) 10:34, 6 June 2013 (UTC)

Question regarding organisation[edit]

Given that malaria (Q12156) is an instance of tropical disease (Q1345671) which is a subclass of disease (Q12136), should malaria also be an instance of disease? Also instance of and subclass of are often used interchangeably, some diseases are listed as instances and others as subclasses, what is correct here? --Wouterstomp (talk) 09:57, 6 June 2013 (UTC)

I think it is enough to add one instance of statement because malaria being a disease is already implied from the statement that malaria is a tropical disease and tropical diseases is a subclass of diaseses. I think that instance of is more correct. Either way we should make a guideline on the project page so we do everything consistently. --Tobias1984 (talk) 10:18, 6 June 2013 (UTC)
Some guidelines would definitely be helpful. Especially for borderline cases such as skin cancer (Q192102), which would be regarded by laymen as a disease (instance of) and by doctors as a (sub)class of diseases. Perhaps it could even be both in this case. --WS (talk) 10:47, 6 June 2013 (UTC)
I think we should definitely go with the professional opinion of the field. Ideally we can find some good sources and just follow what they have worked out. How about skin cancer (Q192102) subclass of disease and instance of cancer? --Tobias1984 (talk) 10:55, 6 June 2013 (UTC)

There is some discussion about this in a more general context over here: Wikidata:Requests for comment/How to classify items: lots of specific type properties or a few generic ones? --WS (talk) 13:10, 6 June 2013 (UTC)

I think that the organisation of diseases is not done well. As indicated above, there are diseases that are both instances of disease (Q12136) and subclasses of disease (Q12136), according to the accepted definitions of instance and subclass. Although this is not a modelling error per se, it does complicate the use of diseases. I was looking around and was unable to find even a simple statement of principles on what a disease is and how they should be organized. Is there such a statement, and I'm just unable to find it? Peter F. Patel-Schneider (talk) 13:11, 19 October 2015 (UTC)
@Peter F. Patel-Schneider: You have certainly found the right page to talk about the issue. The problem is that many people use instance of (P31) the wrong way and even remove correct subclass of (P279) statements. Some items have a history of unintentional back and forth editing of these 2 properties. In general the ontology should be built using subclass of (P279). But we are still far away from having good coverage of the existing disease ontologies. --Tobias1984 (talk) 15:59, 19 October 2015 (UTC)
@Tobias1984: What is the right way? I was unable to find a clear description of just how diseases are to be modelled. Peter F. Patel-Schneider (talk) 16:09, 19 October 2015 (UTC)
@Peter F. Patel-Schneider: We don't have any hard guidelines yet, but these two things should help (but feel free to ask more). Wikidata does not build 1 ontology but multiple ontologies. Different sources might classify diseases differently, so we need to add multiple statements to subclass of (P279). One branch of this ontology tree could be supported by 10 sources another one just by one. In total our ontology branches are built using millions of sources, because no single source could ever cover every piece of knowledge in the universe. - As a rough guideline you can look at http://disease-ontology.org/ which is a good first source to add if you add a p279 statement. You can also look at the disease items in the Reasonator. That tool pulls in information from related items and shows the ontology in the box called "Classification": https://tools.wmflabs.org/reasonator/?q=Q51993&lang=en --Tobias1984 (talk) 16:15, 19 October 2015 (UTC)
@Tobias1984: That's not what I was asking for. What I was trying to find out is how are diseases to be modelled in Wikidata? How is a particular disease to be related to malaria (Q12156)? What extra information is required for diseases? Peter F. Patel-Schneider (talk) 17:16, 19 October 2015 (UTC)
@Peter F. Patel-Schneider: According to disease ontology malaria (Q12156) is a subclass of (P279) of protozoal disease (Q18555201). That statement is already included in the item. But you might find sources that say something different. The other properties you can use in a similar way. For example if you press the "add" button below the last statement it will suggest you which properties don't have statements yet. There is an algorithm that compares items with similar statements and knows which are missing (for example an item about a person missing a birthdate). When I press that button for malaria (Q12156) I see that for example a statement with the ICD-9 code is missing. Feel free to keep asking until I say something that makes sense :) --Tobias1984 (talk) 17:40, 19 October 2015 (UTC)
@Tobias1984: I guess I was not specific enough. What I was looking for was how to set up instance and subclass links for diseases. For example, malaria has three such links. Are all three needed? What implications do these links have? What such links are needed for new diseases? (The reason that I ask is that I am interested in how Wikidata does metamodelling in general, and diseases appear to be a good exemplar. However, I am having trouble finding out how metamodelling is supposed to work for diseases.) Peter F. Patel-Schneider (talk) 17:53, 19 October 2015 (UTC)
@Peter F. Patel-Schneider: You could also start a discussion at Wikidata:WikiProject Ontology for some more expert advice on the ontology. - I don't think that malaria should have instance of (P31) statements because a disease is not an instance in the sense of a notable person being an instance of persons in general. Multiple statements in p279 are fine because they are different branches according to different sources. And that is a core concept of Wikidata. Is is a multi-ontology project. --Tobias1984 (talk) 19:30, 19 October 2015 (UTC)
@Tobias1984: There are two ways to be multi-ontologistic. One way is to have multiple domains, like diseases and colors. The other way is to have several ontologies of diseases. The first generally does not cause problems, although it is useful to have a common modelling methodology. The second can easily cause problems if the two ontologies are not correctly inter-related (or maybe correctly not inter-related). It appears to me that there are two ontologies of disease in Wikidata, one imported and one not, and they do not interact well at all, causing in particular the concept disease to have as instances both disease categories (populated) and particular diseases of individual entities (generally or completely unpopulated). To add to the confusion, there does not appear to be any discussion of the situation. Peter F. Patel-Schneider (talk) 13:41, 20 October 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Peter F. Patel-Schneider: Wikidata is still a pretty young project, so not everything has been discussed in detail. There is so much to sort out that some areas have gotten little attention. - There are certainly problems with having several ontologies for diaseases. But that is the way Wikidata is built. The community is already developing a lot of tools, and will need to built more tools that will take into account this approach. - If you want to work on the whole disease ontology, be prepared that it will take a lot of work and discussions. Currently there are 9591 items in the subclass-tree of disease (http://tools.wmflabs.org/autolist/autolist1.html?q=tree[12136][][279]). --Tobias1984 (talk) 18:37, 20 October 2015 (UTC)

@Tobias1984: Sure, it is going to be work to come up with a better organization of diseases. However, how can I start the work without knowing why the current organization is how it is? Who can tell me how higher-level classes are supposed to work in general in Wikidata? I was hoping that this was recorded somewhere, and then hoping that someone who knew would respond. Peter F. Patel-Schneider (talk) 20:22, 20 October 2015 (UTC)
@Peter F. Patel-Schneider: As Tobias said, the current organization is how it is because semi-anonymous volunteers designed it to be so with the same context which you are finding. I encourage you to continue to ask questions and seek answers. One might say that project knowledge is transmitted through the Internet social custom called "lurk moar".
If you want to ask theoretical questions about the project, consider posting to Wikidata:Project chat. I see in the Wikidata-l mailing list that you already met some of the leading thinkers in the project and that you have read through An Ambitious Wikidata Tutorial. You seem like an insightful person - I expect that your intuition about the development of Wikidata is probably correct, whatever you might be thinking. In the mailing list you seem to have personal introductions to individuals with whom you might talk. I am not sure what else to provide you. What more would you request? Blue Rasberry (talk) 14:10, 21 October 2015 (UTC)
@Blue Rasberry: OK, I'll try the chat. I guess that I'm not finding a basic statement of modelling principles because there just isn't one. Peter F. Patel-Schneider (talk) 14:55, 21 October 2015 (UTC)

Route of administration[edit]

Is there a website that we can use as a source for most of the route of administration claims? Maybe even structured enough to do a bot-import? --Tobias1984 (talk) 07:37, 18 June 2013 (UTC)

Possibly Drugbank? It seems pretty comprehensive, even listing a soap for topical administration of morphine. --WS (talk) 15:32, 18 June 2013 (UTC)
You could try Drugs at FDA. See [2] for an example. Remember (talk) 16:56, 16 July 2013 (UTC)

Generating data about medicine across languages[edit]

I want to know how many medicine related articles are in all languages of Wikipedia. We have information for English here [3] which comes from the WP:MED templates on talk pages. Can we use the interlanguage links of these articles to determine the number in other languages? I realize that this would only be a rough estimate.

Additionally I am interested in having WPMED tags added to the talk pages of medical related articles in other languages. I see this as a first step to generating sums of page views for medicine like we have in En in other languages. Thoughts? Doc James (talk · contribs · email) (if I write on your page reply on mine) 10:43, 5 July 2013 (UTC)

Those sound like some interesting statistics. The current problem is that we need properties in the items that we can query. In the case of the diseases-list, Byrial queried all the items that have a MeSH code assigned. So our first priority is getting the properties of the different medical infoboxes ready (diseases infobox is almost done). --Tobias1984 (talk) 21:33, 5 July 2013 (UTC)

Copied from the project chat: --Tobias1984 (talk) 16:22, 8 July 2013 (UTC)

Analytics (lightning Kraken demo) - Andrew Otto (remote) / Evan Rosen - 3 minutes. WMF Metrics and activities meetings/2012-12-06

There is a project for better page view statistics, with the ominous name "Kraken" (Datenkraken?). See:

  • meta:Glossary#K, Kraken: the upcoming data services platform that the Wikimedia Foundation's Analytics team is working on. It will allow interested persons to query data to answer their questions about Wikimedia projects and users.
  • mw:Analytics/Kraken and mw:Analytics/Kraken/Blurbs
  • commons:File:Report_on_requirements_for_usage_and_reuse_statistics_for_GLAM_content.pdf (June 2013) "there is clearly a need for GLAM-related statistics, for research purposes, but mainly to enable GLAMs to use Wikimedia Projects as a mature distribution channel for their collections. (...) The next steps are up to the analytics team of the Wikimedia Foundation to start gathering the information required (using Kraken) and work on a page (using Limn) to display that information. The GLAM toolset project will continually and regularly liaise with the Wikimedia Foundation to ensure this development is prioritised".

I am quite worried about this kraken-data. When german Wikipedia prominently linked to page view statistics on every Wikipedia page, this resulted in SEO spammers using this tool to determin popular target articles for spamming their links on WP. Apparently now the GLAM community (composed of well-funded institutions with PR departments) is pushing for these analytics to be built by WMF (funded by donations for Wikipedia). Before that, the ISP partners for Wikipedia Zero (and WMF marketing) were asking for special page views statistics, including Saudi Telecom ("No, we never talked about censorship"). And while the toolserver had a privacy policy, that is not the case with kraken, afaik. Pandora box, anyone? --Atlasowa (talk) 13:02, 8 July 2013 (UTC)

There is also this. But currently limited to 500 results. http://208.80.153.172/wdq/?q=claim[486] --Tobias1984 (talk) 07:45, 9 July 2013 (UTC)
We already list the 1500 most viewed medical articles here and it is updated monthly. [4] We have been doing this for years. The more viewed pages are also the more watched. So do not understand the issue? Doc James (talk · contribs · email) (if I write on your page reply on mine) 19:36, 9 July 2013 (UTC)

bot maintenance of drug and disease items[edit]

Hello all, I would like to start plans to write a bot that will maintain all drug and disease items, keeping in sync with source databases. Realizing there are many other people here (including bot owners) who are interested in that data, I want to be sure we're not stepping on any toes. By way of brief introduction, I run a biology research group and one of our main projects has been to maintain the ~10,000 gene/protein infoboxes on Wikipedia. These infoboxes are bot-updated in near-real-time with source databases so that all the data shown stays current. (See w:Portal:Gene Wiki and w:User:ProteinBoxBot for more info.) From that existing effort, we're trying to move in two directions. First, we want to move all that gene/protein data from Wikipedia templates to Wikidata -- that effort is being coordinated over at WD:MBTF. Second, we want to expand to disease and drug infoboxes, and we've done some very early prototyping at w:User:ProteinBoxBot/Phase_3. Anyway, this is another natural community to interact with, so we want to make sure we're coordinating with everyone else in this space.

Just in terms of first actions, I will add a column on the property tables shown on Wikidata:Medicine task force for "Data source". The goal is to identify as few resources as possible that contain the listed mappings and annotations in some structured format. We'd certainly welcome help compiling these data sources. And obviously, any other comments are welcome too! Cheers, Andrew Su (talk) 18:59, 16 July 2013 (UTC)

Hey Andrew! Thank you for your update. I don't think this project has a dedicated bot-operator yet, so your bot would be more than welcome. You could add the bot to the participants list so that people know with whom to coordinate their bot-task operations. --Tobias1984 (talk) 20:44, 16 July 2013 (UTC)
Done! Cheers, Andrew Su (talk) 22:22, 17 July 2013 (UTC)

Just a quick update on bot tasks. Kompakt is going to import the ICD-9 and ICD-10 codes for us. Does anyone know which source we should use for ICD-9? Please post at Wikidata:Bot_requests#ICD_9_.26_ICD_10 --Tobias1984 (talk) 08:45, 17 July 2013 (UTC)

Cool! Though I think long-term, we want to use content directly from some authoritative source (like Human Disease Ontology) rather than importing from Wikipedia infoboxes. (Longer term, that's something that our bot can do.) As for a source for ICD-9 codes, you could use one of these files (linked from [5]). Cheers, Andrew Su (talk) 22:22, 17 July 2013 (UTC)

links between genes/proteins, diseases, and drugs[edit]

I just started a discussion at WT:MBTF on how to add links between genes/proteins, diseases, and drugs. Input from this community would obviously be welcome and appreciated! Cheers, Andrew Su (talk)

French infobox[edit]

Hi everyone.

I don't really understand what this project involves, and if our work will have an impact on it.

To sum up the situation, on the french Wikipedia we are currently discussing to change, add or remove some parameters of the infobox disease. One of the contributors warned us about this project on Wikidata which we hadn't heard of.

So now we're wondering if our current work will have an impact on this projet, and if we have to take special precaution (we didn't really understood what was the point of this projet).

Thanks for your answers. --Woozz un problème? 09:28, 25 July 2013 (UTC)

Hi Woozz! I read the discussion you linked. My french is only good enough for reading, so I'm writing in English. Basically this project just tries to store and centralize different pieces of information. Those pieces can be used by all Wikipedias using special templates. It is important to mention that that doesn't mean that all Wikipedias have to have the same Infobox layout. We should try to gather the data for the new infobox here, so it can be used by other languages too. I looked at your example and we just have to propose the properties to store that information. I already proposed some of them here: Wikidata:Property_proposal/Term#medical_discipline. It would be helpful if a few people from the French project would look at how Wikidata works. I can help out with all the questions you have, so just ask them here. --Tobias1984 (talk) 14:19, 25 July 2013 (UTC)
Hi Woozz, coincidentally, almost the same discussion is going on on the English Wikipedia at wikipedia:en:Wikipedia talk:WikiProject Medicine#Infoboxes_-_any_consensus_for_changes.3F. If any parameters added are common to the both language wikipedias, all data added can be shared between the two languages with the help of wikidata. --WS (talk) 14:39, 25 July 2013 (UTC)

ICD 9 and 10[edit]

User:Kompakt has imported a few thousand ICD-9 and ICD-10 codes for us. The constraint violations are also updated and can be checked. --Tobias1984 (talk) 09:38, 4 August 2013 (UTC)

drug-drug interaction[edit]

We need to have a quick discussion on how we will use qualifiers for the property significant drug interaction (P769). We have to find a few qualifiers that should fulfill the following criteria:

  • Semantic
  • General enough to have wide applicability (e.g. a property "color" is better than "bird color" or "car color")
  • Specific enough to describe detailed interactions
  • Work for the other interactions discussed here: link between genes, proteins and drugs
  • Potentially work all over Wikidata

My rough idea for Warfarin (Q407431) would look something like this:

  • increases chance of = bleeding (item datatype)
  • decreases chance of = absorption (item datatype)
  • occurs in number of cases = 30 % (this is an example of a numeric qualifier that could state statistical data)

But it is obviously more complicated than that. Best way as usual is to think of what kind of queries we would like to have answered. Somebody could query Wikidata on the above example: "What are drugs that influence Warfarin (Q407431) and increase the likelihood of bleeding" and the database would return Ticlopidine (Q420571). --Tobias1984 (talk) 19:38, 6 August 2013 (UTC)

Re: Tobias1984 I agree and think that the work we have done on a semantic model for drug safety statements can apply here. Specifically, I think the following qualifiers may be useful:
  • PharmacodynamicImpact - Information on the pharmacodynamic impact of a drug-drug interaction.
  • drug-toxicity-risk-increased - The drug-drug interaction is associated with an increased risk of toxicity.
  • drug-toxicity-risk-decreased - The drug-drug interaction is associated with an decreased risk of toxicity.
  • drug-efficacy-increased-from-baseline - The drug-drug interaction is associated with an increase in the efficacy of the drug.
  • drug-efficacy-decreased-from-baseline - The drug-drug interaction is associated with a decrease in the efficacy of the drug
  • influences-drug-response - The drug-drug interaction influences drug response
  • not-important - The drug-drug interaction is not associated with a clinically relevant pharmacodynamic effect
  • PharmacokineticImpact - Information on the pharmacokinetic impact of a drug-drug interaction.
  • absorption-increase - The drug-drug interaction is associated with an increase in absorption of the drug.
  • absorption-decrease - The drug-drug interaction is associated with a decrease in absorption of the drug.
  • distribution-increase - The drug-drug interaction is associated with a increase in distribution of the drug
  • distribution-decrease - The drug-drug interaction is associated with a decrease in distribution of the drug.
  • metabolism-increase - The drug-drug interaction is associated with a increase in metabolism of the drug
  • metabolism-decrease - The drug-drug interaction is associated with a decrease in metabolism of the drug.
  • excretion-increase - The drug-drug interaction is associated with a increase in excretion of the drug
  • excretion-decrease - The drug-drug interaction is associated with a decrease in excretion of the drug
  • not-important - The drug-drug interaction is not associated any clinically relevant pharmacokinetic with respect to the drug. "


So, for Warfarin (Q407431) it would look something like this:

  • PharmacokineticImpact = absorption-decrease

Other qualifiers to begin with might come from the simple model found at [Meds] and could include mechanism, related drugs, and options (i.e., therapeutic options). We would need to discuss evidence grading separately because there are two kinds of evidence; the evidence that the interaction exists, and the evidence for patient harm/benefit. The definition of significant drug interaction (P769) assumes that the evidence for the existence of the interaction is sufficient to report publicaly. The latter evidence axis is much more challenging.Boycer (talk) 17:44, 9 August 2013 (UTC)

I like your approach too. Just a small concern is that we should put increase and decrease in the qualifiers. Otherwise we have to create two items for every item we would like to link to. In your example we have an item for "absorption" but no items for "absorption-increase" and "absorption-decrease". I would favor (your example):
  • PharmacokineticImpact decreases = absorption
What is your opinion on that? --Tobias1984 (talk) 15:10, 11 August 2013 (UTC)

Parasites and diseases they cause[edit]

We have a few parasites that have icd and other codes assigned to them. Wuchereria bancrofti (Q311109) for example causes 90 % of the Lymphatic filariasis (Q14514796) cases (http://www.who.int/mediacentre/factsheets/fs102/en/). Should we assign the icd code to the parasite, both or just the disease? --Tobias1984 (talk) 15:04, 7 August 2013 (UTC)

I would think just the disease. In this specific case, it has been imported from the English Wikipedia, where it is listed in the same article because there is no separate article for the disease itself. --WS (talk) 07:03, 9 August 2013 (UTC)

Hong Kong flu vs H3N2[edit]

Could you look at the page Hong Kong flu (Q1069785) please? In most of the languages it is the Hong Kong flu (1968-1969) and there are two languages with the virus H3N2 (causing the Hongkong flu). I don't want to delete them as I have no idea about the content of these Japonese and Armenian articles; for those the correct place would be influenza A virus subtype H3N2 (Q13399926). I already corrected the Hungarian and the French one. --Hkoala (talk) 20:13, 5 January 2014 (UTC)

Thanks for noticing. I moved the two H3N2 articles to influenza A virus subtype H3N2 (Q13399926). From the machine translation they both looked like they are about the virus. --Tobias1984 (talk) 22:51, 5 January 2014 (UTC)

Ebola[edit]

I just checked Ebola virus disease (Q51993) and it still has many missing statements. It is actually a good item to test, if we can fully describe a disease yet, with our existing properties. Also: our properties are almost not used by the community. Any ideas how this could be improved? -Tobias1984 (talk) 19:32, 2 August 2014 (UTC)

Medical Wikidata, Citation MetaData[edit]

There is now Wikidata:WikiProject Source MetaData, and Daniel Mietchen is starting off with malaria-related scientific articles. Is good source metadata of broad interest to the medical data community on Wikipedia? If so, is there a suitable place to ask them for input (for example, what metadata fields would medical editors like?)?

I've also been talking to the Cochrane Collaboration about sharing their database with Wikidata, and they are interested but understandably worried about labour costs. I'm told that the IdeaLab and Wikimedia Deutschland are both possible sources of funding for a data interoperability internship or some such; do you know of any other good routes? HLHJ (talk) 17:12, 11 August 2014 (UTC)

@HLHJ: What kind of metadata would Cochrane provide? I think that at the moment Wikidata and d:Wikidata:WikiProject Medicine should concentrate on alleviating medical editors from maintenance activities so they can concentrate on content. Primarily that includes interwiki-links and identifiers to medical catalogues. Possible next steps would be the centralization of categories (a ontology that is currently build in 270 different languages, wasting a lot of time). We are slowly moving towards this goal: The Drugbank-ID stored on Wikidata is now used in some Russian-Wiki templates (d:Property talk:P715). So if Cochrane can do one thing, they should try to make all medical identifiers freely available in a machine-readable form. --Tobias1984 (talk) 17:02, 14 August 2014 (UTC)
Since they seem to have databases, I can't imagine much is in non-machine-readable form. Risk of bias assessments of articles were mentioned, and other raw review data. With such metadata, when I found a relevant article, I could ask the database whether this article had been incorporated into any systematic reviews. I could ask it for a list of all the articles which had been incorporated into systematic reviews that also incorporated this article. I could ask for the proportion of those studies that were double-blinded, or had trial registrations listed. I could ask for the links to the articles and their registrations. I could plot how long the studies lasted, or how many people they involved... etc., automatically. Would this be useful enough to medical editors to justify the effort? HLHJ (talk) 19:03, 14 August 2014 (UTC)
@HLHJ:: I copied the discussion here so other data contributors can look at it. I do think that this data would be useful. Queries for studies with e.g. "more than 500 participants" would be a powerful tool to find the study that might not be answerable by a text search. -Tobias1984 (talk) 08:40, 16 August 2014 (UTC)

Intra language links[edit]

We are creating medical content for K'ichi that only exists in incubator. Is their a way to add this content to the intra language links? We have 4 translated articles as listed at the bottom here [6] Doc James (talk · contribs · email) (if I write on your page reply on mine) 10:09, 26 August 2014 (UTC)

@Jmh649: Incubator site-links have not been implemented yet, but are planned I think. It is probably not high on the priority-list because of the small amount of pages. Tobias1984 (talk) 10:34, 26 August 2014 (UTC)

Data aquisition disease infobox[edit]

Just putting this here, so it can go into the talk page archive. This was the initial acquisition of the diseases infobox. These identifiers are now stored on Wikidata. -Tobias1984 (talk) 12:15, 27 August 2014 (UTC)

Task Bot(s) # of claims Progress
Template:Infobox disease (Q6436840) - 5842 transclusions  ?
MeSH ID (P486) User:KLBot2
User:SamoaBot Task 25
922 15%?
ICD-9 (P493) User:Kompakt-bot 3650 99 %? ✓ Done
ICD-10 (P494) User:KLBot2
User:Kompakt-bot
3914 99 %? ✓ Done
DiseasesDB (P557) User:KLBot2
User:Kompakt-bot
2679 99%? ✓ Done
ICD-O (P563) User:Kompakt-bot 321 99% ✓ Done
OMIM ID (P492) User:KLBot2
User:Kompakt-bot
1456 99%? ✓ Done
MedlinePlus ID (P604) User:KLBot2
User:Kompakt-bot
1395 99%? ✓ Done
eMedicine (P673) User:Kompakt-bot 1897 99? ✓ Done
GeneReviews ID (P668) User:Kompakt-bot 147 99? ✓ Done

Disease Ontology import[edit]

As a member of the ongoing Gene Wiki project, I am are contemplating how to bring structured gene-disease relationships into wikidata. The thinking right now is to a) bring in the human gene concepts (see ProteinBoxBot's work), b) bring in the human disease ontology http://disease-ontology.org/ , and c) add claims that connect them based on repositories such as OMIM . Anyone here in project medicine have thoughts about that? --Genewiki123 (talk) 17:35, 10 September 2014 (UTC)

Genewiki123 (talkcontribslogs) It is a perennial proposal. It is my opinion that if you drew up a proposal then it would be like to get community support. User:Klortho might be able to tell you something about past proposals, as could User:Emw. My advice would be that if you did this, because of the magnitude of the project, consider making a proposal which includes a layman explanation of what you are doing, what sources you have to back your claims, and why the outcome matters. I would not recommend doing this unless you get consensus that the sources you use to back your claims are excellent, and that consensus should probably include people who have no idea what this project means, because I predict that this kind of project would get more scrutiny from odd observers than any typical gene database. Blue Rasberry (talk) 20:11, 11 September 2014 (UTC)
Genewiki123, Bluerasberry, I recommend reading The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration and Relations in Biomedical Ontologies by Barry Smith et al., which are major papers in biomedical ontology and describe key principles of OBO ontologies (of which Disease Ontology is one).
One of the main purposes of the Disease Ontology (DO) is as a class hierarchy for disease. When DO makes claims like "breast cancer is_a thoracic cancer" (as it does at http://disease-ontology.org/), it is stating "breast cancer subclass of (P279) thoracic cancer". You can see this by opening http://purl.obolibrary.org/obo/doid.owl (warning: big) and seeing how all uses of "is_a" are replaced by rdfs:subClassOf. P279 is mapped to rdfs:subClassOf and exists as such in the Wikidata OWL exports (see wikidata-taxonomy.nt.gz, which you can explore in Protege).
A hierarchy for disease already exists on Wikidata: http://tools.wmflabs.org/wikidata-todo/tree.html?q=Q12136&rp=279&lang=en. As you're likely aware, there are many overlapping disease hierarchies from different authorities. Wikidata can theoretically support multiple hierarchies in a given domain like "disease", but we should designate one as preferred to make things tractable, at least for humans.
I'll follow up here this weekend with more comments. But I think I broadly support Genewiki123's proposal. Emw (talk) 13:00, 12 September 2014 (UTC)
Thanks for the advice, will definitely try to get something formal written out and posted here before taking any further steps. In terms of the existing disease hierarchy, my thinking is to leave it as it is, map DO terms to it where possible and then create new entities when there is something missing. A key modeling question for me is whether to use the subclass property to assemble a polyhierarchy (because it would contain both the existing structure, the DO structure, and potentially many others (e.g. MeSH)). I think, with the help of B.S. disciples like @EMW here, we could do this in a way that is ontologically correct and useful, but that would be a lot of challenging modeling work and I worry about how it would impact generic tools for viewing class hierarchies. It might actually be better attempt to leave a single wikidata-specific disease subclass hierarchy in place and then enhance the data with the unchanged hierarchies from external sources where these were modeled with source-specific properties (DO_subclass_of, MeSH_narrower_than, etc.) ??? --Genewiki123 (talk) 18:10, 15 September 2014 (UTC)


Looking at the hierarchy as it is right now we would like to propose a streamlined set of disease categories, with the idea that a disease could be a cancer as well as a disease of a certain body system. This upper level would include disease of each body system and e.g. cancer, infectious disease, metabolism disease, mental health disease, genetic disease and syndrome. And we would like to discuss the possibility of a separate category for physical disorders. emitraka lschriml

I just tried to find out what the licensing of the DO is. According to doi:10.1093/nar/gkr972, it is "the [sic!] Creative Commons license", which is not precise enough for most purposes, but sufficient to inform us that importing into Wikidata is not possible. --Daniel Mietchen (talk) 23:39, 31 October 2014 (UTC)
emitraka, lschriml, Daniel raises a salient point. Could you make the official licensing of Disease Ontology more precise? Unless Disease Ontology is in the public domain -- i.e. under CC0 -- it technically cannot be imported into Wikidata. More on CC0 here: https://wiki.creativecommons.org/CC0_FAQ. Emw (talk) 16:50, 29 November 2014 (UTC)
Daniel Mietchen, Emw, sorry about us not being clearer on the issue beforehand. The official licensing of DO is under Creative Commons Attribution 3.0 Unported. Emitraka (talk) 22:01, 1 December 2014 (UTC)

subclass of disease: lo-fi DO import problematic[edit]

I recently noticed a batch addition of 'subclass of (P279) disease (Q12136)' statements to items about diseases. For example, see this revision of Alzheimer's disease: https://www.wikidata.org/w/index.php?title=Q11081&oldid=177834864. It has a few problems:

  • Low fidelity provenance. The statements have references like "stated in: Disease Ontology release 2014-11-14", but the corresponding Disease Ontology (DO) item for the disease does not directly state "subclass of disease". The Disease Ontology item for Alzheimer's disease states "subclass of tauopathy" and "subclass of dementia".
If we want to import an ontology, then we should import it. Do a Wikidata keyword search on the object of the ontology's subclass of (/ is a) statement, get its Q number, and make the precise statement the ontology makes. Compare DO's 'Xrefs' to Wikidata's identifier properties if disambiguation becomes an issue. Create items as needed, though this should be rare. We should not use references like "stated in: $ontology" to support claims that are only distant transitive entailments of what the imported ontology actually states.
  • Redundancy with a large set of existing claims. Most of these diseases had already been classified with more granular claims via subclass of (P279), and many of those claims are precisely what is stated in Disease Ontology. For example, per links above, DO states "Alzheimer's disease subclass of dementia", and this statement was already in the Wikidata item on Alzheimer's. The appropriate thing to do in such a case is to add a reference to the pre-existing claim when it matches the imported statement -- rather than adding a new, redundant claim.

Can we revert these problematic "subclass of disease" statements added by ProteinBoxBot? https://tools.wmflabs.org/autolist/autolist1.html?q=claim%5B279%3A12136%5D currently reports 4450 "subclass of disease"; that number should ideally be much lower. I support faithfully importing subclass of statements from Disease Ontology, but the batch edits made around November 24, 2014 are problematic and need to be fixed. Genewiki123, Bluerasberry, Daniel Mietchen, what do you think? Emw (talk) 16:37, 29 November 2014 (UTC)

I reverted these "subclass of" claims. Andrawaag (talk) 22:59, 29 November 2014 (UTC)
Thank you Andra (and for all your work). http://tools.wmflabs.org/autolist/autolist1.html?q=claim%5B279%3A12136%5D now lists 62 direct "subclass of disease" claims, which seems about right. Emw (talk) 00:59, 30 November 2014 (UTC)
Andrawaag (talkcontribslogs), thanks for responding to the criticism of your bot. Emw (talkcontribslogs), thanks for raising the issue. I agree with Emw when he says "If we want to import an ontology, then we should import it", and as I understand, in this case there was some partial import of some ontology which was not well established. If that was the case, and there was dispute about its validity, then I am glad to see the uploader withdraw the claims due to lack of explanation of the need for these edits to be in Wikidata. Blue Rasberry (talk) 16:05, 2 December 2014 (UTC)
Bluerasberry, it turns out that Andra was just being consistent with how genes were very broadly classified (e.g. APOE subclass of gene). General genes, specific diseases examines those different approaches to classification. Emw (talk) 01:16, 3 December 2014 (UTC)

Adding disease properties[edit]

We're developing a Disease Box in the same vein as the Protein Box. So here we would like to propose and initiate discussion to some properties we think are needed.

Possible properties:

→ to add to Disease:
symptoms: add list of symptoms from SYMP
symptom: https://www.wikidata.org/wiki/Property:P780#top [page is currently not populated]
pathogen transmission process: add terms from TRANS
add anatomical location, use UBERON; link out to specific anatomy terms or as an alternative pull UBERON into Wikidata
Add UBERON ID as property of Anatomy
add property Phenotype: populate from HPO
add Inheritance: genetic, monogenic, polygenic, autosomal, recessive, dominant, X-linked
add Orphanet ID
add prevalence
add ICD10 codes
add disease categories/upper level term; suggest upper level term e.g. cancer, genetic disease, metabolic disease, mental health disease etc.
add synonyms

For each of the ontologies mentioned above we propose to pull them into Wikidata. emitraka lschriml

Are any of these already being covered in Wikidata? Blue Rasberry (talk) 19:49, 19 September 2014 (UTC)
@emitraka, lschriml: Orphanet ID (P1550) is done. Did you have time to familiarize yourself with Wikidata? Do you need help with the property proposals? -Tobias1984 (talk) 17:02, 6 October 2014 (UTC)
Thank you! I'm finding my way around Wikidata, slowly but surely. Emitraka (talk) 17:17, 6 October 2014 (UTC)
@emitraka, lschriml, Emw, Andrew Su, Genewiki123: Just created UBERON ID (P1554). -Tobias1984 (talk) 06:35, 7 October 2014 (UTC)

Summary of relevant existing properties[edit]

Emitraka, Lschriml, welcome to Wikidata! It's exciting to have you both here -- I'm a fan of your work with the Disease Ontology. Several of the properties you propose exist:

Property Talk page All usages Creation discussion
symptoms (P780) Property_talk:P780 http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B780%5D Wikidata:Property_proposal/Archive/12#P780
pathogen transmission process (P1060) Property_talk:P1060 http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B1060%5D Wikidata:Property_proposal/Archive/18#P1060
prevalence (P1193) Property_talk:P1193 http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B1193%5D Wikidata:Property_proposal/Archive/20#P1193
mode of inheritance (P1199) Property_talk:P1199 http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B1199%5D Wikidata:Property_proposal/Archive/19#P1199
ICD-10 (P494) Property_talk:P494 http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B494%5D Wikidata:Property_proposal/Archive/7#P494 (and #P493)

As you can see, most of the above properties are rarely used at the moment, simply because they're little known.

Many other properties of interest:

We do not currently have a property for Uberon ID or Orphanet ID. Identifier properties like that virtually always pass property proposal without issue. I'd certainly be interested in hearing more about importing Uberon (presumably via subclass of); it seems like it could be a good candidate for a reference domain ontology for anatomy on Wikidata.

Note that you can search properties by prepending "P:" to your query in the search box at upper right in the Wikidata UI (e.g. P:icd).

Disease categories can be added with subclass of (P279). That property is mapped to rdfs:subClassOf and thus also has the semantics of BFO's is_a. Help:Modeling_causes#Malaria also seems pertinent. Synonyms can be added as aliases. Aliases have a nice side-effect of getting picked up by Wikidata's search. The Gene Wiki project cleverly set enabled searching by Entrez Gene IDs and UniProt IDs by setting those values as aliases in genes like RELN (Q414043) and proteins Reelin (Q13569356).

Hope this helps. You can ping me or other users by mentioning them like [[User:Emw|Emw]]. Cheers, Emw (talk) 06:07, 20 September 2014 (UTC)

Adding disease properties - amended[edit]

Emw thank you very much for the information. It looks like most of what we need already exists. What we still need, and would like to move forward with proposals for them, are the following properties:
UBERON IDs
ORPHANET IDs
Phenotype

Any comments will be greatly appreciated Emitraka (talk) 19:33, 30 September 2014 (UTC)

@Emitraka: I requested one of the properties here. You can request more properties in the same way: Wikidata:Property_proposal/Natural_science#Orpha.net_ID. -Tobias1984 (talk) 06:29, 1 October 2014 (UTC)
You can also check and make changes to the project's list of properties: Wikidata:WikiProject Medicine/Properties. Tobias1984 (talk) 06:32, 1 October 2014 (UTC)
Emitraka, Tobias1984, others, has phenotype and phenotype of are interesting potential properties. I think some explanation of the formal differences and relations between disease, phenotype and symptom would be helpful.
From what I can tell from Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies, Integrating phenotype ontologies across multiple species, The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data and browsing around HPO's Phenomizer, HPO seeks to link what it calls "phenotypic abnormalities" (synonym: "organ abnormalities", also called "clinical features" in aforementioned papers) with diseases -- more precisely, human genetic disorders that have a Mendelian mode of inheritance. For example, one might say "22q13 deletion syndrome (Q1926345) has phenotype 2-3 toe syndactyly.
One possible issue I see with HPO is that it classifies some things that do not have a genetic basis as phenotypic abnormalities. For example, "Abdominal pain" (HP:0002027) is transitively set to satisfy the statement "abdominal pain subclass of phenotypic abnormality". By the standard reading of subclass of, that would mean every instance of abdominal pain is also an instance of phenotypic abnormality. Clearly, that statement is not consistent with how the term "phenotypic" is used throughout the literature to mean an observable trait (aka characteristic, property or quality) that arises because of the genotype of the bearer (i.e. the organism) the quality inheres in. An example of abdominal pain that is not a phenotypic abnormality is abdominal pain caused by extremely spicy food.
Emw Though there is clearly a focus on disorders with obvious genetic components, I don't think the HPO folks or the larger community would consider phenotype to be solely related to genetics. Phenotype is the combination of genetics and environment . It makes sense to include things like abdominal pain because they are indeed such a combination. Even if we only cared about the genetics of disease, we would still want to include observables like abdominal pain or short stature etc. in our records. The determination of the relationships between environmental factors and genetic factors could and often would come after the recording of the phenotype. --12.69.234.201 18:57, 2 October 2014 (UTC)
When would be has phenotype be preferable to symptoms (P780) (i.e. has symptom), and vice versa?
Great question - and very hard to answer.. I would like to hear Peter Robinson answer it. One clear case would be phenotypes that were not necessarily related to diseases. For example blue eyes, freckles, etc. Has Symptom seems like sub-property of Has Phenotype that constrains the relation such that the items in its Domain are classified as diseases. --12.69.234.201 18:57, 2 October 2014 (UTC)
It would also be good to discuss why a potential has phenotype property would be needed distinct of the generic causation properties has cause (P828). (Note also related properties has immediate cause (P1478), has contributing factor (P1479) and their inverses cause of (P1542), immediate cause of (P1536), contributing factor of (P1537), and Help:Modeling causes.) Consider:
Option A: has effect (i.e. cause of)
2-3 toe syndactyly
22q13 deletion syndrome (Q1926345)
22q13 deletion (Qx)
Option B: has symptom
2-3 toe syndactyly
22q13 deletion syndrome (Q1926345)
22q13 deletion (Qx)
Option C: has phenotype
2-3 toe syndactyly
22q13 deletion syndrome (Q1926345)
22q13 deletion (Qx)
Which of the above options is preferable?
It seems to depend on what items you want to use the property on. If you are linking specific genetic variations to a phenotype that we know they cause (or play a role in causing), then option a makes sense. If you are linking a genetic event to a phenotype that seems to co-occur.. then option C. If you are linking a disease to a symptom/phenotype than B.
None of these questions are show-stoppers or blocking issues in my opinion, but they are probably worth contemplating to avoid confusion down the road. Emw (talk) 13:00, 1 October 2014 (UTC)

WDWP:Medicine mentioned in WM blog[edit]

http://blog.wikimedia.de/2014/10/22/establishing-wikidata-as-the-central-hub-for-linked-open-life-science-data/ -Tobias1984 (talk) 15:11, 24 October 2014 (UTC)

Add UMLS concept id property?[edit]

What do you think of adding a property to link items to UMLS concept identifiers? The UMLS (Unified Medical Language System) provides a large-scale 'metathesaurus' that attempts to organize and inter-relate hundreds of biomedical terminologies. http://www.nlm.nih.gov/research/umls/ . To integrate ontologies, they produce a single 'concept identifier' and then link it up to the equivalent concepts in each of the ontologies that they import. This is very useful for large-scale data integration projects that often need to map from one vocabulary to another. Having umls concept ids could really help disambiguate wikidata items as we go forward. --Genewiki123 (talk) 19:34, 13 November 2014 (UTC)

Genewiki123, sounds good to me. Emw (talk) 15:04, 29 November 2014 (UTC)
Sounds good to me too. --Daniel Mietchen (talk) 03:38, 30 November 2014 (UTC)

Hyphens instead of dashes -- an appeal for simplicity[edit]

Many medical subjects have compound names, like "Creutzfeldt-Jakob disease" or "case-control study". Newspaper style guides often recommend using en dashes (–) to connect the first part of such disease names, as does English Wikipedia per MOS:DASH. However, hyphens (-) are used instead of dashes by almost all medical journals, medical institutions, non-English Wikipedias, editors, developers, and readers.

Dashes in labels cause more problems than they solve. If a user is trying to search a page for a disease by name, then pasting "Creutzfeldt-Jakob disease" into a web browser's Find bar (Control F or Command F) won't work if "Creutzfeldt" and "Jakob" are connected by a dash. If an editor searches for "Charcot-Marie-Tooth disease" they will currently be brought to Q18553896 instead of Q1052687, the former of which was created in mistake because the developer (understandably!) did not account for dashes. Search or label-dependent functionality in third party software will also likely encounter problems if we persist using dashes.

To address those problems, I propose we make it policy to always use hyphens and never dashes in labels that have compound words or ranges. This would simplify matters and reduce bugs by aligning our labels to those overwhelmingly used by medical and scientific journals and institutions, non-English Wikipedias, and our various types of users. What do you think? Emw (talk) 18:43, 6 December 2014 (UTC)


Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
sebotic
Pictogram voting comment.svg Notified participants of Wikiproject Medicine

Andrew Su
Marc Robinson-Rechavi
Pierre Lindenbaum
Michael Kuhn
Boghog
Emw
Chandres
Dan Bolser
Pradyumna
Chinmay
Timo Willemsen
Salvatore Loguercio
Tobias1984
Daniel Mietchen
Optimale
Mcnabber091
Ben Moore
Klortho
Hypothalamus
Vojtěch Dostál
Gtsulab
Andra Waagmeester
Sebotic
Mvolz
Toniher
Elvira Mitraka
David Bikard
Dan Lawson
Francesco Sirocco
Konrad U. Förstner (talk)
Chris Mungall (talk)
Kristina Hettne
Hardwigg
i9606
Putmantime
Tinm
Karima Rafes
Finn Årup Nielsen
Jasper Koehorst
Till Sauerwein
Crowegian
Pictogram voting comment.svg Notified participants of Wikiproject Molecular biology

I'm not a native speaker, but I don't think that replacing dashes with hyphens could cause any problems. The label is anyway just a way for us to handle the database. If somebody sees the need for a typographic correct output, we should probably create a property for that. --Tobias1984 (talk) 10:02, 8 December 2014 (UTC)

Launch of WikiProject Wikidata for research[edit]

Hi, this is to let you know that we've launched WikiProject Wikidata for research in order to stimulate a closer interaction between Wikidata and research, both on a technical and a community level. As a first activity, we are drafting a research proposal on the matter (cf. blog post). It would be great if you would see room for interaction! Thanks, --Daniel Mietchen (talk) 01:34, 9 December 2014 (UTC)

Classifying surgical procedures[edit]

There is a discussion at Talk:Q15636253 about how to organize surgical procedures into a concept hierarchy, and how to label it. It touches on ICD-10-PCS and what constitutes reasonable medical terminology. Input from members of this WikiProject would be very welcome! Emw (talk) 17:27, 11 January 2015 (UTC)


Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
sebotic
Pictogram voting comment.svg Notified participants of Wikiproject Medicine

ICD-10-PCS copyright[edit]

Is ICD-10-PCS in the public domain? This question arises from an informative comment:

If the ICD can be used then obviously that is a classification system with a huge amount of international support and I would like to use it. At Wikimania 2012 the WHO sent two representatives to talk to Wikipedians. As I understood at the time, they only allowed the newer ICD systems to be used with licensing agreements, as these are non-free coding systems. We considered becoming more organized to ask that the ICD-11 be freely licensed, as described at en:Wikipedia:WikiProject Medicine/ICD11. What do you know about the circumstances under which these coding systems can be used freely?

While ICD-10 is made available by the World Health Organization (WHO), ICD-10-PCS is made available by the Centers for Medicare and Medicaid Services (CMS), part of the Department of Health and Human Services (HHS) of the US federal government. Per the page 1 footnote in 2015 Development of the ICD-10 Procedure Coding System, the system is developed through funding from federal government contracts (Nos. 90-1138, 91-22300, 500-95-0005, HHSM-500-2004-00011C and HHSM-500-2009-000555-C) by 3M Health Information Systems.

I have found no copyright notice in the resources at 2015 ICD-10 PCS and GEMs. Works by the US federal government are generally not copyrighted and thus in the public domain, but the situation is murkier for works produced by government contractors. The HHS Grants Policy Statement seems to be an authoritative statement on copyrightability of US government works funded by HHS contracts, but I see no particular mention of this work there.

The copyright status of ICD-10-PCS is unclear. If it is in the public domain, then it seems like a good option to consider using as the preferred subclass of (P279) hierarchy for medical and surgical procedures. Such use might also demonstrate to medical organizations like WHO, etc. that putting works like ICD-10 would be practically useful and in the public interest. Emw (talk) 23:48, 17 January 2015 (UTC)

Archiving[edit]

Could we set up archiving on this page? I'm not sure how that's done on Wikidata but it'd be quite useful to keep only the recent and relevant threads here. --LT910001 (talk) 03:09, 29 March 2015 (UTC)

Moving ICD codes and ATC codes from navboxes to Wikidata[edit]

We recently had a similar process with Anatomy articles - identifiers for templates were moved to Wikidata. This significantly enhanced their readability. Could we do something similar for medicine templates?

An example template with the codes is here: en:Template:Respiratory pathology.

As stated previously the reason for doing this is that we can preserve the identifiers but increase the navigational value. Users do not use the numbers to navigate within the navboxes, and they visually clutter the title.

Thoughts? --LT910001 (talk) 03:11, 29 March 2015 (UTC)

I have previously proposed this here: Wikidata:Bot_requests#Move_all_template_ICD9_and_ICD10_references_to_wikidata but do not understand the reply, which I think is referring to articles rather than templates with duplicated ICD9 values (?). --LT910001 (talk) 03:14, 29 March 2015 (UTC)
@LT910001: I also do not understand the reply.
Can you more explicitly point out the codes in the template you presented as an example? Looking at that page I do not see what was changed. Blue Rasberry (talk) 18:43, 31 March 2015 (UTC)
Thanks for the ping, Bluerasberry (talkcontribslogs), I often forget to check wikidata. That is my point. The codes are still there ("J 460..."). They make the template name harder to read and don't add navigational value. We can move all those codes to wikidata and then remove them from the template headings. We recently did this with anatomy templates and I feel this improved their readability about 100% as now they are a lot less intimidating to casual readers. --LT910001 (talk) 22:10, 2 April 2015 (UTC)
@LT910001: I am still not sure what is being proposed. The reader-facing head of that template is currently "Pathology of respiratory system (J, 460–519), respiratory diseases". If you are proposing to change it to "Pathology of respiratory system, respiratory diseases" and to move the links to "J, 460–519" somewhere other than the title line then that could be an improvement, as obviously those codes are not intended for most readers, even though they do back structure of the template and ought not be removed entirely. Can you show an example anatomy template that was like this before, but in which someone has moved the links to these coding systems from the title to elsewhere and incorporated some connection with Wikidata? Blue Rasberry (talk) 13:36, 6 April 2015 (UTC)
That is exactly what I am proposing ("move the links to "J, 460–519" somewhere other than the title "). Readers will benefit by having a clearer title.
You ask for an example page. As stated at the beginning of this thread, en:Template:Respiratory pathology is an example of a template where this has happened. You can use the 'history' ability to see the difference. You can click 'wikidata' to see the related data. This has has in fact already happened on every anatomy template on wikipedia. --LT910001 (talk) 06:07, 12 April 2015 (UTC)

Infobox disease - update required...[edit]

There was a number of changes from DSM-IV to DSM-5 in names of diseases, in their definitions, new diseases emerged... at the same time Infobox disease remained unchanged. I think the Infobox requires modifications i.e. adding DSM-5 codes... --Pwlps (talk) 06:36, 3 May 2015 (UTC)

@Pwlps: Hi! Including the newest classifications would be a goal of this Wikiproject. Currently we have quite a high workload curating the existing data and trying to catch up to all the data that is spread the different languages of Wikipedia. If you would like to work on DSM-5, I could help you getting started. --Tobias1984 (talk) 10:04, 3 May 2015 (UTC)
@Tobias1984: Thank you for your offer but I have zero knowledge on modifying templates and I'm far more effective as an editor of polish WikiMedicine Project articles. However I think due to DSM-5 changes updateing Infobox disease is rather necessary... --Pwlps (talk) 06:26, 4 May 2015 (UTC)
@Pwlps: I am ignorant of how things work here also but I often look at infoboxes and want to know how I could add information here that would update content in Wikipedias of different languages. The best thing that I can say is that even though I know almost nothing, I would look at any proposal here with others, but I also am learning slowly. Blue Rasberry (talk) 19:35, 4 May 2015 (UTC)
@Pwlps, Bluerasberry: I requested the inclusion here: Wikidata:Property_proposal/Natural_science#DMS_V_.28DSM_5.29 --Tobias1984 (talk) 08:04, 6 May 2015 (UTC)

No MeSH term property?[edit]

Correspondence between MeSH data and Wikidata peoperties (from [1]).

We have already two MeSH (Medical Subject Headings) related properties.

But it seems that we don't have "MeSH term" property yet (described as "MeSH Heading" in the figure at right). In English Wikipeida, "MeSH term" is generally reffered as "MeshName". If we search at PubMed, this "MeSH term" data is needed (instruction video, search result example). So I suppose we need proposing and create property for MeSH terms. What do you think? Thanks. --Was a bee (talk) 04:56, 7 June 2015 (UTC)

@Was a bee: That search-video seems really practical. A few of my questions: Do we want to rebuild a system that already works well and is public? Can we get to comparable search results using the existing propoerties? Are we even allowed to copy the MeSH terms (It seems to go beyond storing identifiers). --Tobias1984 (talk) 14:01, 7 June 2015 (UTC)
@Tobias1984: Thank you for questions. I think my explanation was not good :p MeSH terms are already widely used. So there are no actual changes for this. For example, article en:Dengue fever has the line"MeshName = Dengue" in infobox. This parameter "Dengue" is MeSH term. MeSH term is short nouns like "Dengue", not a definition text about Dengue. Difference is, as same as other similar data (e.g. MeSH Code (P672), MeSH ID (P486)), simply saved in various wikis in distributed manner (old style) or saved at one place (at Wikidata). As far as I searched, PubMed (and most of MeSH related website) is not afford another style of input. --Was a bee (talk) 15:47, 7 June 2015 (UTC)
I worry about copyright infringement. How much of the MeSH system can we copyright without encroaching on the parts which are not allowed to be copied? If we can import MeSH terms then I think that would be useful, because it gives a simple human readable explanation of what the other identifiers are. If it is allowed to copy these terms then I support the creation of whatever is necessary to include the information here. Blue Rasberry (talk) 13:34, 10 June 2015 (UTC)

Meta-Data[edit]

This is a list of things we would like to save as meta-data for medical publications. Suggestions for mappings to Wikidata are welcome (just add them in the same row). You can also add more suggestions for meta-data.--Tobias1984 (talk) 15:23, 4 October 2015 (UTC)

* Addresses
* Autobiography
* Bibliography
* Biography
* Books and Documents
* Case Reports
* Classical Article
* Clinical Conference
* Clinical Trial
* Clinical Trial, Phase I
* Clinical Trial, Phase II
* Clinical Trial, Phase III
* Clinical Trial, Phase IV
* Comment
* Comparative Study
* Congresses
* Consensus Development Conference
* Consensus Development Conference, NIH
* Controlled Clinical Trial
* Corrected and Republished Article
* Dataset
* Dictionary
* Directory
* Duplicate Publication
* Editorial
* Electronic Supplementary Materials
* English Abstract
* Evaluation Studies
* Festschrift
* Government Publications
* Guideline
* Historical Article
* Interactive Tutorial
* Interview
* Introductory Journal Article
* Journal Article
* Lectures
* Legal Cases
* Legislation
* Letter
* Meta-Analysis
* Multicenter Study
* News
* Newspaper Article
* Observational Study
* Overall
* Patient Education Handout
* Periodical Index
* Personal Narratives
* Portraits
* Practice Guideline
* Pragmatic Clinical Trial
* Published Erratum
* Randomized Controlled Trial
* Research Support, American Recovery and Reinvestment Act
* Research Support, N.I.H., Extramural
* Research Support, N.I.H., Intramural
* Research Support, Non-U.S. Gov't
* Research Support, U.S. Gov't, Non-P.H.S.
* Research Support, U.S. Gov't, P.H.S.
* Research Support, U.S. Government
* Retracted Publication
* Retraction of Publication
* Review
* Scientific Integrity Review
* Systematic Reviews
* Technical Report
* Twin Study
* Validation Studies
* Video-Audio Media
* Webcasts

Drug prices[edit]

I did some tests on test.wikidata to see how we could store information about drug prices. I think that we should create separate items for drug-packagings. For the example of Isoniazid we would have this structue:

  • Item for the pharmaceutical substance (the molecule, or mixture)
  • Item for 100 mg tablets
  • Item for 100 mg bottles
  • I am not sure if we should do separate items for different manufacturers would be a good idea.
  • Item for 500 mg tablets
  • ...

An example for a packaging is here: https://test.wikidata.org/wiki/Q1674

The statements include the defined daily dose and the dosage (amount of active ingredient). The property prize then takes statements qualfified with year, bulk-packaging-size, and the country or region where the prize was quoted. For bottles or injections we would need another qualifier for the volume of the liquid.

We should spend some time thinking about how to model drug prices. I would also still like to include descriptions of tablets, but those might be dependent on the manufacturer and that means creating subitems, for the dosage items listed above. --Tobias1984 (talk) 09:04, 5 October 2015 (UTC)

@Tobias1984: I created en:WP:Prices to collect discussion on this. If we had the kind of data you describe, then I would like to see it in Wikidata. Blue Rasberry (talk) 17:08, 5 October 2015 (UTC)
@Bluerasberry: Thanks for gathering all those dicussions and the very eloquent summary of the problem. User:Doc James is interested in having this data on Wikidata. There would be a lot of pontential for querying, time-series-data and putting the data on maps (e.g. https://maps.wikimedia.org). The source would be the reports linked on this page (http://erc.msh.org/mainpage.cfm?file=1.0.htm&module=DMP&language=English). I don't think WP:PRIMARY is a problem in this case, because the reports are one step removed from the pharmaceutical companies. Let's see if we can get this through the review: Wikidata:Property_proposal/Generic#price. --Tobias1984 (talk) 17:45, 5 October 2015 (UTC)
I have emailed the ERC to see if they will release it under an open license. Doc James (talk · contribs · email) (if I write on your page reply on mine) 13:13, 6 October 2015 (UTC)

Item needing checking thread[edit]

related items[edit]

I have created a sub-page for cleaning items that seems related to the same subject. Ske (talk) 13:41, 28 October 2015 (UTC) :

/related items

Wikimania 2016[edit]

Only this week left for comments: Wikidata:Wikimania 2016 (Thank you for translating this message). --Tobias1984 (talk) 11:58, 25 November 2015 (UTC)

How significant is significant?[edit]

Regarding significant drug interaction (P769), what is the level of significance required for a drug interaction to be considered significant? That the interaction would be significantly hazardous for the patient's health, or that it would just lead to side effects? I ask because Fluvoxamine (Q409236) is known to interact with Caffeine (Q60235) by slowing the rate at which it's metabolized. This significance isn't recorded in Wikidata, even though I am pretty sure you could find it in the appropriate medical literature (it's by no means an obscure interaction). Was it excluded because the relatively benign effects of the interaction do not rise to the level of "significant," or is it just missing from Wikidata? Harej (talk) 05:26, 5 December 2015 (UTC)

Merge[edit]

How do I merge Q21797739 and Q1499629? Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:46, 21 December 2015 (UTC)

@Doc James: Do they both have the same ICD code? Ther merge button was moved to the 'More' tab next to the search field. --Tobias1984 (talk) 18:25, 26 April 2016 (UTC)
Will need to look. Travelling right now. Doc James (talk · contribs · email) (if I write on your page reply on mine) 18:26, 26 April 2016 (UTC)

Storing ICD9 and 10 codes from EN medical templates[edit]

I've made a proposal for a bot to do this here: [7], with a view to storing them here and removing them ultimately from template titles, which only serves to make them less readable to readers. The current proposal is to store the related ICD codes in Wikidata; once this is done, a separate proposal will be made on Wikipedia itself. Please comment. --LT910001 (talk) 22:15, 10 February 2016 (UTC)

Please comment on my Individual Engagement Grant talk page about my proposal for Guided Checklist for Health Topic Experts[edit]

Hello everyone,

I created a new Individual Engagement grant to try and fix a problem. m:Grants:IdeaLab/Effective Engagement with Health Topic Experts using Guided Checklists

From my work with Cochrane as a Wikipedian in Residence and my observations of other attempts to engage health topic experts in editing, I've come to the conclusion that the quality of the contributions of new health topic expert recruits does not match their level of expertise and effort the we as Wikimedians put into training new contributors. So, I decided to create a new project to develop a Guided Checklist that would assist a health topic expert in assessing the quality of a health articles on Wikipedia, and then guide their contributions toward making edits to correct the lack of quality.

My individual engagement grant would involve interviewing health topic experts and active medical editors, as well as a community consultation on Wikipedia English WikiProject Med. Additionally, because health topics are interrelated on Wikipedias, Commons, and Wikidata, I'm inviting people who are active in WikiProject Medicine on Wikidata to commment and participate. Please add yourself as a volunteer if you would like to participate. Or leave suggestions on the talk page on Meta. Or endorse if you support the idea. Going forward, I'll keep this project updated on the proposal. Sydney Poore/FloNight♥♥♥♥ 23:25, 15 April 2016 (UTC)