Wikidata:Requests for comment/Unifying GO activities and enzyme articles

From Wikidata
Jump to navigation Jump to search
An editor has requested the community to provide input on "Unifying GO activities and enzyme articles" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

Most (but not all) binary duplicates on EC enzyme number (P591) are caused by different bots that create different entries for the same enzyme activity: one from the various Wikipedia bots that generate flat articles (such as en:Thymidine-triphosphatase, EC 3.6.1.39, Thymidine-triphosphatase (Q7799624)) and one from a huge import from Gene ontology (GO) at some point in the history of Wikidata (thymidine-triphosphatase activity (Q22320779)). These describe the same thing and should be merged globally across the site. In fake bot-speak that would be:

For items with Gene Ontology ID (P686) and P591, and known to have a P591 duplicate:
If the P591 value ends with .-, leave it for now.
If the item is a subclass of an item with the same P591, leave it.
Find the duplicate with instance of (P31) enzyme (Q8047) and merge with it.
If the merge fails, pop a instance of (P31) Wikimedia duplicated page (Q17362920) on it.

Before moving on to the bot request, however, I figured that it is a good idea to RFC this since this is expected to be a huge move. And bot requests need justification anyway. --Artoria2e5 (talk) 15:01, 12 June 2019 (UTC)

Anandhisuresh (talk) 17:16, 28 February 2018 (UTC)anandhisuresh Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Pictogram voting comment.svg Notified participants of WikiProject Medicine ChristianKl❫ 07:43, 19 June 2019 (UTC)

Comments[edit]

Symbol support vote.svg Support. This matter already exists for Wikidata items of drugs and drug classes. I am working on a project to adjust Wikidata biomedical taxonomy. All contributors are invited to join the project. --Csisc (talk) 13:23, 19 June 2019 (UTC)
  • Pictogram voting comment.svg Comment Symbol support vote.svg Support Conceptually I'm very supportive of this plan, but would like to see what an example merged item would look like in practice. Just want to make sure I understand how this would impact User:ProteinBoxBot, which I help manage. @Artoria2e5:, can you perform one merge as an example? (I'd do it myself, but I'm not confident I wouldn't mess it up or do it in a different way than you envision.) Best, Andrew Su (talk) 18:59, 20 June 2019 (UTC)
    • I have done... enough of them to make me feel annoyed about this issue. D-alanine-D-alanine ligase activity (Q21199314) is an example, although admittedly I am not yet sure about the ontology/subclass issue. --Artoria2e5 (talk) 18:52, 22 June 2019 (UTC)
      • Technically what is classified is usually concrete objects of the real world, for example my car, or my stomach, are examples of the « car » or « stomach » concepts. « My stomach » is then with no ambiguity an instance of « stomach », and could nether be a subclass. The « instance of » statements should be translated as « is an example of », and « subclass of » statements as « are examples of ». For example « human stomach(s) » are example of « stomach(s) », hence « subclass of » is the right property. Here, enzymes are examples of proteins, so the right property is « subclass of ». There is an exception to this rule, as so called « meta-classes ». I tried to summary these in User:TomT0m/Classification. See also en:Is-a (on the ambiguity of the « is a » relationship, which is why there is both instance of and subclass of) and/or en:type-token distinction for a philosophical perspective and en:metaclass (semantic web) for metaclasses. author  TomT0m / talk page 21:09, 22 June 2019 (UTC)
      • Thanks for the example, Artoria2e5. Yup, that looks fine from the ProteinBoxBot perspective. I think the reference to "imported from Wikipedia" is not necessary when we have a more authoritative source already there, but not a big deal either. Best, Andrew Su (talk) 19:10, 24 June 2019 (UTC)