Property talk:P6694

From Wikidata
Jump to navigation Jump to search

Documentation

MeSH concept ID
identifier of a Medical Subject Headings concept
RepresentsMedical Subject Headings (Q199897)
Data typeExternal identifier
Domainmedical subjects, diseases... (note: this should be moved to the property statements)
Allowed values^M\w{7}(\w{2}|)$ (M followed by 7 or 9 alphanumeric)
Exampleamoxapine (Q58356)M0001008 (RDF)
acetaminophen (Q57055)M0000115 (RDF)
neoplasm (Q1216998)M0014585 (RDF)
Format and edit filter validation
Sourcehttps://id.nlm.nih.gov/mesh/
Formatter URLhttp://id.nlm.nih.gov/mesh/$1
Robot and gadget jobsMESH RDF is available for download from NLM
See alsoMeSH term ID (P6680), MeSH code (P672), MedlinePlus ID (P604), MeSH descriptor ID (P486)
Lists
  • <items with the most statements of this property>
  • Count of items by number of statements (chart)
  • Count of items by number of sitelinks (chart)
  • Items with the most identifier properties
  • Items with no other external identifier
  • Items with no other statements
  • <most recently created items>
  • Items with novalue claims
  • Items with unknown value claims
  • Usage history
  • Chart by item creation date
  • Database reports/Constraint violations/P6694
  • <random list>
  • Proposal discussionProposal discussion
    Current uses677
    Search for values
    [create Create a translatable help page (preferably in English) for this property to be included here]
    Format “M\w{7}(\w{2}|): value must be formatted using this pattern (PCRE syntax). (Help)
    Exceptions are possible as rare values may exist.
    List of this constraint violations: Database reports/Constraint violations/P6694#Format, SPARQL, SPARQL (new)
    Scope is as references (Q54828450), as main value (Q54828448): the property must be used by specified way only (Help)
    List of this constraint violations: Database reports/Constraint violations/P6694#scope, hourly updated report, SPARQL, SPARQL (new)
    Single value: this property generally contains a single value. (Help)
    Exceptions are possible as rare values may exist.
    List of this constraint violations: Database reports/Constraint violations/P6694#Single value, SPARQL, SPARQL (new)
    Conflicts with “instance of (P31): Wikimedia disambiguation page (Q4167410), Wikimedia category (Q4167836), human (Q5), airport (Q1248784): this property must not be used with the listed properties and values. (Help)
    List of this constraint violations: Database reports/Constraint violations/P6694#Conflicts with P31, hourly updated report, SPARQL, SPARQL (new)

    regexp should use digits not alphanumerics[edit]

    @DePiep, Eihel: https://id.nlm.nih.gov/mesh/describe?uri=http%3A%2F%2Fid.nlm.nih.gov%2Fmesh%2Fvocab%23identifier says "8 or 10 alphanumeric starting with the letter M". But after "M" always come only digits (please find a counter-example if you disagree). I think their description says "alphanumeric" because if you take "M" into account, the overall identifier becomes alphanumeric. The same holds of Descriptor (D/C) and Term (T) identifiers, have you changed those? --Vladimir Alexiev (talk) 09:58, 20 April 2019 (UTC)

    I'd support this. Unfortunately I could not easily find confirmation at the nih site (so it only says "alphanumeric beginning with "M'" indeed). - DePiep (talk) 11:18, 20 April 2019 (UTC)

    Early and different creation; definition checks[edit]

    So the proposal was closed very early with a rather unpleasant note [1] by Eihel (talkcontribslogs).

    Some questions remain:

    • I can deduct: there are three identifiers, distinguished as C/D, M, T in properties:
    C/D: MeSH descriptor ID (P486)
    M: MeSH concept ID (P6694)
    T: MeSH term ID (P6680)
    Unfortunately, by source the initial letter does not nicely match its name, confusingly even. We should prevent confusion, for example by cross-referencing each (put the other two properties in there as a "see also").
    • The source defines these identifiers :
    A property of Descriptors, Qualifiers, SupplementaryConceptRecords, Concepts and Terms. 
    Descriptor identifier is a 7 or 10 alphanumeric starting with the letter D. 
    Qualifier identifier is a 7 or 10 alphanumeric starting with the letter Q. 
    SupplementaryConceptRecord identifier is a 7 or 10 alphanumeric starting with the letter C. 
    Concept identifier is an 8 or 10 alphanumeric starting with the letter M. 
    Term identifier is a 7 or 10 alphanumeric starting with the letter T. 
    The 10 alphanumeric format was implemented for new identifiers created on or after about May 19, 2014.
    So Wikidata:
    merges C and D (Descriptor, SupplementaryConceptRecord), and
    Q (Qualifier) is not a property.
    (Just noting, I have no opinion on this e.g. re property correctness & need).
    • A moot note: I proposed "If the pattern is "... 6 or 9 digits", shouldn't the regex be ^M\d{6}(\d{3}|)$? " put the [2]. This was not implemented as such [3]. In the created property, the regex is different again. (moot now)
    C/D: MeSH descriptor ID (P486) REGEX: [CD]\d{9}|[CD]\d{6}^[CD]\d{6}(\d{3}|)$
    M: MeSH concept ID (P6694) REGEX: ^M[A-Za-z0-9]{7}([A-Za-z0-9]{2}|)$^M\d{7}(\d{2}|)$
    T: MeSH term ID (P6680) REGEX: ^T[A-Za-z0-9]{6}([A-Za-z0-9]{3}|)$^T\d{6}(\d{3}|)$


    -DePiep (talk) 11:51, 20 April 2019 (UTC)

    Hey @DePiep: you sound a bit bitter, cheer up man! Prop metadata can be corrected after creation, don't worry about it. Your points are not moot. Some comments on them:

    • Your proposed regexes are right.
    • could not easily find confirmation at the NIH site: but all IDs I've ever seen are numeric, so let's be reasonable.
    • WD merges C and D: These are merged by MESH as the class Descriptor: C is Supplementary i.e. chemicals; D is TopicalDescriptor or GeographicDescriptor. The important point is that each relevant WD entity will have only one MESH Descriptor external id, so we've done the right thing
    • Qualifier is not created because the discussion is ongoing: Wikidata:Property proposal/Mesh Qualifier ID, please conribute --Vladimir Alexiev (talk) 13:12, 20 April 2019 (UTC)
    thx ;-). So NIH has merged C and D? Is our (English) label still correct? I am not familiar with this topic, so I'll leave it here. -DePiep (talk) 13:18, 20 April 2019 (UTC)
    Hello @DePiep: [4] I tried to be kind, including a hint of humor and a polite phrase but apparently you have not been sensitive to it. Following the implementation of Deltabot, the proposal is archived when there is no more change for 3 days. Following a proposal, it is closed when the Property is created. We can give some opinions of course, but to vote for or against its creation is obsolete (Tris T7 and Leiem). We must no longer speak of the proposition, but of the Property, hence the debate on this page, QED. If the proposal is constantly modified, it will never be archived. To get back to Regex, I tried to include Posix code to test. Having searched all Descriptor D (here), it is true that there are only numbers after the first letter. But based on the explanations, the regex must be:
    Best regards --Eihel (talk) 12:01, 13 May 2019 (UTC)