Wikidata:Property proposal/UCUM code

From Wikidata
Jump to navigation Jump to search

UCUM code[edit]

Originally proposed at Wikidata:Property proposal/Natural science

   Done: UCUM code (P7825) (Talk and documentation)
DescriptionCase-sensitive code from the Unified Code for Units of Measure specification to identify a unit of measurement
RepresentsUnified Code for Units of Measure (Q2494286)
Data typeExternal identifier
Domainunit of measurement (Q47574)
Example 1metre (Q11573) → "m"
Example 2ångström (Q81454) → "Ao"
Example 3metre per second squared (Q1051665) → "m/s2"
Example 4cubic yard (Q2165290) → "[cyd_i]"
SourceUCUM specification http://unitsofmeasure.org/ucum.html and UCUM specification in XML http://unitsofmeasure.org/ucum-essence.xml
Planned useLink as much units from UCUM specification into Wikidata as possible, identifying missing units in Wikidata
Number of IDs in source334
Expected completenesseventually complete (Q21873974)
See alsohttps://www.wikidata.org/wiki/Wikidata:Project_chat#Link_UCUM_unit_symbols

Motivation[edit]

It would be useful to link each item related with a unit of measurement with the unit symbols of the UCUM specification, which defines a set of codes to unambiguously represent all unit symbols currently used internationally in science, engineering, and business. It's focused on machine-to-machine communication, so it's a good candidate to be linked within wikidata :-)

The UCUM specification defines all codes, and there is also an UCUM specification in XML. For each unit of measurement different attributes are defined, like:

  • full unit name (e.g. "meter")
  • quantity kind (e.g. "length")
  • different (clean 7-bit US-ASCII) symbol codes:
    • case-sensitive code (e.g. m for metre, Ao for Ångström, or [ft_i] for foot)
    • case-insensitive code (e.g. M for metre, AO for Ångström, or [FT_I] for foot), for machines that cannot work with lower-case letters
    • human-readable printable unit (e.g. "m" for metre, "Å" for Ångström, or "ft" for foot)
  • definition factor (e.g. "0.1 nm" for Ångström)
  • Whether SI prefixes can be used with this unit or not (e.g. "km" means 1000 meters, but "kmin" cannot be used to indicate 1000 minutes)

Thefore, if each wikidata item about a unit of measurement is linked with the UCUM code, it would be possible to automatically extract/check from the UCUM specification some information like the definition factor for unit conversions and so on. Or detect missing items in Wikidata. —surueña 11:03, 28 December 2019 (UTC)

Discussion[edit]

  • Symbol support vote.svg Support David (talk) 05:29, 29 December 2019 (UTC)
  • Pictogram voting question.svg Question According to the proposal, the UCUM specification for a given unit consists of seven different pieces of information. How are those seven pieces of information going to fit in a single Wikidata property?--Pere prlpz (talk) 17:14, 29 December 2019 (UTC)
    • @Pere prlpz: The goal would be to use this new property "UCUM code" to indicate just one of the identifiers, specifically the case-sensitive code (e.g. Ao for Ångström). Other fields should be already present in Wikidata (the unit name is the name of the item; the quantity kind is specified by measured physical quantity (P111); the human-readable printable unit should be unit symbol (P5061) for English; the definition factor should be in conversion to SI unit (P2370) or conversion to standard unit (P2442)). Maybe the only missing information as far as I know is whether the SI prefixes can be used with each unit or not, but this should be specified as separate properties in my opinion. In any case the major goals for specifying one of the UCUM codes with this proposed proerty is first to detect missing units in Wikidata or UCUM, and second to automatically retrieve from the UCUM specification in XML the associated fields for the unit so it can be checked with a script that the related information in the Wikidata properties specified before are correct, for example checking the definition factor against the value in the conversion to SI unit (P2370). —surueña 20:43, 29 December 2019 (UTC)
  • Symbol support vote.svg Support I've studied the linked standard a bit. This looks like a useful identifier; the given motivation is convincing. Similar in spirit to QUDT unit ID (P2968) and Wolfram Language unit code (P7007). Toni 001 (talk) 19:33, 29 December 2019 (UTC)
  • Symbol support vote.svg Support Vahurzpu (talk) 16:10, 1 January 2020 (UTC)
  • @Suruena: I tried to create this property but an automatic check failed: description Code to identify a specific unit of measurement (Q47574) as defined in the Unified Code for Units of Measure (Q2494286) specification (case-sensitive code) contains wiki markup. Please fix your proposal and your property will be created shortly. This is an automated message but do not hesitate to ping me if you need any help. − Pintoch (talk) 11:49, 14 January 2020 (UTC)
    • @Pintoch: Thanks, I've modified the proposal accordingly. Cheers —surueña 12:13, 14 January 2020 (UTC)

@ديفيد عادل وهبة خليل 2, Toni 001, Pere prlpz, Vahurzpu, Suruena: ✓ Done: UCUM code (P7825). − Pintoch (talk) 14:19, 14 January 2020 (UTC)