Wikidata:Property proposal/UKÄ standard classification of Swedish science topics 2016

From Wikidata
Jump to navigation Jump to search

UKÄ standard classification of Swedish science topics 2016[edit]

Originally proposed at Wikidata:Property proposal/Natural science

DescriptionSwedish Higher Education Authority standard for classification of scientific articles from scientists in the higher education system of Sweden
Data typeExternal identifier
Example 1biology (Q420)106
Example 2structural biology (Q908902)10601
Example 3ethology (Q7155)10613
Example 4chemistry (Q2329)104
Example 5physics (Q413)103 (named as Physical Sciences), 10399 (named as Other Physics Topics)
Number of IDs in source~250
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://bibliometri.swepub.kb.se/bibliometrics?subject=$1
See alsoWikidata:Property proposal/UKÄ standard classification of Swedish science topics 2011, ANZSRC 2020 FoR ID (P8529), ANZSRC 2008 FoR ID (P5922), All-Science Journal Classification Codes (P10203)

Motivation[edit]

This is valuable because it is used as an ID in e.g Swepub data from the National Library of Sweden and perhaps in other places.--So9q (talk) 23:43, 2 January 2022 (UTC)[reply]

Discussion[edit]

Yes and no. I have only seen it used here: https://bibliometri.swepub.kb.se/ they have a dump file on 1.5 GB that I just fetched into PAWS and I'm about to extract all the topics and UKÄ codes into pandas and do some statistics on it :), see https://public.paws.wmcloud.org/User:So9q/WikidataMLSuggester/extract-swepub-topics.ipynb where I'm working as we speak. So no URL formatter but it seems like well formatted data, example:
{
          "@id": "https://id.kb.se/term/uka/30205",
          "@type": "Topic",
          "code": "30205",
          "prefLabel": "Endocrinology and Diabetes",
          "language": {
            "@type": "Language",
            "@id": "https://id.kb.se/language/eng",
            "code": "eng"
          },
          "inScheme": {
            "@id": "https://id.kb.se/term/uka/",
            "@type": "ConceptScheme",
            "code": "uka.se"
          },
          "broader": {
            "prefLabel": "Clinical Medicine",
            "broader": {
              "prefLabel": "Medical and Health Sciences"
            }
          }
        },
:
--So9q (talk) 08:48, 3 January 2022 (UTC)[reply]
Isn't it better to ask kb.se to publish "uka" as they do for some other classifications, eg https://id.kb.se/term/sao and one of its terms https://libris.kb.se/rp354vn9510f7x9? That way we can have a proper formatterURL with both HTML and RDF representation Vladimir Alexiev (talk) 10:13, 3 January 2022 (UTC)[reply]
Wrote to libris@kb.se: Hi! We want to connect UKA to Wikidata, see https://www.wikidata.org/wiki/Wikidata:Property_proposal/UKÄ_standard_classification_of_Swedish_science_topics_2016 .
For that it would be best if you publish it as LOD with individual pages for the concept scheme and for each concept,
just like you're doing for some other classifications, eg https://id.kb.se/term/sao and one of its terms https://libris.kb.se/rp354vn9510f7x9 .
A Wikidata user is currently processing https://bibliometri.swepub.kb.se/ and extracting UKA terms and we see that you have it represented in RDF (JSONLD),
so we're just asking whether you can make per-entity HTML pages and JSONLD files? Vladimir Alexiev (talk) 12:42, 3 January 2022 (UTC)[reply]

Further to libris@kb.se:

I noticed a JSONLD mistake in eg https://bibliometri.swepub.kb.se/api/v1/bibliometrics/publications/oai:DiVA.org:umu-187284 (JSON keys are shown in bold instead of surrounded by quotes):

{
  @id: "https://id.kb.se/term/uka/10302",
  @type: "Topic",
  broader: {
    prefLabel: "Physical Sciences"},
  code: "10302",
  inScheme: {
    @id: "https://id.kb.se/term/uka/",
    @type: "ConceptScheme",
    code: "uka.se"},
  language: {
    @id: "https://id.kb.se/language/eng",
    @type: "Language",
    code: "eng"},
  prefLabel: "Atom and Molecular Physics and Optics"
},
{
  @id: "https://id.kb.se/term/uka/10302",
  @type: "Topic",
  broader: {
    prefLabel: "Fysik"},
  code: "10302",
  inScheme: {
    @id: "https://id.kb.se/term/uka/",
    @type: "ConceptScheme",
    code: "uka.se"},
  language: {
    @id: "https://id.kb.se/language/swe",
    @type: "Language",
    code: "swe"},
  prefLabel: "Atom- och molekylfysik och optik"
},

You make two groups of statements about https://id.kb.se/term/uka/10302: most of them are duplicates, except these which will be accumulated

  • "language", which means this concept has two languages??
  • "prefLabel" , but the language of those labels is not indicated

In turtle, this looks like:

<https://id.kb.se/term/uka/10302>
        rdf:type                      bf2:Topic ;
        bf2:code                      "10302" ;
        bf2:language                  <https://id.kb.se/language/swe> , <https://id.kb.se/language/eng> ;
        madsrdf:authoritativeLabel    "Atom- och molekylfysik och optik" , "Atom and Molecular Physics and Optics" ;
        madsrdf:hasBroaderAuthority   [ madsrdf:authoritativeLabel "Fysik" ] ;
        madsrdf:hasBroaderAuthority   [ madsrdf:authoritativeLabel "Physical Sciences" ] ;
        madsrdf:isMemberOfMADSScheme  <https://id.kb.se/term/uka/> .

You can use this to convert JSONLD to Turtle:

curl -s https://bibliometri.swepub.kb.se/api/v1/bibliometrics/publications/oai:DiVA.org:umu-187284|riot -syntax jsonld -formatted ttl

Another problem is that rather than linking to the parent concept, you point to its label:

  madsrdf:hasBroaderAuthority   [ madsrdf:authoritativeLabel "Fysik" ]  

You should replace this with

  madsrdf:hasBroaderAuthority <https://id.kb.se/term/uka/103>  

and optionally include a brief description of that concept.

IMHO, you should fix the JSONLD to the following:

{
  @id: "https://id.kb.se/term/uka/10302",
  @type: "Topic",
  broader: {@type:"@id",@value:"https://id.kb.se/term/uka/103"},
  code: "10302",
  inScheme: {
    @id: "https://id.kb.se/term/uka/",
    @type: "ConceptScheme",
    code: "uka.se"},
  prefLabel: [{@value:"Atom and Molecular Physics and Optics", @language:"en"},
              {@value:"Atom- och molekylfysik och optik",      @language:"se"}]
},

--Vladimir Alexiev (talk) 15:48, 3 January 2022 (UTC)[reply]

@So9q, Egon Willighagen, Ainali, Vladimir Alexiev, ArthurPSmith, MasterRus21thCentury: ✓ Done UKÄ classification of science topics 2016 (P10361) Pamputt (talk) 11:38, 13 February 2022 (UTC)[reply]