Wikidata:Property proposal/KBpedia ID
Jump to navigation
Jump to search
KBpedia ID
[edit]Originally proposed at Wikidata:Property proposal/Authority control
Description | Identifier for the KBpedia knowledge graph, which provides consistent mappings across seven large-scale public knowledge bases including Wikidata, and is used to promote data interoperability and extraction of training sets for machine learning. Aliases: KBpedia, KKO, KBpedia knowledge graph, KBpedia ontology, kbpedia, Kbpedia |
---|---|
Represents | KBpedia (Q64139102) |
Data type | External identifier |
Domain | items |
Allowed values | ([A-Z\d])([(A-Za-z\d-_)]+)+ Syntax clarification PascalCase format with digits allowed for first character and hyphen or underscore in remaining positions except trailing |
Example 1 | Afghan cuisine (Q383096) → AfghanCuisine |
Example 2 | bat (Q28425) → Bat-Mammal |
Example 3 | C-C motif chemokine ligand 16 (Q21113921) → CCL16 |
Example 4 | Dixie Highway (Q818896) → DixieHighway |
Example 5 | ecosystem (Q37813) → Ecosystem |
Example 6 | fire station (Q1195942) → FireStation |
Source | https://kbpedia.org, https://github.com/Cognonto/kbpedia |
Planned use | Knowledge graph-based subset creation and item (Qid) retrieval from Wikidata; much more going forward |
Number of IDs in source | 44,368 |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://kbpedia.org/kko/rc/$1 |
Robot and gadget jobs | Likely to use OpenRefine for import; maybe later validity checks |
See also | DBpedia, Wikidata, Wikipedia, GeoNames, schema.org, UNSPSC, Cyc |
Motivation
[edit]- KBpedia is a logically consistent knowledge graph -- written in RDF, SKOS, and OWL -- with 58 K concepts that integrate 200 K key items from Wikidata, Wikipedia, DBpedia, GeoNames, schema.org, Cyc, and UNSPSC products and services. The KBpedia ontology is a computable framework for reasoning and for making fine-grained entity subset selections from these sources to aid data interoperability and machine learning (AI). The 44 K concepts mapped to Wikidata are one means for Wikidata users to combine, select, and access entities across Wikidata. Mkbergman (talk) 15:36, 17 June 2020 (UTC)
Discussion
[edit]- @Mkbergman: Can you add information about what KBpedia is to the description of this property? ChristianKl ❪✉❫ 07:46, 21 June 2020 (UTC)
- Is this the kind of expanded description you were seeking? If not, I can revise. Mkbergman (talk) 14:10, 22 June 2020 (UTC)
- Support --Jneubert (talk) 19:22, 22 June 2020 (UTC)
- Support
Commentit seems like this is a consumer of Wikidata (they claim "KBpedia has 98% coverage of Wikidata") and this would lead us to add such an identifier to every single item in Wikidata which I am not sure is such a good idea -- or at least warrants a larger discussion before we start doing this. I generally see the point in adding crosslinks to orthogonal / upstream databases but I am not sure about automatically generated downstream databases. Or maybe I misunderstood what you plan to do? --Hannes Röst (talk) 19:28, 24 June 2020 (UTC)- @Hannes Röst:You are correct that KBpedia is a 'consumer' of Wikidata information, and Wikidata is likely the most important contributor to KBpedia's seven major knowledge bases. But, no, it is not likely that the number of mappings to Wikidata will increase much. KBpedia principally links to types and classes, and is itself unlikely to grow much beyond its current 58 K reference concepts (44 K of which now map to Wikidata). KBpedia acts more like a table of contents, than a comprehensive compilation of instances. Wikidata and other constituent knowledge bases are the proper location for that specific content. KBpedia maps to Wikidata instances (Q items) via their parent classes or types, not generally directly unless the instance is quite prominent like Rome or John F Kennedy. Let me know if I can offer additional commentary. Mkbergman (talk) 22:50, 24 June 2020 (UTC)
- I see, so the idea is that we would import all the 1:1 mappings and then eventuall complete the other 12k concepts so that we now have a complete mapping from WD to the other seven knowledge basis? If that is the plan, then I am in favor (changed my position). --Hannes Röst (talk) 18:20, 1 July 2020 (UTC)
- @Hannes Röst:Exactly. I'm not sure if all currently missing 12 K concepts would find a match, since some are needed to maintain the integrity of the knowledge graph, but your understanding is correct. For example, one of the seven knowledge bases, UNSPSC (Q1361569), currently maps (UNSPSC Code (P2167)) to about 1000 WD Q entities. That would immediately increase to about 6500 with the KBpedia linkage. Mkbergman (talk) 19:11, 1 July 2020 (UTC)
- I guess that is a discussion for later but it may make sense to represent them here as well even if they are internal to KBPedia. It seems worthwhile to have the additional 6500 items with UNSPSC Code (P2167) but how did KBPedia do the mapping, what is the quality of the mapping ? Was this done by hand and has high quality or some automated process with high error rate? --Hannes Röst (talk) 15:03, 2 July 2020 (UTC)
- @Hannes Röst:All mappings have been manually vetted, though to differing degrees of scrutiny. Checks for types and class relationships are the most stringent, followed by mappings, and then annotations, with alternative labels the least scrutinized. During builds, disjointedness checks, and logical inconsistency and satisfiabiity checks are applied. Builds can not be accepted with such errors. All new versions go through multiple builds to build without error. The overall process, then, is semi-automatic, with manual inspection the final step. That does not guarantee there are not errors, which we correct as identified in next releases. We think we have F1 score (Q6975395) as high or higher than other 'gold standard' knowledge bases, but that remains to be independently checked. Mkbergman (talk) 18:11, 2 July 2020 (UTC)
- I guess that is a discussion for later but it may make sense to represent them here as well even if they are internal to KBPedia. It seems worthwhile to have the additional 6500 items with UNSPSC Code (P2167) but how did KBPedia do the mapping, what is the quality of the mapping ? Was this done by hand and has high quality or some automated process with high error rate? --Hannes Röst (talk) 15:03, 2 July 2020 (UTC)
- @Hannes Röst:Exactly. I'm not sure if all currently missing 12 K concepts would find a match, since some are needed to maintain the integrity of the knowledge graph, but your understanding is correct. For example, one of the seven knowledge bases, UNSPSC (Q1361569), currently maps (UNSPSC Code (P2167)) to about 1000 WD Q entities. That would immediately increase to about 6500 with the KBpedia linkage. Mkbergman (talk) 19:11, 1 July 2020 (UTC)
- I see, so the idea is that we would import all the 1:1 mappings and then eventuall complete the other 12k concepts so that we now have a complete mapping from WD to the other seven knowledge basis? If that is the plan, then I am in favor (changed my position). --Hannes Röst (talk) 18:20, 1 July 2020 (UTC)
- @Hannes Röst:You are correct that KBpedia is a 'consumer' of Wikidata information, and Wikidata is likely the most important contributor to KBpedia's seven major knowledge bases. But, no, it is not likely that the number of mappings to Wikidata will increase much. KBpedia principally links to types and classes, and is itself unlikely to grow much beyond its current 58 K reference concepts (44 K of which now map to Wikidata). KBpedia acts more like a table of contents, than a comprehensive compilation of instances. Wikidata and other constituent knowledge bases are the proper location for that specific content. KBpedia maps to Wikidata instances (Q items) via their parent classes or types, not generally directly unless the instance is quite prominent like Rome or John F Kennedy. Let me know if I can offer additional commentary. Mkbergman (talk) 22:50, 24 June 2020 (UTC)