Wikidata:Property proposal/Kotobank Japanese ID
Appearance
Kotobank Japanese ID
[edit]Return to Wikidata:Property proposal/Generic
Under discussion
| Description | The identifier in the Japanese article URL on Kotobank |
|---|---|
| Represents | Kotobank (Q11302260) |
| Data type | External identifier |
| Domain | item, lexeme |
| Example 1 | Japan (Q17)→109930,3212087 |
| Example 2 | 日本/にほん (L5108)→109930 |
| Example 3 | Iah (Q946657)→23561 |
| Example 4 | 絵 (Q54552713)→442803 |
| Example 5 | JUNGLIA OKINAWA (Q133878783)→3242580 |
| Example 6 | Digitalis purpurea (Q157555)→3155343,3148923,474830 |
| Allowed values | \d{5,7} |
| Source | https://kotobank.jp |
| Number of IDs in source | 1,198,939 integer values (at least); approximately 1,933,425 integer values plus fragments (at least) |
| Expected completeness | eventually complete (Q21873974) |
| Implied notability | Wikidata property for an identifier that suggests notability (Q62589316) |
| Formatter URL | https://kotobank.jp/word/ID-$1 |
| Country | Japan (Q17) |
| Applicable "stated in"-value | Kotobank (Q11302260) |
Motivation
[edit]Kotobank (Q11302260) is an integrated online Japanese dictionary/encyclopedia. It features many dictionaries/encyclopedias as listed. It's very known and widely used as usage on Japanese Wikipedia. Also it's very often shown in the top of the Google search results. See also Wikidata:Property proposal/Kotobank ID--TAKA647 (talk) 09:40, 8 February 2026 (UTC)
Discussion
[edit]
Comment @TAKA647: What do you make of the concern raised by Midleading (and the reply to that concern by Laftp0) in the previous proposal for this property? Mahir256 (talk) 16:11, 8 February 2026 (UTC)
- Kotobank mixes dictionaries and encyclopedias on the same page, so it can't limit the scope to just lexemes or just items. I believe this property gives useful information in Japanese. TAKA647 (talk) 05:29, 9 February 2026 (UTC)
Oppose external ids for items should not be mixed with those for lexemes and it seems as if its not meant for lexemes. 109930 shows two different lexemes (にほん【日本】and にっぽん【日本】). Ic you merged them in wikidata? 😔 –Shisma (talk) 19:09, 9 February 2026 (UTC)
- I can't understand a disadvantages to use the property for items and lexemes (see also Lur Encyclopedic Dictionary ID (P10242)). にほん【日本】and にっぽん【日本】share the same origin, and there is no difference in meaning between them. So they are just different forms. TAKA647 (talk) 03:05, 10 February 2026 (UTC)
- okay, the 日本 example is debatable. Here's a different one 143996 refers to both 山/やま (L5121) and 山/さん (L1313770). As far as I understand they aren't related. This could actually be solved by changing the proposal slightly (include a hash):
- I am not familiar with Lur Encyclopedic Dictionary ID (P10242). Is there a way to distinguish which kotoba articles are clearly lexicographical, and which aren't? –Shisma (talk) 18:49, 10 February 2026 (UTC)
- The reason that I don't use "#w-" is proper noun item needs a lot of identifier such as:
- However, I think using "#w-" is one way to go. About 山, 143996#w-512769 is about just character, not suffix. So correct linc is this:
- If the word isn't proper noun, the way to judge it is better for item or for lexeme is sourse book. Using identifier shared with (P4070) and identifier shared with lexeme (P9531) might be a good idea. TAKA647 (talk) 07:54, 11 February 2026 (UTC)
- I'd refrain from using identifier shared with lexeme (P9531) when it can be avoided. If you update the proposal to include the hash for homograph lexemes I would remove my oppose vote. If only we could solve the issue of this being mixed item/lexeme property. My initial concern was, that a particular software expects an id-property either be for lexemes or items but maybe that is an issue of this software rather than this proposal. Maybe it makes writing constraints more complicated? I wonder what @Laftp0: concerns were when withdrawing their proposal? – Shisma (talk) 16:36, 14 February 2026 (UTC)
- The way I think now is to change allowed values to
\d{5,7}#w-\d{5,7}and to use property constraint (P2302)distinct-values constraint (Q21502410)exception to constraint (P2303)identifier shared with (P4070)identifier shared with lexeme (P9531) . It's better way to use property and maintain it. TAKA647 (talk) 16:18, 16 February 2026 (UTC)- Sorry, I mistook separator (P4155) for exception to constraint (P2303). TAKA647 (talk) 06:21, 18 February 2026 (UTC)
- The way I think now is to change allowed values to
- I'd refrain from using identifier shared with lexeme (P9531) when it can be avoided. If you update the proposal to include the hash for homograph lexemes I would remove my oppose vote. If only we could solve the issue of this being mixed item/lexeme property. My initial concern was, that a particular software expects an id-property either be for lexemes or items but maybe that is an issue of this software rather than this proposal. Maybe it makes writing constraints more complicated? I wonder what @Laftp0: concerns were when withdrawing their proposal? – Shisma (talk) 16:36, 14 February 2026 (UTC)
- I can't understand a disadvantages to use the property for items and lexemes (see also Lur Encyclopedic Dictionary ID (P10242)). にほん【日本】and にっぽん【日本】share the same origin, and there is no difference in meaning between them. So they are just different forms. TAKA647 (talk) 03:05, 10 February 2026 (UTC)
Comment I am sympathetic to Laftp0's argument that different properties could be created for the individual dictionaries/encyclopedias included on Kotobank. While I have opposed properties for source aggregators on multiple occasions, I am more inclined to support sourcing from this one since unlike those aggregators the individual resources on each page may be directly linked to (or at least, it was not true in 2022 according to the Internet Archive, so this development is new and welcomeable). Yet I am not prepared to outright oppose this until I can get a better idea of how many subentries across dictionaries there actually are; the counts I provided above come from just looking at the frequencies of IDs in the various entry lists provided on the site, and now I'm running an indexing process to count the actual subentries. Mahir256 (talk) 20:13, 16 February 2026 (UTC)