# User talk:JakobVoss

Previous discussion was archived at User talk:JakobVoss/Archive 1 on 2016-03-01.

## Redirect change

4

https://www.wikidata.org/w/index.php?title=Help:Classification&oldid=585513463

Just saw this edit. May I ask you if discussed with anybody ?

I was just tidying up the messy diversity of Help and Wikidata guideline pages. I don't mind which title to redirect to where but there should better be not too many pages with overlapping content so it's clear where to look up some topic. I'd prefer to merge the following pages:

I try to launch a discussion on Wikidata talk:Item classification#Metaclassification, union of and so on to try to update stuffs a little. I’ll notify WikiProject Ontology.

We actually fail to reach a consensus whatsoever, so it’s not an hard task. I tried to launch a couple of RfC to try to help, none was really a success. So whatever happens stays informal discussions about the best way to do. Feels like trying to dry the ocean with a spoon. However I think there is interesting stuffs in my classification subpage what are worth keeping, classes classification especially as it’s a powerful tool to deal with several external classifications and accurately represent them in Wikidata.

## repurposing items

4

Hi JakobVoss,

I undid your edit on an item as it changes its scope. If you want to add more statements, it's preferable that a new item is made. Otherwise users who refer to a past version of the item can get confused. I also undid some of your edits as I think they met the definition.

You might want to add "has quality" to describe aspects of identifiers. When reading some of your thesis, I had thought of applying some of its criteria to these.

I could create a new item, and delete or merge the old one but this would not be better. There is no easy answer to semantic change (Q1939117), it always depends on the particular case, so what's wrong with my actual edits?

Merging wouldn't be possible: Help:Merge#Check_to_be_sure.

You removed the info that some of these are actually multi-source.

The current description (there is no definition!) of the item makes no sense. I answered at Talk:Q21264328. Would you prefer to delete the item?

## Creating the property "usage discussed on page"

9

I think it would be great if you would now create Wikidata:Property proposal/usage discussed on page . The proposal has it's 8 days of wait time and good support. Given I created the proposal I can't create the property.

We are at property number 4498 so the next two property numbers are structured in a way that's easy to remember and I hope "usage discussed on page" will be frequently used manually, so it would be nice for it to have a nice number.

There are several issues with this property creation.

• Minimum is 7 days. So 8 days isn't really urgent.
• Creations should be based on consensus, not a mere count of votes.
• There seem to be several users raising concerns about its scope (none of which have been addressed)
• You are a participant in the discussion favoring its creation
• The sample you add isn't the one given in the discussion
• If you think the property should have a different scope, you should bring this up in the discussion, not by creating the property differently than in the way.

I understand that you just received property creation rights, so you might not be fully aware of the process.

Thanks and sorry. Can you delete the property and its talk page?

Wikidata:Administrators'_noticeboard#Accidentally_created_property. I'll clean all current use of the property.

This comment was hidden by JakobVoss (history)

Shall I ask for deletion at Wikidata:Administrators' noticeboard or is there a better way to quickly contact an Admin? This is clearly a mistake from my side so there should be no formal process needed for deletion, should it?

I'm sorry. At the time I wrote the above post the property proposal had four support votes and one against which would generally be a consensus and this changed after I wrote the post.

Bad luck. However, I'm sure we'll come to a better property and consensus with some additional discussion.

Reply to "Creating the property "usage discussed on page""

## don’t understand those edits

18

Can you explain the edits like this one ? https://www.wikidata.org/w/index.php?title=Q23056371&diff=prev&oldid=546207667

seems that a french postal code like « 12345 » is a postal code in the sense that « subclass of » mean. Any french postal code is a postal code.

So it seems to me that « french postal code » instance of « postal code » does not make any sense but « subclass of » makes.

Stuff like « french postal code » instance of « national postal code system », one the other hand, would make.

The question when to use instance of (P31) and when to use subclass of (P279) cannot be answered by ontological arguments alone but it depends on the current state of Wikidata as a whole. As long as Wikidata does not contain an item for each individual French postal (such « 12345 ») the ontological class of French postal codes will never have instances. Instead of having an empty Wikidata class, it is more useful to model each individual system of postal codes (such as Postal codes in France (Q1105640)) as Wikidata instance of the Wikidata class of all postal codes systems.

Building an ontological mess is building a mess. Just create that damn class, this is way more simple that having to bsk yourself each time « is there instances on Wikidata » ? Which is a non trivial question and can change over time. If that happens you’ll change all the statements ? Better being right the first time.

There is no "right" in Wikidata, it's not about facts but about statements. Furthermore reality is a non trivial and can change over time, so does Wikidata.

Maybe Wikidata:Identifiers helps to explain. The question what an identifier actually is is not obvious. I'd like to stress that there is no identifier without an identifier systems (unlike names, which can also stand alone). It would be very complicated to have Wikidata items for both "System of French postal codes" and "French postal code character strings", so we better take same as one item.

I don't think Wikidata could ever contain items about instances of postal codes, at least until its well-posed formulas consist of triple subject property object:

suppose Wikidata stores items about instances of postal codes; then why couldn't it contain items for instances of any UID? Why couldn't it contain items for instances of Wikidata item identifiers (i.e. Q123)?

Of course we could do this, but soon we would start encountering many recursion problems that Wikidata language can't even express.

Anyway, elements (instances) of the class of postal codes are postal codes, Postal codes in France (Q1105640) is the set (class) of postal codes issued by France, thus it is a subset of postal codes and so we should use subclass of (P279) and not instance of (P31).

Nevertheless I see your point here: how could we retrieve french postal codes if this class is empty by Wikidata means?

I think this is a greater problem; I think we need tools to populate the class without creating new elements, maybe connecting the answer of the query subclass of french postal codes to that of the query postal codes of town located in France. Tools like that could also solve the problem of expressing the same concepts through different properties. I would support a proposal in this direction.

PS: I think you should define ontological class.

>Furthermore reality is a non trivial and can change over time, so does Wikidata.

Wikidata can't express all of reality because its language is not even a first order language and it does not admit recursion, so on Wikidata you can't define a lot of mathematical concepts.

You can feel the limitations with this definition:

In ZF theory, an ordered n-uple ${\displaystyle (a_{1},\ldots ,a_{n})}$ is an unordered couple (a set composed of two elements) consisting of the ordered (n-1)-uple ${\displaystyle (a_{1},\ldots ,a_{n-1})}$ and the n-th element ${\displaystyle a_{n}}$.

Try to express this definition in tuple (Q600590) without using recursion.

@TomT0m: you are right with strict definition it should be

< subject > subclass of (P279) <postal code>
< subject > instance of (P31) <identifier system>

How do we differentiate (if we actually need to, maybe not!)

• classes of concrete identifier systems (postal codes, identifiers for people...)
• classes of identifier systems (e.g. postal code identifier systems, identifier systems for people...)

My problem is I see no way to define "identifier" without "identifier system" because identifiers are always part of (sic!) an identifier system.

You're right, we shouldn't mix them.

More items.

We don't need Wikipedia article(s) about every system in order to create structure.

But it would help if we refer to authoritative organizations. described at URL (P973) or external data available at (P1325)

If a Wikimedia project decides to make a page for a particular postcode, we would be obliged to have an item for it. Even if we expect (and would prefer) no individual instances, we should still model the data in a way which allows for instances. For example, we already have 10048 (Q4546087) (which a query for instances of subclasses of postal code (the usual way to find individual instances) doesn't find).

It seems to me that you're using "postal code" to mean "a postal code system" whereas the other people here understand it as "an individual postal code". Based on the English description and English Wikipedia page for postal code (Q37447), I would also interpret it as meaning an individual postal code.

Perhaps a solution that would work for everyone would be to have a new item for "postal code system"? Then things like ZIP code (Q136208) could be an instance of a postal code system (ZIP codes are a specific system) and a subclass of postal code (all ZIP codes are postal codes).

Dividing "postal code system" and "individual post code" would be impractical nitpicking. Neither Wikipedia articles nor normal language make such distinction. There is no right way to model things in Wikidata but several possibilities. Good solutions must be judged on how well they provide data reuse (e.g. queries) and how well they can be applied in practice (so not too difficult to understand).

Thanks for giving the example 10048 (Q4546087)! These individual identifiers are an exception but they exist. I'd like to be able to answer queries like the following:

Individual identifiers can exist as values (e.g. values of property IATA airport code (P238)) and - less so - as items (e.g. 10048 (Q4546087)). Concrete identifier systems make the most of identifier-related Wikidata items and types of identifiers should only be used to organize the former or of Wikipedia articles about general types of identifiers exist.

What do you suggest to separate these three cases?

not "postal code system" but "identifier" = unique identifier (Q6545185)

It might be confusing after long discussions, but these statements aren't ambiguous:

Thanks, looks good. Could you add how geographic identifier (Q36214810) and 10048 (Q4546087) would fit in here?

can be true

10048 (Q4546087) can't be anything but P31... of ZIP code (Q136208)

ZIP code potentially can be separated into several items (old format) and (new format)

But this is preemptive for many postcodes as historic data is more complex question than current solutions.

Some degree of fuzzyness cannot be avoided because concepts change over time and have slightly different meanings in different contexts and Wikipedia editions. To summarize the example, we have three kinds of items plus the most common superclass of all identifiers:

How to query all individual identifiers?

`?individualID wdt:P31 ?idSystem`

How to query all postal code systems?

`?idSystem wdt:P279 wd:Q37447 ; wdt:P31 "identifier system"`

How to query all identifier systems?

`?idSystem wdt:P279* wd:6545185 ; wdt:P31 "identifier system"`

How to query all types of identifiers?

`?idType wdt:P279* wd:6545185 FILTER NOT EXISTS { ?idSystem wdt:P31 "identifier system" }`

I find this solution more difficult to apply and to make use of. As far as I now, the `FILTER NOT EXISTS` clause cannot be used in Lua Templates. With my current approach this is easier:

How to query all individual identifiers?

`?individualID wdt:P361 ?idSystem` (part-of)

How to query all postal code systems?

`?idSystem wdt:P31 wd:Q37447` (instance)

How to query all identifier systems?

`?idSystem wdt:P31/wd:P279* wd:6545185`

How to query all types of identifiers?

`?idType wdt:P279* wd:6545185`

<nowiki>Maybe adding another property can help? Checking for the (non-)existence of two statements is too fragile to get reliable results. We could create a sub-property of P31 or P279 to use, this is also applied for taxon data (taxon rank (P105), parent taxon (P171)...).

tl;dr: we cannot have both of the follwing statements, one must be changed or there is no easy way to tell that ZIP code (Q136208) is a concrete identifier system instead of a general class of multiple systems with possibly overlapping or identifier-values:

> How to query all identifier systems?

It should be

?idSystem wdt:P31/wdt:P279* wd:6545185

> we cannot have both of the follwing statements

should remove

then

ZIP code (Q136208) is a concrete system, other statements should account this

Reply to "don’t understand those edits"

## Merging non-identical items

1

https://www.wikidata.org/w/index.php?title=Q19795563&action=history shows that you merged the item for the postcode 74321 into the item for the much broader topic of all postcodes in Germany. Please only merge items if they are actually the same, otherwise it changes the meaning of the ID. If you think something isn't notable, you can ask for it to be deleted at Wikidata:Requests for deletions.

## Wikidata:Identifier - identifier classification

8

Thanks for your reply in Project Chat, and the area code (Q36205316). I changed it to administrative territorial entity identifier (ATE ID) and attached some more ID systems via P279. For person related IDs I created "person identifier". So there are as subclasses: ATE, Person, Languoid. See tree https://www.wikidata.org/w/index.php?title=Wikidata%3AProject_chat&type=revision&diff=536634593&oldid=536584405

More ID items could be created based on WD properties - there are ~1945 properties of data type "external-id". Also more subclasses, e.g. for chemical compound IDs, building IDs etc. could be created. Maybe you can help to copy more info from the property name space to the item name space. 213.39.164.36 22:55, 12 August 2017 (UTC)

Thanks. We should organize unique identifier with subclasses such as administrative territorial entity identifier (Q36205316), language identifier (Q2092812), person identifier (Q36218176) ... but I would not go too much into detail. I created wdtaxonomy to examine Wikidata hierarchies. There is always some mess, especially the distinction between instance of (P31) und subclass of (P279) is not as easy at it seems so better not aim at a perfect system in the first place but improve Wikidata step by step.

I doubt that we need a WD identifier item for each WD identifier property. In many cases it's enough to have an item for the classification, catalog etc. which the identifiers are part of. Nevertheless thanks for the effort, I am looking forward to improve information about identifiers in Wikidata together! -- JakobVoss (talk) 07:47, 13 August 2017 (UTC)

Positive text :-) But P31 and P279 need more talk. Some people think that something is only P31 if they cannot imagine something "below". That is a tree view. But not very good for ontologies. Some people think, the only things that have no P279 and have only P31 are physical objects. Then physical objects are never classes. All things that are not physical objects are classes. For these one can always come up with subclasses. E.g. a subclasses of ISO 3166-1 alpha-2 code could be, "reserved codes", or "codes that have never been assigned", "codes that have been re-assigned", "codes that start with Q", "codes assigned before 2012". That these don't exist in Wikidata does not matter. Creation of new items should not change what existing items are.

And if one looks at physical objects, then the item AFG (Q12626453) Afghan vehicle country code, is not an instance of an ID neither. It is a class. The instances are on the cars.

"I doubt that we need a WD identifier item for each WD identifier property" - Why would it matter "what we need"? Is Wikidata for you and me, or for the general public? Would you doubt that anyone would make use of it? Maybe someone wants to create a list of all languoid identifier classes, or all person ID classes. How many are there? Etc. Since in the property name space new ID properties are only created if deemed "needed", it will always lack some systems. Would they create German Personalausweisnummer? But it can exist in the item name space.

Data modelling in Wikidata is artificial so there is no right way. I prefer editing Wikidata from bottom to top based on items that have an actual use case instead of adding items based on theoretical assumptions. For this reason creation of new items might as well change existing items and one item for every identifier system can be needles duplication.

Nevermind we better discuss on practical examples: I'd make AFG (Q12626453) part-of international vehicle registration code (Q154015) and the latter an instance of country code (Q906278). -- JakobVoss (talk) 18:37, 13 August 2017 (UTC)

There are many right ways and many wrong ways. A P31 B, B P31 C, A P13 C is wrong.

"I prefer editing Wikidata from bottom to top based on items that have an actual use case instead of adding items based on theoretical assumptions." - What?

AFG as a part. Then D, GB, also parts? And to what class do these parts belong to? Could it be that they are identifiers? 77.179.130.58 21:16, 13 August 2017 (UTC)

If AFG is partOf international vehicle registration code (Q154015) that means that all instances of any such code have a part AFG. This is false. Why make false claims?

Questions like "Why make false claims" are self-defeating. As said above there are multiple ways to model reality in Wikidata so I don't buy theoretical arguments on right or wrong just to prove a point. To stay at the practical example: AFG (Q12626453) part-of international vehicle registration code (Q154015) instance of country code (Q906278)

Reply to "Wikidata:Identifier - identifier classification"

## Languoid id instances

4

There are no items for the instances. Please revert https://www.wikidata.org/w/index.php?title=Q1059900&action=history

Sorry, I don't understand the problem.

If "There are no items for the instances" is true, then the problem could be, that the claim P31 language identifier is a false claim. https://www.wikidata.org/w/index.php?title=Q1059900&diff=536847981&oldid=536847936

Do you think it is not a problem to insert false claims into Wikidata?

The instances are printed on books, exist on computer screens, are written somewhere.

1

Hallo!

Ich habe gesehen, dass du dich für Bibliotheken interessierst! Vielleicht möchtest du auch an WikiProject Universities teilzunehmen? Zur Zeit ist es leider nur auf Englisch aber wir könnten auch eine deutschsprächige Version anfangen.

Grüße.

## So good to see you here :)

1

How are things in the old country? !

Reply to "So good to see you here :)"

## Please take part in the Flow satisfaction survey

1

(That message in other languages: العربية • ‎bosanski • ‎català • ‎Deutsch • ‎Esperanto • ‎français • ‎עברית • ‎polski • ‎português • ‎português do Brasil • ‎русский • ‎اردو • ‎中文 – ‎translate that message)

Hello!

Like some other community members, you are using Flow.

An increasing number of communities now use Flow or are considering it. Although Flow itself is not scheduled for major development during 2016 fiscal year, the Collaboration Team remains interested in the project and in providing an improved system for structured discussions.

You can help us make decisions about the way forward in this area by sharing your thoughts about Flow — what works, doesn't work or should be improved?

Please fill out this survey (available in multiple languages), which is administered by a third-party service. It will not require an email or your username. See our privacy statement.