Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
(Redirected from Wikidata:PC)
Jump to: navigation, search
Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Also see status updates to keep up-to-date on important things around Wikidata.
Requests for deletions can be made here.
Merging instructions can be found here.

IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2017/08.

Project
chat

Administrators'
noticeboard

Development
team

Bureaucrats'
noticeboard

Translators'
noticeboard

Requests
for permissions

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Interwiki
conflicts

Request
a query

Bot
requests

Contents

How to look up persons nationality?[edit]

Commons Creator page (Q24731821) use nationality and occupation fields to create phrases like Italian painter that describe people. Nationality on commons is encoded as 2-letter ISO 6116 code or as one of several "nationalities" that do not have a ISO code (like "English" or "Flemish", etc.). Nationality field roughly overlaps with country of citizenship (P27) property and my current approach was to look up ISO 3166-1 alpha-2 code (P297) of each country in country of citizenship (P27). That works for about 50% people, but unfortunately many country of citizenship (P27) properties link to items build for very specific period in history of a country and do not have ISO code. So how do I look up that Albrecht Dürer (Q5580) and Adolf Hitler (Q352) are "German" or "DE" ? --Jarekt (talk) 12:25, 2 August 2017 (UTC)

I doubt you can. The most direct path between Durer and Germany seems to be

But you cannot really use followed by (P156) in every case.

Beside, the handling of country of citizenship (P27) for older times in Wikidata does not seem to make much sense. I have never heard nor read that the country of citizenship of Durer was the Holy Roman Empire. --Zolo (talk) 13:49, 2 August 2017 (UTC)

I just look and many of the identifier databases lists nationality for Albrecht Dürer (Q5580) as "German", see RKD or ULAN. Maybe we need a new "nationality" property to capture such metadata? But what kind of items should such property point to? We could have set of items for the nations (instance of (P31) nation (Q6266)), like China (Q29520) or Korea (Q18097) which are not the same as the current country. Unfortunately the wikipedia articles do not differentiate between nations and countries. For example Russia (Q159)'s inception (P571) is 862, 1917 and 1991, and en:Russia article about "Russian Federation" talk about thousand year history. --Jarekt (talk) 15:35, 2 August 2017 (UTC)
@Jarekt:
So it is enough to fill correct city
All "country" information would accumulate at city pages
d1g (talk) 15:46, 2 August 2017 (UTC)
Actually there are two different issues:
  • which values should be used in country of citizenship (P27). There is no consensus even for seemingly simple cases like France in the 20th century, and it's much more complex for 15th century Germany. (I thought Nuremberg was a free city rather than part of Bavaria ? Either way, both "Nuremberg" and "Bavaria" sound better values than "Holy Roman Empire".)
  • It is commonly said (in respected sources) that Durer it German and Botticelli Italian. How do we indicate that. A new property ? Perhaps, but not sure about the label. Actually, we would even have any item to use as a value, as Germany (Q183) is supposed to be about an entity created in 1949. --Zolo (talk) 16:05, 2 August 2017 (UTC)
  • "Holy Roman Empire" is just wrong.
  • 19283 values in P27 correspond to wd:Q34266. Some countries (USA) don't need splits at all. France and Germany had a lot of wars, but we need to sort values eventually. d1g (talk) 16:44, 2 August 2017 (UTC)
(edit conflict) d1g, The problem is that country boundaries, political systems and government types are in constant flux. Some items for countries seem to be only for the current iteration, some for all iterations and some for both. Yesterday I had to revert a whole bunch of edits I made to P27 properties based on nationalities imported from Commons (I will not do that again). P27 does not seem the be a good property for determining nationality. So I am trying to better understand how such concept is being modeled and how it can be improved. The final goal would be for me to be able to look-up that Durer and Hitler were "German". --Jarekt (talk) 16:09, 2 August 2017 (UTC)
your questions are territorial, two mentioned persons have nothing common in "country" sense d1g (talk) 16:43, 2 August 2017 (UTC)
I agree that the countries are different but we are talking about concept of "nationality" and most sources describe both as having "German nationality" (see for example here and here). That is what I am trying to look up, and although there is a high correlation with P27 and P103 but those are different concepts. --Jarekt (talk) 16:57, 2 August 2017 (UTC)
getty.edu is not authoritative for values in P27:
"the object is a country that recognizes the subject as its citizen"
Proper source would be a list registry of citizens in some country.
So, government should recgnize it's citizens, not research institution. Especially art-centric institution.
Historical records cover 1000 years more or less after Genghis Khan (Q720), that's why everything is explained in Wikipedia. d1g (talk) 17:10, 2 August 2017 (UTC)
d1g, I agree with all your statements about P27. However, P27 is not the same as "nationality", Which I was asking about. So lets not talk about P27 but about where and how to store "nationality". I do not think we store it and I think we should. --Jarekt (talk) 17:42, 2 August 2017 (UTC)
Zolo, yes if we create new property than we might also have to create new items for some nations/nationalities to separate them from the items for the current countries. However we already have few such items, like China (Q29520) or Korea (Q18097). For some others a single item might do for both current country and nation, for example United States of America (Q30). --Jarekt (talk) 16:28, 2 August 2017 (UTC)
@Zolo: followed by (P156) is used incorrectly in that item, if anything it should be replaced by (P1366), but with several intervening items.
For pre-modern states, the equivalent relationship is typically "subject of", which is already an alias of the P27.
@Jarekt: The nation for that country would presumably be the American People.
Nations, as opposed to legal sovereign states, are extremely complicated. Making a usable system for identifying people around nations might not be doable. Some examples of relationships between states and nations:
  • The Egyptian constitution defines the Egyptian people as being part of both the "Arab nation" and the "Islamic nation", while many individuals may consider themselves to be part of the Egyptian people may not consider themselves to be part of the Arab and/or Islamic nations. Similar situations exist in numerous other countries.
  • The Irish and Jewish peoples are generally considered to be nations despite majorities of both groups residing outside their affiliated nation-state/country/territory and periods of time with those states/countries/territories not being under their sovereignty.
  • The peoples within the Plurinational State of Bolivia and the Home Nations of the United Kingdom are each considered to be nations of their own, while being part of the greater nation associated with the state/country. The First Nations (Q392316) in Canada and other aboriginal nations in other countries are considered to be nations despite not having associated sovereign states. This adds up to probably thousands of non-sovereign nations, some with or without autonomy or national government.
  • Virtually every item on w:List of ethnic groups is at least occasionally considered a nation, despite many not having ever had a sovereign state. According to w:Ethnic group, the term "ethnic group" is often used synonymously with "nation", and we already have a property for ethnic group (P172).
We currently have 2,292,412 country of citizenship (P27) statements. These statements are fairly clear-cut, since the topic is an unambiguous direct legal status. Dealing with fuzzier concepts of nations and such is much more difficult to find sources for, and is often controversial. --Yair rand (talk) 00:49, 3 August 2017 (UTC)
I think the term "nation" is misleading here, because it carries the connotations so aptly enumerated by Yair rand above. I think the use Jarekt is referring to above—the (compound) "Description" field of Commons Creator templates such as c:Creator:Pieter van der Borcht (I)—is distinct from, but partially derived from, the concept of a "nation" or the common uses of "nationality" (at least in the sense "Belonging to the nation X"). The pseudo-nationality of painters (and by extension also to printmakers, photographers, etc.) is one effectively assigned by art critics and historians, and is one that resembles and partially overlaps the concept of a "school" (style) of painting. For instance, a "Flemish" painter need not have any particular nationality or country of citizenship: any number of factors may have caused them to be considered "Flemish". It could be that the artist was active in the "Flemish" region (geographical region with numerous nations, changing hands several times in an artists' lifetime), or that they learned from a Flemish master, or (and) painted in the style of the Flemish school. Or, when this pseudo-nationality corresponds with actual de jure citizenships, a photographer may be considered a Spanish–Portugese artist, even though they lack citizenship in Portugal but do have both Spanish and British passports. It depends on which countries or schools the sources consider the artist to have strong affinities with. The same can apply for historical eras (e.g. "Mediaeval"), but that's usually treated as a separate property.
Thus, I think the concept Jarekt is looking for is one distinct from the concept of "Nationality", but with a sort of fuzzy derivation from it (multiple inheritance from citizenship, nation, school of art, work location, and style/genre). The good news, though, is that the number of variants that are not straight up 1:1 with a country of citizenship is limited (by art historians' ability to categorize). Judging by some random sampling, adding some way to say that a painter is considered "Flemish" or "Chinese" independently of their citizenships, or "German" or "Russian" despite their formal citizenships being in former (historical) states, would cover a huge part of the immediate problem. That is, a descriptive term not specifically tied to any de jure property (such as citizenship). And a lot of the needed data could, I think, be imported from Commons (and possibly also infoboxes on the Wikipedias), so long as we treat the concepts as distinct, since the immediate interested party (user) is Commons and artist biographies / artwork articles on the Wikipedias.
PS. As an example of why I think the concepts are distinct, consider an Austro-German painter of the Romantic school, with one formal citizenship in the Weimar Republic (among others), who also happens to be better known, and thus more notable as, a German dictator (nobody talks about him as an Austrian dictator). We need the concepts to be able to express all these facets depending on the context of the use: i.e Commons' perspective (the painting and writing) vs. Wikipedia's perspective (everything else).
PPS. I think I may have just ruined Godwin's law. :) --Xover (talk) 06:21, 3 August 2017 (UTC)
Yair rand, yes nationality can be confusing and complicated and controversial but for the great majority of people it is not. Commons creator templates (from which I am importing data at the moment) use Nationality field, which is also defined by template:nationality (used on Commons and Wikidata). In most of the cases the nationality can be encoded by language independent ISO 3166 codes with only a few dozen extra nationalities added for distinct nonexisting countries or regions. On commons we use hundreds of nationalities not thousands. Assigning "nationality" is also not that complicated as most of the identifier databases I checked specify the "Nationality" of a person (see for example RKD). Also, as Xover mentioned, it can be imported from other projects as most Wikipedia articles start with a phrase like "German poet", "Russian painter", etc. Also categories on Commons and Wikipedias are usually grouped by nationality. Nationality is also specified in En-Wiki en:template:Infobox artist and possibly other templates. I will be away from the internet for a few days and not be able to continue this discussion. --Jarekt (talk) 12:27, 3 August 2017 (UTC)
Hello Jarekt, long time no see ;)
on frwikisource's Authors, that uses the same nationality concept as Commons, we simply use a reference list of p27 which the lua template links to the corresponding gentile, like this, this way, we can decide to use "German" for "German Empire", "Germany", and all other names used through times.... or not (for specific places). I don't know if you could use the same system on Commons (that needs to be pluri-lingual), but in fact, we were inspired by the system that Commons is still using for converting languages... :)
when we find a new country (through Categories that list unmatched values), we just add it to the list...
what do you think ?
--Hsarrazin (talk) 16:50, 7 August 2017 (UTC)
Thanks for the suggestion Hsarrazin. I was thinking about generating my own version of a lookup-table like your s:fr:Module:Auteur2/nationalités, but than I figured out that if I need it on Commons and Wikipedia's Infobox might need it (because it is using such field) than others might need nationality of a person as well. Instead of all of us creating our own look-up tables it is better if we store them on wikidata. That is why I started this discussion. --Jarekt (talk) 03:04, 8 August 2017 (UTC)
@Jarekt: the subject is so vague and controversary that I sometimes don't understand. Can you define this property: "nationality"? How it should be separated from ethnic group (Q41710)? Is it nationality (Q231002)? --Infovarius (talk) 15:07, 8 August 2017 (UTC)
Infovarius, "nationality" is pretty clearly defined for people living in modern times and nationality (Q231002) is a good definition. It is closely related and sometimes synonymous with ethnic group (Q41710), citizenship (Q42138), and first language (Q36870). For people living in earlier centuries it is more tricky, I am not an expert on the subject, but it seems to be determined by first language (Q36870) and place of birth. Most biographies, in books and online, specify nationality, so do databases like ULAN or RKD, even if it is vaguely defined. We refer to Mikhail Bulgakov (Q835) as "Russian Writer" even though country of citizenship (P27) is Russian Empire (Q34266) and Soviet Union (Q15180). You might have Czech or Slovakian nationality but rarely Czechoslovakian. Some people are refer as "Yugoslavian" and some might be refereed to as "Serbian" or "Croatian". Some people might be Canadians (Q1196645) and some French Canadian American (Q5501705). If we create a property to capture "nationality" it will be important to use references in case of controversial nationalities. --Jarekt (talk) 15:54, 8 August 2017 (UTC)
You are not clearing anything. Mikhail Bulgakov (Q835) is "Russian writer" because he was writing in Russian language - this is simpler than "nationality". Another example, in Russia we have two adjectives: "российский" (demonym (Q217438) for Russia (Q159), includes Tatar, Chechen, Udmurt and other "nationalities") and "русский" (refers to Russian (Q7737) and Russians (Q49542), includes Russians in Ukraine (Q311762), Russian Americans (Q1140588) and other "nationalities"). P.S. If you are referring to a place of living then you probably can use demonym (P1549) of such place. --Infovarius (talk) 12:44, 10 August 2017 (UTC)

Wikidata Mapping[edit]

Hello, I'm currently connecting the geographic descriptors of the STW Thesaurus for Economics (http://zbw.eu/stw) and I have a question for the term "Bodenseeraum" @de "Lake Constance region" @en. I would like to connect the STW descriptor "Bodenseeraum/Lake Constance region" with Lake Constance Lake Constance (Q4127) and add "Bodenseeraum/Lake Constance region" in "Also known as". I find that the Wikipedia page treats not only of the lake but the of the region too, for example: Tourism, leisure and sports, Towns and cities at the lake, climate, flora and fauna etc.

What do you think about ?

Thank's a lot Jdeillon: Jdeillon (talk) 10:55, 3 August 2017 (UTC)

I think it makes more sense to create a new item for "Bodenseeraum". Lake Constance (Q4127) is instance of (P31) lake (Q23397). ChristianKl (talk) 14:17, 3 August 2017 (UTC)
@ChristianKl: From a practical aspect, creating a new - and isolated - item would undermine one purpose of the mapping Jeanne mentioned, namely connecting STW descriptors via Wikidata to the relevant Wikipedia pages (in different languages). But I'm afraid it touches a more general problem, too: Often, the Wikipedia pages cover different things - in that case, the lake itself and the region, which belong to different and categorically disjunct classes. If we would try to separate these aspects in a clean fashion in WD, we would have to attach the Wikipedia pages to two separate items, which would be hardly desireable. The other option could be making Lake Constance (Q4127) instance of (P31) landscape (Q107425) (or some better fitting class), perhaps with lake (Q23397) as a preferred class. I've seen that approach in different places in WD, but I'm not sure if it is considered good practice. Or would facet of (P1269), mentioned in instance of (P31)/talk, be a better solution? Jneubert (talk) 11:05, 4 August 2017 (UTC)
Wikidata items aren't equivalent to the Wikipedia page. Wikipedia pages often cover more information. A new item might begin isolated but there's a good chance that it get's more statements over time. It might be very worthwhile to describe the meaning of Bodenseeraum with Wikidata statements. http://www.fachdokumente.lubw.baden-wuerttemberg.de/servlet/is/117804/399-Bodenseeraum.pdf?command=downloadContent&filename=399-Bodenseeraum.pdf provides good data for describing it. ChristianKl (talk) 22:51, 4 August 2017 (UTC)
I see your point, but I'm not sure if I can agree completely (because somewhere the differentiation may be too fine-grained to be helpful). Anyway, extending the meaning of Lake Constance (Q4127) is not an option, because there is no consensus, and on the other hand creating a new item would not be very useful in the concrete context of the mapping. A solution could be to introduce a new "close match" property (in analogy to skos:closeMatch). Since similar situations - where the thing some external identifier designates is close, but not completely the same as a Wikidata item - occur in other places too, I'll try to come up with an according property proposal. Jneubert (talk) 13:55, 10 August 2017 (UTC)

Import ISNI from VIAF[edit]

Can someone select items in http://viaf.org/viaf/data/viaf-20170705-links.txt.gz that have a link to Wikidata and an ISNI, and check if the Wikidata item has the ISNI and add it to Wikidata if not present? 78.51.128.16 18:06, 4 August 2017 (UTC)

I think we don't do this anymore, because VIAF mismatched a lot of people. I would love a Wikidata game for this or some import to the primary sources tool. Sjoerd de Bruin (talk) 12:10, 5 August 2017 (UTC)
We recently exported from Wikidata a list of people with Open Library identifiers and their VIAF identifiers as we had it. We received a file with the Open Library redirects and duplicates. All the duplicates have been removed. I do not know the status of the redirects. As a result Open Library now includes links to both VIAF and Wikidata. Andy Mabbett did a trial run of an import from the Biological Heritage Library. They have links to VIAF and to Open Library. When we receive these values as well, we will enrich Wikidata and we will find issues. This is how and where we make a positive difference for us and for our partners. This is how we improve quality.
The primary sources data is a dead end. We cannot even query the data that is in there. It is a travesty that it is proposed for new data. It is not functional. Thanks, GerardM (talk) 12:44, 5 August 2017 (UTC)
There is active work going on to "revive" the primary sources tool. Sjoerd de Bruin (talk) 08:17, 7 August 2017 (UTC)
I recently learned that both VIAF and ISNI are managed by the OCLC. So when both are available, when they match a current record with either a VIAF or an ISNI identifier we enrich our content. So there are no downsides to this proposal. Thanks, GerardM (talk) 09:02, 7 August 2017 (UTC)
"Primary Sources" seems to be suitable for this type of data. Links between VIAF ids and ISNI aren't stable. If needed, one could always do some federated query based on VIAF id.
--- Jura 15:54, 8 August 2017 (UTC)

GerardM, User:Sjoerddebruin, User:Jura1: There are large gaps in the ISNI coverage. Just one example: Q268840 (Sally Potter) has VIAF, in VIAF has many connected IDs, including ISNI, and in Wikidata also has many other IDs. Why does the ISNI not show up in Wikidata? Yes, one can manually fix this, but the tools seem to be broken. The VIAF record history shows

SELIBR|276046      add     2009-03-03T12:03:42+00:00 // first entry
ISNI|0000000081584466   add     2013-09-16T16:15:07+00:00 // link to ISNI
WKP|Q268840     add     2015-04-14T18:56:54+00:00  // link to Wikidata

More than two years have past. What is wrong with the toolchain? User:Magnus_Manske, any idea? 77.179.20.183 14:00, 9 August 2017 (UTC)

Another example, Q611314, https://viaf.org/viaf/26736049/

LC|no2002074020    add     2009-03-03T12:03:40+00:00 // first entry
ISNI|0000000076892915   add     2013-09-16T16:15:07+00:00 // link to ISNI
WKP|Q611314     add     2015-04-14T18:56:54+00:00 // link to Wikidata

Why are the ISNI not automatically added? It could have been obtained automatically since 2015-04-14T18:56:54+00:00. How many more are missing? Thousands? A million? User:D1gggg and User:GerardM any idea? 77.179.20.183 23:47, 9 August 2017 (UTC)

Proposing the "primary sources tool" in this particular instance, when both VIAF and ISNI numbers are known, shows a misunderstanding of both VIAF and ISNI. First, ISNI does not accept imports that are below 95% accuracy. This is better than what we have at Wikidata as a general error rate. Second, we already have identified the people et al involved because WE associated them with Wikidata identifiers. As a consequence all arguments that there are issues with VIAF are issues with our own content as well. OCLC does administer both VIAF and ISNI so the compounded error rate is really small and it is a travesty to have people do manually work, introduce an error rate of 6% for no good reason. Thanks, GerardM (talk) 06:44, 10 August 2017 (UTC)
I don't understand your argument. ISNI has an error rate of 5%? At least now it's clear we shouldn't include it at all.
How did you reach the conclusion about a supposed error rate of Wikidata? Has this something to do with your edits with CatScan?
--- Jura 06:51, 10 August 2017 (UTC)
Obvious. ISNI does quality checks before they import new data. When they find an error rate of 5% or more, they do not include the new data. When data is entered manually, any tool or method, you introduce around 6% of new errors. Consequently when ISNI and VIAF share the same item identified at our end, the quality of that data is superior to anything we can do. Including the "primary sources tool" because its use introduces the same level of errors. Thanks, GerardM (talk) 09:38, 10 August 2017 (UTC)
I would be interested to see a reference for your claim that "anything involving a human in the loop has 6% error rate". − Pintoch (talk) 09:58, 10 August 2017 (UTC)
That is not relevant to the discussion. Suffice to say that any manual action introduces errors .. We are discussing the quality of importing combined data from an export by the OCLC where VIAF and ISNI are combined. With a proper script there are no new errors that will be introduced. When we run the export from the OCLC repeatedly the quality we get at our end will improve with time. Otherwise we remain unaware of their improvements. Thanks, GerardM (talk) 11:20, 10 August 2017 (UTC)
Why did you bring up this claim if it is irrelevant? Manual action can introduce errors, but it can also fix errors. That's why people curate databases! So it is totally possible to have a lower error rate with a manual import (by excluding the erroneous records, or marking them as deprecated in the case of duplicates). Again, if you had a source for your claims, I think many people here would be genuinely interested to see some quantitative analysis. − Pintoch (talk) 15:46, 10 August 2017 (UTC)
It is not a claim, it is a question. The point to the argument is that manual changes using any tool introduce errors. Running the same script without modifications will only introduce errors that are in the base material, something a manual process introduces as well. When you are questioning this basic assumption / observed fact, there is little point in arguing further. Thanks, GerardM (talk) 04:47, 11 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── WD links to VIAF and VIAF makes a claim about ISNI. Adding ISNI with claim "stated in VIAF", "retrieved 2017-08-10" - how is that NOT better than the addition of an ISNI by an IP user or logged-in editor? And if one runs no script to check if the claim is still supported, how will one find errors that have already been found by others, namely ISNI-IA and VIAF? 92.231.163.153 18:30, 10 August 2017 (UTC)

Stub categories[edit]

Hi! Today I discovered Wikimedia category of stubs (Q24046192), created in 2016 by @Pasleim:, and I started substituting instance of (P31)  Wikimedia category (Q4167836) with instance of (P31)  Wikimedia category of stubs (Q24046192). However, @Jura1: told me that "we don't want to reproduce Wikipedia's category system with subclasses at Wikidata", so I stopped the substitution. In my opinion instance of (P31)  Wikimedia category of stubs (Q24046192) is better than instance of (P31)  Wikimedia category (Q4167836) + category combines topics (P971)  Wikipedia:Stub (Q4663261). What's your opinion? --Epìdosis 14:00, 6 August 2017 (UTC)

I agree that there's no reason to duplicate Wikipedia's category system. Adding Wikimedia category of stubs (Q24046192) adds no data that isn't already accessible from the parent categories. --Yair rand (talk) 17:11, 6 August 2017 (UTC)
I agree as well that we do not want to replicate the category system (doesn’t work anyway, since category systems are somewhat different in each project). Category items are about a Wikimedia page outside the main knowledge tree (Q17379835), thus we should not put too much effort into them—they are basically here because Wikidata manages their sitelinks. However, if there is a use case that requires special subclasses of Wikimedia category (Q4167836), I wouldn’t object Epìdosis’ efforts. But I don’t know of such a use case yet… —MisterSynergy (talk) 18:42, 6 August 2017 (UTC)
Since almost the beginning of Wikidata till 2016 people were using subclass of (P279) Category:Stubs (Q2944440). P279 is clearly not the right property to establish the relation between two Wikimedia categories. I mass removed the relations but multiple users protested. As a compromise I added instance of (P31) Wikimedia category of stubs (Q24046192) which follows the idea of Wikimedia templates category (Q23894233), Wikimedia administration category (Q15647814) and Wikimedia disambiguation category page (Q15407973) --Pasleim (talk) 09:52, 7 August 2017 (UTC)
That's an improvement indeed. Personally, I mostly rely on "category combines topics" (P971) and "topic's main category" so (if these are not impacted) I should be indifferent to one or the other being used.
--- Jura 10:53, 8 August 2017 (UTC)
No a category should not refer to another category because the same structure does not exist in all Wikipedias. So I do object and imho there should only be an indication that a category is a category. Thanks, GerardM (talk) 12:58, 7 August 2017 (UTC)

@Yair rand, MisterSynergy, Pasleim, Jura1, GerardM: So, what should we do? Stub categories should have:

  1. instance of (P31)  Wikimedia category (Q4167836) + category combines topics (P971)  Wikipedia:Stub (Q4663261);
  2. instance of (P31)  Wikimedia category of stubs (Q24046192);
  3. instance of (P31)  Wikimedia category (Q4167836) + instance of (P31)  Wikimedia category of stubs (Q24046192) + category combines topics (P971)  Wikipedia:Stub (Q4663261)? --Epìdosis 17:26, 11 August 2017 (UTC)

We should not have stub categories at all. We have categories and that is it. Apart from that, I use "is a list of" "whatever" and I use qualifiers like "award received" "award name". It works so well because it allows Reasonator to compose a query with the results. This is one example. Thanks, GerardM (talk) 20:29, 11 August 2017 (UTC)

Nonsense imported from Geonames[edit]

Thanks to the bot filling the Cebuano Wikipedia (Q837615) with all the items in geonames, and the bot importing all the pages from ceb to here, we now have a lot of nonsense items here. Just my latest picks from looking around Thailand items - Mae Hong Son (Q35440991) and Mae Hong Son (Q35441002) claim to be about a stream in Thailand. But for both the coordinates are in Laos, and there nothing but jungle there, not even a rivulet to be seen in Google Earth. There is a small river with that name in Thailand however which I now added manually (Mae Hong Son River (Q35513098)). Another one - Khuan Khaeng (Q31686472) and Khuan Khaeng (Q31372043) seem to be two peaks of the same hill, in this case the duplication was already done by the GNS-UFI database from which geonames imported them. But the problem - due to the inaccurate coordinates the bot gave the hill an elevation of 94 m, but looking at Google Maps it seems to be about 330 m. To make it worse - the bot who imported that statement did not even add a imported from (P143) reference, so it impossible to see how the bogus heights came here. I am sure this is only the tip of the iceberg, so I worry our database has been screwed up with bogus data now :-( Ahoerstemeier (talk) 14:43, 8 August 2017 (UTC)

There is a dedicated project on svwiki: sv:Wikipediadiskussion:Projekt alla platser-städning to handle this. Users like Kitayama (talkcontribslogs) and Taxelson (talkcontribslogs) have made thousands of deletions to clean up the mess. It looks far worse for cebwiki, very little cleanup is done there, and the botrun has edited that project much longer. -- Innocent bystander (talk) 15:28, 8 August 2017 (UTC)
If its already known that this bot creates so many nonsense articles, why do we still import them here before giving them at least a basic check for validity? Ahoerstemeier (talk) 15:34, 8 August 2017 (UTC)
At a minimum, articles that get deleted at svwiki could have their equivalent deleted from here and, if possible, from cebwiki as well.
--- Jura 16:27, 8 August 2017 (UTC)
The deleted pages are easy to find, but many articles on svwiki are instead merged and redirected, without us noticing where and why. Sometimes the item could stay as fulfilling notability here. Other times, they should be deleted since we do not want items about glitches in the GeoNames database. -- Innocent bystander (talk) 19:01, 8 August 2017 (UTC)
It is better to have deprecated value(s) to avoid re-entry of fake objects.
Question is how many of such items we could allow.
Geonames should remove items at their side, then we would remove external ids. d1g (talk) 20:36, 8 August 2017 (UTC)
Geonames is kind of a Wiki as well - I was able to delete those two wrong river entries there. Ahoerstemeier (talk) 21:00, 8 August 2017 (UTC)
@Ahoerstemeier: And in a wiki you can be reverted, as I have been, so watch out! -- Innocent bystander (talk) 03:48, 9 August 2017 (UTC)

The problem mentioned here is really a disaster, eventually. I am losing hours every month trying to fix a tiny part of the damage this bot has created and keeps creating. It is almost impossible to work on mountains in Canada and Australia for instance, because of all the Wikimedia duplicated page (Q17362920) it has generated along internal borders – summits in two provinces, territories or States at a time have, for some reason, been created twice. And again this is a very specific part of the issue, that in fact keeps growing right now. I have noticed this week that hell has just come recently to American protected areas. I am currently merging hundreds of items on refuges because in cebuano it was so important for them to read machine-written stuff on migratory birds of Virginia and the like. Thank you if you can stop this for good. Thierry Caro (talk) 04:39, 9 August 2017 (UTC)

Maybe a new P31 value for anything that hadn't been confirmed with other sources would be a solution. It can be set with preferred rank and avoids that people keep getting these items. Otherwise, it's indeed hard to avoid that these are re-created.
--- Jura 04:59, 9 August 2017 (UTC)
A mountain can have several summits and we can have one item for each, but GeoNames is not a good enough source for that kind of information. -- Innocent bystander (talk) 06:13, 9 August 2017 (UTC)

These items exist in ceb: and sv: and I guess they are created in other Wikipedias manually at different rates, as geographical article coverage is improve. So, I don't think we should delete them here. I think we need to set a guideline about how to maintenance all these Geonames items. We need a way to state in items which ones were reviewed and are OK and which ones are wrong (removing wrong statements or set them as "unknown value"). Some statistics pages about done/removed/todo status would help too (bot generated) to see the progress. There are different subclass of incorrect info (height, coordinates, place names, country, etc). All the applied corrections should be moved to Geonames, so their database is improved and we don't get again wrong data. Any needed changes in ceb: and sv: articles should be applied there too, and if we delete full wrong items, ceb: and sv: should delete them too. It is a hard task, but we as a community must do it soon or later. We could use the visualization capabilities of Wikidata Query Service (maps) to make easier to work on this task. About the alleged wrong river info, we should check they aren't subterranean or historical rivers, so they aren't shown in maps or satellite images (two possibilities I just imagined). Emijrp (talk) 10:39, 9 August 2017 (UTC)

One reason I can see has causes "duplicate" rivers, but also lakes, islands and groups of islands and many more, is that the name of the river has been written in several places of one or several maps. It look like GeoNames has scraped names from maps and put them in their database, not always checking for duplicates or identified what exactly the name is describing. I have many times complained about the quality of the "populated place"-items in GeoNames. GeoNames is not a good enough source to tell if a populated place exists in real life or not. Many times these populated places are instead administrative units. Many populated places share the name of an administrative unit, but far from always. -- Innocent bystander (talk) 16:32, 9 August 2017 (UTC)
  • I don't think we should rely on cebwiki/svwiki/Geonames to delete items before we remove items from what appears to be valid ones at Wikidata. Given the errors found with these entries in general, I don't think we can assimilate anything that hasn't been otherwise referenced as valid.
    --- Jura 07:23, 10 August 2017 (UTC)
A big part of the geonames items are in fact from the GEOnet Names Server (Q1194038) database, which by itself is far more reliable - but still has some duplicates because what they did for many countries was scraping all names from the maps. At least for rivers I haven't noticed an duplicates in there so far, they were all introduced by geonames themselves; for mountain ranges I have seen a few dupes in GNS as well. Ahoerstemeier (talk) 08:13, 10 August 2017 (UTC)
Maybe we just need to move a model where we add these as names for a feature and not as features with that name.
--- Jura 08:22, 10 August 2017 (UTC)

To demonstrate how much work is demanded. User:Kitayama has on svwiki for 36 days made clean up in bot created articles only related to Sweden. Lsjbot never came to Sweden on svwiki. The bot stopped before it came there. But there were still thousands of pages with links to things related to the small country of Sweden. Only to clean up those took one user 36 days, and that on svwiki alone. How much time it would take here and on cebwiki, I do not know. -- Innocent bystander (talk) 06:52, 14 August 2017 (UTC)

  • Personally, I'm much in favor of adding P31=Q35779580 with preferred rank to any Lsjbot-entry that doesn't any references. As items get completed, we can remove it.
    --- Jura 07:31, 14 August 2017 (UTC)

Can we stop this nonsense[edit]

I am fed up with the continuous idea that other parties like GeoNames are problematic while in reality it is our own intransigence to cooperate that is largely to blame. Take GeoNames, they are interested in cooperating with us. At the time we closed the door on them "because their data was not good enough". Now we have a situation that is substantially worse. Because of this intransigence, the data was imported anyway in several Wikipedias. There is no cooperation and the current proposals makes things worse.

So what we should do instead is linking to the GeoNames database and keep track of the changes that are happening there as well. Once we start collaborating, we can track the changes at Wikidata and compare them with what exists at the Wikipedias. We should work with LSJBOT and not actively discourage him to cooperate with us. We should do better because we are supposed to be a community that cares about data and know about data. Thanks, GerardM (talk) 09:09, 14 August 2017 (UTC)

Plan moving forward[edit]

How should be go about improving the situation? Beyond big words, I think we need specific steps that allow us to identify the quality of each item.
--- Jura 08:17, 15 August 2017 (UTC)

Well, I had a plan to download the Q34-part of GeoNames and compare what I found there with how it is used on Wikidata/pedia. I started doing it as a part of evaluating the quality of GeoNames before the Lsjbot-project started. I quit at some point, probably because I only have 24 hours a day, and some of that time is used for sleeping, family, job and other things in life. I guess it would be a good idea to start doing that again. -- Innocent bystander (talk) 09:32, 15 August 2017 (UTC)
No big words, just some common sense. First, quality of each item is often binary so that approach does not help us. It is not the only way to approach both quality and cooperation. The first thing we do is to agree on what we share that is the same. Then we follow up with the shared information that differs in detail. This is where we want to find out either way what is correct. Systemic differences are identified in this way as well.
All the time results are shared with our partner in cooperation..
In the case of GeoNames, we have a situation that is not in the best interest of both Wikidata, GeoNames and Wikimedia. The first thing to accomplish is that we have GeoNames identifiers for all the items that are associated through Wikipedias with GeoNames. Then we compare the values with GeoNames and what we are informed about from the Wikipedias. In the cooperation with GeoNames, the information in the Wikipedias are secondorary as there are no methods for the continuous verification and validation of differences.
This brings me to my main point. The LSJbot used static information from GeoNames to produce texts in several Wikipedias. When we import the data in Wikidata and allow for the caching of generated texts in stead of the saving of generated texts, we allow for the import of data "unverified" at our end and still provide the best information.
The notion that every item has to be correct is a fallacy and it has brought us where we are. No cooperation, no improvement and a deficient service to the Wikipedias. Thanks, GerardM (talk) 09:32, 15 August 2017 (UTC)

Approach : comparison by country[edit]

To quote Innocent bystander: "Well, I had a plan to download the Q34-part of GeoNames and compare what I found there with how it is used on Wikidata/pedia."

I think that could work. At some point you will end up with things that are (a) unmatched, (b) Ljs-only texts, (c) things that match other references (preferably on items), and (d) things that integrate with other Wikipedia articles. (c) and (d) shouldn't be an issue.
--- Jura 09:45, 15 August 2017 (UTC)

One problem is that I have good reputation here and know how to convince others here. But I do not have what is needed to convince GeoNames that this does not exists in real life. From the maps they have used as source, it looks like it exits, but I know that it doesn't. There exists something here with this name in this here, but it is not what GeoNames says it is. My changes to the GeoNames database has this far been reverted. -- Innocent bystander (talk) 09:55, 15 August 2017 (UTC)
  • I don't think that we need to be perfectly in sync with Geonames nor that users at Wikidata should be asked to edit Geonames directly. Otherwise one could just build a dynamic link to their database. If there is status here that identifies an item as (b) or maybe (former b) that could be sufficient.
    --- Jura 10:04, 15 August 2017 (UTC)

Another nonsense spam by ceb wikipedia is the duplication of most human settlement placenmanes in Germany (and in many other countries). At de.wikipedia and in most other wikipedias there is good reason to have the municipality and the main (often only) settlement in that municipality of the same name in one article. Any intelligent human would always do this. Not the ceb GeoNames bots. Hence, we have thousands and thousands of useless Wikidata items about that clutter our database. Personally, I'd prefer to have all Wikidata items deleted if there are no other than bot generated articles behind it AND no "human" interaction in the version history. --Anvilaquarius (talk) 10:08, 16 August 2017 (UTC)

You do not get it. Nonsense .. any intelligent human .. Conflating the consequence of our actions with the project the data comes from. I prefer us to work together and not insult others. Thanks, GerardM (talk) 15:09, 16 August 2017 (UTC)

SPARQL ORDER BY item - not by numeric nor by string value[edit]

"ORDER BY ?item" - what does that do?

SELECT ?item ?isni
WHERE 
{
  ?item wdt:P213 ?isni
}
ORDER BY ?item

Try it!


yields

114,0000000123530985
...
192,0000000072462811
20,0000000122989524
206,0000000078496895
207,000000012102267X
21,0000000122934507
212,0000000123587973
...
101942,0000000039979739
24427278,0000000116205025
991,0000000121462392

77.179.20.183 15:42, 9 August 2017 (UTC)

That's not what it yields, it yields a list of URI's (?item is a URI) which are ordered in normal string ordering. If you want numeric ordering you need to bind another variable that is just the numeric part of the id, and sort on that. ArthurPSmith (talk) 15:52, 9 August 2017 (UTC)
ArthurPSmith, how is order Q192 , Q20, Q101942 normal string ordering? 77.179.20.183 16:01, 9 August 2017 (UTC)
Actually, the ordering does not work and the order is completely random since you are ordering bounds with "wd:" predicates which is undefined. Matěj Suchánek (talk) 16:17, 9 August 2017 (UTC)
Matěj Suchánek, so one can add statements to Wikidata SPARQL service that have no effect without getting a warning or an error? Users might trust that their ORDER BY statements work by string sort, since the CSV download gives strings. Apart from that, how can one numerically sort by QID? 77.179.20.183 16:31, 9 August 2017 (UTC)
How do you know it has no effect? If a sorting algorithm (Q181593) is unstable, the order is arbitrary but different. Anyway, use BIND( xsd:integer( STRAFTER( STR( ?item ), STR( wd:Q ) ) ) AS ?qid ). Matěj Suchánek (talk) 16:40, 9 August 2017 (UTC)
Matěj Suchánek, thanks a lot. I tested, and ORDER BY has an effect, and for the first items displayed it even had a reproducible order. Thanks for the BIND statement, works fine
SELECT ?qid ?isni
WHERE 
{
  BIND( xsd:integer( STRAFTER( STR( ?item ), STR( wd:Q ) ) ) AS ?qid )
  ?item wdt:P213 ?isni
}
ORDER BY ?qid

Try it!

I also tried to remove the Wikidata-specific whitespaces from the ISNI, but that results in an error if not limited. Limited to 100 works fine for me:
SELECT ?qid ?isni
WHERE 
{
  BIND( xsd:integer( STRAFTER( STR( ?item ), STR( wd:Q ) ) ) AS ?qid )
  BIND( replace(?wd_isni, " ", "") AS ?isni )
  ?item wdt:P213 ?wd_isni
}
ORDER BY ?qid
LIMIT 100

Try it!

77.179.20.183 18:20, 9 August 2017 (UTC)

Pictogram voting comment.svg Comment

wd:Q192 wd:Q20 wd:Q101942

aren't 3 strings, but 3 IRIs.

  • Without limit it would process 402507 statements, which would take too much time, so it would timeout.

Pictogram voting question.svg Question why do you need to process items in QID order? Is it something necessary to answer some question? d1g (talk) 19:51, 9 August 2017 (UTC)

d1g, privjet, the STRAFTER runs through without limit, and gave 409603 (not 402507). It is maybe not specifically necessary, I just prefer to have data ordered. I may like to compare two CSV downloads and simple diff tools might be better higlighting changes if the order does not change. Compare that with ListeriaBot which randomly rearranges lists, and it is hard to see with the MediaWiki diff function what actually has been changed. And to have data more compact, I want the ISNI in compact format. With ca. 9 millions ISNI issued that is 27 millions spaces that Wikidata adds :-(. 77.179.20.183 20:23, 9 August 2017 (UTC)
we have schema:dateModified
maybe we would have schema:dateCreated to sort for humans d1g (talk) 20:43, 9 August 2017 (UTC)
SELECT ?p ?v WHERE 
{
  wd:Q192 ?p ?v
}
Try it!

ISNI format[edit]

(from IP talk)

Hi - your ISNI entries are not following the 4 groups of 4 digits separated by spaces that the ISNI property expects - for example here: in this edit. I think we have a bot that fixes these, but you should probably try to enter them correctly in the first place to limit the extra work needed. See the "format as a regular expression" statement for P213. ArthurPSmith (talk) 15:50, 9 August 2017 (UTC)

ArthurPSmith - ISNI is defined in ISO 27729:2012 as a 16-digit identifier. DeltaBot and KrBot changed my statements to the Wikidata-specific ISNIs. It is a pain that editors have to add the values manually, so add extra burden on editors by requiring Wikidata-specific formatting is just hillarious. 77.179.20.183 15:58, 9 August 2017 (UTC)
Note: I will stop adding any ISNI for now. Have a good day. 77.179.20.183 16:04, 9 August 2017 (UTC)

I also find "4 groups of 4 digits separated by spaces" format for ISNI quite annoying and it was a pain to add those spaces each time I copy it from a source that does not use spaces. Maybe we can display it with spaces but store it without? Or allow both formats? --Jarekt (talk) 16:16, 9 August 2017 (UTC)

You can enter it in any format. Eventually a bot normalizes it.
--- Jura 16:48, 9 August 2017 (UTC)

(from IP talk) It's not a Wikidata-specific thing, it's the standard display format. See the discussion on the talk page for the ISNI property - Property_talk:P213. If you go to isni.org and look at any entry, it is displayed there with spaces. You don't have to fix the formatting yourself - as I mentioned, there's a bot that does it - but if you have a long list of them there are plenty of bits of software you can use to add the spaces to the id's in your list (for example I've used OpenRefine and the expression "join(splitByLengths(value,4,4,4,4),\" \")" does the trick there). ArthurPSmith (talk) 19:22, 9 August 2017 (UTC)

ArthurPSmith: You say it is not Wikidata-specific, and then for reference the first source you point to is a Wikidata property page. The other reference is isni.org and "look at any entry" - but if I go to http://www.isni.org/isni/000000012281955X then the first ISNI I see, namely the one in the URL has no spaces. Yes, for display it has spaces. That is similar to
ISO standard identifiers
Standard Name Abbr. Format Format for display
ISO 2108 International Standard Book Number ISBN 9783161484100 978-3-16-148410-0, 9 783161 484100
ISO 3297 International Standard Serial Number ISSN 20493630, 9772049363002 2049-3630, 9 772049 363002
ISO 3901 International Standard Recording Code ISRC USRC17607839 US-RC1-76-07839
ISO 10957 International Standard Music Number ISMN 9790260000438 979-0-2600-0043-8, 9 790260 000438
ISO 13616 International Bank Account Number IBAN GB29NWBK60161331926819 GB29 NWBK 6016 1331 9268 19
ISO 15706 International Standard Audiovisual Number ISAN 0000000016FF0000Y000000009 0000-0000-16FF-0000-Y-0000-0000-9
ISO 15707 International Standard Musical Work Code ISWC T0000000010 T-000000001-0, T-000.000.001-0
ISO 17442 Legal Entity Identifier LEI 5493000IBP32UQZ0KL24 5493 00 0IBP32UQZ0KL 24
ISO 21047 International Standard Text Code ISTC 0A9200212B4A1057 0A9 2002 12B4A105 7, 0A9-2002-12B4A105-7
ISO 27729 International Standard Name Identifier ISNI 000000012281955X 0000 0001 2281 955X (two spaces as in VIAF), 0000 0001 2281 955X (one space in ISNI, Wikidata)
77.179.20.183 20:11, 9 August 2017 (UTC)
I'm sorry, but why are you copying my comments to you to Project Chat? You have copied this one incorrectly - the OpenRefine expression is invalid, but it is correct on your talk page. This is hardly an important issue - as I said, if you don't want to edit to the format this property expects, a bot will fix it within a day or two I believe. If you think the format should be changed, the property discussion page is the place to raise your concerns - and I pointed you there because this very concern had been raised several years ago, and there was a community decision to stick with the display format. If you think we should change, make your case there. ArthurPSmith (talk) 21:00, 9 August 2017 (UTC)
ArthurPSmith, no idea why the copying was broken. I now try to copy "join(splitByLengths(value,4,4,4,4),\" \")" here again, preview shows it as on the talk. Maybe yet another MediaWiki-Feature/Bug? 77.179.20.183 00:17, 10 August 2017 (UTC)
ISBN need far more knowledge about publishers and countries to enter correct tabulation by hand:
978-3-16-148410-0
978-316-148410-0
978-316-1-48410-0
978-316-1-484-10-0
3161484100 is by far easier to enter manually d1g (talk) 21:07, 9 August 2017 (UTC)
d1g, maybe sometimes people want to store the ISBN as found on the actual product. So, for ISBN one may like additional options, to the compact stored format. But ISNI is rarely not printed on humans. Input methods are extra. The ISNI issue is mainly about data storage. 77.179.20.183 00:17, 10 August 2017 (UTC)
It is not necessary to know tabulated form of ISBN. Many authors don't care about such things when they are used in citations.
We shouldn't ask users to format/enter identifiers against complex presentation formats. d1g (talk) 23:19, 10 August 2017 (UTC)
We need to follow standards of course, so spaces in ISNI shouldn't be meaningful. d1g (talk) 20:58, 9 August 2017 (UTC)
We also need more properties: "display format" "value format" d1g (talk) 20:58, 9 August 2017 (UTC)
The spaces break the resolver BROKEN https://tools.wmflabs.org/wikidata-todo/resolver.php?prop=213&value=0000000059363552 Works only with Wikidata-specific ISNI: https://tools.wmflabs.org/wikidata-todo/resolver.php?prop=213&value=0000%200000%205936%203552 77.179.20.183 00:17, 10 August 2017 (UTC)

Import CSV: QID,ISNI[edit]

@ArthurPSmith, Jarekt, GerardM, Pasleim, Ivan A. Krestinin: Here are more ISNI for import:

77.179.20.183 17:10, 9 August 2017 (UTC)

  • What is the source for that data? ChristianKl (talk) 17:47, 9 August 2017 (UTC)
ChristianKl - Me. As when I added the data manually the source was me. But it is really tiresome. See my past additions. One can verify the accuracy by checking with the content on the ISNI-link target. 77.179.20.183 18:25, 9 August 2017 (UTC)
Isn't easier to just use the authority control adding tool per individual, and let it populate all missing data, whether it be ISNI or whatever?  — billinghurst sDrewth 00:19, 10 August 2017 (UTC)

billinghurst - no idea how, but here is a list with links to the individuals. Can you test some?

77.179.20.183 02:22, 10 August 2017 (UTC)

billinghurst, ArthurPSmith, GerardM - the data still is not there. What can be done? 77.179.8.56 12:42, 10 August 2017 (UTC)
With the authority control too, I went through and updated the second batch to have the available links as shown at VIAF. Still recommend that users get an account, and utilise the available tools to do these edits.  — billinghurst sDrewth 06:24, 11 August 2017 (UTC)

All done manually. 92.227.36.147 13:48, 10 August 2017 (UTC)

Harvest Templates Issues[edit]

Trying to use Harvest Templates to pull dates of birth from members of the 14th National Assembly of Pakistan on English Wikipedia.

We have run into some issues which we can’t explain:

  1. The items we are interested in aren’t all in a usable Category within Wikipedia, but we are able to get them via a Wikidata SPARQL query.
  2. This list contains 342 people, which we believe we can input into the “Manual list” section of Harvest Templates.
  3. But when we hit “get pages” - we only get 25 pages returned.
  4. Of those - an even smaller set look like they actually get updated using the tool (we had to untick “don’t load items with the property set” - as we got an error with it set).
  5. Using Petscan with a SPARQL query, and template filter, we have verified that there are 120 of the previously selected items with a “Birth date and age” template, but no date of birth yet in Wikidata.

Any ideas of where we can find additional help with this? --Lucyfediachambers (talk) 08:54, 10 August 2017 (UTC)

"We" is that EveryPolitician employees?
--- Jura 07:32, 10 August 2017 (UTC)
Jura1 (talkcontribslogs). Apologies for not being clear on that we = me, Oravrattas (talkcontribslogs) (both EveryPolitician) and Saqib (talkcontribslogs) in Pakistan- who reached out about this and we've been attempting to puzzle this out together. --Lucyfediachambers (talk) 08:54, 10 August 2017 (UTC)
When you click "get pages", the tool will find all pages with template, up to the limit (10,000 by default). The "manual list" filter is only applied after all the pages (up to the limit) have been sent back. So if the pages you're looking for are beyond that limit (very likely given that you had to uncheck already set), you may never reach them. I would still try to find a category to filter. Matěj Suchánek (talk) 10:11, 10 August 2017 (UTC)
Ah, so the tool first finds all the pages with the specified template (with a limit), and then applies the filters? I had assumed that when you supply a list of pages, it starts with those, and then looks for the ones in that list which have the template. What is the ordering when a Category is supplied? Why does the same problem not occur — i.e where it first finds only the first 10,000 pages with the DOB template, again missing lots of other people when it filters those to ones in the specified category? --Oravrattas (talk) 04:51, 11 August 2017 (UTC)
All filters but the manual list are applied server-side, including category. (Note that you can also input a list of Qids but this will behave same.) Matěj Suchánek (talk) 09:55, 11 August 2017 (UTC)

Delicate unmerge required Q30849 / Q14206373[edit]

Last December someone merged the project item page Q14206373 to blog (Q30849). It is going to need some delicate unpicking from someone who has a little time; or just revert back to the December 2016 versions of both pages, and let the rebuild go at its own pace. Thanks if someone has the time to resolve.  — billinghurst sDrewth 00:15, 10 August 2017 (UTC)

billinghurst - reverted, most of the edits to the blog-item were clean-up of the merge anyway. 77.179.20.183 02:19, 10 August 2017 (UTC)

autores.uy database id (P2558) - URL connection fails for some users[edit]

re P:P2558, http://autores.uy/autor/13103 (Chrome)

This site can’t be reached
autores.uy took too long to respond.
Search Google for autores uy autor 13103
ERR_CONNECTION_TIMED_OUT

I noticed that for several days. How can one add a value if the website is down? Original research? User:Zeroth https://www.wikidata.org/w/index.php?title=Q27786371&diff=534730052&oldid=534719933 77.179.20.183 02:14, 10 August 2017 (UTC)

It works for me. Strakhov (talk) 07:20, 10 August 2017 (UTC)

Still broken for IP of Telefonica Deutschland GmbH, tested in Chrome, Edge, Firefox. Here is the Firefox message:

The connection has timed out
The server at autores.uy is taking too long to respond.
 The site could be temporarily unavailable or too busy. Try again in a few moments.
 If you are unable to load any pages, check your computer’s network connection.
 If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.

77.179.8.56 12:26, 10 August 2017 (UTC)

autores.uy connection tests[edit]

OK for

Not OK for

That's strange. Can you provide any proxy to test the connection errors from other countries?. Thanks.--Zeroth (talk) 14:00, 10 August 2017 (UTC)

I sent a mail to the admin of autores.uy and it seems that they had DDoS attacks from Germany servers and they temporary blocked the access from an IP range of that country. At my request, they lifted it. Could you try again please?. @Jared Preston:. Regards, --Zeroth (talk) 16:35, 10 August 2017 (UTC)
@Zeroth: yes, I can confirm the link is now working again. Jared Preston (talk) 16:47, 10 August 2017 (UTC)

1 character job for a bot[edit]

  • remove P856="example.org" when P856="example.org/" is present

such bot can run with 10 minute intervals after humans or other bots d1g (talk) 06:04, 10 August 2017 (UTC)

@Ivan A. Krestinin: same reason, all patterns could allow optional "/" in P856 d1g (talk) 06:14, 10 August 2017 (UTC)

Change "Nearby" link to list items nearby the item?[edit]

Currently "Nearby" links to Special:Nearby, but it could just as well point to list pages nearby the item one is on.
e.g.

What do you think?
--- Jura 09:19, 10 August 2017 (UTC)

It should be so when URL contains "wiki/Q", otherwise it should use geolocation. d1g (talk) 09:25, 10 August 2017 (UTC)
  • Yeah, I don't think it would work beyond namespace 0
    --- Jura 07:34, 14 August 2017 (UTC)

Consider "imported from" as a qualifier, not a source[edit]

(See Wikidata:Project_chat/Archive/2013/07#Not counting "imported from" as a source for a previous discussion)

Help:Sources#Different_types_of_sources says

statements that are only supported by "imported from (P143)" are not considered sourced statements. If you encounter one of these statements, please remove "imported from" and add a more reliable source.

Why not just show those imported from (P143) as qualifiers, like proportion (P1107)? It would makes it easier (i.e. without having to click on each "1 source") to check if a statement is sourced or not. The RedBurn (ϕ) 09:32, 10 August 2017 (UTC)

Symbol oppose vote.svg Oppose This would violate the data model. Qualifiers say under what conditions the statement is valid, references say how we found statement, whatever the reliability of the sources is. Matěj Suchánek (talk) 09:55, 10 August 2017 (UTC)
Properties for references should be in references. d1g (talk) 10:24, 10 August 2017 (UTC)

What to do with items that have multiple articles in same language wiki?[edit]

I'm sure this is answered somewhere but I can't find it. So I stumbled across the English-language article St. Cuthbert's beads (Q7587693) which in the USA we call Indian bead (Q3055342). Separate articles make sense as they have separate histories of being used as currency, etc. The scientific name for these are fossilized columns of crinoidea. In French they are known as entroque. The English Indian bead article and French entroque article share while the same Wikidata item. I've just created a category on Commons called Category:Crinoid beads that I have linked to Q3055342 (Indian beads/Entroque) since it contained two items. I've manually added a Wikicommons link to the St Cuthbert's beads article but Is there any kind of way to merge these on Wikidata so the Wikicommons category can be linked to both here? It seems like there should be a way for Wikidata to understand that Q3055342 and Q7587693 are the same thing. Or is the only option to engage in the mess of suggesting a merge of the two English-language articles to some neutral title (which even I don't support)? Wikimandia (talk) 21:37, 10 August 2017 (UTC)

First of all the items should be mutually related to each other by permanent duplicated item (P2959) (or, if there is doubt, said to be the same as (P460)). The interwiki links are not affected by this, they need to add them manually in the old style in Wikipedias or at Commons. —MisterSynergy (talk) 05:01, 11 August 2017 (UTC)

Restrict merging rights to autoconfirmed users[edit]

Hi, here at Wikimania several contributors were talking about merging issues. Merging is a process that requires some knowledge of how Wikidata works, and as the help page states in bold, users must be absolutely certain that the two items are the same. This results in merging errors, most of them by newcomers. As it can be tricky to revert, especially if one or both the merged items was massively used as a value in statements in other items, I think it would be better to restrict merging rights to autoconfirmed users. What do you think of it? @Lea Lacroix (WMDE): would a vote on this page be enough to activate this restriction or is a full RfC needed? -Ash Crow (talk) 23:08, 10 August 2017 (UTC)

Maybe at some point we don't need to make as many merges. So it would make sense that merge would be less necessary. But this is also depends on how often things outside Wikidata change.
On other hard, merge is hidden already... Maybe we should hide it even more to avoid confusion for newcomers. GA candidate.svg Weak support
We have a bot to replace indirect items with direct links, but not an unmerge bot right now. d1g (talk) 23:56, 10 August 2017 (UTC)

Pictogram voting comment.svg Comment The gadget could be set up so that only autoconfirmed users can use it. Getting autoconfirmed is a low bar, so gives us about a week, and a low number of edits, so seems reasonable approach.  — billinghurst sDrewth 02:45, 11 August 2017 (UTC)

Symbol support vote.svg Support Looks fair enough to me. Pamputt (talk) 11:56, 11 August 2017 (UTC)
Symbol support vote.svg Support I simply agree. I have similar proposal in past (Wikidata:Project_chat/Archive/2016/05#Mergers_limitation) and nothing substantial changed since then.--Jklamo (talk) 12:32, 11 August 2017 (UTC)
Symbol support vote.svg Support I also support this restriction. ArthurPSmith (talk) 12:44, 11 August 2017 (UTC)
Symbol support vote.svg Support -- Maxlath (talk) 12:54, 11 August 2017 (UTC)
Symbol oppose vote.svg Oppose it might be good to restrict, but there is no demonstration of what does it help. What is the rate of false VS correct mergers by autoconfirmed users, vs the same rate by not autoconfirmed users? The simply answer of those inside the club, seems to be restrict, restrict, restrict. But are those that are inside the club, the editors that commit significantly less errors? Does the rate change after one becomes and autoconfirmed user? Are there other options on reducing wrong mergers? How many wrong mergers are there ? Is there any policy on when something should be restricted? 77.179.188.123 13:00, 11 August 2017 (UTC)
I can understand that statistics would be useful, though I would like to reflect upon the proposal in a different way. We know that editing WD requires some knowledge of WD and the wikis and it is particularly the case with merges. Having a bad merge is highly problematic where it goes unnoticed, so emphasising the complexity and a slightly raised level of competence is not a bad thing.  — billinghurst sDrewth 13:21, 11 August 2017 (UTC)
  • Symbol support vote.svg Support, users should have some basic understanding of Wikidata before doing such actions. Stryn (talk) 17:45, 11 August 2017 (UTC)
    • Stryn - how is that ensured by allowing autoconfirmed users still to do it and blocking only not autoconfirmed users? Does the proposal achieve what you wish for? And why should "some basic understanding of Wikidata" be of any help to prevent wrong merges? 77.180.179.245 17:47, 11 August 2017 (UTC)
      • Of course it does. I think a user with 50 edits knows more about Wikidata than a user who makes their first edit here and it's merging. And anyone can make mistakes, it's not always about the edit count, but I believe newbies (non-autoconfirmed here) can make more mistakes. Stryn (talk) 17:52, 11 August 2017 (UTC)
  • Symbol support vote.svg Support Seem resonable. --Fralambert (talk) 17:58, 11 August 2017 (UTC)
  • Symbol support vote.svg Support Strakhov (talk) 03:59, 12 August 2017 (UTC)
  • I Symbol support vote.svg Support binding this right to the (auto)confirmed group, as well as item-redirect right. We should have a dedicated page where newbies can report possible merges. Update to Help:Merge and MediaWiki:Wikibase-error-sitelink-already-used is also essential. Note that users who will loose the rights won't be able to enable and use Merge.js. Matěj Suchánek (talk) 08:16, 12 August 2017 (UTC)
  • Symbol support vote.svg Support I would additional like the function to be included in the default Wikidata package, without a user having to go to the gadget page to enable the feature. ChristianKl (talk) 19:45, 12 August 2017 (UTC)
  • Pictogram voting comment.svg Comment The merge tool shouldn't be as hidden as it is - I recently had to explain to an experienced Wikimedian (but a relatively new Wikidataian) how to find it! Thanks. Mike Peel (talk) 00:15, 13 August 2017 (UTC)
    we are not talking about hiding or unhiding the tool, just restricting its usage for users' the very initial set of days of their account creation. It would still sit as a gadget. If you believe that the tool is hidden, what is your expectation of where it could be better seen? It already shows as the top gadget, and it shows on Help:Merge.  — billinghurst sDrewth 09:25, 13 August 2017 (UTC)
    I was replying to @D1gggg's comment above saying "On other hard, merge is hidden already... Maybe we should hide it even more to avoid confusion for newcomers." Personally, I think the gadget should be turned on by default. Thanks. Mike Peel (talk) 23:11, 13 August 2017 (UTC)
  • I personally don't support this, but if this is needed, please firstly restrict it to registered users.--GZWDer (talk) 12:27, 13 August 2017 (UTC)
    Isn't that a given? Anonymous users are unable to run scripts as best as I know it; they don't have gadgets — billinghurst sDrewth 14:53, 13 August 2017 (UTC)
    Addendum: Symbol oppose vote.svg Oppose unless a message that introduce them to the dedicated page for merge requests is shown when a merge by adding sitelink using client add link widget is blocked.--GZWDer (talk) 09:26, 14 August 2017 (UTC)
  • I just saw this anonymous user moving svwiki sitelinks from some items to another, while they should merge them (if not all - most of them; I didn't checked all items), and I have a feeling that they will ignore any invitation to register and use a gadget/tool to perform merges. --XXN, 01:04, 14 August 2017 (UTC)
Not many edits BTH, If it were hundreds or thousands, then gadget would save time. d1g (talk) 02:17, 14 August 2017 (UTC)
Symbol support vote.svg Support this can be (de-facto only?) one of exception of Wikipedia:IPs are human too (Q13164301), @GZWDer: I'd love to say that anyone who're hating log in, or are not possible to login citation needed (Q3544030) should request for IPBE rather than using puppet IPs. --Liuxinyu970226 (talk) 15:36, 14 August 2017 (UTC)
Symbol support vote.svg Support Sensitive issue with the potential of damage with a lot of cleanup. --Hedwig in Washington (talk) 03:03, 15 August 2017 (UTC)

Wikidata:Identifier[edit]

I created two lists based on unique identifier (Q6545185)

and edited several items that had a claim instanceOf, but the items are instances of ID systems, not instances of IDs. There are some items about IDs (instanceOf UID). Also several National Identity Card claimed to be instanceOf UID, but should be classes and should not have regex format attached because they are ID cards not strings. Whether individual strings, e.g. AFG (Q12626453) are instanceOf UID is out of my interest, my main concern was, that items like ISBN are subclassOf UID and not instanceOf UID.

A further observation, there are several lists for properties

Would it be possible that a bot creates main namespace items for each "external-id"-property? That would grow Wikidata:List of classes of identifiers.

Last but not least, there is a bug with Listeria, if one wants to list Wikidata property (P1687) then the Q-item with the same number as the P-item is shown, e.g. for Alberta Register of Historic Places identifier (Q19832913) it shows as Wikidata property (P1687) New Hampshire (Q759) instead of Alberta Register of Historic Places ID (P759). 78.55.161.157 03:44, 11 August 2017 (UTC)

  • So what's the point to create these lists of identifiers? What's the point to create ID items?
  • We need to talk about Wikidata:Identifier. I see a need for a page with that title, but in its current condition it should better be deleted.
  • Listeria is a tool by User:Magnus Manske. There is a public repository where you can file bug reports if you think something is broken. Magnus is busy, however, so sometimes it takes a while until he shows a reaction.
MisterSynergy (talk) 05:11, 11 August 2017 (UTC)
MisterSynergy "So what's the point to create these lists of identifiers?" - There is no "the point", but there are purposes of the lists,
  1. sort out the current mis-classifications of identifiers in Wikidata (e.g. subclassOf vs instanceOf)
  2. track inconsistent property usage, e.g. country (P17) vs applies to territorial jurisdiction (P1001), issued by (P2378) vs operator (P137)
  3. demonstrate what information is already stored in Wikidata main namespace - or how much is missing, when comparing with the 1945 properties "external-id"
"What's the point to create ID items?" - There is no "the point", but there are several purposes,
  1. connect Wikipedia articles, like at International Standard Name Identifier (Q423048)
  2. to correctly link from a property (e.g. ISNI (P213)) using subject item of this property (P1629) to the main name space - if there is no ID item, one cannot link correctly, and currently P213 stores a lot of false claims
  3. to make the data that Wikidata already stores in properties available for SPARQL retrieval via the item namespace
"We need to talk about Wikidata:Identifier" - that page has a talk page, use it! "I see a need for a page with that title" - Good, it is there, why not say "Thanks" to the creator? "but in its current condition it should better be deleted." - Explain why, especially using WD policies and considering that you just said "I see a need for a page with that title" - Why delete instead of ammend? What do you have mind to be there? 77.179.188.123 12:44, 11 August 2017 (UTC)
Frankly, I do not understand what you are doing or planning to do, and this is reflected by your contributions, your comments on this page, and by your new pages in Wikidata namespace as well. My best assumption is that you are a pretty inexperienced user who is not used to the habits here and not aware of the consequences of their own activity. Your apparent reluctance to create and use an account does not permit you to develop any track record. I’d like to ask you to slow down a lot, and discuss your plans before you change anything on a large scale on property or item pages. Beginners seldom manage to work with an acceptably low error rate if they immediately try to be a major player.
The identifier page does not contain anything that appears useful, thus a deletion would probably be the best for a fresh start by experts. Regards, MisterSynergy (talk) 13:32, 11 August 2017 (UTC)
  1. "Frankly, I do not understand what you are doing" - That is your limitation! Try to fix!
  2. "Beginners" - I am not a beginner, I work with identifiers for more than a decade and as demonstrated have some knowledge about P31/P279 that registered user ArthurPSmith does not have
  3. "seldom manage to work with an acceptably low error rate" - show the errors, the error rate and say what is "acceptably low" to you
  4. "if they immediately try to be a major player." - I don't try to be a major player - maybe I am, but I don't try to. 77.179.36.254 14:02, 11 August 2017 (UTC)

IP user modifying unique identifier (Q6545185) related items[edit]

This IP user seems to have made a large number of changes to P31/P279 statements regarding our unique identifiers, including some modifications of properties. I don't believe they make sense, and don't understand why an unidentified (new?) user of this sort thinks they can override decisions on our subclass hierarchy made by many previous users here. I believe all their changes should be reverted as they don't make sense to me. unique identifier (Q6545185) has all along been a class whose instances are specific identifier systems, NOT the actual id's of objects within those systems - this can easily be seen from the "Examples" listing in the enwiki sitelink for Q6545185. Those "examples" are instances, not subclasses, of "unique identifier", and I think that's been clear from the way those relationships have generally been applied within wikidata. ArthurPSmith (talk) 13:06, 11 August 2017 (UTC)

ArthurPSmith - "I don't believe they make sense" - then show, using reasoning, and not using attack on the user.
Semantics are broken and you should probably be blocked from editing P31/P279 until you can demonstrate your knowledge improved. BTW: "This IP user seems to have made a large number of changes to P31/P279 statements regarding our unique identifiers" - Do you really play OWN here? Do you regard the identifiers to belong to you and your friends? Why do you have ID lists in user space: User:ArthurPSmith/Identifiers. Why not in a project? You want to assert some kind of control? 77.179.36.254 13:22, 11 August 2017 (UTC)
did you just edit my comment and your response here? Here's what I wrote to respond to you before your edit interfered: You are confused about the meaning of unique identifier (Q6545185). Given where wikidata is, there was naturally some confusion here as well (the country codes case is a good example). Nevertheless, thinking that something that is an instance of "unique identifier" is the actual code for identifying a specific entity is completely wrongheaded. What could "uniqueness" possibly mean if we are talking about a single entity? It is inherently nonsensical. If you look at en:Unique identifier (and its reference en:Identifier you will see that all the examples listed (*instances*) are things like ISBN, ISSN, ISNI, ORCID, etc, which are systems for uniquely identifying entities with specific codes. ISO 3166-1 alpha-2 code (Q1140221) should have been instance of (P31) country code (Q906278), not a subclass. As to my expertise on the issue of class relationships, perhaps some others here could note the work I've done on this in the past. Just as a sample, I did some considerable editing of Help:BMP and wrote a number of the pages under Wikidata:WikiProject Ontology. Whereas, given your continued instance on using an IP address, it's not clear what background you have at all. ArthurPSmith (talk) 17:21, 11 August 2017 (UTC)
as to what was going on at User:ArthurPSmith/Identifiers - this was at an early point in the external-id conversion project, to decide which string properties qualified as legitimate unique identifiers that would be useful as external id's. See Wikidata:Identifier migration for the actual main work on this, and thanks to Lydia and the developers for making the migrations happen. My page was just intended as a temporary reference point on what was going on with various identifiers. There are lots of similar user pages with occasionally gathered stats of this sort. I'm leaving this one up just for historical purposes. At least it does demonstrate that I've been involved with identifiers on wikidata for some time now. ArthurPSmith (talk) 17:34, 11 August 2017 (UTC)
ArthurPSmith If I changed any of your edits, then it must have been in an edit conflict, without a notice for such a conflict seen here. Please stop your personal attacks "You are confused about the meaning of unique identifier (Q6545185)" - I can claim the same about you. That does not lead us anywhere. Can you please say how you would like the claims to be, including for individual code elements like 10048 (Q4546087), cre (Q12656548), AFG (Q12626453). Is "AFG" a UID? If not, what is it? I didn't change all IDs that are now subclassOf UID, there were already some, which was the reason I started working on it. SPARQL failed to show all when using P31 and failed when using P279. It should be consistent.
"If you look at en:Unique identifier (and its reference en:Identifier you will see that all the examples listed (*instances*) are things like ISBN, ISSN, ISNI, ORCID, etc, which are systems for uniquely identifying entities with specific codes." - Exactly, they are systems, maybe instances of systems. But they are classes of UIDs. en:Identifier does not claim they are instances. 77.180.179.245 17:39, 11 August 2017 (UTC)
AFG (Q12626453) is an instance of international vehicle registration code (Q154015) that in turn should be an instance of country code, which is a subclass of unique identifier (Q6545185). Class relationships can be hard to think about with respect to any abstract concept, but I think here it is relatively clear. The postal code case is interesting however - many things have the same postal code, and the code within a given country is not a "unique identifier" of anything without also specifying the country. I don't think it belongs under "unique identifier" at all, that is, all that a zip code (for example) identifies is the area identified by the zip code, which is somewhat tautological - and which may also change over time. So I think 10048 (Q4546087) can be instance of (P31) ZIP code (Q136208), sure, but what relationships ZIP code (Q136208) should have otherwise are not clear to me - in the end it should be an instance of identifier, but not necessarily of unique identifier (Q6545185). ArthurPSmith (talk) 17:53, 11 August 2017 (UTC)
ArthurPSmith, thanks for your reply. (US) ZIP code is a UID to identify a (US) ZIP code area (geographic region, but the extension can change). An ISO country code identifies a country (legal entity) which claims to have certain rights over a geographic region. There are different kinds of postal IDs in some countries, e.g. 01 identifies a Postleitregion, (https://commons.wikimedia.org/wiki/File:German_postcode_information.png), no idea if there is a name for the area where codes start with "0". There is always the thing that is identified and the identifier. There are IDs for physical objects (humans, cars, ...) and IDs for non-physical objects. "GB", "DE", "FR" are identifiers. Maybe they are instances or they are classes of ISO 3166-1 alpha-2 code (Q1140221). What you said "AFG (Q12626453) is an instance of international vehicle registration code (Q154015) that in turn should be an instance of country code" would mean that AFG (Q12626453) is not an instance of a country code, because that would only be inherited from international vehicle registration code (Q154015) if international vehicle registration code (Q154015) would be a subclass of country code. Then AFG (Q12626453) is not a UID. "GB", "DE", "FR" would not be UIDs. So there are these things:
  1. UID
  2. specific UID systems (e.g. ISO 3166-1 alpha-2 code, ISO 639-3 language code)
  3. the codes of a system (DE, FR, GB; deu, eng, fra) : 1) subclass or instance of UID ("FR is a UID for the country named 'France'"), 2) subclass or instance of ISO 3166-1 alpha-2 code ("FR is a UID for the country named 'France'"), 3) "DE, FR, GB are UIDs, country codes, 2-letter country codes, ISO 3166-1 alpha-2 country codes"
  4. some classes with no specific UIDs, e.g. "country code", or "article identifier", or "language identifier"
On a page about "FR" it should be sufficient to state the relation to "ISO 3166-1 alpha-2 country code" and then to inherit all the rest that is attached via subclassOf, so that "FR" would be a UID, ID, 2-letter string. If one adds "UID system", e.g. "ISO 3166-1 alpha-2 code is an instance of a UID system", then much is solved. "country code" and "2-letter country code" would not be an "instance of a UID system" but just some classes to group the systems. I would even go one step further and say that "FR" just is a class, and when I write it down on a paper then it is instance of the country code 'FR'. To have items about instances of UIDs (like on my paper), should be rare in Wikidata. 77.180.179.245 19:40, 11 August 2017 (UTC)

County letter (Q10571932)[edit]

User:Innocent bystander claimed [1] that County letter (Q10571932) is an instance of country code (Q906278). The description says "alphabetic or numeric geographical codes that represent countries and dependent areas". But County letter (Q10571932) has nothing to do with identifying a country or dependent area. But it is only meant to identify members of a set of country subdivisions, namely county of Sweden (Q200547).

And if it is an instanceOf UID, then what are "AB", "C", "D"? InstanceOf an instanceOf UID? Semantics broken. 77.180.179.245 18:04, 11 August 2017 (UTC)

First user that made that claim was User:JakobVoss (Jakob Voß (Q15303972) - all the education, the degrees and the work at VGZ didn't prevent that claim from being made by him). But why did User:Innocent bystander re-insert the claim? 77.180.179.245 18:33, 11 August 2017 (UTC)

It depends on how you define "dependent areas". If a "subdivision of Sweden" is such "dependent area", the claim is right. Otherwise County letter (Q10571932) and country code (Q906278) only share a common superclass. -- JakobVoss (talk) 19:44, 11 August 2017 (UTC)
JakobVoss, wouldn't Wikidata follow what reliable sources define a "dependent area" to be, en lieu of me doing so? For a start: en:Dependent territory. If one just goes by the words "dependent" and "area" then any area may qualify because any depends on something. And "country code" would be called "area code". But there is a reason why "country" is inside the term. Re "Otherwise County letter (Q10571932) and country code (Q906278) only share a common superclass" - that is not correct. First, all things share a common superclass, here it is: entity (Q35120). Second, the two do not "only share a common superclass", but share several. 92.227.218.95 00:56, 12 August 2017 (UTC)
From Swedish article it says for car plates.
Not relevant to ATE or countries. d1g (talk) 01:17, 12 August 2017 (UTC)
d1g, there are different ways to define ATE, you may look at it in Soviet/Russian context. A Country could also be a ATE - it is an entity, it is defined for administration, and it has to do with territory, thus adminsitrative territorial entity. European Union, Russia, Central European Time Zone, Kaliningrad Region, Moscow City (the federal subject), could all be ATEs. Country is just one ATE class. For some ATE it is debated if they are countries, e.g. Kosovo, Transnistria, South Ossetia, Abkhazia, Nakhitshevan, Palestine. And then there are ATEs that are not countries, but sometimes seen as outside countries, that's what usually is meant by "dependent area" (American Samoa, Hong Kong, Macao, Gibraltar), the term that JakobVoss wanted to redefine. I see no reliable source for the edits by User:Innocent bystander and JakobVoss that define the counties of Sweden as "countries" or "dependent areas".
User:JakobVoss did not yet comment on 'And if it [Sweden county code] is an instanceOf UID, then what are "AB", "C", "D"?' 77.179.79.12 12:49, 12 August 2017 (UTC)
No problem, please assume good faith! I just created administrative territorial entity identifier (Q36205316) for special classes of unique identifier (Q6545185) with identify any kind of areas (countries, counties, etc.). -- JakobVoss (talk) 19:46, 12 August 2017 (UTC)

Postal code[edit]

ZIP code[edit]

    • How can 10048 (Q4546087) be P31 ZIP code (Q136208) and therefore P31 unique identifier (Q6545185)
    • P31 is not a transitive property, "and therefore" is wrong.
    • Q4546087 can have P279 with value of "sequence of characters" or similar. d1g (talk) 06:03, 12 August 2017 (UTC)
d1g ZIP code (Q136208) subclass of unique identifier (Q6545185) and therefore transitive and therefore the therefore was correct. 77.179.79.12 13:18, 12 August 2017 (UTC)
unique identifier has values, 10048 (Q4546087) has no values/instances and not identifier at all, but a ZIP code
Undo your changes to Q136208! d1g (talk) 13:29, 12 August 2017 (UTC)
d1g, of course ZIP codes have instances. Have you never seen Soviet/Russian postal codes printed on letters? 77.179.79.12 14:03, 12 August 2017 (UTC)
Apparently, we had such oversight for a long time. d1g (talk) 13:31, 12 August 2017 (UTC)

@Mbch331: US ZIP codes are now human readable data and no longer strings, or UIDs, thanks to D1gggg's edit. 77.179.79.12 14:12, 12 August 2017 (UTC)

Eircode[edit]

Eircode (Q6070781) now human readable data and no longer string, or UID, thanks to User:D1gggg's [2] 77.179.79.12 14:19, 12 August 2017 (UTC)

Youtube channel ID[edit]

@ArthurPSmith: some thoughts on YouTube channel ID (Q35907496):
Instances of Q35907496 are unlikely to be modeled as separate items.
It must be P31 "identifier" (YouTube has only one version of channel identifiers)
It could be P279 "sequence of letters"
Changes by @77.180.179.245: don't capture this. d1g (talk) 01:44, 12 August 2017 (UTC)
Each single channel id should be instance of (P31)  YouTube channel ID (Q35907496), even though we will never have an item about a specific channel id (we might have an item about a channel, but not about the id). So YouTube channel ID (Q35907496)subclass of (P279)  identifier. And subclass of (P279)  sequence of letters is totally wrong as it's not just letters, also numbers and some special characters. Mbch331 (talk) 06:03, 12 August 2017 (UTC)
Individual identifiers aren't modeled using items.
Why one need to create item "UCcOkA2Xmk1valTOWSyKyp4g"? What it should state?..
is totally wrong as it's not just letters, also numbers and some special characters
I wasn't in mood to type every trivial detail. d1g (talk) 06:27, 12 August 2017 (UTC)
Of course there won't be an item for an individual channel id, but that still doesn't make that YouTube channel ID (Q35907496)instance of (P31)  identifier true. It's about concepts. Not because something is the lowest level possible on Wikidata, it automatically requires a P31 statement. Sometimes the lowest level only needs a P279 statement. Mbch331 (talk) 12:11, 12 August 2017 (UTC)
I don't understand your comments: "Of course there won't be an item for an individual channel id" is opposite "Each single channel id should be instance of (P31)  YouTube channel ID (Q35907496)"
Your statements seem nothing to do with what i said as for now... d1g (talk) 12:32, 12 August 2017 (UTC)
d1g, why is that opposite? Only because there are no items about YT channel IDs does not mean the YT channel IDs are not IDs. 77.179.79.12 13:16, 12 August 2017 (UTC)
P31 URL
P31 youtubechannelID
Who need to make such statements? d1g (talk) 13:23, 12 August 2017 (UTC)
d1g, "Who need to make such statements?" - what do you mean? 77.179.79.12 14:01, 12 August 2017 (UTC)

List of issues with identifiers[edit]

Wikidata:List of classes of unique identifiers[edit]

{ {Wikidata list
|sparql=SELECT ?item WHERE {  ?item wdt:P279* wd:Q6545185 } 
|columns=item,label,P279,P31,P17,P1001,P2378,P137,P1793,P1630,description,P1687
|sort=label
|summary=itemnumber
} }

returns 251 items. Included are several specific code elements from ISO 639-3, e.g. "aze". How can IDs be separated from ID systems? Should the systems get an additional P31? 77.179.36.254 14:40, 11 August 2017 (UTC)

Voting on removal of inconsistencies related to identifiers[edit]

Shall the inconsistencies in the usage of subclass of (P279) VS instance of (P31), country (P17) VS applies to territorial jurisdiction (P1001), issued by (P2378) VS operator (P137), and format as a regular expression (P1793) be removed?

ISO 639-1 code (P218) ISO 639-3 code (P220) should be used at Arabic (Q13955)
I don't know if we should do anything about items like ISO 639-3 kodai/ara (Q12656547) d1g (talk) 02:28, 12 August 2017 (UTC)
d1g 1) the items exist because there are connected pages in Wikipedia. 2) languoids are not physical objects, they are defined by different people at different ppoints in time differently. How many things in a language have to be change to constitute a new language? ISO came around and defined some languoids, 'ara' is macrolanguage consisting of several individual languages. Other languoid ID systems may not have something like 'ara'. Wikidata should only state what is stated in reliable sources. Maybe the item for 'ara' could be redefined as "the languoid identified by ISO 639 'ara'". So, the 'ara'-item would not be instanceOf/subclassOf UID, but subclassOf ISO 639 defined languoid. d1g, User:ArthurPSmith, what do you think? 77.179.79.12 13:06, 12 August 2017 (UTC)
Symbol oppose vote oversat.svg Strong oppose insane suggestions to create items for "tt0120338" "tt0088846" and use P31/P279 below. d1g (talk) 15:20, 13 August 2017 (UTC)
d1g where was it suggested to create items for "tt0120338" "tt0088846"? The item identified by tt0120338 exists: "Titanic (Q44578)", so does exists for "tt0088846" the item Brazil (Q25057). 85.179.160.30 16:49, 13 August 2017 (UTC)

human-readable data (Q28777989) OR machine-readable data (Q6723621) on UID items[edit]

@Innocent bystander, GerardM, Mbch331, JakobVoss, ArthurPSmith: User:D1gggg now inserts randomly(?) subclassOf human-readable data (Q28777989) OR subclassOf machine-readable data (Q6723621) on UID items [3]. Maybe the statements are correct, but wouldn't it make more sense to classify all the UIDs with something more specific like unique identifier (Q6545185) or a subclass of the latter? human-readable data OR machine-readable data is very unspecific. 77.179.55.131 14:40, 12 August 2017 (UTC)

@Ogoorcs: User:D1gggg turned your specific classification as "Internet Movie Database title ID" subclassOf "UID" into only "Internet Movie Database title ID" subclassOf "string" [4] . Yes, a IMDB title ID is a string, but a specific type of string, namely a UID. Waht do you think? 78.51.208.125 15:16, 12 August 2017 (UTC)

Internet Movie Database title ID (Q28777282) is rather an instance of unique identifier (Q6545185) (or some of its subclasses). Individual parts of Internet Movie Database title ID (Q28777282) are strings but the whole is an identifier system. In most cases it makes no sense to make statements about individual identifiers, at least before lexical label items are supported in Wikidata. -- JakobVoss (talk) 07:52, 13 August 2017 (UTC)
Pictogram voting comment.svg Comment we shouldn't put too much effort to explain "unique identifier" is a system of values.
They were given explanation multiple times above, but they continue to drag everyone in community rather than accounting what what was suggested. d1g (talk) 08:10, 13 August 2017 (UTC)
JakobVoss - "Internet Movie Database title ID (Q28777282) is rather an instance of unique identifier (Q6545185)" - how that? "tt0120338" and "tt0088846" are Internet Movie Database title ID (Q28777282), there are many of these, that belong to the class IMDB id, subclass of creative work ID, subclass of ID. "DE" is a UID for a country, namely a country code. "FR", "GB" too. What is the class they belong to? ISO 3166-1 alpha-2 code, subclass of country code, subclass of UID. 85.179.160.30 13:54, 13 August 2017 (UTC)
@78.51.208.125: I think it is correct to say that ImDB title IDs are instances of "UID"s and of course of strings. I don't see conflicts in these statements. To me, best choice is "subclass of" "UID" -> qualifier: "consists of" "string".--Ogoorcs (talk) 23:45, 13 August 2017 (UTC)
@Ogoorcs: anyone with 1 day of experience with databases wouldn't define unique identifiers as strings: "a specific type of string, namely a UID" - such statement show all profanity
unique identifiers aren't about strings but
  1. about uniqueness of values
  2. about identification system associated with values (ara is meaningless without coding system)
d1g (talk) 02:31, 14 August 2017 (UTC)
Any individual identifier such as "ara" is meaningless without a coding system.
Individual identifiers such as country code "DE" or IMDb "tt0120338" should not be stored as Wikidata items but used as values with the corresponding Wikidata properties. Exceptions should be treated as exceptions, instead of argument to make them the norm. -- JakobVoss (talk) 09:17, 14 August 2017 (UTC)

Identifier classification tree[edit]

213.39.164.36 22:29, 12 August 2017 (UTC)

P155 and P156 qualifier constraint[edit]

Hello. Does the community agree with that changes? [5], [6] (@Lockal:)

Are we allowed to use those properties as statements and not as qualifiers (I know there are the exceptions of rare cases)? Now we mostly used them not as qualifiers.

Previous discussions:

I don't have an opinion or a suggestion. I just want to use them correctly.

Xaris333 (talk) 11:55, 11 August 2017 (UTC)

I guess nobody interest for this subject either... Although there are discussions that decided to use them as a qualifier, a user remove the constraint only by his own opinion, thousands of users add them as statements and nobody cares about that. So thats wikidata. Since in most items the properties are as statements, we just continue with that... We have a big problem with wikidata in general. And since the wiki is growing, the mess and the problems are growing too. Αnarchy... Well, nothing is going to change, so I will just continue add them as I believe. No one take enough serious the decisions that are taking in wikidata, so everyone is doing what ever he wants... Xaris333 (talk) 18:10, 11 August 2017 (UTC)

sorry, I was going to say something (I wasn't even aware of the qualifier constraint, and I've used these properties before) but I wasn't sure what to add. I don't think there's reason to despair here, or even be discouraged - sometimes a single property can be useful in different contexts and I think this is a case of that. Probably we should focus on relatively narrow areas and do what makes sense to us, and then try to reach consensus where the areas overlap, more than trying to enforce something top-down? ArthurPSmith (talk) 18:17, 11 August 2017 (UTC)
There were discussions and decisions about using them as qualifiers. What's the point if anyone can change those decisions without asking the community? What's the point if we don't have the willing to correct things? I know that every decision can change, but through a discussion. Now, I am just seeing a property that the community decided to use it a as a qualifier, most of the users use it as a statement, a user just removed the constraint and life go on... And I think that this apply to many other properties... Do we have rules? And if we have, does anyone follows the rules? Xaris333 (talk) 18:30, 11 August 2017 (UTC)
well we may see quite a bit of this sort of thing going forward now that constraints actually have a UI effect (before the last month or so, any constraint violations were hidden on report pages). Most constraints make sense. But a constraint that has been widely violated probably doesn't. However, perhaps we should try to get people to discuss on the property talk page before removing a constraint like this at least... ArthurPSmith (talk) 18:49, 11 August 2017 (UTC)
Another discussion: Topic:Tv00hex7b7bwpd4t. TLDR: introducing this constraint generated 266060 violations without any algorithmic solution. --Lockal (talk) 19:30, 11 August 2017 (UTC)
Pictogram voting comment.svg Comment 2 first changes make sense, because
1. sometimes next/previous links are not ambiguous and
2. they refer to whole items, rather than individual statements
So usage as qualifiers isn't necessary d1g (talk) 00:27, 12 August 2017 (UTC)
How can that be? Previous and following are relative concepts that depend on a point or item of relevance. If something precedes or follows something there always needs to be a context, it cannot happen in isolation. Philip > Anne > Andrew > Edward ... children of QE2; Philip > Andrew > Edward ... male children of QE2. Album releases of a band, etc. Years 1996 follows 1995 (previous year) and 1992 (previous leap year). Book published might follow the author's previous work; the author's previous non-fiction work; the publisher's previous work; the illustrator's previous work; and so on. There has to be context, so I am interested in your examples where you see that P155/P156 do not have a context.  — billinghurst sDrewth 05:33, 12 August 2017 (UTC)
Anything. Models of cars. d1g (talk) 05:52, 12 August 2017 (UTC)
A model of car has context, it doesn't exist in isolation, the models are of a brand. Australian Holden Commodore is the brand, and it has multiple models since the 1980s. Holden has also built other cars and models. Also a certain model could be built in one country then they change place of manufacture, so if your were looking to track the models built in a country or at a plant, if you put an overarching P155/156, aren't you confusing that if you wish to track other components.  — billinghurst sDrewth 06:00, 12 August 2017 (UTC)
We don't need to use any qualifiers in most cases: Ford Model A (Q1167651) Ford Model A (Q515001) d1g (talk) 06:10, 12 August 2017 (UTC)
Looks like something that can be qualified, and doesn't run contrary to guidance.  — billinghurst sDrewth 15:11, 12 August 2017 (UTC)
@billinghurst: South Pole Telescope (Q1513315) follows (P155) Antarctic Submillimeter Telescope and Remote Observatory (Q4771004) - I'm not sure how that would be qualified? Thanks. Mike Peel (talk) 00:11, 13 August 2017 (UTC)
On what basis does it follow?  — billinghurst sDrewth 06:50, 13 August 2017 (UTC)
In the telescopes I think its make sense to use replaces (P1365) and replaced by (P1366). The one item replaced or replaced by the other. Is not a series. Xaris333 (talk) 06:59, 13 August 2017 (UTC)
Aah, replaced by makes more sense. I didn't know we had those properties, thanks! Mike Peel (talk) 22:48, 13 August 2017 (UTC)
Anything. TV seasons. d1g (talk) 06:56, 13 August 2017 (UTC)
Anything. Ceremonies and events. d1g (talk) 06:58, 13 August 2017 (UTC)

So, we are going to use them also as statements? Not all user agree to use them as qualifiers only. Xaris333 (talk) 09:23, 12 August 2017 (UTC)

I disagree quite strongly with the decision to remove the constraints without discussion. I think the constraints should be re-added, and only removed if a discussion on the property talk page results in consensus to remove. --Yair rand (talk) 20:57, 14 August 2017 (UTC)
I have re-added them 3 days ago. I agree with you. Xaris333 (talk) 21:43, 14 August 2017 (UTC)
In the event the qualifier constraint remains, we should determine which properties P155 and P156 should qualify. part of (P361) and series (P179) are the most obvious two that come to mind. (I don't want to bother Maarten by suggesting instance of (P31)). Mahir256 (talk) 05:56, 15 August 2017 (UTC)

Single category items?[edit]

Maybe I am mistaken, though I thought that current practice was not to create single category items, like Category:Champagnac-la-Rivière (Q32377107). What is our current practice?  — billinghurst sDrewth 13:54, 11 August 2017 (UTC)

I never heard of such current practice and WD:N does not claim it either. The Query Service just told me that we have ~2.7M items about categories with only one sitelink. —MisterSynergy (talk) 14:06, 11 August 2017 (UTC)
It's just that it shouldn't be a link to Commons.
--- Jura 14:18, 11 August 2017 (UTC)
Okay, that must have been my confusion.  — billinghurst sDrewth 22:08, 11 August 2017 (UTC)

PetScan (wikidata-labels)[edit]

Hello. Can anyone explain me how petscan - wikidata - Labels etc. works? Xaris333 (talk) 14:37, 11 August 2017 (UTC)

See PetScan on meta. Pamputt (talk) 16:12, 11 August 2017 (UTC)
Nothing about that section. Xaris333 (talk) 18:14, 11 August 2017 (UTC)
Hey Xaris333, your request is a bit vague. Can you please let us know a little more in detail what you want to do with petscan? —MisterSynergy (talk) 18:37, 11 August 2017 (UTC)
Nothing specific. I know how to use it, most of the options. But I haven't understood that point. "Labels etc.Note: The options below will be used as a generator, if and only if no other generator is used!... Has all of these labels..." How it works. Maybe that options helps me in future search, if I know what is it. Xaris333 (talk) 18:44, 11 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Okay, some comments about Petscan in bullet points. Maybe some of the aspects are know to you, but here they are:

  • Petscan is the successor of Catscan, which was a tool by Magnus Manske as well, designed to craw Wikipedias according to (primarily) categorization, but also for existence of templates, links, etc. Petscan also includes this functionality, but it also got a lot of connectivity to Wikidata which no other tool provides.
  • Many users are unhappy with the amount of functionality that this tool offers, but there is in fact a need for most of it. Even experienced users use “trial-and-error” method until the outcome fits their expectations.
  • Regarding the label function you asked for: as far as I understand, it only has an effect if no other input is provided in any of the tabs. But then you can quickly look up items which have certain combinations of labels, descriptions, or aliases. Try some inputs without filling any other field and learn from the results… I doubt that there is a useful description around for that functionality.

MisterSynergy (talk) 19:11, 11 August 2017 (UTC)

Ok. Thanks! Xaris333 (talk) 19:14, 11 August 2017 (UTC)

  • Search by label is based on raw sql database access, see source code. You can achieve the same result with Quarry, but in many cases SQARQL/PetScan is more than enough. --Lockal (talk) 21:18, 12 August 2017 (UTC)

College Football Data Warehouse ID[edit]

What do we do when a website shuts down? "College Football Data Warehouse ID" for w:College Football Data Warehouse is no longer active since February 2017. --Richard Arthur Norton (1958- ) (talk) 03:12, 12 August 2017 (UTC)

Deprecate the formatter URL (P1630) of College Football Data Warehouse ID (P3560) (already done) and keep all identifiers. The links will not be generated any longer, but we can still somewhat identify the items that used this identifier. —MisterSynergy (talk) 05:35, 12 August 2017 (UTC)
Can we reformat the descriptor to point to the Wayback Machine so we can have a useable link again? We can go to "https://web.archive.org/web/20160805191715/http://cfbdatawarehouse.com/data/coaching/alltime_coach_year_by_year.php?coachid=3401" from "http://www.cfbdatawarehouse.com/data/coaching/alltime_coach_year_by_year.php?coachid=3401" and take us from dead link to working link. --Richard Arthur Norton (1958- ) (talk) 15:16, 12 August 2017 (UTC)
I doubt that this would work, but I had the idea this morning as well. The archive URLs contain the archiving timestamp (as above: “20160805191715” for “5 Aug 2016, 19:17:15”) which is identifier-dependent and thus not stable. If an Internet Archive expert could point to a possibility to link the archived profiles without any varying parts, it might be an option to add a new (non-deprecated) formatter URL next to the old one. —MisterSynergy (talk) 16:05, 12 August 2017 (UTC)

Here are some random entries: https://web.archive.org/web/20160805191715/http://cfbdatawarehouse.com/data/coaching/alltime_coach_year_by_year.php?coachid=3435 https://web.archive.org/web/20160805191715/http://cfbdatawarehouse.com/data/coaching/alltime_coach_year_by_year.php?coachid=3115

Note that an archive.org timestamp is down to second level, as noted above, so even if the whole site was crawled in one go, it might have subtly different timestamps for different sections. One approach you could use is to link to https://web.archive.org/web/*/http://cfbdatawarehouse.com/data/coaching/alltime_coach_year_by_year.php?coachid=3435 - this doesn't go direct to the target page, but instead to a landing page with all the archive versions listed. It's not perfect, but it's probably good enough for a backup. Andrew Gray (talk) 11:06, 13 August 2017 (UTC)

Official website[edit]

Hi, I have noticed that the property "official website" is used a lot for people from the entertainment industry and firms and companies and so on, but not at all for people from the academia.

Basically, even if scientists have group webpages, we rarely add such infomation to our items. I found no evidence in my random sampling.

Is it on purpose? Is it a gap to fix? Any previous discussion about the topic?

In any case I agree that the name of the institution is enough to find the website, but it can be also use a source per se. Even if it is not a "third party source" the academic website is usually considered reliable on many platform as a source for basic academic information (Thesis supervisor, year of graduation, main awards...).--Alexmar983 (talk) 11:07, 12 August 2017 (UTC)

official website (P856) is also suitable for people from the academia, for them having official website (P856) with link to his/her profile on webpage of its institution is appropriate.--Jklamo (talk) 12:05, 12 August 2017 (UTC)
ok.--Alexmar983 (talk) 12:49, 12 August 2017 (UTC)

Label changed to nil[edit]

At enwiki I am seeing a message from a module that the label for an entity has been changed to nil. It is the first of the following, and sure enough, it is not displaying a label. The second entity is just an example of something that works, where the label is shown.

elementary charge (Q2101) (update: this used to display "no label" instead of "elementary charge")
femtometre (Q208788)

However, the entity shows "elementary charge (Q2101)" even after I purged it. What's going on? Johnuniq (talk) 04:43, 13 August 2017 (UTC)

What's the exact error message you're getting? Mbch331 (talk) 07:06, 13 August 2017 (UTC)
I think this is still unresolved #P1027 does not show its label, is not found as "conferred by". Matěj Suchánek (talk) 07:45, 13 August 2017 (UTC)
How irritating. I tried {{Q|Q2101}} a few times, both here and at enwiki. Here, it was displaying "no label" instead of "elementary charge", while {{Q}} at enwiki displayed nothing for the label. I see that it has corrected itself. Conceivably erratic behavior like this could be related to T170039 concerning pages that display an error in mw.wikibase.entity.lua (see Tech/News/2017/31). Johnuniq (talk) 08:07, 13 August 2017 (UTC)
Well elementary charge (Q2101) didn't resolve itself. I did a hard purge on the item (purge with forcelinkupdate). Mbch331 (talk) 08:19, 13 August 2017 (UTC)
OK, but that is weird. I often need to use forcelinkupdate to purge error tracking categories at enwiki in order to find real errors, but I have never had a case where previewing a template would show an error, unless there was a real error. In this case, previewing my opening post showed "no label", and I'm sure I would have noticed if it did not show that problem after saving. In fact, you probably saw it showing "no label", then purged the item (which I had done, although without forcelinkupdate). I guess there is a caching layer between the underlying database and clients. Johnuniq (talk) 09:39, 13 August 2017 (UTC)
A likely explanation for the fact that a standard purge did nothing while a hard purge fixed the problem is that a standard purge merely flushes the HTML cache for the rendered page, while a hard purge presumably also flushes the database caching. Johnuniq (talk) 22:13, 13 August 2017 (UTC)

Countries, places etc for sports[edit]

I will ask again because is so confusing. In many sports competitions and seasons there is a problem with places. I am writing down all possible cases.

Previous discussions:

1) National club competitions and seasons: competitions and seasons that a team from all the country (one country) can take place. The competition held only in one country, nowhere else. For example, Serie A (Q15804). Any team from Italy, and only from Italy, can be on that championship and the competition held only in Italy. It's not a local championship and its not about more than 1 country. In that case we are using ONLY country (P17) --> Italy (Q38). No need for other properties like operating area (P2541).

  • Problem 1a: Special case: Teams of countries of the United Kingdom: At association football UK have 4 association. And we agree (?) that we will use country (P17) --> United Kingdom (Q145) with operating area (P2541) --> England (Q21) (for competitions or seasons that held only in England). The problem is that in P2541 description says that is only for organization, not for an event.

2) National club competitions and seasons at local level: competitions and seasons that a team from a specific part (or parts) of a country (one country) can take place. The competition held only in one country, that specific parts of the country, nowhere else. For example, Eccellenza Abruzzo 2010–11 (Q5332172). Only teams from Abruzzo (Q1284) are taking part. In that case we are using country (P17) --> Italy (Q38) with operating area (P2541) --> Abruzzo (Q1284). Is that ok?

country (P17) --> United Kingdom (Q145) with operating area (P2541) --> Dorset (Q23159) (England is not shows in that case.)

or

country (P17) --> United Kingdom (Q145) with operating area (P2541) --> England (Q21) and another property for Dorset (Q23159)?


  • Problem 2b (for seasons): Same with 1b.

3) International club competitions and seasons. Competitions and seasons that a team from more than one country can take part. The competition held in many countries. For example, UEFA Champions League (Q18756). (Some competitions like American Hockey League (Q464995) that take place in 2 countries are not a problem. We are using country (P17) for both countries and face them as a national championship - case 1 above).

  • Problem 3c: If we are going to list all countries with P17, what are you going to do for UK, if only England took part? And not Scotland, Northern Ireland and Wales for example. Should we use England at P17?
  • Problem 3d (for seasons): Same with 1b.

Comment (for seasons): For stadiums for club competitions, we can use home venue (P115) as a qualifier for each team with participating teams (P1923). Example, 2016–17 Cypriot First Division (Q23756432).

4) International national teams competitions. Competitions and seasons that national teams are take part. Like UEFA Euro 2016 (Q189571).

  • Problem 4a (only for seasons): We are using country (P17) for the host country. Not for all countries that took part. I think is logical. But, why not to use operating area (P2541)? Or both? For example, France was the host country for Euro 2016. But the competition was about other 15 countries as well (ok, that maybe is a wrong thought, place is different from participating teams. That situation apply more to qualification phase, see problem 4d).
  • Problem 4b (only for seasons): We are using location (P276) for the stadiums. In these kind of competitions is easy because only some stadiums are been used.
  • Problem 4c: UEFA Euro 2020 (Q373501) is a national teams competitions that is not taking place in one country. Of course we can use P276 for stadiums. Are we going to use P17 for all countries that are going to host a game?
  • Problem 4f: Maybe for international competitions on national teams we can use P2541 to list all host country/countries using applies to part (P518) (or something else) and not the continent. For example:
< FIFA World Cup (Q19317) View with Reasonator View with SQID > operating area (P2541) View with SQID < Brazil (Q155) View with Reasonator View with SQID >
applies to part (P518) View with SQID < 2014 FIFA World Cup (Q79859) View with Reasonator View with SQID >
< UEFA European Football Championship (Q260858) View with Reasonator View with SQID > operating area (P2541) View with SQID < Portugal (Q45) View with Reasonator View with SQID >
applies to part (P518) View with SQID < UEFA Euro 2004 (Q102920) View with Reasonator View with SQID >

Xaris333 (talk) 08:16, 13 August 2017 (UTC)

Maybe we should create a new property for 1a if P1532 and 2541 aren't applicable. d1g (talk) 11:40, 13 August 2017 (UTC)
Something special for UK? Xaris333 (talk) 13:41, 13 August 2017 (UTC)
Not necessary, simply state why it was created; it may be in other sports too. d1g (talk) 14:26, 13 August 2017 (UTC)

authority_control.js gives problem for me getting VIAF records[edit]

See VIAF problem anyone skilled in VIAF and javascript that can see if they get the same problems... - Salgo60 (talk) 10:12, 13 August 2017 (UTC)

Works for me.  — billinghurst sDrewth 14:46, 13 August 2017 (UTC)
Works for me too. Mbch331 (talk) 15:51, 13 August 2017 (UTC)
Installed Firefox and it worked for me...hm... thanks for testing - Salgo60 (talk) 06:32, 14 August 2017 (UTC)
Same problem here, see my comment there. --Nono314 (talk) 21:39, 13 August 2017 (UTC)

Property for the starting point of a linear item[edit]

I admit being a bit lost, given the amount of similar properties (destination point (P1444), start point (P1427), terminus location (P609), terminus (P559)). Is there a specific property (start point or whatever?) equivalent to the terminus (P559) property? In order to give both start-end points of a linear feature, for example: a particular street "starts" (??) at its intersection with Street X and "ends" (terminus (P559)) at its intersection with Street Y)?--Asqueladd (talk) 15:08, 13 August 2017 (UTC)

origin (Q40735) for vector? Does a street really have a start point? It would have two ends, but wouldn't a start point indicate a required direction of travel?  — billinghurst sDrewth 00:25, 14 August 2017 (UTC)
Hi, billinghurst. Yes, in my geographical context streets have a conventionally defined "start point" (where numeration begins) and "end point" (where numeration ends). See for example here "comienza en/termina en" (starts/ends).--Asqueladd (talk) 14:29, 14 August 2017 (UTC)

subclass of (P279) AND uses (P2283) having the same class as value[edit]

datatype (Q31385480)subclass of (P279)  Internationalized Resource Identifier (Q424583) AND datatype (Q31385480)uses (P2283)  Internationalized Resource Identifier (Q424583) [7] looks wrong. One more misclassification by User:D1gggg? 77.179.188.46 16:57, 13 August 2017 (UTC)

It's ok to say that something "looks wrong" but please stop blaming contributors for continuously adding wrong statements . -- JakobVoss (talk) 09:22, 14 August 2017 (UTC)

anti-war activist is not an occupation[edit]

@GerardM: has added occupation (P106) anti-war activist (Q36193099) to a whole series of articles, claiming that Wikipedia says these individuals' occupation was "anti-war activist". Aside from the fact that "anti-war activist" is not an occupation, the text he claims is in the articles appears to only be a category, and appears nowhere within the text of the articles examined. Either the information must be removed as being incorrect, or it must be corrected in some way. Is there a way to correct the information? If so, does someone have a bot that could run through all instances using the correct property for this qualifier? Certainly occupation (P106) is wrong, but GerardM has refused to agree. --EncycloPetey (talk) 20:29, 13 August 2017 (UTC)

"'anti-war activist' is not an occupation" For some value of "occupation" (occupation (Q12737077) is defined, in English, as "any activity of a person (hobby, work, pastime...)")). I think it is very easy to occupy one's time with anti-war activism. Perhaps you meant "'anti-war activist' is not a paid job". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:41, 13 August 2017 (UTC)
Did you look at the various ways occupation (P106) is defined and explained in all languages, or just seize upon a single one in English? occupation (P106) is for paid employment. --EncycloPetey (talk) 20:59, 13 August 2017 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

Yes, but since this discussion is in English, that's what I quoted. Here you go:

  • ast = cualquier actividá d'una persona (pasatiempu, trabayu, deporte, etc.)
  • bg = всяка продължителна дейност на човек (хоби, свободно време, работа, професионален спорт и др.)
  • bn = একজন ব্যক্তির কোনো কার্যকলাপ (শখ, কাজ, আহ্লাদ, পেশাদারী খেলা...)
  • ca = qualsevol activitat d'una persona (afició, treball, passatemps, esport professional etc.)
  • da = persons aktivitet, f.eks. hobby, arbejde, sport, ...
  • de = jede dauerhafte Aktivität eines Menschen (Hobby, Freizeit, Beruf etc.)
  • es = cualquier actividad de una persona (hobby, trabajo, deporte profesional, etc.)
  • et = igasugune inimese tegevus, nt hobi, töö, meelelahutus jne.
  • eu = pertsona baten edozein jarduera (zaletasuna, lana, denbora-pasa, lanbidezko kirola...)
  • fr = toute occupation d'une personne: passe-temps, travail, sport professionnel...

and that's just the first screen in the Labelister gadget, other than English. I also note from the item's early history that an attempt to merge it with job (Q192581) was reverted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:31, 14 August 2017 (UTC)

+1. "Occupation" =/= "paid work" by our definition. - PKM (talk) 20:46, 13 August 2017 (UTC)
  • Pictogram voting comment.svg Comment In English the generally acknowledged modern interpretation of "occupation" is for paid employment, or at least solid part of a career. It is would be useful to be able to have better granularity. We have many British clergyman, and gentry and peers of various realms who did the highest quality amateur research outside of their financial income. Similarly, daughters and wives of politicians, peers, etc. who did much social work funded by family or inheritance. Do we wish to differentiate between how users supported themselves in the world, compared to their claim to renown.  — billinghurst sDrewth 00:15, 14 August 2017 (UTC)
I recently did edits to modern payed workers, all modern professions should be sub classes of employee (Q703534).
E.g. barber surgeon (Q781850) is not linked to Q703534 d1g (talk) 02:00, 14 August 2017 (UTC)
@billinghurst: That would be a vocation (Q829183) or avocation (Q1267055), depending on how we choose to define them. I do not know whether we have a property tied to either of these. --EncycloPetey (talk) 02:10, 14 August 2017 (UTC)
We have approved by (P790) for example d1g (talk) 02:13, 14 August 2017 (UTC)
Employee as a superclass for modern occupations? Really so everybody who has such an occupation is not self employed.. REALLY? Thanks, GerardM (talk) 03:59, 14 August 2017 (UTC)
  1. I don't think we should stress about "real one-man company" or "contractor" nuances.
  2. Self-employed don't get money from the air.
  3. It must be legal form (P1454), not primitive instance of (P31) subclass of (P279) d1g (talk) 04:18, 14 August 2017 (UTC)
This we "should not" is exactly why the whole class system is a quagmire. It cannot be explained it is absolutely helpful and unhelpful at the same time. GerardM (talk) 04:24, 14 August 2017 (UTC)
4 classes should be enough to cover all profit-related activities, shadow economy and illegal activities if have interest to have them together.
service worker (Q33394442), farm worker (Q33394254), manual worker (Q33394058), white-collar worker (Q368758)
Maybe we should include Q781850, but then we should agree to state "end date"=unknown to a profession or similar. d1g (talk) 05:09, 14 August 2017 (UTC)
@GerardM: pay attention to d1g (talk) 05:20, 14 August 2017 (UTC)
and that tells me what? You did not make your point. Thanks, GerardM (talk) 06:56, 14 August 2017 (UTC)
@GerardM: we should use affiliation (P1416) (and P1454) to identify if they they are organization of member of organization and other ways
You said "so everybody who has such an occupation is not self employed" but we shouldn't make such conclusions based on P31/P279.
P31 should answer very fundamental questions, not what you raise d1g (talk) 08:05, 14 August 2017 (UTC)

We could use movement (P135) (though anti-war is more a political movement than philosophical; should we correct this property definition to explicitly allow political movements?). For example, I added it to Heather Heyer (Q36338039) with value anti-racism (Q582965). Anyway, if one person has been an anti-war activist, I think we should add "activist"/"political activist" as occupation. Emijrp (talk) 08:47, 14 August 2017 (UTC)

There are plenty of occupations that are no occupation at all. Poet for instance.. There are plenty of sportspeople who are known for their sport but do not make any money out of it. Affiliation is another non starter imho because poets have no affiliation either. Occupation is used as an indicator for the activities someone is known for, what occupies their attention not really their profession. Thanks, GerardM (talk) 09:14, 14 August 2017 (UTC)
Fully agree with GerardM, and that was my point with reference to renown. How we mention and remember people is not necessarily their occupation, eg. Octavia Hill (Q437462) or John Monash (Q2731333) or Francis Ledwidge (Q1387970). How do we intend to capture that?  — billinghurst sDrewth 12:38, 14 August 2017 (UTC)
I say again, that what we may need is a property for avocation (Q1267055), which is a "calling" that is not usually a paid position, such as missionary, sportsperson, artist, activist, etc. It is true that Olympic athletes and state poets are not usually given a salary for their position, but the position is still supported financially by the state. In a sense, they are paid, just not salaried. --EncycloPetey (talk) 13:17, 14 August 2017 (UTC)
@GerardM, billinghurst: The problem with your approach is that as for now occupation (P106) is not defined as an activity someone is known for, but as any activity of the person, which is far to broad and can include waking up in the morning, brushing one's teeth, putting on shoes, using swear-words or paying income-taxes. The only restriction as for now is that there has to be a corresponding item for the activity (which can necessarily be created) and that the statement should provide the source of the information (which meanwhile nobody takes seriously). Such a broad definition renders the data virtually useless. The fact that Wikidata editors (mostly) act reasonable and don't add statements in the whole range allowed by the definition doesn't solve the problem, just covers it temporarily, so that it surfaces later after growing to an unmanageable amount. And meanwhile brings other problems like one editor restricting himself arbitrarily on "activities one is known for", other one on "activities one makes his living with", and yet another one on "long-term or regularly repeated activities", "activities of one's own accord" a.s.o., who will engage in endless and hopeless discussions, whether a particular activity should be included or not.--Shlomo (talk) 09:28, 15 August 2017 (UTC)

We also have the problem that "Anti-war activist" implies the person was against all war of any kind, which is not always the case. Some anti-war activists oppose a particular war or cause, but support other wars or causes. Aristophanes, for example very vocally opposed the Peloponessian War in his plays, while praising the wars that had been fought against Persia. He only opposed war with neighbor Sparta, not war against foreign invaders. So @GerardM: how would we indicate that? --EncycloPetey (talk) 13:17, 14 August 2017 (UTC)

I do not indicate that, current category structures do indicate that. Thanks, GerardM (talk) 13:38, 14 August 2017 (UTC)
@EncycloPetey: This has nothing to do with the discussed problem of occupation (P106), this is just lack o sufficiently particularised definition of anti-war activist (Q36193099). As for now I can't see any statement, label, description or even discussion that implies, that this item should be used for "anti-any-war activist" only and not for "anti-some-particular-war(s) activist". As soon as this is clarified (which should be done on item's talk page), we can look for a solution for Aristophanes or other ones (using qualifiers, splitting items, whatever). But we'll still be facing the question, whether "anti/pro-anything-activism" is considered occupation as understood in occupation (P106), or we should better use political ideology (P1142), political alignment (P1387) or anything else.--Shlomo (talk) 07:15, 15 August 2017 (UTC)

any good candidates as subject item of this property (P1629) for uses (P2283)?[edit]

d1g (talk) 07:58, 14 August 2017 (UTC)

Limit edits to English language label of properties to logged-in users?[edit]

Looking at edits by anons of the last 30 days, it seems that most edits are either vandalism, vandalism reversal or people mistaking the property label for the value to be entered. The few edits that could actually be considered appropriate didn't persist either.

I'd suggest that we try to set up an edit filter that blocks anonymous edits to English labels on properties.
--- Jura 10:48, 14 August 2017 (UTC)

We could at least try this 2 weeks or a month GA candidate.svg Weak support d1g (talk) 11:19, 14 August 2017 (UTC)
Symbol oppose vote.svg Oppose for a single language (support if applied generally). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:40, 14 August 2017 (UTC)
Why is that?
--- Jura 13:30, 14 August 2017 (UTC)
@Pigsonthewing: I have not seen random IPs editing any language other than English by mistake, so I cannot see why we should protect other languages. New logged in users, yes, but not IPs. - -- Innocent bystander (talk) 05:25, 16 August 2017 (UTC)

Merge request[edit]

I am not too familiar with Wikidata and not at all with its tools, so I hesitate to possibly do harm by messing with the tools. But could someone possibly merge Category:Hamilton, New York (Q20088954) and Hamilton (Q3460721)? They appear to be identical. Thank you so much! --Stilfehler (talk) 11:52, 14 August 2017 (UTC)

We don't merge wikimedia categories with their corresponding topics, they are considered different.  — billinghurst sDrewth 12:06, 14 August 2017 (UTC)
Thank you, I wasn't aware that there was a difference. --Stilfehler (talk) 12:25, 14 August 2017 (UTC)
@Stilfehler: I have added a topic's main category (P910) to the latter item, using the former item as a value. This is how we link Wikimedia categories to their subjects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:39, 14 August 2017 (UTC)
I created a page [8] and am failing to connect this page to [9]. It both would share a Wikidata ID, the link to Commons would appear automatically - this is my primary goal. --Stilfehler (talk) 13:00, 14 August 2017 (UTC)
Use Commons category (P373). Matěj Suchánek (talk) 15:32, 15 August 2017 (UTC)
P373 doesn't create interwiki links on Commons, it's better to use the 'Other sites' sitelink to Commons instead. Thanks. Mike Peel (talk) 17:48, 15 August 2017 (UTC)
But that won't work in this case Mike Peel as the link is used in the category, so need to use an active means to pull link to the CommonsCat using WikiBase.  — billinghurst sDrewth 06:01, 16 August 2017 (UTC)

Label conflict - how to solve?[edit]

I can't add Czech label to Template:t (Q30769953) because of conflict with no label (Q10809692). However these are totally differrent templates. What is the proceedings in such cases? (Feel free to fix it directly if possible.)

Danny B. 12:47, 14 August 2017 (UTC)

Specification of one of the descriptions seems to be a workaround. --Sintakso (talk) 13:31, 14 August 2017 (UTC)
  • ✓ Done
    --- Jura 13:32, 14 August 2017 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 14:15, 16 August 2017 (UTC)

Sources[edit]

Hello, I was thinking of centralizing the sources on various wikis so that they can be accessed by various bots and be rated on various parameters like reliability, quality, etc. First logical step in this direction I could think of is the role of wikidata in same. Is there any existing feature on wikidata through which we can list sources? It would be a good option if we could use source on wikipedia with just unique identifier and page no, and fetch all other details of source through wikidata. Would like to know views of others on this proposal. Capankajsmilyo (talk) 13:44, 14 August 2017 (UTC)

That sounds like a simplification! I often use Q25711315 as source for villages of Sweden. It is a good source, maybe the only reliable source in the subject, but not for the names of the villages. So, is it reliable? YNeos! -- Innocent bystander (talk) 14:38, 14 August 2017 (UTC)
Reliability is a subjective term and I am not denying that. But this subjectivity can be defined for certain cases, through which bots can get basis for assessing the sources. But that comes at a later stage. As I have already mentioned, first step would be to give definition to sources itself. They are just plain texts with some being bound in templates like Citation on ENWIKI, etc. If they are sourced from a central location like wikidata or wikisource, its utility can be defined by editors in time. For example, this could also help in getting the statistics of usage of a particular source or help in preparing a blacklist of those which are certain cases of NON-RS. Capankajsmilyo (talk) 15:16, 14 August 2017 (UTC)
@Capankajsmilyo: are you familiar with meta:WikiCite? I think that's along the lines of what you're talking about. ArthurPSmith (talk) 15:32, 14 August 2017 (UTC)
That is similar to what I am proposing here. On going through the related Wikiproject though, it seems limited to subjects of science. To illustrate on the idea above, we have plenty of sources on enwiki:Jainism, none of which I could find on Wikidata. If I could get a tool to replace them with wikidata information, that would be great opportunity to define those sources as being related to Jainism. They can be made available to editors who want to add citations to other articles related to Jainism and to those who want to add sources on wikidata items related to Jainism. Capankajsmilyo (talk) 17:07, 14 August 2017 (UTC)
What's been implemented so far is probably mostly just in science, but the intent with Wikicite is to (in the long run) have a wikidata item representing every cited reference in all wikipedia's. How to do this is still somewhat up in the air. I think we don't do this very well even within wikidata at the moment. ArthurPSmith (talk) 18:19, 14 August 2017 (UTC)
How about importing all the Google books in Wikidata? Capankajsmilyo (talk) 18:23, 14 August 2017 (UTC)

Next step for lexicographical data: demo system[edit]

Lexicographical data on Wikidata, Lydia Pintscher, Wikimania 2017

Hello all,

During Wikimania, Lydia presented the status of the lexicographical data on Wikidata. You can find the slides here.

We're also happy to announce that there is now a demo system ready, where you can try structured lexicographical data as it will appear on Wikidata. Please note the following:

  • The system is not persistent for now, the information are not stored and will disappear if you reload the pages
  • The structure of the pages is based on the data model, but the content and the properties will be decided by the community in the future. We created a few for the demo, feel free to create others.
  • The design of the page is also expected to change, this is not the final version

Feel free to try it, give us feedback or ask questions. See also the Phabricator board. Thanks for your support! Lea Lacroix (WMDE) (talk) 14:09, 14 August 2017 (UTC)

Wikidata weekly summary #273[edit]

Wikidata IDs in Openstreetmap[edit]

Openstreetmap has had a "Wikidata" key for years, but still most items use a wikipedia key rather than the Wikidata key. This is much less useful. A Wikidata key would be so much more useful.

I recall reading somewhere that mappers were reluctant to bot-fix this, because sometimes Wikipedia articles do not exactly match the linked Wikidata item. Unless I am missing something huge, this is essentially bullshit.

Can anyone with OSM connections push for the simple move of converting Wikipedia links to Wikidata links ?

Ok sorry for the probably fruitless rant, but it's frustrating to see that things that should be so simple are going along so slowly. --Zolo (talk) 19:03, 14 August 2017 (UTC)

well, having some experience trying to convert wikipedia links from another dataset into wikidata links, the actual problem is most likely that the wikipedia link is to a page that *contains* information about the linked item, but isn't exactly about the linked item. For example an OSM entry for a particular building might point to a wikipedia link for the company that operates out of the building, not to the building itself. ArthurPSmith (talk) 19:34, 14 August 2017 (UTC)
Thanks for the input. It does not seem to apply to OpenStreetMap, or at least, it's not supposed to. Per doc [10]
You may tag secondary attributes of the feature by preceding the wikipedia key with the name of the attribute, separated by a colon (:). The value of such a key would be the same as the normal wikipedia key, but referring to the appropriate wikipedia page. For example, operator:wikipedia=en:McDonald's on a McDonald's fast food unit (but don't forget to also tag operator=McDonald's, because the former tag doesn't replace the latter). 
and
only provide links to articles which are 'about the feature'. A link from St Paul's Cathedral in London to an article about St Paul's Cathedral on Wikipedia is fine. A link from a bus depot to the company that operates it is not.
I should admit it cites the following (that 'applies to almost no case'), that more or less contradicts the above requirement:
One example where it is appropriate to provide additional explicit links to articles in secondary languages is where the subject is included in an article on a broader subject in the secondary language, for example to the English article which the particular museum in France while French wikipedia has only wikipedia:fr=Monuments et sites de Paris. In another example the structure of subjects in articles cannot be matched 1:1 with interlanguage links (or maybe there are several articles for the same object). In these circumstances use the format wikipedia:lang=page title for the secondary languages.
In any case, that seems like a secondary concern. Would seem much more productive to just upload all Wikidata ids and clean up the few problematic cases afterwards. --Zolo (talk) 20:24, 14 August 2017 (UTC)
Wouldn't it be possible to match OSM node types to Wikidata classes, so that a bot could convert Wikipedia links to Wikidata ids only in the case where the node type and the item type match? We already have OpenStreetMap tag or key (P1282) that could be used for that. Intuitively, that would already cover a decent number of cases, and would be pretty safe. But I can imagine there are cases where the type mismatch is spurious and the Wikidata id should be added anyway. − Pintoch (talk) 09:25, 15 August 2017 (UTC)
My knowlege of OSM is rather shallow, but I think wikipedia|wikidata links should rather go to shapes or ways than nodes. That's also where I have usually seen them (I don't know how to get real states about that).
I suppose we could try a filter by class, but what I see is OpenStreetMap tag or key (P1282) are not subclasses of geographic entites, so that seems rather hard to use [11]. Anyway, if a Wikidata link in OSM is wrong, chances are the Wikipedia link is wrong as well and should be deleted anyway. --Zolo (talk) 10:15, 15 August 2017 (UTC)

Removal of labels by bots[edit]

Within the last days I observed the deletion of all labels of an id by bots. For instance it was done for Q14201291 by PLbot. Q14201291 is now a redirect to the id Q11904043 which represents a disambiguation page to different geographical sites all named أبو صير (variant أبوصير but the same). Unfortunately most of the users cannot speak Arabic that's why we are using transcripted Latin lemmas. But they differ by language using language-dependent writings like Abusir, Abu Sir, Abū Ṣīr, Abousir, Абусир and so on. Normally these labels should have both the same meaning and spelling.

These disambiguation ids will help to find the same lemma in languages with non-Latin letters and different Latin writings. It makes no sense to have a separate id for all writing variants.

I do not know/understand why bots delete these labels solely because of slight differences in writing. --RolandUnger (talk) 11:59, 15 August 2017 (UTC)

The Wikidata way to find these locations would be https://tools.wmflabs.org/reasonator/?find=Abusir rather than visit a disambiguation page elsewhere.
--- Jura 12:14, 15 August 2017 (UTC)
@RolandUnger: If you don't want to have separate items for all writing variants, what do you suggest to do with no label (Q29390469), no label (Q28965736) and Abusir (Q11904043)? Merge is not possible since each item can only have one sitelink per project. PS. I would appreciate to get notified if my work is discussed here --Pasleim (talk) 12:22, 15 August 2017 (UTC)
Of course it is a problem if you have a wiki with identical articles (independently of standard or disambiguation pages). This is not a problem of Wikidata but a problem of quality control and maintenance at these wikis. With my knowledge of these ids or geographical sites I could merge them all. If a merging is not yet possible because of comprehensive merging of the wiki articles then the property P460 ("said to be the same as") should be set for both ids to find all variants for maintenance. In the case of Abu Sir the problem arose from bots which could not think about different lemma writing for the same thing.
The Reasonator tool is a nice one but not known to the general public and it should help to merge identical Wikidata ids. But if a bot will delete all labels then Reasonator cannot find a duplicate.
Maybe as a proposal: If we have different writings of (geographical) objects then we should have a list of aliases which is valid for all languages to save time to enter them for all languages. --RolandUnger (talk) 15:18, 15 August 2017 (UTC)

Here, I'm an idiot (Yet Another Whinge)[edit]

First time I've needed to update something cached away in Wikidata, and I find I'm an idiot! Nothing about the process, the editing, was at all obvious. How strange that an experienced editor (since 2004 Global Contribs) would have such problems? I succeeded, but it took more than a couple attempts... Why such a mismatch with expectations?


I noticed that the repository info on page en:Blink_(web_engine) was wrong. I even first searched for the maillist conversation that pointed me to the correct information for that factoid. Then got flummoxed at no link in the Infobox text. I don't remember what tidbit finally made me clicked on the "shadow source" for info being Wikidata.

(Hmm, possibly reading the page source, then using incantations to get to en:Template:Infobox, then realizing I needed instead en:Template:Infobox_software, and then finding the description of 'repo' there, and then seeing the mention of "attempts to acquire the repository link from Wikidata." "Beware of the Leopard" time...)

But that is problem one, hinting/reminding people that text may be sourced from somewhere mysterious.

(Hey, the blaring hooting notices hidden at top of the template page that "This template uses the Wikidata property: official website (P856)"
{*{Uses Wikidata|P856}}{{Tracks Wikidata|P856}}
would have told anyone something's afoot with 'repository' (P1324), right? Or perhaps... since that notice has a direct link to here Property:P856, the same kind of shortcut should be placed inline in the template help everyplace it mentions data 'might' come from Wikidata? I mention this because there is no obvious link in that help to Wikidata. Dumb, no?)


I don't know how I finally got to the page here Blink because, again, there is no obvious way to get from "this page may reference wikidata" to the mirroring page here to check if that the wrong information is from here.

But that is problem two, no obvious link to the wikidata 'shadow' page here given a page at :foo: wiki.


So I scroll down to "source code repository" and wonder what to do. Click 'Edit'. Why is 'Save' already checked?!? How do I save the changed data? The popup help says "Enter a value". How? The save link is grayed out! I still don't know - I think I experimentally clicked "Add Qualifier" or "Add Reference" and it just happened.

BTW: what's a 'Qualifier'? What's a 'Reference'? I wandered around and found Wikidata:Glossary#Reference and then scrolled up to see qualifier. But my question was really, how and where do I enter my justification for the change? How do I point someone to the 'proof' the change was reasonable? I still don't know if I did the right/best thing!

But that is problem three, how does one edit this pineapple correctly without everything blowing up?

(What is the best thing to do? How does one say "I know this item value is true because of the information I found at <blah>, be it web, book, newspaper, etc." ?)


A key problem described here is, how do you help strangers to quickly fix that one fact they found wrong in another wiki. They don't *want* to be here. They shouldn't have to jump through fiery hoops. Y'all have reams and reams of help, but that is actually a hindrance for the greatest number of editors, those from somewhere else. (Sorry, the important editors for Wikidata are not the most frequent people here, but rather the infrequent or one-time visitors.) Asking people at Wikidata:Introduction#Where_to_get_started to take a tour on elephant back is really discouraging.

And isn't it strange that a wiki, a set of wikis, somehow makes it quite difficult to navigate to information? Is this perhaps a missed facet for implementation of Wikidata at the "system level"? Poor Ms. U. N. Owen has near no chance to bridge all the gaps in order to remedy the smallest data problem. This is an implementation problem hampering a good idea.

Please take the time I've spent describing my experiences as an indication of how far from 'good' that experience was. There are far too many hurdles to jump now that data 'might' come from Wikidata. Without lowering those hurdles you leave the implementation unfinished, incomplete, and far from practical for your average editor. Shenme (talk) 00:19, 16 August 2017 (UTC)

Hey, look, at en:Blink_(web_engine) in the left sidebar under 'Tools', "Wikidata item" ! How helpfully mysterious to the occasional editor. Shenme (talk)
@Shenme: First: Come here and complain that templates on Wikipedia are poorly designed is not going to give you any sympathies here.
Some helpful soul over there has pointed out a gooder template, how it has listed a more complete set of interrelationships to Wikidata, and how uses of the template even include shortcuts (the pencil icons) next to each imported value in the infobox. All of these go very far towards giving the confused editor things to click on and hope to be unconfused. Shenme (talk) 22:22, 16 August 2017 (UTC)


Secondly: Your experience, as you describe it above, summarize well what I feel about VisualEditor or Flow. It takes some times to learn, and it does not help if you think that the idea behind them were bad from the beginning. (Belive me, WYSIWYG (Q170542) is pure evil!!!) -- Innocent bystander (talk) 06:04, 16 August 2017 (UTC)
(ec) I don't have to time to deal with all your whinges, so I deal with just one. There is a project to allow the editing of data from the wikis more directly in WD with a user interface, it just isn't here yet. The thing about fixing a data point in WD is that it fixes the same data point in each wiki, and that becomes a whole lot quicker and easier to do once. Similarly with centrally stored data, it allows the creation of pages in multiple languages that can at least show data in a contextual language where a wiki is yet to create a specific article. Rome wasn't built in a day.  — billinghurst sDrewth 06:10, 16 August 2017 (UTC)
I very much understood the great promise of centralizing data that should be uniform across the multiple wikis. That is why I went to the trouble of correcting something, here, rather than just jamming it in locally at enwiki.
When Rome conquered, they didn't allow uncertainty about how things would be done from then on. They made sure the rules were painfully obvious. No one ever complained "but we didn't know how to pay our taxes!" It remains the case here that how to submit taxes (or data) is unclear to the average plebeian. Having to travel to Rome, learn passable Latin and obtain favorable omens just to correct your middle name on the last census... Shenme (talk) 22:22, 16 August 2017 (UTC)


@Shenme: thanks for taking the time to point out your challenges in fixing something here. As billinghurst mentioned there is development work ongoing to make it easier to edit wikidata entries directly from wikipedia. However, I think the main point of your complaint was uncertainty in how to source the change you wanted to make which will surely still be a problem for more direct wikidata editing from the language wikipedias. I know Lea Lacroix (WMDE) has been working to get documentation here improved - Lea, should we perhaps focus on a one-pager simple outline for people familiar with regular wikipedia editing, on how to fix one claim with appropriate referencing? ArthurPSmith (talk) 14:17, 16 August 2017 (UTC)
As mentioned above, there are certainly improvements possible at the originating wikis. (I love the pencil icon shortcuts in the mentioned template.) Is it possible a "first visit" here can kick off a popup or page-top link to the quick cookbook recipe "how to make an omelette"? If they've just been dropped here from a wiki page pencil link, and it all looks so different and strange, a friendly smile and quick recipe will be very enticing. Shenme (talk) 22:22, 16 August 2017 (UTC)

How to edit the spam filter?[edit]

Q36799744 needs www.bestwestern.de/hotels/Weingarten/BEST-WESTERN-Parkhotel-Weingarten as its official URL. But I cannot enter it because the spam filter won't let me. What is the point of this in Wikidata? --Anvilaquarius (talk) 10:11, 16 August 2017 (UTC)

You can't. Only admins can. It's not even on the local blacklist, but on the global blacklist. I've added the link to the local whitelist. And added the link to the item. Mbch331 (talk) 10:34, 16 August 2017 (UTC)
Thank you. --Anvilaquarius (talk) 13:13, 16 August 2017 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 14:15, 16 August 2017 (UTC)

merging doline (fr) and sinkhole (en)[edit]

Hello ! Could you merge doline (fr) Q30142256, sinkhole (en) Q188734 and Q10939257 ? Thanks.--Cquoi (talk) 11:42, 16 August 2017 (UTC)

I deleted the third one, there was just a link to a redirect. I can't see how the first and second one could be the same thing. Matěj Suchánek (talk) 14:21, 16 August 2017 (UTC)
You probably meant "Q30148256". These two items cannot be merged because there are several wikis with articles about both items. You may change individual sitelinks if you think they are incorrect. Matěj Suchánek (talk) 14:23, 16 August 2017 (UTC)
Thanks a lot Matèj. Regards --Cquoi (talk) 16:38, 16 August 2017 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 18:02, 16 August 2017 (UTC)

perceptible object (Q337060)[edit]

Any possible difference between perceptible object (Q337060) and physical object (Q223557) ? d1g (talk) 17:33, 16 August 2017 (UTC)

Yes, one is a philosophical concept (something perceived), and the other is a physical concept (something that has measurable properties). There's lots of overlap, but we seem to have 5 distinct language wikipedias with entries for both, so they are definitely deserving of being distinguished. The German terms "Gegenstand" vs "Körper (Physik)" seem quite distinct. ArthurPSmith (talk) 17:54, 16 August 2017 (UTC)
perceptible object (Q337060) is very similar to object (Q488383) IMO d1g (talk) 18:49, 16 August 2017 (UTC)
Both are different views/concepts about entity (Q35120). --Succu (talk) 20:53, 16 August 2017 (UTC)
Pictogram voting comment.svg Comment If this item because of Aristotle works (0199326002, p 78), then I suggest to use has quality (P1552) for items about senses and what else is meant about "perceptible"
I do exactly this for food products. E.g. taste can be only with specific chemical elements.
Aristotle had no information what was where, so their explanation is almost always without examples (5 senses e.t.c) d1g (talk) 21:35, 16 August 2017 (UTC)

Categories for Cities?[edit]

A general question, but a specific example.

Cebu City Q1467 only had the category Q104157 "City in the Philippines". I added Q515 "City" for a top-level description of what it is. Is this the intended approach? Should I have added Q1549591 (big city) instead? Power~enwiki (talk) 19:37, 16 August 2017 (UTC)

For instance of (P31) (it's better to avoid the term category, this can get confused with Wikipedia categories), you should generally use the most specific item available - so city of the Philippines (Q104157) rather than city (Q515). The Philippines one is already a subclass of "city", so there is no need to add both - a properly constructed search will find both items marked with city (Q515) and items marked with any subclass of it.
big city (Q1549591) covers a different aspect of the item to city of the Philippines (Q104157), so it would be reasonable to include both (assuming it is indeed a sufficiently large city).
The one major exception to this is people, who should always simply be instance of (P31):human (Q5), not "woman", "doctor", "Danish person", or any other more specific group. Andrew Gray (talk) 20:29, 16 August 2017 (UTC)

"date of birth" (P569)[edit]

The date of birth (P569) property has recently been edited to change the Italian label from "data di nascita" to "26 maggio 1938". I can't fix it because that property is semi-protected and I am not a confirmed/autoconfirmed editor on Wikidata. Could someone fix it for me? Thanks. —RP88 (talk) 20:44, 16 August 2017 (UTC)

Fixed by Jura1. --Denny (talk) 22:33, 16 August 2017 (UTC)