Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
Wikidata Telegram group
On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2020/02.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

Why I can't edit interwiki links?[edit]

https://www.wikidata.org/wiki/Q785653 There should be https://en.wikipedia.org/wiki/Archetype for EN wiki Page says: Could not save due to an error. The save has failed. Also at https://en.wikipedia.org/wiki/Archetype there is no russian --Oleh B (talk) 13:30, 22 February 2020 (UTC)

https://en.wikipedia.org/wiki/Archetype is already connect to Q131714, and https://en.wikipedia.org/wiki/Jungian_archetypes connects to Q785653, that is why the safe fails. Also, there is no Russian page to connect to Archetype: there is a disambiguation page at Q346973 and one for Jungian archetype or archetype in psychology at Q785653, connecting to the Russian Wikipedia. An actual article for the generic archetype concept does not seem to exist in Russian. Pending a bigger refactoring, the current situations looks correct. Or is there a specific issue with the mapping? --Denny (talk) 21:48, 22 February 2020 (UTC)

Which country is Puerto Rico in?[edit]

.. and then, which values to use for country (P17) and located in the administrative territorial entity (P131)?

There are some lengthy comments by User:The Eloquent Peasant on various talk pages of items about places in Puerto Rico and a few other non-US states, such as United States Virgin Islands (Q11703), Guam (Q16635): Talk:Q44547, Talk:Q11703, Talk:Q16635.

The arguments for not using Q30 seem to be that:

  • It's not a state of the US,
  • it's not part of the continental US,
  • some infobox at some Wikipedia displays it in a way not liked by some users,
  • the statements were added by an IP in 2013,
  • and/or, some US politicians think it's not in the US.

I don't quite see how any of this is relevant. --- Jura 07:36, 3 February 2020 (UTC)

  • Neither do I. It's neither independent nor part of any other country: it's part of the United States. Ghouston (talk) 08:17, 3 February 2020 (UTC)
The common approach is to use US for country (P17) and nothing for located in the administrative territorial entity (P131) because it's not in the U.S. .. Located in the U.S. are 50 states.--The Eloquent Peasant (talk) 11:54, 3 February 2020 (UTC)
  • I think the "common" approach since 2013 was to use Q30 in P131. What else do you mean with "common"? --- Jura 12:02, 3 February 2020 (UTC)
However, in the case of the Arecibo Observatory, if we place United States in the country field, then the infobox incorrectly displays that Arecibo Observatory is locted in the U.S. which it is not. Sorry, I should not have said common. What does Q30 mean? I'm not very familiar with wikidata terms but I do know what is in and not in the U.S. Thank you.--The Eloquent Peasant (talk) 12:05, 3 February 2020 (UTC)
As mentioned above, I don't think argument #3 is relevant to Wikidata. Q30 is linked further up. --- Jura 12:07, 3 February 2020 (UTC)
Regarding the located in the administrative territorial entity (P131) "Located in administrative...entity", please show me a map, an official map of the U.S. that shows that Guam, P.R. The US Virgin Islands, etc. are in the U.S. Something is either in the U.S. or it is not in. This is not a controversial topic. I fail to see how this is difficult to understand.--The Eloquent Peasant (talk) 12:11, 3 February 2020 (UTC)
Can you clarify what you mean with "U.S."? Argument #1 isn't a reason not to use Q30 in Wikidata. --- Jura 12:15, 3 February 2020 (UTC)
Argument #1 is a reason not to use Q30 in located in the administrative territorial entity (P131) . If you are in your house-you are in your house. If your left foot is in your house and your right foot is outside your house you are both in and out. However, in the case of these territories they are not in the U.S. so P131 should not be populated with Q30.The Eloquent Peasant (talk) 12:20, 3 February 2020 (UTC)
Regarding your comment "I think the "common" approach since 2013 was to use Q30 in P131." This was done by someone who used an algorithm to do it without giving much other thought to what he was doing with the algorithm and many editors called him out for introducing errors into wikidata, in 2013. The U.S. territories slipped through the cracks at that time. Because it's been there since 2013 doesn't make it right. Many errors exist in wikidata and wikipedia because editors don't notice them until later. So every wiki different language article after that assumed these territories are in the U.S.The Eloquent Peasant (talk) 12:25, 3 February 2020 (UTC)
So your argument is that "US territories" should not use P131=Q30. --- Jura 12:31, 3 February 2020 (UTC)
Neither Washington DC is inside one of the 50 states of the US. The models we use here at Wikida can never fully describe all the fine details in all relations. We have to accept that it sometimes is a little rough, and not fully can describe the truth. I can accept both that Puerto Rico is described as located in US and that it is not. But we cannot have one model for some parts of Puerto Rico and another model for other parts. We have many territories like this in the world. We maybe not even can have the same model for Puerto Rico as for Greenland as for New Caledonia. But inside all of these terrotories we have to accept a common model. 62 etc (talk) 13:01, 3 February 2020 (UTC)
  • And that shows that the term "country" is poorly defined. In my language we do not use the same word (ie country) to describe Wales and UK. The country of Wales and the country of UK are not the same kind of entity. We do not even translate a "county" in England the same was as a "county" in US. 62 etc (talk) 17:07, 3 February 2020 (UTC)
So we should ackowledge that Wikidata has this flaw, and maybe attempt to correct it. I don't know how much coding is involved but because the entire world looks to wikipedia / wikidata for accurate information, I think we should try to get a fact such as whether a place is in a country or not in a country correct. --The Eloquent Peasant (talk) 17:42, 3 February 2020 (UTC)
What is a country? "A country can be part of a larger state" according to our very own wikipedia here, P.R. would be a country, part of (but not in) a larger state, the U.S. --The Eloquent Peasant (talk) 17:48, 3 February 2020 (UTC)
located in the administrative territorial entity (P131) is itself a bit of a strange concept, combining geographical location with administrative control. We have Puerto Rico at some level controlled by the United States (government), even if it may or may not be part of the United States geographically, depending on how we are defining "United States". Ghouston (talk) 22:23, 3 February 2020 (UTC)
@Jura1: I was confused and wondering why you made this change. Isn't this exactly what we are discussing here. Is Guam in the United States? What does in mean? What does located in the administrative territorial entity (P131) mean? --The Eloquent Peasant (talk) 20:43, 4 February 2020 (UTC)
It seems your deletion isn't supported, but, if the conclusion ends up being that it should be removed, we will do so. If you need a reference for Guam being a territory of the US, I can add one. --- Jura 20:46, 4 February 2020 (UTC)
located in the administrative territorial entity (P131) is about "administrative territorial entities", which I suppose are arbitrary areas administered by a government body. They won't necessarily have any connection to geographic entities. Puerto Rico doesn't have much geographical connection to Guam, but they still in international politics considered possessions of the same state. Ghouston (talk) 23:07, 4 February 2020 (UTC)
@Jura1: I believe adding located in the administrative territorial entity (P131) = (U.S.) to Guam and PR. and other territories is wrong but I also see that from the beginning you don't care and will do whatever you want. I don't need a ref to say they're territories. I've never disputed that fact. The fact that they are territories is covered in other wikidata properties. So you're basically saying they're in the U.S. with located in the administrative territorial entity (P131) and that is incorrect--The Eloquent Peasant (talk) 23:57, 4 February 2020 (UTC)
Because you are defining U.S. to mean something other than all of the territory coming under the sovereignty of the U.S. government. It will also make a difference when calculating things like population and land area. en:United States says "The United States of America (USA), commonly known as the United States (U.S. or US) or America, is a country comprising 50 states, a federal district, five major self-governing territories, and various possessions." Ghouston (talk) 00:47, 5 February 2020 (UTC)
The territories are "of" but not "in" the U.S. Adding located in the administrative territorial entity (P131) = U.S. to the wikidata item of the terrorities will state they are in the U.S. when they are not. The property is "Wikidata property to indicate a location" --The Eloquent Peasant (talk) 02:33, 5 February 2020 (UTC)
Personally, I don't care either way, but I think things should be consistent, i.e., how Wikipedia defines it, the population counts and area for the US on Wikipedia and Wikidata, the P17 and P131 statements, should all match. In principle, you can create two or more items on Wikidata for the US, with different definitions, but it would be incredibly confusing. We do have contiguous United States (Q578170), but this is something else. Ghouston (talk) 04:34, 5 February 2020 (UTC)
It would seem germane that anyone born in Puerto Rico is a U.S. citizen. Not sort of a U.S. citizen, with some weird document. They are exactly as much a U.S. citizen as if they had been born on Boston or Chicago. - Jmabel (talk) 05:12, 5 February 2020 (UTC)
Do not look to deep into such words as "in", "of", "on" and "at". Their meaning seldom survives a translation. In fact, it is one of the most difficult things to manage when you are learning a new language. At least when the languages are closely related. 62 etc (talk) 07:13, 5 February 2020 (UTC)
First of all, some background reading: w:Dependent territory.
This stuff is complicated. (IIRC, I brought it up during the discussions leading up to the ongoing "Countries, subdivisions, and disputed territories" RFC, but left it out of the RFC itself because it added complications that could be left for later.) Some countries claim certain territories as their own while saying that the territory is not "part of" the country, making a distinction between all areas governed by the country and the area of the country "proper". This distinction is often applied to various legal things, and is inconsistent between countries. Similar complicated things: What are protectorates, tributary states, dominions, associated states, vassal states, puppet states, colonies, etc? There are many different levels of association an entity can have with a country in power over it. We need a broad and consistent solution for how to represent the levels of association. Current uses of country (P17) and located in the administrative territorial entity (P131) are ambiguous on this. --Yair rand (talk) 22:45, 5 February 2020 (UTC)
In this instance, we only have to solve the problem for the US territories, not come up with a general theory for every country at every point in history. There can be three answers: 1) they are part of the US (as we define it here) 2) they are not part of the US 3) they are variously part of or not part of, on different items or at different times depending on the whim of whoever edited it last. Option 3) will apply by default unless otherwise decided. Ghouston (talk) 00:50, 6 February 2020 (UTC)
There are just three involved (Puerto Rico, Guam, USVI). --- Jura 05:29, 6 February 2020 (UTC)
Also American Samoa and Northern Mariana. And some others like Guantanamo Bay, or in the past Panama Canal, in a completely uncertain status.--Ymblanter (talk) 13:09, 6 February 2020 (UTC)
Guantanamo Bay is interesting in that Cuba retains sovereignty, and should probably have country (P17), but maybe the located in the administrative territorial entity (P131) chain wouldn't lead to Cuba. At present though, it's set as part of a Cuban province: Guantanamo Bay Naval Base (Q762570). Should the population of Guantanamo Bay be counted under the US or Cuba? Presumably it's so low that it wouldn't make much difference. Ghouston (talk) 22:15, 6 February 2020 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────The US Census does not include PR in its tally of US population.[1] The April 2010 US population was 308,745,538, not including Puerto Rico's 3,725,789 inhabitants. The way the US Census accounts for American personnel at US foreign bases (Guantanamo, etc) into the total tally of US population is via the Census tally at the state of residence of the individuals in question (they are residents of their individual states not of the US foreign base).

@The Eloquent Peasant: what were you suggesting above with "The property is 'Wikidata property to indicate a location.' " Were you saying that by manipulating that parameter holds the answer to satisfactorily address the "in"-versus-"of" concern? Mercy11 (talk) 23:36, 11 February 2020 (UTC)

@Mercy11: Thanks for your comments on US population and clarifying that. The only thing I was saying was that - well see here .. https://www.wikidata.org/wiki/Talk:Q16635 ---- That an editor, in 2013, felt that every location needs to be "in" a country (as a matter of heirarchy) but I know that is not necessarily so for P.R., Guam, the Virgin Islands, etc. Because as we all know there are 50 states in the U.S. and they don't include those territories. I explained that the editor was mistaken when he made that change in 2013 with a bot. I explained, in the talk page, that I removed the parameter from those US territories wikidata items, because if we add it then also addresses might say something like "Arecibo, Puerto Rico, US" and that is incorrect per pretty much everyone's knowledge. Also, I warned that because of that change made in 2013, other language Wikipedia articles created geographic location articles that stated that P.R. is in the U.S. This is the parameter that I feel should not be on US territories = to US ---> https://www.wikidata.org/wiki/Property:P131 = P131 is a property that indicates a location. So it's clear that this property should not be added to US territories because they are not "in" the US. Finally, to answer your specific question I believe the definition of parameter located in the administrative territorial entity (P131) is to indicate a location, thus again should not be added to US territories. (Sorry to sound like a broken record) One editor mentioned not to get too hung up on "in" or "on" or "by" or "of".. (prepositions) but I think that something as simple as whether or not a place is in another place should be accurate on Wikipedia so that we don't perpetuate wrong information. I've had people say to me "Oh Puerto Rico is not in the U.S.?".. So that's all. (Obviously the relationship is complicated - It reminds me of the question- are you married? divorced? answer is "it's complicated".) But I think the P131 parameter is clearly talking about a location (not relationship between US and PR).. but the location of PR is not complicated, and whether a location is in another location is not complicated. That's all. Bottom line "my argument is that "US territories" should not use P131=Q30" (and Q30 means US)--The Eloquent Peasant (talk) 23:56, 11 February 2020 (UTC)
@Jmabel: When you said "anyone born in Puerto Rico is a U.S. citizen...exactly as much a U.S. citizen as if they had been born on Boston or Chicago", how do you see citizenship having a bearing on the issue here? You seemed to be implying that citizenship can be a way to determine the "in" relationship, but I don't think I can agree with that because w:Puerto Rican citizenship explains how people born in Puerto Rico have both "Puerto Rican citizenship" and "US citizenship." But reviewing what happened after PR became a territory "of" the US people born in PR were only "Citizens of Puerto Rico" and did not automatically become US citizens until an Act of Congress some 20 years later (1917). There is no record that in the 1917 grant of citizenship incorporated PR "in"to the United States. The lesson is that citizenship is no more a determinant of PR being "in" the US than, say, Congress passing a Law granting earthquake relief moneys would be a determinant that PR is suddenly "in" the US. Seems to me that citizenship, like population, can't be used to determine whether or not PR is "in" the US, or was I missing something in your rationale? Mercy11 (talk) 02:25, 12 February 2020 (UTC)
It still seems like you can argue it either way. en:United States seems to be taking that position that Puerto Rico is part of the USA, for example. "United States (U.S. or US) or America, is a country comprising 50 states, a federal district, five major self-governing territories, and various possessions." and in geography: "the entire United States is approximately 3,800,000 square miles (9,841,955 km2),[218] with the contiguous United States making up 2,959,064 square miles (7,663,940.6 km2) of that. Alaska, separated from the contiguous United States by Canada, is the largest state at 663,268 square miles (1,717,856.2 km2). Hawaii, occupying an archipelago in the central Pacific, southwest of North America, is 10,931 square miles (28,311 km2) in area. The populated territories of Puerto Rico, American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands together cover 9,185 square miles (23,789 km2)". Ghouston (talk) 03:06, 12 February 2020 (UTC):
No one here disagrees that PR is part "of" the USA, and the article you cited also states it is part "of" it. The disagreement is whether or not PR is "in" the USA. The links offered by the various editors above show the problem is one of sometimes articles implying or openly stating PR is part "of" the US and sometimes articles implying or openly stating (with the help of wikidata) that PR is "in" the US (e.g., former version of Arecibo Observatory). The confusion seems to stem from a tendency to equate the two: that being "part of the USA" implies being "in the USA" and vice versa. PR is part "of" the US while at the same being not "in" the US. The lead paragraph at the English WP US article you cited above is consistent with this, but says nothing about the "in" part. Are these Wikidata constructs (Q30, P|17, P|131, etc.) supposed to facilitate the work of the WP sister projects or are the sister projects supposed to perform workarounds to facilitate Wikidata's work? Mercy11 (talk) 04:09, 13 February 2020 (UTC)
Doesn't the "administrative territorial entity" of the USA consist of all the territories that are part "of" the USA, or administrated by the USA at some level? Wouldn't every geographical area that's part of the USA then be "in" this territorial entity? Ghouston (talk) 04:55, 13 February 2020 (UTC)
I don't think so. I took a snapshot of the also known as (multiple descriptions - please see attached image) of the parameter in question. And I went ahead and tabulated / summed the population as seen in the source provided by Mercy11. Total US Population 203,211,926 which can be seen at the top of US population in 1970 did not include Puerto Rico's population of 2,712,033 of 1970 (which is listed on the last line of same source). I just added them to check. PR's pop is listed but not included in the US total pop. --The Eloquent Peasant (talk) 11:42, 13 February 2020 (UTC)
Parameter 131 in question. I see the parameter as defining a place that is "in" another place not "of" or "belonging to"
So the Spanish national census agency is still doing the census in PR? --- Jura 12:02, 13 February 2020 (UTC)
Whoa- don't go getting salty --The Eloquent Peasant (talk) 12:37, 13 February 2020 (UTC)
@Ghouston: As I believe I read someone state above, the term country seems to be poorly defined. Like beauty, this too seems to be in the eyes of the beholder. Mercy11 (talk) 03:13, 15 February 2020 (UTC)
Sure. I don't know how this can be resolved: maybe somebody can be appointed to toss a coin, or we could have a vote? Otherwise, we will never know one way or the other. Ghouston (talk) 03:18, 15 February 2020 (UTC)
  • Another page where Wikipedia says it's in: en:Unincorporated territories of the United States: All modern inhabited territories under the control of the federal government can be considered as part of the "United States" for purposes of law as defined in specific legislation (refs). Ghouston (talk) 09:26, 15 February 2020 (UTC)
It's not a matter of a coin toss. @Jura1: Do you think the US territories are in the US? If you do, please share sources that say the US territories are in the US. Have you ever seen a map of the US? We are talking about Parameter 131, a parameter that I did not create but that was created and defined by someone on the wikidata project. The definition or "also known as" states in more than a dozen times and this is the most ridiculous thing at this point because you're not listening. You just want to win an argument, and this was your position from the beginning - from the moment I updated the wikidata items for US territories and added my "lengthy explanation" that you defined as "not relevant". Well it is relevant and it is important to get it right here.--The Eloquent Peasant (talk) 12:39, 15 February 2020 (UTC)
@Mercy11: When you say the territories are of the US, I think you mean belong to the US. I can have a dog or a kitty cat that belong to me but they are unINcorporated into my household, so they stay outside all the time. The territories belong to the US but again are not in the US, not INcorporated. Scholars from Yale talk about the issue here, using the term belong to.[2] --The Eloquent Peasant (talk) 12:00, 16 February 2020 (UTC)
The "in" in "incorporated" is the latin prefix, not the English word. "please show me a map, an official map of the U.S. that shows that Guam, P.R. The US Virgin Islands, etc. are in the U.S.": Here is one by the NOAA. But more generally the property asks for the administrative territorial entity the item belongs to. From what I understand of w:en:Territories of the United States, Guam, Puerto Rico, etc. are part of the territory of the United States. -Ash Crow (talk) 22:14, 16 February 2020 (UTC)
Hi. The located in the administrative territorial entity (P131) in question is defined as places that are in other places (if you see the screenshot attached you can see what I mean) and the map you shared is a natural map which includes the "U.S. Caribbean region (in Spanish: El Caribe estadounidense) is a term used by the National Oceanic and Atmospheric Administration (NOAA) to refer to the waters belonging to the United States in the Caribbean Sea.[3] NOAA maps it as a natural region of the United States, located in the Caribbean Sea, made up of federal waters in and around Puerto Rico, the US Virgin Islands, Navassa Island, and the Guantánamo Bay Naval Base. Serranilla Bank, and inhabited island, and Bajo Nuevo Bank, which are currently controlled by Colombia but claimed by the United States, are sometimes included in the region by NOAA. The U.S. Caribbean region is a natural region and not a political or administrative region." These locations are not administratively located in the US. Have a great week.[4]  – The preceding unsigned comment was added by The Eloquent Peasant (talk • contribs).
Thank you. I think another field would be useful. A new field / parameter could explain something to this effect. --> The U.S. Secretary of Interior "Carries out Responsibilities for the U.S. Insular Areas" even though P.R. is not on this doi.gov site's list of public laws.[5] --The Eloquent Peasant (talk) 12:12, 17 February 2020 (UTC)
I'd also like to invite you all to read what the Wikiproject Puerto Rico team believes happens often regarding the editing of Puerto Rico articles, here in a 2014 Wikipedia Signpost.--The Eloquent Peasant (talk) 16:53, 17 February 2020 (UTC)
  1. 2010 US Census
  2. https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=6444&context=faculty_scholarship
  3. Delgado, Patricia; Delgado, Patricia; Stedman, Susan-Marie (2004). La región del Caribe Estadounidense: humedales y peces, una conexión vital. Silver Spring, MD: Administración Nacional de los Océanos y la Atmósfera (NOAA), Oficina de Pesquerías de NOAA, División de Conservación de Habitáculo – via Google Books. 
  4. {{Cite book|url=https://www.biodiversitylibrary.org/bibliography/62466%7Ctitle=La región del Caribe Estadounidense : humedales y peces, una conexión vital|last=Delgado|first=
  5. https://www.doi.gov/oia/budget/authorities-public-law

US counties[edit]

There is now a dashboard at Wikidata:Lists/US counties/dashboard.

It still needs some tweaking to handle co-extensive cities. --- Jura 18:00, 5 February 2020 (UTC)

  • Looks like some items (and Wikipedia articles) missed the updates since creation: e.g. name and area changes that come with re-oranization. --- Jura 05:48, 17 February 2020 (UTC)

Draft for why we need new ranks[edit]

I have written a draft for the case for having two new ranks of Uncertain and False. Feel free to comment on the talk page if you have input at the draft stage. ChristianKl❫ 11:27, 7 February 2020 (UTC)

It seems preferable to split them into separate statements, but "false" might not be the optimal term (compare with "erroneous" used also for deprecated). --- Jura 11:42, 7 February 2020 (UTC)
@Jura1: I'm open to other terms then false. "erroneous" doesn't feel to me like an improvement.  – The preceding unsigned comment was added by ChristianKl (talk • contribs).
It wasn't meant as an improvement, just a comparison for being too close. "contested" might do. --- Jura 21:58, 7 February 2020 (UTC)
"Refuted" suggesting the addition of the refutation reference? --SCIdude (talk) 14:34, 8 February 2020 (UTC)
  • Not sure about description of the VIAF part: neither is this way these are currently handled nor is deprecated rank currently appropriate for these. --- Jura 12:31, 7 February 2020 (UTC)
I do not like the idea of "new ranks":
  • Ranks as we have them now do deliberately not carry any semantic meaning; they are mere visibility controllers, and we can freely compose a ranking combination for all claims of a given property in an item for very different reasons. The proposed draft intends to change this completely.
  • Even the three current ranks are poorly understood by many editors and often used incorrectly.
  • The more ranks we add, the more complicated it becomes to understand which data is visible in which data retrieval scenario.
  • The more ranks we add, the more likely it becomes that multiple ranks appear applicable at the same time. Which one to choose then?
That said, I do acknowledge that it is currently difficult to annotate statements properly with editorial decisions in a structured manner. We do so by using a combination of qualifiers, references, and rank usage to deal with this shortcoming and it is a mess. This should somehow be improved, but please do not mess up ranks to try this. —MisterSynergy (talk) 12:49, 7 February 2020 (UTC)
In data retrivial scenarios today sometimes a person will make adjustments for qualifiers and sometimes they don't, it doesn't seem clear to me. Can you point to example where you think the proposed ranks wouldn't leave a clear which rank to use.
I do agree that the current ranks are poorly understood and that suggests to me that the current implementation of them is problematic. Ranks are essentially messed up right now. If it would become more commanplace within Wikidata to make decision about whether normal or uncertain is the more appropriate rank, this will result in users getting more conscious about ranks. Having ranks more visible in the UI will also help making them more visible. ChristianKl❫ 17:54, 7 February 2020 (UTC)
To paraphrase: ranks are poorly understood, so if we add a new rank they will become better understood. No. --Tagishsimon (talk) 18:38, 7 February 2020 (UTC)
Well, as ranks are visibility controllers, your proposed new ranks would be sort of sub-ranks that control claim visibility in the same way as one of the existing ranks. Rank "false" would actually be like "deprecated-false", and "uncertain" would be "normal-uncertain" (I guess). There would always be uncertainty whether to use the sub-rank or to main-rank, and very soon there would be demand for even more sub-ranks for various purposes. For that reason, I clearly prefer to keep it as "simple" as it is, i.e. continue with pure visibility controllers without any further semantics.
I think we should maybe overhaul Help:Ranking in order better educate users what ranks actually are; improving visibility should not be that difficult (there is at least a phabricator ticket for it, and we could probably offer a gadget within hours if we wanted to do so). —MisterSynergy (talk) 22:09, 8 February 2020 (UTC)
I think we currently also have "deprecated-uncertain", so it's not a straight subrank.
I agree that we shouldn't add ranks that don't have an effect on visibility. Both of my proposed ranks do effect visibility and this would be a natural barrier towards people proposing additional sub-ranks. ChristianKl❫ 08:06, 11 February 2020 (UTC)
  • one potential use for the "false" rank is to prevent bad data from re-entering wikidata. for example, if some source says the imdb profile id of X is y but I've hand audited that to be mistaken I would like to be able to encode that information in wikidata to prevent a later editor/bot from mistakenly re-adding the wrong imdb profile link. is that something you imagine the rank being used for? BrokenSegue (talk) 15:27, 7 February 2020 (UTC)
    • Deprecated rank with a reason such as error in source covers that use case. I'm not buying any of the new rank argument. --Tagishsimon (talk) 16:50, 7 February 2020 (UTC)
  • Opening the floodgate of large amount of uncurated data may be a concern.--GZWDer (talk) 22:52, 7 February 2020 (UTC)
  • I think adding of uncurated data already happens to day already. When it comes to large-scale adding of data the gate that we have is supposed to be the bot approval process even if you try to circumvent it in cases like the Peerage import. ChristianKl❫ 10:05, 8 February 2020 (UTC)
  • Two thoughts:
1. I think it may make sense to distinguish between statements that we are pretty confident are wrong, and have therefore deprecated from the project, from statements (perhaps suggested by an AI -- and in particular, in Commons Structured Data, suggested by machine vision) that we think might well be right, but we think would benefit from a check or review. We could use deprecation for this latter case, but a specific new type of rank might be more undersatandable and transparent.
2. At the moment we don't have that many deprecated statements. But if we anticipate the number increasing, I think there is a fair case for reviewing the p:Pxxx prefix in the RDF dump and WDQS, and splitting out deprecated statements from it, to use a new prefix, eg pd:Pxxx.
At the moment, to exclude deprecated statements in SPARQL, you have to do something like the following:
  VALUES ?not_deprecated {wikibase:NormalRank wikibase:PreferredRank}
  ?item p:Pxxx ?stmt .
  ?stmt wikibase:rank ?not_deprecated .
or alternatively
  ?item p:Pxxx ?stmt .
  MINUS {?stmt wikibase:rank wikibase:DeprecatedRank} .
This is inefficient because it requires an extra join. More to the point, nobody ever does this, because it's a pain to include -- even people who are having to use the p:Pxxx form in their queries because they want to extract or filter by a qualifier value. So as a result, deprecated statements get included in their results, which is probably not what was intended. This is bad design. It's not helping people to get the results they are most likely to want.
The alternative, if we introduced a pd:Pxxx prefix for deprecated statements would be that
 ?item p:Pxxx ?stmt
would only return non-deprecated statements (probably the desired behaviour 99% of the time); while
 ?item p:Pxxx|pd:Pxxx ?stmt
could be used whenever deprecated statements were desired to be included -- an easy enough change to make, but requiring the query-writer to explicitly signal this intention.
Yes, this would be a breaking change for a limited number of existing queries and reports. But in my view it is one that would make sense. Jheald (talk) 14:25, 8 February 2020 (UTC)
Breaking change for queries that might already be broken .. I don't usually filter for deprecated rank when accessing qualifiers (which is kind of bad). --- Jura 22:14, 8 February 2020 (UTC)
I had sort of assumed until this point that p/ps was "normal plus preferred", not "normal plus preferred plus deprecated". Ooops. I like the idea of a specific "give me deprecated values" prefix. Andrew Gray (talk) 20:08, 10 February 2020 (UTC)
I support that change. I would guess it will fix more existing queries then it breaks. ChristianKl❫ 08:06, 11 February 2020 (UTC)
@Jheald, Andrew Gray: nobody ever does thiscough
return non-deprecated statements (probably the desired behaviour 99% of the time) I think that number is way too high. In many cases, I believe the intention will not be “all preferred- and normal-rank statements”, but “all best-rank statements”, i. e. the same ones that you get with wdt: (and p: was used e. g. to access qualifiers, not to get all statements). And the correct way to do that is ?stmt a wikibase:BestRank. --TweetsFactsAndQueries (talk) 14:04, 18 February 2020 (UTC)

Q12678612 vs Q12678613[edit]

Please help. --Kusurija (talk) 12:58, 10 February 2020 (UTC)

Vyžuona (Q12678613): tributary of Šventoji, in Lithuania --- Jura 13:01, 10 February 2020 (UTC)
Thank you very much. --Kusurija (talk) 13:07, 10 February 2020 (UTC)


It seems that there are actually 5 items with the same label:

All got expanded over the last days.

@Kusurija: --- Jura 06:01, 17 February 2020 (UTC)

Thank you. --Kusurija (talk) 08:25, 17 February 2020 (UTC)

How do you make citation needed constraint (Q54554025) work with qualifiers[edit]

This is a problem i have been running into. --Trade (talk) 19:13, 11 February 2020 (UTC)

  • Try Template:Complex constraint? But then, would there be a statement that only needs a reference when it has a qualifier? --- Jura 17:52, 12 February 2020 (UTC)
  • @Trade: In my opinion, the reference of a value can be inserted as a reference of a qualifier. I mean: a qualifier qualifies a value and a reference proves the existence and the usefulness of this value, so the qualifier is indirectly sourced, IMHO. applies to part (P518) can be part of the references to specify that a reference applies to a qualifier.
From a property perspective, I believe we can add citation needed constraint (Q54554025) to a property that also hosts as qualifiers (Q54828449). For example, place of marriage (P2842) is used as a qualifier in spouse (P26) and P26 must have a constraint reference. Otherwise there is no way to specifically reference a qualifier (if that was the question): a source reference a value that can contain one or more qualifiers and that seems sufficient, as described in Help:Sources. Does that help you or did I confuse you? With an example, your question will be clearer. —Eihel (talk) 19:23, 12 February 2020 (UTC)
@Eihel:, i want to make 'citation needed' work with number of reviews/ratings (P7887) --Trade (talk) 20:09, 12 February 2020 (UTC)
@Trade: I went too far: in fact Q54554025 is useless for a qualifier, because the constraint violation page does not list this constraint (for a qualifier) and no alert is generated on other tools. As Jura writes, I think that a Complex constraint is the only alternative. —Eihel (talk) 22:12, 18 February 2020 (UTC)

located in the administrative territorial entity (P131) isn't really transitive...[edit]

Hi folks! Over on the located in the administrative territorial entity (P131) talk page, we're trying to figure out how to best handle the fact that administrative territorial entitites at different logical levels aren't really transitive (but the conversation is sort of stuck, so I'm asking for more attention here). For example, the city of Atlanta (Q23556) exists in two counties: Fulton County (Q486633) and DeKalb County (Q486398). The problem arises when I state in P131 that an organization like Patch Works Art & History Center (Q76461608) is located in Atlanta...but then which county is it in? After a natural disaster, I need to be able to search for all organizations in a specific county. Without some fix for this, many organizations will show up as being in the wrong county. Сидик из ПТУ proposed Wikidata:Property proposal/hierarchy switch to explicitly state at the organization level which "grandparent" administrative territorial entity it belongs to, as a potential solution. However, my "inner ontologist" says that a property is either transitive or it isn't, and therefore we could say that located in the administrative territorial entity isn't transitive and we should just explicitly state all logical levels for each organization in the P131 field (returning to the example, explicitly stating that Patch Works Art & History Center is in Atlanta, Fulton County, and Georgia (U.S. State), all in the P131 field). I know that removing P131's transitive property status would be a dramatic change that could have a heavy impact, so I'm curious if there's consensus around one of these two options, or if there's a third option that we're not thinking of. Thanks for your time and attention on this! Clifflandis (talk) 14:13, 12 February 2020 (UTC)

Here we are talking about an exception to the rule, the solution to the problem is elementary and expects only the adoption of a new property. Thanks to the hierarchy of located in the administrative territorial entity (P131) declared by default, we can easily adjust and build chains from the village to the state with a minimum number of edits, and if you specify not the most accurate values, but everything from parish to the state, this will require significant duplication of efforts, duplication of data and the creation of less savvy algorithms with an extensive knowledge base formalized in code, which will be far from universal in contrast to the current state of affairs. In other words, if we reject the declared hierarchy and transitivity here we will need switches at each access to the property to figure out which of the values is higher in status.Сидик из ПТУ (talk) 15:15, 12 February 2020 (UTC)
@Сидик из ПТУ: "an exception to the rule"? Hmm, you have forgotten this discussion about French intercommunalities and cantons. Ayack (talk) 15:33, 12 February 2020 (UTC)
Well, yes, for France it’s done topsy-turvy without clear reasoning. I understand that the arrondissement of France (Q194203) and canton of France (until 2015) (Q184188) lived in parallel there, but no one answered me why the region of France (Q36784) are specified for commune of France (Q484170) if they are completely determined by the arrondissement of France (Q194203). Сидик из ПТУ (talk) 15:42, 12 February 2020 (UTC)
For an edge case like this, wouldn't it make more sense to create items for the portion of Atlanta (Q23556) within Fulton County (Q486633) and the portion within DeKalb County (Q486398), rather than having to change how we handle what is probably the 98+% case? - Jmabel (talk) 16:08, 12 February 2020 (UTC)
I think this is a bad idea. There is nothing better than using of Atlanta (Q23556) item in all cases (for place of birth (P19) and located in the administrative territorial entity (P131) or for 1996 Summer Olympics (Q8531) and Atlanta Thrashers (Q244039)). Just adding a switch qualifier as needed, we will not change anything at all for 98+% cases. Сидик из ПТУ (talk) 16:19, 12 February 2020 (UTC)
This isn't really an edge case for Georgia (Q1428). Of the 539 municipalities in Georgia, 51 (9.5 %) are in more than one county. Clifflandis (talk) 17:01, 12 February 2020 (UTC)
Thanks for bringing this issue here. I think I agree with Jura here, that we should only be using located in the administrative territorial entity (P131) when the subject is entirely within the object. For the exceptional cases like Atlanta, we should use territory overlaps (P3179) for the next-level geopolitical entities it falls within, and P131 for the smallest geopolitical entity it's within (Georgia). The problem with using P131 for multiple counties is that this practice effectively redefines P131 to be the same as P3179, rendering it non-transitive and far less useful. If 2% of P131 usage does not denote transitive geo-containment, then none of it does. Bovlb (talk) 20:39, 13 February 2020 (UTC)
  • Yes, you have correctly summarized my position (and, I hope, Jura's). It seems to me that the statement "Atlanta is in Fulton County" is simply false, so we should not say it, and queries asking "What is Atlanta in?" should not return "Fulton County". Cheers, Bovlb (talk) 17:55, 14 February 2020 (UTC)
  • Hmm. Just to be clear, I do think it is false in reality to claim that "Atlanta is in Fulton County", but now we're down to arguing what "in" means, which will have no satisfactory resolution. :)
I had a quick look around WPEN to try to find a definition of what inclusion in en:Category:Cities in Fulton County, Georgia is intended to denote, but I didn't find anything relevant to this question. It is the nature of the category system that the link between the article and the category is unspecified, so in edge cases it will end up having multiple possible meanings. Here at Wikidata, we're trying to be ontologists, so we should not tolerate such vagueness. Cheers, Bovlb (talk) 21:13, 14 February 2020 (UTC)
There are many languages where are no analogues of in at main labels of P131. But the logic on the example of the city is clear and it's similar to continent (P30) for Russia (Q159). Сидик из ПТУ (talk) 21:51, 14 February 2020 (UTC)
  • In my view, we need to think about what queries will people most typically write, and how do we try to make sure they get back as much as possible of what they are looking for.
I don't like the territory overlaps (P3179) solution, because in practice people won't think to ask for it when they're writing queries.
I'm not sure I understand the "hierarchy switch" suggestion -- I don't see how this would work in queries, when people are most naturally just using wdt:P131*
For myself I think the best approach would be to use P131 with applies to part (P518) qualifiers when required eg on Atlanta, plus a P131 at whatever level can be given without qualification (eg Atlanta -> Georgia), plus 'leapfrogging' P131s when there is ambiguity (eg organisation -> Atlanta, and organisation -> Fulton County). This allows queries using P131* to return pretty much the right answers, while careful queries can return exactly the right answers, using P131* MINUS {chains that include a qualified P131} PLUS {chains where that qualified P131 is covered by a P131 from a lower entity}. Jheald (talk) 21:19, 13 February 2020 (UTC)
@Clifflandis: No, it would be Atlanta (Q23556) located in the administrative territorial entity (P131) Fulton County (Q486633) with qualifier applies to part (P518) = somevalue or applies to part (P518) = some list of districts of Atlanta that are in Fulton County (Q486633).
For Patch Works Art & History Center (Q76461608) I would suggest both Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Atlanta (Q23556) and Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Fulton County (Q486633), the latter probably qualified with an appropriate object has role (P3831) qualifcation to flag that Q486633 is not the regular next step up the hierarchy. Jheald (talk) 15:05, 14 February 2020 (UTC)
  • @Jheald: Thanks for spelling that out for me -- I'm following you now!
I don't think that applies to part (P518) will work as a way to try to connect cities and counties, since they're not logically related to each other, so there's no somevalue to apply. As you suggest, we could maybe cobble something together for Atlanta based around neighborhoods, but that approach won't work as well for smaller towns like Braselton (Q899020) where the city exists in four counties -- around 9.5% of municipalities in Georgia exist in more than one county, unfortunately. There's just no logical connection between city borders and county borders, at least in Georgia.
For Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Fulton County (Q486633), where you suggest qualifying with object has role (P3831), would it look like this: Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Fulton County (Q486633) / object has role (P3831) county (Q28575)? Does that make sense, or would it just be redundant?
The messiness between cities and counties is what has me leaning towards Сидик из ПТУ's proposed Wikidata:Property proposal/hierarchy switch as a pragmatic solution. Clifflandis (talk) 18:22, 14 February 2020 (UTC)
@Clifflandis: applies to part (P518) is not a connector, it's a warning -- it means that the statement does not apply to the whole of the subject, only a part of it. Even if we cannot precisely detail which parts of the subject item the statement applies to, nevertheless we can still use applies to part (P518) with the generic value somevalue to indicate that the statement Atlanta (Q23556) located in the administrative territorial entity (P131) Fulton County (Q486633) does not apply to the whole of Atlanta.
We need something like Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Atlanta (Q23556), otherwise a query for archive centres in Atlanta will not return it. Jheald (talk) 16:12, 16 February 2020 (UTC)
@Jheald: For the statement Atlanta (Q23556) located in the administrative territorial entity (P131) Fulton County (Q486633), I tried saving applies to part (P518) as a qualifier, both with a blank value, and with "somevalue" as the value (to indicate the warning), but it won't save with either of those values. I'm not sure how to use applies to part without creating an artificial subdivision of Atlanta that doesn't exist. I'm probably misunderstanding how it should be used. Can you point me to an item that uses applies to part in the way you're describing? Thanks again for explaining! Clifflandis (talk) 13:09, 18 February 2020 (UTC)
@Clifflandis: Works for me: diff. Note that the UI represents somevalue as "unknown value", which is often quite untrue; that's a known shortcoming in the UI. Jheald (talk) 13:16, 18 February 2020 (UTC)
@Jheald: Oh, thanks! Once I changed it from somevalue to unknown (Q24238356), it saved. But for some reason I couldn't save it as "unknown value" the way that you did. Maybe it's a permission that I don't have. Oh well, either way it's handy to know that unknown (Q24238356) exists and can be used in a pinch. Thanks again! Clifflandis (talk) 15:05, 18 February 2020 (UTC)
I think we should approach this from the perspective of the "client". There are two major types of questions we are solving by located in the administrative territorial entity (P131):
  1. How to get display geochain in infobox like Patch Works Art & History Center (Q76461608)Atlanta (Q23556)Fulton County (Q486633)Georgia (Q1428)United States of America (Q30) in lua. That seems to be quite trivial, assuming that Wikidata:Property proposal/hierarchy switch will be implemented as P8000 and statement Patch Works Art & History Center (Q76461608) located in the administrative territorial entity (P131) Atlanta (Q23556) will have qualifier P8000:Q486633
  2. How to query for all instance of (P31) organization (Q43229) located in Fulton County (Q486633)? Not sure I have immediate SPARQL snippet here. May be someone with better query writing skills can do it? Ghuron (talk) 06:44, 15 February 2020 (UTC)
@Ghuron: Maybe something like this:
  {
    ?item wdt:P31/wdt:P279* wd:Q43229. # organisation
    ?item wdt:P131+ wd:Q486633. # located in Fulton County
  }
  MINUS
  {
    # remove the item if there is a hierarchy switch to another county
    ?county_of_georgia wdt:P31 wd:Q13410428.
    ?item wdt:P131*/p:P131/pq:P8000 ?county_of_georgia.
    FILTER (?county_of_georgia != wd:Q486633)
  }
--Dipsacus fullonum (talk) 13:33, 15 February 2020 (UTC)

To summarize as I see the dissussion:

  1. Viewpoint one is that there is a hierarchy where municipalites is considered to be lower in the hierarchy than counties. That is the view of English Wkipedia (and probably also all other Wikipedias) which has the category Category:Cities in Fulton County, Georgia (Q15211928), but there is no category named w:Category:Counties in Atlanta, Georgia. For that viewpoint to be reflected in Wikidata, we will need the proposed hierarchy switch qualifier.
  2. Viewpoint two is that in Georgia municipalites and counties have equal hierarchical status, so chains of hierarchy should be like (1) Patch Works Art & History Center (Q76461608) → (2) [ Atlanta (Q23556) and Fulton County (Q486633) ] → (3) Georgia (Q1428). For that viewpoint to be reflected in Wikidata, the municipalites and counties of Georgia should point to each other with territory overlaps (P3179)

I agree that from strictly logical point of view that viewpoint 2 is correct. However most sources inclusive the Wikipedias assume viewpoint 1, so I support to also use viewpoint 1 as I don't think that it is sustainable to have another model of the world in Wikidata than all or most other places. --Dipsacus fullonum (talk) 06:45, 16 February 2020 (UTC)

There is no such thing as "hierarchical status" where I live. The discussions here tends to be dominated by people from federal states (US/Germany/Russia) which have "hierarchical status", but many other nations do not have them. The 98+% case people talk about above, is more of an exception here, rather than a rule. If I have to add the smallest entity, I sometimes have to add "Europe" if I should follow the rule to the book. 62 etc (talk) 08:47, 16 February 2020 (UTC)
I'm failed to see how you get from "Sweden does not have hierarchical status" to "wikidata should not capture hierarchical status where one clearly exists" Ghuron (talk) 12:56, 16 February 2020 (UTC)
I remember my talk with Yger (talkcontribslogs) who said that transitivity works about OK from kommun. län land. And when I look at the Swedish Wikipedia, I observe the following: Hässjö distrikt är ett distrikt i Timrå kommun och Västernorrlands län, Timrå kommun är en kommun i landskapet Medelpad i Västernorrlands län (where province of Sweden (Q193556) is not for located in the administrative territorial entity (P131) at our days). Moreover, I found there many-to-one direct matching Lista över Sveriges kommuner and similar district listings in articles and templates of kommuns. Сидик из ПТУ (talk) 14:14, 16 February 2020 (UTC)
I'm no expert on ontology, but if there's one thing I've learned from my own work trying to classify administrative regions into databases, it's that they are not hierarchical. You can convince yourself that at best they're mostly hierarchical. But if you have a scheme that assumes that they're perfectly hierarchical, a scheme that's going to break if confronted with a real-world exception, then the bad news is: it's going to break. There are a lot of exceptions out there in the real world.
But the other thing I strongly believe is that there needs to be an effective compromise. The convenience of assuming that something is "mostly" hierarchical is significant. A scheme that forces every entity to be explicitly categorized under its grandparents and great-grandparents (just because there are a relatively few entities whose direct parents happen to be ambiguous in that regard) is going to be way too much work.
So I believe our goal should be: that compromise. What's the right way of encoding situations like Atlanta's, that balances the needs of the people entering and maintaining the data, versus the needs of the people querying the database? —Scs (talk) 14:46, 16 February 2020 (UTC)
Right way is a) Get Wikidata:Property proposal/hierarchy switch; b) Use query of Dipsacus fullonum (talkcontribslogs) with this property. In this scenario everyone will be satisfied in their needs with the filled data will correspond to the main PoV stated in authoritative sources. Сидик из ПТУ (talk) 15:00, 16 February 2020 (UTC)
I basically agree with viewpoint 2, though wherever a municipality is entirely inside a county, we should express the simple hierarchy. - Jmabel (talk) 17:21, 16 February 2020 (UTC)
This discussion is (primarily) about how to represent cities that do not fall within a single county. I've seen several people above express opinions about the simpler case that, when a city falls within a single county, we should be allowed to represent that geo-containment directly. What I cannot see is anyone proposing otherwise, so I don't understand why it keeps coming up. Am I missing something? Bovlb (talk) 20:59, 17 February 2020 (UTC)
In case of ambiguity, we suggest that the new qualifier (Wikidata:Property proposal/hierarchy switch) will indicate the right choice at the next level of the hierarchy. The main idea is that all Georgia municipalities should have only counties in P131 as it corresponds to the official hierarchy of administrative territorial entities (Q4057633). Сидик из ПТУ (talk) 08:12, 18 February 2020 (UTC)
P131 works quite well as a simple transitive hierarchy in the most cases, and the solution by Сидик из ПТУ will allow it to work correctly in most other cases. It is very important to mark only one administrative territrial entity on most elements in WD as it drastically reduces the amount of work to create chains of ATEs. If one somehow destroys this system, it would be nearly impossible to put correct chains for all small ATEs of differect times on millions of elements. Wikisaurus (talk) 16:40, 19 February 2020 (UTC)
  • In my recent explorations of counties, I came across Unorganized Borough (Q1474662). It's interesting as it's called a borough like all other and it even has some subdivisions that for stat purposes are considered equivalent of those other boroughs, but practically all government functions are with the state. --- Jura 20:16, 19 February 2020 (UTC)

Versionize property definitions ?[edit]

Occasionally an identifier scheme is replaced with some other: new identifers, possibly new domain name, etc. The question arises: what to do with the property?

The general approach for entities is not to repurpose existing entities.

Some users like to keep the integer (PID) they were aware of and would just want to redo the property definition, formatter and all values at Wikidata they can get hold of.

While this probably wont matter for properties that have never really been used around Wikidata, the question has a larger impact now that other WMF sites use properties and these can't even be tracked from Wikidata (read: Commons). For third party users .. well too bad for them.

If we try to define a version for each property definition, users could check if they still have the current scheme. A simpler approach could be to delete the existing property and create a new one. --- Jura 11:03, 13 February 2020 (UTC)

Sounds like a good argument to create new properties when significant changes like that happen. ArthurPSmith (talk) 15:11, 13 February 2020 (UTC)
I agree for any version change which could create a conflict between values assigned under different versions. If the bulk of the old version's values are still valid in the new one, a bot can copy them over as a one-time thing when the new property is created. Josh Baumgartner (talk) 20:55, 13 February 2020 (UTC)

ChristianKl
ArthurPSmith
d1g
JakobVoss
Jura
Jsamwrites
MisterSynergy
Salgo60
Micru
Pintoch
Harshrathod50
Wildly boy
ZI Jony
Ederporto
99of9
Danrok
Eihel
Emw
Fralambert
GZWDer
Ivan A. Krestinin
Jonathan Groß
Joshbaumgartner
Kolja21
Kristbaum
MSGJ
Mattflaschen
MichaelSchoenitzer
Nightwish62
Pablo Busatto
Paperoastro
PinkAmpersand
Srittau
Thierry Caro
Tobias1984
Vennor
Yellowcard
Ivanhercaz
DannyS712
Tinker Bell
Bodhisattwa


Pictogram voting comment.svg Notified participants of WikiProject Properties --- Jura 20:36, 13 February 2020 (UTC)

How to show the correct item if a statement is deprecated with 'applies to other...' reason?[edit]

Like in (4RS)-4-hydroxy-L-proline (Q27102938) I have two deprecated IDs (not removed, because someone probably would re-add these IDs in the future):

CAS Registry Number (P231) 30724-02-8 / reason for deprecation (P2241) applies to other chemical entity (Q51734763)
DSSTox substance ID (P3117) DTXSID60861573 / reason for deprecation (P2241) applies to other chemical entity (Q51734763)

What would be the best way to show the correct item (in which the same ID is added correctly)? Which qualifier I should use? Wostr (talk) 15:40, 13 February 2020 (UTC)

I have proposed Wikidata:Property proposal/intended subject.--GZWDer (talk) 22:41, 13 February 2020 (UTC)
What's wrong with normally adding the information to the correct item? ChristianKl❫ 14:14, 18 February 2020 (UTC)
And how to easily find the correct item from the wrong item? Using query every time? Wostr (talk) 09:51, 23 February 2020 (UTC)

Discussion of P180 ("Depicts") usage on Wikimedia Commons[edit]

This discussion may be of interest: c:Commons:Village pump#Misplaced invitation to "tag" images. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:50, 15 February 2020 (UTC)

This can be considered vandalism, can't it? --SCIdude (talk) 16:22, 15 February 2020 (UTC)
I'm sorry, User:SCIdude, what can be considered vandalism? Andy's linking to Commons? Adding a feature on Commons without the consensus of its participants? Trying to get that reversed? Adding bad depicts statements? Removing depicts statements? Having a discussion at c:Commons:Village pump? - Jmabel (talk) 18:46, 15 February 2020 (UTC)
Thanks. I'll pick "Adding bad depicts statements" without consensus there and here. It seems similar to running a WD bot without having read anything about WD. --SCIdude (talk) 20:28, 15 February 2020 (UTC)
Bots can do huge damage very fast, but it could be fixed. For new users—IPs on commons are rare—"fixing it" will happen after climate change reverted itself to Y2K levels, and at this point in time commons will have no mammal users anymore. –84.46.52.151 07:32, 20 February 2020 (UTC)

Archiving links[edit]

Hello,

is it possible to run InternetArchieveBot here in Wikidata to check the links to other websites if they work. I think there are pages who are no longer available. For pages like this it were good if they are archieved. -- Hogü-456 (talk) 18:23, 15 February 2020 (UTC)

I tried to add a archive URL (P1065) mandatory qualifier constraint (Q21510856) to official website (P856) but one of the other users were strongly against it. --Trade (talk) 20:18, 15 February 2020 (UTC)
The problem with that is that some websites forbid archiving, or archiving fails for technical reasons (typically too much Javascript), so you'd end up with a lot of constraint violations or exceptions. Ghouston (talk) 01:54, 16 February 2020 (UTC)
I don't get why we need to add archive URLs to the entities since the archive links can be programatically generated after the fact if the link does go dead (assuming we also record a last-fetched timestamp). The real issue is making sure that we try to archive all the URLs soon after they are added to wikidata. BrokenSegue (talk) 03:39, 16 February 2020 (UTC)
Constraints are mostly for human editors. If you want to run a bot, you don't need that. --- Jura 07:38, 16 February 2020 (UTC)
I don't get why we need to add archive URLs to the entities Because that way we can be assured that the URL have been archived after it have been added to Wikidata. Also some values such as review scores might change with time making a archived URL a must have. --Trade (talk) 19:18, 16 February 2020 (UTC)
And if the archive goes offline? We need to add a backup url?
Anyways, if you think archiving is necessary, why not set up a bot to do it? --- Jura 05:19, 17 February 2020 (UTC)
I'm not against setting up a bot to do yhe job.--Trade (talk) 11:19, 17 February 2020 (UTC)
It were great if someone can set up a bot for it. Maybe it is possible to use InternetArchieveBot for that task. -- Hogü-456 (talk) 17:27, 19 February 2020 (UTC)

Is gender a property that may violate privacy or likely to be challenged?[edit]

@Daniel Mietchen: removed sex or gender (P21) from Q28913663 for WD:BLP. It have been removed by other user(s) but the removal was reverted by @ديفيد عادل وهبة خليل 2:. There's source that refers the person as female. I don't know whether it is proper to include the gender information to Wikidata.--GZWDer (talk) 20:10, 15 February 2020 (UTC)

I have reformatted the above link to the item in question. --Daniel Mietchen (talk) 14:23, 19 February 2020 (UTC)
I've heard about religion, medical conditions and sexuality being information that violates privacy but never gender. It just seems like such extremely basic knowledge. --Trade (talk) 21:45, 15 February 2020 (UTC)
Not so basic if the only mostly binary value determines your competitors in sports, could mean "all or nothing" in heritage depending on the jurisfiction, etc. It's on the same privacy level as religion, sexual orientation, political views, address, phone number, or banking account. Not like the less critical and recently discussed weight or cup-size (boobpedia ID). –84.46.52.151 07:52, 20 February 2020 (UTC)
As in everything in life (Wikidata included), Common sense is required. Nothing is absolute. Gender may be non-controversial or obvious for the vast majority of living or historic people, but if there is reason to suspect that it is controversial, or sensitive, for some living people, then "basic knowledge" needs to take a hike, and exceptionally high sources are required. I have no knowledge of the subject in question, but in general, in some cases it may be preferable to leave the field empty, even if reliable sources can be scrounged from the depths of knowledge, if such information is not public, not widely circulated, and/or contradicts a person's stated or assumed gender. -Animalparty (talk) 23:51, 15 February 2020 (UTC)
I put it back, with a reference. Ghouston (talk) 01:17, 16 February 2020 (UTC)
This was not a reference, this was speculation in the reference section of the claim in question. I have removed it again, and a source that explicitly states a gender is the absolute minimum requirement in these situations. —MisterSynergy (talk) 23:22, 19 February 2020 (UTC)
@MisterSynergy: If the OTRS ticket shows that the value is incorrect, but doesn't give a correct value, wouldn't he be better to deprecate the statement? Perhaps make a new item for reason for deprecation (P2241) by analogy with consensus to remove (Q55193796), maybe "Reason for removal in OTRS ticket", and put the OTRS id in the Talk page (since there doesn't seem to be a property for OTRS ids. Perhaps there should be, so that OTRS tickets could be used as references if they supply a correct value). Ghouston (talk) 01:31, 20 February 2020 (UTC)
The OTRS ticket is not a public source, thus we cannot use it here as a reference. Deprecation would be the way to go if there was a reference to a serious source, but we also found that the given value was incorrect. As long as there is no such reference, we do not need to keep the claim at all. —MisterSynergy (talk) 07:47, 20 February 2020 (UTC)
@Animalparty: If we know the person's stated or assumed gender we can easily use that as the gender we show. I find it hard to imagine that you can have controversy about someone's gender without having sources that you can use to have a sense about what gender might be appropriate for the person. In the worst cases you might have a sourced "unknown value" statement. ChristianKl❫ 14:30, 18 February 2020 (UTC)
@GZWDer, ديفيد عادل وهبة خليل 2, Trade, Animalparty, Ghouston, ChristianKl: I had non-public reasons to remove the information, and I shared them with OTRS. These reasons are clearly stated in WD:BLP, which is why I linked there from the edit summary. Perhaps that was not enough, and I am open for suggestions on how that process could be improved. In any case, I have pinged OTRS again, and the reasons for redacting the information have not changed, so I strongly suggest to keep it redacted, ideally in a way that would reference the OTRS ticket by something like Wikimedia OTRS ticket number (P6305), though that property was primarily intended for copyright stuff. --Daniel Mietchen (talk) 14:23, 19 February 2020 (UTC)
We do have some alternatives for people who don't want to be identified as either male or female, if that's the issue. There's a list on the sex or gender (P21) constraints. Ghouston (talk) 22:54, 19 February 2020 (UTC)
@MisterSynergy: so you don't believe that deduced from pronoun used (Q73168402) and deduced from given name (Q69652498) are actually good enough for Wikidata purposes? We need some other reference, like a statement in some other database where somebody has probably made the same assumption on our behalf? Or importing the claim from Wikipedia, where somebody else has added a gender based on a guess? My own guess is that those two "reasonings", often combined with checking their appearance in photos, are likely to be accurate in a vast majority of cases. No data is ever 100% certain. Some of the claims about education and former jobs may turn out some day to be fabrications (or perhaps never discovered), but I doubt that the percentage is very high. Ghouston (talk) 23:59, 19 February 2020 (UTC)
This is not about correctness or likelihoods that your speculation might be "correct". Instead, think of Wikidata as a secondary database that collects publicly available data. In this case, there is apparently no explicit information about the gender of the person available, which means that Wikidata does not know it either.
Of course, unsourced claims and wild deductions based on names or pronouns or even images are widely used, and this is borderline okayish for situations where nobody complains. Here, someone has complained and we should thus only include information that is explicitly mentioned in a serious source, and use the ranking tool in case the referenced information is found to be incorrect. —MisterSynergy (talk) 07:43, 20 February 2020 (UTC)
In this case is it meaningful to add sex or gender (P21)=somevalue? Especially if we meet constraint violations.--GZWDer (talk) 11:21, 20 February 2020 (UTC)
There are different opinions about how to use unknown value Help:
  • Some say it means that this information is generally unknown, and there are sources which explicitly state it as "unknown". According to this approach, you can only add it if you find a source that claims that the gender of the person in question is unknown; you should of course add this source in the reference section of the claim.
  • Others use it "because a thorough search for sources yielded no results, thus I speculate that the information is unknown (to the general public)". According to this approach, we could theoretically mass-add unknown value Help in plenty of properties to tons of items. I don't think that this would generate any benefit.
Which constraint is being violated here? Is it a item requires statement constraint (Q21503247) or value requires statement constraint (Q21510864) or similar on another property? These ones are often unfixable… —MisterSynergy (talk) 11:49, 20 February 2020 (UTC)

Preventing ping-pong protocol[edit]

We lack a mechanism to dissuade users from re-adding statements, in situations in which a subject has indicated that a property is, for them, a violation of their reasonable expectation of privacy (per WD:BLP "and which doesn't violate a person's reasonable expectations of privacy").

In the above case, a P21 value has been withdrawn 2 or 3 times. Right now if users check the item history punctiliously (they won't) or the talk page (maybe, maybe not) they may be alerted to an OTRS which may give them pause in re-adding the value. More likely, the value will be re-added, and removed again, and so on.

I think we need to do more by way of dissuasion and oversight, and I venture to suggest a mechanism: that in such cases there should be an OTRS log of the issue, and, after removal of the statement in an appropriate fashion (by edit, by oversight) a <no value> statement be added, with a {P|6305}} qualifier. The logic here is 1) <no value> is consonant with the subject's wishes, that wikidata hold no information for this property 2) OTRS qualifier is a strong hint in exactly the right place that there is an issue which should give users pause for thought about adding a value 3) we can use WDQS to report on occurrences of <no value> OTRS qualified statements having additional statements and 4) OTRS, or others, can maintain a list of such items against which removals of <no value> OTRS can be spotted. (It may be that we should go a step further and use e.g. sourcing circumstances (P1480) or another qualifier to hold a more clear "do not amend this property" value.) Thoughts? --Tagishsimon (talk) 08:13, 23 February 2020 (UTC)

  • Well, no value Help has a different semantic meaning. It is "there is no gender for this entity", not "there is no gender for this entity in Wikidata". The latter is expressed by the absence of claims with non-deprecated rank of the corresponding property in the item.
    Formally, using deprecation would be just right, with a descriptive qualifier that explains the rank selection if desired. The ranking mechanism is all about data visibility, and "Deprecated rank" makes the data already pretty invisible for actual data users (SPARQL or Wikibase client parser functions).
    What's unfortunate here is the fact that the Web-UI does not really make deprecated data less visible. This is not totally surprising, given that the Web-UI is basically an editor tool, not an interface for data users, and editors somehow need to see the data not to be added again in order not to add it again. (External) casual visitors, however, might be using the Web-UI to inspect the data about them, and they will likely not understand the concept and impact of chosen ranks. IMO the display of deprecated rank claims in the Web-UI should be improved, in a way that better informs actual editors as well as external visitors what's on. —MisterSynergy (talk) 08:37, 23 February 2020 (UTC)
I'm a couple of orders less concerned about the supposed semantic meaning of <no value> than I am about a mechanism by which we can satisfy the BLP policy. Deprecating a value that a subject has indicated is a privacy violation does not seem "Formally ... right" so much as 100% against that policy. --Tagishsimon (talk) 08:47, 23 February 2020 (UTC)
Set it to <no value> and then deprecate it? Ghouston (talk) 08:59, 23 February 2020 (UTC)
Is there a way to combine <no value> with a qualifier indicating deliberate suppression? - Jmabel (talk) 18:34, 23 February 2020 (UTC)

Misspelling of a moth name[edit]

I'm not sure how to fix it, so I'll post it here. Please see Talk:Q13393150. SchreiberBike (talk) 04:00, 16 February 2020 (UTC)

Wikidata_talk:WikiProject_Taxonomy might help. --- Jura 07:41, 16 February 2020 (UTC)
@Jura1: Thank you. SchreiberBike (talk) 22:47, 17 February 2020 (UTC)

Question about subclass and instance of[edit]

A few questions I couldn't find answers to in the docs. Maybe I'm being too pedantic here but I don't get how to model data here.

  • When do you use subclass v. instance of?
    • For example Q11421395 is an instance of a medal but there are lots of copies of the medal. Should it be an instance of a "kind of medal"? Or a subclass of medal?
    • Q868130 is an instance of a softdrink but really it's a soft drink brand not an actual soft drink? Or surely at least it's a subclass of softdrink? Should it be an instance of a brand and a subclass of soft drink?
    • Q12372598 is an instance of "food ingredient" but a subclass of "fruit". Should they both be sub-classes? Also, isn't marking it as a sub-class of "drupe" and "fruit" redundant? Is there value in having both?

Am I over thinking this? Or are there docs describing this I missed? Thanks. BrokenSegue (talk) 04:46, 16 February 2020 (UTC)

I struggle with this at times too and it would be good to have a dedicated help page that runs through things to help an editor decide which is more suitable. --SilentSpike (talk) 11:49, 16 February 2020 (UTC)

--Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC) Vladimir Alexiev (talk) 07:01, 24 February 2017 (UTC) Pintoch (talk) 22:42, 5 March 2017 (UTC) Fuzheado (talk) 14:43, 15 May 2017 (UTC) YULdigitalpreservation (talk) 14:37, 14 June 2017 (UTC) PKM (talk) 00:24, 17 June 2017 (UTC) Fractaler (talk) 14:42, 17 June 2017 (UTC) Andreasmperu Andreasmperu Diana de la Iglesia Jsamwrites (talk) Finn Årup Nielsen (fnielsen) (talk) 12:39, 24 August 2017 (UTC) Alessandro Piscopo (talk) 17:02, 4 September 2017 (UTC) Ptolusque (.-- .. -.- ..) 01:47, 14 September 2017 (UTC) Gamaliel (talk) --Horcrux (talk) 11:19, 12 November 2017 (UTC) MartinPoulter (talk) Bamyers99 (talk) 16:47, 18 March 2018 (UTC) Malore (talk) Wurstbruch (talk) 22:59, 4 April 2018 (UTC) Dcflyer (talk) 07:50, 9 September 2018 (UTC) Ettorerizza (talk) 11:00, 26 September 2018 (UTC) Ninokeys (talk) 00:05, 5 October 2018 (UTC) Buccalon (talk) 14:08, 10 October 2018 (UTC) Jneubert (talk) 06:02, 21 October 2018 (UTC) Yair rand (talk) 00:16, 24 October 2018 (UTC) Tris T7 (talk) ElanHR (talk) 22:05, 26 December 2018 (UTC) linuxo Gq86 Gabrielaltay Liamjamesperritt (talk) 08:44, 21 June 2019 (UTC) ZI Jony Ivanhercaz (Talk) 11:07, 15 July 2019 (UTC) Gaurav (talk) 22:39, 24 August 2019 (UTC) Meejies (talk) 04:38, 29 August 2019 (UTC) SilentSpike (talk) Tfrancart (talk) Luis.ramos.pst.ag TiagoLubiana (talk) 15:12, 2 December 2019 (UTC) Albert Villanova del Moral (talk) 15:43, 6 February 2020 (UTC) Clifflandis (talk) 15:10, 18 February 2020 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject Ontology

Please see Help:Basic membership properties. Joao4669 (talk) 12:32, 16 February 2020 (UTC)

Should caramel ice cream (Q84573720) be an instance or a subclass of ice cream (Q13233)?
Should Bubblegum Squash McFlurry (Q82751912) be an instance or a subclass of McFlurry (Q906754)? --Trade (talk) 18:47, 16 February 2020 (UTC)
Subclass in both cases as they are currently modeled, because you're not talking about a specific instance of a dessert. Now if "McFlurry" were a model (Q10929058): class of manufactured objects of similar design sold under a specific brand, not a dessert, then any flavor of McFlurry would be an instance. IMHO. - PKM (talk) 19:45, 16 February 2020 (UTC)
As to the original medal question, the first step would be to look if there is a speciic project that has a guideline / schema. If not, naively I would agree that United Nations Peace Medal (Q11421395) is an instance of a medal class (there is class of award (Q38033430)), not instance of medal; and making it subclass of medal is correct but should be improved by giving the next-higher category of United Nations Peace Medal (Q11421395) if there is one. --SCIdude (talk) 08:23, 17 February 2020 (UTC)
As to plum (Q12372598) there is the problem that "plum" can have at least two meanings, 1. type of plant (represented by the taxon item Prunus subg. Prunus (Q6401215)); 2. the fruits of (1); and 3. the food ingredient---which is not necessarily (2) because I expect more Prunus species/varieties whose fruits are not eaten, e.g. those varieties that are out of fashion. Now the item in question is (3) plum (Q12372598) and there are several plum varieties, so it is a class of ingredients, a subclass of fruits, and yes, since a drupe is a fruit, the subclass of fruit is redundant. --SCIdude (talk) 08:44, 17 February 2020 (UTC) PS: There is one case when you should not remove the subclass of fruit statement: if it has a better reference than the more specific statement.
It is not redundant - all drupes are fruit (Q1364), but not all are fruit (Q3314483); some drupes are culinary nuts (Q3320037), or inedible fruit (Q30312832). Peter James (talk) 12:38, 17 February 2020 (UTC)

Do we have a rule against making revision deletion requests at Wikidata:Administrators' noticeboard?[edit]

@Jasper Deng: I'll like to have this clarified. I've made revision deletion requests at Wikidata:Administrators' noticeboard before but i never knew such an rule existed. --Trade (talk) 20:41, 16 February 2020 (UTC)

As we don't have an administrator mail list this may be the only viable way. Another way is via IRC but I don't think there's always someone online and actively monitering messages (in October I was asking an Oversight request to remove a password unintentionally leaked by another user in Wikidata, I can only find an oversighter after 80 minutes.)--GZWDer (talk) 23:57, 16 February 2020 (UTC)
@GZWDer: If it's urgent then i suppose you could just ping a admin who have been active within the last hour. Also what's an oversighter? --Trade (talk) 11:14, 17 February 2020 (UTC)
@Trade: WD:OS. ‐‐1997kB (talk) 11:36, 17 February 2020 (UTC)
I remember there was a time where all admins were able to read deleted revisions. Any idea why that was changed?--Trade (talk) 11:54, 17 February 2020 (UTC)
There are two types of revdel:
  • admin-revdel (which admins can do and undo, and all admins can still see the admin-revdel'ed content; this is logged in Special:Log/delete)
  • oversight-revdel (which only oversighters can do and undo, and only oversighters can still see the content; this is *not* logged publicly)
If you think you need revdel, have a look at Wikidata:Deletion policy#Revision deletion first and decide which sort of revdel you need. As WD:AN is watched by hundreds of editors, it may im many situation be wiser to approach individual admins or oversighters via email, in order not to drag too much attention to the problematic content. This is usually the case for content pages (items, etc.), but often not so much on high-traffic project pages such as the Project chat. —MisterSynergy (talk) 12:04, 17 February 2020 (UTC)
Indeed, requests for revdel are best sent to individual admins using wikimail.--Ymblanter (talk) 19:54, 17 February 2020 (UTC)
While we don't have an admin mailing list, we do have oversight@wikidata.org for the people with Oversight rights. I think in most cases that's a better road then using Wikimail to contact individual admins. ChristianKl❫ 14:40, 18 February 2020 (UTC)

dead links - items with broken references[edit]

The data item about Sue Gardner (Q7524) has at least two references that are broken (date of birth). At enwiki such links are tagged with {deadlink}. What is the practice here? Thanks in advance , Ottawahitech (talk) 02:32, 17 February 2020 (UTC)

Sue Gardner (Q7524). Ideally add archive URL (P1065), but in this case, the links don't seem to have been archived. If it was a main statement it could be deprecated, or given qualifiers, but I'm not sure what to do when it's already a qualifier. Ghouston (talk) 06:12, 17 February 2020 (UTC)

Wikidata weekly summary #403[edit]

Extract data from Wikipedia templates[edit]

It seems that the tool TemplateTiger is not working anymore, then I look for another tool to extract data from infobox templates to add them to Wikidata. Normally the tool HarvestTemplates would be the solution, but in some cases the data format can't be interpreted by the tool. Then I will need to extract the data, adjust the format and then import it to Wikidata in a batch. --Cavernia (talk) 21:21, 17 February 2020 (UTC)

@Cavernia You can use HarvestTemplate's demo mode and download the "demo results" as CSV for further editing. This usually works. Vojtěch Dostál (talk) 10:31, 20 February 2020 (UTC)
@Vojtěch Dostál: I've tried that, but the data is not included in the export, only the error message (like "no target page found"). --Cavernia (talk) 20:21, 20 February 2020 (UTC)
@Cavernia Sometimes you can fool HarvestTemplates by feeding it some specific type of property. If you give me your use case, I can try. Vojtěch Dostál (talk) 20:47, 20 February 2020 (UTC)
@Vojtěch Dostál: Example: [1] The item Q17764175 contains the information "Torpedert og senket 14. juli 1940" which means torpedoed and sunk 14th of July 1940. The motiviation to extract this information is to restructure it and import it as a significant event (P793) entry of torpedo attack (Q50295027) with qualifier point in time (P585). --Cavernia (talk) 21:07, 20 February 2020 (UTC)

Translation tag help request for Wikidata Tours[edit]

Hi all

Over the past few months @NavinoEvans:, @Alicia Fagerving (WMSE): and myself have been working on fixing the bugs and completing many of the missing Wikidata Tours, we now have a pretty extensive set of new tours ready to go, with more almost ready... but we have an issue... the translation tags on the new version of the main tours page are really broken. Does anyone have experience in fixing mangled translation tags? I spent over an hour today trying to fix it but am still getting loads of errors.... Once we have this complete we can concentrate on new churning out new tours and make it much easier for new contributors to learn how to contribute to Wikidata.

https://www.wikidata.org/wiki/User:Alicia_Fagerving_(WMSE)/Sand_box

The two jobs are:

  • Fix the existing translation tags
  • Explain on the talk page how to add new tags in a way that makes sense when we start adding additional tours

Thanks very much indeed

--John Cummings (talk) 22:14, 17 February 2020 (UTC)

Leading and trailing spaces ...[edit]

Labels and Descriptions can handle leading and trailing spaces. Why can't other string fields do the same? - PKM (talk) 22:56, 17 February 2020 (UTC)

Do you have an example where the spaces are significant? ArthurPSmith (talk) 14:28, 18 February 2020 (UTC)

Using image as a source[edit]

sometimes I take images of plates and graves and I'd like to know if there is a way to use them as a source for a statement here (date/place of birth, date/place of death). I have never looked into that but i have always assumed that as a general long-term goal of having structured data also on Commons, images should be become more integrated into a network of structured data. Any previous discussion on the topic?--Alexmar983 (talk) 02:19, 18 February 2020 (UTC)

There's a property for linking a grave image, namely image of grave (P1442), which I suppose could be used as a reference statement. I'm not sure if that's good practice or not. Also, commemorative plaque image (P1801) for plaques. Ghouston (talk) 06:37, 18 February 2020 (UTC)
Can statement links be given as value to reference URL (P854)? --SCIdude (talk) 07:57, 18 February 2020 (UTC)
No, but you can put any statement in the reference section. Ghouston (talk) 08:24, 18 February 2020 (UTC)
I know that there is a property for sepcific images but is there a way to specifically use that image as a reference for a clear specific fact? Not only the date of birth/death but also name of spouse, nationality, place of burial... Same for a document, suppose I have an image for a contract with details about a certian item. I suppose you could use the commons url but that's not very elegant. I think it's time we accept images as a source, and I mean doing this with the possible highest standard of quality (like with clear metadata of commons). In a way it encourages the depth of metadata.--Alexmar983 (talk) 12:39, 18 February 2020 (UTC)

What about a "reference file" instead of "reference url"? I have a similar problem with some authorizations for WLM, they are stored on a database of Wikimedia Italy and if I want to put a starting date for the competition and their WLM ID or even a source for an address or a proof of an inclusion in a bigger complex, I could do it only with a url, but the scansions of the files are CC BY-SA, as not be bot accessible, but this way all is transparent.--Alexmar983 (talk) 13:38, 18 February 2020 (UTC)

Can we enlarge the concept beyond tombstones? I mean, a "normal" source is a document, some of these documents can be already imported as images on commons, if the related image on commons has all correct metadata to descrbe it, why shouldn't we use it directly? Of course we can add more tertiary source when we have them, but why is this not option? They are sources that everybody can quickly double-check.--Alexmar983 (talk) 13:58, 18 February 2020 (UTC)
Are you looking to do something like this: Cornelia Augusta Betts (Q85430571), look at the references for her date of death. --RAN (talk) 14:19, 18 February 2020 (UTC)
yes that is more structured, and flexible. Thank you RAN.--Alexmar983 (talk) 15:26, 18 February 2020 (UTC)
For working on instance_of=human. I also highly recommend getting a free Familysearch account and linking the entry here to any entry there, or creating a new entry at Familysearch. They have birth, marriage, and death records online for free. You can also add images there that are fairuse under international copyright law, that cannot be stored at Wikimedia Commons. Also apply for a free Newspaper.com account at Wikipedia:The Wikipedia Library. --RAN (talk) 19:24, 18 February 2020 (UTC)

Instances vs Classes for theme park attractions that are very similar[edit]

Hi, I'm working on Disney-related Wikidata entries. One thing I've noticed is that some attractions seem to be instances and some seem to be classes. For example, take Pirates of the Caribbean (Q1713564). This item clearly needs cleanup. My question is, there are no fewer than five versions of this attraction around the world. I'm assuming that it is improper to have one item with five different "part of" properties and five different "coordinates" properties; an item can't be a "part of" 5 different theme parks! Moreover, the version in Shanghai Disneyland Park (Q865312) is sufficiently different that it merits its own entry.

Is it fair to say that there should be a general Pirates of the Caribbean attraction entry, and five other new items that "instance of" that master item? Or is that too much bloat? --OnePt618 (talk) 03:38, 18 February 2020 (UTC)

@OnePt618: It is fair to say that there should be a general Pirates of the Caribbean attraction entry, and five other new items that are "instance of" that master item. Yes please. Make it so. --Tagishsimon (talk) 04:01, 18 February 2020 (UTC)
@Tagishsimon: Wonderful, that matches my understanding. Thank you so much! --OnePt618 (talk) 04:04, 18 February 2020 (UTC)

How to connect Model Trains Museum with rail transport modelling[edit]

Items are Q6888298 and Q623272. Smiley.toerist (talk) 09:43, 18 February 2020 (UTC)

Property:P921? Pietro (talk) 11:04, 18 February 2020 (UTC)
instance of (P31) museum (Q33506) / of (P642) rail transport modelling (Q623272). Circeus (talk) 23:45, 18 February 2020 (UTC)

Problem with applies to name (P5168)[edit]

It's not clear whether the value of this is a name of the subject or the object.

An example of this would be UNO (Q17267) which is named after 1 (Q199), but specifically the Spanish name "uno". However, it's not clear from the qualifier weather the subject is named after the object's Spanish name "uno"; or the subject's Spanish name "uno" is named after the object.

In fact, this can also be considered unclear (non-explicit) for any named after (P138) statement qualified with P5168 where the subject and object have multiple names. I'm sure there are other statements too where the use of this qualifier can be unclear.

It seems to me like there should actually be two qualifiers in the vain of subject has role (P2868) (applies to name of subject) and object has role (P3831) (applies to name of object). Does this seem sensible to anyone else? Would be happy to make proposals if there's support for this. --SilentSpike (talk) 11:41, 18 February 2020 (UTC)

  • See Wikidata:Property proposal/applies to name. --- Jura 12:22, 18 February 2020 (UTC)
    • I did have a look there, but it looks like discussion didn't really address this matter --SilentSpike (talk) 12:33, 18 February 2020 (UTC)
      • I think it did. The qualifier is for the subject, not the object. --- Jura 12:41, 18 February 2020 (UTC)
        • Then I would propose renaming to "applies to name of subject" and the creation of a second qualifier "applies to name of object". However, it seems likely there could be existing misuse of P5168 which can't always be assumed to apply to the subject. --SilentSpike (talk) 13:11, 18 February 2020 (UTC)

I have started a proposal here: Wikidata:Property_proposal/applies_to_name_of_object --SilentSpike (talk) 15:35, 18 February 2020 (UTC)

Grant Application Wikidata + Performing Arts[edit]

Project Grant Application by the Canadian Arts Presenting Association and the Conseil québecois du théâtre for the population of Wikidata with performing arts related data. – Please review, comment, endorse...! --Beat Estermann (talk) 22:04, 18 February 2020 (UTC) Beat Estermann Vladimir Alexiev Ilya Sadads Astinson Strakhov Zeromonk Spinster Wittylama Daniel Mietchen Susannaanas Sic19 Jason.nlw Carlojoseph14 YULdigitalpreservation MB-one Ouvrard MartinPoulter Missvain VIGNERON Ainali Birk Weiberg Pmt Mauricio V. Genta Smallison ProtoplasmaKid 2le2im-bdc Rodrigo Tetsuo Argenton Ivanhercaz VisbyStar Patafisik Beireke1 Vahur Puik Ettorerizza Sp!ros Alexmar983 Epìdosis Buccalon Mrtngrsbch Eothan Giaccai NAH

Pictogram voting comment.svg Notified participants of WikiProject Cultural heritage

Instrument used[edit]

Descriptions of art objects sometimes include instrument used, like ball pen or brush. What property can be suggested ? - Kareyac (talk) 08:03, 19 February 2020 (UTC)

Randomly looking at the properties used to link to ballpoint pen (Q160137) (query), material used (P186) seems to be most common, with fabrication method (P2079) as the runner-up. (Caveat: that query won’t capture statements using the item as a qualifier or reference.) fabrication method (P2079) sounds promising to me (and is also listed on Wikidata:WikiProject Visual arts/Item structure#Describing individual objects); but maybe it’s also worth asking on some talk page of that WikiProject? --TweetsFactsAndQueries (talk) 14:47, 19 February 2020 (UTC)

soweego 2 proposal[edit]

(Please disregard this message if you have already read it in the Wikidata mailing list, and apologies for the distraction)

  • TL;DR: soweego 2 is on its way!
  • The Project Grant proposal is out for your consideration:

Hi everyone,

Does the name soweego ring you a bell? It's an artificial intelligence that links Wikidata to large catalogs: https://soweego.readthedocs.io/ It's a close friend of Mix'n'match (Q28054658), which generally copes with small catalogs.

The next big step is to check Wikidata content against third-party trusted sources. In a nutshell, we want to enable feedback loops between Wikidatans and catalog maintainers. The ultimate goal is to foster mutual benefits in the open knowledge landscape.

I would be really grateful if you could have a look at the proposal.

Can't wait for your feedback.

Best,
Hjfocs (talk) 15:57, 19 February 2020 (UTC)


Soweego 1 review[edit]

There seem to have been serveral reports on Soweego 1, but I don't think they were shared at Wikidata or elsewhere, nor seem the actual outputs to be widely known. Reports are:

The outputs seem to be:

The Mix'n'match catalogues seem to be either unused or unusable. I suppose they are "automatched" based on the Soweego scores and qid, but now contributors would have to confirm them manually.
As of today, this is hardly done and even users who work on other catalogues might not find them as the entries aren't searchable by text (try to find a record by name). The entries provide no information that would allow to do that without going back to the source database.
The few unmatched ones create horrible entries like [2] repeating the label and description from MnM : 8447 soweego confidence score: 0.5026371479034424"
For databases where this is possible, maybe a suitable use of the MnM catalogues could be to import most entries that haven't a potential duplicate in MxM.
Is there any guidance available what's meant to be done with the MnM catalogues? --- Jura 15:31, 22 February 2020 (UTC)

ValterVB Josve05a LydiaPintscher Ermanon Cbrown1023 Discoveranjali Mushroom Queryzo Danrok Rogi Escudero Mbch331 Jura Jobu0101 Jklamo Jon Harald Søby putnik ohmyerica AmaryllisGardener FShbib Andreasmperu Li Song Tiot Udi Oron ~ אודי אורון Harshrathod50 U+1F350 Bodhisattwa (talk) Shisma Wolverène Tris T7 TT me Esteban16 Antoine2711 Hrk6626 TheFireBender V!v£ l@ Rosière /Murmurer…/ WatchMeWiki! CptViraj ʂɤɲ Trivialist

Pictogram voting comment.svg Notified participants of WikiProject Movies As Imdb is included, WP Movies would probably be most interested. --- Jura 13:36, 23 February 2020 (UTC)

YVNG ID[edit]

Are these values being added correctly? Please look at Annemarie Loepert (Q58358600) and the YVNG_ID, when I click on it, it takes me to a generic page, not the entry for the person. The actual link appears below as a reference. The actual ID is formatted as "4118003&ind=35" with the second number as the database within the YVNG website and 4118003 is the entry in that database, as best as I can figure it out. YVNG has multiple databases that can be searched. Compare it to the "Jewish Museum Berlin person ID" in her entry and it takes me directly to her entry in the database. --RAN (talk) 07:08, 20 February 2020 (UTC)

The ID's URL template is wrong. The description reads "identifier in the Yad Vashem central database of Shoah victims' name", and the original proposal also specifically mentions the victims' names database. The identifier's name, "YVNG", is the same Yad Vashem (sometimes) uses for the personal identifiers, and the subdomain of that specific dataset. Even the examples included in the proposal link to the individual name records. The URL template, however, is different from the examples?

I would suggest changing the URL template to https://yvng.yadvashem.org/nameDetails.html?language=en&itemId=$1.

I am also wondering if the name should perhaps be changed to something more meaningful, such as "Shoa Victim Name ID". The abbreviation "YVNG" does not seem to be "official", but merely and internal, technical one. --Matthias Winkelmann (talk) 08:25, 20 February 2020 (UTC)

Cemetery plot[edit]

A cemetery plot can contain multiple graves. Has anyone created an entry for one so I can see how it is structured and how the people buried there are listed? --RAN (talk) 08:48, 20 February 2020 (UTC)

Some are listed at Special:WhatLinksHere/Q1541002, although a quick perusal did not turn up anything spectacular. Just "has part" for the individual graves. Quantity Buried and Category of People Buried Here are also relevant.--Matthias Winkelmann (talk) 09:31, 20 February 2020 (UTC)

We have burial plot reference (P965), which is used as a qualifier to place of burial (P119). As every cemetery uses a different scheme to identify the plots, its just a plain string. See e.g. Marc-Antoine Berdolet (Q152944) how it is used. There a probably only very few cases where grave itself is notable enough by itself to have an item here, and there the property then probabky need to be used as a statement. AFAIK there's no inverse property "people buried here", and that would be get very unhandy if applied to any bigger cemetery, so I think it won't be a good idea to create one. Ahoerstemeier (talk) 17:23, 20 February 2020 (UTC)

UK 1922 or 1927?[edit]

There are:

The descriptions on these two items use 1922 or 1927 (oddly, not even in one language the year is the same on both items). Also, thousands of items use both items as values in country of citizenship (P27) with start/end year in qualifiers (20818*12 April 1927) [3].

Obviously, the descriptions and these qualifiers should match. So, what should it be? Do we need to have a bot update or remove all qualifiers? --- Jura 10:28, 20 February 2020 (UTC)

The territorial change happened in 1922, but the name of Parliament only changed in 1927 by the Royal and Parliamentary Titles Act 1927. We certainly should not say that the UK was *founded* in 1927. Owain (talk) 11:40, 20 February 2020 (UTC)
I'm still mystified about how the UK could have been founded in 1922 or 1927, while the USA was founded in 1776. It's not like the territory of the US hasn't changed a bit since then, but apparently name changes are all-important. It means that the entity that we call the UK didn't take part in WWI, for example. Ghouston (talk) 11:46, 20 February 2020 (UTC)
That was the point I was trying to clarify. the UK wasn't founded in 1922 or 1927, but 1801, and of course Great Britain and Ireland shared the same monarch since 1603. You are right that territorial changes happen all the time, which is why I added the clarification "territorial extent from 1922". Even the names of states change, but that should not and does not change the date on which the state was founded. Owain (talk) 12:10, 20 February 2020 (UTC)
I think it's debatable if there are should be two items, but if we have two, we should try to use them consistently and add adequate descriptions.
If the item is only applicable starting 1927, including 1922 in the description will likely lead the users to apply the wrong one. --- Jura 15:31, 20 February 2020 (UTC)
There will be two items even if only because Wikipedias have articles for both. But if you check en:United Kingdom, it gives a range of "formation" dates in the infobox, starting at 1535. I don't think it would be unreasonable to treat the United Kingdom of Great Britain and Ireland (Q174193) item as an historical period of United Kingdom (Q145) instead of a separate country in its own right. Ghouston (talk) 22:56, 20 February 2020 (UTC)

Whilst I have sympathy for the view that UKoGB&I = UK, equally, let me scotch a couple of Owain's points. the UK wasn't founded in 1922 or 1927, but 1801 ... no. The UKoGB&I was founded in 1801. And something called the UK was founded in 1921/22/27 (take your pick); which had different territory, different parliament, different name. It's not a genie that you'll ever get back into a 'just a rename' bottle. Great Britain and Ireland shared the same monarch since 1603 and Canada, Oz, NZ &c share the same monarch today; are not the same country, so we can dispense with that straw man. If the whole name/parliament/law thing does not mark, for you, the passing of one state and the birth of another, that's fine. But you have to ask why the loss of most of Ireland is not significant, whilst the union with Scotland, or the merging of GB & Ireland are significant. By the logic proffered, we could equally say this is all really the Kingdom of England rolling on as it does, gathering a country here, losing it there. You can look to the USA item and wonder why we don't just handle the UK like that. Or you could look at France (Q142) but also at French First Republic (Q58296) and French Third Republic (Q70802) (to choose just two of the many France Country type items) and conclude that if anything, the UK is slightly clearer. It's all not ideal; it is complex; it is not helped by eliding over the need to deal with the very real changes of state by asserting that an arbitrary set of them are the same. --Tagishsimon (talk) 18:20, 20 February 2020 (UTC)

In what way does the pre-1922 and post-1922 UK have a different Parliament? Why does a change of name make a material difference here, but not in say, Myanmar (Q836)? Owain (talk) 19:28, 20 February 2020 (UTC)
  • Just to return to the original question. Will we stick to 1927 to differentiate the two or not? --- Jura 15:29, 21 February 2020 (UTC)
  • Unclear unless we know the reason that we are considering them separate entities. Is it because the territory changed, or because the name changed? Ghouston (talk) 01:29, 22 February 2020 (UTC)
    • 1927 used in thousands of qualifiers is for Royal and Parliamentary Titles Act 1927 (Q7375047). --- Jura 08:24, 22 February 2020 (UTC)
      • That looks like a simple name change, which can be handled with multiple official name (P1448) statements with start and end qualifiers. The most significant change was the loss of most of Ireland in 1922. However, I still think it's a questionable interpretation that that constitutes the creation of a "new" United Kingdom, any more than the loss of the Philippenes needs to be interpreted as the creation of new USA. Note also that Royal and Parliamentary Titles Act 1927 (Q7375047) is marked as an instance of Act of Parliament of the United Kingdom (Q4677783), like many other acts passed by parliaments of "different countries" even back to the early 1800s such as Slave Trade Act 1807 (Q770832). Ghouston (talk) 23:33, 22 February 2020 (UTC)
      • In other words, I'd say it's most consistent with history, as most people understand it, to say that the UK lost some territory in 1922, and changed its name in 1927, than to say that the old UK was dissolved in 1922, a new UK formed, and then 5 years later they realized that they had forgotten to name the new country properly and renamed it. Ghouston (talk) 23:44, 22 February 2020 (UTC)
      • Well, saying "dissolved" and "newly formed" is unfair, it's more like the "old UK" split into the new state of Ireland and the "new UK". But many kinds of entities change over time, and we can represent them either with new and old items at the point of change, or a single item with start and end dates on particular statements where needed. Either way is presumably valid, but I think in this case, it's better represented as a state that lost some territory than as a split, since so many features of the UK remained the same. Ghouston (talk) 00:34, 23 February 2020 (UTC)
      • Note that after the formal split, both of the entities eventually renamed: the UK in 1927, and the Irish Free State was renamed to Éire, or Ireland in 1937 and "described" as Republic of Ireland in 1948, according to enwiki. But 1922 is more relevant for the split than 1927. Ghouston (talk) 00:50, 23 February 2020 (UTC)
      • A similar situation was discussed recently in the context of Scotland possibly leaving the UK. See https://publications.parliament.uk/pa/cm201213/cmselect/cmfaff/643/643.pdf. The options are discussed starting on page 13, but basically some model or other would need to be adopted, one option being that the "RUK", the remainder of the UK, would continue as the successor of the UK, for the purposes of ~14k treaties, membership of the UN, etc., and Scotland would need to start as a new country from scratch. On page 130, Lidington says: "If we look at analogous examples, when Ireland established the Irish Free State in 1922 the United Kingdom continued to exist. It was accepted as such. The Free State and subsequently the Irish Republic became new countries. The same applied when India, which as a dominion had been a founder member of the United Nations, separated from Pakistan. India was accepted as a continuing state; Pakistan was the new state and had to apply to join the international organisations. The same took place when Eritrea became independent from Ethiopia, when South Sudan became independent from Sudan, when Malaysia and Singapore separated. If you look at recent European history, it is very striking that at the time of German unification the Federal Republic of Germany continued to exist and was accepted as such and what happened in international law and in terms of membership of organisations was that new Länder from the former German Democratic Republic became part of that continuing Federal Republic of Germany." Ghouston (talk) 06:28, 23 February 2020 (UTC)
        • @Ghouston: Very good set of examples. That issue of being internationally recognized as a continuing state is probably crucial. - Jmabel (talk) 06:44, 23 February 2020 (UTC)

County organization of a state[edit]

When going through #US_counties, one come across many items like county of Virginia (Q13415368). I think it would be interesting to add statements there what government institutions or facilities could be found in each county of that state (various boards). The sample here is for counties, but it should also work other classes of territorial government entites. Some aspects might be identical for every state, but the equivalence is generally for census purposes.

Eventually we might even have items for each of institution in every county, but, to start, it might be good to identify the classes. has parts of the class (P2670) might work for that or may it should be more specific. What do you think? --- Jura 11:27, 20 February 2020 (UTC)

Proposal to undeprecate comment (DEPRECATED) (P2315) ("Comment" property)[edit]

Even this is not structured data, Sometimes it is useful to add editorial comment about an entity to notice editors of the item (e.g. what should not be changed or be added). This is similar to comments in the source of Donald Trump article in English Wikipedia. For example Talk:Q19862406 contains some information that are important to editors. Another example is some informations that should not be added to Wikidata, or removed by consensus (currently there's not a way to indicate them). This property is to be used as both main statement and qualifier. Recently created Wikimedia community discussion (P7930) is to be used together with this property (as a reference).

For clearity, I also propose to rename the property to "editorial comment" and explicitly stated that these should be ignored by data (re)users. However, as it is important to editors, I proposed to order it at the top of the entity page, even above instance of (P31).--GZWDer (talk) 12:15, 20 February 2020 (UTC)

Once again, what's wrong with suggesting people actually look at the Talk pages for this kind of thing? Most items have no talk page, so the existence of one (blue instead of red link) should be a clue to read it... ArthurPSmith (talk) 19:18, 20 February 2020 (UTC)
I don't think most users will read the talk page before editing an item.--GZWDer (talk) 22:39, 20 February 2020 (UTC)
What makes you think they will read this comment either? ArthurPSmith (talk) 14:26, 21 February 2020 (UTC)
Well, except that some people think it’s useful to create talk pages with nothing but {{Item documentation}}, producing blue links with no useful information. —Galaktos (talk) 12:57, 22 February 2020 (UTC)
  • Symbol support vote.svg Support For analogy, consider the MARC standards for book cataloguing [4]. These are very very structured, and very very prescriptive, with the aim being to capture as much of the information about the book or edition as possible, in a very prescribed framework. But even MARC allows a number of fields (identified by a code in the 500 to 599 range) for notes of different kinds in free-text. Sometimes there are things (including important warnings and caveats) that are worth communicating to other people reading an item, that simply can't to be expressed just with properties statements and qualifiers. GZWDer is quite right that most people simply will not see such messages if they are on the talk page. Nor are they accessible for query and retrieval there. The assertion sometimes made that "this isn't structured data, therefore we can't have it here" is very thin, giving no answer to the question "Why not?" Indeed, being able to attach comments as qualifiers to particular statements, does locate those comments in a very specific and structured way.
Obviously, wherever possible, we should try to express information through properties and values, that are immediately internationalised, and expressed in terms of items that are themselves parts of the wikibase. Wherever possible we must make the extra effort to try to express information that way. But it isn't always possible, so in my view: yes, there is a role for free-text comments, attached as qualifiers to statements, or main statements to items. Jheald (talk) 15:14, 21 February 2020 (UTC)

Questions about photo(s) on a wikidata page[edit]

1. Is only one photograph permitted per wikidata page about a person?

2. Can a wikidata page about a person have 3 photos, if they show that person in their youth, their middle age and their elder years to give a sense of the changes to that person appearance over their lifetime?

Thank you,

    Tibet Nation (talk) 18:26, 20 February 2020 (UTC)
The image (P18) property is intended to hold a single image that can be displayed in infobox templates, etc. If you want multiple representations, it's best to make a montage, and use collage image (P2716). Ghouston (talk) 22:48, 20 February 2020 (UTC)
The infobox displays only the first image. If you want to have it display the second image you have to deprecate the first image or prioritize the second image. The field does not give an error message if it contains more than one image, so there is no current restriction. When previously discussed, people were upset by more than three images. I personally think it should hold at least the two best images, and it would be even better if periodically it automatically switched the prioritized image just to change things around once in a while. --RAN (talk) 04:57, 21 February 2020 (UTC)
Restricting to a single image is the only restriction that doesn't seem completely arbitrary to me. If you are going to have multiple images, you may as well include every relevant image from Commons. Ghouston (talk) 05:08, 21 February 2020 (UTC)
Reductio ad absurdum always welcome! --RAN (talk) 06:27, 21 February 2020 (UTC)
When you have N images for an item, it's likely that somebody will think they don't depict the concept sufficiently, and they need N+1. That's my reasoning for stopping the process at 1. How many images would you need to depict a large city properly? Ghouston (talk) 01:26, 22 February 2020 (UTC)
Exactly. In the beginning, we do not limit the number of images in P18 and we ended up with items with 100+ images (somehow related to item) in P18.--Jklamo (talk) 15:01, 22 February 2020 (UTC)

Need redirect auto replacement[edit]

Take a look at Mary Sarah Wellesley (Q75387210) and the field for father. Because of a merge the field contains the old value that leads to a redirect so it gets an error message. Is there any way that these can be auto-replaced with the correct value? It also leads to problems in the genealogical graphic at Commons. See Commons:Category:George_Cadogan,_5th_Earl_Cadogan where the bad value appears as the Q-number of the redirect. --RAN (talk) 20:46, 20 February 2020 (UTC)

Good old Peerage bulk import problems! There is said to a bot that replaces these in due time. Namely, User:KrBot, said by User:GZWDer. Ivan A. Krestinin (talkcontribslogs) might have more information. -Animalparty (talk) 22:06, 20 February 2020 (UTC)
KrBot will wait at least 24 hours, probably more, in case the merge is reverted.--GZWDer (talk) 22:15, 20 February 2020 (UTC)
Ordinary redirects may be handle by Lua modules without any special handling. But in this case it is a double redirect so Lua module will not work. Currently there're two bots fixing double redirects, Revibot and PLbot.--GZWDer (talk) 22:18, 20 February 2020 (UTC)
Lua does sometimes need special handling for redirects. That's one of the reasons we still fix them (the other is WDQS). --Matěj Suchánek (talk) 10:26, 22 February 2020 (UTC)
  • Excellent, thanks! Better to wait for the autofix than to risk making a mistake myself. --RAN (talk) 01:24, 21 February 2020 (UTC)

Special:MostRevisions[edit]

Special:MostRevisions wasn't updated since 25 November 2019. Eurohunter (talk) 22:31, 20 February 2020 (UTC)

It has been disabled: phab:T239072. --Matěj Suchánek (talk) 10:07, 21 February 2020 (UTC)

Property:P1801[edit]

Property:P1801 is currently used for "commemorative plaques" and I added "signage" in the description. Can we change the name to more generic "signage" so the field can contain more than commemorative plaques? We often have pictures of the signs on buildings so we correctly identify them. Rather than create a new field, how about expanding this one? See Jedediah Higgins House (Q74473194) for an instance and peek at the plaque field. --RAN (talk) 04:36, 21 February 2020 (UTC)

I don't think that's a very good example & don't think 'signage' is a very sensible extension of P1801. Your example is a sign associated with a house knonw as the Jedediah Higgins House. A commemorative plaque tends to be a plaque affixed to a building (that is not named for the person named in the plaque), generally saying "Foo Bar, lived here, 1739-1829" - example at https://www.wikidata.org/wiki/Q7251#P1801. The plaque commemorates the person. The sign merely tells you what the building is called. So, yeah: strongly oppose. --Tagishsimon (talk) 04:41, 21 February 2020 (UTC)
You want to tighten the use to instance_of=human? I am not against using the related_image field to hold signage images, if we remove the restriction of not allowing use of related_image if the image field is populated. --RAN (talk) 04:59, 21 February 2020 (UTC)
I would use place name sign (P1766), see for example Thomas Paine Cottage (Q7792975) and Ray Charles Childhood Home (Q55806705) Piecesofuk (talk) 09:52, 21 February 2020 (UTC)
Perfect! I will add it as a "See also" with image, so more people are aware of it. --RAN (talk) 16:19, 21 February 2020 (UTC)
I tried to add in all the image fields as see alsos in image, if you know of others, or can search for all of them, please add them. There are about a dozen so far. Most of them I was not aware of. --RAN (talk) 21:06, 22 February 2020 (UTC)

Change of identifier[edit]

I don't know if it is the right place to advise users about a proposal. Please have a look to this. --★ → Airon 90 07:56, 21 February 2020 (UTC)

  • For those who don't want to waste their time, it's a proposal to simplify the TripAdvisor ID. - Jmabel (talk) 16:48, 21 February 2020 (UTC)

TV Show Judges[edit]

Does anyone know of a property that I could use in order to list the people who were judges on a TV show? I can't find anything that would do the job properly, they're not a presenter (P371), participant (P710) or cast member (P161). Any suggestions? - X201 (talk) 11:13, 21 February 2020 (UTC)

They could be a cast member or participant with qualifier of object has role (P3831) = Reality television program judge (Q60118864) or similar. --Tagishsimon (talk) 18:50, 21 February 2020 (UTC)
  • Just for the sake of conversational clarity - there is a difference between TV Show Judges, and TV Show judges and judges on a TV show. Judy Sheindlin (Judge Judy) is different from Katy Perry (American Idol) which is different from Simone Missick who plays Judge Lola Carmichael on All Rise. Quakewoody (talk) 19:08, 21 February 2020 (UTC)

Property talk:P131#Possible change of usage[edit]

Please note the possible rescoping of located in the administrative territorial entity (P131). --- Jura 12:18, 21 February 2020 (UTC)

Merging coronavirus pages[edit]

Shouldn't Q290805 and Q57751738 be merged? Specifically those items in Q57751738 are actually titled "coronavirus" should be instead linked to Q290805, no? Huji (talk) 15:29, 21 February 2020 (UTC)ping me in your response please

Q290805 has a different taxon rank than Q57751738 and Q57751738. This is not Wikipedia, where the title is the defining field, in Wikidata the statements define the concept. --SCIdude (talk) 16:29, 21 February 2020 (UTC) ...oh I forgot to @User:Huji: you...
@User:Huji: also note that the ukwiki has articles for both. It's quite possible that some of the sitelinks are connected to the wrong item, though. Ghouston (talk) 01:13, 22 February 2020 (UTC)
@Ghouston: ok fair. But are those pages associated with Q57751738 that are named "coronavirus" really correctly linked to Q57751738 or should they be connected to Q290805 instead? Huji (talk) 01:22, 22 February 2020 (UTC)
I don't know enough to say. It depends on what exactly is the difference between these two items. Perhaps one includes a larger set of virus than the other, given that one is a genus and the other a subfamily. Then the pages (and other items that link to the items) could be distinguished by which viruses they include. There's also Coronaviridae (Q1134583) if you go up to the family level. Ghouston (talk) 01:42, 22 February 2020 (UTC)
The frwiki article fr:Coronavirus, for example, despite its title, is about the subfamily. Ghouston (talk) 01:43, 22 February 2020 (UTC)

Reversing the roles of Twitter properties[edit]

Twitter properties Twitter username (P2002) and Twitter user numeric ID (P6552) are currently used with the former as main statement and latter as qualifier. However, only the latter is truly an identifier and the former easily becomes stale data without both a point in time qualifier and the P6552 identifier qualifier.

If you enter Twitter data at all, please contribute to the discussion at Wikidata:Requests_for_permissions/Bot/SilentSpikeBot#Relevant_property_discussion where I seek to establish a bot task to handle tidying up Twitter data and have started a discussion on whether the swapping of these property roles should be part of that. --SilentSpike (talk) 16:26, 21 February 2020 (UTC)

Computational limit[edit]

I asked this once before, but still not sure why we are at our computational limit. Wikidata:Database reports/items with P569 greater than P570 looks for people who died before they were born. It no longer runs because it times out in the 1-minute allotted for computations. What can be done to be able to run one of our important error detection searches for instances_of=human? It has been used to correct >1,000 errors in the past. It is important since it also detects errors that we imported from VIAF, and VIAF uses our corrections to correct their own database. As well as detecting typos, it finds where have conflated two people of the same name. --RAN (talk) 17:36, 21 February 2020 (UTC)

I added FILTER(?bdate > ?ddate), which seems to help a bit (I got results without timeout both times I tried it, though I may have gotten lucky). --TweetsFactsAndQueries (talk) 18:57, 21 February 2020 (UTC)
Excellent, thanks! Wow, lots more errors to fix tonight. [BTW it was temporary, you must have tried during a slow load time]. --RAN (talk) 21:59, 21 February 2020 (UTC)
Maybe we could also run two reports, say one for men and one for women, and subdivide further when needed? I'm curious for your evidence that VIAF are ingesting our corrections (which would be good news); I thought they'd stopped. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:53, 21 February 2020 (UTC)
  • Oh no! Why did they stop? I can see that some of the conflated records we have a list of are very difficult to tease apart. Some of the simple typographical errors where two numbers are transposed were easy to fix and he corrected them right away when notified. Did the one person retire? I will look and see if I still have his email. --RAN (talk) 21:11, 22 February 2020 (UTC)
Wouldn't the easiest fix just be to double the limit to two minutes instead of just one? Robin van der Vliet (talk) (contribs) 13:26, 22 February 2020 (UTC)
The query service is already at maximum capacity, we can't easily let it do more work. ChristianKl❫ 15:47, 22 February 2020 (UTC)
  • Is this one of the reasons we have slowed down in adding new large data sets?

(badtoken) Invalid CSRF token.[edit]

I'm receiving it while adding new language code. Eurohunter (talk) 18:01, 21 February 2020 (UTC)

Two profiles for one person[edit]

Can someone merge Q75788407 into Q7812365? -- Zanimum (talk) 18:32, 21 February 2020 (UTC)

@Zanimum: ✓ Done; you can have a look at Help:Merge if you want to see how you can do this yourself. Mahir256 (talk) 19:58, 21 February 2020 (UTC)

Edit rate in Wikidata[edit]

Hello,

in QuickStatements the batches are running slow. I am not a programmer and dont know much about what happens after saving an edit in Wikidata. As far as I have understand the Speed of Editing in Wikidata is in relation to the maxlag parameter. If there is a big lag at the servers and there also and in the last months mostly as far as I know at the query servers then the editing rate is going to be lower. What is the current plan to solve that problem of a lag at the query servers. I think it is not good if editing Wikidata with batches needs a long time. Something I suggested here is to create lists of data related to a specific topic that can be downloaded. I think that would reduce the number of queries and then the query servers have more time to write the changes. I am interested in doing this and for that I need the data and at my home it would need to long to donwload the dump. -- Hogü-456 (talk) 21:28, 21 February 2020 (UTC)

https://lists.wikimedia.org/pipermail/wikidata/2020-February/013793.html  – The preceding unsigned comment was added by Tagishsimon (talk • contribs) at 21. 2. 2020, 23:46‎ (UTC).

Quote or excerpt[edit]

We have quote_or_excerpt where we can add in a short excerpt from a public domain book, usually the opening few sentences. For those in other languages do we ever translate them, and add them in another field and mark them somehow as a translation of an excerpt? This would be for a work that has no translated version. --RAN (talk) 22:05, 21 February 2020 (UTC)

Wikisource categories for topics[edit]

Wikisource has no page at "Staffordshire", so should we link its "Category:Staffordshire" to Category:Staffordshire (Q8809886) or, as we would for many Commons categories, with no page at that project, to Staffordshire (Q23105)? I favour the latter. If you disagree, please give reasons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:51, 21 February 2020 (UTC)

Commmon categories are generally linked with the category item, if it already exists. Ghouston (talk) 00:49, 22 February 2020 (UTC)
...and, from what I have seen, Wikinews categories are linked with the main item. I believe what's going on with Wikisource could go either way depending on the category page in question. Mahir256 (talk) 05:25, 22 February 2020 (UTC)
Really? That's not my experience. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:36, 22 February 2020 (UTC)

Wikidata and SEO[edit]

There seems to be quite an increase in the amount of people who use Wikidata as a tool for search engine optimization (Q180711) self-promotion. I'm worried that the spam items are created faster than we can find and promote them for deletion. --Trade (talk) 22:57, 21 February 2020 (UTC)

Give some examples. --RAN (talk) 03:52, 22 February 2020 (UTC)
@Trade: from what I can tell, a lot of people don't see this as a problem in the first place. Self-promotion is actively encouraged in Wikidata. The notability guidelines have been changed recently to stop discouraging people from creating their own item. We are on the right track to gather the sum of all knowledge about all SEO professionals and Wikimedians. − Pintoch (talk) 12:05, 22 February 2020 (UTC)
  • When you are claiming that your job is Entrepreneur/Influencer/SEO, we really shouldn't let them on WD to build their brand. Their brand should already be built before we have them here. It is "catch 22" of notability - you can't be a brand until you are on WD but you should be a brand before you are on WD. It really should take more than having an internet connection for someone to claim themselves as an influencer who needs to use WD for SEO. Quakewoody (talk) 14:41, 22 February 2020 (UTC)
  • We have millions of people on Wikidata already, many of them living. Maybe some day we'll have billions. But SEO people should definitely not be a priority; I'll happily vote for their deletion if they are not otherwise notable. ArthurPSmith (talk) 20:50, 22 February 2020 (UTC)

National Museums Greenwich subject ID[edit]

Hi. Does anyone know if there is an equivalent of Property:P7332 for subjects of artworks at National Museums Greenwich rather than the artists? I was trying to add a subject link to Q5224177 but it comes out as a brief note on the individual but no files as they were subject (person) rather than maker. To get the subject files the link should be https://collections.rmg.co.uk/collections.html#!csearch;authority=agent-8868;browseBy=person but the property forces the link to https://collections.rmg.co.uk/collections.html#!csearch;authority=agent-8868;browseBy=maker From Hill To Shore (talk) 02:02, 22 February 2020 (UTC)

Undetected vandalism[edit]

See Cayetano Coll y Toste (Q5055304) as an example, I only detected it because they changed the date to have the person dead before they were born. Many others in the current error detection batch had the same IP vandalism, all from different IPs. There are 10 detected IP vandalisms so far in the current months batch, so that is one every 3 days or so. How are we detecting more subtle vandalism, like changing a date by a few years instead of 100 years? --RAN (talk) 02:46, 22 February 2020 (UTC)

I realise this may not apply to a lot of data, but it seems to me that any time a sourced statement is changed or removed it would need review. This is one way subtle vandalism could be detected and why sourcing statements is so important (sadly, most data here is not). --SilentSpike (talk) 10:41, 22 February 2020 (UTC)
If the date field contained a reference that reference would still be in place when the record was vandalized and appear to be properly referenced. Recently a warning has been added when a date changed and the reference not changed. That may help. --RAN (talk) 17:27, 22 February 2020 (UTC)

Taxonomy[edit]

Hoi, Please read this blogpost where a scientific taxonomy conference reports on taxonomy in Wikidata. It states quite clearly that the notions on taxonomy that have prevailed are at least problematic. Can we stop bickering for a change and accept that taxonomy is different from the current preconceptions? Thanks, GerardM (talk) 10:42, 22 February 2020 (UTC)

Sure „The conclusion of the Taxonomy team was that taxonomy is hard.“ No breaking news at all. Would be great to have sone real content from Taxon Names and Concepts group (TNC). --Succu (talk) 21:15, 22 February 2020 (UTC)
BTW: The idea of a taxon concept (Q38202667) was established 25 years ago in The Concept of "Potential Taxa" in Databases (Q28957948). --Succu (talk) 21:54, 22 February 2020 (UTC)
"Wikidata has somewhat lost its way with taxonomy and it can be seen from the data that users do not understand the intricacies of taxonomic names versus taxonomic concepts". Or in other words, academics will spend a few more decades developing the platonic ideal of a perfect onthology while we're just making something that works. I'm entirely fine with it. Nemo 10:19, 23 February 2020 (UTC)
Pictogram voting info.svg Info: Here are the etherpad notes of Cost MOBILISE Wikidata Workshop 2020 (Q84943795). --Succu (talk) 15:32, 23 February 2020 (UTC)
  • The biggest issue is that we are not able to make a clear choice between taxon concept vs name concept. If each taxa items represents a name, and only one, then this taxa items should inherit of all the properties of the names, and as exemple a "recombination" should be a subclass of a taxon. Otherwise if we don't accept this kind of things, and it's understandable, then the names should be separated so that we can work on. The issue is that we don't do one, not the other, neither a summary of the both. Christian Ferrer (talk) 18:34, 23 February 2020 (UTC)

How to mass-remove descriptions?[edit]

Jinmaku (Q651348) was incorrectly marked as instance of (P31)=Wikimedia disambiguation page (Q4167410), and the bots have added descriptions in many languages. Is there an easy way to remove them? Thanks. Mike Peel (talk) 17:21, 22 February 2020 (UTC)

  • It always had that P31, so the better solution would be to move the sitelinks to a new item and delete this one. Otherwise it would look like re-purposing. BTW there is a gadget to delete all descriptions. --- Jura 17:32, 22 February 2020 (UTC)
    Hmm, maybe, but it seems a waste of an item. What's the gadget name? Another similar case is Pfyffer (Q2084488), but that is more complicated (see enwp). Thanks. Mike Peel (talk) 17:50, 22 February 2020 (UTC)
    @Mike Peel: I believe you're looking for MediaWiki:Gadget-dataDrainer.js. Mahir256 (talk) 17:56, 22 February 2020 (UTC)
    @Mahir256: That looks likely, any chance it could be added to MediaWiki:Gadgets-definition please? It doesn't seem to be there at the moment. I see you're an interface admin, so you should have edit access to that page? Thanks. Mike Peel (talk) 18:02, 22 February 2020 (UTC)
    I think it's not included on purpose. --- Jura 18:38, 22 February 2020 (UTC)
    BTW, the jawiki page is a dab. --- Jura 18:40, 22 February 2020 (UTC)

How to qualify P22 to indicate 'official father' vs biological father ?[edit]

In a case like Diana Cooper (Q128576) (cf enwiki), how best to qualify father (P22) to distinguish her official father from her biological father?

I have looked at the table at Wikidata:WikiProject_Parenthood, but it's not very specific on this.

As for a probable parentage (eg Q5541503#P22, where the source (ODNB) says the later George IV "was believed to be the father of her son", I think sourcing circumstances (P1480) (or perhaps nature of statement (P5102)) is the way to go, though I'm not 100% sure of the best value in this case. But I see from the table at the WikiProject we do also have may be father (Q21152551). Should this be used as the value of type of kinship (P1039) instead? My instinct is to keep qualifiers for the nature of the relationship distinct from qualifiers for how certain it may or may not be. But perhaps there are good reasons that have led to the creation of Q21152551 ? I'd like to know what people think. Jheald (talk) 18:45, 22 February 2020 (UTC)

ChristianKl (talk) 15:11, 24 June 2017 (UTC) Melderick (talk) 12:22, 25 July 2017 (UTC) Richard Arthur Norton Jklamo (talk) 20:21, 14 October 2017 (UTC) Sam Wilson Gap9551 (talk) 18:41, 5 November 2017 (UTC) Jrm03063 (talk) 15:46, 22 May 2018 (UTC) Salgo60 (talk) 18:10, 18 June 2018 (UTC) Egbe Eugene (talk) Eugene233 (talk) 03:40, 19 June 2018 (UTC) Dcflyer (talk) 07:45, 9 September 2018 (UTC) Nomen ad hoc Gamaliel (talk) 13:01, 12 July 2019 (UTC) Pablo Busatto (talk) 11:51, 24 August 2019 (UTC) Theklan (talk) 19:25, 20 December 2019 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject Genealogy Jheald (talk) 18:46, 22 February 2020 (UTC) User:Paweł Ziemian User:Jura1 (is this project family relationships?) User:Infovarius User:Melderick User:Bvatant


Pictogram voting comment.svg Notified participants of WikiProject Parenthood Jheald (talk) 18:47, 22 February 2020 (UTC)

Hello,
For Diana Cooper (Q128576)'s legal father, I would use qualifier type of kinship (P1039) = legal father (Q66363656).
As for uncertainty of a statement, I usually use nature of statement (P5102) with values like hypothetically (Q18603603), presumably (Q18122778), disputed (Q18912752) ... I am not a big fan of may be father (Q21152551).
--Melderick (talk) 20:18, 22 February 2020 (UTC)
@Melderick: Thanks! I'd missed that legal father (Q66363656) already existed. Jheald (talk) 22:57, 22 February 2020 (UTC)
@Melderick: What about the inverse relationship?
Is there a preferred type of kinship (P1039) qualifier for eg Louis XV of France (Q7738) child (P40) Charles Louis Cadet de Gassicourt (Q2618388)? Presumably biological child (Q53705034), though I see we also have illegitimate child (Q170393), illegitimate child of a nobleman (Q10499185), royal bastard (Q7375049). But it would be good to have proper guidance on this set down somewhere.
Also what qualifier for Louis Claude Cadet de Gassicourt (Q736277) child (P40) Charles Louis Cadet de Gassicourt (Q2618388) ? I'm not seeing an item for "legal child" or "legal son" or "not biological child". Perhaps it should be created, or does something similar already exist and I've missed it? Jheald (talk) 08:57, 23 February 2020 (UTC)
adopted child (Q25858158)? —Scs (talk) 18:43, 23 February 2020 (UTC)

Bad data imports[edit]

CERL ID and Catalogo della Biblioteca IDs and deutsche-biographie have been added recently to people with the same name, born centuries apart. The years or birth and death were added from the sources. I can detect some errors since they cause a person to die before they were born, or cause a person to be older than 120 years. Are there other ways to detect the errors that may be more subtle? I noticed that some of the data from Catalogo della Biblioteca is corrupt before we import it, should we exclude bad data before we import it? http://catalogo.pusc.it/auth/126105 which has conflated two people and the data came from VIAF https://viaf.org/viaf/2713072/ which may have come from us. --RAN (talk) 00:15, 23 February 2020 (UTC)

Have you asked the users doing the import first? I don't know Pusc but CERL people were working on a massive cleanup. Nemo 10:16, 23 February 2020 (UTC)

Common-law marriage, concubines, fiancées, domestic partnerships and other types of partners[edit]

I'm having a hard time figuring out how to specify these kind of things.*Treker (talk) 09:03, 23 February 2020 (UTC)

I imagine use of the spouse (P26) property with object has role (P3831) as a qualifier taking appropriate values for common-law spouse, fiancée, &c. --Tagishsimon (talk) 09:06, 23 February 2020 (UTC)
@*Treker, Tagishsimon: Where the relationship does not constitute a legal marriage, use partner (P451). This can be qualified if desired as Tagishsimon indicates, if there is more detail to record. Jheald (talk) 09:31, 23 February 2020 (UTC)
What about concubines? Historically concubines would be considered legal spouses, just of a lesser status than the "main" wife.*Treker (talk) 09:34, 23 February 2020 (UTC)
There are likely different countries with different conception of what it means to be a concubine. The key test is whether there was or wasn't a marriage. ChristianKl❫ 20:50, 23 February 2020 (UTC)
There are likely different countries with different conceptions of what marriage is, so that's not a very good test. I wonder if it's very helpful to have 2 properties in this domain. What's the advantage? --Tagishsimon (talk) 00:41, 24 February 2020 (UTC)

Similar data-item[edit]

Q2098700 and Q1426123. Is one a subcategory of the other?Smiley.toerist (talk) 20:24, 23 February 2020 (UTC)

  • Yes, that's what the subclass claim is about. ChristianKl❫ 20:35, 23 February 2020 (UTC)

Could the bots do the same work in fewer edits more efficientiently?[edit]

It seems to me that the lag problems comes from the number of edits more than the content of these edits. So couldn't we avoid some of the lag by asking the bots with many edits to do their work in fewer edits? There are very many edits where a bot first add a statement, and then maybe qualifier in another edit, and then add a reference in yet another edit. Or they add multiple labels or descriptions with only one label or description per edit. If they did all work on the same item at once there would be much fewer edits – and the server load would be lower with the same work done. --Dipsacus fullonum (talk) 23:48, 23 February 2020 (UTC)

Short answer is yes, but whether this would make a noticeable difference I'm not sure. I just finished updating my bot (approval pending) and took this into consideration. You can see (Special:Contributions/SilentSpikeBot) I'm just making two edits for each claim I'm editing. Rather than adding qualifiers individually, I just make a new duplicate claim (copy qualifiers, sources, rank, value, snakType) and then locally edit the data before adding as a new claim and removing the old one. Result is 2 edits per claim, rather than multiple edits for individual qualifiers. The downside to this is that it's not as clear what I've changed using the diff view. So you might ask: why I don't just make 1 edit to update the existing claim? That's unfortunately down to the Pywikibot (Q15169668) implementation (which I suspect a lot of bots are probably using) where I don't believe it's currently possible to edit an existing claim locally and only upload all the changes as one edit. --SilentSpike (talk) 00:06, 24 February 2020 (UTC)
You could use the editEntity function in pywikibot that maps to the wbeditentity API action to make those changes in one edit, but you’d need to change quite a lot of things. That function actually allows you to change all data you find in an item in one edit. Not sure whether it's worth to use it here, though. —MisterSynergy (talk) 00:35, 24 February 2020 (UTC)

Emerge Karl II (Q548215) with Charles II (Q42396638)[edit]

They both represent "Charles II" (查理二世)。—— Eric Liu留言百科用戶頁 05:51, 24 February 2020 (UTC)