Wikidata:Project chat/Archive/2020/02

From Wikidata
Jump to navigation Jump to search

Now that UN document symbol (P3069) exists, would anyone here mind if i remove all 'full work available' from instances of United Nations Security Council resolution? --Trade (talk) 17:03, 1 February 2020 (UTC)

@Trade: remove? We have robots to replace that, see Property talk:P973 for examples. Multichill (talk) 22:38, 2 February 2020 (UTC)
@Multichill: I gave it a try --Trade (talk) 00:00, 3 February 2020 (UTC)

Wildly variant genealogies?

I thought I knew how to deal with genealogical variants: find the most authoritative source, use that as a backbone, then note variant parents / spouses / children / dates with sourcing, while distinguishng the apparently authoritative version "preferred" rank.

But what if the genealogy from a particular source is *wildly* different -- eg keeping the same main sequence of names (more or less), but systematically assigning them the dates, spouses, and collateral children that in other sources are assigned to somebody else -- essentially giving the name a totally different bio, so it's not just a question of a couple of statements different, but instead more like a completely different person being attached to the name, so different is the presented tree.

I was merging apparently duplicate items created with the The Peerage person ID (P4638) import when I hit this, and now I'm not sure what to do.

Do I try to merge by the by-names (the first names all being "Ulick"), essentially mashing together two completely different biographies? Do I try to merge according to the biographies, where this seems possible, ignoring that they've been attached to different names? Or do I try to maintain both versions in the database unmerged, in which case what properties should I use to connect the different biographical "versions" attached to a particular name, and/or how do I indicate that an entire "version" (ie one particular item for that name) is to be preferred over the second one ?

I am tempted to use something like partially coincident with (P1382) to link items for two contending versions of lives attached to the same name; but I'm not sure it's appropriate for radically different incompatible reconstructions of the truth, as opposed to variant descriptions or terminology for a single reality. Noe am I clear how I would indicate that the other version should probably be the preferred version. Jheald (talk) 18:14, 1 February 2020 (UTC)

@Jheald: This means a conflation, see Help:Conflation of two persons]. The Peerage item should be tagged as conflation (Q14946528) and have all data deprecated.--GZWDer (talk) 18:48, 1 February 2020 (UTC)
@GZWDer: Neat. Yes, that seems a very good way to deal with it. So how to relate the conflation to the underlying people (conflation "of" ?) on the conflated item, and how to link to the conflated item from the "good" items? ("different from" X + "object has role" = "conflation" ?) Jheald (talk) 18:55, 1 February 2020 (UTC)

--GZWDer (talk) 19:03, 1 February 2020 (UTC)

GZWDer is there a good way to indicate which items are conflated in the ID of interest? For a non-Peerage example, the Internet Broadway Database ID of Louise Allen (Q70776916) conflates the actress who flourished in the 1920s and 30s with her aunt of the same name (Louise Allen (Q70760915)) who was born in 1873 but who died in 1909. In cases where an External ID conflates two or more distinct items, would partially coincident with (P1382) or partially coincident with (Q78694451) be added as a qualifier to the deprecated ID, or are there better ways? -Animalparty (talk) 23:37, 2 February 2020 (UTC)

how do i post my logo to my ceate new item profile

Waw*Mart (Q84203597)  – The preceding unsigned comment was added by Waw mart2 (talk • contribs) at 07:15, 2 February 2020‎ (UTC).

Waw mart2, if the logo is released under a free license, or is otherwise public domain, you first need to upload it to Wikimedia Commons (make sure it is under a compatible license: see Commons:Licensing). You can then add it to Q84203597 as logo image (P154). And, whether you are the subject of the item or not, please review Wikidata:Notability and Wikimedia FAQ on paid contributions without disclosure. Be prepared for bush back under an actual or perceived conflict of interest. Cheers, -Animalparty (talk) 23:59, 2 February 2020 (UTC)

Same data for P18 and P2716

Please move montage image (P2716) close to (below or under) image (P18). For some items there is a same file image for both and it is hard to see it when they are far from each other... Or just add some notification (error message) when someone try to add image which is already used in the other property on the same item. --Termininja (talk) 08:54, 2 February 2020 (UTC)

You can use
select ?item ?photo where {?item wdt:P2716 ?photo . ?item wdt:P18 ?photo}
Try it!
to find them, I believe? Edoderoo (talk) 11:08, 2 February 2020 (UTC)
I know how to use SPARQL, the problem is about the item interface (example Heterobranchia (Q133143)) --Termininja (talk) 11:58, 2 February 2020 (UTC)
This is controlled by MediaWiki:Wikibase-SortedProperties - best place to request this be moved is there. In general it looks like P18 goes at the top, or some properties which probably get used instead of P18 (eg logo image), and then there's a generic group of "other images of the subject" lower down, including P2716 (except for P1442, image of grave, which is filed alongside date of death). Andrew Gray (talk) 20:18, 2 February 2020 (UTC)

Years ago I suggested that we should have a 'should not have the same value as' constraint. And a 'should have the same value as' one as well. They are still missing, unfortunately. Thierry Caro (talk) 17:38, 2 February 2020 (UTC)

User Account Verification

I think here in Wikidata are users who use their account also for their work or who edit Wikidata mostly as their job. In that case it is paid Editing. At the subpoint Paid contributions without disclosure of the terms of Use of the Wikimediafoundation it is clarified that users who recieve a compensation for one of their edits need to clarify that on their userpage or at the talk page of the editet article or item or in the edit summary of every contribution for what they recieve money. I think this is an important thing. That more people can read it and understand it, it were great if there is a template for that, with the informations in more than the English language. -- Hogü-456 (talk) 20:41, 2 February 2020 (UTC)

Got any examples of paid editors working on Wikidata? --Trade (talk) 00:11, 3 February 2020 (UTC)
Since they haven't been required to declare here, that would be hard to identify in advance, but I bet if you look at the self-declared paid editors on the English-language Wikipedia you will find people who have edited Wikidata, and it's a pretty safe bet that they've done that in their paid capacity. - Jmabel (talk) 01:57, 3 February 2020 (UTC)
I am checking new company items frequently and I can see there many paid editor suspicions. Usually items from new users about (new) not notable companies (and its founders) with a description in promotional style. I am usually proposing these for deletion. See for example Badassentrepreneur (talk • contribs • deleted contribs • logs • filter log • block user • block log • SUL (for IP: GUC)) and Wizkidoftheinternet (talk • contribs • deleted contribs • logs • filter log • block user • block log • SUL (for IP: GUC)).--Jklamo (talk) 02:26, 3 February 2020 (UTC)
@Jklamo: i've made a similar page with websites instead of companies. Might be worth keeping an eye on. --Trade (talk) 00:29, 4 February 2020 (UTC)

Commons link should exist

At David Kettlewell (Q50311931) there is a ! warning that "Commons link should exist." Exactly which commons link is required? Is the warning because there is a category link in the slot for commons? --RAN (talk) 23:54, 2 February 2020 (UTC)

That's an error that plagues every Commons Creator page statement. Ignore it. -Animalparty (talk) 00:20, 3 February 2020 (UTC)
Can we remove: item requires statement constraint:property:Commons category? I am not sure it serves any purpose. --RAN (talk) 02:32, 3 February 2020 (UTC)

Mismatched reference: first version to be deployed this week

Mismatched reference notification preview

Hello all,

As announced last month, we’ve been working on mismatched reference, a new feature that alerts users when editing a value without changing the existing attached reference. This feature has been tested over the past month. Based on the positive feedback we received, we are now able to move forward and enable the feature on wikidata.org. This will take place this week, in two different steps:

  • Today around 13:00 UTC, you will be able to see a notification (similar to the constraint ones) after saving an edit
  • On Thursday, February 6th, we will also enable a button that will allow you to hide the notification if you think that the reference is not mismatched

Please note that for now, the feature is not persistent: the editor who made the change will see it appear when they saved their edit, but if they reload the page, the notification will be gone. Other users also won’t be able to see it. We are considering adding this persistency feature in the future.

If you want to give feedback about the feature, feel free to use this talk page. If you want to report an issue directly on Phabricator, feel free to use this form. Cheers, Lea Lacroix (WMDE) (talk) 11:22, 3 February 2020 (UTC)


I have added sinature in Hebrew (signature (P109)) in Moshe Dayan (Q188783) which already contaon signature in English but got problem tag single-value constraint (Q19474404). I think that it should be allowed more then one language signature. Geagea (talk) 13:14, 3 February 2020 (UTC)

@Geagea: I've amended the single value constraint on P109 so that language of work or name (P407) acts as a separator, to deal with the commonplace issue of a person having different signatures for different languages. The constraint no longer triggers on https://www.wikidata.org/wiki/Q188783#P109 ... thanks for spotting this. --Tagishsimon (talk) 13:36, 3 February 2020 (UTC)
Thanks Tagishsimon. Geagea (talk) 15:18, 3 February 2020 (UTC)

Wikidata weekly summary #401

Scaling WDQS

I made a grant proposal with an initial project plan including a potential solution. Please come and visit the grant page for a future-proof WDQS and discuss it in the talk page.

 Withdrawn as of 30 January 2020. –84.46.52.96 05:31, 4 February 2020 (UTC)

Limit for dates in Gregorian

What about setting a date to Gregorian calendar to be possible only for years after 1582. Also last year for setting a date to Julian calendar to be 1923 as Greece being the last European country to adopt the Gregorian calendar in 1923. Xunonotyk (talk) 09:31, 30 January 2020 (UTC)

I think the en:Proleptic Gregorian calendar may be in use in some fields. Ghouston (talk) 09:34, 30 January 2020 (UTC)
Since 0000-03-01 depending on my mood, because an ISO year 0000 is one thing, but deciding on leap year or not goes too far. An ideal switch from proleptic Gregorian to Julian could be the century when both calendars agreed. 2nd or 3rd century—I could check it in a vintage 2006 template on Meta.:-)84.46.52.96 05:45, 4 February 2020 (UTC)

Editing 5 books

Hi,

I just joined Wikidata. I would like to manually update 5 records:

https://www.wikidata.org/wiki/Q46802 https://www.wikidata.org/wiki/Q47228 https://www.wikidata.org/wiki/Q1521632 https://www.wikidata.org/wiki/Q2450557 https://www.wikidata.org/wiki/Q1144373

with information from the editions that we publish:

http://www.yogavidya.com/freepdfs.html

What does the community recommend?

19:52, 3 February 2020 (UTC)19:52, 3 February 2020 (UTC)19:52, 3 February 2020 (UTC)Cinna Babu (talk)

@Cinna Babu: Follow the advice at Wikidata:WikiProject Books#Edition item properties and create new items for your editions based on the table there. --Tagishsimon (talk) 20:26, 3 February 2020 (UTC)
@Cinna Babu:, for an example on how a book edition (Q57933693) can look like check out Be Loved (Q84083827) --Trade (talk) 23:58, 3 February 2020 (UTC)


Thanks for the pointers! I just created my first Wikidata page: (Q84361166)

I was not able to add the translator: Brian Dana Akers

I was not able to add the publisher: YogaVidya.com

Not sure how to deal with the languages. The book is a bilingual Sanskrit-English text.

00:24, 5 February 2020 (UTC)00:24, 5 February 2020 (UTC)00:24, 5 February 2020 (UTC)~~

Quality of referencing required to support relationship

Hi. What level of referencing is required to support a claim of family relationship? I have been working on tidying some records related to the Crowdy family. I've been able to link three sisters together, Edith Frances Crowdy, Isabel Crowdy and Rachel Crowdy, with two sources saying they are the daughters of James Crowdy (solicitor) and Mary Isabel Ann Fuidge. [1][2] Meanwhile, James Fuidge Crowdy is stated by a forum post on a genealogy website to be the child of James Crowdy (solicitor) and Mary Isabel Ann Fuidge.[3] All four of the potential siblings were alive during the same period. Is the forum post considered reliable enough for Wikidata or is there a higher standard that has to be achieved? From Hill To Shore (talk) 22:36, 3 February 2020 (UTC)

I don't know of any relevant Wikidata policy; indeed Wikidata has been resistant to requiring decent sources. I have done genealogical research for my own family, and found user-contributed information on genealogy sites is often based on wishful thinking. Also, Wikidata is structured to provide a reference to a reliable source that plainly states the fact in question. Sources that are an essay about why something might be true are not suitable. Jc3s5h (talk) 23:19, 3 February 2020 (UTC)
Currently we use "reference_url=https://www.oxforddnb.com/view/10.1093/ref:odnb/9780198614128.001.0001/odnb-9780198614128-e-62129" with "stated_in=Oxford Dictionary" and even "quote=.... one of the four daughters of James Crowdy, solicitor, and his wife, Mary Isabel Anne Fuidge" if there is a sentence that can be cut and pasted. We have multiple entries for some birthdates based on genealogical sources handled that way. If new, more reliable info come along you can deprecate the bad value. Don't forget to use the Wikidata family tree template if there is an entry at Wikimedia Commons. See Anton Julius Winblad I (Q20667111) and click through to Commons to see the chart. Also I recommend getting a free account at Familysearch where you can enter a new person, if not already there and link to Wikidata. If you visit Familysearch I added in the family and attached them to the various censuses for England as well as extant birth, marriage, and death records and linked to Wikidata for you. Good to see someone else working on connecting family data here. --RAN (talk) 01:27, 4 February 2020 (UTC)
@Richard Arthur Norton (1958- ): Thanks for the advice and sorting through those Crowdy records. It is a really useful perspective to just post information with whatever reference is available and then replace it later if higher quality references materialise. Coming from English Wikipedia, I'm used to much more stringent policies. I'm happy to adapt to how things are done here though.
By "Wikidata family tree template" do you just mean the Commons:Template:Wikidata Infobox used in Commons categories?
I don't think I will sign up to Familysearch. Submissions aren't covered by Creative Commons and I am not comfortable giving a religious organisation ownership of my work, especially where they have specific monetised purposes for the data listed in the site structure. I'm coming at this solely from the perspective of linking related Wikipedia subjects together through Wikidata rather than producing Wikidata for genealogy; if someone wants to take my work (released under Creative Commons) and put it on Familysearch then that is their choice. From Hill To Shore (talk) 13:19, 4 February 2020 (UTC)
@From Hill To Shore: Genealogical information is not covered by copyright, it does not meet the threshold of originality. Birth, marriage, and death information is available in public documents. Only if you synthesize that information into a prose biography using original research such as interviews or information in personal letters would it be capable of earning copyright protection. I don't think any single Wikidata entry is copyrightable. All of Wikidata as a data set would be because of the "sweat of brow concept". Just as a telephone book entry is not copyrightable, but the entire book may be. In Familysearch I am just clicking on the documents that belong to the family, like the census and the birth, marriage, death records so that they are linked to a specific person. I don't know what "specific monetised purposes for the data listed in the site structure" means. They do not charge for the service, nor do they sell anything. I am not sure why you think a posting in newsgroup is ok, but are wary of the actual primary source documents that have been scanned and indexed by volunteers. I am not a member of the LDS, I just use their documents and link to them here at Wikidata. Registration is required because there is information on living people in the census data. --RAN (talk) 13:41, 4 February 2020 (UTC)
Let us just leave it at, "I am not comfortable with supporting the activities of any religious organisation (of any faith)." This is a topic area that could become messy if other editors decide to become involved, so best not to continue on that theme.
In terms of quality of data, I have not said that the forum post is in anyway superior to primary sources. I started this thread specifically because I was unsure whether we could use a forum post. I have seen a lot of unsourced or poorly sourced material here and was trying to see if there is a minimum standard that we need to achieve. From Hill To Shore (talk) 13:53, 4 February 2020 (UTC)

Editing the header for a Wikidata entry

Every once in a while I am editing the titles for a Wikidata entry and I am brought to the screen where all the "also known as" entries are separated by a vertical bar, which was the the entry screen during creation. How can I do that on purpose so that I can edit "also known as" entries that have overflow text, there is a glitch that blocks entries from being edited that have overflow text, but when I am mysteriously propelled to the creation edit screen I can make correction because the field is longer. What magic can I use to go there on purpose? --RAN (talk) 01:16, 4 February 2020 (UTC)

@Richard Arthur Norton (1958- ): That’s the special page Special:SetLabelDescriptionAliases, which is the link target of that “edit” link. Usually, left-clicking the link will only follow it (and open the page) if the click happens early, before the page’s JavaScript has loaded; however, you can always follow the link in other ways, e. g. via right click > open link in new tab or (if you have a three-button mouse) middle click. And you can also go to Special:SetAliases directly, to set just the aliases without label and description – you can add the entity ID and language code in the title, e. g. Special:SetAliases/Q4115189/en. (Warning: those special pages don’t support editing aliases which themselves contain vertical bar characters, see phabricator:T219499.) And the issue with being unable to edit overlong aliases normally is phabricator:T234804. --Lucas Werkmeister (WMDE) (talk) 10:54, 4 February 2020 (UTC)
@Lucas Werkmeister (WMDE): Wow! Thanks, the right click workaround is perfect. --RAN (talk) 20:23, 4 February 2020 (UTC)

Add instance of Jessica Harris, podcaster, film producer, director

For consideration, I would like to add the following data:

instance of: human sex or gender: female country of citizenship: United States of America given name: Jessica family name: Harris birth name: Glass occupation: radio personality, podcaster, documentary film producer, documentary film director educated at: Harvard College, Columbia Business School official website: http://fromscratchradio.org References: IMDB https://www.imdb.com/name/nm0321952/, Wikipedia https://en.wikipedia.org/wiki/From_Scratch_(radio)

Thank you, Brian Williams Ploughdeep (talk) 03:59, 4 February 2020 (UTC)

@Ploughdeep: I've made a start at Jessica Harris (Q84332670) and note also From Scratch (Q5505508) --Tagishsimon (talk) 04:19, 4 February 2020 (UTC)
@Tagishsimon: Thank you very very much! -- Ploughdeep (talk) 02:48, 5 February 2020 (UTC)

Wikidata = quick-data

Just recalled that "wiki" is meant to mean "quick" .. so "QuickStatements" is just wiki-statements? --- Jura 20:51, 4 February 2020 (UTC)

Structured data on WikiSource - any takers?

Hi,

I'm a WikiSource editor working on the US Statutes at Large [4] from the 1904/05 Congressional session. There are thousands and thousands of federal pension grants and increases in these pages, so of passingly important social history/military history/PhD/genealogical value to somebody, and I'm pretty sure the Library of Congress isn't going to cover these in its digitisation project because they are technically Private Acts.

For logistical reasons, I have had to develop a suite of templates for proofreading, which means that all this lovely data now accidentally has a simple structure. In its current state, therefore, the data set is conceivably only a step away from being machine-readable/searchable (each pension has at least the following fields: date=, grantee=, quality=, rate=, gender=). Once the proofreading is finished on WS, we will just subst: the whole lot to avoid Template limits, whereupon it just goes back to being in a sense "dumb" data again. Does anyone have any ideas about who might want to host such data? Is it WikiData's bailiwick? Is there a potential SMW hoster out there somewhere? It would be such a shame to lose all the carefully applied data structure which should be of value to someone... CharlesSpencer (talk) 14:54, 4 February 2020 (UTC)

  • How many entries are we talking about? What's written in the quality and rate fields? ChristianKl16:32, 4 February 2020 (UTC)
    • @ChristianKl: Rate is dollars per month in words only - generally something between 8 bucks and 30 bucks per month, I assume depending on a combination of rank of the pensioner and how well he/ his widow knew his congressman/senator!
    • Quality is generally something like “late Captain, Company X, New York Volunteer Infantry”, sometimes “widow of Joe Bloggs, late Captain [etc. etc.]”. For some of the more obscure engagements and/or units, there might be a terminal qualifier such as “war with Mexico” or “Seminole Indian disturbances”. Once it’s all been fully proofread, it could potentially be very interesting to further break down the “Quality” data into rank, unit (sometimes units plural), and maybe campaign (where specified). Who knows, you might even be able to develop a full nominal roll of “Captain Whosit’s Company, Wisconsin Irregular Militia, War with Mexico”!
    • And there are (at a guess) almost a thousand pages at 4-5 acts per page (99% of them pensions) just in Vol. 33 - there are probably tens of thousands over the years - a potentially fascinating social history resource. CharlesSpencer (talk) 00:31, 5 February 2020 (UTC)
  • @CharlesSpencer: Oh, this is interesting! First of all, it might be worth just taking the whole lot, converting into a TSV or similar plain text file, and posting the whole lot on figshare or another repository - this might be of interest to a researcher in its simplest form. I think we could definitely do something with these here, though, and a private act still seems notable and substantial enough to include.
In terms of data structure, we'd be able to use something like:
Title: An act granting a pension to XYZ (I presume these are the standard titles)?
instance of (P31):private act of the United States congress (might as well create a new subclass of Act of Congress in the United States (Q476068)
legislated by (P467): 58th United States Congress (Q4640944), with qualifier point in time (P585) for date of approval
publication date (P577): date of approval
legal citation of this text (P1031): formal citation calculated from the structured data
full work available at URL (P953): Wikisource link (unless you will have one WS page per private Act, in which case we can use a sitelink)
We wouldn't easily be able to make use of the rate ($30), gender, or quality ("widow of...") fields, unless we tried creating items for each recipient. Which wouldn't be impossible, but might be a bit unwieldy since we don't really know anything about them other than that they were granted a pension on a specific date. Andrew Gray (talk) 20:59, 4 February 2020 (UTC)
A possible solution is to annotate the HTML outputted by the templates with Microdata. It is what we do in the french Wikisource header templates (page example). Then you could use an HTML/microdata scraper to extract all the embedded metadata. Tpt (talk) 21:38, 4 February 2020 (UTC)
@Andrew Gray: @Tpt: I’m all at sea in Wikidata land, so I only wanted to alert people who know what they’re doing to make best use of my accidentally structured data - I’m afraid I have more than my hands full myself with the proofreading! And don’t worry - I shan’t be :subst-ing anytime soon. Please make use of the WS data to your hearts’ content! CharlesSpencer (talk) 00:31, 5 February 2020 (UTC)
Excellent - I'll have a think about how best to do this and let you know how it goes :-) Andrew Gray (talk) 00:32, 5 February 2020 (UTC)
It seems to me that we do know a least a bit about the person "spouse of X" maybe with <SomeValue> (stated as...) when the person served as a captain in the New York Volunteer Infantry that's also enough information for an item about the person. ChristianKl08:21, 5 February 2020 (UTC)

Determination method [for copyright status]

Do we have a Determination_method [for copyright status] that covers the license {{Anonymous work}} {{PD-US-unpublished}} ? --RAN (talk) 07:03, 5 February 2020 (UTC)

Richard Bachman - pseudonym for Stephen King

Hello all, I know that pseudonyms were discussed here several times previously and the consensus was that they represent the same entity as the person. There is a very small minority of items which still do not follow the rule, such as Richard Bachman (Q3495759) who corresponds to Stephen King (Q39829) but they obviously have separate items. My opinion is that we should be consistent in following the rule above, but what is your preferred solution in case of an item which has so many sitelinks? Would you go for Wikimedia duplicated page (Q17362920)? --Vojtěch Dostál (talk) 10:02, 5 February 2020 (UTC)

How about said to be the same as (P460)? ChristianKl12:20, 5 February 2020 (UTC)
@ChristianKl: Which one should have the unique identifiers then? They obviously trigger constraint violations :) Vojtěch Dostál (talk) 14:56, 5 February 2020 (UTC)
I would go by whatever the relevant authority considers to be the main name. And it's worth pointing out that VIAF also seems to have a distinct ID for both. ChristianKl15:07, 5 February 2020 (UTC)

Property for a GLAM institution contributing data to a project/system

Hi everyone, I've got a question concerning GLAMs that are contributing with their collections data to a project, system or infrastructure. I want to display that museum XY is sharing data with e.g. Europeana, K-samsök/SOCH etc. Is there a specific property for that relation? Or which one would you recommend? Tulipasylvestris (talk) 17:12, 5 February 2020 (UTC)


Wikimedia 2030 community discussions: Halfway to the conclusions

Icons of 13 strategic recommendations

As I wrote in my last message here, the strategic recommendations for how we can achieve the Wikimedia 2030 vision are available for your final review. There are three weeks left to share your feedback, questions, concerns, and other comments.

These 13 recommendations are the result of more than a year of dedicated work by working groups comprised of volunteers and staff members from all around the world. These recommendations include the core content plus the Principles and the Glossary, which lend important context to this work and highlight the ways that the recommendations are conceptually interlinked. The Narrative of Change offers a summary introduction to the recommendations material. On Meta-Wiki, you can find even more detailed documentation.

Community input has played, and will continue to play, an important role in the shaping of these recommendations. They reflect this and cite community input throughout in footnotes.

In this final community review stage, we're hoping to better understand how you think the recommendations would impact our movement – what benefits and opportunities do you foresee for your community, and why? What challenges or barriers could they pose for you?

After this three-week period, the Core Team will publish a summary report of input from across affiliates, online communities, and other stakeholders for public review before the recommendations are finalized. You can view our updated timeline here as well as an updated FAQ section that addresses topics like the goal of this current period, the various components of the draft recommendations, and what's next in more detail.

Thank you again for taking the time to join us in community conversations, and I look forward to receiving your input. Happy reading!

SGrabarczuk (WMF) (talk) 02:20, 6 February 2020 (UTC)

There don't appear to actually be any changes to the content since the last message here, nor do the talk pages indicate any responsiveness to community concerns. Meh. --Yair rand (talk) 03:03, 6 February 2020 (UTC)
Do you mean changes to the content of the recommendations? This is because the changes will be done after 21 February, when all feedback has been collected. SGrabarczuk (WMF) (talk) 03:23, 6 February 2020 (UTC)

UUID search

Some time ago, we discussed URN (Q76497) here (Project_chat/Archive/2019/10). This lead to the creation of URN formatter (P7470).

Possibly a direct application for this at Wikidata could be to enable search by Universally Unique Identifier (Q195284) (UUID). These alphanumeric strings are used by various identifiers. Sample is 03cefc64-0bb7-42e6-9bf2-2200f42c3318, a value of Musicbrainz for Yankee Stadium (Q675214). URNs normalize these to a format like <urn:uuid:03cefc64-0bb7-42e6-9bf2-2200f42c3318> (lowercase, with "-").

The property documentation template on talk pages of properties with URN formatter (P7470) already include a query that generates such values for the property (see Property talk:P1004#URN). However, as some normalization is included, querying across identifiers isn't reallly efficient (test at Wikidata:Database reports/uuid).

If there is interest, a simpler approach could be to generate URN-triples by Wikibase directly from these identifier statements, similarly to the way other triples are generated. --- Jura 13:13, 3 February 2020 (UTC)

Random observation, not everything formatted like a UUID necessarily is a UUID. RFC 4122 has various verified errata, and we don't know what Musicbrainz actually does, e.g., is it correct? –84.46.52.96 06:05, 4 February 2020 (UTC)
Sure. Musicbrainz provides some explanation on their website on what goes into their identifiers.
The uniqueness of some versions of UUID is based on the assumption that not every database has, e.g. entries. This might eventually be incorrect. However, none of the "dups" on Wikidata:Database_reports/uuid seem to be due to that. --- Jura 10:23, 4 February 2020 (UTC)

Hi @Jura1:, thanks for mentioning this at https://www.wikidata.org/wiki/Property_talk:P7711#Format. The culture.fr thesauri use GUIDs for new entries, but they don't use <urn:uuid:>, instead they embed them in ark: URLs. Also, all their 30 thesauri are mixed in the same namespace. Could you please explain how P7470 could help this problem? --Vladimir Alexiev (talk) 11:33, 6 February 2020 (UTC)


@Vladimir Alexiev: If you have a UUID like 171394dc-5472-4322-9017-b8ca091e18b0, you could search

SELECT * { ?a ?b <urn:uuid:171394dc-5472-4322-9017-b8ca091e18b0> }
Try it!

Of course, one could do:

SELECT * { ?a ?b "171394dc-5472-4322-9017-b8ca091e18b0" }
Try it!

as P7711 has well formed values, but other databases might use a format for the same UUIDs like "171394DC547243229017B8CA091E18B0" and mere string search wouldn't find it. --- Jura 11:52, 6 February 2020 (UTC)

Any talk link should be prevented

e.g. special:diff/849336521.--Roy17 (talk) 20:35, 4 February 2020 (UTC)

@Incnis Mrsi:  – The preceding unsigned comment was added by Jmabel (talk • contribs) at 20:57, 4 February 2020‎ (UTC).
@Jmabel: pings only work if you add a signature in the same edit. @Incnis Mrsi: your edit is up for discussion. --Lucas Werkmeister (talk) 11:22, 5 February 2020 (UTC)
Oops… this was a side effect of a renaming kludge in Wikipedia. Override. Incnis Mrsi (talk) 11:27, 6 February 2020 (UTC)

Missing labels again

See near-square prime, position creation or Beatriz Mayordomo Pujol as random examples with {{Label}} that does not fetch the label in English, nor in other languages, nor with Lua mw.wikibase.getLabel, nor after purging. It was phab:T237984 closed as resolved. --Vriullop (talk) 10:59, 6 February 2020 (UTC)

It is spreading! Last night the Wikidata/FamilyTree template began displaying Q numbers and Category links instead of the names. I thought it would be fixed overnight, but still off. See Commons:Category:Sir Hector Og Maclean, 15th Chief --RAN (talk) 19:13, 6 February 2020 (UTC)
It shows labels up to Q74000000 ([Plan for cancer prophylaxis]), and then Q numbers after that (The risk of rapid and large citrated blood transfusions in experimental haemorrhagic shock). In Special:ExpandTemplates it shows labels, including for items above Q74000000. Peter James (talk) 19:33, 6 February 2020 (UTC) Special:ExpandTemplates only shows labels above Q74000000 if link=y is specified. Peter James (talk) 19:42, 6 February 2020 (UTC)

Statistics

Moin Moin together, I have a question about/for properties. I can see how many Wikidata objects we have, but where do I see the average number of properties per object and where a total number of all properties in total that are set. Is there something I can have a look at? Regards --Crazy1880 (talk) 20:09, 5 February 2020 (UTC)

Moin Moin, I had had a look at, but didn't found anything helps me. --Crazy1880 (talk) 20:18, 8 February 2020 (UTC)

Bogus category constraint warning

I have just added to Category:Food banks in Canada (Q8463144) a statement thus:

Category:Food banks in Canada (Q8463144)category's main topic (P301)food bank (Q113603)of (P642)Canada (Q16)

I now see a constraint warning that "food bank should also have the inverse statement topic's main category (P910)Category:Food banks in Canada (Q8463144)", which is clearly nonsense. How widespread is this issue? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:26, 7 February 2020 (UTC)

It's not nonsense. P301 and P910 have one-to-one relation. Use category contains (P4224) instead. --Shinnin (talk) 23:35, 7 February 2020 (UTC)
More likely he wants category combines topics (P971). - Jmabel (talk) 23:42, 7 February 2020 (UTC)
It's nonsense unless you think food bank should have the statement topic's main category (P910)Category:Food banks in Canada (Q8463144). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:57, 7 February 2020 (UTC)
@Pigsonthewing: A fair amount of code, including in particular all the infoboxes on Commons, assume that category's main topic (P301) is a 1-to-1 relation. Please listen to the warnings you're getting, and don't break this. The statement you added does not conform to the expected behaviour for category's main topic (P301). Jheald (talk) 01:02, 8 February 2020 (UTC)
The warning I'm getting says, completely nonsensically, "food bank should have the statement topic's main category (P910)Category:Food banks in Canada (Q8463144)"; what it should apparently say is that the logical "X of Y" statement is not deemed a valid statement on a category, and that an alternative is preferred. Again: How widespread is this issue? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:29, 8 February 2020 (UTC)
Again: this is exactly why Category:Food banks in Canada (Q8463144)category's main topic (P301)food bank (Q113603)of (P642)Canada (Q16) is the wrong way to do this. It should be and . - Jmabel (talk) 15:38, 8 February 2020 (UTC)
Again: I'm pointing out that the constraint message that is shown when this method is employed tells people to do something nonsensical, rather than how to correct it; and am asking How widespread is this issue? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:48, 8 February 2020 (UTC)

Associated Press articles

When we have a Q item that is an Associated Press article do I add author=Associated Press or is that field only expected to contain a human? I know we have a bot calculating copyright status for books and articles based on the year of death for the human author. --RAN (talk) 03:07, 8 February 2020 (UTC)

I think this should be a human(s) or with an actual value of unknown value if there is no byline. There are some articles written by a machine (e.g. recaps of sporting events) but those are rare and probably not even copyright-able. —Justin (koavf)TCM 03:14, 8 February 2020 (UTC)
  • Where should we store "Associated Press" so that someone can search for all the Associated Press articles? --RAN (talk) 03:40, 8 February 2020 (UTC)
    • i'm no expert but you could list Q40469 as the publisher of the article and then query for all articles published by Q40469. or you could make a new entity representing an "AP article" and have the articles be an instance of that entity. not sure which would be preferred. BrokenSegue (talk) 04:10, 8 February 2020 (UTC)
    • I think that the Associated Press is strictly a "syndicate" and then an individual newspaper or magazine would have a "publisher", so that is how the AP should be stored. —Justin (koavf)TCM 04:50, 8 February 2020 (UTC)
I agree that the local paper is the publisher, so that leaves, temporarily, that the author is the Associated Press. We do something similar for studio photographs. Sometimes the creator is a person, and sometimes it is a photo studio. --RAN (talk) 06:10, 8 February 2020 (UTC)
No, but that's not what is happening here: A Peanuts comic strip published in the Indianapolis Star was not "authored" by United Feature Syndicate. It was authored by Charles M. Schultz and syndicated or distributed by United Feature Syndicate. These are totally different roles. —Justin (koavf)TCM 06:17, 8 February 2020 (UTC)
They have two roles, Associated Press reporters author news reports and they are sent by telegraph and later teletype to subscribing newspapers. You appear to be saying that they may act as a distributor for existing news stories authored by third parties. In your case above we know that Charles M. Schultz is the creator, while AP employees are anonymous, more similar to the photo studio employees that take images anonymously. --RAN (talk) 07:52, 8 February 2020 (UTC)
Yes, they have different roles but just because we know who Charles Schultz is and we don't know who wrote a particular AP piece doesn't change the fact that AP is just a company that distributes things, they are not an author or a photographer or an illustrator or one of any other persons who may make a news story. This is an excellent example of when and why we should use unknown value. —Justin (koavf)TCM 08:06, 8 February 2020 (UTC)
So should we use Property:P750 to list Associated Press as the "distributor", like we do for movies? —Scs (talk) 18:13, 8 February 2020 (UTC)

Is it a good idea? Is it doable by a bot?

Ideally each former building/structure should have a state of conservation (P5816) and a state of use (P5817), but I believe a bot would not be smart enough to guess.

Cheers! Syced (talk) 11:32, 8 February 2020 (UTC)

Next question, does anyone disagree that former administrative territorial entity (Q19953632) should only be applied to territorial entities that have been dissolved? Thanks! Syced (talk) 12:00, 8 February 2020 (UTC)
Partly yes, partly this value should be burnt with fire. It certainly should not be applied to a current administrative entity. It arguably should not be applied ot anything. An e.g. council which has demised as a result of local government reorganisation, is a council which ideally has, at least, an end date. P31=Council plus End Date = all you need. It was a council. It was never a former council. Much the same holds for all 'former *' values. --Tagishsimon (talk) 12:17, 8 February 2020 (UTC)

Blocked for edits

Can someone explain to me why my bot, insclusive PAWS, is constantly blocked for reading and writing, while [bots] are making several edits per minute in the same time? Edoderoo (talk) 16:19, 8 February 2020 (UTC)

Could you skip these items until the site is more stable? --- Jura 16:27, 8 February 2020 (UTC)
Hmm, the last time the site was stable, Johannes Paulus was still our pope. Your answer is anyways not an anwer to my question. Why are my scripts blocked, while others keep on adding items at a higher rate then I can even add descriptions to them? Edoderoo (talk) 16:49, 8 February 2020 (UTC)
Adding an item of a scholary article seems to be clearly more valuable then adding a Dutch name to an item of an scholary article. Given that the amount of edits the query service can do is limited, it's easy to understand why someone might prefer limiting your bot. That said, if you want to hear something concrete when you ask about your bot how about adding a link to your bot to your userpage or linking it in a discussion like this? ChristianKl19:21, 8 February 2020 (UTC)
It seems that creating an item for a scholarly article is half as expensive as adding a "." to one its descriptions. Isn't it? --- Jura 19:30, 8 February 2020 (UTC)
  • All bots should obey the "maxlag" constraint that blocks edits when maxlag is over 5. PWB also stops reading at high maxlag, but I'm not sure that was the intended behavior, I think that is being reviewed. In the meantime, please report misbehaving bots to the Admin noticeboard. ArthurPSmith (talk) 20:39, 8 February 2020 (UTC)

Fiction database IDs for real items?

Is it appropriate to add IDs from databases devoted to fiction franchises like Star Trek to real items like Andromeda (Q2469)? An example of such an edit may be found here. Jc3s5h (talk) 16:36, 8 February 2020 (UTC)

What alternative do you suggest? --- Jura 16:38, 8 February 2020 (UTC)
For fictionalised people it is important to have a separate item that is different from the item for the actual person. For fictionalised things or places... possibly not? It depends whether there are statements to be made about the thing that will be true only of the fictionalised item. A property like place of birth (P19) comes to mind. Does it matter if a fictionalised character has for their place of birth a real city? Is that going to create an unpleasant 'gotcha' for unwary query writers asking "who was born here?". Or, is it better to include fictional Dubliners by default, if the query writer does not include ?item wdt:P31 wd:Q5? There's a balance of different considerations to take into account here, I would say. Jheald (talk) 16:50, 8 February 2020 (UTC)
Imagine in some fiction the Andromeda Galaxy has not exactly the same features/properties as the real one. Then someone may want to add those datas who obviously are not constitent with the datas of the real one. It needs it own item then … now imagine in this fiction some statements uses the real item, other the fiction one … this would be a mess. author  TomT0m / talk page 17:09, 8 February 2020 (UTC)
I’ll even add, because I read in some report about constraints that items for biographies or something could be a problem for constraints, and needs sometimes to add exceptions to constraint violations (see Wikidata:2020_report_on_Property_constraints#Fictional_entities), that this may even be done even for biographies who may be romanced for example. author  TomT0m / talk page 17:18, 8 February 2020 (UTC)
We probably want a fictional andromeda galaxy in star trek, fictional or mythical analog of (P1074) View with SQID the real one. That way there is little chances results about Star Trek pollutes the real datas about Andromeda Galaxy. I think this is a sane thing to systematically do that, this avoids to systematically have to care about filtering out fiction datas in any query. Data about fiction can pop up any time in any item … This could break perfectly good queries or infoboxes at any time. author  TomT0m / talk page 17:02, 8 February 2020 (UTC)

Jura asks what alternative I suggest. I suggest that fiction, and databases devoted to fiction, will routinely mention vast numbers of real people and places, and not attribute to these real items any properties that are significantly different than the true properties. In general I don't think these occurrences are notable enough to make any mention of them in the item about the real item. One might make an exception if the mention in fiction affected the real item, for example, if a real given name (Q202444) suddenly became more frequently given to babies because it was the name of an important, popular, character. Jc3s5h (talk) 17:44, 8 February 2020 (UTC)

Note that the solution of creating fictional items linked to the real one totally allows to retrieve the information of the real one if it’s not different for the fiction item. This does not require to duplicate information, therefore. So it’s not a lot of work to create the fiction item anyway. author  TomT0m / talk page 20:17, 8 February 2020 (UTC)
  • The basic function of identifier statements on items is to find the same concept in other databases. It seems to me that this is exactly what you are describing. What content the database includes wouldn't really matter. Obviously, if the database didn't have any content, I don't see why we added the identifier in the first place. --- Jura 20:23, 8 February 2020 (UTC)

Q62589320

We have Wikidata property for an identifier that does not imply notability (Q62589320) used for Findagrave. Is it being used to say "does not imply notability" for Wikidata or English Wikipedia? I would say for English Wikipedia it "does not imply notability". I see no restrictions in our basic tenets that would prevent adding any/all entry/entries in Findagrave to Wikidata. The thing preventing us is no one wants to do the work. Am I right or wrong? --RAN (talk) 02:16, 4 February 2020 (UTC)

@Andrew Gray: What is "the policy", can you quote it please! Current policy reads notable if: "It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references." Are we saying that Findagrave is frivolous, and not serious? Also there is no "one degree of separation" rule! We have entries for over 10 generations of US presidential families. ChristianKl's arguments are sound, in that we have not all agreed yet to upload that particular data set, but I see no Wikidata:notability rule preventing us from doing that. Large data sets need community approval, Wikidata is already so large that we are experiencing computational constraints. Searches already timeout because the data set is too large, I can no longer search for people who died before they were born to look for errors. My summary: the wording: "Does not imply notability" is incorrect, "not yet approved for bot upload" would be the proper wording. Or "does not imply notability for Wikipedia" would also be correct. --RAN (talk) 13:18, 5 February 2020 (UTC)

If we have serious doubts about Wikipedia itself being citable in this respect, then I would think of Findagrave as one level below that. Do people see that differently? - Jmabel (talk) 16:26, 5 February 2020 (UTC)

Is your concern reliability or notability? You can visit the page of, as of yet, uncorrected VIAF errors. You can look at the >1,000 fixes I made of people who were dead before they were born or lived more than 120 years who are not flagged as super-centenarians here at Wikidata. Most were typos here or typos at the source we imported the data from. Other errors were from conflating people of the same name and importing a birth or death date that was wildly incorrect. You would have to calculate error rates for each data set and compare to see which was the most reliable: Findagrave or Wikidata or VIAF. For instance Geni.com has now flagged all people who died before they were born, married before age 14, and lived over 120 years as potential errors, so these errors can be corrected. --RAN (talk) 18:02, 5 February 2020 (UTC)

Q29940641

At 80 years or more after author(s) death (Q29940641) there is a curious entry for start_date=31 December 1986. I am not a math expert but I think 2020-1986=34, am I missing something? Shouldn't it be 2020-80=1940 as the latest date for a death? ... and there is no point putting a fixed value in the data field, it will increment each year. --RAN (talk) 21:16, 7 February 2020 (UTC)

I haven't found at what point it was added, there is no clear note in the edit summary that I can find. --RAN (talk) 03:01, 8 February 2020 (UTC)
I suppose the statement means that copyright in Spain last until 80 years after the death of the author if the author died at latest 31 December 1986. (If the author died at a later date, the copyright will last until 70 years after the death). It you compare with c:Commons:Copyright rules by territory/Spain#General rules, you will notice that according to Commons the 80 years protection is valid for authors who died before 7 December 1987, thus almost a year longer than stated here. --Dipsacus fullonum (talk) 08:14, 8 February 2020 (UTC)
That makes sense, we always have trouble force fitting important information within the constraints of our field names. Better confusing information rather than then absence of information. --RAN (talk) 07:23, 9 February 2020 (UTC)

How to say "meets criteria" or "doesn't meet criteria"? and "How?"

For example, Dodgson's method (Q5287927) meets or fulfills Condorcet criterion (Q831304) while Borda count (Q576578) does not. Currently Dodgson's method (Q5287927) is listed as "instance of" Condorcet criterion (Q831304) which doesn't seem right.

(Also, Ranked pairs (Q1686629) is an "instance of" electoral system (Q182985), which seems right, while

Borda count (Q576578) is a "subclass of" ranked voting (Q1078003), which I'm not sure is right. I think it should be an instance of both?

Ranked voting is a subclass of voting system, and Borda count and Ranked pairs are instances of both classes?) Omegatron (talk) 02:49, 9 February 2020 (UTC)

@Omegatron:This actually came up on the project chat last week. While the discussion didn't come to a firm conclusion, using complies with (P5009) seems to be the most specific solution at the moment. It also seems that there are multiple kinds of Borda count (Q576578), so using subclass of (P279) is reasonable. Vahurzpu (talk) 03:07, 9 February 2020 (UTC)
@Vahurzpu: "complies with" sounds good, but how would you say "doesn't comply with"?
Multiple kinds of Borda count? You mean these variations? https://en.wikipedia.org/wiki/Borda_count#Voting_and_counting Omegatron (talk) 03:41, 9 February 2020 (UTC)
Accordingly, I added "How?" to the section header. --- Jura 13:49, 9 February 2020 (UTC)

Items refusing to appear

Does anyone else deal with this problem? It's pretty annoying constantly having to refresh a page just to get the items to appear. --Trade (talk) 00:46, 5 February 2020 (UTC)

Yes! Has always happened intermittently. Add a field, hit publish and it disappears until refresh. --RAN (talk) 13:14, 5 February 2020 (UTC)
Most of the times I can't simply "hit publish". I have to focus somewhere else, then return focus to edit field and then I can hit publish. Strange behavior. --Infovarius (talk) 00:03, 11 February 2020 (UTC)

Technical problem with the Living People policy

Hi, on Sasha Grey (Q2709) I removed mass (P2067) as ancient and undesirable per policy, cf. Talk:Q2709 here and on enwiki. Wikilinks to the 2019 history about the corresponding weight in an infobox on demand, meanwhile {{Infobox person}} does not more support this on enwiki.
However, ruwiki still supports it, and of course folks try to import the "missing" statement here: How is that supposed to be handled? –84.46.53.138 09:44, 6 February 2020 (UTC)

If the source is accurate, it should be restored. It is physics, not "ancient and undesirable" information. And the decision to remove body masses appears to be one person's opinion here at Wikidata based on a decision at English Wikipedia. We remove living people information to prevent identity theft and protect minors, no one uses body mass as a personal identifier since it changes from year to year. --RAN (talk) 03:46, 8 February 2020 (UTC)
Only me happens often, but I wasn't involved in anything related to WD:BLP and its two lists. –84.46.53.249 08:41, 8 February 2020 (UTC)

Wikidata step-by-step

Is there some sort of a guide that suggests things to try for new contributors ?

e.g. if someone decides to invest some time every week they could go through a list of things to do (by actually editing). We currently have "tours", but they don't seem to actually lead to edits.

Some samples (each done in a separate, short session):

  1. find a three missing facts to add to Wikidata. (maybe from some generated list)
  2. create three new items for Wikipedia article
  3. fix three labels, etc.
  4. create a series of items on some topic manually
  5. create a series of items on some topic with Petscan
  6. create a series of items on some topic with Quickstatements

Not sure about the sequence, but the general idea is to get to know various aspects if one can't invest too much time at once.

Some thought is needed that a few users don't exhaust all samples. --- Jura 13:58, 9 February 2020 (UTC)

does Western Punjabi Wikipedia pnb.* پنجابی have a higher than average number of trolls?

I don't speak it, but a lot of things look suspect and when I look them up they make even less sense. I'm learning Urdu and I occasionally look up Western Punjabi to compare, since there is a lot of overlap. I find for most languages the matching wiki page is a good quick way to check i'm getting the right concept, and not mixing it up with a synonym. For most languages this works well and confirms or clarifies what I find in other sources, but for پن٘جابی I find the result often looks a bit off, and when I look it up it seems to be some sort of joke. But for Western Punjabi there aren't many good quality accessible resources to compare to, so am i just getting muddled or does pnb.wikipedia.org have a higher than average troll problem? Is there a way to approach it if i find what looks like a troll but i don't know how to fix it? Is there a way to flag things for a fluent speaker to check? Irtapil (talk) 01:29, 10 February 2020 (UTC)

I cannot help with the OP's problem, but I would love to have a way to flag things to be checked by a fluent speaker of a specific language. Bovlb (talk) 16:28, 10 February 2020 (UTC)

Interest in This Tool?

I made a browser extension that rips structured data from Google's knowledge panel search results and posts the data to wikidata. I'm aware of concerns about sourcing data from Google (it can create cycles of information given they source from us). But it's a good semi-automated tool that currently only pulls out:

  • Social media links
  • Freebase/Google KB ids

The changes it makes to wikidata look like this. The results need to be hand audited for accuracy since Google's data is noisy but it's much faster than doing it by hand.

My questions are:

  • Is there any interest in my open sourcing/distributing this tool? Doing so would require some effort to make it more user-friendly.
  • Are there thoughts about pulling more structured data from these panels (e.g. inception dates).

Also, is this the most active place to discuss the project or is there a mailing list/IRC that would be better?

BrokenSegue (talk) 22:17, 5 February 2020 (UTC)

I'm well aware about the concerns regardit sourcing data from Google but i think we should be very much safe extracting social media links. --Trade (talk) 22:30, 5 February 2020 (UTC)
Doesn't Google rip data from Wikidata? Seems likely to create circular citations, and amplify "citogenesis" , corrupting verifiability of the origins of data. -Animalparty (talk) 23:24, 5 February 2020 (UTC)
yes, but the tool only populates fields that are currently empty in wikidata and I hand audit the results (it's not fully automated and cannot be fully automated). examining the data it's clear Google is sourcing the social media information primarily from somewhere else. and the population of the freebase/google kb info seems unproblematic since they can't be sourcing that from us. BrokenSegue (talk) 00:23, 6 February 2020 (UTC)
This sounds like the tool does populate fields where Google mirrored our data and then we deleted our data. I think the tool is fine as long as there's hand auditing of the results. ChristianKl16:45, 6 February 2020 (UTC)
Yeah that is a reasonable concern though I think that's a minority of cases and yeah i'm hand auditing the results. BrokenSegue (talk) 17:05, 6 February 2020 (UTC)
@BrokenSegue: Is it possible to have these edits include a reference? You have "scraped data from google" in the edit summary, but it would be really helpful in the long run if you could add some kind of reference like stated in (P248):Google Knowledge Graph (Q648625) (or possibly a more appropriate item) Andrew Gray (talk) 20:12, 10 February 2020 (UTC)
@Andrew Gray: good idea. I added that feature. you can see it here BrokenSegue (talk) 04:44, 11 February 2020 (UTC)

Q56299775 vs Q1122799

Comstock laws (Q56299775) vs Comstock laws (Q1122799). How should we handle this, should the two French Wikipedia articles be merged? We have the various laws passed under one article in the English Wikipedia but there are two articles in the French Wikipedia, one is on a single act of Congress and the the other on the various laws passed. --RAN (talk) 01:28, 10 February 2020 (UTC)

Although one of the French articles - https://fr.wikipedia.org/wiki/Comstock_laws - leaves something to be desired, it appears to be about the set of laws, whilst https://fr.wikipedia.org/wiki/Comstock_Act seems to be about the 1873 'parent' act. My view: swap the FR sitelinks around, clarify the label and description of Q56299775 - it's an act, not laws - and then use has part / part of to link the two wikidata items. --Tagishsimon (talk) 09:11, 10 February 2020 (UTC)

Label/alias in mul

Is it possible to add label and aliases with 'mul' as a language? In some cases there are names, symbols etc. that are correct for many languages (like symbols of the elements for example) and it's quite tedious to add such things to many languages as having the same aliases in many languages does not help in searching etc. Wostr (talk) 15:09, 22 January 2020 (UTC)

Slightly OT, years ago I was on the language tag review and related RFC-update lists, exploring mul + und issues (after en-GB surviced a basic plausibility check) can be fascinating. On YouTube I gave up to find a tag for Kobaian. Is und-Latn (example) a thing? After all you would at least know the script. –84.46.53.84 02:16, 28 January 2020 (UTC)
Previous discussions: Wikidata:Project_chat/Archive/2013/07#Global_labels, Wikidata:Project_chat/Archive/2019/06#Multilanguage_label has ended with nothing. --Infovarius (talk) 15:57, 29 January 2020 (UTC)
Given that there's support and no pushback, maybe we need a Phabricator ticket? ChristianKl08:52, 3 February 2020 (UTC)
Please make sure to get this right per RFC 5646 section 4.1 clause 5.
I just saw that enwiki has mis for Kobaian, and zxx non-linguistic can be also interesting. In one of the two archived discussions wikilinked above you had a mul-Latn, that's not the same as und-Latn, there is a SHOULD NOT for mul in BCP 47 = RFC 5646. –84.46.52.96 05:17, 4 February 2020 (UTC)
There's a SHOULD NOT but I don't think it applies to our planned usage. "This subtag SHOULD NOT be used when a list of languages or individual tags for each content element can be used instead." A list of languages is not possible for our use-case. We actually mean all languages and not a subset that could be expressed by a list. I think we comply with the recommendations in that document. If you think we don't please point to the specific portion that you think get violated by the planned usage.
zxx seems to be a good idea for special unicode characters. ChristianKl17:43, 10 February 2020 (UTC)
^.^b All fine if you checked it, the experts on phab: can sort it out. @Cbrescia: My comment about shp in January was misleading, if MediaWiki now tries to cover all plausible language tags. –84.46.52.252 11:36, 11 February 2020 (UTC)

RAL Colors

I was working on colors to improve them and I found that there are a lot of RAL colors instance of RAL classic color (Q17421658). I would like to suggest a batch change and an addition of a property.

First I think it would be better to have the name of the RAL colors as label and the RAL 0000 number part as an alias. Additionaly I think it would be create to add the RAL identifier as an external identifier, so we would need a property RAL identifier or similar.

Any ideas, thought for the further process on this?

--DaSch (talk) 19:33, 5 February 2020 (UTC)

Link from Commons to Wikidata for entries on images

At Wedding Deferred, Commits Suicide (Q84572479), for instance, we have a Wikidata entry for a news article hosted at Commons. I can click on the image to get to Commons ... but if I was at Commons how would I know there was a Wikidata entry for this image? --RAN (talk) 06:13, 8 February 2020 (UTC)

I added a P6243 statement on its structured data, does that look right? Ghouston (talk) 06:23, 8 February 2020 (UTC)
Or you can check File usage on other wikis section on commons. ‐‐1997kB (talk) 06:48, 8 February 2020 (UTC)
Ok, both are clever, thanks! Now I see the "File usage on other wikis". --RAN (talk) 07:38, 8 February 2020 (UTC)
I wonder why we have a massive, and monolithic, quotation or excerpt (P7081) value on that item, rather than the whole (short) article, which is out of copyright, being transcribed on Wikisource, where it can enjoy the use of heading and paragraph markup? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:39, 8 February 2020 (UTC)
Can't we have both? I have had items deleted at Wikisource as out of scope in the past. Quote at Wikidata is limited to 1,500 characters, so not "massive, and monolithic". Sometimes redundancy is good since we have no control over the deletion policies at other projects. Wikipedia has purges of classes of people and things, and the information is only preserved because we have an entry for them here. At one time there was a purge of high schools at Wikipedia and more recently a purge of mayors and county administrators. They even tried deleting the entries for those people/things here at Wikidata. So ... redundancy can be beneficial. --RAN (talk) 17:38, 8 February 2020 (UTC)
@Richard Arthur Norton (1958- ): Storing data in Wikidata statements is more expensive then storing it elsewhere. The query service has limited capacity and that should encourage us to not store longer texts in statements, so that we can have more items/statements in the query service. ChristianKl08:53, 11 February 2020 (UTC)

time-varying P31

Quite recently, the two small Indian union territories of Dadra and Nagar Haveli district (Q46107) and Daman and Diu (Q66710) merged to form Dadra and Nagar Haveli and Daman and Diu (Q77997266).

If the Wikipedia article is to be believed, the new territory is actually ((Dadra and Nagar Haveli) and Daman and Diu). That is, although the former union territory of Daman and Diu is no more, the former union territory of Dadra and Nagar Haveli is now one of the three districts making up Dadra and Nagar Haveli and Daman and Diu.

I have indicated this at Dadra and Nagar Haveli district (Q46107) with two different values for instance of (P31), qualified by start time (P580) and end time (P582) in the obvious way. I believe this is a good way to do it.

However, based on discussions I sometimes see here, I suspect there might be an argument for creating two distinct Q-entities for the former union territory as distinct from the current district.

So, if anyone is interested, feel free to correct the tagging at Q77997266 and related entities if I got it wrong, or lobby for creating the second, distinct entity for Dadra and Nagar Haveli. —Scs (talk) 21:05, 9 February 2020 (UTC)

If the new territory is the legal successor of one of the two previous - i.e. it was one territory added to another - then keeping the old and new in one item would be OK. But if a legally new territory was formed out of the two previous, then it must be a new item. Its a grey area which other external identifiers for administrative units handle differently, some assign a new identifiers at every merge/split of an administration units, others keep them at least when the name of the successor stays same. But probably the Wikipedia links will require two items anyway, as there will be at least one Wikipedia which chooses to have different articles about them. Ahoerstemeier (talk) 09:33, 10 February 2020 (UTC)
In this case there are already separate items, but I think the difficulty is that Dadra and Nagar Haveli district (Q46107) is being used to represent different kinds of entities before and after the change, union territory of India (Q467745) vs district of India (Q1149652). Since they are different kinds of administrative entities, should they have different items, even though they have the same name and possibly the same territory? Ghouston (talk) 09:58, 10 February 2020 (UTC)
Yes, that's precisely it: Same name and (AIUI) same territory, but a different "kind of entity".
And although I said above that "I believe this is a good way to do it", here's a pretty strong counterargument: Do we expect someone running a query for "all Indian states and territories (that is, someone trying to check or recreate one of the lists at w:Administrative divisions of India) to explicitly qualify their wdt:P31 query to make sure they get only entities that are currently instances of whatever? —Scs (talk) 11:33, 11 February 2020 (UTC)

Wikidata weekly summary #402

We might expose a subgraph with only truthy statements. Or have language specific graphs, with only language specific labels.

It's not clear to me why the SPARQL server needs to know anything about labels or descriptions in the first place. I don't think we would lose much relevant functionality when labels and descriptions would be moved to a different server. On a related note I think that the main venue for communicating information about Wikidata should be Wikidata and if there's a monthly edition of the state of the query service it would be great to have it on Wikidata (it's okay to additionally email it). ChristianKl08:15, 11 February 2020 (UTC)

Merging 2 items

Could someone merge Q84564408 and Q55771445 ? Same individual. Txs--René La contemporaine (talk) 10:53, 11 February 2020 (UTC)

@René La contemporaine: see Help:Merge. --- Jura 11:15, 11 February 2020 (UTC)
Jura Thank you.--René La contemporaine (talk) 13:24, 11 February 2020 (UTC)

Reduced loading times for Wikidata/Wikimedia Commons

Hello all, While cleaning (reviewing and rewriting) the code of Wikidata and Wikimedia Commons backend in October 2019, The Wikidata team at WMDE together with WMF worked on reducing the loading time of pages. We managed to reduce the loading time of every Wikidata page by about 0.1-0.2 seconds. This is due to a reduction of the modules (sets of code responsible for a certain function) that need to be loaded every time a page is opened by someone. Instead of 260 modules, which needed to be loaded before, only 85 modules need to be loaded now when the page is called. By doing so, it is easier to load Wikidata pages for people who only have a slow internet connection.

Alt text
Size decrease of the initialization loader on Wikidata pages (on Grafana).

Reducing the amount of modules called when loading the page equals a reduction of about 130 GB of network traffic for all users every day, or 47TB per year. The reduction of network traffic translates into a reduction of electricity use, thus, this change is also good for the environment. Additionally, the interdependencies between the modules were reduced from 4MB to 1MB, which improved the loading time per page as well.

Many thanks to everyone involved in this improvement! If you want to get more details about the actions we performed, you can have a look at the Phabricator board.

If you are developing scripts or tools on top of the Wikidata UI, some documentation will walk you through the architecture of RessourceLoader, what’s page load performance and how to create module bundles with ResourceLoader.

For further questions or feedback, feel free to contact us on this page.

Cheers, for the Wikidata team: Max Klemm (WMDE) (talk) 11:05, 11 February 2020 (UTC)

Items failed to merge because two items contain the same sitelink

Category:People executed by the Qin dynasty (Q10813613) was merged to Category:Executed Qin dynasty people (Q24918671), but could not be redirected, because Category:People executed by the Qin dynasty (Q7016834) is linked to the same Chinese Wikipedia page as Q24918671. I don't know which is the correct item for the sitelink, as putting the description into Google Translate suggests Q7016834, but based on the categories Q24918671 looks more likely. Peter James (talk) 12:20, 11 February 2020 (UTC)

The two categories are distinct on both enwiki and zhwiki. Li Si seems to be in one enwiki category but not in the other. Before we can merge on Wikidata, the enwiki and zhwiki would have to merge their categories. ChristianKl14:03, 11 February 2020 (UTC)
The merged items are Q10813613 and Q24918671, not Q7016834. The zhwiki links on Q24918671 and Q7016834 are identical, which makes it impossible to edit the items unless one of the links is removed, but I don't want to remove a correct link and keep an incorrect one. Peter James (talk) 14:59, 11 February 2020 (UTC)
It's not generally possible to put the same sitelink on multiple items anyway. I assume it was only a bug that allowed it to happen. I'd just remove it from one item and then complete the merge. Ghouston (talk) 22:02, 11 February 2020 (UTC)

Problem with P3054

Hi, I have a problem with Ontario MPP ID (P3054), aparently the format of the ID changed from a number to a name format, like for Dalton McGuinty (Q568204) it is now dalton-mcguinty instead of the old format. What sould we do, change the ID, or mark this property as obselete and start a new one? --Fralambert (talk) 01:35, 3 February 2020 (UTC)

Discussion continues on the property's Talk page to find a way out. —Eihel (talk) 11:06, 12 February 2020 (UTC)

Little riddle…

Something happened yesterday on Wikidata that had not happened since February 22, 2013, shortly after the beginnings of this project.

What is it? —Eihel (talk) 23:58, 7 February 2020 (UTC)

For one single day, no one trolled other editors, wrote nasty things, tried to ban their enemies? --RAN (talk) 02:58, 8 February 2020 (UTC)
No bogus items by Wikipedia editors? No redundant bot edits adding a description identical to the value of the P31 statement? --SCIdude (talk) 14:29, 8 February 2020 (UTC)
Please, tell us. --Matěj Suchánek (talk) 08:10, 12 February 2020 (UTC)
Solution Matěj Suchánek :
We congratulate Mike Peel for obtaining administrative rights. Note that it was very close to not getting it ;)
Indeed, since 2013, we have not seen such a high number of votes. Instead of Lymantria, I would have allowed myself a little joke. In the genre:
  • Sorry, I don't have enough fingers to count the voices… in doubt, ✓ Granted
  • or In front of the candidate's Stalinist score, I am inclined to ✓ granted the right. etc.
Again, congratulations Mike, I have no doubt that you will use it well. —Eihel (talk) 10:19, 12 February 2020 (UTC)
@Eihel: Thanks! I'll try to use the mop well. What was the event in February 2013, though? Thanks. Mike Peel (talk) 10:40, 12 February 2020 (UTC)
My apologies for not having been funny enough ;P Granting rights is serious stuff. ;) Lymantria (talk) 11:26, 12 February 2020 (UTC)

Described by source

Described by source Property:P1343 is currently restricted to dictionaries and encyclopedias, why can't the source be an obituary or a biography or a news article. Most people and events are not in existing encyclopedias, but described by news articles and biographies and obituaries. If we have those news articles and biographies and obituaries as Q entries, shouldn't they be linked by "Described by source"? --RAN (talk) 18:57, 9 February 2020 (UTC)

I use "described by source" (judiciously) for other items like exhibition catalogs. I'd like to see this property opened up. - PKM (talk) 20:06, 9 February 2020 (UTC)
I tried to change the label and it was reverted. So I am seeking consensus to make the definition broader. We don't have to link to every news article we have that mentions George Washington or Abraham Lincoln, but if we have news articles on people with no coverage in an encyclopedia, we should link to those news items and books about them. Especially since Wikisource does a poor job of indexing and cataloging their entries. They are indexed by author but not by subject. There is no way to know that John Smith is the same as John Q. Smith is the same as J. Q. Smith across articles at Wikisource. --RAN (talk) 00:16, 10 February 2020 (UTC)
The property was proposed and created for dictionaries and encyclopedias describing the item. Don't just use it otherwise. If people start using properties as they like instead of as intended, it will be chaos. --Dipsacus fullonum (talk) 08:36, 10 February 2020 (UTC)
Or, you know, it'll be managed change. It's perfectly possible to use as a qualifier object has role (P3831) to specify the nature of the source; and that way we get a useful general purpose property. The alternative is we proliferate properties for no good reason, which makes consistency unlikely and reporting a PITA. There doesn't actually seem to be anything at all very useful about restricting the sources to two types, doesn't seem to be anything that'll get broken if we include an obituary or a magazine article. --Tagishsimon (talk) 09:03, 10 February 2020 (UTC)
It's not clear from me from the proposal that it can only be used with dictionaries and encyclopedias. The labels don't all match either, e.g., en-gb says "dictionary, encyclopaedia, etc. where this item is described", with the addition of "etc." perhaps expanding the usage considerably, and nl says "woordenboek, encyclopedie of naslagwerk", and who knows what "naslagwerk" (reference work) may encompass. So I've treated it as an analogue of described at URL (P973), which I don't think is limited by type of external content. Once you describe the resource at the URL with it's own item, you have to switch to described by source (P1343). Ghouston (talk) 09:09, 10 February 2020 (UTC)
I've been using this property to specify "technical standards" which define or describe a concept; but why stop there and not allow any scientific article or source which, well, describes (as the label of the property reads) a concept. Therefore I  Support extending the documented scope of this property to any source. (I think that the current label in en, de, es, fr is good; the description could be updated, but the important point is that the outcome of this discussion should be recorded, say, on the property talk page and property usage examples so that others easily find guidance for how to use this property.) Toni 001 (talk) 10:47, 10 February 2020 (UTC)
I don't think that Project chat is the place to create consenus to change property descriptions. That discussion should happen on the property page, so that in future people can review it easily. If the discussion doesn't get enough attention, link it from here. ChristianKl13:35, 10 February 2020 (UTC)
We can always migrate the discussion there when complete, but people only respond to highly visible discussions here. Even when you link to a discussion elsewhere, most people do not click through, including myself. --RAN (talk) 04:54, 11 February 2020 (UTC)

See for example special:diff/1096069043 by @Trade:. Visite fortuitement prolongée (talk) 21:45, 10 February 2020 (UTC)

Using described by source (P1343) seems better than described at URL (P973) for the items on Eurabia (Q737979), given that they have their own Wikidata entries. A problem with both properties is that they may turn into extensive lists, as is already happening there. The item for "London" could potentially list every encyclopedia and guidebook that mentions the city. Ghouston (talk) 01:15, 11 February 2020 (UTC)
We can always limit the number of references to 10 or 15, or only display the first 10, or rank the references by the number of times the subject name gets mentioned. There can be many technical solutions. For some obscure topics, Wikidata may be the only way to link to obscure news articles and scientific papers. --RAN (talk) 04:54, 11 February 2020 (UTC)
"Better than" described at URL (P973) is not necessarily "good enough", it could still be a maintenance nightmare. –84.46.52.252 13:57, 11 February 2020 (UTC)

hyphens in URLs but underscores in lists

Why do the tables on the pages e.g. https://www.wikidata.org/wiki/Q32762#sitelinks-wikipedia use underscores e.g. "fiu_vro" while the actual URLs use hyphens e.g. https://fiu-vro.wikipedia.org/wiki/V%C3%B5ro_kiil ? consistency would be much better. Is it possible to fix this? or too big? Irtapil (talk) 01:12, 10 February 2020 (UTC)

^.^b Good question, per #Label/alias in mul above I bet on bug, maybe try to report it on phab:. –84.46.52.252 14:06, 11 February 2020 (UTC)
We use MediaWiki:Gadget-SiteIdToInterwiki.js to strip -wiki etc. suffixes, perhaps it could also do this. Note that you can still see be_x_old, although it was renamed to be_tarask a few years ago. --Matěj Suchánek (talk) 08:17, 12 February 2020 (UTC)
MediaWiki:Gadget-SiteIdToInterwiki.js is definitely responsible for this, in my opinion. Wikibase itself shows site IDs in the sitelinks list (which I believe correspond to site_global_key) – that’s also why it still shows be_x_oldwiki, the database hasn’t been renamed yet (T127570). And these site IDs use underscores, not hyphens – if MediaWiki:Gadget-SiteIdToInterwiki.js is supposed to turn those site IDs into interwiki codes, then apparently it needs to turn the underscores into hyphens as well. --Lucas Werkmeister (WMDE) (talk) 18:12, 12 February 2020 (UTC)

Duplicates about French mayors

Hi there,

I continuously find duplicate items about French mayors, especially from Gard- lastly Q60677070 and Q65579675 (now merged) about Mr. Gilles Dumas, CLH. They were all created en masse months ago without checking pre-existing ones. I can't check one by one all the mayors of Gard. How to deal with it?

Thanks for any idea... 86.193.172.227

  • Some checking was done, some dups happen. See Help:Merge.
Are you sure one isn't the father or a relative of the other?
@Arpyia: for info. --- Jura 22:04, 10 February 2020 (UTC)

Already knew merge tools, thanks. And it is obvious that it is the same person (both items deal with a politician who hold the same position, at the same time). 86.193.172.227

Look: this isn't the same start time (because one item took in account the first election as a mayor, and the second only the current elective term; and both dates are correct, see for instance [5]), but as the position is still hold now, it means that two "Gilles Dumas" can't have been both mayor of Fourques synchronically. 86.193.172.227

Thanks for merging them then. It wasn't visible from the data available here. Such duplicates might be a side-effect of the data model chosen for French mayors. --- Jura 12:02, 11 February 2020 (UTC)
@Jura1: I just found some other duplicates (Q65604477 and Q60824253; Q60824116 and Q65582364; Q60824933 and Q65591551; Q60677916 and Q65603310; Q60824490 and Q65572113; Q65569085 and Q60825842; Q60815680 and Q65571735; Q60815035 and Q65569716; Q60823841 and Q65587974), only having a look at Gard towns whose name start with A. But I don't think that it is me to merge all of them... Regards, 86.193.172.227
Of course not. This finds some 250 (out of 41557 mayors/35277 distinct offices). While the mayors of Tours are probably not duplicates, e.g. Q65577425 and Q83617920 should have been avoided. Eventually, these need to be looked into. The rate of possible duplicates seems fairly low given the number of items involved. --- Jura 08:58, 12 February 2020 (UTC)

I already merged a duplicate about Ms. Pilar Chaleyssin, CLH, mayor of Aubais, on August 2019. What's the sense of re-creating one on January 2020? I'm puzzled. 86.193.172.227

Items deletion request

Please delete this empty items:

--151.49.42.20 21:19, 10 February 2020 (UTC)

I checked one of them; it should have been merged, not emptied out. The same is likely true of others. Please read Help:Merge. ArthurPSmith (talk) 21:48, 10 February 2020 (UTC)
They are not emptied by me, I have found already empty... —151.49.42.20 21:53, 10 February 2020 (UTC)
I fixed the last one. Is this from Special:ShortPages, basically everything with a size of 158 or 159 bytes? Ghouston (talk) 00:35, 11 February 2020 (UTC)
So I can reuse them... Thanks! --2001:B07:6442:8903:E1F5:E3C3:7545:62E0 08:38, 11 February 2020 (UTC)
@2001:B07:6442:8903:E1F5:E3C3:7545:62E0: Merged items should never be reused.--GZWDer (talk) 08:54, 11 February 2020 (UTC)
@GZWDer: I don’t speak about merged items. I have say the intention to use EMPTY item, NOT the merged one. You don’t want to delete empty item, why? The empty (not merged) item should be deleted otherwise they are available for the reuse! --151.49.42.20 19:53, 11 February 2020 (UTC)
empty entries that have been merged elsewhere should be redirected. if it never had any content it should be deleted presumably. why re-use? BrokenSegue (talk) 00:59, 12 February 2020 (UTC)
It's not like we are running out of integers. Ghouston (talk) 01:04, 12 February 2020 (UTC)
+1. +1. +1. +1. +1. +1. +1. +1. +1. +1. +1. (concur w/ Ghouston.) - Jmabel (talk) 04:50, 12 February 2020 (UTC)
  • I suppose it wont break the site entirely if an IP changes an entity once in a while to something else. Still, I find it more concerning when more extensive re-purposing is done. --- Jura 17:46, 12 February 2020 (UTC)

P953

At Property:P953 can we delete the "mandatory qualifier constraint property=archive URL" it is very confusing as to what it wants and every instance of using the "full work at" gets the message which appears as if an error, even though it is just a suggestion. Is it suggesting to me I should be using a link from the Wayback Machine? If it is it is something that would be better automated by a bot. --RAN (talk) 04:44, 11 February 2020 (UTC)

For information to SPARQL query writers : workaround found to a bug

If you ever tried to use aggregation functions in SPARQL with the query service together with the wikibase:label service and it did not work, you’ll be happy to read a workaround  : Wikidata:SPARQL_tutorial#wikibase:Label_and_aggregations_bug

(and a bug is filled)

This may help someone :)

author  TomT0m / talk page 21:19, 11 February 2020 (UTC)

There is no bug. It is just as documented in the User Manual. The label service in automatic mode is supposed to work on unbound variables in SELECT. You try to use it for variables in GROUP BY which isn't mentioned as supported, so there is no case. --Dipsacus fullonum (talk) 22:11, 11 February 2020 (UTC)
I think it had always been that way. Not sure if it's a bug or the absence of a feature, obviously, it could be simpler .. --- Jura 00:40, 12 February 2020 (UTC)
@Dipsacus fullonum: I get the point, I missed that. Nethertheless, technically it’s also used in the « select » clause in the aggregation expression, still unbound as it’s also unbound in the group by. so the user manual may be seen as ambiguous and this may still be seen as a bug. It’s not very intuitive anyway, and as most of the time the label service is used with default naming (watch the query examples, I think that none or almost none explicitely name the variables) I suspect most people forgot about that feature. Anyway I’ll modify the user manual. author  TomT0m / talk page 09:01, 12 February 2020 (UTC)
@TomT0m: “used in the select clause” is an underdefined term; strictly speaking, the label service looks at the projection of the query. I’ve updated the manual as well. --Lucas Werkmeister (WMDE) (talk) 12:55, 12 February 2020 (UTC)
@Lucas Werkmeister (WMDE): Is there a technical reason why this should remain as this and this could not be implemented to look inside the aggregation expression for variables ? Too much work to implement ? Because it’s the kind of papercuts that can be very frustrating and puzzling for no good reason. It’s already frustrating to have to care at all for labels where the work of choosing labels for items could be done by the query service UI … author  TomT0m / talk page 16:41, 12 February 2020 (UTC)
@TomT0m: I don’t know Blazegraph internals well enough to judge that, sorry. --Lucas Werkmeister (WMDE) (talk) 16:56, 12 February 2020 (UTC)

Academic papers about Wikidata

I would like to read some papers about Wikidata. Where should I start? Any recommendations what to read first? Thanks in advance! Bencemac (talk) 07:14, 12 February 2020 (UTC)

@Bencemac: Where to start? With a wikidata report, perhaps. Beyond that, it somewhat depends on what facet of wikidata is of interest.
SELECT ?item ?itemLabel (group_concat(?instLabel; separator="; ") as ?type) ?date ?inLabel
WHERE 
{
  ?item wdt:P921 wd:Q2013.
  ?item wdt:P31 ?inst .
  optional {?item wdt:P577 ?date . }
  optional {?item wdt:P1433 ?in . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". 
                         ?inst rdfs:label ?instLabel .
                         ?item rdfs:label ?itemLabel . 
                         ?in rdfs:label ?inLabel .}
} group by ?item ?itemLabel ?date ?inLabel
Try it!
--Tagishsimon (talk) 13:13, 12 February 2020 (UTC)

Thank you very much, these links are very useful! To be more specific, I'd like to read about Wikidata in general and VIAF (maybe other library catalogues) integration. Regards, Bencemac (talk) 15:41, 12 February 2020 (UTC)

Fictional character as a value for notable work

Agatha Christie (Q35064) has notable work (P800) Hercule Poirot (Q170534). The value is for the fictional character Hercule Poirot, which seems strange since notable work is intended for creative works and has a value-type constraint (Q21510865), which includes work (Q386724). It turns out fictional character (Q95074) is a subclass of work in the hierarchy, so that's why it's not flagged as a constraint violation. Is it possible to exclude fictional entity (Q14897293) from being an allowed value type constraint, or is there a better way to address this issue? Chicagohil (talk) 20:20, 12 February 2020 (UTC)

If you think something else is notable, why not just add it in addition? This is not Wikipedia, you don't need to overwrite other contributors. --- Jura 21:15, 12 February 2020 (UTC)
It doesn't seem strange to me at all. Hercule Poirot (Q170534) is a creative work, not a person. What seems strange to me is that fictional human (Q15632617) is a subclass of person (Q215627). Ghouston (talk) 21:30, 12 February 2020 (UTC)

Performance of tools is abysmal and is often denied

What are the plans to get back to a situation where tools like awarder, quickstatements, tools that editors use gain an acceptable performance. As it is the performance is arguably so bad that I no longer consider Wikidata as performing on an acceptable level. What are the plans and do plan on the pent up demand that is currently denied. Thanks, GerardM  – The preceding unsigned comment was added by GerardM (talk • contribs) at 20:20, 8 February 2020‎ (UTC).

The current bottleneck is the Wikidata Query Service, and they wrote about current plans yesterday in a posting on the wikidata mailing list (the author is one of the team members of the responsible WMF department). The short term plan is to optimize the software of the current setup, as they have identified some pretty expensive processes there which could potentially be streamlined. A mid-term plan is to restrict WDQS more than currently (we won't like this for sure). —MisterSynergy (talk) 22:29, 8 February 2020 (UTC)
This email is a technicians view. While valid, the question then becomes what the position is of the Wikimedia Foundation. I have written a blogpost where I ask the WMF to value projects other than English Wikipedia. GerardM (talk) 11:49, 9 February 2020 (UTC)
WMF is not equipped to have a position on particular tools like QuickStatements. WMDE sets the priority about how Wikidata gets developed.
There are ethical reasons for using money that's raised with banners that ask people to help Wikipedia at least partly to improve Wikipedia. It's alright when the WMF using some of their funds for other purposes but if you call that the majority of the donor money should be used in a way that's distinct from what the donors wanted to support, that seems problematic. ChristianKl14:27, 9 February 2020 (UTC)
This is not about the use of tools. It is about the prospects available to the project- the future of Wikidata. As it is the usefulness of Wikidata is severely impacted by its performance.
The notion that the WMF is beholden to anyone because it raises funding does in my mind not register at all in this. Thanks, GerardM (talk) 17:00, 13 February 2020 (UTC)

Colour property for films

Santo contra el cerebro del mal (Q130087) has Property:P462 colour set to "colour" - is this the right property? It seems like a different thing from a field which can be, say, blue, green, red. All the best: Rich Farmbrough14:40, 13 February 2020 (UTC).

According to Wikidata:WikiProject Movies/Properties#Other properties, yes, right thing. --Tagishsimon (talk) 14:45, 13 February 2020 (UTC)

Many new Wikidata Tours ready to be published but a software bug is stopping it :(

Hi all

I've spent the past few weeks working on creating many new Wikidata Tours which cover almost all the basics of Wikidata and some common specific tasks like adding coordinates. All the text is now written and Nav Evans has kindly agreed to do all the technical parts. However the tour software has some bugs which means they Tours won't work properly :(

I'd really appreciate your help (will take 2 mins):

https://phabricator.wikimedia.org/T244994

  • Non Programmers: subscribe to the task to let people know having functional tours is important
  • Programmers: Have a look at the issue and see if you can contribute

Thanks very much

--John Cummings (talk) 13:23, 12 February 2020 (UTC)

From the task it isn't really clear what needs fixing. The task mentions two or three script files which may need unification, and also some issues with each.
If you have successfully tested everything on test.wikidata, then perhaps just make {{Edit request}}'s at respective files, so that local (interface) admins know what needs updating.
Anyway, thanks for working on this debt. --Matěj Suchánek (talk) 15:09, 12 February 2020 (UTC)
Has the Recent Changes issue been addressed? Cheers, Bovlb (talk) 20:43, 13 February 2020 (UTC)

Can location be a vehicle, ship, airplane or similar?

What is the "most specific known" location if a person is born eller died during transportation in an ambulance, other vehicle, ship, airplane etc.? Should it be the continent, country, highway, ocean or whatever is known about where the vehicle etc. was, or name or type of the vehicle etc.? Or is there a way to state both? --Dipsacus fullonum (talk) 02:58, 13 February 2020 (UTC)

Specific vehicles seem fine as values for location but the type of vehicle is no valid value. ChristianKl08:44, 13 February 2020 (UTC)
Certainly that specific vehicle was at a specific location when "it" happened? --SCIdude (talk) 14:39, 13 February 2020 (UTC) PS. I just want to prevent people from adding "hospital bed" which can be considered a vehicle...
I am not sure if knowing that someone was born in an ambulance is any more or less valuable than knowing they were born in a hospital bed or a sofa or a grassy knoll. The actual location--'X highway', 'X hospital', 'X other place'--would seem fundamentally different and more important than the type of furniture/vehicle/etc. they were born in. Of course if the individual vehicle/room/etc. is known (e.g. the ambulance in which a famous person was born which is now on museum display), then sure, that makes sense to list. Aircraft and ships might be a complication since depending on the flag nationality of the vehicle, there may be legal ramifications independent of the geographic location of the vessel, but again this would require more than the general type of vehicle be listed to make sense. Josh Baumgartner (talk) 21:05, 13 February 2020 (UTC)
The reason for my question is the death of Søren Christensen (Q46730224). He died in an air ambulance (helicopter) while it flew between two Danish hospitals (from Aalborg Hospital (Q2757508) to Rigshospitalet (Q3357360)). It could be somewhere over Denmark or over the sea. What should the location of death be stated as? --Dipsacus fullonum (talk) 23:37, 13 February 2020 (UTC)
Sounds like we may want a way to model a location that means "In transit from [A] to [B]". - Jmabel (talk) 00:01, 14 February 2020 (UTC)
There's the location at sea (Q55438959) which can be used for death on a ship on an unknown sea. I think there may be a similar concept for "death in the air", but I can't find anything. If you know the place where the plane was flying over, would you record the death at that location? Then you can just use the most specific location, such as "Denmark", "Europe", "Northern Hemisphere". Ghouston (talk) 01:27, 14 February 2020 (UTC)

Incorrect genealogy

Following on somewhat from this earlier discussion ten days ago: Wikidata:Project_chat/Archive/2020/02#Wildly_variant_genealogies?), again: how to represent genealogy which is wrong, but found in a popular source, eg The Peerage person ID (P4638)? It is desirable, I think, to represent the wrong genealogy here, so we can mark it as wrong (or at least probably wrong), so that people working with the source that is incorrect can understand what we have done here, and that we do know about it, but haven't followed it.

To make things concrete, here's a case study: the genealogy of the first four people to be Viscount Wenman. I've done my best, but some of the items do seem a bit messy now as a result. I'd welcome any thoughts as to what might be improved.

According to current thinking (and in particular the History of Parliament ID (P1614) biogs, which tend to be of a very high quality), the sequence of relationships went as follows:

  0. Thomas Wenman (Q26790736) (c.1548-77) -- HoP
m. Jane West (Q76172650) (d. about 1606)
  1. Richard Wenman, 1st Viscount Wenman (Q7329888) (1573-1640) -- HoP / HoP
    Son of #0
  2. Sir Thomas Wenman, 2nd Viscount Wenman (Q7794988) (1596-1665) -- HoP / HoP
    Son of #1.
  3. Philip Wenman (Q76151947) (d.1686)
    Younger son of #1, brother of #2
  4. Richard Wenman, 4th Viscount Wenman (Q7329889).(1657-1690) -- HoP
    Son of Mary Wenman (Q76265422) (m. 1651; d. 1657; see HoP), who was a daughter of #2.

However, the entries for the above people in The Peerage (Q21401824) follow a different genealogy, essentially equivalent to the one given at the bottom of this page from Burke's Extinct and Dormant Baronetcies (1844).

According to TP,

The TP version is clearly wrong:

  • #1's date of birth, and the identity and date of death of his father are presumably established by what sounds like a considerable paper trail alluded to in [6] following his father's death.
  • And if #3 died in 1686 and #1's father died in 1577, then there is no way that #3 can have been #1's brother. #3 was therefore surely #1's son; and #3's uncle, the Thomas that died in Ireland in the 1630s and remembered #3 in his will, must therefore have been #1's brother not his uncle.
  • As for Mary's father, he mathematically might have been a brother of #1; but a member of #2's generation fits much better. Given HoP's definiteness that her father was #2 (eg in [7] and elsewhere), it may well be that Mary and her husband were mentioned in #2 will, or some other of the documents, settling the point.

So how to deal with the TP version on Wikidata?

But it does lead to quite a mess of deprecated statements, especially on Sir Richard Wenman (Q76115470) and Thomas Wenman (Q76265423), but also on some of the other items. Pinging @Andrew Gray, Salgo60, Tagishsimon, Pigsonthewing, GZWDer: and anyone else who's been working in this sort of area for your thoughts. Is this appropriate? What can be improved on?

Thanks, Jheald (talk) 20:05, 13 February 2020 (UTC)

For what it's worth, the DNB (s:Wenman, Thomas (1596-1665) (DNB00)) seems to have avoided the glitches in Burke's, at least for this family, as long ago as 1899. Jheald (talk) 20:39, 13 February 2020 (UTC)
Ping @Bvatant, Lesko987a: to people over in Wikidata land.... I dont think Wikidata should dig too deep into the land of unsure genealogy and research is my feeling.... I guess we need better tools for describing hypothesis in Wikidata I like the software Evidentia that try to follow en:Genealogical Proof Standard. Wikidata is good for sources we trust not questioning the relevance of sources - Salgo60 (talk) 21:28, 13 February 2020 (UTC)
When it comes to labeling things as unsure see Draft:New_Ranks for a possible way of to mark claims where the sources aren't strong. ChristianKl22:10, 13 February 2020 (UTC)
@ChristianKl: Well yes, but the community doesn't seem to be going for it. Jheald (talk) 22:12, 13 February 2020 (UTC)
@Jheald:I don't think strong support is necessary to discuss how the proposal interacts with various challenges. At the moment it's a draft. Maybe, I will bin it and maybe I will put it up sooner or later as an RfC to see what the community wants. ChristianKl22:23, 13 February 2020 (UTC)
@Salgo60: Not really the issue I was bringing up. Here I was wondering how to deal with something that is reliably wrong -- a source that is known for (occasional) weakness, that here is definitively contradicted by much stronger sources; the question being how best to acknowledge what the popular but incorrect source says, but nevertheless mark it as not correct.
The question of how to mark hypotheses, even strong ones, that all the same could use some more explicit and conclusive sources (eg how to indicate "citation needed"), is a different one, though also very relevant to genealogy. For example, in the same family, I know that Penelope Wenman (Q75785757)father (P22)Sir Richard Wenman (Q76115470) from ThePeerage is wrong. I think Penelope Wenman (Q75785757)father (P22)Richard Wenman, 1st Viscount Wenman (Q7329888) is the appropriate correction (and is what user trees at familysearch.org and geni.com also suggest). But I don't (yet) have a reference with a hard citation to put it beyond doubt. So how to try to indicate that is also a good question, but not really the one I was getting at above. Jheald (talk) 22:09, 13 February 2020 (UTC)

Conservation status (of monuments)

WikiProject Cultural heritage has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

conservation state (Q55553838) seems to be used for two different concepts - there are a handful of statuses (abandoned, restored, etc.) and hundreds of crashes, sunken ships, etc. Does anyone know what's happening here? Was there a bad merge or something? - PKM (talk) 21:07, 13 February 2020 (UTC)

@PKM: Interesting. The set of items that are direct instances of the class seems pretty well controlled ( https://w.wiki/HC9 ), and the values of state of conservation (P5816) also seem pretty well concentrated ( https://w.wiki/HC7 ).
But I think you're looking at this query, for things P31/P279* conservation state (Q55553838), https://w.wiki/HCA which includes all the wrecks, caused by the chain
shipwreck (Q852190) --> subclass of (P279) disaster remains (Q21073029) --> subclass of (P279) conservation state (Q55553838)
Could I think be fixed by changing disaster remains (Q21073029) to instance of (P31) conservation state (Q55553838), but might need to think about that. Jheald (talk) 21:37, 13 February 2020 (UTC)
I have made the change, which I hope now gives something closer to the expected behaviour. disaster remains (Q21073029) could use an additional class something like "remains" -- the kind of general concept we don't yet have an item for, because WP doesn't think such a generality is worth an article, yet is a grouping without which our ontology doesn't really express what things are. General and/or abstract grouping concepts without items -- something we hit again and again.
But what you raised at least now should be fixed. Jheald (talk) 21:50, 13 February 2020 (UTC)
@Jheald: Thank you! Yes, that was the query I was using (the default one on the "information" template). Now I need to think of a good "conservation status" for movable historical monuments that have gone missing (disappeared in "the war", weren't there when a new inventory was ordered, etc.) I'll probably use "missing".

About a public nickname statement

Hi all,

the statement that Christian Giudicelli (Q1079854) was nicknamed "Eight One One" by Gabriel Matzneff (Q3093872) has been recently reverted, despite:

  1. it is supported by reliable sources - and very recently, by nationwide magazines such as L'Express ([8]) and The New York Times ([9]);
  2. as reported by L'Express, it is publicly admitted by Giudicelli himself, who wrotes in Florent Georgesco (Q51884467) (ed.), Gabriel Matzneff, Le Sandre, 2010: "Durant notre premier séjour à l'hôtel Tropicana, lui habitait la chambre 804 [Eight o four] et moi la 811 [Eight one one] : ainsi, en bavardant, avons-nous pris l'habitude de nous désigner plutôt que par nos prénoms et [...] mon cher Eight o four prend soin de dissimuler son cher Christian sous l'aile protectrice d'Eight one one : un tour de passe-passe qui n'abuse plus depuis longtemps ses fidèles lecteurs." ("During our first stay at the Tropicana Hotel, he lived in the 804 [Eight o four] room, and I the 811 [Eight one one] one: hence, while chatting, we got into the habit of naming ourselves like that rather than by our first names and [...] my dear Eight o four is careful of hiding is dear Christian under the protective wing of of Eight one one: a sleight of hand who no more misleads our faithful readers.")

So what do you think about that? Thanks, 86.193.172.227 17:02, 13 February 2020 (UTC)

See fr:Discussion:Christian_Giudicelli, the editor is claiming that he or she is acting on a request by Christian Giudicelli. If this is true, do we retain "hostile" nicknames? Ghouston (talk) 00:53, 14 February 2020 (UTC)
Or nicknames used briefly while staying in a hotel, or between friends, that have no wider significance? Ghouston (talk) 00:57, 14 February 2020 (UTC)
Hi Ghouston. Please consider that:
  1. This is not an hostile nickname: Matzneff is a close friend of Giudicelli.
  2. It neither wasn't used "briefly", nor only "in a hotel", but also in many books of Giudicelli and Matzneff; see L'Express: "dans plusieurs de leurs romans respectifs" ("in several of their respective books"); or Le Monde: "Dans les livres [de Matzneff], il figure souvent sous le surnom « Eight One One » [...], comme un numéro de chambre d’hôtel." ("In [Matzneff]'s books, he is frequently mentioned under the nickname Eight One One [...], like a hotel room"; my emphasis). It can be checked on Google Books: Boulevard Saint-Germain and Voici venir le fiancé (2006); Les Demoiselles du Taranne (2007); Carnets noirs (2009); dedication of Les Nouveaux Émiles de Gab la Rafale (2013)
Best, 86.193.172.227
Fair enough, but it seems like a nickname used only by one person. If I understand the context, Gabriel Matzneff has been accused of sex crimes and Giudicelli is apparently trying to reduce the association. Ghouston (talk) 07:38, 14 February 2020 (UTC)
Correct. 86.193.172.227

Merge ?

St. John (Q7593634) vs St John (Q65953328) ? Jheald (talk) 23:20, 13 February 2020 (UTC)

See w:St_John_(name). --- Jura 00:48, 14 February 2020 (UTC)

How about the case of items where all the properties, including the image, are identical with the only exception of inventory number (P217). It is unclear if it is one object entered twice into institution database, or two objects with one having wrong image. See Kaiumers, the First King of Persia (Q64538958), Kaiumers, the First King of Persia (Q64538960); or Sultan Murad III receiving a book (Q64537998) and Sultan Murad III receiving a book (Q64538040). --Jarekt (talk) 04:55, 14 February 2020 (UTC)

It could be separate entries for each side of an object?
Occasionally, I add possibly invalid entry requiring further references (Q35779580) to P31 with preferred rank when I come across what seems to be database artifacts (but that's something different from the two items mentioned initially). --- Jura 08:46, 14 February 2020 (UTC)
I like possibly invalid entry requiring further references (Q35779580). It seems like handy way of tagging problem items, especially if they lack references and URLs. As for Q64538958/Q64537998 and Q64537998/Q64538040 pairs, we have the source but the source is unclear or wrong. It is not uncommon to have item based on a single record in some institution database, which can give us inventory number (P217) but not much else. --Jarekt (talk) 16:44, 14 February 2020 (UTC)

Most common values for a given property

Is there a way to figure out what the most common values for depicts Iconclass notation (P1257) are? Thanks! Calliopejen1 (talk) 01:09, 15 February 2020 (UTC)

@Mahir256: thanks!! Calliopejen1 (talk) 01:16, 15 February 2020 (UTC)

Krolik w sauce

The user "Krolik w sauce" is engaged in Vandalism, bypassing the block, I'm already tired of canceling edits of this vandal. Please cancel all edits and block it.--Ilnur efende (talk) 18:43, 13 February 2020 (UTC)

Krolik w sauce (talkcontribslogs)
@Ilnur efende: Could you please clarify what you mean by bypassing the block?
This user seems to be making a lot of quickstatements edits. I cannot make much of the Tartar, but is it possible they have conflated the label with the description? Also, I'm wondering what steps you have taken to interact directly with the user. I see their talk page is a redlink. Cheers, Bovlb (talk) 20:13, 13 February 2020 (UTC)

Bovlb , User:Maitsavend and Krolik w sauce (talkcontribslogs)it's the same person. He makes up his own words and adds them. When they are corrected, it cancels the contributions of others. for that was blocked forever.--Ilnur efende (talk) 15:38, 14 February 2020 (UTC)

Thanks. Here are some discussions about Maitsavend: Wikidata:Administrators'_noticeboard/Archive/2018/01#User:Maitsavend, Wikidata:Administrators'_noticeboard/Archive/2019/02#Maitsavend, Wikidata:Administrators'_noticeboard/Archive/2019/03#Vandalism
@Ymblanter: Do you want to weigh in? The language barrier makes it hard for me to confirm the similarity. Bovlb (talk) 17:48, 14 February 2020 (UTC)
Bovlb,is it possible to cancel the entire contribution of these users to this project, because it takes a long time to manually cancel their contribution--Ilnur efende (talk) 18:02, 14 February 2020 (UTC)
I blocked the user, bacause this is clearly the same person as the several accounts I previously blocked (mass-changing Tatar labels with only fixing orthography is a good indicator), but I do not know how to mass-revert their edits, there are hundreds of them.--Ymblanter (talk) 20:29, 14 February 2020 (UTC)
I rolled back whatever I could roll back--Ymblanter (talk) 20:50, 14 February 2020 (UTC)

Ymblanter, thanks,--Ilnur efende (talk) 05:14, 15 February 2020 (UTC)

Wikimedia 2030 community discussions: Last week begins

We are entering the last lap of the discussions on the Wikimedia 2030 strategic recommendations. Until next Friday, February 21, you can share your feedback, questions, concerns, and other comments.

In my last 2 messages on this village pump, I described how the recommendations were created, the role of your input, and the next steps. This time, let me describe just one selected recommendation, one that sounds to be particularly close to the activities on wiki: 'Improve User Experience'.

It states that anyone, irrespectively from their gender, culture, technological background, or physical and mental abilities, should enjoy a fluid, effective, and positive experience during both the consultation and contribution to knowledge. This recommendation is, among others, about the design improvements, user interface, but also training and support programsdedicated resources for newcomers, and, what I personally find especially interesting, mechanisms that allow finding peers with specific interests, roles, and objectives along with communication channels to interact and collaborate.

Please comment on the recommendations' talk pages. What do you think about this and other recommendations? Should some points be improved, removed, or added?

If something is not clear, please ping me. I will write back as soon as I can.

SGrabarczuk (WMF) (talk) 14:43, 15 February 2020 (UTC)

I made a clumsy visual representation of the connections between the recommendations (data taken from the sections called Connection to other recommendations). I didn't take into account the cases when a recommendation is connected to all the others. Anyway, I leave making the conclusions to you. In addition, I'm sharing a cloud with the most frequently used words in the recommendations. SGrabarczuk (WMF) (talk) 23:54, 15 February 2020 (UTC)

Cannot save

On The Lookout (Q85244833), I am trying to add a link to w:en:The Lookout (Laura Veirs album) but I keep on getting "Could not save due to an error. The save has failed." Can someone help me? —Justin (koavf)TCM 00:39, 16 February 2020 (UTC)

28th time is a charm apparently. —Justin (koavf)TCM 00:43, 16 February 2020 (UTC)

Coordinates

Hello,

is there a free source for coordinates of a map, where I can look for coordinates of a address to enter it into Wikidata. OpenStreetMap is licensed under the OpenDatabaseLicense and so it is not possible to use it for getting coordinates for Wikidata and so I need another source and I dont know one I can use here in Wikidata. At the end in cases like this CCO as a license makes it difficult to use information of other databases. -- Hogü-456 (talk) 17:51, 11 February 2020 (UTC)

I use this tool to pin a point on the map manually, to get the coordinates of something, somewhere. Edoderoo (talk) 19:45, 11 February 2020 (UTC)
Is it allowed to use this tool for Wikidata. I am not sure if it is allowed. At the end I use OpenStreetMap and this is not conform because of the license with Wikidata. At the end it is a question about the license of Geodata and if there is a possibility to get free Geodata witout a License what is Public Domain. I dont know Geodata what is licensed in that way. If there is no possibilty to use Coordinates from a map here in Wikidata license conform I think then there should be disussions about the License of Coordinates. I dont want to get problems and so it were good to get an answer what I am allowed to do with coordinates from a map and what not. -- Hogü-456 (talk) 20:32, 11 February 2020 (UTC)
Yes, it is allowed, why should it *not* be allowed? Edoderoo (talk) 19:19, 16 February 2020 (UTC)

Inaugural lecture

Does anyone have any experience with modeling the 'inaugural lecture' of a university teacher? First of all: what would be the appropriate P-number/propery to use for 'inaugural lecture'? (It would be okay if I could put a string in it, but ideally it would use a normal Wikidata value/Q number.) Though I could use inaugural 'lecture:name of lecture' as a qualifier either for the chair that is held by the university teacher or the position accepted at the university, my preference is to be able to model the inaugural lecture as its own statement that includes its own qualifiers. I am simply wondering if I haven't been able to find the property for inaugural lecture or that it isn't available yet. If so, I will ask for the correct property to be created. Thanks for your feedback, Ecritures (talk) 12:12, 15 February 2020 (UTC)

@Ecritures: Are you trying to create an item about the lecture, or simply to add details of it to the item about the person? for the latter you could use significant event (P793). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:07, 15 February 2020 (UTC)
The latter indeed. Yeah I suppose that could be the way I can model it. Somehow it feels suboptimal but it will do. Thanks for the input! Ecritures (talk) 14:52, 15 February 2020 (UTC)
Maybe "notable work", but then they aren't necessarily .. I'd rather make an item about the lecture and link it to its author/speaker. --- Jura 15:02, 15 February 2020 (UTC)
@Ymblanter: you probably have input on this? Multichill (talk) 19:40, 16 February 2020 (UTC)
I would support "significant event". Lectures are sometimes being published afterwards, but not always, so it is not "work". Do we add info on say TED talks? Inaugural lecture should be probably described by the same property.--Ymblanter (talk) 19:44, 16 February 2020 (UTC)

US counties

There is now a dashboard at Wikidata:Lists/US counties/dashboard.

It still needs some tweaking to handle co-extensive cities. --- Jura 18:00, 5 February 2020 (UTC)

  • Looks like some items (and Wikipedia articles) missed the updates since creation: e.g. name and area changes that come with re-oranization. --- Jura 05:48, 17 February 2020 (UTC)

dead links - items with broken references

The data item about Sue Gardner (Q7524) has at least two references that are broken (date of birth). At enwiki such links are tagged with {deadlink}. What is the practice here? Thanks in advance , Ottawahitech (talk) 02:32, 17 February 2020 (UTC)

Sue Gardner (Q7524). Ideally add archive URL (P1065), but in this case, the links don't seem to have been archived. If it was a main statement it could be deprecated, or given qualifiers, but I'm not sure what to do when it's already a qualifier. Ghouston (talk) 06:12, 17 February 2020 (UTC)

Which country is Puerto Rico in?

.. and then, which values to use for country (P17) and located in the administrative territorial entity (P131)?

There are some lengthy comments by User:The Eloquent Peasant on various talk pages of items about places in Puerto Rico and a few other non-US states, such as United States Virgin Islands (Q11703), Guam (Q16635): Talk:Q44547, Talk:Q11703, Talk:Q16635.

The arguments for not using Q30 seem to be that:

  • It's not a state of the US,
  • it's not part of the continental US,
  • some infobox at some Wikipedia displays it in a way not liked by some users,
  • the statements were added by an IP in 2013,
  • and/or, some US politicians think it's not in the US.

I don't quite see how any of this is relevant. --- Jura 07:36, 3 February 2020 (UTC)

The common approach is to use US for country (P17) and nothing for located in the administrative territorial entity (P131) because it's not in the U.S. .. Located in the U.S. are 50 states.--The Eloquent Peasant (talk) 11:54, 3 February 2020 (UTC)
However, in the case of the Arecibo Observatory, if we place United States in the country field, then the infobox incorrectly displays that Arecibo Observatory is locted in the U.S. which it is not. Sorry, I should not have said common. What does Q30 mean? I'm not very familiar with wikidata terms but I do know what is in and not in the U.S. Thank you.--The Eloquent Peasant (talk) 12:05, 3 February 2020 (UTC)
As mentioned above, I don't think argument #3 is relevant to Wikidata. Q30 is linked further up. --- Jura 12:07, 3 February 2020 (UTC)
Regarding the located in the administrative territorial entity (P131) "Located in administrative...entity", please show me a map, an official map of the U.S. that shows that Guam, P.R. The US Virgin Islands, etc. are in the U.S. Something is either in the U.S. or it is not in. This is not a controversial topic. I fail to see how this is difficult to understand.--The Eloquent Peasant (talk) 12:11, 3 February 2020 (UTC)
Can you clarify what you mean with "U.S."? Argument #1 isn't a reason not to use Q30 in Wikidata. --- Jura 12:15, 3 February 2020 (UTC)
Argument #1 is a reason not to use Q30 in located in the administrative territorial entity (P131) . If you are in your house-you are in your house. If your left foot is in your house and your right foot is outside your house you are both in and out. However, in the case of these territories they are not in the U.S. so P131 should not be populated with Q30.The Eloquent Peasant (talk) 12:20, 3 February 2020 (UTC)
Regarding your comment "I think the "common" approach since 2013 was to use Q30 in P131." This was done by someone who used an algorithm to do it without giving much other thought to what he was doing with the algorithm and many editors called him out for introducing errors into wikidata, in 2013. The U.S. territories slipped through the cracks at that time. Because it's been there since 2013 doesn't make it right. Many errors exist in wikidata and wikipedia because editors don't notice them until later. So every wiki different language article after that assumed these territories are in the U.S.The Eloquent Peasant (talk) 12:25, 3 February 2020 (UTC)
So your argument is that "US territories" should not use P131=Q30. --- Jura 12:31, 3 February 2020 (UTC)
Neither Washington DC is inside one of the 50 states of the US. The models we use here at Wikida can never fully describe all the fine details in all relations. We have to accept that it sometimes is a little rough, and not fully can describe the truth. I can accept both that Puerto Rico is described as located in US and that it is not. But we cannot have one model for some parts of Puerto Rico and another model for other parts. We have many territories like this in the world. We maybe not even can have the same model for Puerto Rico as for Greenland as for New Caledonia. But inside all of these terrotories we have to accept a common model. 62 etc (talk) 13:01, 3 February 2020 (UTC)
  • And that shows that the term "country" is poorly defined. In my language we do not use the same word (ie country) to describe Wales and UK. The country of Wales and the country of UK are not the same kind of entity. We do not even translate a "county" in England the same was as a "county" in US. 62 etc (talk) 17:07, 3 February 2020 (UTC)
So we should ackowledge that Wikidata has this flaw, and maybe attempt to correct it. I don't know how much coding is involved but because the entire world looks to wikipedia / wikidata for accurate information, I think we should try to get a fact such as whether a place is in a country or not in a country correct. --The Eloquent Peasant (talk) 17:42, 3 February 2020 (UTC)
What is a country? "A country can be part of a larger state" according to our very own wikipedia here, P.R. would be a country, part of (but not in) a larger state, the U.S. --The Eloquent Peasant (talk) 17:48, 3 February 2020 (UTC)
located in the administrative territorial entity (P131) is itself a bit of a strange concept, combining geographical location with administrative control. We have Puerto Rico at some level controlled by the United States (government), even if it may or may not be part of the United States geographically, depending on how we are defining "United States". Ghouston (talk) 22:23, 3 February 2020 (UTC)
@Jura1: I was confused and wondering why you made this change. Isn't this exactly what we are discussing here. Is Guam in the United States? What does in mean? What does located in the administrative territorial entity (P131) mean? --The Eloquent Peasant (talk) 20:43, 4 February 2020 (UTC)
It seems your deletion isn't supported, but, if the conclusion ends up being that it should be removed, we will do so. If you need a reference for Guam being a territory of the US, I can add one. --- Jura 20:46, 4 February 2020 (UTC)
located in the administrative territorial entity (P131) is about "administrative territorial entities", which I suppose are arbitrary areas administered by a government body. They won't necessarily have any connection to geographic entities. Puerto Rico doesn't have much geographical connection to Guam, but they still in international politics considered possessions of the same state. Ghouston (talk) 23:07, 4 February 2020 (UTC)
@Jura1: I believe adding located in the administrative territorial entity (P131) = (U.S.) to Guam and PR. and other territories is wrong but I also see that from the beginning you don't care and will do whatever you want. I don't need a ref to say they're territories. I've never disputed that fact. The fact that they are territories is covered in other wikidata properties. So you're basically saying they're in the U.S. with located in the administrative territorial entity (P131) and that is incorrect--The Eloquent Peasant (talk) 23:57, 4 February 2020 (UTC)
Because you are defining U.S. to mean something other than all of the territory coming under the sovereignty of the U.S. government. It will also make a difference when calculating things like population and land area. en:United States says "The United States of America (USA), commonly known as the United States (U.S. or US) or America, is a country comprising 50 states, a federal district, five major self-governing territories, and various possessions." Ghouston (talk) 00:47, 5 February 2020 (UTC)
The territories are "of" but not "in" the U.S. Adding located in the administrative territorial entity (P131) = U.S. to the wikidata item of the terrorities will state they are in the U.S. when they are not. The property is "Wikidata property to indicate a location" --The Eloquent Peasant (talk) 02:33, 5 February 2020 (UTC)
Personally, I don't care either way, but I think things should be consistent, i.e., how Wikipedia defines it, the population counts and area for the US on Wikipedia and Wikidata, the P17 and P131 statements, should all match. In principle, you can create two or more items on Wikidata for the US, with different definitions, but it would be incredibly confusing. We do have contiguous United States (Q578170), but this is something else. Ghouston (talk) 04:34, 5 February 2020 (UTC)
It would seem germane that anyone born in Puerto Rico is a U.S. citizen. Not sort of a U.S. citizen, with some weird document. They are exactly as much a U.S. citizen as if they had been born on Boston or Chicago. - Jmabel (talk) 05:12, 5 February 2020 (UTC)
Do not look to deep into such words as "in", "of", "on" and "at". Their meaning seldom survives a translation. In fact, it is one of the most difficult things to manage when you are learning a new language. At least when the languages are closely related. 62 etc (talk) 07:13, 5 February 2020 (UTC)
First of all, some background reading: w:Dependent territory.
This stuff is complicated. (IIRC, I brought it up during the discussions leading up to the ongoing "Countries, subdivisions, and disputed territories" RFC, but left it out of the RFC itself because it added complications that could be left for later.) Some countries claim certain territories as their own while saying that the territory is not "part of" the country, making a distinction between all areas governed by the country and the area of the country "proper". This distinction is often applied to various legal things, and is inconsistent between countries. Similar complicated things: What are protectorates, tributary states, dominions, associated states, vassal states, puppet states, colonies, etc? There are many different levels of association an entity can have with a country in power over it. We need a broad and consistent solution for how to represent the levels of association. Current uses of country (P17) and located in the administrative territorial entity (P131) are ambiguous on this. --Yair rand (talk) 22:45, 5 February 2020 (UTC)
In this instance, we only have to solve the problem for the US territories, not come up with a general theory for every country at every point in history. There can be three answers: 1) they are part of the US (as we define it here) 2) they are not part of the US 3) they are variously part of or not part of, on different items or at different times depending on the whim of whoever edited it last. Option 3) will apply by default unless otherwise decided. Ghouston (talk) 00:50, 6 February 2020 (UTC)
There are just three involved (Puerto Rico, Guam, USVI). --- Jura 05:29, 6 February 2020 (UTC)
Also American Samoa and Northern Mariana. And some others like Guantanamo Bay, or in the past Panama Canal, in a completely uncertain status.--Ymblanter (talk) 13:09, 6 February 2020 (UTC)
Guantanamo Bay is interesting in that Cuba retains sovereignty, and should probably have country (P17), but maybe the located in the administrative territorial entity (P131) chain wouldn't lead to Cuba. At present though, it's set as part of a Cuban province: Guantanamo Bay Naval Base (Q762570). Should the population of Guantanamo Bay be counted under the US or Cuba? Presumably it's so low that it wouldn't make much difference. Ghouston (talk) 22:15, 6 February 2020 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────The US Census does not include PR in its tally of US population.[1] The April 2010 US population was 308,745,538, not including Puerto Rico's 3,725,789 inhabitants. The way the US Census accounts for American personnel at US foreign bases (Guantanamo, etc) into the total tally of US population is via the Census tally at the state of residence of the individuals in question (they are residents of their individual states not of the US foreign base).

@The Eloquent Peasant: what were you suggesting above with "The property is 'Wikidata property to indicate a location.' " Were you saying that by manipulating that parameter holds the answer to satisfactorily address the "in"-versus-"of" concern? Mercy11 (talk) 23:36, 11 February 2020 (UTC)

@Mercy11: Thanks for your comments on US population and clarifying that. The only thing I was saying was that - well see here .. https://www.wikidata.org/wiki/Talk:Q16635 ---- That an editor, in 2013, felt that every location needs to be "in" a country (as a matter of heirarchy) but I know that is not necessarily so for P.R., Guam, the Virgin Islands, etc. Because as we all know there are 50 states in the U.S. and they don't include those territories. I explained that the editor was mistaken when he made that change in 2013 with a bot. I explained, in the talk page, that I removed the parameter from those US territories wikidata items, because if we add it then also addresses might say something like "Arecibo, Puerto Rico, US" and that is incorrect per pretty much everyone's knowledge. Also, I warned that because of that change made in 2013, other language Wikipedia articles created geographic location articles that stated that P.R. is in the U.S. This is the parameter that I feel should not be on US territories = to US ---> https://www.wikidata.org/wiki/Property:P131 = P131 is a property that indicates a location. So it's clear that this property should not be added to US territories because they are not "in" the US. Finally, to answer your specific question I believe the definition of parameter located in the administrative territorial entity (P131) is to indicate a location, thus again should not be added to US territories. (Sorry to sound like a broken record) One editor mentioned not to get too hung up on "in" or "on" or "by" or "of".. (prepositions) but I think that something as simple as whether or not a place is in another place should be accurate on Wikipedia so that we don't perpetuate wrong information. I've had people say to me "Oh Puerto Rico is not in the U.S.?".. So that's all. (Obviously the relationship is complicated - It reminds me of the question- are you married? divorced? answer is "it's complicated".) But I think the P131 parameter is clearly talking about a location (not relationship between US and PR).. but the location of PR is not complicated, and whether a location is in another location is not complicated. That's all. Bottom line "my argument is that "US territories" should not use P131=Q30" (and Q30 means US)--The Eloquent Peasant (talk) 23:56, 11 February 2020 (UTC)
@Jmabel: When you said "anyone born in Puerto Rico is a U.S. citizen...exactly as much a U.S. citizen as if they had been born on Boston or Chicago", how do you see citizenship having a bearing on the issue here? You seemed to be implying that citizenship can be a way to determine the "in" relationship, but I don't think I can agree with that because w:Puerto Rican citizenship explains how people born in Puerto Rico have both "Puerto Rican citizenship" and "US citizenship." But reviewing what happened after PR became a territory "of" the US people born in PR were only "Citizens of Puerto Rico" and did not automatically become US citizens until an Act of Congress some 20 years later (1917). There is no record that in the 1917 grant of citizenship incorporated PR "in"to the United States. The lesson is that citizenship is no more a determinant of PR being "in" the US than, say, Congress passing a Law granting earthquake relief moneys would be a determinant that PR is suddenly "in" the US. Seems to me that citizenship, like population, can't be used to determine whether or not PR is "in" the US, or was I missing something in your rationale? Mercy11 (talk) 02:25, 12 February 2020 (UTC)
It still seems like you can argue it either way. en:United States seems to be taking that position that Puerto Rico is part of the USA, for example. "United States (U.S. or US) or America, is a country comprising 50 states, a federal district, five major self-governing territories, and various possessions." and in geography: "the entire United States is approximately 3,800,000 square miles (9,841,955 km2),[218] with the contiguous United States making up 2,959,064 square miles (7,663,940.6 km2) of that. Alaska, separated from the contiguous United States by Canada, is the largest state at 663,268 square miles (1,717,856.2 km2). Hawaii, occupying an archipelago in the central Pacific, southwest of North America, is 10,931 square miles (28,311 km2) in area. The populated territories of Puerto Rico, American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands together cover 9,185 square miles (23,789 km2)". Ghouston (talk) 03:06, 12 February 2020 (UTC):
No one here disagrees that PR is part "of" the USA, and the article you cited also states it is part "of" it. The disagreement is whether or not PR is "in" the USA. The links offered by the various editors above show the problem is one of sometimes articles implying or openly stating PR is part "of" the US and sometimes articles implying or openly stating (with the help of wikidata) that PR is "in" the US (e.g., former version of Arecibo Observatory). The confusion seems to stem from a tendency to equate the two: that being "part of the USA" implies being "in the USA" and vice versa. PR is part "of" the US while at the same being not "in" the US. The lead paragraph at the English WP US article you cited above is consistent with this, but says nothing about the "in" part. Are these Wikidata constructs (Q30, P|17, P|131, etc.) supposed to facilitate the work of the WP sister projects or are the sister projects supposed to perform workarounds to facilitate Wikidata's work? Mercy11 (talk) 04:09, 13 February 2020 (UTC)
Doesn't the "administrative territorial entity" of the USA consist of all the territories that are part "of" the USA, or administrated by the USA at some level? Wouldn't every geographical area that's part of the USA then be "in" this territorial entity? Ghouston (talk) 04:55, 13 February 2020 (UTC)
I don't think so. I took a snapshot of the also known as (multiple descriptions - please see attached image) of the parameter in question. And I went ahead and tabulated / summed the population as seen in the source provided by Mercy11. Total US Population 203,211,926 which can be seen at the top of US population in 1970 did not include Puerto Rico's population of 2,712,033 of 1970 (which is listed on the last line of same source). I just added them to check. PR's pop is listed but not included in the US total pop. --The Eloquent Peasant (talk) 11:42, 13 February 2020 (UTC)
Parameter 131 in question. I see the parameter as defining a place that is "in" another place not "of" or "belonging to"
So the Spanish national census agency is still doing the census in PR? --- Jura 12:02, 13 February 2020 (UTC)
Whoa- don't go getting salty --The Eloquent Peasant (talk) 12:37, 13 February 2020 (UTC)
@Ghouston: As I believe I read someone state above, the term country seems to be poorly defined. Like beauty, this too seems to be in the eyes of the beholder. Mercy11 (talk) 03:13, 15 February 2020 (UTC)
Sure. I don't know how this can be resolved: maybe somebody can be appointed to toss a coin, or we could have a vote? Otherwise, we will never know one way or the other. Ghouston (talk) 03:18, 15 February 2020 (UTC)
It's not a matter of a coin toss. @Jura1: Do you think the US territories are in the US? If you do, please share sources that say the US territories are in the US. Have you ever seen a map of the US? We are talking about Parameter 131, a parameter that I did not create but that was created and defined by someone on the wikidata project. The definition or "also known as" states in more than a dozen times and this is the most ridiculous thing at this point because you're not listening. You just want to win an argument, and this was your position from the beginning - from the moment I updated the wikidata items for US territories and added my "lengthy explanation" that you defined as "not relevant". Well it is relevant and it is important to get it right here.--The Eloquent Peasant (talk) 12:39, 15 February 2020 (UTC)
@Mercy11: When you say the territories are of the US, I think you mean belong to the US. I can have a dog or a kitty cat that belong to me but they are unINcorporated into my household, so they stay outside all the time. The territories belong to the US but again are not in the US, not INcorporated. Scholars from Yale talk about the issue here, using the term belong to.[2] --The Eloquent Peasant (talk) 12:00, 16 February 2020 (UTC)
The "in" in "incorporated" is the latin prefix, not the English word. "please show me a map, an official map of the U.S. that shows that Guam, P.R. The US Virgin Islands, etc. are in the U.S.": Here is one by the NOAA. But more generally the property asks for the administrative territorial entity the item belongs to. From what I understand of w:en:Territories of the United States, Guam, Puerto Rico, etc. are part of the territory of the United States. -Ash Crow (talk) 22:14, 16 February 2020 (UTC)
Hi. The located in the administrative territorial entity (P131) in question is defined as places that are in other places (if you see the screenshot attached you can see what I mean) and the map you shared is a natural map which includes the "U.S. Caribbean region (in Spanish:
El Caribe estadounidense
) is a term used by the National Oceanic and Atmospheric Administration (NOAA) to refer to the waters belonging to the United States in the Caribbean Sea.[3] NOAA maps it as a natural region of the United States, located in the Caribbean Sea, made up of federal waters in and around Puerto Rico, the US Virgin Islands, Navassa Island, and the Guantánamo Bay Naval Base. Serranilla Bank, and inhabited island, and Bajo Nuevo Bank, which are currently controlled by Colombia but claimed by the United States, are sometimes included in the region by NOAA. The U.S. Caribbean region is a natural region and not a political or administrative region." These locations are not administratively located in the US. Have a great week.[4]  – The preceding unsigned comment was added by The Eloquent Peasant (talk • contribs).
Thank you. I think another field would be useful. A new field / parameter could explain something to this effect. --> The U.S. Secretary of Interior "Carries out Responsibilities for the U.S. Insular Areas" even though P.R. is not on this doi.gov site's list of public laws.[5] --The Eloquent Peasant (talk) 12:12, 17 February 2020 (UTC)
I'd also like to invite you all to read what the Wikiproject Puerto Rico team believes happens often regarding the editing of Puerto Rico articles, here in a 2014 Wikipedia Signpost.--The Eloquent Peasant (talk) 16:53, 17 February 2020 (UTC)
  1. 2010 US Census
  2. https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=6444&context=faculty_scholarship
  3. Delgado, Patricia; Delgado, Patricia; Stedman, Susan-Marie (2004). La región del Caribe Estadounidense: humedales y peces, una conexión vital. Silver Spring, MD: Administración Nacional de los Océanos y la Atmósfera (NOAA), Oficina de Pesquerías de NOAA, División de Conservación de Habitáculo – via Google Books.
  4. {{Cite book|url=https://www.biodiversitylibrary.org/bibliography/62466%7Ctitle=La región del Caribe Estadounidense : humedales y peces, una conexión vital|last=Delgado|first=
  5. https://www.doi.gov/oia/budget/authorities-public-law

Please help. --Kusurija (talk) 12:58, 10 February 2020 (UTC)

Vyžuona (Q12678613): river in n Lithuania, tributary of Šventoji --- Jura 13:01, 10 February 2020 (UTC)
Thank you very much. --Kusurija (talk) 13:07, 10 February 2020 (UTC)


It seems that there are actually 5 items with the same label:

All got expanded over the last days.

@Kusurija: --- Jura 06:01, 17 February 2020 (UTC)

Thank you. --Kusurija (talk) 08:25, 17 February 2020 (UTC)

Misspelling of a moth name

I'm not sure how to fix it, so I'll post it here. Please see Talk:Q13393150. SchreiberBike (talk) 04:00, 16 February 2020 (UTC)

Wikidata_talk:WikiProject_Taxonomy might help. --- Jura 07:41, 16 February 2020 (UTC)
@Jura1: Thank you. SchreiberBike (talk) 22:47, 17 February 2020 (UTC)

Question about subclass and instance of

A few questions I couldn't find answers to in the docs. Maybe I'm being too pedantic here but I don't get how to model data here.

  • When do you use subclass v. instance of?
    • For example Q11421395 is an instance of a medal but there are lots of copies of the medal. Should it be an instance of a "kind of medal"? Or a subclass of medal?
    • Q868130 is an instance of a softdrink but really it's a soft drink brand not an actual soft drink? Or surely at least it's a subclass of softdrink? Should it be an instance of a brand and a subclass of soft drink?
    • Q12372598 is an instance of "food ingredient" but a subclass of "fruit". Should they both be sub-classes? Also, isn't marking it as a sub-class of "drupe" and "fruit" redundant? Is there value in having both?

Am I over thinking this? Or are there docs describing this I missed? Thanks. BrokenSegue (talk) 04:46, 16 February 2020 (UTC)

I struggle with this at times too and it would be good to have a dedicated help page that runs through things to help an editor decide which is more suitable. --SilentSpike (talk) 11:49, 16 February 2020 (UTC)

WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Please see Help:Basic membership properties. Joao4669 (talk) 12:32, 16 February 2020 (UTC)

Should caramel ice cream (Q84573720) be an instance or a subclass of ice cream (Q13233)?
Should Bubblegum Squash McFlurry (Q82751912) be an instance or a subclass of McFlurry (Q906754)? --Trade (talk) 18:47, 16 February 2020 (UTC)
Subclass in both cases as they are currently modeled, because you're not talking about a specific instance of a dessert. Now if "McFlurry" were a product model (Q10929058): class of manufactured objects of similar design sold under a specific brand, not a dessert, then any flavor of McFlurry would be an instance. IMHO. - PKM (talk) 19:45, 16 February 2020 (UTC)
As to the original medal question, the first step would be to look if there is a speciic project that has a guideline / schema. If not, naively I would agree that United Nations Peace Medal (Q11421395) is an instance of a medal class (there is class of award (Q38033430)), not instance of medal; and making it subclass of medal is correct but should be improved by giving the next-higher category of United Nations Peace Medal (Q11421395) if there is one. --SCIdude (talk) 08:23, 17 February 2020 (UTC)
As to plum (Q12372598) there is the problem that "plum" can have at least two meanings, 1. type of plant (represented by the taxon item Prunus subg. Prunus (Q6401215)); 2. the fruits of (1); and 3. the food ingredient---which is not necessarily (2) because I expect more Prunus species/varieties whose fruits are not eaten, e.g. those varieties that are out of fashion. Now the item in question is (3) plum (Q12372598) and there are several plum varieties, so it is a class of ingredients, a subclass of fruits, and yes, since a drupe is a fruit, the subclass of fruit is redundant. --SCIdude (talk) 08:44, 17 February 2020 (UTC) PS: There is one case when you should not remove the subclass of fruit statement: if it has a better reference than the more specific statement.
It is not redundant - all drupes are fruit (Q1364), but not all are fruit (Q3314483); some drupes are nut (Q3320037), or inedible fruit (Q30312832). Peter James (talk) 12:38, 17 February 2020 (UTC)

Wikidata weekly summary #403

Translation tag help request for Wikidata Tours

Hi all

Over the past few months @NavinoEvans:, @Alicia Fagerving (WMSE): and myself have been working on fixing the bugs and completing many of the missing Wikidata Tours, we now have a pretty extensive set of new tours ready to go, with more almost ready... but we have an issue... the translation tags on the new version of the main tours page are really broken. Does anyone have experience in fixing mangled translation tags? I spent over an hour today trying to fix it but am still getting loads of errors.... Once we have this complete we can concentrate on new churning out new tours and make it much easier for new contributors to learn how to contribute to Wikidata.

https://www.wikidata.org/wiki/User:Alicia_Fagerving_(WMSE)/Sand_box

The two jobs are:

  • Fix the existing translation tags
  • Explain on the talk page how to add new tags in a way that makes sense when we start adding additional tours

Thanks very much indeed

--John Cummings (talk) 22:14, 17 February 2020 (UTC)

Draft for why we need new ranks

I have written a draft for the case for having two new ranks of Uncertain and False. Feel free to comment on the talk page if you have input at the draft stage. ChristianKl11:27, 7 February 2020 (UTC)

It seems preferable to split them into separate statements, but "false" might not be the optimal term (compare with "erroneous" used also for deprecated). --- Jura 11:42, 7 February 2020 (UTC)
@Jura1: I'm open to other terms then false. "erroneous" doesn't feel to me like an improvement.  – The preceding unsigned comment was added by ChristianKl (talk • contribs).
It wasn't meant as an improvement, just a comparison for being too close. "contested" might do. --- Jura 21:58, 7 February 2020 (UTC)
"Refuted" suggesting the addition of the refutation reference? --SCIdude (talk) 14:34, 8 February 2020 (UTC)
  • Not sure about description of the VIAF part: neither is this way these are currently handled nor is deprecated rank currently appropriate for these. --- Jura 12:31, 7 February 2020 (UTC)
I do not like the idea of "new ranks":
  • Ranks as we have them now do deliberately not carry any semantic meaning; they are mere visibility controllers, and we can freely compose a ranking combination for all claims of a given property in an item for very different reasons. The proposed draft intends to change this completely.
  • Even the three current ranks are poorly understood by many editors and often used incorrectly.
  • The more ranks we add, the more complicated it becomes to understand which data is visible in which data retrieval scenario.
  • The more ranks we add, the more likely it becomes that multiple ranks appear applicable at the same time. Which one to choose then?
That said, I do acknowledge that it is currently difficult to annotate statements properly with editorial decisions in a structured manner. We do so by using a combination of qualifiers, references, and rank usage to deal with this shortcoming and it is a mess. This should somehow be improved, but please do not mess up ranks to try this. —MisterSynergy (talk) 12:49, 7 February 2020 (UTC)
In data retrivial scenarios today sometimes a person will make adjustments for qualifiers and sometimes they don't, it doesn't seem clear to me. Can you point to example where you think the proposed ranks wouldn't leave a clear which rank to use.
I do agree that the current ranks are poorly understood and that suggests to me that the current implementation of them is problematic. Ranks are essentially messed up right now. If it would become more commanplace within Wikidata to make decision about whether normal or uncertain is the more appropriate rank, this will result in users getting more conscious about ranks. Having ranks more visible in the UI will also help making them more visible. ChristianKl17:54, 7 February 2020 (UTC)
To paraphrase: ranks are poorly understood, so if we add a new rank they will become better understood. No. --Tagishsimon (talk) 18:38, 7 February 2020 (UTC)
Well, as ranks are visibility controllers, your proposed new ranks would be sort of sub-ranks that control claim visibility in the same way as one of the existing ranks. Rank "false" would actually be like "deprecated-false", and "uncertain" would be "normal-uncertain" (I guess). There would always be uncertainty whether to use the sub-rank or to main-rank, and very soon there would be demand for even more sub-ranks for various purposes. For that reason, I clearly prefer to keep it as "simple" as it is, i.e. continue with pure visibility controllers without any further semantics.
I think we should maybe overhaul Help:Ranking in order better educate users what ranks actually are; improving visibility should not be that difficult (there is at least a phabricator ticket for it, and we could probably offer a gadget within hours if we wanted to do so). —MisterSynergy (talk) 22:09, 8 February 2020 (UTC)
I think we currently also have "deprecated-uncertain", so it's not a straight subrank.
I agree that we shouldn't add ranks that don't have an effect on visibility. Both of my proposed ranks do effect visibility and this would be a natural barrier towards people proposing additional sub-ranks. ChristianKl08:06, 11 February 2020 (UTC)
  • one potential use for the "false" rank is to prevent bad data from re-entering wikidata. for example, if some source says the imdb profile id of X is y but I've hand audited that to be mistaken I would like to be able to encode that information in wikidata to prevent a later editor/bot from mistakenly re-adding the wrong imdb profile link. is that something you imagine the rank being used for? BrokenSegue (talk) 15:27, 7 February 2020 (UTC)
  • Opening the floodgate of large amount of uncurated data may be a concern.--GZWDer (talk) 22:52, 7 February 2020 (UTC)
  • I think adding of uncurated data already happens to day already. When it comes to large-scale adding of data the gate that we have is supposed to be the bot approval process even if you try to circumvent it in cases like the Peerage import. ChristianKl10:05, 8 February 2020 (UTC)
  • Two thoughts:
1. I think it may make sense to distinguish between statements that we are pretty confident are wrong, and have therefore deprecated from the project, from statements (perhaps suggested by an AI -- and in particular, in Commons Structured Data, suggested by machine vision) that we think might well be right, but we think would benefit from a check or review. We could use deprecation for this latter case, but a specific new type of rank might be more undersatandable and transparent.
2. At the moment we don't have that many deprecated statements. But if we anticipate the number increasing, I think there is a fair case for reviewing the p:Pxxx prefix in the RDF dump and WDQS, and splitting out deprecated statements from it, to use a new prefix, eg pd:Pxxx.
At the moment, to exclude deprecated statements in SPARQL, you have to do something like the following:
  VALUES ?not_deprecated {wikibase:NormalRank wikibase:PreferredRank}
  ?item p:Pxxx ?stmt .
  ?stmt wikibase:rank ?not_deprecated .
or alternatively
  ?item p:Pxxx ?stmt .
  MINUS {?stmt wikibase:rank wikibase:DeprecatedRank} .
This is inefficient because it requires an extra join. More to the point, nobody ever does this, because it's a pain to include -- even people who are having to use the p:Pxxx form in their queries because they want to extract or filter by a qualifier value. So as a result, deprecated statements get included in their results, which is probably not what was intended. This is bad design. It's not helping people to get the results they are most likely to want.
The alternative, if we introduced a pd:Pxxx prefix for deprecated statements would be that
 ?item p:Pxxx ?stmt
would only return non-deprecated statements (probably the desired behaviour 99% of the time); while
 ?item p:Pxxx|pd:Pxxx ?stmt
could be used whenever deprecated statements were desired to be included -- an easy enough change to make, but requiring the query-writer to explicitly signal this intention.
Yes, this would be a breaking change for a limited number of existing queries and reports. But in my view it is one that would make sense. Jheald (talk) 14:25, 8 February 2020 (UTC)
Breaking change for queries that might already be broken .. I don't usually filter for deprecated rank when accessing qualifiers (which is kind of bad). --- Jura 22:14, 8 February 2020 (UTC)
I had sort of assumed until this point that p/ps was "normal plus preferred", not "normal plus preferred plus deprecated". Ooops. I like the idea of a specific "give me deprecated values" prefix. Andrew Gray (talk) 20:08, 10 February 2020 (UTC)
I support that change. I would guess it will fix more existing queries then it breaks. ChristianKl08:06, 11 February 2020 (UTC)
@Jheald, Andrew Gray: nobody ever does thiscough
return non-deprecated statements (probably the desired behaviour 99% of the time) I think that number is way too high. In many cases, I believe the intention will not be “all preferred- and normal-rank statements”, but “all best-rank statements”, i. e. the same ones that you get with wdt: (and p: was used e. g. to access qualifiers, not to get all statements). And the correct way to do that is ?stmt a wikibase:BestRank. --TweetsFactsAndQueries (talk) 14:04, 18 February 2020 (UTC)

How do you make citation needed constraint (Q54554025) work with qualifiers

This is a problem i have been running into. --Trade (talk) 19:13, 11 February 2020 (UTC)

  • @Trade: In my opinion, the reference of a value can be inserted as a reference of a qualifier. I mean: a qualifier qualifies a value and a reference proves the existence and the usefulness of this value, so the qualifier is indirectly sourced, IMHO. applies to part (P518) can be part of the references to specify that a reference applies to a qualifier.
From a property perspective, I believe we can add citation needed constraint (Q54554025) to a property that also hosts as qualifier (Q54828449). For example, place of marriage (P2842) is used as a qualifier in spouse (P26) and P26 must have a constraint reference. Otherwise there is no way to specifically reference a qualifier (if that was the question): a source reference a value that can contain one or more qualifiers and that seems sufficient, as described in Help:Sources. Does that help you or did I confuse you? With an example, your question will be clearer. —Eihel (talk) 19:23, 12 February 2020 (UTC)
@Eihel:, i want to make 'citation needed' work with number of reviews/ratings (P7887) --Trade (talk) 20:09, 12 February 2020 (UTC)
@Trade: I went too far: in fact Q54554025 is useless for a qualifier, because the constraint violation page does not list this constraint (for a qualifier) and no alert is generated on other tools. As Jura writes, I think that a Complex constraint is the only alternative. —Eihel (talk) 22:12, 18 February 2020 (UTC)

Leading and trailing spaces ...

Labels and Descriptions can handle leading and trailing spaces. Why can't other string fields do the same? - PKM (talk) 22:56, 17 February 2020 (UTC)

Do you have an example where the spaces are significant? ArthurPSmith (talk) 14:28, 18 February 2020 (UTC)

Using image as a source

sometimes I take images of plates and graves and I'd like to know if there is a way to use them as a source for a statement here (date/place of birth, date/place of death). I have never looked into that but i have always assumed that as a general long-term goal of having structured data also on Commons, images should be become more integrated into a network of structured data. Any previous discussion on the topic?--Alexmar983 (talk) 02:19, 18 February 2020 (UTC)

There's a property for linking a grave image, namely image of grave (P1442), which I suppose could be used as a reference statement. I'm not sure if that's good practice or not. Also, plaque image (P1801) for plaques. Ghouston (talk) 06:37, 18 February 2020 (UTC)
Can statement links be given as value to reference URL (P854)? --SCIdude (talk) 07:57, 18 February 2020 (UTC)
No, but you can put any statement in the reference section. Ghouston (talk) 08:24, 18 February 2020 (UTC)
I know that there is a property for sepcific images but is there a way to specifically use that image as a reference for a clear specific fact? Not only the date of birth/death but also name of spouse, nationality, place of burial... Same for a document, suppose I have an image for a contract with details about a certian item. I suppose you could use the commons url but that's not very elegant. I think it's time we accept images as a source, and I mean doing this with the possible highest standard of quality (like with clear metadata of commons). In a way it encourages the depth of metadata.--Alexmar983 (talk) 12:39, 18 February 2020 (UTC)

What about a "reference file" instead of "reference url"? I have a similar problem with some authorizations for WLM, they are stored on a database of Wikimedia Italy and if I want to put a starting date for the competition and their WLM ID or even a source for an address or a proof of an inclusion in a bigger complex, I could do it only with a url, but the scansions of the files are CC BY-SA, as not be bot accessible, but this way all is transparent.--Alexmar983 (talk) 13:38, 18 February 2020 (UTC)

Can we enlarge the concept beyond tombstones? I mean, a "normal" source is a document, some of these documents can be already imported as images on commons, if the related image on commons has all correct metadata to descrbe it, why shouldn't we use it directly? Of course we can add more tertiary source when we have them, but why is this not option? They are sources that everybody can quickly double-check.--Alexmar983 (talk) 13:58, 18 February 2020 (UTC)
Are you looking to do something like this: Cornelia Augusta Betts (Q85430571), look at the references for her date of death. --RAN (talk) 14:19, 18 February 2020 (UTC)
yes that is more structured, and flexible. Thank you RAN.--Alexmar983 (talk) 15:26, 18 February 2020 (UTC)
For working on instance_of=human. I also highly recommend getting a free Familysearch account and linking the entry here to any entry there, or creating a new entry at Familysearch. They have birth, marriage, and death records online for free. You can also add images there that are fairuse under international copyright law, that cannot be stored at Wikimedia Commons. Also apply for a free Newspaper.com account at Wikipedia:The Wikipedia Library. --RAN (talk) 19:24, 18 February 2020 (UTC)

Instances vs Classes for theme park attractions that are very similar

Hi, I'm working on Disney-related Wikidata entries. One thing I've noticed is that some attractions seem to be instances and some seem to be classes. For example, take Pirates of the Caribbean (Q1713564). This item clearly needs cleanup. My question is, there are no fewer than five versions of this attraction around the world. I'm assuming that it is improper to have one item with five different "part of" properties and five different "coordinates" properties; an item can't be a "part of" 5 different theme parks! Moreover, the version in Shanghai Disneyland Park (Q865312) is sufficiently different that it merits its own entry.

Is it fair to say that there should be a general Pirates of the Caribbean attraction entry, and five other new items that "instance of" that master item? Or is that too much bloat? --OnePt618 (talk) 03:38, 18 February 2020 (UTC)

@OnePt618: It is fair to say that there should be a general Pirates of the Caribbean attraction entry, and five other new items that are "instance of" that master item. Yes please. Make it so. --Tagishsimon (talk) 04:01, 18 February 2020 (UTC)
@Tagishsimon: Wonderful, that matches my understanding. Thank you so much! --OnePt618 (talk) 04:04, 18 February 2020 (UTC)

How to connect Model Trains Museum with rail transport modelling

Items are Q6888298 and Q623272. Smiley.toerist (talk) 09:43, 18 February 2020 (UTC)

Property:P921? Pietro (talk) 11:04, 18 February 2020 (UTC)
instance of (P31)museum (Q33506)of (P642)rail transport modelling (Q623272). Circeus (talk) 23:45, 18 February 2020 (UTC)

It's not clear whether the value of this is a name of the subject or the object.

An example of this would be Uno (Q17267) which is named after 1 (Q199), but specifically the Spanish name "uno". However, it's not clear from the qualifier weather the subject is named after the object's Spanish name "uno"; or the subject's Spanish name "uno" is named after the object.

In fact, this can also be considered unclear (non-explicit) for any named after (P138) statement qualified with P5168 where the subject and object have multiple names. I'm sure there are other statements too where the use of this qualifier can be unclear.

It seems to me like there should actually be two qualifiers in the vain of subject has role (P2868) (applies to name of subject) and object has role (P3831) (applies to name of object). Does this seem sensible to anyone else? Would be happy to make proposals if there's support for this. --SilentSpike (talk) 11:41, 18 February 2020 (UTC)

I have started a proposal here: Wikidata:Property_proposal/applies_to_name_of_object --SilentSpike (talk) 15:35, 18 February 2020 (UTC)

Grant Application Wikidata + Performing Arts

Project Grant Application by the Canadian Arts Presenting Association and the Conseil québecois du théâtre for the population of Wikidata with performing arts related data. – Please review, comment, endorse...! --Beat Estermann (talk) 22:04, 18 February 2020 (UTC) WikiProject Cultural heritage has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Hi folks! Over on the located in the administrative territorial entity (P131) talk page, we're trying to figure out how to best handle the fact that administrative territorial entitites at different logical levels aren't really transitive (but the conversation is sort of stuck, so I'm asking for more attention here). For example, the city of Atlanta (Q23556) exists in two counties: Fulton County (Q486633) and DeKalb County (Q486398). The problem arises when I state in P131 that an organization like Patch Works Art & History Center (Q76461608) is located in Atlanta...but then which county is it in? After a natural disaster, I need to be able to search for all organizations in a specific county. Without some fix for this, many organizations will show up as being in the wrong county. Сидик из ПТУ proposed Wikidata:Property proposal/hierarchy switch to explicitly state at the organization level which "grandparent" administrative territorial entity it belongs to, as a potential solution. However, my "inner ontologist" says that a property is either transitive or it isn't, and therefore we could say that located in the administrative territorial entity isn't transitive and we should just explicitly state all logical levels for each organization in the P131 field (returning to the example, explicitly stating that Patch Works Art & History Center is in Atlanta, Fulton County, and Georgia (U.S. State), all in the P131 field). I know that removing P131's transitive property status would be a dramatic change that could have a heavy impact, so I'm curious if there's consensus around one of these two options, or if there's a third option that we're not thinking of. Thanks for your time and attention on this! Clifflandis (talk) 14:13, 12 February 2020 (UTC)

Here we are talking about an exception to the rule, the solution to the problem is elementary and expects only the adoption of a new property. Thanks to the hierarchy of located in the administrative territorial entity (P131) declared by default, we can easily adjust and build chains from the village to the state with a minimum number of edits, and if you specify not the most accurate values, but everything from parish to the state, this will require significant duplication of efforts, duplication of data and the creation of less savvy algorithms with an extensive knowledge base formalized in code, which will be far from universal in contrast to the current state of affairs. In other words, if we reject the declared hierarchy and transitivity here we will need switches at each access to the property to figure out which of the values is higher in status.Сидик из ПТУ (talk) 15:15, 12 February 2020 (UTC)
@Сидик из ПТУ: "an exception to the rule"? Hmm, you have forgotten this discussion about French intercommunalities and cantons. Ayack (talk) 15:33, 12 February 2020 (UTC)
Well, yes, for France it’s done topsy-turvy without clear reasoning. I understand that the arrondissement of France (Q194203) and canton of France (Q184188) lived in parallel there, but no one answered me why the region of France (Q36784) are specified for commune of France (Q484170) if they are completely determined by the arrondissement of France (Q194203). Сидик из ПТУ (talk) 15:42, 12 February 2020 (UTC)
For an edge case like this, wouldn't it make more sense to create items for the portion of Atlanta (Q23556) within Fulton County (Q486633) and the portion within DeKalb County (Q486398), rather than having to change how we handle what is probably the 98+% case? - Jmabel (talk) 16:08, 12 February 2020 (UTC)
I think this is a bad idea. There is nothing better than using of Atlanta (Q23556) item in all cases (for place of birth (P19) and located in the administrative territorial entity (P131) or for 1996 Summer Olympics (Q8531) and Atlanta Thrashers (Q244039)). Just adding a switch qualifier as needed, we will not change anything at all for 98+% cases. Сидик из ПТУ (talk) 16:19, 12 February 2020 (UTC)
This isn't really an edge case for Georgia (Q1428). Of the 539 municipalities in Georgia, 51 (9.5 %) are in more than one county. Clifflandis (talk) 17:01, 12 February 2020 (UTC)
Thanks for bringing this issue here. I think I agree with Jura here, that we should only be using located in the administrative territorial entity (P131) when the subject is entirely within the object. For the exceptional cases like Atlanta, we should use territory overlaps (P3179) for the next-level geopolitical entities it falls within, and P131 for the smallest geopolitical entity it's within (Georgia). The problem with using P131 for multiple counties is that this practice effectively redefines P131 to be the same as P3179, rendering it non-transitive and far less useful. If 2% of P131 usage does not denote transitive geo-containment, then none of it does. Bovlb (talk) 20:39, 13 February 2020 (UTC)
  • Yes, you have correctly summarized my position (and, I hope, Jura's). It seems to me that the statement "Atlanta is in Fulton County" is simply false, so we should not say it, and queries asking "What is Atlanta in?" should not return "Fulton County". Cheers, Bovlb (talk) 17:55, 14 February 2020 (UTC)
  • Hmm. Just to be clear, I do think it is false in reality to claim that "Atlanta is in Fulton County", but now we're down to arguing what "in" means, which will have no satisfactory resolution.  :)
I had a quick look around WPEN to try to find a definition of what inclusion in en:Category:Cities in Fulton County, Georgia is intended to denote, but I didn't find anything relevant to this question. It is the nature of the category system that the link between the article and the category is unspecified, so in edge cases it will end up having multiple possible meanings. Here at Wikidata, we're trying to be ontologists, so we should not tolerate such vagueness. Cheers, Bovlb (talk) 21:13, 14 February 2020 (UTC)
There are many languages where are no analogues of in at main labels of P131. But the logic on the example of the city is clear and it's similar to continent (P30) for Russia (Q159). Сидик из ПТУ (talk) 21:51, 14 February 2020 (UTC)
  • In my view, we need to think about what queries will people most typically write, and how do we try to make sure they get back as much as possible of what they are looking for.
I don't like the territory overlaps (P3179) solution, because in practice people won't think to ask for it when they're writing queries.
I'm not sure I understand the "hierarchy switch" suggestion -- I don't see how this would work in queries, when people are most naturally just using wdt:P131*
For myself I think the best approach would be to use P131 with applies to part (P518) qualifiers when required eg on Atlanta, plus a P131 at whatever level can be given without qualification (eg Atlanta -> Georgia), plus 'leapfrogging' P131s when there is ambiguity (eg organisation -> Atlanta, and organisation -> Fulton County). This allows queries using P131* to return pretty much the right answers, while careful queries can return exactly the right answers, using P131* MINUS {chains that include a qualified P131} PLUS {chains where that qualified P131 is covered by a P131 from a lower entity}. Jheald (talk) 21:19, 13 February 2020 (UTC)
@Clifflandis: No, it would be Atlanta (Q23556)located in the administrative territorial entity (P131)Fulton County (Q486633) with qualifier applies to part (P518) = somevalue or applies to part (P518) = some list of districts of Atlanta that are in Fulton County (Q486633).
For Patch Works Art & History Center (Q76461608) I would suggest both Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Atlanta (Q23556) and Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Fulton County (Q486633), the latter probably qualified with an appropriate object has role (P3831) qualifcation to flag that Q486633 is not the regular next step up the hierarchy. Jheald (talk) 15:05, 14 February 2020 (UTC)
  • @Jheald: Thanks for spelling that out for me -- I'm following you now!
I don't think that applies to part (P518) will work as a way to try to connect cities and counties, since they're not logically related to each other, so there's no somevalue to apply. As you suggest, we could maybe cobble something together for Atlanta based around neighborhoods, but that approach won't work as well for smaller towns like Braselton (Q899020) where the city exists in four counties -- around 9.5% of municipalities in Georgia exist in more than one county, unfortunately. There's just no logical connection between city borders and county borders, at least in Georgia.
For Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Fulton County (Q486633), where you suggest qualifying with object has role (P3831), would it look like this: Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Fulton County (Q486633)object has role (P3831)county (Q28575)? Does that make sense, or would it just be redundant?
The messiness between cities and counties is what has me leaning towards Сидик из ПТУ's proposed Wikidata:Property proposal/hierarchy switch as a pragmatic solution. Clifflandis (talk) 18:22, 14 February 2020 (UTC)
@Clifflandis: applies to part (P518) is not a connector, it's a warning -- it means that the statement does not apply to the whole of the subject, only a part of it. Even if we cannot precisely detail which parts of the subject item the statement applies to, nevertheless we can still use applies to part (P518) with the generic value somevalue to indicate that the statement Atlanta (Q23556)located in the administrative territorial entity (P131)Fulton County (Q486633) does not apply to the whole of Atlanta.
We need something like Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Atlanta (Q23556), otherwise a query for archive centres in Atlanta will not return it. Jheald (talk) 16:12, 16 February 2020 (UTC)
@Jheald: For the statement Atlanta (Q23556)located in the administrative territorial entity (P131)Fulton County (Q486633), I tried saving applies to part (P518) as a qualifier, both with a blank value, and with "somevalue" as the value (to indicate the warning), but it won't save with either of those values. I'm not sure how to use applies to part without creating an artificial subdivision of Atlanta that doesn't exist. I'm probably misunderstanding how it should be used. Can you point me to an item that uses applies to part in the way you're describing? Thanks again for explaining! Clifflandis (talk) 13:09, 18 February 2020 (UTC)
@Clifflandis: Works for me: diff. Note that the UI represents somevalue as "unknown value", which is often quite untrue; that's a known shortcoming in the UI. Jheald (talk) 13:16, 18 February 2020 (UTC)
@Jheald: Oh, thanks! Once I changed it from somevalue to unknown (Q24238356), it saved. But for some reason I couldn't save it as "unknown value" the way that you did. Maybe it's a permission that I don't have. Oh well, either way it's handy to know that unknown (Q24238356) exists and can be used in a pinch. Thanks again! Clifflandis (talk) 15:05, 18 February 2020 (UTC)
I think we should approach this from the perspective of the "client". There are two major types of questions we are solving by located in the administrative territorial entity (P131):
  1. How to get display geochain in infobox like Patch Works Art & History Center (Q76461608)Atlanta (Q23556)Fulton County (Q486633)Georgia (Q1428)United States of America (Q30) in lua. That seems to be quite trivial, assuming that Wikidata:Property proposal/hierarchy switch will be implemented as P8000 and statement Patch Works Art & History Center (Q76461608)located in the administrative territorial entity (P131)Atlanta (Q23556) will have qualifier P8000:Q486633
  2. How to query for all instance of (P31)organization (Q43229) located in Fulton County (Q486633)? Not sure I have immediate SPARQL snippet here. May be someone with better query writing skills can do it? Ghuron (talk) 06:44, 15 February 2020 (UTC)
@Ghuron: Maybe something like this:
  {
    ?item wdt:P31/wdt:P279* wd:Q43229. # organisation
    ?item wdt:P131+ wd:Q486633. # located in Fulton County
  }
  MINUS
  {
    # remove the item if there is a hierarchy switch to another county
    ?county_of_georgia wdt:P31 wd:Q13410428.
    ?item wdt:P131*/p:P131/pq:P8000 ?county_of_georgia.
    FILTER (?county_of_georgia != wd:Q486633)
  }
--Dipsacus fullonum (talk) 13:33, 15 February 2020 (UTC)

To summarize as I see the dissussion:

  1. Viewpoint one is that there is a hierarchy where municipalites is considered to be lower in the hierarchy than counties. That is the view of English Wkipedia (and probably also all other Wikipedias) which has the category Category:Cities in Fulton County, Georgia (Q15211928), but there is no category named w:Category:Counties in Atlanta, Georgia. For that viewpoint to be reflected in Wikidata, we will need the proposed hierarchy switch qualifier.
  2. Viewpoint two is that in Georgia municipalites and counties have equal hierarchical status, so chains of hierarchy should be like (1) Patch Works Art & History Center (Q76461608) → (2) [ Atlanta (Q23556) and Fulton County (Q486633) ] → (3) Georgia (Q1428). For that viewpoint to be reflected in Wikidata, the municipalites and counties of Georgia should point to each other with territory overlaps (P3179)

I agree that from strictly logical point of view that viewpoint 2 is correct. However most sources inclusive the Wikipedias assume viewpoint 1, so I support to also use viewpoint 1 as I don't think that it is sustainable to have another model of the world in Wikidata than all or most other places. --Dipsacus fullonum (talk) 06:45, 16 February 2020 (UTC)

There is no such thing as "hierarchical status" where I live. The discussions here tends to be dominated by people from federal states (US/Germany/Russia) which have "hierarchical status", but many other nations do not have them. The 98+% case people talk about above, is more of an exception here, rather than a rule. If I have to add the smallest entity, I sometimes have to add "Europe" if I should follow the rule to the book. 62 etc (talk) 08:47, 16 February 2020 (UTC)
I'm failed to see how you get from "Sweden does not have hierarchical status" to "wikidata should not capture hierarchical status where one clearly exists" Ghuron (talk) 12:56, 16 February 2020 (UTC)
I remember my talk with Yger who said that transitivity works about OK from kommun. län land. And when I look at the Swedish Wikipedia, I observe the following: Hässjö distrikt är ett distrikt i Timrå kommun och Västernorrlands län, Timrå kommun är en kommun i landskapet Medelpad i Västernorrlands län (where province of Sweden (Q193556) is not for located in the administrative territorial entity (P131) at our days). Moreover, I found there many-to-one direct matching Lista över Sveriges kommuner and similar district listings in articles and templates of kommuns. Сидик из ПТУ (talk) 14:14, 16 February 2020 (UTC)
I'm no expert on ontology, but if there's one thing I've learned from my own work trying to classify administrative regions into databases, it's that they are not hierarchical. You can convince yourself that at best they're mostly hierarchical. But if you have a scheme that assumes that they're perfectly hierarchical, a scheme that's going to break if confronted with a real-world exception, then the bad news is: it's going to break. There are a lot of exceptions out there in the real world.
But the other thing I strongly believe is that there needs to be an effective compromise. The convenience of assuming that something is "mostly" hierarchical is significant. A scheme that forces every entity to be explicitly categorized under its grandparents and great-grandparents (just because there are a relatively few entities whose direct parents happen to be ambiguous in that regard) is going to be way too much work.
So I believe our goal should be: that compromise. What's the right way of encoding situations like Atlanta's, that balances the needs of the people entering and maintaining the data, versus the needs of the people querying the database? —Scs (talk) 14:46, 16 February 2020 (UTC)
Right way is a) Get Wikidata:Property proposal/hierarchy switch; b) Use query of Dipsacus fullonum with this property. In this scenario everyone will be satisfied in their needs with the filled data will correspond to the main PoV stated in authoritative sources. Сидик из ПТУ (talk) 15:00, 16 February 2020 (UTC)
I basically agree with viewpoint 2, though wherever a municipality is entirely inside a county, we should express the simple hierarchy. - Jmabel (talk) 17:21, 16 February 2020 (UTC)
This discussion is (primarily) about how to represent cities that do not fall within a single county. I've seen several people above express opinions about the simpler case that, when a city falls within a single county, we should be allowed to represent that geo-containment directly. What I cannot see is anyone proposing otherwise, so I don't understand why it keeps coming up. Am I missing something? Bovlb (talk) 20:59, 17 February 2020 (UTC)
In case of ambiguity, we suggest that the new qualifier (Wikidata:Property proposal/hierarchy switch) will indicate the right choice at the next level of the hierarchy. The main idea is that all Georgia municipalities should have only counties in P131 as it corresponds to the official hierarchy of administrative territorial entities (Q4057633). Сидик из ПТУ (talk) 08:12, 18 February 2020 (UTC)
P131 works quite well as a simple transitive hierarchy in the most cases, and the solution by Сидик из ПТУ will allow it to work correctly in most other cases. It is very important to mark only one administrative territrial entity on most elements in WD as it drastically reduces the amount of work to create chains of ATEs. If one somehow destroys this system, it would be nearly impossible to put correct chains for all small ATEs of differect times on millions of elements. Wikisaurus (talk) 16:40, 19 February 2020 (UTC)
  • In my recent explorations of counties, I came across Unorganized Borough (Q1474662). It's interesting as it's called a borough like all other and it even has some subdivisions that for stat purposes are considered equivalent of those other boroughs, but practically all government functions are with the state. --- Jura 20:16, 19 February 2020 (UTC)

Archiving links

Hello,

is it possible to run InternetArchieveBot here in Wikidata to check the links to other websites if they work. I think there are pages who are no longer available. For pages like this it were good if they are archieved. -- Hogü-456 (talk) 18:23, 15 February 2020 (UTC)

I tried to add a archive URL (P1065) required qualifier constraint (Q21510856) to official website (P856) but one of the other users were strongly against it. --Trade (talk) 20:18, 15 February 2020 (UTC)
The problem with that is that some websites forbid archiving, or archiving fails for technical reasons (typically too much Javascript), so you'd end up with a lot of constraint violations or exceptions. Ghouston (talk) 01:54, 16 February 2020 (UTC)
I don't get why we need to add archive URLs to the entities since the archive links can be programatically generated after the fact if the link does go dead (assuming we also record a last-fetched timestamp). The real issue is making sure that we try to archive all the URLs soon after they are added to wikidata. BrokenSegue (talk) 03:39, 16 February 2020 (UTC)
Constraints are mostly for human editors. If you want to run a bot, you don't need that. --- Jura 07:38, 16 February 2020 (UTC)
I don't get why we need to add archive URLs to the entities Because that way we can be assured that the URL have been archived after it have been added to Wikidata. Also some values such as review scores might change with time making a archived URL a must have. --Trade (talk) 19:18, 16 February 2020 (UTC)
And if the archive goes offline? We need to add a backup url?
Anyways, if you think archiving is necessary, why not set up a bot to do it? --- Jura 05:19, 17 February 2020 (UTC)
I'm not against setting up a bot to do yhe job.--Trade (talk) 11:19, 17 February 2020 (UTC)
It were great if someone can set up a bot for it. Maybe it is possible to use InternetArchieveBot for that task. -- Hogü-456 (talk) 17:27, 19 February 2020 (UTC)

Instrument used

Descriptions of art objects sometimes include instrument used, like ball pen or brush. What property can be suggested ? - Kareyac (talk) 08:03, 19 February 2020 (UTC)

Randomly looking at the properties used to link to ballpoint pen (Q160137) (query), made from material (P186) seems to be most common, with fabrication method (P2079) as the runner-up. (Caveat: that query won’t capture statements using the item as a qualifier or reference.) fabrication method (P2079) sounds promising to me (and is also listed on Wikidata:WikiProject Visual arts/Item structure#Describing individual objects); but maybe it’s also worth asking on some talk page of that WikiProject? --TweetsFactsAndQueries (talk) 14:47, 19 February 2020 (UTC)

Versionize property definitions ?

Occasionally an identifier scheme is replaced with some other: new identifers, possibly new domain name, etc. The question arises: what to do with the property?

The general approach for entities is not to repurpose existing entities.

Some users like to keep the integer (PID) they were aware of and would just want to redo the property definition, formatter and all values at Wikidata they can get hold of.

While this probably wont matter for properties that have never really been used around Wikidata, the question has a larger impact now that other WMF sites use properties and these can't even be tracked from Wikidata (read: Commons). For third party users .. well too bad for them.

If we try to define a version for each property definition, users could check if they still have the current scheme. A simpler approach could be to delete the existing property and create a new one. --- Jura 11:03, 13 February 2020 (UTC)

Sounds like a good argument to create new properties when significant changes like that happen. ArthurPSmith (talk) 15:11, 13 February 2020 (UTC)
I agree for any version change which could create a conflict between values assigned under different versions. If the bulk of the old version's values are still valid in the new one, a bot can copy them over as a one-time thing when the new property is created. Josh Baumgartner (talk) 20:55, 13 February 2020 (UTC)

WikiProject Properties has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. --- Jura 20:36, 13 February 2020 (UTC)

Discussion of P180 ("Depicts") usage on Wikimedia Commons

This discussion may be of interest: c:Commons:Village pump#Misplaced invitation to "tag" images. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:50, 15 February 2020 (UTC)

This can be considered vandalism, can't it? --SCIdude (talk) 16:22, 15 February 2020 (UTC)
I'm sorry, User:SCIdude, what can be considered vandalism? Andy's linking to Commons? Adding a feature on Commons without the consensus of its participants? Trying to get that reversed? Adding bad depicts statements? Removing depicts statements? Having a discussion at c:Commons:Village pump? - Jmabel (talk) 18:46, 15 February 2020 (UTC)
Thanks. I'll pick "Adding bad depicts statements" without consensus there and here. It seems similar to running a WD bot without having read anything about WD. --SCIdude (talk) 20:28, 15 February 2020 (UTC)
Bots can do huge damage very fast, but it could be fixed. For new users—IPs on commons are rare—"fixing it" will happen after climate change reverted itself to Y2K levels, and at this point in time commons will have no mammal users anymore. –84.46.52.151 07:32, 20 February 2020 (UTC)

Extract data from Wikipedia templates

It seems that the tool TemplateTiger is not working anymore, then I look for another tool to extract data from infobox templates to add them to Wikidata. Normally the tool HarvestTemplates would be the solution, but in some cases the data format can't be interpreted by the tool. Then I will need to extract the data, adjust the format and then import it to Wikidata in a batch. --Cavernia (talk) 21:21, 17 February 2020 (UTC)

@Cavernia You can use HarvestTemplate's demo mode and download the "demo results" as CSV for further editing. This usually works. Vojtěch Dostál (talk) 10:31, 20 February 2020 (UTC)
@Vojtěch Dostál: I've tried that, but the data is not included in the export, only the error message (like "no target page found"). --Cavernia (talk) 20:21, 20 February 2020 (UTC)
@Cavernia Sometimes you can fool HarvestTemplates by feeding it some specific type of property. If you give me your use case, I can try. Vojtěch Dostál (talk) 20:47, 20 February 2020 (UTC)
@Vojtěch Dostál: Example: [10] The item Q17764175 contains the information "Torpedert og senket 14. juli 1940" which means torpedoed and sunk 14th of July 1940. The motiviation to extract this information is to restructure it and import it as a significant event (P793) entry of torpedo attack (Q50295027) with qualifier point in time (P585). --Cavernia (talk) 21:07, 20 February 2020 (UTC)

YVNG ID

Are these values being added correctly? Please look at Annemarie Loepert (Q58358600) and the YVNG_ID, when I click on it, it takes me to a generic page, not the entry for the person. The actual link appears below as a reference. The actual ID is formatted as "4118003&ind=35" with the second number as the database within the YVNG website and 4118003 is the entry in that database, as best as I can figure it out. YVNG has multiple databases that can be searched. Compare it to the "Jewish Museum Berlin person ID" in her entry and it takes me directly to her entry in the database. --RAN (talk) 07:08, 20 February 2020 (UTC)

The ID's URL template is wrong. The description reads "identifier in the Yad Vashem central database of Shoah victims' name", and the original proposal also specifically mentions the victims' names database. The identifier's name, "YVNG", is the same Yad Vashem (sometimes) uses for the personal identifiers, and the subdomain of that specific dataset. Even the examples included in the proposal link to the individual name records. The URL template, however, is different from the examples?

I would suggest changing the URL template to https://yvng.yadvashem.org/nameDetails.html?language=en&itemId=$1.

I am also wondering if the name should perhaps be changed to something more meaningful, such as "Shoa Victim Name ID". The abbreviation "YVNG" does not seem to be "official", but merely and internal, technical one. --Matthias Winkelmann (talk) 08:25, 20 February 2020 (UTC)

Cemetery plot

A cemetery plot can contain multiple graves. Has anyone created an entry for one so I can see how it is structured and how the people buried there are listed? --RAN (talk) 08:48, 20 February 2020 (UTC)

Some are listed at Special:WhatLinksHere/Q1541002, although a quick perusal did not turn up anything spectacular. Just "has part" for the individual graves. Quantity Buried and Category of People Buried Here are also relevant.--Matthias Winkelmann (talk) 09:31, 20 February 2020 (UTC)

We have burial plot reference (P965), which is used as a qualifier to place of burial (P119). As every cemetery uses a different scheme to identify the plots, its just a plain string. See e.g. Marc-Antoine Berdolet (Q152944) how it is used. There a probably only very few cases where grave itself is notable enough by itself to have an item here, and there the property then probabky need to be used as a statement. AFAIK there's no inverse property "people buried here", and that would be get very unhandy if applied to any bigger cemetery, so I think it won't be a good idea to create one. Ahoerstemeier (talk) 17:23, 20 February 2020 (UTC)

County organization of a state

When going through #US_counties, one come across many items like county of Virginia (Q13415368). I think it would be interesting to add statements there what government institutions or facilities could be found in each county of that state (various boards). The sample here is for counties, but it should also work other classes of territorial government entites. Some aspects might be identical for every state, but the equivalence is generally for census purposes.

Eventually we might even have items for each of institution in every county, but, to start, it might be good to identify the classes. has part(s) of the class (P2670) might work for that or may it should be more specific. What do you think? --- Jura 11:27, 20 February 2020 (UTC)

Special:MostRevisions

Special:MostRevisions wasn't updated since 25 November 2019. Eurohunter (talk) 22:31, 20 February 2020 (UTC)

It has been disabled: phab:T239072. --Matěj Suchánek (talk) 10:07, 21 February 2020 (UTC)

Change of identifier

I don't know if it is the right place to advise users about a proposal. Please have a look to this. --★ → Airon 90 07:56, 21 February 2020 (UTC)

TV Show Judges

Does anyone know of a property that I could use in order to list the people who were judges on a TV show? I can't find anything that would do the job properly, they're not a presenter (P371), participant (P710) or cast member (P161). Any suggestions? - X201 (talk) 11:13, 21 February 2020 (UTC)

They could be a cast member or participant with qualifier of object has role (P3831) = reality television program judge (Q60118864) or similar. --Tagishsimon (talk) 18:50, 21 February 2020 (UTC)
  • Just for the sake of conversational clarity - there is a difference between TV Show Judges, and TV Show judges and judges on a TV show. Judy Sheindlin (Judge Judy) is different from Katy Perry (American Idol) which is different from Simone Missick who plays Judge Lola Carmichael on All Rise. Quakewoody (talk) 19:08, 21 February 2020 (UTC)

Please note the possible rescoping of located in the administrative territorial entity (P131). --- Jura 12:18, 21 February 2020 (UTC)

Merging coronavirus pages

Shouldn't Q290805 and Q57751738 be merged? Specifically those items in Q57751738 are actually titled "coronavirus" should be instead linked to Q290805, no? Huji (talk) 15:29, 21 February 2020 (UTC)ping me in your response please

Q290805 has a different taxon rank than Q57751738 and Q57751738. This is not Wikipedia, where the title is the defining field, in Wikidata the statements define the concept. --SCIdude (talk) 16:29, 21 February 2020 (UTC) ...oh I forgot to @User:Huji: you...
@User:Huji: also note that the ukwiki has articles for both. It's quite possible that some of the sitelinks are connected to the wrong item, though. Ghouston (talk) 01:13, 22 February 2020 (UTC)
@Ghouston: ok fair. But are those pages associated with Q57751738 that are named "coronavirus" really correctly linked to Q57751738 or should they be connected to Q290805 instead? Huji (talk) 01:22, 22 February 2020 (UTC)
I don't know enough to say. It depends on what exactly is the difference between these two items. Perhaps one includes a larger set of virus than the other, given that one is a genus and the other a subfamily. Then the pages (and other items that link to the items) could be distinguished by which viruses they include. There's also Coronaviridae (Q1134583) if you go up to the family level. Ghouston (talk) 01:42, 22 February 2020 (UTC)
The frwiki article fr:Coronavirus, for example, despite its title, is about the subfamily. Ghouston (talk) 01:43, 22 February 2020 (UTC)

Reversing the roles of Twitter properties

Twitter properties Twitter (X) username (P2002) and Twitter (X) numeric user ID (P6552) are currently used with the former as main statement and latter as qualifier. However, only the latter is truly an identifier and the former easily becomes stale data without both a point in time qualifier and the P6552 identifier qualifier.

If you enter Twitter data at all, please contribute to the discussion at Wikidata:Requests_for_permissions/Bot/SilentSpikeBot#Relevant_property_discussion where I seek to establish a bot task to handle tidying up Twitter data and have started a discussion on whether the swapping of these property roles should be part of that. --SilentSpike (talk) 16:26, 21 February 2020 (UTC)

(badtoken) Invalid CSRF token.

I'm receiving it while adding new language code. Eurohunter (talk) 18:01, 21 February 2020 (UTC)

Two profiles for one person

Can someone merge Q75788407 into Q7812365? -- Zanimum (talk) 18:32, 21 February 2020 (UTC)

@Zanimum: ✓ Done; you can have a look at Help:Merge if you want to see how you can do this yourself. Mahir256 (talk) 19:58, 21 February 2020 (UTC)

Edit rate in Wikidata

Hello,

in QuickStatements the batches are running slow. I am not a programmer and dont know much about what happens after saving an edit in Wikidata. As far as I have understand the Speed of Editing in Wikidata is in relation to the maxlag parameter. If there is a big lag at the servers and there also and in the last months mostly as far as I know at the query servers then the editing rate is going to be lower. What is the current plan to solve that problem of a lag at the query servers. I think it is not good if editing Wikidata with batches needs a long time. Something I suggested here is to create lists of data related to a specific topic that can be downloaded. I think that would reduce the number of queries and then the query servers have more time to write the changes. I am interested in doing this and for that I need the data and at my home it would need to long to donwload the dump. -- Hogü-456 (talk) 21:28, 21 February 2020 (UTC)

https://lists.wikimedia.org/pipermail/wikidata/2020-February/013793.html  – The preceding unsigned comment was added by Tagishsimon (talk • contribs) at 21. 2. 2020, 23:46‎ (UTC).

Katlin Bennett is the equalivent to my left toe. deserves no human rights

National Museums Greenwich subject ID

Hi. Does anyone know if there is an equivalent of Property:P7332 for subjects of artworks at National Museums Greenwich rather than the artists? I was trying to add a subject link to Q5224177 but it comes out as a brief note on the individual but no files as they were subject (person) rather than maker. To get the subject files the link should be https://collections.rmg.co.uk/collections.html#!csearch;authority=agent-8868;browseBy=person but the property forces the link to https://collections.rmg.co.uk/collections.html#!csearch;authority=agent-8868;browseBy=maker From Hill To Shore (talk) 02:02, 22 February 2020 (UTC)

Proposal to undeprecate comment (DEPRECATED) (P2315) ("Comment" property)

Even this is not structured data, Sometimes it is useful to add editorial comment about an entity to notice editors of the item (e.g. what should not be changed or be added). This is similar to comments in the source of Donald Trump article in English Wikipedia. For example Talk:Q19862406 contains some information that are important to editors. Another example is some informations that should not be added to Wikidata, or removed by consensus (currently there's not a way to indicate them). This property is to be used as both main statement and qualifier. Recently created Wikimedia community discussion URL (P7930) is to be used together with this property (as a reference).

For clearity, I also propose to rename the property to "editorial comment" and explicitly stated that these should be ignored by data (re)users. However, as it is important to editors, I proposed to order it at the top of the entity page, even above instance of (P31).--GZWDer (talk) 12:15, 20 February 2020 (UTC)

Once again, what's wrong with suggesting people actually look at the Talk pages for this kind of thing? Most items have no talk page, so the existence of one (blue instead of red link) should be a clue to read it... ArthurPSmith (talk) 19:18, 20 February 2020 (UTC)
I don't think most users will read the talk page before editing an item.--GZWDer (talk) 22:39, 20 February 2020 (UTC)
What makes you think they will read this comment either? ArthurPSmith (talk) 14:26, 21 February 2020 (UTC)
Well, except that some people think it’s useful to create talk pages with nothing but {{Item documentation}}, producing blue links with no useful information. —Galaktos (talk) 12:57, 22 February 2020 (UTC)
  •  Support For analogy, consider the MARC standards for book cataloguing [11]. These are very very structured, and very very prescriptive, with the aim being to capture as much of the information about the book or edition as possible, in a very prescribed framework. But even MARC allows a number of fields (identified by a code in the 500 to 599 range) for notes of different kinds in free-text. Sometimes there are things (including important warnings and caveats) that are worth communicating to other people reading an item, that simply can't to be expressed just with properties statements and qualifiers. GZWDer is quite right that most people simply will not see such messages if they are on the talk page. Nor are they accessible for query and retrieval there. The assertion sometimes made that "this isn't structured data, therefore we can't have it here" is very thin, giving no answer to the question "Why not?" Indeed, being able to attach comments as qualifiers to particular statements, does locate those comments in a very specific and structured way.
Obviously, wherever possible, we should try to express information through properties and values, that are immediately internationalised, and expressed in terms of items that are themselves parts of the wikibase. Wherever possible we must make the extra effort to try to express information that way. But it isn't always possible, so in my view: yes, there is a role for free-text comments, attached as qualifiers to statements, or main statements to items. Jheald (talk) 15:14, 21 February 2020 (UTC)

Questions about photo(s) on a wikidata page

1. Is only one photograph permitted per wikidata page about a person?

2. Can a wikidata page about a person have 3 photos, if they show that person in their youth, their middle age and their elder years to give a sense of the changes to that person appearance over their lifetime?

Thank you,

    Tibet Nation (talk) 18:26, 20 February 2020 (UTC)
The image (P18) property is intended to hold a single image that can be displayed in infobox templates, etc. If you want multiple representations, it's best to make a montage, and use montage image (P2716). Ghouston (talk) 22:48, 20 February 2020 (UTC)
The infobox displays only the first image. If you want to have it display the second image you have to deprecate the first image or prioritize the second image. The field does not give an error message if it contains more than one image, so there is no current restriction. When previously discussed, people were upset by more than three images. I personally think it should hold at least the two best images, and it would be even better if periodically it automatically switched the prioritized image just to change things around once in a while. --RAN (talk) 04:57, 21 February 2020 (UTC)
Restricting to a single image is the only restriction that doesn't seem completely arbitrary to me. If you are going to have multiple images, you may as well include every relevant image from Commons. Ghouston (talk) 05:08, 21 February 2020 (UTC)
Reductio ad absurdum always welcome! --RAN (talk) 06:27, 21 February 2020 (UTC)
When you have N images for an item, it's likely that somebody will think they don't depict the concept sufficiently, and they need N+1. That's my reasoning for stopping the process at 1. How many images would you need to depict a large city properly? Ghouston (talk) 01:26, 22 February 2020 (UTC)
Exactly. In the beginning, we do not limit the number of images in P18 and we ended up with items with 100+ images (somehow related to item) in P18.--Jklamo (talk) 15:01, 22 February 2020 (UTC)

Need redirect auto replacement

Take a look at Mary Wellesley (Q75387210) and the field for father. Because of a merge the field contains the old value that leads to a redirect so it gets an error message. Is there any way that these can be auto-replaced with the correct value? It also leads to problems in the genealogical graphic at Commons. See Commons:Category:George_Cadogan,_5th_Earl_Cadogan where the bad value appears as the Q-number of the redirect. --RAN (talk) 20:46, 20 February 2020 (UTC)

Good old Peerage bulk import problems! There is said to a bot that replaces these in due time. Namely, User:KrBot, said by User:GZWDer. Ivan A. Krestinin might have more information. -Animalparty (talk) 22:06, 20 February 2020 (UTC)
KrBot will wait at least 24 hours, probably more, in case the merge is reverted.--GZWDer (talk) 22:15, 20 February 2020 (UTC)
Ordinary redirects may be handle by Lua modules without any special handling. But in this case it is a double redirect so Lua module will not work. Currently there're two bots fixing double redirects, Revibot and PLbot.--GZWDer (talk) 22:18, 20 February 2020 (UTC)
Lua does sometimes need special handling for redirects. That's one of the reasons we still fix them (the other is WDQS). --Matěj Suchánek (talk) 10:26, 22 February 2020 (UTC)

Wikisource categories for topics

Wikisource has no page at "Staffordshire", so should we link its "Category:Staffordshire" to Category:Staffordshire (Q8809886) or, as we would for many Commons categories, with no page at that project, to Staffordshire (Q23105)? I favour the latter. If you disagree, please give reasons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:51, 21 February 2020 (UTC)

Commmon categories are generally linked with the category item, if it already exists. Ghouston (talk) 00:49, 22 February 2020 (UTC)
...and, from what I have seen, Wikinews categories are linked with the main item. I believe what's going on with Wikisource could go either way depending on the category page in question. Mahir256 (talk) 05:25, 22 February 2020 (UTC)
Really? That's not my experience. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:36, 22 February 2020 (UTC)

Undetected vandalism

See Cayetano Coll y Toste (Q5055304) as an example, I only detected it because they changed the date to have the person dead before they were born. Many others in the current error detection batch had the same IP vandalism, all from different IPs. There are 10 detected IP vandalisms so far in the current months batch, so that is one every 3 days or so. How are we detecting more subtle vandalism, like changing a date by a few years instead of 100 years? --RAN (talk) 02:46, 22 February 2020 (UTC)

I realise this may not apply to a lot of data, but it seems to me that any time a sourced statement is changed or removed it would need review. This is one way subtle vandalism could be detected and why sourcing statements is so important (sadly, most data here is not). --SilentSpike (talk) 10:41, 22 February 2020 (UTC)
If the date field contained a reference that reference would still be in place when the record was vandalized and appear to be properly referenced. Recently a warning has been added when a date changed and the reference not changed. That may help. --RAN (talk) 17:27, 22 February 2020 (UTC)

How to mass-remove descriptions?

jinmaku (Q651348) was incorrectly marked as instance of (P31)=Wikimedia disambiguation page (Q4167410), and the bots have added descriptions in many languages. Is there an easy way to remove them? Thanks. Mike Peel (talk) 17:21, 22 February 2020 (UTC)

Why I can't edit interwiki links?

https://www.wikidata.org/wiki/Q785653 There should be https://en.wikipedia.org/wiki/Archetype for EN wiki Page says: Could not save due to an error. The save has failed. Also at https://en.wikipedia.org/wiki/Archetype there is no russian --Oleh B (talk) 13:30, 22 February 2020 (UTC)

https://en.wikipedia.org/wiki/Archetype is already connect to Q131714, and https://en.wikipedia.org/wiki/Jungian_archetypes connects to Q785653, that is why the safe fails. Also, there is no Russian page to connect to Archetype: there is a disambiguation page at Q346973 and one for Jungian archetype or archetype in psychology at Q785653, connecting to the Russian Wikipedia. An actual article for the generic archetype concept does not seem to exist in Russian. Pending a bigger refactoring, the current situations looks correct. Or is there a specific issue with the mapping? --Denny (talk) 21:48, 22 February 2020 (UTC)

Property:P1801

Property:P1801 is currently used for "commemorative plaques" and I added "signage" in the description. Can we change the name to more generic "signage" so the field can contain more than commemorative plaques? We often have pictures of the signs on buildings so we correctly identify them. Rather than create a new field, how about expanding this one? See Jedediah Higgins House (Q74473194) for an instance and peek at the plaque field. --RAN (talk) 04:36, 21 February 2020 (UTC)

I don't think that's a very good example & don't think 'signage' is a very sensible extension of P1801. Your example is a sign associated with a house knonw as the Jedediah Higgins House. A commemorative plaque tends to be a plaque affixed to a building (that is not named for the person named in the plaque), generally saying "Foo Bar, lived here, 1739-1829" - example at https://www.wikidata.org/wiki/Q7251#P1801. The plaque commemorates the person. The sign merely tells you what the building is called. So, yeah: strongly oppose. --Tagishsimon (talk) 04:41, 21 February 2020 (UTC)
You want to tighten the use to instance_of=human? I am not against using the related_image field to hold signage images, if we remove the restriction of not allowing use of related_image if the image field is populated. --RAN (talk) 04:59, 21 February 2020 (UTC)
I would use place name sign (P1766), see for example Thomas Paine Cottage (Q7792975) and Ray Charles Childhood Home (Q55806705) Piecesofuk (talk) 09:52, 21 February 2020 (UTC)
Perfect! I will add it as a "See also" with image, so more people are aware of it. --RAN (talk) 16:19, 21 February 2020 (UTC)
I tried to add in all the image fields as see alsos in image, if you know of others, or can search for all of them, please add them. There are about a dozen so far. Most of them I was not aware of. --RAN (talk) 21:06, 22 February 2020 (UTC)

Computational limit

I asked this once before, but still not sure why we are at our computational limit. Wikidata:Database reports/items with P569 greater than P570 looks for people who died before they were born. It no longer runs because it times out in the 1-minute allotted for computations. What can be done to be able to run one of our important error detection searches for instances_of=human? It has been used to correct >1,000 errors in the past. It is important since it also detects errors that we imported from VIAF, and VIAF uses our corrections to correct their own database. As well as detecting typos, it finds where have conflated two people of the same name. --RAN (talk) 17:36, 21 February 2020 (UTC)

I added FILTER(?bdate > ?ddate), which seems to help a bit (I got results without timeout both times I tried it, though I may have gotten lucky). --TweetsFactsAndQueries (talk) 18:57, 21 February 2020 (UTC)
Excellent, thanks! Wow, lots more errors to fix tonight. [BTW it was temporary, you must have tried during a slow load time]. --RAN (talk) 21:59, 21 February 2020 (UTC)
Maybe we could also run two reports, say one for men and one for women, and subdivide further when needed? I'm curious for your evidence that VIAF are ingesting our corrections (which would be good news); I thought they'd stopped. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:53, 21 February 2020 (UTC)
  • Oh no! Why did they stop? I can see that some of the conflated records we have a list of are very difficult to tease apart. Some of the simple typographical errors where two numbers are transposed were easy to fix and he corrected them right away when notified. Did the one person retire? I will look and see if I still have his email. --RAN (talk) 21:11, 22 February 2020 (UTC)
Wouldn't the easiest fix just be to double the limit to two minutes instead of just one? Robin van der Vliet (talk) (contribs) 13:26, 22 February 2020 (UTC)
The query service is already at maximum capacity, we can't easily let it do more work. ChristianKl15:47, 22 February 2020 (UTC)
  • Is this one of the reasons we have slowed down in adding new large data sets?

How to show the correct item if a statement is deprecated with 'applies to other...' reason?

Like in (4RS)-4-hydroxy-L-proline (Q27102938) I have two deprecated IDs (not removed, because someone probably would re-add these IDs in the future):

CAS Registry Number (P231)30724-02-8reason for deprecated rank (P2241)applies to other chemical entity (Q51734763)
DSSTox substance ID (P3117)DTXSID60861573reason for deprecated rank (P2241)applies to other chemical entity (Q51734763)

What would be the best way to show the correct item (in which the same ID is added correctly)? Which qualifier I should use? Wostr (talk) 15:40, 13 February 2020 (UTC)

I have proposed Wikidata:Property proposal/intended subject.--GZWDer (talk) 22:41, 13 February 2020 (UTC)
What's wrong with normally adding the information to the correct item? ChristianKl14:14, 18 February 2020 (UTC)
And how to easily find the correct item from the wrong item? Using query every time? Wostr (talk) 09:51, 23 February 2020 (UTC)

How to qualify P22 to indicate 'official father' vs biological father ?

In a case like Diana Cooper (Q128576) (cf enwiki), how best to qualify father (P22) to distinguish her official father from her biological father?

I have looked at the table at Wikidata:WikiProject_Parenthood, but it's not very specific on this.

As for a probable parentage (eg Q5541503#P22, where the source (ODNB) says the later George IV "was believed to be the father of her son", I think sourcing circumstances (P1480) (or perhaps nature of statement (P5102)) is the way to go, though I'm not 100% sure of the best value in this case. But I see from the table at the WikiProject we do also have may be father (Q21152551). Should this be used as the value of kinship to subject (P1039) instead? My instinct is to keep qualifiers for the nature of the relationship distinct from qualifiers for how certain it may or may not be. But perhaps there are good reasons that have led to the creation of Q21152551 ? I'd like to know what people think. Jheald (talk) 18:45, 22 February 2020 (UTC)

ChristianKl (talk) 15:11, 24 June 2017 (UTC) Melderick (talk) 12:22, 25 July 2017 (UTC) Richard Arthur Norton Jklamo (talk) 20:21, 14 October 2017 (UTC) Sam Wilson Gap9551 (talk) 18:41, 5 November 2017 (UTC) Jrm03063 (talk) 15:46, 22 May 2018 (UTC) Egbe Eugene (talk) Eugene233 (talk) 03:40, 19 June 2018 (UTC) Dcflyer (talk) 07:45, 9 September 2018 (UTC) Gamaliel (talk) 13:01, 12 July 2019 (UTC) Pablo Busatto (talk) 11:51, 24 August 2019 (UTC) Theklan (talk) 19:25, 20 December 2019 (UTC) SM5POR (talk) 20:17, 29 May 2020 (UTC) Pmt (talk) 23:22, 27 June 2020 (UTC) CarlJohanSveningsson (talk) 12:13, 30 July 2020 (UTC) Ayack (talk) 14:39, 12 October 2020 (UTC) EthanRobertLee (talk) 19:17, 20 December 2020 (UTC) -- Darwin Ahoy! 18:20, 25 December 2020 (UTC) Germartin1 (talk) 03:13, 30 December 2020 (UTC) Skim (talk) 00:13, 10 January 2021 (UTC) El Dubs (talk) 21:55, 29 April 2021 (UTC) CAFLibrarian (talk) 16:36, 30 September 2021 (UTC) Jheald (talk) 18:50, 23 December 2021 (UTC)

Notified participants of WikiProject Genealogy Jheald (talk) 18:46, 22 February 2020 (UTC)

User:Paweł Ziemian User:Jura1 (is this project family relationships?) User:Infovarius User:Melderick User:Bvatant

Notified participants of WikiProject Parenthood Jheald (talk) 18:47, 22 February 2020 (UTC)

Hello,
For Diana Cooper (Q128576)'s legal father, I would use qualifier kinship to subject (P1039) = legal father (Q66363656).
As for uncertainty of a statement, I usually use nature of statement (P5102) with values like hypothetically (Q18603603), presumably (Q18122778), disputed (Q18912752) ... I am not a big fan of may be father (Q21152551).
--Melderick (talk) 20:18, 22 February 2020 (UTC)
@Melderick: Thanks! I'd missed that legal father (Q66363656) already existed. Jheald (talk) 22:57, 22 February 2020 (UTC)
@Melderick: What about the inverse relationship?
Is there a preferred kinship to subject (P1039) qualifier for eg Louis XV of France (Q7738)child (P40)Charles Louis Cadet de Gassicourt (Q2618388)? Presumably biological child (Q53705034), though I see we also have child born out of wedlock (Q170393), frillo child (Q10499185), royal bastard (Q7375049). But it would be good to have proper guidance on this set down somewhere.
Also what qualifier for Louis Claude Cadet de Gassicourt (Q736277)child (P40)Charles Louis Cadet de Gassicourt (Q2618388) ? I'm not seeing an item for "legal child" or "legal son" or "not biological child". Perhaps it should be created, or does something similar already exist and I've missed it? Jheald (talk) 08:57, 23 February 2020 (UTC)
adopted child (Q25858158)? —Scs (talk) 18:43, 23 February 2020 (UTC)

Bad data imports

CERL ID and Catalogo della Biblioteca IDs and deutsche-biographie have been added recently to people with the same name, born centuries apart. The years or birth and death were added from the sources. I can detect some errors since they cause a person to die before they were born, or cause a person to be older than 120 years. Are there other ways to detect the errors that may be more subtle? I noticed that some of the data from Catalogo della Biblioteca is corrupt before we import it, should we exclude bad data before we import it? http://catalogo.pusc.it/auth/126105 which has conflated two people and the data came from VIAF https://viaf.org/viaf/2713072/ which may have come from us. --RAN (talk) 00:15, 23 February 2020 (UTC)

Have you asked the users doing the import first? I don't know Pusc but CERL people were working on a massive cleanup. Nemo 10:16, 23 February 2020 (UTC)

Similar data-item

Q2098700 and Q1426123. Is one a subcategory of the other?Smiley.toerist (talk) 20:24, 23 February 2020 (UTC)

Taxonomy

Hoi, Please read this blogpost where a scientific taxonomy conference reports on taxonomy in Wikidata. It states quite clearly that the notions on taxonomy that have prevailed are at least problematic. Can we stop bickering for a change and accept that taxonomy is different from the current preconceptions? Thanks, GerardM (talk) 10:42, 22 February 2020 (UTC)

Sure „The conclusion of the Taxonomy team was that taxonomy is hard.“ No breaking news at all. Would be great to have sone real content from Taxon Names and Concepts group (TNC). --Succu (talk) 21:15, 22 February 2020 (UTC)
BTW: The idea of a taxon concept (Q38202667) was established 25 years ago in The Concept of "Potential Taxa" in Databases (Q28957948). --Succu (talk) 21:54, 22 February 2020 (UTC)
"Wikidata has somewhat lost its way with taxonomy and it can be seen from the data that users do not understand the intricacies of taxonomic names versus taxonomic concepts". Or in other words, academics will spend a few more decades developing the platonic ideal of a perfect onthology while we're just making something that works. I'm entirely fine with it. Nemo 10:19, 23 February 2020 (UTC)
 Info: Here are the etherpad notes of Cost MOBILISE Wikidata Workshop 2020 (Q84943795). --Succu (talk) 15:32, 23 February 2020 (UTC)
What we have does not work; as discussed here at length. What heart rate does your name have? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:13, 24 February 2020 (UTC)
Ask your activity tracker (Q16001686) ;) --Succu (talk) 20:31, 24 February 2020 (UTC)
  • The biggest issue is that we are not able to make a clear choice between taxon concept vs name concept. If each taxa items represents a name, and only one, then this taxa items should inherit of all the properties of the names, and as exemple a "recombination" should be a subclass of a taxon. Otherwise if we don't accept this kind of things, and it's understandable, then the names should be separated so that we can work on. The issue is that we don't do one, not the other, neither a summary of the both. Christian Ferrer (talk) 18:34, 23 February 2020 (UTC)

Common-law marriage, concubines, fiancées, domestic partnerships and other types of partners

I'm having a hard time figuring out how to specify these kind of things.*Treker (talk) 09:03, 23 February 2020 (UTC)

I imagine use of the spouse (P26) property with object has role (P3831) as a qualifier taking appropriate values for common-law spouse, fiancée, &c. --Tagishsimon (talk) 09:06, 23 February 2020 (UTC)
@*Treker, Tagishsimon: Where the relationship does not constitute a legal marriage, use unmarried partner (P451). This can be qualified if desired as Tagishsimon indicates, if there is more detail to record. Jheald (talk) 09:31, 23 February 2020 (UTC)
What about concubines? Historically concubines would be considered legal spouses, just of a lesser status than the "main" wife.*Treker (talk) 09:34, 23 February 2020 (UTC)
There are likely different countries with different conception of what it means to be a concubine. The key test is whether there was or wasn't a marriage. ChristianKl20:50, 23 February 2020 (UTC)
There are likely different countries with different conceptions of what marriage is, so that's not a very good test. I wonder if it's very helpful to have 2 properties in this domain. What's the advantage? --Tagishsimon (talk) 00:41, 24 February 2020 (UTC)
Yeah I think we should merge them into a single "Partner" item and have specifiers to signify the nature of the relationships. Common-law marriage for example is still a legal partnership that could be considered marriage, even if its not the same kind as a regular marriage.*Treker (talk) 11:22, 24 February 2020 (UTC)

They both represent "Charles II" (查理二世)。—— Eric Liu留言百科用戶頁 05:51, 24 February 2020 (UTC)

Déjà vécu / déjà visité data

Several years ago I set about collecting data from Internet users concerning their déjà vécu and déjà visité experiences. The same questions were asked about both forms of déja experience in an effort to determine what their similarities and differences are. Two papers were subsequently published, the first one in the Journal of Consciousness Studies (21 [11-12]:7-18, 2014), and the second in Explore (14[7-8]:277-282, 2018). Altogether there are 3258 data sets (ca. 80 questions per set) arranged in an Excel spreadsheet. Would there be interest in my uploading this data onto Wikidata and, if so, how should I go about it. Please reply to atf@alum.mit.edu. Thank you.

@ArtFunk: Having skimmed your paper, it seems that your dataset isn't a good fit for Wikidata. Generally, Wikidata stores information about individually notable entities and the connections between them, while your data seems to be tabular information about research subjects. This is not to say you shouldn't release your data; it would just be preferable to put it on Figshare (Q17013516) or another scientific data repository. Vahurzpu (talk) 16:18, 24 February 2020 (UTC)

Sitelinks vs interwiki

Wikidata has found Ferns (Q80005) on 94 languages of Wikipedia. Yet on English Wikipedia, the Languages section only lists 38. I'm curious why is this inconsistent? -- Zanimum (talk) 13:34, 24 February 2020 (UTC)

The enwiki interwikilink of fern (Q80005) is en:Polypodiophyta, which redirects to en:Fern. That page itself is connected to Polypodiopsida (Q373615) and gets its interwikilinks from there, not from Q80005. —MisterSynergy (talk) 13:42, 24 February 2020 (UTC)

Wikidata weekly summary #404

need assistance with linking item to entry

hi. I am trying to link item Q86242945 to the article English Wikipedia page on "Live streaming world news". I keep getting an error message. could you please assist? thanks. --Sm8900 (talk) 17:54, 24 February 2020 (UTC)

@Sm8900: Fixed--Trade (talk) 18:02, 24 February 2020 (UTC)
@Trade:, thanks, that's terrific. by the way, would you happen to be able to please tell me the steps involved? or alternately, simply what I had been doing wrong? I really appreciate your help. thanks. --Sm8900 (talk) 18:04, 24 February 2020 (UTC)
actually, I just viewed the diffs for your edits, so I assume that illustrates the steps that I should follow for this. I appreciate your help. thanks. --Sm8900 (talk) 18:08, 24 February 2020 (UTC)
@Sm8900:If you go to Preferences > Gadgets there should be a gadget called 'Merge: This script adds a tool for merging items'. Activate that and there should be a menu at the right corner of the page called 'More'. Move your mouse over there and you will see a option called 'Merge with...' --Trade (talk) 18:22, 24 February 2020 (UTC)

How to edit label?

I am trying to edit the label of Category:Officiers of the Ordre des Palmes Académiques (Q8692654) to read (in English) "Category:Officers of the Ordre des Palmes Académiques", but when I try to publish the edit, I get the (cryptic) error message: "Could not save due to an error. The save has failed." Can anyone suggest the correct method of making this edit (or does it need a user with admin privileges to do it)? --R'n'B (talk) 21:43, 24 February 2020 (UTC)

There're several true duplicates of this item so the item can not be edited because of this will result in sitelink conflict.--GZWDer (talk) 23:00, 24 February 2020 (UTC)

Do we have a rule against making revision deletion requests at Wikidata:Administrators' noticeboard?

@Jasper Deng: I'll like to have this clarified. I've made revision deletion requests at Wikidata:Administrators' noticeboard before but i never knew such an rule existed. --Trade (talk) 20:41, 16 February 2020 (UTC)

As we don't have an administrator mail list this may be the only viable way. Another way is via IRC but I don't think there's always someone online and actively monitering messages (in October I was asking an Oversight request to remove a password unintentionally leaked by another user in Wikidata, I can only find an oversighter after 80 minutes.)--GZWDer (talk) 23:57, 16 February 2020 (UTC)
@GZWDer: If it's urgent then i suppose you could just ping a admin who have been active within the last hour. Also what's an oversighter? --Trade (talk) 11:14, 17 February 2020 (UTC)
@Trade: WD:OS. ‐‐1997kB (talk) 11:36, 17 February 2020 (UTC)
I remember there was a time where all admins were able to read deleted revisions. Any idea why that was changed?--Trade (talk) 11:54, 17 February 2020 (UTC)
There are two types of revdel:
  • admin-revdel (which admins can do and undo, and all admins can still see the admin-revdel'ed content; this is logged in Special:Log/delete)
  • oversight-revdel (which only oversighters can do and undo, and only oversighters can still see the content; this is *not* logged publicly)
If you think you need revdel, have a look at Wikidata:Deletion policy#Revision deletion first and decide which sort of revdel you need. As WD:AN is watched by hundreds of editors, it may im many situation be wiser to approach individual admins or oversighters via email, in order not to drag too much attention to the problematic content. This is usually the case for content pages (items, etc.), but often not so much on high-traffic project pages such as the Project chat. —MisterSynergy (talk) 12:04, 17 February 2020 (UTC)
Indeed, requests for revdel are best sent to individual admins using wikimail.--Ymblanter (talk) 19:54, 17 February 2020 (UTC)
While we don't have an admin mailing list, we do have oversight@wikidata.org for the people with Oversight rights. I think in most cases that's a better road then using Wikimail to contact individual admins. ChristianKl14:40, 18 February 2020 (UTC)
@ChristianKl:, how long is the response time usually when using that mail? --Trade (talk) 00:17, 25 February 2020 (UTC)
It should be quick, but sometimes they need two or three days to look at the edit and respond. Only three oversighters seem to be reading that list, and they are … currently not among the most active editors here ;-) This is another reason why asking an admin for a quick action can be a good idea. —MisterSynergy (talk) 07:21, 25 February 2020 (UTC)

Wikidata and SEO

There seems to be quite an increase in the amount of people who use Wikidata as a tool for search engine optimization (Q180711) self-promotion. I'm worried that the spam items are created faster than we can find and promote them for deletion. --Trade (talk) 22:57, 21 February 2020 (UTC)

Give some examples. --RAN (talk) 03:52, 22 February 2020 (UTC)
@Trade: from what I can tell, a lot of people don't see this as a problem in the first place. Self-promotion is actively encouraged in Wikidata. The notability guidelines have been changed recently to stop discouraging people from creating their own item. We are on the right track to gather the sum of all knowledge about all SEO professionals and Wikimedians. − Pintoch (talk) 12:05, 22 February 2020 (UTC)
If you're going to criticise the actions of individual contributors in good standing, such as Denny and Fuzheado, have the courtesy to name, and ping them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:23, 25 February 2020 (UTC)
  • When you are claiming that your job is Entrepreneur/Influencer/SEO, we really shouldn't let them on WD to build their brand. Their brand should already be built before we have them here. It is "catch 22" of notability - you can't be a brand until you are on WD but you should be a brand before you are on WD. It really should take more than having an internet connection for someone to claim themselves as an influencer who needs to use WD for SEO. Quakewoody (talk) 14:41, 22 February 2020 (UTC)
  • We have millions of people on Wikidata already, many of them living. Maybe some day we'll have billions. But SEO people should definitely not be a priority; I'll happily vote for their deletion if they are not otherwise notable. ArthurPSmith (talk) 20:50, 22 February 2020 (UTC)
I think you are trying to apply English Wikipedia notability standards to Wikidata. Notability here is to actually exist, and not be a prank or vandalism. Since the language of the data that can be entered is limited, it is hard to use promotional language, you can really just show that you exist or a product exists. --RAN (talk) 14:45, 25 February 2020 (UTC)

Could the bots do the same work in fewer edits more efficiently?

It seems to me that the lag problems comes from the number of edits more than the content of these edits. So couldn't we avoid some of the lag by asking the bots with many edits to do their work in fewer edits? There are very many edits where a bot first add a statement, and then maybe qualifier in another edit, and then add a reference in yet another edit. Or they add multiple labels or descriptions with only one label or description per edit. If they did all work on the same item at once there would be much fewer edits – and the server load would be lower with the same work done. --Dipsacus fullonum (talk) 23:48, 23 February 2020 (UTC)

Short answer is yes, but whether this would make a noticeable difference I'm not sure. I just finished updating my bot (approval pending) and took this into consideration. You can see (Special:Contributions/SilentSpikeBot) I'm just making two edits for each claim I'm editing. Rather than adding qualifiers individually, I just make a new duplicate claim (copy qualifiers, sources, rank, value, snakType) and then locally edit the data before adding as a new claim and removing the old one. Result is 2 edits per claim, rather than multiple edits for individual qualifiers. The downside to this is that it's not as clear what I've changed using the diff view. So you might ask: why I don't just make 1 edit to update the existing claim? That's unfortunately down to the Pywikibot (Q15169668) implementation (which I suspect a lot of bots are probably using) where I don't believe it's currently possible to edit an existing claim locally and only upload all the changes as one edit. --SilentSpike (talk) 00:06, 24 February 2020 (UTC)
You could use the editEntity function in pywikibot that maps to the wbeditentity API action to make those changes in one edit, but you’d need to change quite a lot of things. That function actually allows you to change all data you find in an item in one edit. Not sure whether it's worth to use it here, though. —MisterSynergy (talk) 00:35, 24 February 2020 (UTC)
Also QuickStatements adds qualifiers and references in separate edits. I would guess that it and many bots use the API functions wbcreateclaim, wbsetqualifier and wbsetreference. To create a new statement complete with all qualifiers and references you don't have to get and edit all data of the entity with wbeditentity. You can also use the simpler wbsetclaim. If processes like QuickStatements and bots did that, they would also have the benefit of doing the jobs much faster with the same number of edits per minute. And the server lag might be lower. --Dipsacus fullonum (talk) 07:53, 24 February 2020 (UTC)
@MisterSynergy: Thanks for the heads up, missed that method while trawling the source. See it now defined under class WikibasePage. I'll look into that. --SilentSpike (talk) 10:27, 24 February 2020 (UTC)
Answering the question: it depends. I have observed that the lag starts to grow when bots create new items with all the data in a single request, doing so many times per minute (public live data). So I believe this is about the amount of data rather than number of edits per unit of time. On the other hand, narrow API calls make better edit summaries which can be useful if you search for specific edits in contributions or history. Hopefully, it's going to be better with next development. --Matěj Suchánek (talk) 12:07, 24 February 2020 (UTC)
I was told that the WDQS update process already waits a short time after an item is edited before processing it, so if any more edits to same item follow immidiately there are processed together. So doing less edits as proposed will not mean fewer WDQS updates. That also explains Matěj Suchánek's observations: If the bot can create or edit more different items per minute, it will probably increase the lag. --Dipsacus fullonum (talk) 10:27, 25 February 2020 (UTC)

Which property should i use when creating a item about a national sports team (Q1194951)? Both? --Trade (talk) 21:30, 24 February 2020 (UTC)

country for sport (P1532) is more appropriate I think. It's a subproperty of country (P17). -- Ajraddatz (talk) 02:07, 25 February 2020 (UTC)
Welp if nobody minds i'll just remove Property:Country from the items. --Trade (talk) 15:26, 25 February 2020 (UTC)
Sometimes a country is a country only when it comes to sport, like Puerto Rico, and then there's cans-of-worms like Taiwan. Partially why there's a distinction I think? Moebeus (talk) 20:34, 25 February 2020 (UTC)

soweego 2 proposal

(Please disregard this message if you have already read it in the Wikidata mailing list, and apologies for the distraction)

  • TL;DR: soweego 2 is on its way!
  • The Project Grant proposal is out for your consideration:

Hi everyone,

Does the name soweego ring you a bell? It's an artificial intelligence that links Wikidata to large catalogs: https://soweego.readthedocs.io/ It's a close friend of Mix'n'match (Q28054658), which generally copes with small catalogs.

The next big step is to check Wikidata content against third-party trusted sources. In a nutshell, we want to enable feedback loops between Wikidatans and catalog maintainers. The ultimate goal is to foster mutual benefits in the open knowledge landscape.

I would be really grateful if you could have a look at the proposal.

Can't wait for your feedback.

Best,
Hjfocs (talk) 15:57, 19 February 2020 (UTC)


Soweego 1 review

There seem to have been serveral reports on Soweego 1, but I don't think they were shared at Wikidata or elsewhere, nor seem the actual outputs to be widely known. Reports are:

The outputs seem to be:

The Mix'n'match catalogues seem to be either unused or unusable. I suppose they are "automatched" based on the Soweego scores and qid, but now contributors would have to confirm them manually.
As of today, this is hardly done and even users who work on other catalogues might not find them as the entries aren't searchable by text (try to find a record by name). The entries provide no information that would allow to do that without going back to the source database.
The few unmatched ones create horrible entries like [12] repeating the label and description from MnM : 8447 soweego confidence score: 0.5026371479034424"
For databases where this is possible, maybe a suitable use of the MnM catalogues could be to import most entries that haven't a potential duplicate in MxM.
Is there any guidance available what's meant to be done with the MnM catalogues? --- Jura 15:31, 22 February 2020 (UTC)

WikiProject Movies has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. As Imdb is included, WP Movies would probably be most interested. --- Jura 13:36, 23 February 2020 (UTC)

Reply to soweego 1 review

Hey @Jura1: thanks a lot for your review, super appreciated and really valuable! Here are my replies.

  • 250,000 identifiers added. Excellent! Supposedly this is the key output of the project.
Yes, this is indeed the core task.
  • Oddly, the final report doesn't mention much of the problem we had with Twitter.
This is normal, the final report was completed before[1] this specific topic was raised.[2]
  • Another issue that also surfaced there (the absence of numeric ids in Wikidata) wasn't really followed up and is now handled by another user/bot.
This was added in the project backlog.[3] Thanks a lot for the pointer, I will mark the issue as resolved.
The Wikidata community is just great.
  • The Mix'n'match catalogues seem to be either unused or unusable.
I completely disagree.
People have already contributed some curation,[4][5] and the Auxiliary data matcher seems to be doing quite a lot of job as well.[6]
  • I suppose they are "automatched" based on the Soweego scores and qid, but now contributors would have to confirm them manually.
Correct.
  • the entries aren't searchable by text
This was already raised[7] and is in the project backlog.[8] It requires quite some work, and will be addressed if version 2 proposal gets selected.
  • The few unmatched ones create horrible entries like [2] repeating the label and description from MnM
Wow, thanks for spotting this, I totally agree. This seems to be MnM default behavior when creating a new item. I'm opening a new ticket for it.
  • maybe a suitable use of the MnM catalogues could be to import most entries that haven't a potential duplicate in MxM.
I'm not sure I understand this point, could you please expand?
  • Is there any guidance available what's meant to be done with the MnM catalogues?
What goes into MnM are medium-confident links, i.e., those needing curation.[9] MnM is the natural tool for that.

Cheers! --Hjfocs (talk) 12:17, 24 February 2020 (UTC)

References
  1. https://meta.wikimedia.org/w/index.php?title=Grants:Project/Hjfocs/soweego/Final&oldid=19240771
  2. https://www.wikidata.org/w/index.php?title=Topic:V6cc1thgo09otfw5&action=history
  3. https://github.com/Wikidata/soweego/issues/376
  4. see for instance the Users section in https://tools.wmflabs.org/mix-n-match/#/catalog/2709
  5. In https://tools.wmflabs.org/mix-n-match/#/catalog/2712 QTHCCAN has curated 163 potential matches
  6. 1k matches in https://tools.wmflabs.org/mix-n-match/#/catalog/2711 for instance
  7. https://github.com/Wikidata/soweego/issues/364
  8. https://github.com/Wikidata/soweego/issues/325
  9. m:Grants:Project/Hjfocs/soweego/Final#Summary

Further comments on soweeego 1

  • About "The Mix'n'match catalogues seem to be either unused or unusable.": I doubt that 80 of 40,0000 entries highlight use of a MxM catalogue. As there is no auxiliary data present for these catalogues (AFAIK), the auxiliary data matcher merely copies entries already added to Wikidata back to MxM. For the three I checked one was by your bot [13], a bot prior to the project [14], the other manually [15]. Maybe the first two highlight some other problem. --- Jura 12:38, 24 February 2020 (UTC)
  • "maybe a suitable use of the MnM catalogues could be to import most entries that haven't a potential duplicate in MxM." To explain this a little further: it could be the key second outcome of your project.
    Consider all of a given resource:
    (1) either we have it already (nothing todo, statements are already here),
    (2) we can add it to existing items with confidence (what your bot did),
    (3) it's likely that we have an entry, but we aren't sure. Supposedly that's what's in MxM
    (4) Anything else that Wikidata is still missing. A large part of this could probably be added directly.
    Hope that clarifies it. --- Jura 12:38, 24 February 2020 (UTC)
  • I like that the matching is fed into mix'n'match. Relying on an existing tool may be more effective than trying to invent yet another one, like the Wikidata:Primary sources tool. It would be nice to learn more on what could increase usage of those mix'n'match tasks. Nemo 16:53, 24 February 2020 (UTC)
  • @Jura1: the example I mentioned actually has 226 manual matches out of 4,738 total entries, but of course there is room for improvement;
  • let me try to clarify your second point to see if I understand correctly: soweego mines matches over existing Wikidata items without a given catalog identifier. Confident ones are uploaded through the bot, while potential ones get to MnM for an additional check. The set of non-matching catalog identifiers could be used to create new Wikidata items: if the community thinks this is useful, it would be a key outcome of the project indeed.
  • @Nemo bis: I think that the missing labels and descriptions in MnM entries would increase the usability for sure, and that's scheduled work.
Cheers,
Hjfocs (talk) 16:58, 25 February 2020 (UTC)
Looking at some of the "226 manual matches" and the recent changes of the catalogue, it seems that the user simply clicked "manual sync" and an automated batch imported edits done by other users at Wikidata into MxM. It just shows that the identifiers get added directly to Wikidata despite the catalogue. It is similar to the "auxiliary data matcher" already discussed. --- Jura 09:01, 26 February 2020 (UTC)

Flat-Earthers

What's the proper property to link Mike Hughes (Q47546026) and flat Earth (Q660936)?--GZWDer (talk) 00:04, 25 February 2020 (UTC)
@GZWDer:On Paul Watson (Q201670) they have used movement (P135) to link him to conservation movement (Q1088777) and environmental movement (Q8189417) I think that might be viable option Andber08 (talk) 01:07, 25 February 2020 (UTC)

I've had similar thoughts. Only other thing I could come up with would be to make a "flat earther" "occupation". It's not really an occupation but we already have conspiracy theorist (Q19831149) which isn't really an occupation either. movement (P135) is probably better though. BrokenSegue (talk) 17:12, 25 February 2020 (UTC)
I would suggest Mike Hughes (Q47546026) occupation (P106) conspiracy theorist (Q19831149) with qualifier of (P642) flat Earth (Q660936). Visite fortuitement prolongée (talk) 22:53, 26 February 2020 (UTC)

giving consumers better opportunities for making informed decisions about which products and services to consume

Hello all, this is a post to ask you for your input on initiatives aimed at getting more data online, and in a more structured manner, about products and services, including reviews about those products and services. I am familiar with a few past and current initiatives that aim to do this, e.g.:

I've noted that some people, including myself, are interested in exploring the possibilities for breathing new life into what these initiatives are trying to accomplish: getting more data about products and services online, and in a more structured manner, using the power of wiki. See here and here, for example. The end-goal of these initiatives is, imo, to give consumers better opportunities for making informed decisions about which products and services to consume and how they wish to consume them. For many products and services information to allow consumers to make more informed choices is currently still absent, so this is a goal for which i think there is still much to be gained. This post is to ask your thoughts on this: do you agree that these goals are indeed worthy of more attention and that there is much to be gained in this areas? Are you aware of any more past or current initiatives that are aiming to do this, besides the ones listed above? Have there been discussions on how to give rise to these goals within Wikidata? What are the best ways for moving forward you think: a separate initiative or more work on the existing subsets of Wikidata for products and services? Have there been successes to date in getting data like these onto Wikidata? What have the most important challenges been? And what are potential business models for such an initiative, i.e., how would we fund the cost of building the ultimate goal of such an initiative: a wiki of all products and services, with items and properties developed by and for users, specific to product- and service-categories, including reviews by and for users? Any input would be much appreciated!! --Wikirik123 (talk) 19:59, 25 February 2020 (UTC)

Do you have a rough idea of the number of items you would need to create to do this well? I'm guessing at least in the millions, possibly hundreds of millions - so a separate Wikibase instance federated with Wikidata might make the most sense. See the Wikibase home page; also Wikidata:Federation input for how people are thinking they may want close federation to work (ideally). ArthurPSmith (talk) 21:35, 25 February 2020 (UTC)
Great sum up Wikirik123! I would just correct that it's already possible to add any kind of products on OpenFoodFacts (OFF), and not just food products. Also, I would point that OFF has also started creating bridges to Wikidata, adding new properties such has Property:P5930 and Property:P1821; which amounts to mapping the metadata but not adding the products themselves in Wikidata. I don't think that would be necessary since OFF can provide this database and exposition of each product to the web, with a unique identifier. Going beyond that, I agree that a new instance of Wikibase (which I have experience in) would be the best idea, with the intended goal of making that the new engine and source of truth for the OFF project. Johanricher (talk) 11:48, 26 February 2020 (UTC)
Hi all, thanks for your input and great suggestions/thoughts!! really appreciated!! Re ArthurPSmith: I don't know, but great question. Perhaps we will find an answer to this question throught this new project! My guess would be that it's closer to the hundreds of millions than millions (perhaps even billions or tens/hundreds of billions). Thanks again for your suggestions, really useful.
A followup question Johanricher: i think one of the great powers of wiki is that in addition to users contributing data, users also contribute to the structure of those data (e.g. in defining specific categories/classes and properties for specific items or groups of items). What i see before me in a future wiki for products is that people who know a lot about product X, let's say TVs, discuss together all the useful properties and categories/classes there are for TVs. And that the people who know a lot about Lego sets, do that for Lego sets. Etc. I am a big fan of OFF, but since they ask users to provide data for a set group of product categories, that are all very relative to food items but not necessarily to other items, this makes me wonder if OFF is the right place for entering such data for products more broadly. I also think that even for food, users could have a lot to contribute about the useful properties and categories/classes that there are (i could imagine, for example, there being quite different things of interest for vegetables than for meat, and even different for food products in Holland than in Ghana, for example). I must admit i do also see the value of a simple interface like that for your everyday-user, but perhaps it is a good idea to combine the 2 in a future "product wiki"? With a more extensive interface for the hobbyists, who enjoy thinking up and discussing new product(group)-specific properties and categories/classes etc, and a simple interface for everyday-users? Something to think about. And great that you have experience in setting up new wiki instances, that will be useful in this project, if you will be willing to help! --Wikirik123 (talk) 00:03, 27 February 2020 (UTC)

Converting a Wikipedia watchlist to a Wikidata watchlist

I have got a question which I can not answer, but may be someone here can. Can a Wikipedia user easily convert their Wikipedia watchlist to a Wikidata watchlist, i.e. can they watch on Wikidata the items for the same articles they watch on Wikipedia? Lest us assume for simplicity that the watchlist only contains articles and that the user is not a tech genius (I might be accidentally wrong about the second assumption though, but are there any layman solutions - of course one may write a script and convert one to another?) Direct pasting of the watchlist does not work.--Ymblanter (talk) 14:00, 26 February 2020 (UTC)

Hello @Ymblanter: in your preferences, for example at en:Special:Preferences#mw-prefsection-watchlist for the english language version, you can activate a checkbox, in order to also get changes from wikidata for all articles on your watchlist, so from my point of view, there would be no need for converting a watchlist. --M2k~dewiki (talk) 14:17, 26 February 2020 (UTC)
Thanks, but the person specifically wanted to separate the watchlists, they are aware of this possibility.--Ymblanter (talk) 17:03, 26 February 2020 (UTC)

The following could work:

--- Jura 15:34, 26 February 2020 (UTC)

Additional interface for edit conflicts on talk pages

Sorry, for writing this text in English. If you could help to translate it, it would be appreciated.

You might know the new interface for edit conflicts (currently a beta feature). Now, Wikimedia Germany is designing an additional interface to solve edit conflicts on talk pages. This interface is shown to you when you write on a discussion page and another person writes a discussion post in the same line and saves it before you do. With this additional editing conflict interface you can adjust the order of the comments and edit your comment. We are inviting everyone to have a look at the planned feature. Let us know what you think on our central feedback page! -- For the Technical Wishes Team: Max Klemm (WMDE) 14:15, 26 February 2020 (UTC)

Is gender a property that may violate privacy or likely to be challenged?

@Daniel Mietchen: removed sex or gender (P21) from Q28913663 for WD:BLP. It have been removed by other user(s) but the removal was reverted by @ديفيد عادل وهبة خليل 2:. There's source that refers the person as female. I don't know whether it is proper to include the gender information to Wikidata.--GZWDer (talk) 20:10, 15 February 2020 (UTC)

I have reformatted the above link to the item in question. --Daniel Mietchen (talk) 14:23, 19 February 2020 (UTC)
I've heard about religion, medical conditions and sexuality being information that violates privacy but never gender. It just seems like such extremely basic knowledge. --Trade (talk) 21:45, 15 February 2020 (UTC)
Not so basic if the only mostly binary value determines your competitors in sports, could mean "all or nothing" in heritage depending on the jurisfiction, etc. It's on the same privacy level as religion, sexual orientation, political views, address, phone number, or banking account. Not like the less critical and recently discussed weight or cup-size (boobpedia ID). –84.46.52.151 07:52, 20 February 2020 (UTC)
As in everything in life (Wikidata included), Common sense is required. Nothing is absolute. Gender may be non-controversial or obvious for the vast majority of living or historic people, but if there is reason to suspect that it is controversial, or sensitive, for some living people, then "basic knowledge" needs to take a hike, and exceptionally high sources are required. I have no knowledge of the subject in question, but in general, in some cases it may be preferable to leave the field empty, even if reliable sources can be scrounged from the depths of knowledge, if such information is not public, not widely circulated, and/or contradicts a person's stated or assumed gender. -Animalparty (talk) 23:51, 15 February 2020 (UTC)
I put it back, with a reference. Ghouston (talk) 01:17, 16 February 2020 (UTC)
This was not a reference, this was speculation in the reference section of the claim in question. I have removed it again, and a source that explicitly states a gender is the absolute minimum requirement in these situations. —MisterSynergy (talk) 23:22, 19 February 2020 (UTC)
@MisterSynergy: If the OTRS ticket shows that the value is incorrect, but doesn't give a correct value, wouldn't he be better to deprecate the statement? Perhaps make a new item for reason for deprecated rank (P2241) by analogy with consensus to remove (Q55193796), maybe "Reason for removal in OTRS ticket", and put the OTRS id in the Talk page (since there doesn't seem to be a property for OTRS ids. Perhaps there should be, so that OTRS tickets could be used as references if they supply a correct value). Ghouston (talk) 01:31, 20 February 2020 (UTC)
The OTRS ticket is not a public source, thus we cannot use it here as a reference. Deprecation would be the way to go if there was a reference to a serious source, but we also found that the given value was incorrect. As long as there is no such reference, we do not need to keep the claim at all. —MisterSynergy (talk) 07:47, 20 February 2020 (UTC)
@Animalparty: If we know the person's stated or assumed gender we can easily use that as the gender we show. I find it hard to imagine that you can have controversy about someone's gender without having sources that you can use to have a sense about what gender might be appropriate for the person. In the worst cases you might have a sourced "unknown value" statement. ChristianKl14:30, 18 February 2020 (UTC)
@GZWDer, ديفيد عادل وهبة خليل 2, Trade, Animalparty, Ghouston, ChristianKl: I had non-public reasons to remove the information, and I shared them with OTRS. These reasons are clearly stated in WD:BLP, which is why I linked there from the edit summary. Perhaps that was not enough, and I am open for suggestions on how that process could be improved. In any case, I have pinged OTRS again, and the reasons for redacting the information have not changed, so I strongly suggest to keep it redacted, ideally in a way that would reference the OTRS ticket by something like Wikimedia VRTS ticket number (P6305), though that property was primarily intended for copyright stuff. --Daniel Mietchen (talk) 14:23, 19 February 2020 (UTC)
We do have some alternatives for people who don't want to be identified as either male or female, if that's the issue. There's a list on the sex or gender (P21) constraints. Ghouston (talk) 22:54, 19 February 2020 (UTC)
@MisterSynergy: so you don't believe that inferred from pronoun used (Q73168402) and inferred from person's given name (Q69652498) are actually good enough for Wikidata purposes? We need some other reference, like a statement in some other database where somebody has probably made the same assumption on our behalf? Or importing the claim from Wikipedia, where somebody else has added a gender based on a guess? My own guess is that those two "reasonings", often combined with checking their appearance in photos, are likely to be accurate in a vast majority of cases. No data is ever 100% certain. Some of the claims about education and former jobs may turn out some day to be fabrications (or perhaps never discovered), but I doubt that the percentage is very high. Ghouston (talk) 23:59, 19 February 2020 (UTC)
This is not about correctness or likelihoods that your speculation might be "correct". Instead, think of Wikidata as a secondary database that collects publicly available data. In this case, there is apparently no explicit information about the gender of the person available, which means that Wikidata does not know it either.
Of course, unsourced claims and wild deductions based on names or pronouns or even images are widely used, and this is borderline okayish for situations where nobody complains. Here, someone has complained and we should thus only include information that is explicitly mentioned in a serious source, and use the ranking tool in case the referenced information is found to be incorrect. —MisterSynergy (talk) 07:43, 20 February 2020 (UTC)
In this case is it meaningful to add sex or gender (P21)=somevalue? Especially if we meet constraint violations.--GZWDer (talk) 11:21, 20 February 2020 (UTC)
There are different opinions about how to use unknown value Help:
  • Some say it means that this information is generally unknown, and there are sources which explicitly state it as "unknown". According to this approach, you can only add it if you find a source that claims that the gender of the person in question is unknown; you should of course add this source in the reference section of the claim.
  • Others use it "because a thorough search for sources yielded no results, thus I speculate that the information is unknown (to the general public)". According to this approach, we could theoretically mass-add unknown value Help in plenty of properties to tons of items. I don't think that this would generate any benefit.
Which constraint is being violated here? Is it a item-requires-statement constraint (Q21503247) or value-requires-statement constraint (Q21510864) or similar on another property? These ones are often unfixable… —MisterSynergy (talk) 11:49, 20 February 2020 (UTC)

Preventing ping-pong protocol

We lack a mechanism to dissuade users from re-adding statements, in situations in which a subject has indicated that a property is, for them, a violation of their reasonable expectation of privacy (per WD:BLP "and which doesn't violate a person's reasonable expectations of privacy").

In the above case, a P21 value has been withdrawn 2 or 3 times. Right now if users check the item history punctiliously (they won't) or the talk page (maybe, maybe not) they may be alerted to an OTRS which may give them pause in re-adding the value. More likely, the value will be re-added, and removed again, and so on.

I think we need to do more by way of dissuasion and oversight, and I venture to suggest a mechanism: that in such cases there should be an OTRS log of the issue, and, after removal of the statement in an appropriate fashion (by edit, by oversight) a <no value> statement be added, with a {P|6305}} qualifier. The logic here is 1) <no value> is consonant with the subject's wishes, that wikidata hold no information for this property 2) OTRS qualifier is a strong hint in exactly the right place that there is an issue which should give users pause for thought about adding a value 3) we can use WDQS to report on occurrences of <no value> OTRS qualified statements having additional statements and 4) OTRS, or others, can maintain a list of such items against which removals of <no value> OTRS can be spotted. (It may be that we should go a step further and use e.g. sourcing circumstances (P1480) or another qualifier to hold a more clear "do not amend this property" value.) Thoughts? --Tagishsimon (talk) 08:13, 23 February 2020 (UTC)

  • Well, no value Help has a different semantic meaning. It is "there is no gender for this entity", not "there is no gender for this entity in Wikidata". The latter is expressed by the absence of claims with non-deprecated rank of the corresponding property in the item.
    Formally, using deprecation would be just right, with a descriptive qualifier that explains the rank selection if desired. The ranking mechanism is all about data visibility, and "Deprecated rank" makes the data already pretty invisible for actual data users (SPARQL or Wikibase client parser functions).
    What's unfortunate here is the fact that the Web-UI does not really make deprecated data less visible. This is not totally surprising, given that the Web-UI is basically an editor tool, not an interface for data users, and editors somehow need to see the data not to be added again in order not to add it again. (External) casual visitors, however, might be using the Web-UI to inspect the data about them, and they will likely not understand the concept and impact of chosen ranks. IMO the display of deprecated rank claims in the Web-UI should be improved, in a way that better informs actual editors as well as external visitors what's on. —MisterSynergy (talk) 08:37, 23 February 2020 (UTC)
I'm a couple of orders less concerned about the supposed semantic meaning of <no value> than I am about a mechanism by which we can satisfy the BLP policy. Deprecating a value that a subject has indicated is a privacy violation does not seem "Formally ... right" so much as 100% against that policy. --Tagishsimon (talk) 08:47, 23 February 2020 (UTC)
Set it to <no value> and then deprecate it? Ghouston (talk) 08:59, 23 February 2020 (UTC)
Is there a way to combine <no value> with a qualifier indicating deliberate suppression? - Jmabel (talk) 18:34, 23 February 2020 (UTC)
I have created a new Wikibase reason for deprecated rank (Q27949697) - Q86535474 to prevent the statement from being readded.--GZWDer (talk) 21:26, 27 February 2020 (UTC)

UK 1922 or 1927?

There are:

The descriptions on these two items use 1922 or 1927 (oddly, not even in one language the year is the same on both items). Also, thousands of items use both items as values in country of citizenship (P27) with start/end year in qualifiers (20818*12 April 1927) [16].

Obviously, the descriptions and these qualifiers should match. So, what should it be? Do we need to have a bot update or remove all qualifiers? --- Jura 10:28, 20 February 2020 (UTC)

The territorial change happened in 1922, but the name of Parliament only changed in 1927 by the Royal and Parliamentary Titles Act 1927. We certainly should not say that the UK was *founded* in 1927. Owain (talk) 11:40, 20 February 2020 (UTC)
I'm still mystified about how the UK could have been founded in 1922 or 1927, while the USA was founded in 1776. It's not like the territory of the US hasn't changed a bit since then, but apparently name changes are all-important. It means that the entity that we call the UK didn't take part in WWI, for example. Ghouston (talk) 11:46, 20 February 2020 (UTC)
That was the point I was trying to clarify. the UK wasn't founded in 1922 or 1927, but 1801, and of course Great Britain and Ireland shared the same monarch since 1603. You are right that territorial changes happen all the time, which is why I added the clarification "territorial extent from 1922". Even the names of states change, but that should not and does not change the date on which the state was founded. Owain (talk) 12:10, 20 February 2020 (UTC)
I think it's debatable if there are should be two items, but if we have two, we should try to use them consistently and add adequate descriptions.
If the item is only applicable starting 1927, including 1922 in the description will likely lead the users to apply the wrong one. --- Jura 15:31, 20 February 2020 (UTC)
There will be two items even if only because Wikipedias have articles for both. But if you check en:United Kingdom, it gives a range of "formation" dates in the infobox, starting at 1535. I don't think it would be unreasonable to treat the United Kingdom of Great Britain and Ireland (Q174193) item as an historical period of United Kingdom (Q145) instead of a separate country in its own right. Ghouston (talk) 22:56, 20 February 2020 (UTC)

Whilst I have sympathy for the view that UKoGB&I = UK, equally, let me scotch a couple of Owain's points. the UK wasn't founded in 1922 or 1927, but 1801 ... no. The UKoGB&I was founded in 1801. And something called the UK was founded in 1921/22/27 (take your pick); which had different territory, different parliament, different name. It's not a genie that you'll ever get back into a 'just a rename' bottle. Great Britain and Ireland shared the same monarch since 1603 and Canada, Oz, NZ &c share the same monarch today; are not the same country, so we can dispense with that straw man. If the whole name/parliament/law thing does not mark, for you, the passing of one state and the birth of another, that's fine. But you have to ask why the loss of most of Ireland is not significant, whilst the union with Scotland, or the merging of GB & Ireland are significant. By the logic proffered, we could equally say this is all really the Kingdom of England rolling on as it does, gathering a country here, losing it there. You can look to the USA item and wonder why we don't just handle the UK like that. Or you could look at France (Q142) but also at French First Republic (Q58296) and French Third Republic (Q70802) (to choose just two of the many France Country type items) and conclude that if anything, the UK is slightly clearer. It's all not ideal; it is complex; it is not helped by eliding over the need to deal with the very real changes of state by asserting that an arbitrary set of them are the same. --Tagishsimon (talk) 18:20, 20 February 2020 (UTC)

In what way does the pre-1922 and post-1922 UK have a different Parliament? Why does a change of name make a material difference here, but not in say, Myanmar (Q836)? Owain (talk) 19:28, 20 February 2020 (UTC)
  • Unclear unless we know the reason that we are considering them separate entities. Is it because the territory changed, or because the name changed? Ghouston (talk) 01:29, 22 February 2020 (UTC)
    • 1927 used in thousands of qualifiers is for Royal and Parliamentary Titles Act 1927 (Q7375047). --- Jura 08:24, 22 February 2020 (UTC)
      • That looks like a simple name change, which can be handled with multiple official name (P1448) statements with start and end qualifiers. The most significant change was the loss of most of Ireland in 1922. However, I still think it's a questionable interpretation that that constitutes the creation of a "new" United Kingdom, any more than the loss of the Philippenes needs to be interpreted as the creation of new USA. Note also that Royal and Parliamentary Titles Act 1927 (Q7375047) is marked as an instance of Act of the Parliament of the United Kingdom (Q4677783), like many other acts passed by parliaments of "different countries" even back to the early 1800s such as Slave Trade Act 1807 (Q770832). Ghouston (talk) 23:33, 22 February 2020 (UTC)
      • In other words, I'd say it's most consistent with history, as most people understand it, to say that the UK lost some territory in 1922, and changed its name in 1927, than to say that the old UK was dissolved in 1922, a new UK formed, and then 5 years later they realized that they had forgotten to name the new country properly and renamed it. Ghouston (talk) 23:44, 22 February 2020 (UTC)
      • Well, saying "dissolved" and "newly formed" is unfair, it's more like the "old UK" split into the new state of Ireland and the "new UK". But many kinds of entities change over time, and we can represent them either with new and old items at the point of change, or a single item with start and end dates on particular statements where needed. Either way is presumably valid, but I think in this case, it's better represented as a state that lost some territory than as a split, since so many features of the UK remained the same. Ghouston (talk) 00:34, 23 February 2020 (UTC)
      • Note that after the formal split, both of the entities eventually renamed: the UK in 1927, and the Irish Free State was renamed to Éire, or Ireland in 1937 and "described" as Republic of Ireland in 1948, according to enwiki. But 1922 is more relevant for the split than 1927. Ghouston (talk) 00:50, 23 February 2020 (UTC)
      • A similar situation was discussed recently in the context of Scotland possibly leaving the UK. See https://publications.parliament.uk/pa/cm201213/cmselect/cmfaff/643/643.pdf. The options are discussed starting on page 13, but basically some model or other would need to be adopted, one option being that the "RUK", the remainder of the UK, would continue as the successor of the UK, for the purposes of ~14k treaties, membership of the UN, etc., and Scotland would need to start as a new country from scratch. On page 130, Lidington says: "If we look at analogous examples, when Ireland established the Irish Free State in 1922 the United Kingdom continued to exist. It was accepted as such. The Free State and subsequently the Irish Republic became new countries. The same applied when India, which as a dominion had been a founder member of the United Nations, separated from Pakistan. India was accepted as a continuing state; Pakistan was the new state and had to apply to join the international organisations. The same took place when Eritrea became independent from Ethiopia, when South Sudan became independent from Sudan, when Malaysia and Singapore separated. If you look at recent European history, it is very striking that at the time of German unification the Federal Republic of Germany continued to exist and was accepted as such and what happened in international law and in terms of membership of organisations was that new Länder from the former German Democratic Republic became part of that continuing Federal Republic of Germany." Ghouston (talk) 06:28, 23 February 2020 (UTC)
I just noticed that in the case of Ireland, the Irish Free State (Q31747) and Republic of Ireland (Q27) have been set up as different countries. If there's a desire to treat every change of territory or adoption of a new constitution or name as effectively creating a new country, we also have the option of creating an additional item for the UK, to cover the 1922-1927 period. Ghouston (talk) 05:58, 25 February 2020 (UTC)
I made this change [17], so the enwiki article now actually mentions 1927. Ghouston (talk) 06:05, 25 February 2020 (UTC)
It seems to me that Wikidata country items are typically created because they have Wikipedia articles, and on Wikipedia, country names are the most important thing. A country generally gets a separate article for each historical name, and doesn't generally get a separate article for changes in constitution etc. Ghouston (talk) 13:00, 27 February 2020 (UTC)

What is the best primary title for an entry for an obituary?

What is the best primary title for an entry for an obituary? "Old Man Dies" the title of the obit in the newspaper? or "John Q. Smith (1880-1925) obituary"? or "John Q. Smith obituary"? Have we harmonized on any style? --RAN (talk) 14:25, 25 February 2020 (UTC)

The "title" statement value should probably be the title in the newspaper. I would generally favor that as the label as well; the description could maybe be more a standardized format with the name and dates. ArthurPSmith (talk) 14:54, 25 February 2020 (UTC)
19xx obituary of John Doe (18xx - 19xx) published in The New York Times How does this sound? --Trade (talk) 15:16, 25 February 2020 (UTC)
Interestingly we have obituaries published years before the actual death and some even years afterwards. --- Jura 09:07, 26 February 2020 (UTC)
i.e. "obituary" is generally sufficient instead of "19xx obituary". --- Jura 06:29, 27 February 2020 (UTC)
The title (P1476) should be whatever the actual title is in the source ("John Doe (18xx - 19xx)" or "actor John Doe dead at 99" or "Recent deaths" or whatever). Only in cases where the death notice lacks an official title should a more generic description ("obituary of John Q. Smith") be used as the Label, with title (P1476) set to no value or unknown value. -Animalparty (talk) 22:22, 26 February 2020 (UTC)
A problem we have with many entries generated by bot is that the journals they come from usually include a section header "Obituary" and then the name of the person with their lifespan. The items created here sometimes include just one, the other or both.--- Jura 06:29, 27 February 2020 (UTC)

Do we need error flags suggesting that we add a link to the Wayback Machine at every url link we have at Wikidata

Can someone peek at full_work_available at Property:P973 and described_at_URL Property:P973 and they both have a suggestion_constraint called "archive URL". Can someone explain to me why we need this? It seems like something that can be handled by a bot, if we want to add a Wayback Machine link for every url entry that has one. I do not see the utility of adding an error flag to every url entry we have in Wikidata. --RAN (talk) 00:18, 26 February 2020 (UTC)

It's a suggestion, not an error. Isn't that the same question as we had with another property? I don't think the qualifiers suggested on that property are particularly useful, but maybe the user who added them wants to do some work in field .. --- Jura 09:13, 26 February 2020 (UTC)
@Jura1: I might be able to do work if Wikidata would stop glitching so much. --Trade (talk) 21:49, 27 February 2020 (UTC)

Is Wikidata's purpose to provide links to every (open) wiki?

Recently someone proposed Wikidata:Property proposal/EcuRed. I'm not against this specific proposal, and we have many other properties similar to this like Vikidia article and RationalWiki ID. However I doubt whether it is really a good idea to create a property for each open wikis, especially those comparable to Wikipedia (i.e. not subject-specific and not having an dedicated professional team to edit or review). One of declined case is Baidu Baike ID, which mentions various issues may potentially affect proposal (also likely affecting properties about other open wikis):

  • advertisement and paid editing - many sites do not care about paid editing, unlike Wikimedia project
  • content is not open - most of other sites linked from external ID are also not open
  • copyvio
  • misleading material

I think we should really consider which wikis are suitable for a property. This obviously not includes every wiki (you can create an open wiki in less than 10 minutes). Some cases for discussion, which are wikis without properties or property proposals:

Question is: How is some kind of wikis that should have properties different from others? Should we define some criteria for inclusion?

--GZWDer (talk) 03:04, 26 February 2020 (UTC)

In principle, for an exact correspondence between two items, properties are not needed, just use exact match (P2888). I would even argue to move external IDs to exact match (P2888) if there are less than 1k of them to be expected. So, instead of deciding from content/quality why not set a number above which existing exact match (P2888) statements can be converted to a dedicated external ID? Automating this decision process is desirable. The actual conversion of exact match (P2888) is also easily automated. --SCIdude (talk) 08:33, 26 February 2020 (UTC)
  • I just wanna point out that LyricsWiki have been locked down and closed for all editing for quite some time now. This was partially caused due to Wikia ToS changes regarding the hosting of copyrighted and offensive material (see pornogrind (Q584752) as an example why). --Trade (talk) 09:48, 26 February 2020 (UTC)
FYI Wikidata property linking to external MediaWiki wiki (Q62619638) and Special:WhatLinksHere/Q62619638 (there is currently no "Wikidata property linking to external wiki" item). Visite fortuitement prolongée (talk) 22:47, 26 February 2020 (UTC)
But my question is: which wikis should we have a property.--GZWDer (talk) 01:09, 27 February 2020 (UTC)
  • Personally, I don't think many things linked with external-ids are that useful. Also some of the recently added identifiers are redundant. Obviously, the people adding these think differently. However, I don't see how this would differ by the use of the software, e.g. MediaWiki. --- Jura 06:25, 27 February 2020 (UTC)
    @Jura1: What identifiers do you consider redundant? --Trade (talk) 21:51, 27 February 2020 (UTC)

Create a help page

I would like to think that I'm not stupid but I find Wikidata hard to work on. I guess a lot of users come here with a problem and when they can't fix it they just give up and the problem is not fixed unless another user find it.

Also it seems that Wikidata have no forum where users can ask for help or make simple requests.

For example a user thinks that 2 items should be merged and the help page tell users to install a js-script or go through a manual merge process. User think that it is too complex and ask for help. The reply at Help talk:Merge is basicly to go study the manual.

I have the same problem. I have 2 items that I think should be merged but since I don't really edit here on Wikidata I don't feel like spending a lot of time setting figuring out how it works.

My guess is that experienced users can fix it in 30 seconds or explain why items should not be merged.

So why not make a page where users can ask for help? I think that if users come here again and again and gets help at some point they will be more interested in understanding how it works. And if they only come to Wikidata once or twice a year then it is not really taking up a lot of time for the other users here that they have to help the newbies. --MGA73 (talk) 07:21, 27 February 2020 (UTC)

  • The problem is that we have many duplicates. With some experience, it's easy to merge them, but it's a basic skill every contributor should have.
A person who identified a duplicate already has some of the needed skill and actually doing the merge would help them acquire it. If an experienced user merges the items instead, the training opportunity is lost.
BTW maybe we should remove the explanation about the manual merge process ... --- Jura 07:44, 27 February 2020 (UTC)
why can't we make the merge script default enabled? that would seem to make this particular issue more user friendly. BrokenSegue (talk) 07:48, 27 February 2020 (UTC)
Some comments:
  • It is the purpose of help pages to help users, so that they don't have to seek help.
  • Users can always ask for help here or anywhere else appropriate.
  • Users are only told to go to their settings. That is the first thing they should do they first arrive a project anyway.
  • There is also Special:MergeItems. It can error out due to things the js tool handles better but it is a very simple (simpler than editing or asking for help) and the help page does mention it.
  • Merge script is not enabled by default due to concerns over possible misuse.
  • I recall we had a page where merges could be requested. But I don't know where it went.
--Matěj Suchánek (talk) 09:31, 27 February 2020 (UTC)
I think it would be useful to have a "Request a Merge" page similar to pages for requesting deletions or queries - merging/disambiguating here can be a little complex, and a bad merge has the effect of (after bot action within a day or so) changing a lot of statements that are hard to fully revert. People doing merges should probably have a firm understanding of Wikidata principles like what an item should represent, when it's appropriate to have 2 items rather than 1, policies on disambiguation and name pages, etc. ArthurPSmith (talk) 13:29, 27 February 2020 (UTC)
@Jura1: users have different things they master and different things they care about. Some know a lot about copyright. Some take good pictures. Some write good articles. Some know a lot about birds. Some are experts in plants. Some like to clean up stuff and organize stuff. We can't expect everyone to know everything.
I like to work on Commons and one of the things I would like to do is move music files from Wikisource to Commons and it turns out to do so it would be very helpful if we have Commons:Template:Creator for the composers. That template uses Wikidata so when I move a music file from Wikisource to Commons I sometimes need to create the template on Commons. And it turns out someone need to do something on Wikidata after I created the template.
On Commons when someone ask about copyright we don't tell them to go read the laws unless they want to know about it. Often they just want to write an article on Wikipedia and need a photo to put in the article. All they care about if they can use the photo or not.
@Jura1:@BrokenSegue: Removing the manual thing and making the merge gadget standard could help. But won't it be harder to clean up if users merge stuff that should not be merged?
@ArthurPSmith: yes I think sometimes it is better to have users ask for help than having them do bad merges or giving up. --MGA73 (talk) 18:56, 27 February 2020 (UTC)
It seems hard to imagine active contributors who don't know how to merge. --- Jura 19:02, 27 February 2020 (UTC)
I find at least one case every week where someone blanks an item claim-by-claim instead of merging. I even created a template to save myself from having to give the same explanation so often. Bovlb (talk) 23:58, 27 February 2020 (UTC)
The prime reason not to make the make functionality enabled by default is that we don't want to have people doing merges who don't understand what they are doing because undoing merges can be quite complicated. If there would be a one-click undo for merges the way there's for normal edits having the functionality as default would likely be better.

Wikidata Languages Landscape dashboard

Wikidata languages as obtained from clustering across their Jaccard distances

Hello all,

A new dashboard got released for Wikidata’s birthday, and I realized that it was not properly announced here - sorry for that, let’s catch up :)

The Wikidata Languages Landscape dashboard provides insights into the ways languages are organized and used in Wikidata and across the Wikimedia projects that reuse Wikidata. It relies on different data sources: the Wikidata dumps, various datasets obtained directly from the Query Service, and datasets on Wikidata entity reuse statistics obtained from the Wikidata Concepts Monitor.

The dashboard provides different features:

  • The ontology tab, visualizing the graph of Wikidata ontology regarding languages
  • The language/class tab, generating graphs of items connected to a language, class or category
  • The label sharing graph, showing how similar the languages are judging from the extent of their overlap in what Wikidata items they have labels for
  • The language status tab, focussing on the UNESCO language endangerment categories and the Ethnologue language status
  • The language use tab, representing various indicators of language usage in Wikidata and across the Wikimedia projects

If you have any questions related to this dashboard, feel free to ask.

See also: full documentation of the dashboard. Lea Lacroix (WMDE) (talk) 13:49, 27 February 2020 (UTC)

Thank you for the Dashboards and the files in csv-Format with the content of the Dashboards. After reading the definitions it is clear for me what the Dashboards are about. Something I am interested in is an overview about the descriptions per language. It were great to have a Dashboard for that topic and list in CSV-Format. There was a language statistic by user Pasleim but the query for updating this timed out and so this overview is not up to date. Because of the existence of more items in Wikidata with the same Label in a language Descriptions are important for choosing the correct item when using the User Interface of Wikidata. At the moment QuickStatements is slow and so I cant add so many descriptions as usual. This is a situation where I hope that it is improved soon. -- Hogü-456 (talk) 19:26, 27 February 2020 (UTC)

Plantations and insurrections/revolts

I am working on the data on plantations in Suriname (South Africa, once part of the colonial empire of the Dutch). On many of these plantations at several times there were (especially during the 19th century > the Dutch only abolished slavery in Suriname in 1863!) revolts/insurrections. I asm looking for a way to incorporate these events in the data of the Wikidata items of teh plantations. Ideally (in my case) I would have a property for insurrection/revolt/fights (?) where I simply can add a date. I was hoping/thinking that maybe creating a Wikidata item for each revolt on every plantation is not the way to go.But I am still hesitating about the right way forward. Ecritures (talk) 17:18, 27 February 2020 (UTC)

After asking this question here I did find the following value: slave rebellion (Q1155622). Trying to think about the rights ways to incorporate this value (at least I can use qualifiers for start and possibly end time). Maybe simply with significant event (P793) > slave rebellion (Q1155622) ? (Sort of answering my own question, I think ), Ecritures (talk) 17:28, 27 February 2020 (UTC)
Concur with your answer. Lucky you were around to help the OP ;) --Tagishsimon (talk) 21:31, 27 February 2020 (UTC)
Yes!! I was happy to help :D Ecritures (talk) 21:32, 27 February 2020 (UTC)
Maybe just a mistyping, but Suriname is in South America, not South Africa. Ghouston (talk) 00:47, 28 February 2020 (UTC)

Dump of wikidata redirections

Hello, I am working on getting a programmatic redirection of wikidata IDs but cannot find this information in the current dumps. What's the best way to obtain a mapping from old id -> new id?

See here (this file is currently 534MB).--GZWDer (talk) 23:07, 27 February 2020 (UTC)
Thanks!, is there a way of running this query locally on a dump (e.g. latest-all.json), or is this data stored in a different database that is accessible? Jonathanraiman (talk) 23:16, 27 February 2020 (UTC)

New deprecation reason "not disclosed"

I have created Q86535474 as one of possible value of reason for deprecated rank (P2241) as a solution of Wikidata:Project_chat#Preventing_ping-pong_protocol. @Succu: nominated the item for deletion. Feel free to discuss whether this is a good idea (to dissuade users from re-adding statements).--GZWDer (talk) 22:41, 27 February 2020 (UTC)

Perform changes

Could someone perform the changes suggested at Talk:Q5782572 and Talk:Q638210? Tgeorgescu (talk) 12:22, 28 February 2020 (UTC)

Capturing world records

Read today about this man who set a new Guinness World Record for longest plank held: https://www.nytimes.com/2020/02/27/us/marine-plank-record.html

Got me thinking about how we could model this kind of data in Wikidata, would need to capture:

  • The subject of the world record (plank (Q3094383))
  • The criteria of the record ("longest held" vs "highest altitude performed at" vs "heaviest weighted", etc.)
  • The value of the record (8:15:15 here)
  • The holder of the record (George E. Hood)
  • The governing body of the record (Guinness)
  • The time the record was set

Anyone have suggestions or thoughts? Perhaps this has been discussed before somewhere. --SilentSpike (talk) 14:48, 28 February 2020 (UTC)

Are academic ranks + the persons name a useful alias for the person’s Wikidata item?

Are academic ranks/titles a useful alias for a person's Wikidata item? E.g. Prof. Dr. h. c. Albert Einstein? Thanks, --Polarlys (talk) 19:19, 28 February 2020 (UTC)

Generally, I don't think so. I haven't seen it on any item either. Ahmadtalk 20:50, 28 February 2020 (UTC)

Item fix request

Please enter on Wikipedia (Q52), and go on Wikipedia languages interlinks. Please delete “ዊኪፔዲያ” and write “ዊኪ ፔዲያ” on Tygrinia language. Thanks!!! --151.49.42.20 20:14, 28 February 2020 (UTC)

✓ Done, thanks. Ahmadtalk 20:43, 28 February 2020 (UTC)

Repurposed item

The item Q2635280 has been about at least three different things during its lifetime: a hall, a conseratory, and now a "conflation". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:54, 28 February 2020 (UTC)

It was originally an item about the house (Wollaton Hall), but when a bot added National Heritage List for England number (P1216) it added the number for the conservatory (Camellia House) and created a new item Wollaton Hall (Q17528596) for the house. It was then repurposed based on the P1216 and image that was added at the same time, and sitelinks and most other content related to the house were moved to the item created by the bot. I found the item mostly repurposed but with some labels and an identifier for the house, so I moved them to Q17528596, and moved everything for the conservatory to a new item, Camellia House (Q86595130). There seems to be nothing to say "ambiguous item, do not use"; the closest I could find was Help:Conflation of two people, which is another type of item that should be split, which says replace with "conflation" - there is nothing to say conflation only applies to people so I used it on Q2635280. I don't know if there should be something more specific than "conflation" for this purpose; deletion would lose the links to the items that should be used instead. Peter James (talk) 21:55, 28 February 2020 (UTC)
Without having looked at this item in detail: we sometimes delete "conflation items" after all data has been pulled apart into separate items and backlinks have been migrated as well; in such cases, it is good practice to link all new items in the deletion reason. —MisterSynergy (talk) 22:40, 28 February 2020 (UTC)

knowledge ecosystem

Considering the widespread use of the term within wikimedia circles, it'd be good to have a nice definition of knowledge ecosystem (Q3578818) on wikidata. Any ideas? T.Shafee(evo&evo) (talk) 01:27, 29 February 2020 (UTC)

Siblings whose lives did not overlap

Is there a qualifier for sibling (P3373) to indicate that they were not contemporaneous? -- ie one died before the other was born. Without it, saying that they were siblings of each other seems a bit odd, if their lives never overlapped. Jheald (talk) 17:15, 28 February 2020 (UTC)

Is there a distinct or qualifying term in any language for this besides 'non-contemporaneous sibling'? Do real world sources make the distinction? -Animalparty (talk) 17:27, 28 February 2020 (UTC)
  • I don't think it needs stated. But could be included as "bonus material" if a qualifier existed. If you think about it, many half-siblings live at the same time but have never met. They are still (half) siblings even though they never met. Sibling is a legal term that applies to blood relations, it has nothing to do with where, when, or how someone lived. Quakewoody (talk) 18:20, 28 February 2020 (UTC)
  • It's already implied by birth and death dates. There would also be siblings who never met even though their lifespans did overlap. Ghouston (talk) 06:14, 29 February 2020 (UTC)

Glossary additions

Glossary has now entries on:

Please double-check, proofread as needed. --- Jura 12:26, 25 February 2020 (UTC)

And:

--- Jura 07:08, 27 February 2020 (UTC)

Also:

--- Jura 09:37, 28 February 2020 (UTC)

Continued:

Maybe the page format could be made to work differently. Directly add the text to Wikidata entities would probably be the easiest approach, but that might not make it easy for translations. For an alternative with subpages, I left a question at Wikidata:Translators'_noticeboard#Translation_extension_and_Wikidata:Glossary_format. --- Jura 21:06, 1 March 2020 (UTC)

Item with most labels or descriptions?

Are there lists of items with the most labels or descriptions? Eurohunter (talk) 20:19, 29 February 2020 (UTC)

Not that I'm aware of. You probably have to download the whole database and run a script to count for you.So9q (talk) 06:39, 1 March 2020 (UTC)
@So9q: There are statistics for items with most interwiki or most changes so why there can't be something like mentioned above? Eurohunter (talk) 14:33, 1 March 2020 (UTC)
In addition it could be based on category. Eurohunter (talk) 14:34, 1 March 2020 (UTC)
For descriptions, it might well be "all" category items or "all" disambiguation items. Which isn't really helpful.
#Wikidata_Languages_Landscape_dashboard has some analysis on labels. --- Jura 14:46, 1 March 2020 (UTC)

This item looks very much like "self-promotion" to me... Do we keep it ? --Hsarrazin (talk) 23:22, 29 February 2020 (UTC)

yeah, especially given the username of the creator @Gundolf_Meyer-Hentschel:. they seem possibly notable given one of the items they made is linked to a wikipedia article in german? BrokenSegue (talk) 23:46, 29 February 2020 (UTC)
Given the references to various databases, I think he's notable in our sense but there are plenty of statements that shouldn't be there. Human's don't have titles. ChristianKl10:42, 1 March 2020 (UTC)
Honestly, whether it is or isn't self-promotion, is there a policy against it? Does the item satisfy notability? Does it meet Deletion policy? If yes and no, respectively, we keep it. It seems like the Wikidata community love to condemn any appearance of conflict of interest, but where does it explicitly say that's a bad thing? Where do we direct new users who may be unaware of the arcane, implicit rules of behavior? There was a proposal to create a Conflict of interest policy. It was rejected. As far as I can tell, being a regular contributor, Wikidata has almost no explicit policies or guidelines regarding conduct, and even when policies exist, they are not easy to find. Imagine how confusing it must be for newcomers. Until more transparent norms are clearly established, Wikidata will remain the Wild West, where anything goes until the posse objects. All data items will require curation, and incorrect or unverifiable properties should be addressed accordingly. Assume good faith is a policy however. -Animalparty (talk) 19:35, 1 March 2020 (UTC)

Images for Wikidata - "Global Young Academy"

Recently, OTRS has received several photo submissions, such as ticket:2020022410001019‎, alluding to a "Global Young Academy project" that says it "highlights global excellence and diversity in science by uploading notable members from science academies around the world" and, to that end, "are currently uploading the profiles and pictures of around 50 national young academies from across the globe‎".

However, none have Wikipedia articles anywhere that meet notability requirements.

Has an exception been made for this "Global Young Academy project" to host their photos at Wikidata anyway? JGHowes (talk) 13:37, 24 February 2020 (UTC)

@JGHowes: An exception to what? In any case, images are hosted at Wikimedia Commons, not on Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:05, 24 February 2020 (UTC)
To clarify, the policy at OTRS photo-submissions is not to accept files emailed to us for Wikimedia Commons if there's no related Wikipedia article. In recent weeks, we've received some 20 photo submissions for people not having Wikipedia bios. When the submitter has been informed that the image cannot be accepted for this reason, their response has typically been that it's not for Wikipedia, just the Wikidata page they've created about the person. So is there an exception we OTRS agents should be aware of, whereby the Global Young Academy project creates a Wikidata entry and submits a photo for Commons solely for use as a P18 statement on the wikidata page they've created? JGHowes (talk) 17:04, 24 February 2020 (UTC)
From a structure data perspective, if the entities are notable per our local standards then that should be fine. But as the hoster of the images, Commons is free to create its own standard for whether images can be hosted there. Do you have specific items referenced from the emails? I'd be happy to check to ensure that they do meet the local notability standard. -- Ajraddatz (talk) 17:17, 24 February 2020 (UTC)
See for example Renard Siew (Q65095592) created as Global Young Academy (Q5570796). Now Renard Slew has emailed his photo to Commons photo-submissions via said OTRS ticket for this Wikidata page. JGHowes (talk) 19:12, 24 February 2020 (UTC)
Are you supposed to reveal this level of detail? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:37, 24 February 2020 (UTC)
"policy at OTRS photo-submissions is not to accept files emailed to us for Wikimedia Commons if there's no related Wikipedia article" If so, then the policy is guaranteed to be harmful to this project, and our wider movement. OTRS should accept any genuinely-free file that meets Wikimedia Commons' purpose, and which are unlikely to be deleted there. That incudes not only images that have (I paraphrase) "educational purpose", but also those which are used on Wikidata, Wikispecies or Wikisource, or other sister projects. Who set that policy, and where did they consult with the projects that use images? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:37, 24 February 2020 (UTC)
Regardless of the policy, it appears that the *only* place that there is a "page" of any type on a Wikimedia project is on Wikidata; there is no other place at this time where the subjects of these images have an entry. Thus, perhaps the bigger question is whether or not these individuals meet the Wikidata notability standards, and whether they should have an entry here. The policies of other projects (including OTRS) aren't truly within scope for Wikidata. It doesn't seem appropriate for Wikidata to be a place for young professionals to store their CV, but that's a decision for this project to make. It seems kind of circular that they're looking to upload images to Commons for the purpose of completing their Wikidata entry. Risker (talk) 19:55, 24 February 2020 (UTC)
Not really. My understanding is the the Google Knowledge box draws heavily from WD (e.g., [18] and presumably related). But OTRS permission queue shouldn't really have anything to do with the inclusion standards of WD. It should have to do with the inclusion standards of Commons. Good faith inclusion on a sister project automatically puts something within Common's scope, but it doesn't mean that not being on WD automatically means it's outside Common's scope. GMGtalk 20:10, 24 February 2020 (UTC)
@JGHowes, Risker: some people have turned Wikidata into a dumping ground for scientific papers and a phone book for scientists. This is a consequence of that. Multichill (talk) 21:24, 24 February 2020 (UTC)
A Wikidata entry not linked to any project file is a fine way to avoid the notability guidelines of Wikipedia, IMHO. --Ganímedes (talk) 23:11, 24 February 2020 (UTC)
Besides, the authors seems not to be familiar with our policies. Some days ago one persone write saying "I'm the author, I'm the photographer" in the middle of the template text. I've asked him how could be possible that he's the photographer if he's in the photo with the arms crossed over the chest. He said something like "of course it's impossible, the photographer is X". Touché!.... --Ganímedes (talk) 23:16, 24 February 2020 (UTC)
@JGHowes: This is probably not the best place to hammer out Commons policy, but where (if anywhere?) did OTRS end up with that very limiting policy? I'm completely with Andy on this. I'm an admin on Commons, and I would strenuously object to that policy. I doubt that even half of our pictures on Commons relate to any Wikipedia article, unless you count, say, that any picture of any part of a city corresponds to us having an article on that city, or other reductio ad absurdum interpretations (which would lead to a far more liberal policy for OTRS, anyway). For example, we do not have, nor are we likely to have, a Wikipedia article on this long-gone Lutheran church in Seattle, but we'd certainly want more pictures of it. I could come up with a hundred similar examples. - Jmabel (talk) 23:21, 24 February 2020 (UTC)
While I understand the concerns of those wondering how OTRS came up with the *UPLOAD* procedure (for requests being sent to a queue that is supposed to deal with permissions only), that's a discussion to have between the Commons community and OTRS. Realistically, the Commons admins participating in the OTRS discussion are saying pretty clearly that they don't want automatic uploading of images on request, only for them to have to deal with junk images on that project; personal photographs are borderline at the best of times, and there's no reason why the person doesn't create their own account and upload it themselves. Whether or not the process should be tweaked is not really in scope for Wikidata. What *is* in scope for Wikidata is whether or not you want to have these entries on *this* project. I have the impression from some of the posts above that there is a level of dissatisfaction with random professionals creating Wikidata entries about themselves (or having them created for them), but that's an issue this project can and should decide for itself, and it has absolutely nothing to do with OTRS or with uploading of images to Commons. Risker (talk) 23:33, 24 February 2020 (UTC)
Repeating what I said on the email thread, if it's used/usable on WD, then it's in-scope for Commons. Part of our mission is to support our sister projects. Whether it's in your scope is a decision that the local community has to make, and it is not in the remit of OTRS or Commons to decide that. GMGtalk 01:54, 25 February 2020 (UTC)
Some people have turned Wikidata into a "dumping ground" for artworks and a "phone book" for artists. Different strokes, eh? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:40, 24 February 2020 (UTC)
"The policies of other projects (including OTRS) aren't truly within scope for Wikidata." When they have potential to impact this project - for instance, by disallowing images we might wish to display - then they are very much in scope for discussion here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:45, 24 February 2020 (UTC)
Who wants to display these files here? Why are these files different from any other personal photo? --Ganímedes (talk) 23:47, 24 February 2020 (UTC)
They are different because they depict the subjects we document. Note also that the OTRS restriction described near the top of this section is not limited to photographs of people. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:55, 24 February 2020 (UTC)
But who says we must document them? There's no reason to keep their q if there is no article linked to them., IMHO. I see all this as great self-promotion. Do we don't ave guidelines for that either? --Ganímedes (talk) 00:28, 25 February 2020 (UTC)
There is no must. But WD:N (a policy, not mere guideline) gives us scope. It is not for OTRS to override. Your "no reason to keep their q if there is no article linked to them" belies a gross misunderstanding of Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:32, 25 February 2020 (UTC)
You continue writing in plural. Are you talking in someone else name? --Ganímedes (talk) 00:34, 25 February 2020 (UTC)
I see you no longer wish to debate what I have to say; nor can you refute it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:38, 25 February 2020 (UTC)
No, you're who's avoiding to answer. --Ganímedes (talk) 00:42, 25 February 2020 (UTC)
Feel free to point out where that has occurred. You, on the other hand, have ignored "WD:N ... gives us scope". It is not for OTRS to override.; and you have yet to address your misunderstandings that Wikipedia links are a requirement; or that Wikipedia's notability policies have any bearing here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:50, 25 February 2020 (UTC)
"Feel free to point out where that has occurred" No response from user:Ganímedes, and no evidence of me "avoiding to answer". No acknowledgment nor attempt to refute that their "no reason to keep their q if there is no article linked to them" is a gross misrepresentation of Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:51, 27 February 2020 (UTC)
@JGHowes: answered you in the parallel discussion that you has unfortunately started on Commons, quoting OTRS guidelines. I don't pretend to continue arguying with you in circles. It's exhausting, and pointless. Besides, I'm not following this thread anymore. Thanks. --Ganímedes (talk) 12:15, 27 February 2020 (UTC)
On Commons, I merely posted a pointer to this discussion. When you began posting the same content beneath that as you had posted here, I pointed out to you "It really would be better if you did not split the discussions between venues.". Above, I was asking you for evidence to support your claim "you're who's avoiding to answer"; it seems you have none. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:42, 3 March 2020 (UTC)

See also related discussion on Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:17, 25 February 2020 (UTC)

Having opened this discussion, it would appear that a clarification might be helpful. My question pertained to uploading of files about non-notable persons and it was in that context that I was referring to OTRS policy about photos of persons, not files in general. (Perhaps "policy" is too strong and "guideline" would have been a better choice of words): "If the person is trying to submit an image of a non-notable person (or one we don't have an article for), it might be best not to upload it. Use the 'no article, not notable' boilerplate."[19] This pertains only to files of people and should not be misconstrued to say more than what was intended. Anyway, the question remains: Is this Wikidata entry Q65095592 deemed to meets requirement #1 of WD:N? JGHowes (talk) 07:31, 25 February 2020 (UTC)
I've asked in the parallel discussion that has unfortunately started on Commons: "When and where was this guideline drawn up, what consultation took place, and how can it be urgently updated to be fit for purpose? Who can track down correspondence with the authors of any previously-rejected material, wanted by non-Wikipedia sister projects, that should have been accepted?". It would be good to have some answers. [Also, note that the OTRS wiki page on which the policy guideline you cite lives is not publicly viewable. That guideline was apparently written in 2010, two years before Wikidata existed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:28, 25 February 2020 (UTC)
@JGHowes: I don't know if anybody else answered you on this, but Renard Siew (Q65095592) satisfies WD:N on grounds #2 (based on external id's) and #3 (as the author of A review of corporate sustainability reporting tools (SRTs). (Q38588420). ArthurPSmith (talk) 15:04, 25 February 2020 (UTC)
Finally, a cogent answer without all the grandstanding. Thank you @ArthurPSmith: and Ajraddatz. Accordingly, the file has now been processed as File:Renard Siew2.jpg. To avoid future misunderstandings, it would be a good idea to update the guidance provided to the OTRS team regarding Wikidata entries. JGHowes (talk) 16:43, 25 February 2020 (UTC)
I gave you the first of several cogent answers 21 hour ago: "OTRS should accept any genuinely-free file that meets Wikimedia Commons' purpose, and which are unlikely to be deleted there. That incudes not only images that have (I paraphrase) "educational purpose", but also those which are used on Wikidata, Wikispecies or Wikisource, or other sister projects.". You've also been pointed - here and on Commons - to both WD:N and c:COM:SCOPE. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:51, 25 February 2020 (UTC)
  • Just a note that I've checked the example page listed above, and the subject is indeed notable under WD:N criteria #2 (identifiable with the Google Scholar and ORCID Ids). Wikidata is intentionally more inclusive than most Wikipedias because our aim is to build a database, rather than just encyclopaedic entries. I cannot dictate how Commons should implement its policies, but I do think that if the goal is to support sister projects, these images should be retained. -- Ajraddatz (talk) 16:45, 25 February 2020 (UTC)

Global Young Academy

A description of Global Young Academy (Q5570796) may be found at en:Global Young Academy. As:

members are expected to be several years past their doctoral studies [and] capped at 200 [and are recruited] based on scientific excellence, after a process of nominations from senior scientists, national societies, and self-nominations, together with peer review by members

then it seems highly likely that the 200 members and 258 alumni ([20], [21]) meet our notability requirements (and are far from "random professionals"); in which case we definitely want freely-licensed images of each of them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:55, 24 February 2020 (UTC)

Notability guidelines in Wikipedia says they must been covered by independent sources. To bring their resumes here avoid this. Does this make them notables? --Ganímedes (talk) 00:26, 25 February 2020 (UTC)
Notability guidelines in Wikipedia have bugger all to do with commons or wikidata. --Tagishsimon (talk) 00:29, 25 February 2020 (UTC)
So they can create self-promotion q easily just because nothing says otherwise. I've got a degree in biochemistry (true). Can I create my q here and add my photo too? --Ganímedes (talk) 00:33, 25 February 2020 (UTC)
"they"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:38, 25 February 2020 (UTC)
The Young Academy Scientist who're sending the photos and creating the elements here. --Ganímedes (talk) 00:42, 25 February 2020 (UTC)
Evidence of any of them doing so? But yes, if they meet the criteria, they may. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:50, 25 February 2020 (UTC)

Hi, just to jump in - thanks for all your volunteer work on this, your dedication is very impressive! Here's the list of young academies network: https://globalyoungacademy.net/national-young-academies/ Anyone who might contact you regarding picture verification has authored at least one important publication (at least enough to get selected nationally as among their country's most excellent young scientists in terms of research & societal impact). Most are professors/have wikipedia pages - we're trying to make those from underrepresented locations more visible (also often a language barrier!). Happy to hear suggestions about how to streamline this. The vision is to fully digitise national science academies, starting with the young ones =). I'm very glad that we have supporters in the wiki communities for this! --PPEscientist (talk) 21:15, 26 February 2020 (UTC)

This is the trick: according to our guidelines (Wikidata: Notability): "An item is acceptable if and only if it fulfills at least one of these two goals, that is if it meets at least one of the criteria below: 1. It contains at least one valid sitelink to a page on Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wikinews, Wikibooks, Wikidata, Wikispecies, Wikiversity, or Wikimedia Commons." So, adding a file to Wikimedia Commons and linking it here, they've got a notable q. So, they become notables. This is how this works, right? --Ganímedes (talk) 00:42, 25 February 2020 (UTC)

Is that what WD:N says? WD:N is how it works. You seem to be distracting us significantly from the issue of OTRS policy and how it affects this project. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:50, 25 February 2020 (UTC)

Hi, just to jump in - thanks for all your volunteer work on this, your dedication is very impressive! Here's the list of young academies network: https://globalyoungacademy.net/national-young-academies/ Anyone who might contact you regarding picture verification has authored at least one important publication (at least enough to get selected nationally as among their country's most excellent young scientists in terms of research & societal impact). Most are professors/have wikipedia pages - we're trying to make those from underrepresented locations more visible (also often a language barrier!). Happy to hear suggestions about how to streamline this. The vision is to fully digitise national science academies, starting with the young ones =). I'm very glad that we have supporters in the wiki communities for this! --PPEscientist (talk) 21:17, 26 February 2020 (UTC)

Zzzz. Commons has its own notability policy. From that - at https://commons.wikimedia.org/wiki/Commons:Project_scope#Must_be_realistically_useful_for_an_educational_purpose - "Must be realistically useful for an educational purpose". Might a mugshot of "members ... expected to be several years past their doctoral studies [and] capped at 200 [and are recruited] based on scientific excellence, after a process of nominations from senior scientists, national societies, and self-nominations, together with peer review by members" pass that requirement? Very probably. Next. --Tagishsimon (talk) 00:53, 25 February 2020 (UTC)
As U wish. Commons has also another policies, but that's not the point here. --Ganímedes (talk) 01:06, 25 February 2020 (UTC)
The first few listed for Australia, Patrick Cobbinah (Q64907170), Cheng Zhiming (Q64910913), Aysha Fleming (Q57304466), Bartlomiej Kolodziejczyk (Q65007511), I think would easily meet notability in the way it's applied to other researchers. Ghouston (talk) 04:59, 25 February 2020 (UTC)

I am livid. Commons is to support its sister projects, it is NOT exclusively for Wikipedia. All members of the Global Young Academy are scientists, they often have publications and yes Wikidata supports images. It is not for Commons to destroy the efforts of what they are not familiar with. Thanks, GerardM (talk) 09:24, 25 February 2020 (UTC)

GerardM, I do not reply to obscene emails and have blocked receiving any further emails from you. JGHowes (talk) 14:15, 25 February 2020 (UTC)
Well, I reread the text, the only obscene issue is the wanton destruction that took place. Thanks, GerardM (talk) 11:01, 1 March 2020 (UTC)
Wikidata it's only a database. There's nothing in those elements but a name and a date. Not even a source. Why these elements should be kept? --Ganímedes (talk) 11:23, 25 February 2020 (UTC)
"Wikidata it's [sic] only a database" So? "nothing in those elements" Which "elements"? Did you look at the examples posted by Ghouston, above? They are far better populated than your comment suggests. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 25 February 2020 (UTC)

GYA 2

Thanks all, I post this here (sorry, I'm bad at this - some wikimedia friends told me about this discussion so I came to check it out ) Happy to connect everyone to the wonderful world of science academies :)) --> Dear all, thanks a lot for all of your engagement and countless volunteer hours. I'm representing this effort of the Global Young Academy as well as many different other networks who have joined this effort to bringing excellent young scientists to wikidata (from India to Iraq to Italy). We are happy to receive advice on how to streamline this process. We are asking that scholars of national young academies themselves upload their pictures rather than doing this in bulk. Most scholars are professors, all of them are prize-winning scientists and all have wikidata entries now (Wikipedia pages exist for a great number of them, but these are not written by us (see here: https://w.wiki/DQr)). The Bangladesh Young academy https://nyabangladesh.org/ (to take one example out of 50) is one of the first contributors. Sooner or later, all 50+ national young academies will be submitting pictures. The plan is to then engage our senior academies and senior academy networks to do likewise, as well as the framework organizations through which they are organized (InterAcademy Partnership, ALLEA, African Academy of Sciences, Royal Academy...). So we are very much interested in setting up a process by which this is streamlined. Apologies for the many individuals who do not send in photos with the correct specifications, we want to support wikimedia as much as possible, help us to do this. PPEscientist (talk) 10:32, 25 February 2020 (UTC)

@PPEscientist: Thank you for your collaborative approach to this. I'm sorry you're having to be exposed to the above nonesesene, but until it is satisfactorily resolved, I suggest you do not ask anyone else to submit images. Once it is, and you're ready to start again, you're welcome to refer people to this page (or to borrow text from it); that should help them to understand the issuse of authorship of images. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:06, 25 February 2020 (UTC)
@PPEscientist: I'm speaking here mainly as a Commons admin, even if the discussion is on Wikidata. A few considerations to streamline this:
  1. Always be sure to distinguish author (photographer) from subject. Copyright belongs to the author, not the subject, and that is the person from whom we need permission.
  2. If one person is in a position to take multiple photos, that often makes matters simpler, since there are fewer separate grants of permissions to process.
  3. The simplest way (from Commons' point of view) to provide an appropriate license is for some other trusted site, or site clearly under control of the author/photographer, to indicate the license. It might be simpler for your organization to host photos of scholars of national young academies on your own website, with appropriate licenses indicated; those could then by used as a source for Commons, and anyone (not just the members of the OTRS team) could handle the uploads, without involving email at all. You could use a similar method to indicate licenses on your site to what we do on Commons.
  4. Failing that, you might create a customized version of the standard OTRS form that would let someone indicate author, subject, Wikidata item (if that has already been created), and one or more links (including presumably something linking to a site of your organization, and something for their university affiliation or affliations, and their prizes/awards/publications) indicating notability. - Jmabel (talk) 16:16, 25 February 2020 (UTC)
This at least has some sense: if it's up to OTRS, avoid to use OTRS to ask for agents to upload the files in your name. --Ganímedes (talk) 17:12, 25 February 2020 (UTC)
@PPEscientist: Why are you suggesting people submit the photos via OTRS rather than simply uploading the images to Commons themselves? As most of them are obvious selfies there isn't any issue that the copyright holder and the subject are not one and the same person. Nthep (talk) 17:49, 25 February 2020 (UTC)
Many are not "obvious", and others are not even selfies. --Ganímedes (talk) 18:36, 25 February 2020 (UTC)

Thanks for these helpful suggestions, I've amended our communication on this. We definitely don't want to cause a burden, so we'll ask them to upload themselves and not ask pictures to be uploaded (with scholars as diverse as from Iraq to Panama, it's also tricky to control this, just fyi, and language is frequently an issue). This is all a pretty good test case for our big aims to bring entire science organizations (think Leopoldina/Royal academy) on wikidata/wikimedia in the hopefully mid to long term future. We started with national young academies as these are typically more diverse in terms of gender/discipline/age than the senior academies. If anyone wants to join & help, let us know. PPEscientist (talk) 19:22, 25 February 2020 (UTC)

I don't understand, why was OTRS even involved in the first place? You don't need OTRS to release a work with a free license. Just put cc-by-sa everywhere on https://globalyoungacademy.net/legal-notice/ , upload the photos on the GYA website and be happy. Nemo 20:52, 28 February 2020 (UTC)
OTRS was involved because of the trigger happy Commoners who delete images EVEN though they are linked to Wikidata and are included in a category structure. The thought is that by having an OTRS label you are save and images will be retained. Thanks, GerardM (talk) 11:03, 1 March 2020 (UTC)
OTRS only assure that copyrights are respected, not avoid deletion by other reasons. @PPEscientist:: Please explain the customers that, when they stated OTRS "I'm the photographer, I'm the copyright holder", that should be true beyond any reasonable doubt. If a mugshot it's provided, it must contain the EXIF information (camera, model, date, ISO, etc, which is pre-set in the device), or the customer should ask the photographer to send the permission template. It's not enough to affirm to be the photographer: they need to be able to demostrate it. The other chance are named by Jmabel: if you upload the files to a website when stated a free license, they'll can be importated to Commons. Regards. --Ganímedes (talk) 16:53, 1 March 2020 (UTC)
Thanks for the suggestion. Yes, I think that communication was pretty clear about this, outlining it exactly as you did; I'm sorry in case individuals don't follow this properly (when we deal with prize-winning botanists from, say, Egypt, sometimes there is also a language issue). By the way, as you see here: https://globalyoungacademy.net/national-young-academies/ we are a network (often informal), so the thousands of national academy members are not formally part of us. We encourage academies to put files on a website with free license from now on, that's a very good idea. We also ask people not to ask volunteers to upload for them, I see how this can cause volunteers excessive work. We also want to provide more value through our network to wiki (uploading general scientific information, their institutions etc). But I feel that people are quickly frustrated (see this discussion here). Would it not be nice to have more, new, and especially people from the Global South contribute to wikimedia for the first time? I'm happy to facilitate this, and also have spoken to more wikimedia people offline now. Happy week everyone PPEscientist (talk) 17:06, 1 March 2020 (UTC)

Representing recipes from Wikibooks Cookbooks in various languages in Wikidata

For providing background:

This is a question that was asked in 2017.

On a similar topic there has been a related question from the same year that was asked and where a conversation developed on how to represent processes, like recipes, in Wikidata.

I'm editing in both the English and the Swedish Wikibooks(although I'm a very new user regarding Cookbook projects) and I'm wondering if I can start adding wikidata items for individual recipes:

Now for my question:

Can I add Wikidata items for recipes I create, ie. this recipe I created(which is still a draft but I'm going to spend more time completing it) 10 days ago "Cookbook:Salmon with Rice and Sauce". If it had an item on Wikidata it could(or not, depending on if I got this right or wrong, or parts thereof) contain has part(s) (P527)

What do you think about this, can I add Wikidata items for recipes I create in Wikibooks?(and for other recipes that exist in Wikibooks?) Datariumrex (talk) 11:54, 27 February 2020 (UTC)

It is something what I think that it is interesting. The example above why there are doubts is a good one. I think it is not easy to find the right item. For a specific fish there is no item usually in Wikidata. The usually used name for a fish is in some cases not the real scientific name of a fish. I think it were great to find a solution for that. So this is something you need to check before adding your recipes here. How do you plan to show the steps that need to be done to get the menue the recipe is about. It is something that should be defined. -- Hogü-456 (talk) 19:44, 27 February 2020 (UTC)
On Commons we have separate categories like commons:Category:Salmon as food. - Jmabel (talk) 17:29, 1 March 2020 (UTC)
Regarding that I found the species of the fish. In the Wikidata item Salmon with Rice and Sauce (Q86594655) for my recipe I've added what species of Salmon that is being used. Datariumrex (talk) 12:45, 3 March 2020 (UTC)
I finally created an item for my recipe on Wikidata Salmon with Rice and Sauce (Q86594655). I also added the "Category:Salmon as food" to the picture on Wikimedia Commons that depicts the finished dish of the recipe. Thanks for the tips! Datariumrex (talk) 12:43, 3 March 2020 (UTC)

Help page for determining when to merge

The current version of Help:Merge has only a cursory mention of how to determine whether two items would be merged. I started User:Vahurzpu/Consider before merging as a draft of an update to that section. I would appreciate any other heuristics, as well as examples of items that look like they should be merged, but shouldn't be. Vahurzpu (talk) 17:10, 27 February 2020 (UTC)

@Vahurzpu: Thanks, I added a few suggestions. ArthurPSmith (talk) 18:42, 27 February 2020 (UTC)

Bulk fixing error

It looks like a lot of bot-created articles from the journal The Art Bulletin (Q15766110) are listed as instance of scholarly article (Q13442814) instead of academic journal article (Q18918145). E.g. A Romanesque Fresco in the Plandiura Collection (Q57684351) and Whither Art History in a Globalizing World (Q57678930) before I fixed them. I'm not sure if any article ever published in this journal could be considered a "scientific article" but if there are any, I'd expect them to be quite rare. What would be the best way to bulk change these entries? And is there a way to keep this from happening again? I don't know the workflows for the automated article-item creation. Thanks, Calliopejen1 (talk) 17:47, 26 February 2020 (UTC)

@Calliopejen1: The workflow tends to be a WDQS report such as the below to find the items at issue; and then use of Quickstatements to fix the issue. QS, in essence, wants comma or tab seperated value lists, of triples to be added to, or triples to be removed from, items. I'll amend the Art Bulletin set now.
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P1433 wd:Q15766110;
        wdt:P31 wd:Q13442814.
  filter not exists {?item wdt:P31 wd:Q18918145 .}
  SERVICE wikibase:label {    bd:serviceParam wikibase:language "en" .  } 
}
Try it!
--Tagishsimon (talk) 18:00, 26 February 2020 (UTC)
Thanks!! Calliopejen1 (talk) 18:27, 26 February 2020 (UTC)
@Tagishsimon: This also applies to articles from The Seventeenth Century (Q15763184). Could you fix those as well? I just asked User:Sic19 to help me identify any other journals this could apply to, because it looks like these two at least started with batch edits he did, and I suspect there could be a lot more.... Calliopejen1 (talk) 18:58, 26 February 2020 (UTC)
And here's a search that identifies more articles that are likely misclassified (along with other articles in the journals where they're published).[22] This seems like a very widespread issue... Calliopejen1 (talk) 19:06, 26 February 2020 (UTC)
I'll do The Seventeenth Century (Q15763184) but note that the edits are only slowly trickling in (big queue at Quickstatements [23] & intermittently throttled API) so it may take a day or more for any set to be done. But I've got you; art history generally != science. --Tagishsimon (talk) 19:11, 26 February 2020 (UTC)
Thanks! Obviously no rush. Will keep you posted on other affected journals. Calliopejen1 (talk) 19:15, 26 February 2020 (UTC)
@Calliopejen1: Presumably you are not aware that academic journal article (Q18918145) is a subclass of work of science (Q11826511) (see https://tools.wmflabs.org/wikidata-todo/tree.html?lang=en-gb&q=Q11826511&rp=279)? Simon Cobb (User:Sic19 ; talk page) 22:46, 26 February 2020 (UTC)
@Sic19: scholarly article (Q13442814) is a subclass of academic journal article (Q18918145) which implies there are some academic journal articles that are not scientific articles. Seems like something has gone wrong with subclasses/definitions here. Calliopejen1 (talk) 22:48, 26 February 2020 (UTC)
I don't think scholarly work (Q55915575) should be a subclass of work of science (Q11826511) -- there is plenty of scholarly work that is not scientific... Calliopejen1 (talk) 22:50, 26 February 2020 (UTC) ([24] this change seems like a faulty edit Calliopejen1 (talk) 22:59, 26 February 2020 (UTC))
I'm reverting that referenced edit now. The real cause of the problem seems to be what Ghouston identified below. Calliopejen1 (talk) 05:10, 27 February 2020 (UTC)
There has been some confusion about the scholarly article (Q13442814), probably because somebody changed the English label from "scientific article" to "scholarly article" in 2018, and the subclasses ended up getting reversed. I reset it to its current state in January. Ghouston (talk) 23:34, 26 February 2020 (UTC)
@Ghouston: Yikes. Does that mean that most of the bulk-added articles from scholarly (but non-scientific) journals between 2018 and January 2020 may now have the incorrect value for "instance of"? Calliopejen1 (talk) 00:51, 27 February 2020 (UTC)
I don't know, but it wouldn't be surprising if at least some of the bots importing articles use the same instance for all. Ghouston (talk) 01:02, 27 February 2020 (UTC)
@Ghouston: The issue is that even if editors chose the "right" (looking) one for bulk imports (i.e. the one that used to be called "scholarly article" and now is called "scientific article") their choice would be wrong given how things stand now. I assume this affects thousands and thousands and thousands of items... Calliopejen1 (talk) 01:32, 27 February 2020 (UTC)
True. They could base it perhaps on whether the journal itself is a academic journal (Q737498) or a scientific journal (Q5633421), although not every journal may be set up to use the most appropriate. Ghouston (talk) 03:31, 27 February 2020 (UTC)
Boo. I just spot checked some journals from that list and it appears that that most of the humanities journals were themselves bot-created and incorrectly called instance of scientific journal. :( There may need to be a manual review of the list of journals to properly categorize the ones that appear humanities related, and then once that is done try to fix the articles in the journals. Calliopejen1 (talk) 05:17, 27 February 2020 (UTC)
Related request: Can someone fix my code here so the search returns the journal names/labels, not just the item numbers? I'm terrible at this.... Trying to figure out which journals may have the most articles with problems. Calliopejen1 (talk) 01:36, 27 February 2020 (UTC)
@Calliopejen1: fixed. --Tagishsimon (talk) 02:24, 27 February 2020 (UTC)
  • Couldn't we just change back the label of Q15766110Q13442814 to "scholarly article". I don't really see the value added by changing thousands of items back and forth depending on the current view about the nature of a journal. We would have to check two different P31 values instead of just one. --- Jura 06:18, 27 February 2020 (UTC)
Stopped --Tagishsimon (talk) 11:03, 27 February 2020 (UTC)
@Jura1: I'm adding ~1300 P31s to articles with no P31 fyi; deletions had got ahead of appends at the time I stopped. --Tagishsimon (talk) 19:37, 27 February 2020 (UTC)
  • Well, one difference would be that scientific articles are "part of" science, i.e., part of the body of scientific knowledge, and produced by scientists through the application of science, whereas non-scientific articles are not. But "instance of" may not be the best way to represent that: otherwise, we'd possibly want a whole subclass tree of things like "geological article" or "climate change article". Ghouston (talk) 10:56, 27 February 2020 (UTC)
    • I agree that "instance of" may not be the best way to represent that. The problem is that science (Q336) has a slightly different notion and scope in different languages (see Talk:Q336) and the English notion seems to be the most narrow one. Given your description in terms of Wikidata items (scholarly article (Q13442814) is part of science (Q336) and produced by scientist (Q901)) I would conclude [from a German language perspective] (inserted for clarification) that an article from an academic journal in the field of "Literaturwissenschaft" (literary studies (Q208217)) is a scholarly article (Q13442814). From my perspective there is actually not something profoundly different. Some time ago I opened a discussion how to represent the subject area of a publication (see Wikidata_talk:WikiProject_Books/2018#subject_areas_and_genres), I think this is somehow related to the question how to represent that an academic article is from "science" in the strict English meaning or from literary studies. - Valentina.Anitnelav (talk) 12:06, 27 February 2020 (UTC)
      • Yes "science" in one sense may be just any collection of knowledge, but in modern English it's usually considered to be the study of the natural world, using some kind of rational principles. The exact definition has been argued about extensively with various philosophical schools of thought, and I'm not sure that any overwhelming consensus has been reached (much like a typical Wikidata discussion). But we do have many Wikipedia articles that have been sitelinked together, and in principle are supposed to be roughly about the same topic. Ghouston (talk) 12:43, 27 February 2020 (UTC)
        • I'm sorry, I forgot to mention that I read Wikidata labels in German. I see that my comment can be a bit confusing without that. This is really just about the use of words/building of concepts in different languages or "conventions" in different cultures, I did not want to open a philosophical discussion about the essence of knowledge :). In German it is really quite uncontroversial that "Literaturwissenschaft" (literary studies (Q208217)) or "Geschichtswissenschaft" (study of history (Q1066186)) is a "Wissenschaft" (science (Q336)). (As wikidata should be multilingual English should not be the only language taken into consideration here.) - Valentina.Anitnelav (talk) 13:21, 27 February 2020 (UTC)
      • The obvious difficulty being exactly what's in scope for study: humans are considered part of the natural world, so anthropology and economics and political science can be considered sciences, so why not the study of art and literature too? If you can study the history of the Universe in cosmology, why not study the history of World War II and call it science? It probably just comes down to convention. Ghouston (talk) 12:50, 27 February 2020 (UTC)
See also exact science (Q475023). It's not true that Germans agree on including soft sciences into the "scientific article" set. This is an old debate. But would the solution not be to import a related meta-ontology? --SCIdude (talk) 14:54, 27 February 2020 (UTC)
My big-picture thoughts: At this point scholarly article (Q13442814) and academic journal article (Q18918145) are often indiscriminately applied to journal article, regardless of how "science" and "scientific article" are defined. Once we get consensus on the distinction (if any) between scholarly article (Q13442814) and academic journal article (Q18918145), I think bulk edits of some sort will be necessary (either to move all to the more general term, or to appropriately distinguish). Once we get to this, a report by journal of how many scholarly article (Q13442814) and academic journal article (Q18918145) appear in each could be informative. I have no big stake in the meta-question, but I would like the art history articles I keep encountering to be correctly classified, whatever the correct way is. :) Calliopejen1 (talk) 17:42, 27 February 2020 (UTC)
Also, note that there are various subclasses of Q13442814 and Q18918145 and these should be appropriately dealt with in whatever taxonomy/conventions for descriptions of items are decided on. E.g. historical article (Q58901470), medical scholarly article (Q82969330). Calliopejen1 (talk) 17:46, 27 February 2020 (UTC)
Then we probably should start by asking if journals should be classified into different groups using "instance of", such as "medical journal" or "economics journal", as well as "scientific journal", or if they should just be academic journals with the field stored in some other property. Ghouston (talk) 01:14, 28 February 2020 (UTC)
The advantage of using some other property is that we can presumably link existing items like "economics" and "climate science", we don't need a parallel set of items just for journals. Ghouston (talk) 01:17, 28 February 2020 (UTC)
I don't really see an advantage of using different P31 for either. The type of statements that should be found on an item for an article from a history journal isn't really different than from physics journal.
The counts can be done by linking the articles to the journal they are published in and the field the journal covers. More detailed analysis can be done based on the actual subjects of the articles.
(Some mathematician might point out that there is only one true science: Mathematics, so every other field shouldn't claim that term.) --- Jura 11:31, 28 February 2020 (UTC)
I would support a new property that would cover the field of a journal. Maybe something like "academic field of publication"? And what to do with scholarly article (Q13442814) and academic journal article (Q18918145)? Should scholarly article (Q13442814) be labeled "scholarly article" in English again? Should they be merged, then? Should scholarly article (Q13442814) stay as it is but maybe not used in P31? - Valentina.Anitnelav (talk) 12:22, 28 February 2020 (UTC)
I'd continue using Q13442814 in P31, but label it "journal article" (change the few that use Q18918145). Most current labels/descriptions/sitelinks of Q13442814 could go on a new item. I think even on the item for a journal main subject (P921) can be used. --- Jura 12:49, 28 February 2020 (UTC)

What on earth is happening here? There is a redefinition and merging one of the most central items in Wikidata. This does not make sense to me. We should go back to [25]. There may be a problem for English-speaking people: Danish and German labels "videnskabelig" and "wissenschaftlicher" encompasses both technical, natural science, social science and humanities. "scholarly" is an attempt to get the English-speaking people along, because "scientific" has a strong tendency to mean "natural science" (broadly I suppose with technical and mathematical). Why is such a dramatic change not discussed better? — Finn Årup Nielsen (fnielsen) (talk) 14:37, 4 March 2020 (UTC)

  • There are several questions to address: what describes best the 32 million items that currently use it? How many millions of items to change to something else if it isn't "exactly" covered? How often to re-classify items that meet it or no longer meet it depending on the view of the field or publication venue? What to do with sitelinks about "scientific article"? What to do about sitelinks about academic journal articles? What to do about sitelinks in journals about fields that some don't consider "Science"? --- Jura 14:46, 4 March 2020 (UTC)
There is also a problem with removing the "scholarly work" [26]. Why is that done? — Finn Årup Nielsen (fnielsen) (talk) 14:51, 4 March 2020 (UTC)
Most of the items we have in this domain is imported from PubMed would be "wissenschaftliche article" (I am here avoiding the contested English "scholarly" word). Most of them are published in scientific journals so most would also be "journal articles" and "scientific journal article"s, and even "medical science journal articles". Articles coming for arxiv.org are "wissenschaftliche articles". They (at least in machine learning) would also very often be "conference articles" or "scientific conference articles". It is questionable whether one could call them "academic articles" since many articles in them are not written by academics (university and college people, here it is unclear to me how one defines "academic"), but may come from research departments in the industry. There are also IBM Journal of Research and Development (Q15753899), which is an industry journal and a scientific journal. An article such as Der - som formelt subjekt i dansk (Q58799627), which is a from of language researcher in the humanities domain, I would call a "wissenschaftliche article" and a ""wissenschaftliche journal article". As I see it most would fit "wissenschaftliche article", i.e., the old Q13442814. There are various issues, e.g., short popular science article that are resume of longer scientific articles in Nature and Science, articles in Scientific American, Active Monitoring of Persons Exposed to Patients with Confirmed COVID-19 — United States, January–February 2020 (Q87056580) (has a DOI, but perhaps whould be termed a report?), reviews in scientific journals, errata, etc. — Finn Årup Nielsen (fnielsen) (talk) 15:19, 4 March 2020 (UTC)
We should change back to [27] before too much havock is done. — Finn Årup Nielsen (fnielsen) (talk) 15:44, 4 March 2020 (UTC)
What "havock" do you think is currently done? I don't see anything dramatic about the change. I think the solution is far better than the "bulk fix" proposed initially. What would you do with articles that you think don't fit "wissenschaftliche article"? --- Jura 17:12, 4 March 2020 (UTC)
For instance, the around 10,000 article under NeurIPS https://tools.wmflabs.org/scholia/series/Q15767446 are scientic conference articles. They are not "journal article". — Finn Årup Nielsen (fnielsen) (talk) 01:22, 5 March 2020 (UTC)

The whole issue seems to rest on a Anglocentric misconception. A version of The Parable Paintings of Domenico Fetti (Q55884180) states that is a Q13442814 (scholarly article) which is entirely correct. It is published in The Art Bulletin (Q15766110). That journals is regarded as a scholarly journal based on that it is listed in national listings of research/scholarly articles, e.g., ERA Journal ID (P1058) and Danish Bibliometric Research Indicator (BFI) SNO/CNO (P1250). It defines itself with "the journal has published, through rigorous peer review, scholarly articles" [28]. Articles in that journals are scholarly articles (or whatever you call it in English). With the label change of Q13442814 to journal articles thousands of articles has a wrong label. Take most of the articles by Quoc Viet Le (Q29043123). Most of his scholarly articles are not journal articles, but conference papers. And that means that when we present the information Scholia, we present wrong information, see https://tools.wmflabs.org/scholia/author/Q29043123 .

We should go back to this version as far as I can see, before @Ghouston:. It is unclear to me why this label change occurred. — Finn Årup Nielsen (fnielsen) (talk) 19:04, 6 March 2020 (UTC)
It was the obvious thing to do, because of the site links, many of the non-English labels, and the fact that the label was originally "scientific article" but had been changed in 2018. The site links have since been moved to a newly created scholarly article (Q86995636), and the English label on scholarly article (Q13442814) has been changed again. I thought Wikidata items weren't supposed to change meaning over time, to avoid breaking external links, and so that the various languages remain consistent. But I'll leave it to others to fix it how they please, I won't touch it further. Ghouston (talk) 21:52, 6 March 2020 (UTC)
I agree the best way forward in terms of the scope of Q13442814 (deliberately not using {{Q}} here, since discussions are hard to follow when labels change) is to go back to defining it as "scholarly article" because that is inclusive of articles written in any scholarly discipline and published essentially in any scholarly outlet, be it a journal, conference series or otherwise. The sitelinks do have varying scopes, so some may end up at Q13442814, while others would end up over at Q86995636 or maybe elsewhere. Note that the English Wikipedia has an article using the German term "Wissenschaft" as its title because there is no good English equivalent for that broad concept of systematic endeavours around engaging with knowledge, even though it is common in other languages — the Dutch wetenschap, the Russian наука and the Korean 학문 all map most closely to "Wissenschaft" and have separate terms for what is the modern narrow meaning of "science(s)" in English: Naturwissenschaften (de), natuurwetenschappen (nl), естественные науки (ru), 과학 (ko). Forcing people (or tools) into using "journal article" or "scientific article" when they want to refer to a broader notion of the content (e.g. Wissenschaft versus science) or the publication type (e.g. journal versus conference proceedings) and taking away the existing mechanism for that broader notion (i.e. Q13442814 as in this version) does not seem like a good Wikidata way of handling that. Wikidata should be able to reflect that there are multiple umbrella concepts above the English one of "science", that the corresponding publications can be grouped in a similar fashion, and that quite a bit of WikiCite tooling is built around the identifier Q13442814 for that broad concept of a "scholarly article", much like Q5 serves as an anchor for the concept of "human". Labeling Q13442814 as "journal article" would (or, as of now, does) destroy that, whereas restoring it to "scholarly article" would unbreak the workflows and still leave room to chisel out ways that fit particular subfields of Wissenschaft, be this the arts, sciences, humanities or others. --Daniel Mietchen (talk) 23:29, 10 March 2020 (UTC)
However, changing Q13442814 to "scholarly article" still leaves the question of how it differs from the original "scholarly article" item, Q18918145. Should they now be merged? Ghouston (talk) 23:44, 10 March 2020 (UTC)
No reason to merge them — Q18918145 has always been about "academic journal article", a subset of "scholarly article" that does not include things like articles in conference proceedings. --Daniel Mietchen (talk) 01:59, 11 March 2020 (UTC)
Then the "bulk fixing error" would still exist, in the form of numerous articles which are instances of the wrong item. Ghouston (talk) 23:33, 12 March 2020 (UTC)
Such inconsistencies can be sorted out as discussed at the beginning of this thread, perhaps on a per-journal basis. So in the short run, I propose to go back to this version and sorting out the labels on that basis, though keeping the sitelinks at scholarly article (Q86995636) until this is sorted out more properly. I plan to do this edit in a week from now, i.e. on March 20, and am happy to be beaten to it by someone else.
In the long run, we probably need to rethink the entire system a bit more deeply, perhaps choosing a generic instance of (P31) like publication (Q732577) for every publication, similar to human (Q5) being used on all items about humans. On that basis, we can then build a proper ontology for all the different nuances of publications and specify them by something like a "publication type" property. --Daniel Mietchen (talk) 00:34, 13 March 2020 (UTC)

The Source MetaData WikiProject does not exist. Please correct the name. --Daniel Mietchen (talk) 00:34, 13 March 2020 (UTC)

A lot of the confusion here comes from subjective value judgements. For you Art History is not science, for me it is. Even the distinction "journal article vs conference paper" is not always clear or permanent. Many conferences publish as special issues of journals (or invite the best papers to do so). Some of the leading research is published as preprints on arXiv and do these necessarily need to be reclassified when they appear in a "vetted" venue? Not necessarily. What's important is to have a pivot class "scholarly article" like we have Q5 for humans and use it to populate WikiCite and thus interlink the scholarly endeavour. We may never agree on the classification underneath, as it is subjective, language dependent and non permanent. But for the most part, it's also unimportant. There are some clearly non scholarly articles in scientific journals, eg obituaries or editorial intros to the issue. It would be nice to filter them out from sources like CrossRef but even that is not so important. And some historian of science may find them very valuable (especially if he wrote that obit ;-) So @Calliopejen1: I know you mean well but your efforts are misguided. Everyone, please listen to Finn and Daniel (and Jura) and don't mess up a working system -Vladimir Alexiev (talk) 02:26, 13 March 2020 (UTC)

The Source MetaData/More WikiProject does not exist. Please correct the name. --Daniel Mietchen (talk) 00:37, 13 March 2020 (UTC)

It does seem desirable to avoid using a class hierarchy for either the subject (computer science / physics / art history) or publication venue (magazine / academic journal / proceedings / preprint archive / newspaper: the same obituary could perhaps appear in academic or non-academic publications). Do we even care if a written work is published or not? Publication doesn't necessary change it, just produce more copies. Most of the works we deal with have been published though. We also have a distinction of "form", poems / novels / short stories / textbooks / academic papers. Ghouston (talk) 01:36, 15 March 2020 (UTC)
Sorry Template:Ghouston, it isn’t just the number of copies. Publication means the work can be read and criticized by anyone, not just an ‘in crowd’. It’s an essential part of the practice of science. LeadSongDog (talk) 00:41, 26 March 2020 (UTC)
No, publicationg does not necessarily mean a work "can be read and criticized by anyone". A limited edition of 1000 is a published work. Posting on a paywall site still makes for a published work. Indeed, most academic journals are paywalled. - Jmabel (talk) 19:56, 26 March 2020 (UTC)

The Source MetaData WikiProject does not exist. Please correct the name. We are now back to the "scholarly article" version indicated above. On that basis, I think it is now a good time to discuss a data model for bibliographic items more generally, for which I will open a new thread. --Daniel Mietchen (talk) 15:57, 20 March 2020 (UTC) The Source MetaData/More WikiProject does not exist. Please correct the name. That additional discussion has now been started over at WikiProject Source MetaData. --Daniel Mietchen (talk) 16:09, 20 March 2020 (UTC)