Wikidata:Project chat/Archive/2018/10

From Wikidata
Jump to navigation Jump to search

Merge proposals

Is there a page where one may propose items for merging? Q19394113 and Q17074026 should be merged, and I might do it manually, but it would be much quicker just to add somewhere a proposal and leave it to a bot to move all links and properties. --193.157.207.127 08:53, 3 October 2018 (UTC)

I have merged the items. Leaving a request here is fine I guess.--Ymblanter (talk) 09:04, 3 October 2018 (UTC)
Thanks for the answer and for the merge. Could you do the same for the following items:
This section was archived on a request by: Please see Help:Merge. --- Jura 08:18, 6 October 2018 (UTC)

Add wikipedia page

Can someone please add https://pt.wikipedia.org/wiki/Subdistrito to https://www.wikidata.org/wiki/Q3491994 ? Something don't let me do that, the "publish" button is grey.

Tentre (talk) 11:18, 3 October 2018 (UTC)

@Tentre: ✓ Done – another item was already using that sitelink, I’ve merged them now. —Galaktos (talk) 11:22, 3 October 2018 (UTC)
This section was archived on a request by: Please see Help:Merge. --- Jura 08:20, 6 October 2018 (UTC)

Merge

It would be good to merge Q24853030 and Q161374, it is the same tree, but I do not know how to do it.--Jan Spousta (talk) 14:55, 3 October 2018 (UTC)

This section was archived on a request by: --- Jura 08:19, 6 October 2018 (UTC)

Bad coordinate data from the Holocaust Museum encyclopedia

If you look at the coordinate location (P625) in (nearly) all the items that use Encyclopedia of Camps and Ghettos, 1933-1945 (Q6946780) as a reference, you will see that they are suffering from some sort of glitch. For example, Oranienburg concentration camp (Q119720) has two sets of coordinates, and the data from the Holocaust Museum encyclopedia is both extremely over-precise and also disagrees with the other by some 260 meters. Perhaps this dataset was uploaded incorrectly? Abductive (talk) 06:50, 27 September 2018 (UTC)

  • Can't speak to the overprecision, but 260 meters could easily still be within the camp. Maybe one is using an approximate center and the other is using an entrance? Or something like that? Have you looked at how the respective coordinates compare to a map on some of the ones whose boundaries would be clear? - Jmabel (talk) 15:47, 27 September 2018 (UTC)
  • It looks like the values were added using QuickStatements, which I assume used a fixed precision of 6 decimal places. Based on the decimal representation we have (52.75, 13.2333), I suspect that the original source had 52°45'N 13°14'E (converts to 52.75, 13.2333(333...) in decimal) which seems reasonable to me. - Nikki (talk) 16:41, 27 September 2018 (UTC)

Should persons' former citizenships be deleted?

Should we mark persons' former citizenships or not? For example, should Adolf Hitler (Q352)' country of citizenship (P27) have only Nazi Germany (Q7318)? If no, should Viktor Orbán (Q57641) have the person's former Hungarian People's Republic (Q16410) citizenship as well, because he was born and he lived there? Bencemac (talk) 13:02, 27 September 2018 (UTC)

  • In general, historical information should not be removed from items. Help:Ranking shows how to deal with it.
  • In this particular case regarding Viktor Orbán (Q57641), it is not clear to me whether Hungarian People's Republic (Q16410) is a valid value to be used with country of citizenship (P27) or not. I clearly tend to believe it is, but I am surprised so see that there are only 123 items with such a claim.
  • As far as I understand from Google-translations of some edit summaries in Viktor Orbán (Q57641), the reason for removal of this claim might be that there is some problem in a huwiki infobox when this historical claim is in the item. This is clearly not a problem of Wikidata and not a valid reason to remove the claim, as it should be fixed in the infobox rather than in Wikidata. I would consider it vandalism if an editor repeatedly removed claims for such reasons, but only after they have explicitly been informed of the situation here.
  • Also: @Pallerti, Csigabi as involved editors/admins.

MisterSynergy (talk) 13:38, 27 September 2018 (UTC)

The reason of the Hungarian People's Republic (Q16410)'s low usage is very simple: incorrect importations. See my bot request. Bencemac (talk) 14:14, 27 September 2018 (UTC)
Wouldn't the case of Hungary be taken care of by the replaces (P1365) from the People's Republic? 216.125.49.252 16:31, 27 September 2018 (UTC)

Ping ArthurPSmith as well. Bencemac (talk) 06:33, 28 September 2018 (UTC)

I have no particular expertise with regard to country of citizenship (P27); in general I would echo what MisterSynergy said above. ArthurPSmith (talk) 01:24, 29 September 2018 (UTC)

I have opened a discussion on one of the huwiki forums on the issue. The problem is more complicated than simply calling it a Wikidata problem to be solved, it relates other Wikipedias as well. I ask everyone for more patience now. I promise I will summarize the results of discussion on huwiki here. Csigabi (talk) 08:58, 29 September 2018 (UTC)

The question is wrong, because the citizenship does not change with the the changes of the political system in the country. My citizenship was and remains Hungarian, however I have lived in Hungary through three or four different political systems since 1947. --Szilas (talk) 17:33, 30 September 2018 (UTC)

Addition: Adolf Hitler also had only two citizenship in his life: Austrian (and not Austro-Hungarian) and German. The present list of his differemt "citizenships" is completely mistaken.--Szilas (talk) 17:42, 30 September 2018 (UTC)

Yes, this is the case as long as we consider Hungary (Q28) and Hungarian People's Republic (Q16410) to be the same country, with a change of government, instead of two separate countries. Ghouston (talk) 03:12, 2 October 2018 (UTC)

tangled/conflicting/duplicate Q items

As far as I can tell, at the moment https://www.wikidata.org/wiki/Q3227830 https://www.wikidata.org/wiki/Q21741420 are both about the same François Boucher painting, and neither one is about the general artistic theme... AnonMoos (talk) 21:05, 30 September 2018 (UTC)

  • Judging by the images they are about two different paintings on that same theme ("The Triumph of Venus"), although both are possibly by the same artist. - Jmabel (talk) 00:02, 1 October 2018 (UTC)
What I most clearly understand is that neither should have a commonswiki link to "Category:Triumph of Venus", because neither is about the general artistic theme... AnonMoos (talk) 01:11, 1 October 2018 (UTC)
I agree with that. - Jmabel (talk) 05:42, 1 October 2018 (UTC)
@Mike Peel: looks like your bot made a mistake. Can you check the logic and see if it edited more paintings? Multichill (talk) 08:02, 1 October 2018 (UTC)
The logic the bot uses is that the name should match, and the picture should be in that commons category, which can't already have a sitelink. That works well for 99.9..% of the cases it's editing. Any suggestions on how to modify that to avoid this false positive? In this case, a new Wikidata item is probably needed, but I'm not sure if it should be about the artistic theme, the general event, or instance of (P31)=Wikimedia category (Q4167836) + category combines topics (P971). Any thoughts? Thanks. Mike Peel (talk) 11:10, 1 October 2018 (UTC)
"Triumph of Venus" as an artistic theme has an iconclass code, so I have made triumph of Venus (Q56850955) (artistic theme). - PKM (talk) 02:55, 2 October 2018 (UTC)

Orphaned wikidata items without articles

Hi, Today I see for the first time that there are wikidata personal entries without any existing article in any wikiproject. I wonder if that has any system. What is the point of this?

We try to evaluate topical editing on swwiki and I find this irritating. I found several entries on Tanzanian women with no article on enwiki, swwiki or elsewhere. I saw some of these were started by user:Nattes à chat and user:Florentyna.

To me it looks like a receipe for a bit of chaos, not least because of the varying orthographies in this area of the world. Example wd:Q26986911 "Nassra Jumaa". On dewiki she is mentioned and red-linked as Nasra Juma in de:Badminton-Afrikameisterschaft_1992, so if anybody uses that red link for an article it will be a new entry and nobody will discover that user:Florentyna had already started a wikidata item with a weird different spelling from wheresoever. (Sorry if wrong: I brought this first to request for comments, if it is not its place someone pls delete). Kipala (talk) 10:30, 1 October 2018 (UTC)

@Kipala: There are millions of items without any sitelinks. Wikidata doesn’t just exist to serve the Wikipedias – there’s no reason not to collect data on notable items just because they don’t have a Wikipedia article yet. (And if you want to criticize the way Florentyna spelled the label, then IMHO it would be polite to at least ping them.) And if a duplicate item is created due to this, that’s not the end of the world, the items can be merged later – e. g. if someone notices the distinct-values constraint (Q21502410) violation because both items have the same Commonwealth Games Federation athlete ID (archived) (P4548) (assuming whoever creates the new item also adds that identifier). —Galaktos (talk) 14:20, 1 October 2018 (UTC)
Hi Kipala - As Galaktos said, it is acceptable and desirable to have Wikidata items for which there is no corresponding Wikipedia article in any language. Take a look at Wikidata:Notability and while number 1 has the most elaboration and criteria you recognize, points 2 and 3 describe millions of Wikidata entries in a way that is sometimes unfamiliar to folks who have only edited Wikipedia. Feel free to ask any other questions. -- Fuzheado (talk) 20:30, 1 October 2018 (UTC)

Wikidata weekly summary #332

Results from global Wikimedia survey 2018 are published

Hello! A few months ago the Wikimedia Foundation invited contributors to take a survey about your experiences on Wikipedia. The report is now published on Meta-Wiki! We asked contributors 170 questions across many different topics like diversity, harassment, paid editing, Wikimedia events and many others.

Read the report or watch the presentation, which is available only in English. Add your thoughts and comments to the report talk page. Feel free to share the report on Wikipedia/Wikimedia or on your favorite social media. Thanks! -- EGalvez (WMF) 20:40, 1 October 2018 (UTC)

Mali changes

Hi, I'm not allowed to edit Mali (Q912). Could someone change the "driving side" to right and remove one occurrence of the president under "head of state" (he appears twice). Add "Ein Volk, ein Ziel, ein Glaube" as "motto text" (German). Thanks!--37.201.181.41 17:35, 1 October 2018 (UTC)

  • Done. Driving side changed to 'right' and reference added. Head of state already had one item only. German motto text added but no reference could be found other than the German Wikipedia (good enough for now I suppose, and Google translate indicates it's about right). Dhx1 (talk) 13:21, 2 October 2018 (UTC)

Facto Post issue 16, Cambridge Wikidata event 20 October

The latest issue of the Facto Post newsletter is here.

Wikidata:ContentMine/Cambridge Wikidata Workshop is the page for an event on 20 October, in Cambridge UK. Star guest is Magnus Manske. Charles Matthews (talk) 11:14, 2 October 2018 (UTC)

Technical Advice IRC Meeting

We'd like to invite you to the weekly Technical Advice IRC meeting. The Technical Advice IRC Meeting is a weekly support event for volunteer developers. Every Wednesday, two full-time developers are available to help you with all your questions about Mediawiki, gadgets, tools and more! This can be anything from "how to get started" over "who would be the best contact for X" to specific questions on your project.

The Technical Advice IRC meeting is every Wednesday 3-4 pm UTC as well as on every first Wednesday of the month 11-12 pm UTC.

If you know already what you would like to discuss or ask, please add your topic to the page of the next meeting. Cheers, -- Michael Schönitzer (WMDE) (talk) 16:12, 2 October 2018 (UTC)

RDF: 2 URI's for same entity :(

In this query, ?type is bound to long URI's like http://www.wikidata.org/entity/statement/Q36655806-1cef2a8a-4779-65c7-2f3c-a337d5d3dbbb

Instead, the URI http://www.wikidata.org/entity/Q36655806 is expected.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
  ?type <http://www.wikidata.org/prop/statement/P279>
        <http://www.wikidata.org/entity/Q10675206> .
  OPTIONAL { ?type rdfs:label ?label }
}

Run it with Wikidata UI

As a consequence, ?label is not returned. But this would be true for most properties, as can be seen from this query:

select *
where { wds:Q36655806-1cef2a8a-4779-65c7-2f3c-a337d5d3dbbb ?p ?o }

--Jmvanel (talk) 10:41, 28 September 2018 (UTC)


I found that using this prefix for properties fixes the problem:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

But why different URI's for the same property ? No other RDF dataset I know of does that . --Jmvanel (talk) 11:47, 28 September 2018 (UTC)

There are simple values and full values. In most cases, you want to have simple values which you get with wdt:. But in some cases you need additional information, e.g. precision, calendar, qualifiers or sources, for which you need to retrieve the full value. The complete RDF documentation you find on mw:Wikibase/Indexing/RDF Dump Format. --Pasleim (talk) 12:24, 28 September 2018 (UTC)
As Pasleim says, those are two different properties. They have different ranges, and thus different semantics. As you will see in the documentation, there is even more than just two of those :) --Denny (talk) 21:16, 2 October 2018 (UTC)

Wikimedia Commons-only properties?

Hello 👋🏻 everyone,

I remember there being a large discussion about the notability of Wikimedia Commons properties on Wikidata where it was discussed if things such as items which have exclusively a Wikimedia Commons category for example "Commons:Category:Collections of the Capital Museum" have their own Wikidata property. Now I went back to "Wikidata:Notability" (per this version, for these reading it in the distant future 🔮) and it reads "It contains at least one valid sitelink to a page on Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wikinews, Wikibooks, Wikidata, Wikispecies, Wikiversity, or Wikimedia Commons.", now does this mean that the aforementioned Wikimedia Commons category can have "its own" Wikidata instance? This is quite important to know as the whole Structured Data on Wikimedia Commons thing is slowly being rolled out and this would be a major step towards achieving it. If this is the case, are there any plans to have a bot mass-create these category items "as they're being created"? -- 徵國單  (討論 🀄) (方孔錢 💴) 11:13, 1 October 2018 (UTC)

That's the reason why I think that such "manual updated list" tone should be modified, I always propose it to change to It contains at least one valid sitelink to a page on any projects that are deployed Wikibase client. --Liuxinyu970226 (talk) 14:43, 1 October 2018 (UTC)
@Liuxinyu970226: I'm pretty sure "deplymented" is not a word, and I can't guess the intended meaning. Can you clarify? - Jmabel (talk) 15:44, 1 October 2018 (UTC)
I think Liuxinyu970226 means "projects which have Wikibase client deployed/implemented on them". I'm not sure why we don't just say "contains at least one valid sitelink", link "sitelink" to Help:Sitelinks, and remove the list of projects. - Nikki (talk) 15:55, 1 October 2018 (UTC)
I think you confuse terms. What's a "Wikimedia Commons-only property"? Note that "the whole Structured Data on Wikimedia Commons" will live on Commons as well, so WD:N should be irrelevant to it. Matěj Suchánek (talk) 14:03, 2 October 2018 (UTC)
@Matěj Suchánek: I don’t think this is accurate – as far as I’m aware, Wikimedia Commons will reuse items and properties from Wikidata, Wikibase Federation (Commons’ entities will be a separate entity type, MediaInfo). See commons:Commons:Structured data/Development#Federation, especially this image shown there. --Lucas Werkmeister (talk) 11:20, 3 October 2018 (UTC)
Right, I forgot about that. So is the problem that WD:N doesn't deal with properties at all? How relevant are "sitelinks", then? Matěj Suchánek (talk) 15:18, 3 October 2018 (UTC)

Question about authority control ID about Maria José de Castro Rebello Mendes (Q10325936)

Hi. I was reading about pt:Maria José de Castro Rebello Mendes yesterday and decided to improve the information about her available. She was the first woman diplomat of Brazil. I found some sources about her, and then decided to update her wikidata entry. However, I didn't find her name in any of the Authority Control instituions like LCCN or WorldCat. Does she really exist in any database? Also, I'm not sure if I should consider her given name, Maria José de Castro Rebello Mendes, or her name after she was married, Maria José Mendes Pinheiro de Vasconcellos,. And if she really doesn't exist in any authority control, is it possible to suggest her to be added? Thanks! Tetizeraz (talk) 02:23, 29 September 2018 (UTC)

There is a field for "married name" and "surname" and one for "second surname in Spanish name" Property:P1950, so you can parse the name into the components. --RAN (talk) 21:03, 3 October 2018 (UTC)

question about how items appear

Why do some items display only the item's item identifiers but not labels? Is there something about my particular settings or, maybe, about the particular item? I look at the item only only see property : Q##### which is not informative. I have to click on the identifier and read it and then back to the item for the next identifier. - Trilotat (talk) 12:42, 3 October 2018 (UTC)

@Trilotat: that’s an unfortunate bug, phabricator:T205556. --Lucas Werkmeister (WMDE) (talk) 13:03, 3 October 2018 (UTC)
fwiw, @Trilotat: if you enable Preferences / Gadgets / General / PurgeTab: Adds a "Purge" tab. Clicking on it will purge the page, you'll get a tab to the right of View history, containing a Purge link, which will with a single click refresh the page, convert QIds to labels, and leave you on the correct page. --Tagishsimon (talk) 00:43, 4 October 2018 (UTC)

Distinguishing two names

What to do with distinctive names that look alike but are not the same thing? For instance, Nicolas Burq and Nick Berg. I could not find a template or hatnote to make note about the distinctions. George Ho (talk) 23:54, 29 September 2018 (UTC)

That's what I would use, but these two folks don't seem similar enough or easily confused enough to warrant that. Here's an example of how I'd use it for nicotine and (-)-nicotine. [1] -- Fuzheado (talk) 20:33, 1 October 2018 (UTC)
Yup, I add the property whenever I hear a name on the radio, and then do a search for that person to get more info. Once I distinguish the correct person, I add the sound alikes (near homophones) to make it easier for the next person to distinguish them. --RAN (talk) 20:58, 3 October 2018 (UTC)

Nicknames

What is best practice for nicknames? nickname (P1449) only takes a string as a value, not an item, and seems more suited to a unique byname (Q279239) than to a common nickname of the hypocorism (Q1130279) type. I'm not sure if this calls for a new property or just the considered use of qualifiers, but it's strange that short-form nicknames seem to be rare on biographical items. I've put in an example for John F. Kennedy (Q9696) and Jack (Q1159009) - how does Q9696#P735 look?--Pharos (talk) 11:39, 4 October 2018 (UTC)

  • It depends: if it's a name used for affection or ridicule, P1449 should do. Although in the later case, for living people, there may be other issues.
    If it's just the first name people are generally known by, this should be in P735.
    --- Jura 11:57, 4 October 2018 (UTC)

Reminder: No editing for up to an hour on 10 October

12:03, 4 October 2018 (UTC)

How to indicate a work depicts the whole (or only a part) of a subject?

Example: how to indicate that a map depicts a country in its entirety, not just a part of the country? (cf: c:Category:Old maps of whole Wales (alone) as opposed to c:Category:Old maps of Wales)

applies to part (P518) in connection with depicts (P180) usually indicates a part of the subject of the statement, ie which a part of the work the statement refers to, rather than what part of the stated object of the statement. So I don't think depicts (P180) Wales (Q25) / applies to part (P518) whole (Q16868672) would be appropriate.

shown with features (P1354) is often used to give further detail about the objects of depicts (P180), but I'm not quite sure I could get it to work here. Does shown with features (P1354) = whole (Q16868672) work ?

proportion (P1107) = 1 seems obscure and cryptic.

object has role (P3831) = whole (Q16868672) is a standard fall-back qualifier, and at least clearly indicates that it is the object of the statement that the qualifier is describing, but is whole (Q16868672) really a role? And object has role (P3831) = part (Q15989253) seems ambiguous -- would it be implying that this is part of what the map depicts, or that a part of this is what is depicted?

Equally, if a painting depicted some well-known item (with its own Q-number), but only part of it, how should one indicate this?

Anyone got any good ideas as to how to code this clearly and unambiguously? Jheald (talk) 12:07, 3 October 2018 (UTC)

proportion (P1107) = 1 was the closest I could guess on this. Maybe we need a new qualifier to indicate the fraction of a target item included in the original item? ArthurPSmith (talk) 14:42, 3 October 2018 (UTC)
@ArthurPSmith: Property proposed, as Wikidata:Property_proposal/Creative_work#depicted_part. Jheald (talk) 16:00, 3 October 2018 (UTC)
I’d say that unless specified otherwise, we can assume the whole is depicted. Otherwise, I think the qualifier applies to part (P518) View with SQID has been created for this purpose. author  TomT0m / talk page 19:59, 3 October 2018 (UTC)
Yes, I think applies to part (P518) fits here. --Marsupium (talk) 20:23, 3 October 2018 (UTC)
@Marsupium, TomT0m:
Here is a query to survey the most common values for applies to part (P518) as a qualifier on depicts (P180): tinyurl.com/y7jth23e
Common values are background (Q13217555), foreground (Q2533752), left (Q13196750), right (Q14565199), obverse (Q257418), reverse (Q1542661), down (Q15332388), and up (Q15332375): all referring to the part of the image (or coin), not the part or extent of the thing depicted.
For heraldic objects, common values are coronet (Q50324108), supporter (Q725975), first quarter (Q27927512), second quarter (Q27927517), third quarter (Q27927518), fourth quarter (Q27927519), escutcheon (Q1227885), inescutcheon (Q740263), chief (Q1475346), fess (Q1088484), point (Q643584), canton (Q1267314), bordure (Q849732) -- all referring to the part of the heraldic design, not designating a part of the thing depicted.
This is also how the Structured Data team is expecting (and will be encouraging) P518 to be used -- see eg example in their most recent slide deck.
applies to part (P518) is to identify the part of the image or of the heraldic device that depicts (P180) applies to. We need something else, if we want to be able to specify the extent or part of the value of P518 that is depicted. Jheald (talk) 21:34, 3 October 2018 (UTC)
@Jheald: yes, I realised this in the property proposal. Are you aware if the structured data team envisioned the possibility to use whole items for parts of a work ? the part-items can then be described with plain statements and the whole has the item parts as part of. author  TomT0m / talk page 07:44, 4 October 2018 (UTC)
Nethertheless, I still think we should assume that whole are depicted unless specified otherwise. There is also the possibility to create item for parts such as « human leg » (or « woman leg », I think I did stuffs like that in the past) « Woman leg » is then part of « woman » (and a subclass of human leg), and it’s easy to say « work depicts woman leg ». And it’s reusable. author  TomT0m / talk page 07:44, 4 October 2018 (UTC)
@TomT0m: Obviously, if one has a more specific item, then one should use it for the value of depicts (P180). But I don't think we're going to be mass-creating items for things like "Queen Victoria's head". So it is useful to have the option to record depicts (P180) Victoria (Q9439) "depicted part" head (Q23640).
As to whether we should assume that whole are depicted unless specified otherwise, I don't think that is a sound strategy for maps. When X is map is classified under subject X, eg "Brittany", either in an external catalogue or a Commons category, one cannot assume that it is a map of the whole of Brittany - that is just not the way those sources work. And when an item here says "depicts Brittany", one cannot reliably assume that it depicts the whole of Brittany either - the statement may simply have been imported directly from a source that doesn't distinguish; and in any case, up until now there is no way to record that a map "depicts Brittany (part of)" or "depicts Brittany (location within)". So currently, if a statement says "depicts Brittany" one would be wise not to assume anything, and additional information "depicted part: entirety" would indeed be additional knowledge. Jheald (talk) 09:40, 4 October 2018 (UTC)
@Jheald: What sources are you referring to ? If their datas are that imprecise of that bad it might not be wise to import them and post cure them on Wikidata directly, and we might envision a cleaning tool like a Wikidata game. We create a dataset to be validated by us, and the tools create the statement after curation. With Brittany, we have already a bunch of items for its part, for example. If in those sources « depicts : Brittany» is uses as a kind of category « Brittany » in Wikipedia this might be as bad as bulk import a wp category without curation. author  TomT0m / talk page 11:36, 4 October 2018 (UTC)
Also, in terms of a picture of a building: no one photo ever depicts a "whole" building. At most, it will depict 2 sides/façades (in a corner view). - Jmabel (talk) 15:51, 4 October 2018 (UTC)
That’s a general truth for any « model » of an object, no model (scientific, 3D, …) depicts all aspects of an object. If you have a map of a country, you’ll have feature in the map for say roads, mountains but no map will include all the informations. This implies maybe we need properties to describe the features of the « model » of the object (scale, resolution, features of the object depicted in the model, …). But I don’t think we should try to model pictures just as we can model datasets :) In the picture case, maybe we need a way to describe « viewpoint » taken, more than considering this as a depiction of some part ? author  TomT0m / talk page 17:43, 4 October 2018 (UTC)
@TomT0m: Well the structured data project is exactly trying to model pictures, and will include properties for scale, resolution, features of the object depicted (this property). Those properties can also be useful for pictures that have their own Wikidata items. We already have scale (P1752) for example. To describe photographs of models of things, we will probably need a wikidata item for the model, which would then hold the information about the model, albeit that this has some issues as laid out here. Jheald (talk) 19:17, 4 October 2018 (UTC)

Mystic Marriage of Saint Catherine

Does anyone see a reason why mystic marriage of Saint Catherine (Q4296527) and mystic marriage of Saint Catherine of Alexandria (Q20729926) should not be merged? - PKM (talk) 20:23, 3 October 2018 (UTC)

PS There are no 'mystic marriages' of other St Cathertines. - PKM (talk) 21:15, 3 October 2018 (UTC)
Apparently one is for the commons category. I guess they can be merged since I don't know why these shouldn't be. Jane023 (talk) 11:45, 4 October 2018 (UTC)

The first one concerns both Catherine of Siena (Q229190) and Catherine of Alexandria (Q179718) while the second only concerns Catherine of Alexandria (Q179718). IMHO they shouldn't be merged. — Ayack (talk) 13:19, 4 October 2018 (UTC) FYI, I've just created mystic marriage of Saint Catherine of Siena (Q56876704). — Ayack (talk) 13:25, 4 October 2018 (UTC)

@Jura1, Ayack: And there's the answer, thank you! I was afraid I was missing something. - PKM (talk) 19:18, 4 October 2018 (UTC)

Does Wikidata not even have any way of entering both hyponyms and hypernyms?

So "vehicle" would be hyponym of "machine" but hypernym of "motor car" and "motor truck". As far as I can see, Wikidata has a property "subclass of" but no corresponding property "superclass of", and the explanation of the properties "hypernym of" and "hyponym of" manage to (unclearly) suggest that they both in fact capture the same semantic relationship (i.e., you can enter hypernyms but not hyponyms). I may not have time to come back here and see whether anyone responded to this question, but I am tossing the grenade just to say that either Wikidata's data model glaringly sucks or it is inadequately explained (or both). All of the "nyms" outlined at Wiktionary:Wiktionary:Semantic relations would need to be straightforwardly covered by Wikidata using predictable property names or aliases. Are they? Quercus solaris (talk) 01:03, 4 October 2018 (UTC)

  • I don't think we should generally have inverse properties where the typical relation is few-to-many, if not quite 1-to-many. We don't want a single item to have to carry 100+ values for the inverse property.
  • If we can come up with some sort of bot-maintained "shadow" inverse property, rather than have to do queries each time, that would be great, but all database design principles I know argue against maintaining an editable, independent inverse property in these circumstances. - Jmabel (talk) 01:47, 4 October 2018 (UTC)

Is this even modelable in WikiData ? I mean. Basically all mammals would have a direct relation with all animals. That is a humongous group.. There is a reason that most of the wiktionary pages only provide examples of such relations. Put otherwise: A set of related words can be structured in hyponyms and/or hypernyms, but can you model all possible words into all possible sets of hyponyms and hypernyms? It seems to me you'd require a different type of database to do that effectively. When I read about the concept it says: "Hypernyms and hyponyms are asymmetric" "Hyponymy is a transitive relation, if X is a hyponym of Y, and Y is a hyponym of Z, then X is a hyponym of Z". That at least doesn't seem like a concept with a child-parent relationship to me, so I think any Wikidata properties can never be more than "Example of a hypernym" and "example of a hyponym" really. TheDJ (talk) 09:26, 4 October 2018 (UTC)

  • Thank you all for your comments. I acknowledge (1) the problem of gargantuan scope for the inverse and (2) the fact that it would be superfluous anyway if the "shadow" could be presented dynamically as a w:database view or collated by a bot. I hope that someday the latter ideas will be realized. Regards, Quercus solaris (talk) 23:56, 4 October 2018 (UTC)

How to record reputation of Science Awards?

I want to record the reputation/ranking scores of academic awards according to IREG List of Academic Awards (Q56884231) (http://ireg-observatory.org/en/pdfy/IREG-list-academic-awards-EN.pdf) Should I record it like this, or is there a better way?

Kavli Prize in Neuroscience (Q18889781) review score (P444) 0.55,

--Vladimir Alexiev (talk)

  • I would say that sounds reasonable and, at worst, if you do that consistently, then it would be very easy to fix it to something else in the future. - Jmabel (talk) 16:21, 5 October 2018 (UTC)

Hello,

When a user has set certain languages e.g. lb, pfl, ... in the user preferences it is not possible to access the main page via the side menu or the wikidatalogo on the left-top of each wikidata webpage. Indeed when you are logged with preference language lb, when you click in the side-menu on 'Haaptsäit' you are directed to https://www.wikidata.org/wiki/Haapts%C3%A4it (which is not really useful. For the language pfl you are redirected to https://www.wikidata.org/wiki/Schdadsaid

On the other hand for some other languages like gsw you are redirected to the German version of the main page.

Unfortunately I've not yet found out how to proceed to change this. Is there anyone you can help with this?

many thanks for any comments on this. Robby (talk) 09:21, 7 October 2018 (UTC)

I've fixed lb and pfl and if there are any more you'd like fixing, I can do those too. The problem is caused by Wikidata being a multilingual site (Commons has the same problem) and made worse by Wikidata's main page not being in the main namespace. Right now, they have to be fixed one by one by creating MediaWiki:Mainpage/xx (where xx is the language code) with "Wikidata:Main Page" as the content. - Nikki (talk) 10:17, 7 October 2018 (UTC)
Many thanks for the quick reaction and the information given. As I mainly work with lb interface this was the most important language for me to be fixed.Robby (talk) 13:06, 7 October 2018 (UTC)

Are all articles listed on Wikidata?

Hello all! This might be the silly question you see for the day, but I'm wondering if, for example, all the articles on the Arabic Wikipedia are enlisted as entries on Wikidata. --Reem Al-Kashif (talk) 12:03, 7 October 2018 (UTC)

Thank you!--Reem Al-Kashif (talk) 14:43, 7 October 2018 (UTC)

Questions about creating a new article.

I, yes, me (in case of a COI rule), do not have a wikipedia article. But I am mentioned on wikipedia articles as parts of groups/teams.

I am older and most of my life/work was pre-internet. Therefore I do not have a "web presence", nor have I ever wanted one. But the office publicist recently told me "your information is already out there, so you might as well manage it". And that has sort of resonated with me. I do have a Knowledge Graph on Google, which I recently "claimed". And I have created some of the social networking profiles (not that I have much to say) that are not yet "verified".

  1. I know that I am not allowed to edit Wikipedia articles that I am personally involved with (whether personally mentioned or not). Can I edit the same Wikidata items?
  2. Because I do not have a Wikipedia article, can I have a Wikidata article? I do have the "properties" that will be listed on a Wikidata page (BnF, ISNI, IPI, etc).

My primary goal is to gather "the facts" into one place so that I am not confused with others of the same name. And maybe get a picture on "the commons" that can be used by the sites that grab the data from this site. Beyond that, personal promotion via the internet is new to me. I have no idea what else may come of it.  – The preceding unsigned comment was added by 73.79.233.39 (talk • contribs) at 6. 10. 2018, 19:28‎ (UTC).

I don't think there's a rule against editing your own Wikidata item. But check Wikidata:Notability to find out if you should have one. Then (ideally) only add data that is referenced against an external source. If you are an author of any works that already have an item on Wikidata, you'd meet the notability requirement. Ghouston (talk) 00:18, 7 October 2018 (UTC)
Most of that is gibberish to me. I mean, subpage of a module, mainspace pages, subpages of mainspace pages, clearly identifiable conceptual or material entity.
Like I said (in common words), I am mentioned on multiple wikipedia pages (I don't even know what some of those other wiki things are - source, quote, news, books) in multiple languages. I also have a Google Knowledge Graph, so I can only imagine that I am notable enough to have centralized interlanguage links. 73.79.235.89
Yeah, I guess it is (gibberish). If you create the item, at worst it will end up on Wikidata:Requests for deletions some day. It will usually be because the item is about someone who isn't mentioned on other websites (except user-generated, like Facebook etc.) Ghouston (talk) 10:09, 8 October 2018 (UTC)

SUL and WD password

Hi, since many weeks i'm not able any more to log in to WD through my SUL, even it remains possible for others WM projects (Commons, WP, etc...). Did someone have the same problem and/or solve it ? Thx for answering. 78.125.131.167 11:49, 7 October 2018 (UTC)

On one of my devices (Ipad) I have the same problem. I am logged in on two other Wikimedia projects but Wikidata shows me as if I am not logged in.--Ymblanter (talk) 11:54, 7 October 2018 (UTC)
What happens when you try to log in? ·addshore· talk to me! 20:25, 8 October 2018 (UTC)

Protection?

[2] Enigmaman (talk) 04:46, 9 October 2018 (UTC)

✓ Done Surprised this wasn't done earlier. Mahir256 (talk) 05:09, 9 October 2018 (UTC)

External-id max characters

I've tried to add InChI (P234) to icatibant (amide form) (Q27137008), but I got an error that this maximum number of characters is 400, but the correct value is (517 characters):

1S/C59H90N20O12S/c60-37(14-5-19-68-57(62)63)49(84)74-39(16-7-21-70-59(66)67)53(88)76-22-8-18-43(76)55(90)78-30-35(81)26-44(78)51(86)71-28-47(82)72-40(27-36-13-9-23-92-36)50(85)75-41(31-80)54(89)77-29-34-12-2-1-10-32(34)24-46(77)56(91)79-42-17-4-3-11-33(42)25-45(79)52(87)73-38(48(61)83)15-6-20-69-58(64)65/h1-2,9-10,12-13,23,33,35,37-46,80-81H,3-8,11,14-22,24-31,60H2,(H2,61,83)(H,71,86)(H,72,82)(H,73,87)(H,74,84)(H,75,85)(H4,62,63,68)(H4,64,65,69)(H4,66,67,70)/t33-,35+,37+,38-,39-,40-,41-,42-,43-,44-,45-,46+/m0/s1

Is it possible to add this value in some other way? That is certainly not the only item in which InChI (P234) should be that long (or longer). Wostr (talk) 19:48, 8 October 2018 (UTC)

This is the "common string" max length as currently defined in Wikibase. Looking at the code "Defaults to 400 characters. This was an arbitrary decision when it turned out that 255 was to short for descriptions.". I guess this could be increased with a phabricator ticket explaining it is needed :) ·addshore· talk to me! 20:24, 8 October 2018 (UTC)
see phab:T154660 --Pasleim (talk) 21:34, 8 October 2018 (UTC)

Wikidata weekly summary #333

A history of European Carnival sculpture makers

hello Everyone, I've noticed that there are no references to the contemporary or historical makers of Carnival Art, an ancient Folk Art activity, on Wikipedia. Although I am just beginning to understand the site, I'd like to research, write and publish and article about the subject. I'm beginning by interviewing contemporary makers about their practice and who influenced their development. I will transcribe, quote and paraphrase these interviews to create the article. Does this method qualify as a primary source or does it require further citations ?Iotasteve (talk) 12:35, 9 October 2018 (UTC)

Wikisource: Notability of subpages of mainspace pages

Wikidata:Notability 1.5 states „The status of subpages of mainspace pages (for example, individual chapters) is undetermined”. This is from 2014. Is there any progress on this? --Succu (talk) 19:42, 9 October 2018 (UTC)

works of authors

How comes Wikidata doesn't include works (books) of authors? As far as I remember Freebase did this and had a huge catalogue of author related works. Or did I only not find them? --Rabenkind (talk) 20:56, 9 October 2018 (UTC) Add: why do I ask this: It makes is simpler to have a list of all books published and look for translations of those books. --Rabenkind (talk) 21:01, 9 October 2018 (UTC)

@Rabenkind: It might help if you mentioned which books by which author you're looking for; as long as you're not self-promoting or spamming your or someone else's books on Wikidata and the books are either 1) mentioned in an external database (VIAF, LCCN, GND, etc.) or 2) used as a reference to a claim on Wikidata, then those books and authors are eligible for Wikidata items. Mahir256 (talk) 21:35, 9 October 2018 (UTC)
It's more how to extract all books (in all known languages) of certain authors like i.e. George Orwell. As I found out that 'Animal Farm' is bound to Orwell I was trying to use the Query Service to get that working - without success so far. --Rabenkind (talk) 21:46, 9 October 2018 (UTC)
@Rabenkind: Write query with instance of (P31) = written work (Q47461344) AND author (P50) = George Orwell (Q3335). Snipre (talk) 22:45, 9 October 2018 (UTC)
Something like
SELECT ?h
WHERE 
{
	?h wdt:P31 wd:Q47461344.
        ?h wdt:P50 wd:Q3335
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
Or
SELECT ?h
WHERE 
{
	?h wdt:P31 wd:Q571.
        ?h wdt:P50 wd:Q3335
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
Snipre (talk) 22:56, 9 October 2018 (UTC)

That's what I was looking for - thanks a lot! --Rabenkind (talk) 07:31, 10 October 2018 (UTC)

Using 'imported from Wikimedia project' for adding a reference calls anything and everything

When adding a reference using imported from Wikimedia project (P143), it doesn't restrict the suggested list to Wikimedia Foundation project (Q14827288), which necessitates extra typing and increases the chance for errors. Can this be fixed? Abductive (talk) 04:34, 6 October 2018 (UTC)

Round 2

Since the remarks indented under what I said in no way address what I said, I'm going to repeat it, hoping to get an actual response.

What User:Matěj Suchánek says above directly contradicts what I was told on this very page barely a month ago, in a discussion that led to this change in imported from Wikimedia project (P143) as a supposed clarification. There were similar changes in several other languages. What Matěj is saying had been my previous understanding; I was kind of "newbied" and told in no uncertain terms that I was wrong. - Jmabel (talk) 20:01, 9 October 2018 (UTC)

So which is correct? - Jmabel (talk) 20:02, 9 October 2018 (UTC)

  • Maybe my memory fails me, but I may have read the same thing you wrote two years ago. Does it really matter? I think the recent change is an advantage. I tend to add "I think" to my comments, mainly to make it clear that's primarily a personal view. Someone else would just write "The recent change is an improvement mandated by ..". The user you mention is generally known for thoughtful comments. --- Jura 20:22, 9 October 2018 (UTC)
  • As you are referring to my comment, here some clarification: imported from Wikimedia project (P143) should only be used with Wikimedia projects as values, and stated in (P248) should not (must not) be used with Wikimedia projects as values. It is for both reference qualifiers irrelevant whether they are added via tools/bot, or manually.
    That said, let’s have a closer look at the situation: imported from Wikimedia project (P143)’s main purpose is to track provenance of data that has been batch-imported from a Wikipedia project, which is generally considered an unreliable source in the Wikimedia universe. The import script cannot verify the imported data, thus the imported from Wikimedia project (P143) reference is a marker to use this data carefully unless there is another, serious source available. Wikipedia templates do indeed consider P143-only referenced statements as unreferenced. Now if you manually import from a Wikipedia, you can also use the imported from Wikimedia project (P143) reference to indicate provenance, but it would in many cases be advantageous if you manually verified the imported value against an external source whenever one is available, and provide this directly in the references section rather than the imported from Wikimedia project (P143) “auxiliary reference”.
    For that reason, I also think that it is reasonable not to offer the reference qualifier imported from Wikimedia project (P143) too much in the GUI. —MisterSynergy (talk) 21:22, 9 October 2018 (UTC)
  • I would say that about half the time I've used imported from Wikimedia project (P143) manually it was because it was already there, but due to the limited intelligence of bots the Wikidata statement did not accurately reflect what the article said. The other half has, indeed, been to add basic data that was sitting in the Wikipedia article but not in Wikidata. Are you saying that it would be better for me not draw facts from Wikipedia at all when I have no interest in taking the time to find more solid sources? - Jmabel (talk) 23:44, 9 October 2018 (UTC)

Use version type (P548) as property

Currently, version type (P548) can be used only as a qualifier. What about using it as property for items that represent program versions?--Malore (talk) 23:33, 6 October 2018 (UTC)

For instance? Matěj Suchánek (talk) 08:30, 7 October 2018 (UTC)
Use « instance of » for this kind of usage. author  TomT0m / talk page 07:53, 11 October 2018 (UTC)

Template markup doesn't work in Description field

I tried to use the {{P|P...}} template within a Description and was surprised to find that it didn't work. Is this a feature or a bug? Abductive (talk) 23:42, 6 October 2018 (UTC)

@Abductive: That's right. The descriptions are intended to be re-usable in all sorts of environments beyond Wikimedia, so template markup is not supported. Jheald (talk) 23:58, 6 October 2018 (UTC)
Even plain references to other entities should be avoided if possible. Matěj Suchánek (talk) 08:31, 7 October 2018 (UTC)
Is there a « (Not so) frequently (but this still pops up from time to time) asked question » page ? author  TomT0m / talk page 08:00, 11 October 2018 (UTC)

Using IMO number for ship owners and managers

The property IMO ship number (P458) is now used in more than 10,000 items, basically ships registered with a number in the database of International Maritime Organization (Q201054). The question is: Should we extend this property to also include ship owners and managers? I.e. Wilson Ship Management (Q47149956) has the IMO number 1168545 and Norled (Q7051397) has the IMO number 1956053. Is there any good reason that we should not add these IMO numbers? --Cavernia (talk) 18:50, 8 October 2018 (UTC)

I’ve seen some identifier properties used to identify the type of item it’s into (it has this identifier => it’s a ship) but instance of (P31)/subclass of (P279) does this, so I’d say as it’s an identifier of the same database go for it and extend the use of the property to include the full extent of objects kind it identifies. author  TomT0m / talk page 08:04, 11 October 2018 (UTC)

Mistakes

One of the problems of assuming that bodies like Historic England (formerly English Heritage) have got it right is that we propagate their mistakes here and onwards. This, for example, is not on Slately Road, it's on Slatey Road, and I've had to email a Minor Amendment to them. Will it be picked up here? Their geocoords are also often inaccurate. There's also no point using their titles here since they are not unambiguous, as ours are forced to be. "Lodge" is a comnon term they use, but we use "Lodge of Allerton Manor" on Commons. Could we have bit more due diligence please, rather than the current obsession with quality over quantity? Thanks. Rodhullandemu (talk) 12:35, 9 October 2018 (UTC)

@Rodhullandemu: I too think more meaningful titles here would be useful, but there has been debate about this, e.g. as to eg whether "Church of St Mary", or "St Mary's, Little Snoring" is preferable. I would prefer the latter (close to how Commons names its categories; less likely to be confused; and easier to select correctly from a drop-down list), but there is a significant constituency of opinion for the former. (Note also in passing that the title does not in fact have to be unique on Wikidata (unlike Commons) -- on Wikidata the actual restriction is that an item has to have a unique combination of title + description.)
On the second point, quantity probably comes first; then scripts etc can be used to compare the data from different sources and identify anomalies. A mistake from a body like Historic England should be noted here with reference (including url and date), but ideally marked down to rank=deprecated, with a qualifier like reason for deprecated rank (P2241) = error in referenced source or sources (Q29998666). This (a) helps when checking what has and hasn't been imported from the database; and (b) can help us if we are drawing from further sources, that in turn may have drawn their data from Historic England. And (c) it gives us an easy way with a query to pull out a catalogue of issues we think we may have found with a particular source, that we can later check to see whether they may have updated. Jheald (talk) 09:04, 10 October 2018 (UTC)
It seem you also forgot to mention the use of Alias for this kind of usage, a string like « St Mary's, Little Snoring » can definitely be put in an alias to help find this item. author  TomT0m / talk page 08:09, 11 October 2018 (UTC)

Suitability of dataset for Wikidata

Hi. The National Library of Wales is planning on sharing a dataset, extracted from the Welsh Book of Remembrance. It contains data on 16,000 soldiers who died in action in the first world war. At a minimum these entries contain names, date and place of death and their regiment, but many also contain dates of birth, details of the war monument their names are recorded on and other biographical data. We are considering Wikidata as a possible home for this data, which was transcribed and enriched by volunteers, but i wanted to see how the community felt about adding such a large volume of relatively obscure people to Wikidata. Although obviously there is great potential to link people to regiments, and war memorials, and battle sites. I would appreciate any feedback. Thanks! Jason.nlw (talk) 15:20, 9 October 2018 (UTC)

Use that dataset to complete data on existing items. The existence of a person inthat dataset is not sufficient to justify the creation of an item, but each person having already an ittem and being in that dataset can see the corresponding data added in WD. As I see the the data addition, each year items from WD should be compared with the dataset and if a match is found the data are transfered from the dataset to WD. During that annual check, previous added data could be checked to ensure data integrity. Snipre (talk) 16:46, 9 October 2018 (UTC)
A previous discussion at Wikidata:Project_chat/Archive/2018/07#Scholarly_projects_and_notability_policy and deletion request concluded that even an item like Nathaniel Oldham (Q55445723) is notable enough for Wikidata, because the person has been listed in an external database. The Welsh soldiers seem like an improvement on that one. Ghouston (talk) 19:49, 9 October 2018 (UTC)
  • For some of the entries, you might find additional elements that make them worth including. Personally, I'd find it interesting if you could generate counts for first names and surnames in these lists and add at least those. Similar to the ones we have for Rotterdam/the Netherlands, e.g. at Q4925477#P5323. I can add them to items if you post them somewhere on a Wikidata page. --- Jura 06:04, 10 October 2018 (UTC)

Thanks everyone. @Jura1: If we do go down the Wikidata route i will definitely share the data with you for the name counts. It sounds from the test case highlighted by Ghouston that this data would be deemed acceptable - it looks as though we have the CWGC person ID (P1908) for each person too, so each person could be linked to at least 2 external databases - Commonwealth War Graves and The Welsh Book of Remembrance. We would obviously be careful to match to people who already exist in Wikidata however i think if we were going to do this the Library would want to share the complete set, rather than just the richest parts of it. The benefit to the institution comes from being able to offer users full access to the data via Wikidata, (there will also be a data dump on Github i think.). Obviously we can also make much better use of queries and visualizations if the data set is complete. Jason.nlw (talk) 08:22, 10 October 2018 (UTC)

  • I think it's worth discussing. w:WW1 notes some 9 million soldiers who died in WWI. Obviously, they all have some identifier. I don't think that makes them notable as such. --- Jura 08:53, 10 October 2018 (UTC)
Yes definitely needs discussing, and i'm grateful for all the input so far. I understand that many of these people might not be notable in the Wikipedia sense of the word, but does Wikidata really follow the same guidlines? In past discussions ive been told that Wikidata will generally accept all external datasets as long as it is deemed useful - something that links to, and enriches other data. For a lot of people having a single database of all 9 million WWI soldiers as part of a larger dataset, with linked data on regiments, places ect, would be an exciting prospect. Take the sum of all paintings project for example. Is a painting notable in its self for being in a museum? Some are for sure but many are not. Yet it is the collection of all those items together that has real value (for research and discovery), rather than each individual piece...Well that's what i think any way! please discuss. Jason.nlw (talk) 10:27, 10 October 2018 (UTC)
    • @Jason.nlw: Your own wikibase offering its own federated SPARQL query service might also be a possibility. I believe there's been quite a lot of work put in to try to make it easier for GLAMs to set up and run such a service, and various GLAMs that may be exploring this possibility for different projects. It might be a valuable thing for NLW to run as a pilot, to explore the technology. Not sure if there is a user group. Jheald (talk) 09:12, 10 October 2018 (UTC)
Hi Jheald Thanks for this. Yes this is definitely where we want to get to, however our current level of resources means we simply don't have the capacity to set this kind of thing up, which is one of the reasons we have been so keen on using Wikidata - because the infrastructure is there already! and we have repeatedly found that sharing with Wikidata adds value to our data, as we can pull back data like co-oordinates, external identifiers and additional biographical information. Jason.nlw (talk) 10:27, 10 October 2018 (UTC)
@Jason.nlw: Understood. And of course, it's not just the setting up of your own wikibase, it would be the maintenance and the keeping of it updated, etc which requires ongoing resources. And as you say, it's lonely being away from the main community; even if you can match items to Wikidata, and share Wikidata through federation, you don't get the incidental improvements from bystanders, and all the data you pull back you have to do yourself.
But still, the issue of secondary databases, as hub-and-spoke satellites to Wikidata, is one that a number of projects are coming to have to look at, as they start to strain WD:Notability -- for example, the current discussion in WikiCite. So there are others that would be facing this too. And - in the same way that one can rent a wiki from a serviced wiki-farm in the cloud, it may well be that there is a similar availability of services wikibases, with somebody else managing the setup and the servers and the maintenance and the query service, so that NLW as an end-user might just be able to sign a turn-key contract, and then start putting in data. Jheald (talk) 12:23, 10 October 2018 (UTC)
@Jason.nlw: It's also possible you might be able to get a consortium of GLAMs together behind a UK biographical wikibase -- there might be several cases of datasets such as yours containing people not considered to fulfill WD:N, but that it would be useful to have in an ecosystem attached to Wikidata. Jheald (talk) 12:29, 10 October 2018 (UTC)
Thanks Jheald, there are some really interesting ideas here. In wales for example there is now a big consortium of libraries using the same catalogue system, so perhaps having a wikibase as some kind of extension to that could be a way forward, and a way of open up the data of smaller libraries, who may not even have heard of Wikidata! Jason.nlw (talk) 09:02, 11 October 2018 (UTC)
    • Personally, I don't think adding these at scale is a good idea. While it might technically comply with the notability policy, it's really pushing the spirit of it - eg in theory we could describe everyone alive in the UK or US in certain years using public census data to create WD items, but we'd probably agree that was too much.
In the example Ghouston linked to, while the deletion discussion passed, the discussion on project chat was more nuanced and suggested that this is a good case where a separate database would be able to give the information they want. It's worth noting that in that case, those two test items were the only ones created - it didn't proceed further - so I wouldn't take it as unqualified endorsement/precedent for this being a good approach. Andrew Gray (talk) 11:29, 10 October 2018 (UTC)
  • Thanks Andrew, i appreciate your input. I take your point about the cited test case, and i wouldn't want to try and force anything through based on that alone. However what is becoming clear is that we do not have any kind of community consensus on what should be allowed on Wikidata, beyond the WD:N guide, partly because the project is developing so fast, and so its purpose/mission too is ever developing. Technically this data meets points 2 and 3 of WD:N. However the guide leaves a lot open to interpretation, perhaps deliberately, as Wikidata tries to find its purpose. Personally i think Wikidata should support and follow Wikipedia's goal of providing access to the sum of ALL human knowledge, and when it comes to data, shouldn't ALL mean ALL? Jason.nlw (talk) 09:20, 11 October 2018 (UTC)
If something technically complies with the notability policy, but isn't wanted on Wikidata, maybe the notability policy should be changed. I'm personally in favor of data donations to Wikidata, for the reasons stated above that a more robust dataset is more useful for queries and users. If Wikidata ends up adding census data I would not be opposed to that. Rachel Helps (BYU) (talk) 15:58, 10 October 2018 (UTC)
  • In my opinion, this dataset should be added to Wikidata. It has serious references and can even be linked to a second database. The fear of some users of big datasets is mostly unfounded; for example if one looks what is going on related to scholarly article (Q13442814) one sees that Wikidata can deal smoothly with datasets in the order of 10M. --Pasleim (talk) 20:21, 10 October 2018 (UTC)

Succu problems again

See this edit.

Followed by my explanation why the edit was wrong.

Followed by Succu making the same bad edit again through a revert to restore his bad edit rather than engage in a discussion to find a workable solution. --EncycloPetey (talk) 05:36, 10 October 2018 (UTC)

Sorry, I saw your entry at my talk page after my revert. And I saw this entry only by chance. More later, EncycloPetey. --Succu (talk) 05:44, 10 October 2018 (UTC)
I not only left a note about the matter on your Talk Page, but also explained the issue in the edit comment with a link to the community discussion on the matter. Are you saying that you missed both messages, or that you reverted first out of habit? --EncycloPetey (talk) 11:04, 10 October 2018 (UTC)
  1. Please avoid offensive headings
  2. Please assume good faith (that you reverted first out of habit)
  3. Please provide a direct link to the item your are talking about (The botany of the Antarctic voyage of H.M. discovery ships Erebus and Terror in the Years 1839-1843 (Q6435950))
  4. You referred to the ongoing discussion We need a clear model in WD if we want to follow a FRBR structure pushing your POV
  5. looks like Help:Sources/de#Bücher (instance of book (Q571)) and Help:Sources#Books (instance of written work (Q47461344) are out of sync
--Succu (talk) 20:22, 10 October 2018 (UTC)
  1. The heading is not offensive.
  2. I cannot assume good faith when you repeatedly respond to edits by immediately reverting. Further, you claim not to have seen the responses I made to you, but managed to pattern your comment on the revert to match the comment I made but which you claim you did not see. This breaks the assumption of good faith.
  3. I provided links to the diffs on the item.
  4. The discussion is not ongoing. Discussion was concluded in August (two months ago) and has moved on with changes already being enacted based on the consensus. Nor is it "my POV", but the consensus of Wikiproject:Books. I neither started nor coordinated the discussion, and I was far from the only participant as can be seen by the numerous comments in that discussion from multiple commenters.
  5. Yes, the German translation is out of sync, which is why I directed you to the primary discussion.
  6. In future please read comments provided for your education. We are all in this together, and the comments were provided to bring you up to speed on current consensus. Reverting first and reading later is a poor way to do things; in particular, it does not assume good faith. --EncycloPetey (talk) 22:24, 10 October 2018 (UTC)

Federated search with data.nobelprize.org do we have any experience/design patterns for using information like this in an article

We can now do a federated query to data.nobelprize.org link list - link map and get back

  • Nobel Lecture
  • Biographical
  • Banquet Speech
  • Prize motivationen
  • Documentary
  • Diploma

using SPARQL see tweet. My question is has anyone tested doing a Wikipedia template that extract information from a Federated query and presented it in an Wiki article and do we have any good pattern for that like caching data? - Salgo60 (talk) 09:01, 10 October 2018 (UTC)

@Salgo60: In general, Wikipedia templates can't be filled from a SPARQL query -- not even from our own WDQS service. Instead they draw from the linked Wikibase directly. It is possible to write Javascript gadgets to run a query and then insert something that looks like a template on a page, but it's not built into the template system, and it's slow and resource-expensive -- at best, that kind of thing is a hack that can be used in very limited circumstances (eg perhaps a gadget of use to a very few power users), but not appropriate for general deployment. Jheald (talk) 09:21, 10 October 2018 (UTC)
Thanks for the clarification then my question is shouldnt this be the direction Wikipedia/Wikidata should move or should everything be stored in WD? I guess a good pattern is to have some kind of caching to get better performance - Salgo60 (talk) 09:27, 10 October 2018 (UTC)
Technical questions aside, is the data from that website even available under a free license? Because their Terms of Use say otherwise. Without a compatible license, it simply can't be used on Wikipedia, whether we want to or not. --Kam Solusar (talk) 20:59, 10 October 2018 (UTC)
Good point my initial thought was a "Nobel prize Infobox" with links to Nobel Data ==>
  • Nobel Lecture
  • Biographical
  • Banquet Speech
  • Prize motivationen
As said earlier Nobel Data is not a problem per se the challenge I see is WikiDataSync how to keep Wikidata updated. I guess we need to gather good examples and at least in Sweden learn data owners what is expected from them when they manage Linked data. I miss good examples of Open government data with good change management working together with Wikidata data - Salgo60 (talk) 05:04, 11 October 2018 (UTC)

Wikidata Synch - Notification

Another thing is that with queries like this we could easy find when a new nobel prize winner is added (see tweet) and it would be nice to have a subscription service that notify me when something changed - Salgo60 (talk) 09:01, 10 October 2018 (UTC)

Noticing that a new nobel prize has been awarded does not seem like a problem that needs a database solution. Plumbum208 (talk) 09:53, 10 October 2018 (UTC)
The question is more to find good patterns and as Nobeldata has SPARQL and API and a rather small dataset with very high visibility that everyone understands I guess its a good candidate to create good "patterns"
We have in Sweden the following "cleaning" projects that will be a nightmare to keep "in shape"
to be able to manage them I guess best is if we could tell the owner of information that WIkidata prefer using this pattern to keep Wikidata in synch with the data they have. My believe is that SPARQL is a very good pattern for easy maintain the quality data with external sources - Salgo60 (talk) 11:37, 10 October 2018 (UTC)
Create a report (see for example Wikidata:WikiProject sum of all paintings/Wiki monitor/nlwiki) and add it to your watchlist. Multichill (talk) 19:52, 10 October 2018 (UTC)
Yes Listeria is the best pattern I have seen so far to synchronize wikidata with external datasets eg. Property_talk:P1006/Mismatches - Salgo60 (talk) 05:04, 11 October 2018 (UTC)
Thanks I Did a list d:User:Salgo60/ListeriaNobelData3 that I feel will do the job - Salgo60 (talk) 11:21, 11 October 2018 (UTC)

Looks like Metadata2020 and the conf FORCE2018 feels like speaking about the challenges with metadata and machine-actionable data management - Salgo60 (talk) 06:36, 11 October 2018 (UTC)

How to indicate the person who destroyed a work?

I encountered Un bon bock (Q2000314), a kind of early film that utilises the procedure of Théâtre Optique (Q2709131). It was, along with most of his other ribbons for the theatre optique, destroyed by its creator Charles-Émile Reynaud (Q286445), who smashed it and threw it into the Seine (some sources describe it as an act out of mental derangement or depression). How to express this?

I thought of using significant event (P793)/destruction (Q17781833) with qualifiers. To indicate the person I found three already used ways, none of them very convincing:

I like participant (P710) best, but does this exclude other ways of participating in this event (like trying to save it)? cause of destruction (P770) and killed by (P157) are a bit weird due to the domain/value (and I would say the reason for destruction of Un bon bock (Q2000314) is something like smashing, maybe even mental depression (Q4340209)). Is there a better way for modelling it?

To indicate that he threw it into the Seine I would just use location (P276) (although this is not very accurate, as he threw the ribbons into the Seine after he destroyed them, but this might be negligible).

Btw: That Un bon bock (Q2000314) was intended for Théâtre Optique (Q2709131) I expressed via Un bon bock (Q2000314)uses (P2283):Théâtre Optique (Q2709131). I'm not sure if this is the best way to put it. - Valentina.Anitnelav (talk) 08:45, 11 October 2018 (UTC)

Maybe one could also use significant person (P3342) qualified (object has role (P3831)) with some newly created item like <destroyer>. -Valentina.Anitnelav (talk) 09:53, 11 October 2018 (UTC)
There's also has cause (P828) as a qualifier of dissolved, abolished or demolished date (P576), which I don't like so much. The area of destructions in general is quite inconsistent... --Yair rand (talk) 16:45, 11 October 2018 (UTC)

Calandrinia corrigioloides

This discussion on User:Peter coxhead's English Wikipedia talk page, is pertinent to issues with Wikidata's handling of items about species. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:14, 11 October 2018 (UTC)

is a Wikinews article

Wikinews article (Q17633526) 2011 Norway attacks (Q79967) is both instance of (P31) terrorist attack (Q5710433) and instance of (P31) Wikinews article (Q17633526) (see also various Description). This is an issue IMHO. Visite fortuitement prolongée (talk) 18:29, 6 October 2018 (UTC); 19:26, 6 October 2018 (UTC)

@Visite fortuitement prolongée: There are two items which match your description; which one are you talking about? Mahir256 (talk) 18:53, 6 October 2018 (UTC)
@Superchilum, Laddo: regarding 2011 Norway attacks (Q79967); @Laddo: regarding 2017 Quebec City mosque shooting (Q28549976) (as "Huhbakker" is no longer around). Mahir256 (talk) 18:57, 6 October 2018 (UTC)
Sorry, I wanted to talk about 2011 Norway attacks (Q79967). Corrected. Thank you for finding 2017 Quebec City mosque shooting (Q28549976). Visite fortuitement prolongée (talk) 19:26, 6 October 2018 (UTC)
@Visite fortuitement prolongée: Can you clarify your point? These are events, and the first articles reporting that event. They should be separated? -- LaddΩ chat ;) 22:18, 6 October 2018 (UTC)
For Q79967 Description, fr Description is "article de Wikinews", de is "Artikel bei Wikinews", an is "articlo de Wikinews", bav is "Artike bei Wikinews", bs is "Wikinews članak" etc. Very different from en "two sequential lone wolf terrorist attacks in Norway on 22 July 2011", then I guess that there is an issue. Visite fortuitement prolongée (talk) 22:46, 6 October 2018 (UTC)
Even if you dont want a separate item, at least I think that Description and instance of (P31) should not be "Wikinews article". Visite fortuitement prolongée (talk) 23:00, 6 October 2018 (UTC)
@Laddo: Yes. See the result of Wikidata:Requests_for_deletions/Archive/2018/07/08#Q17655696. Mahir256 (talk) 23:19, 6 October 2018 (UTC)

So a Wikidata item with ns0 articles on Wikipedia should not contain ns0 articles on Wikinews, but instead Wikinews categories? And all the ns0 articles on Wikinews should remain separated? --Superchilum(talk to me!) 08:12, 7 October 2018 (UTC)

Yes, I think that's the best solution indeed, if there are several Wikinews articles about the same news event. Especially the Dutch and Russian Wikinews versions work with categories in such cases. On Wikipedia, big news events have their own articles which contain all the relevant information, so categories are not needed there in that case. --De Wikischim (talk) 08:53, 11 October 2018 (UTC)
On the other hand, cases like d:Q28549976 are somewhat different. Since no Wikinews categories are involved here, all the articles (both Wikipedia and Wikinews) can just remain on the same Wikidata item. De Wikischim (talk) 08:59, 11 October 2018 (UTC)
When Wikinews is concerned, the following connections are made to other projects:
  • Wikinews article -> Wikinews article (no connection to other projects)
  • Wikinews category -> Wikipedia article / Wikivoyage article / Commons category
Ymnes (talk) 14:38, 11 October 2018 (UTC)
No they can't remain on the same Wikidata item. A mass shooting (Q21480300) is an event, not a text like a Wikinews article (Q17633526). -Ash Crow (talk) 16:27, 11 October 2018 (UTC)
Well, a lot of Wikidata items now containing both Wikipedia and Wikinews pages will have to be split up in that case. However, I wonder if it's really worth investing that much time and energy in. De Wikischim (talk) 19:31, 11 October 2018 (UTC)

a way to protect?

I have Romeo and Juliet (Q83186) on my watchlist, and I can easily remove it so feel free to tell me NO.

Everyday or at least often, some IP makes a wrong edit to it, one edit only. Can it be protected because of "stupid uncalled for gaming" or whatever?--RaboKarbakian (talk) 16:14, 11 October 2018 (UTC)

Looks like 6 edits in the last 2 weeks, but not too frequent before that. You might want to request this on the Wikidata:Administrators' noticeboard. ArthurPSmith (talk) 18:25, 11 October 2018 (UTC)
I've protected it for a month. If it's still attracting vandalism after that, let us know on the page ArthurPSmith linked above. - Nikki (talk) 18:48, 11 October 2018 (UTC)
The main page highlights Romeo and Juliet and the "Discover" section, but doesn't link to the item directly. Could this be the cause somehow? --Yair rand (talk) 20:30, 11 October 2018 (UTC)

Political movement or ideology

There is question among me and @Fnielsen: about Lars Hedegaard (Q1806231), who belong to the Counterjihad (Q3374768) (or Counter-jihad movement, or CJM). The CJM is currently a political movement (Q2738074) in Wikidata, and is described by scholars (see Q3374768#P1343) as a loose network of peoples and organisations. Is the CJM an organization (Q43229), a political movement (Q2738074) or an political ideology (Q14934048) ? Main question: how do Wikidata say that somebody belongs to the CJM: member of (P463)? political ideology (P1142)? else? Visite fortuitement prolongée (talk) 16:48, 11 October 2018 (UTC)

This problem concerns not only persons but also organizations such as International Free Press Society (Q4354370). To complication matters further there is also the possibility to invoke affiliation (P1416) (beyond member of (P463) and political ideology (P1142)). — Finn Årup Nielsen (fnielsen) (talk) 17:02, 11 October 2018 (UTC)
Both affiliation (P1416) and member of (P463) seem to be for organisations, and Counterjihad (Q3374768) doesn't seem to be an organisation. I suppose it's correctly a political movement (Q2738074) and political ideology (P1142) should be used. I'm not sure what the difference is between political movement (Q2738074) and political ideology (Q14934048). Ghouston (talk) 19:55, 11 October 2018 (UTC)
There may be a distinction between political movement (Q2738074) and political ideology (Q14934048) (they both have articles on cswiki), in which case using political ideology (P1142) with a political movement (Q2738074) could be a mistake. I don't know how you decide whether a particular instance such as Counterjihad (Q3374768) is one or the other, or what property you should use with a political movement (Q2738074). Ghouston (talk) 20:53, 11 October 2018 (UTC)
There are a lot of small local political movements about things like preventing high-rise buildings or giving tenants the right to own pets which can't really be described as ideologies. An ideology is some kind of grand scheme about how politics or society should be organized, such as communism (Q6186) or conservatism (Q7169). Presumably political ideologies will also be political movements, or have associated political movements. In the past I suggested (in jest, but maybe it's not a bad idea) properties like "supporter of" and "opponent of" to allow recording a person or organization's opinions about random things. That discussion was about atheism (Q7066) and whether or not it's a religion, but it could even be used to record a sports team somebody supported. Ghouston (talk) 21:06, 11 October 2018 (UTC)

I can not see that these two Properties are making any difference, both of them are Wikidata property related to geography (Q52511956) or a subclass of (P279) of that. At the time being list of monuments (P1456) 7 805 and appears in the heritage monument list (P2817) has 62 689 items. Pmt (talk) 07:16, 6 October 2018 (UTC)

  • I think we could probably delete the later. It was created for a country that used WLM lists that don't follow an administrative structure. --- Jura 08:07, 6 October 2018 (UTC)

Merge two items

Could anyone please merge Q17003641 and Q20113514? --193.157.194.219 07:32, 11 October 2018 (UTC)

Could anyone please merge Q11980054 and Q24511179? --193.157.194.219 10:06, 11 October 2018 (UTC)
The geo coordinates for these two items point to two different fjords. [3] and [4]. I am not sure they should be merged. — Finn Årup Nielsen (fnielsen) (talk) 17:08, 11 October 2018 (UTC)
✓ Done for the first two, the second two are still problems. --Liuxinyu970226 (talk) 11:10, 12 October 2018 (UTC)

Author Qualifiers

In the Item - Q57077013 , I would like to use qualifiers to distinguish Coordinating Lead Authors, Lead Authors & Review Editors.

Are the properties object has role (P3831) & subject has role (P2868) the right qualifiers?

Or,

should I be using another property more suited for authors, Any suggestions?  – The preceding unsigned comment was added by Wallacegromit1 (talk • contribs).

subject has role (P2868) is correct, object has role (P3831) as qualfier of author name string (P2093) or author (P50) is wrong because the report is the subjejct and the authors are the objects. --Pasleim (talk) 09:13, 13 October 2018 (UTC)

Thanks for your quick feedbacks. But, When I add object has role (P3831), a message pops up saying;

"object has role is not a valid qualifier for author name string (P2093) – the only valid qualifiers are: series ordinal, of, subject has role, or, affiliation"

A bit more clarity, please!

. --- Wallacegromit1


Several "Wikilinkproblems" moved from talk page

Can someone wikilink en:Babić (Q20519335) with de:Babić (Q797730) ? 178.3.19.113 17:14, 10 October 2018 (UTC)

Can someone wikilink en:Čović (Q21487827) with de:Čović (Q341615) ? 178.3.19.113 17:17, 10 October 2018 (UTC)

Can someone wikilink en:Ivanić (Q21513651) with de:Ivanić ((Q16849113)) ? --178.3.19.113 17:20, 10 October 2018 (UTC)

Can someone wikilink en:Izetbegović (Q56538786) with de:Izetbegović ((Q1255192)) ? 178.3.19.113 17:23, 10 October 2018 (UTC)

Who deleted all the interwikilinks of Bosnian politican surnames ? --178.3.19.113 17:23, 10 October 2018 (UTC)

Can someone wikilink en:Ivanović (Q21507848) with de:Ivanović (Q526937) ? --178.3.19.113 17:27, 10 October 2018 (UTC)

Moved by Liuxinyu970226 (talk) 03:17, 13 October 2018 (UTC)

 Not done family name has to use a different item than disambiguation page (Q27924673) --Liuxinyu970226 (talk) 03:18, 13 October 2018 (UTC)
We definitely need some way to link items of these two types to each other; something akin to is a list of (P360). Suggestions?Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:30, 13 October 2018 (UTC)

Double

Q12410695 was a double of Q7136547. I merged it there. Can somebody please delete Q12410695? Debresser (talk) 16:34, 13 October 2018 (UTC)

Please merge items with the instructions listed on Help:Merge next time. I've turned the former item into a redirect. Sjoerd de Bruin (talk) 17:16, 13 October 2018 (UTC)

Reminder about birth-death dates

Hi all, I spend time now and then filling in death dates for women, but of course this problem is true for all people, female, fictional and otherwise. Please try to be more exact in filling in dates. Someone who was obviously born after 1950 should at least have a birthdate with precision of a decade and not century (which reverts to 1901s). See this link to help work on the century of your choice: Wikidata:WikiProject Women/Centenarians. Thx Jane023 (talk) 14:04, 10 October 2018 (UTC)

If the year of birth isn't publicly available, the decade probably isn't either. I don't think guessing is a good idea. Ghouston (talk) 22:13, 10 October 2018 (UTC)
Guessing may be a bad idea, but unfortunately our sources have already done that and the result is we are left with lots of items with very strange birthdates - some are even set to the year 0 or 100. The 1901s is just an example. Jane023 (talk) 17:13, 14 October 2018 (UTC)

Why did Donna Strickland (Q56855591) not have a Wikidata item when her Nobel Prize was announced?

Over on the English Wikipedia, there is an insightful essay about the mechanisms that led to her not having an article there, but no Wikipedia had an entry, and neither had Wikidata, so I am wondering what we can learn from that for Wikidata-related workflows. --Daniel Mietchen (talk) 01:28, 12 October 2018 (UTC)

Like many authors of academic papers, we had items about her papers, with her name as a author name string (P2093) value (see the histories of Q33306866, Q33335086, Q35516328, Q36021270, for example) but our automated tools had, presumably, not been able to create an item about her, as her ORCID profile has "No public information available". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:22, 12 October 2018 (UTC)
Lile ORCID, which is as trustable as IMDB, is the only criteria... Sjoerd de Bruin (talk) 23:59, 12 October 2018 (UTC)
"Lile"? ORCID iDs are certainly more trustable than IMDB, in this context. I said nothing to claim that they are "the only criteria [sic]", but it they are the primary ID used by our automated tools to create such items. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:35, 13 October 2018 (UTC)
  • Maybe she was probably just younger than the usual winners? --- Jura 16:38, 12 October 2018 (UTC)
  • Two further factors, besides presumably gender bias: (1) her unusual decision to never have applied for a full professorship and (2) the lack of a Wikipedia article in no small part because of what I would consider an over-strict application of the idea of "independent" sources, where we don't consider an accredited university "independent" enough to be citeable about their own faculty. - Jmabel (talk) 19:59, 12 October 2018 (UTC)
  • Much of Wikidata is created from pre-existing databases. The unfortunate consequence of this is that any bias in the source databases will be reflected in Wikidata. Where gender bias exists in the source databases, it takes proactive editing to overcome that bias. I have repeatedly been shocked to find that novels by leading women--even Pulitzer winners--have not had Wikidata entries, but on searching further I have often found that these same novels were also absent from VIAF and the Library of Congress. So unless Wikidata editors are actively seeking to counter pre-existing bias, we can expect the default to be a reflection of whatever bias was present in our sources. --EncycloPetey (talk) 17:44, 13 October 2018 (UTC)

Ethnic and religious composition of human settlement

How to store data about ethnic and religious composition of human settlement (Q486972) in many different point in time (Q186408), if I know the quantity of representatives of particular ethnic group (Q41710) or religious organization (Q1530022) in different point in time (Q186408)? - Kareyac (talk) 14:02, 13 October 2018 (UTC)

You could use Wikimedia Commons and Data namespace. strakhov (talk) 18:16, 13 October 2018 (UTC)
@strakhov. Thank you, choosing the desition I’ll consider your proposal. - Kareyac (talk) 08:48, 14 October 2018 (UTC)

A persons full or official name

Full name (Passport name). There are several ways to indicate a persons name. But I can not find a simple way to indicate a persons full name. An item usually uses the name a person is known under and most of the time equal to the article name on wikipedia. Like for Betty Ford (Q213122) who has the full name Elizabeth Anne Ford, probably used as her name in official documents and Passport. I am not able to find a property reflecting this I can find birth name (P1477) but for Betty Ford the birth name is Elizabeth Anne Bloomer and her {{P|P2562}} and {{P|P1559}} will be Betty Ford or derivatives of that with Ford as Family name. Is there a need for a property reflecting a persons full name? For females who have changed their last name when married I can find a property for a persons full or official name usefull. Pmt (talk) 22:06, 13 October 2018 (UTC)

way of itemstructure

I'm little in hang of idea: How to say, that Walther Freise (Q57313462) is chairperson (Q140686) of Naturforschende Gesellschaft der Oberlausitz (Q1970282) inclusive dates in the right way (did it via part of (P361)). Thank you for help, thank you very much for your work, Conny (talk) 13:28, 14 October 2018 (UTC).

Items created unintentionally twice on several occasions

Hello I noticed twice the following behaviour:

When creating an new item by clicking in Commons in the sidebar on the button 'In Wikipedia Add liniks' and although i just clicked once there were 2 items created. This happens every now and again without that I would be able to tell which circumstances lead to this. Moreover the 2 items created have each time subsequent item numbers. The last time i noticed this was earlier today:

You will as well notice that both items where created in the same second and so I think I can exclude that I made 2 actions.

the same happened as well a week ago:

and as well a few minutes earlier here:

and the day before:

as well as twice here:

as well a week earlier:

and the first time here:

Is there anyone who has an exèlamation fo rthis behaviour? I suggest to create an issue in Phabricator in order to allow deeper investiagtion.

Many thanks for any feedback on this issue. Robby (talk) 22:13, 11 October 2018 (UTC)

Many tools use SPARQL queries for duplicate detection, and when there is a lag on the server used by such a query, then duplicates may arise. Such lags have occurred multiple times during the relevant period, and I am not sure what causes them. --Daniel Mietchen (talk) 01:35, 12 October 2018 (UTC)
Particularly worrying is the fact that the sitelinks are present *both* in Category:1990s in the Balearic Islands (Q57215615) and Category:1990s in the Balearic Islands (Q57215616) - something I thought was impossible, as sitelinks are supposed to be unique. Pinging @Lydia Pintscher (WMDE). Thanks. Mike Peel (talk) 06:32, 12 October 2018 (UTC)
These are so-called true duplicates. --Pasleim (talk) 08:57, 12 October 2018 (UTC)
Any suggestions on how to proceed with this?
  • Create an new Phabricator ticket or is there an open issue on this in Phabricator?
  • Update the list on true duplicates or first establish a complete list with all these true duplicates (my knowledge of SPARQL does not allow me to create a request to generate such a list (I do not even know whether this would be possible) ?
  • merge the duplicates and take no further action?
Thanks for further feedback and/or proposals Robby (talk) 21:13, 12 October 2018 (UTC)
It looks like the developers will run a script to update the list soon, see Wikidata:Contact_the_development_team#True_duplicates_clean_up?, and then we can merge them. There's a Phabricator ticket linked from there too. - Nikki (talk) 12:29, 14 October 2018 (UTC)
thanks for this update. I've added a comment in the corresponding ticket in phabricator. Robby (talk) 13:44, 15 October 2018 (UTC)

Solar Hijri Calendar

Hi, "time" data type does not support SH calendar. We don't use Gregorian calendar so we don't convert dates from SH to AD. For example when I want to edit date (property: inception) for item Q4819111 I don't know exact equivalent in AD calendar. How can I edit in my own language? --دوستدار ایران بزرگ (talk) 21:27, 14 October 2018 (UTC)

Wikidata does not yet support calendars other than the Gregorian and Julian calendars. There are some Phabricator tasks open for expanding this: See for example this task for Hebrew calendar support. You can open a similar task for Solar Hijri support on Phabricator. --Yair rand (talk) 23:05, 14 October 2018 (UTC)
✓ Done Thanks I made a thread in Phabricator. --دوستدار ایران بزرگ (talk) 06:32, 15 October 2018 (UTC)

Wikidata weekly summary #334

Mixup Valiha

There is a mix up around the word Valiha. As far as I can tell there are two different subjects, one a musical instrument and one a plant.

valiha (Q733370) is the instrument Valiha (Q7911694) is the plant Category:Valiha (Q8894884) mixes the two

At issue is that Category:Valiha (Q8894884) is on the top of the musical instrument page on Commons. A bot had added the musical instrument photos to the Swedish plant article (corrected).

Jacqke (talk) 11:31, 20 October 2018 (UTC)

I've moved the Commons link to the item for the instrument. - Nikki (talk) 18:24, 20 October 2018 (UTC)
Nikki Thank you. Jacqke (talk) 14:52, 21 October 2018 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 09:43, 21 October 2018 (UTC)

No label defined (Q25490615)

Please merge "No label defined (Q25490615)" into Gila Balas (Q57570999). רוזמן יצחק (talk) 05:01, 21 October 2018 (UTC)

→ ← Merged Matěj Suchánek (talk) 09:43, 21 October 2018 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 09:43, 21 October 2018 (UTC)

Denmark

Is it possible to solve the problem which state (Q35 (Denmark) or Q756617 (Kingdom of Denmark)) Danish locations and Greenland locations belong to? There cannot be two states. Other properties are affected to this problem, too. I do not understand the difference between both. There is no English article on Q756617. --RolandUnger (talk) 09:13, 14 October 2018 (UTC)

For a start, I think en:Denmark is about Kingdom of Denmark (Q756617), but has been incorrectly connected to Denmark (Q35). I don't know much about Denmark, but I think it's something similar to Kingdom of the Netherlands (Q29999) and its constituent countries, which can also confuse people in Wikidata. Ghouston (talk) 10:13, 14 October 2018 (UTC)
Kingdom of Denmark is the official name of Denmark, so it should be the same. At least, we need a unique value for country (P17). If Denmark is a state then Kingdom of Denmark is maybe only a name of a state but not the state itself. --RolandUnger (talk) 11:06, 14 October 2018 (UTC)
No, I believe Kingdom of Denmark includes Denmark proper, Faroe Islands, and Greenland. For example, Denmark is part of the EU, whereas Faros Islands and Grenland (and, consequently, the Kingdom of Denmark) are not.--Ymblanter (talk) 12:10, 14 October 2018 (UTC)
Something similar exists between Netherlands (Q55) and Kingdom of the Netherlands (Q29999). However, it's complicated by the fact that the Caribbean Netherlands (Q27561) are part of both, while Curaçao (Q25279) only belongs to Kingdom of the Netherlands (Q29999). —Rua (mew) 12:37, 14 October 2018 (UTC)
We had the following request some time ago: Wikidata:Bot_requests/Archive/2016/12#Country_→_Denmark --- Jura 12:46, 14 October 2018 (UTC)
But I think the request was discussed but not executed. So we have the situation that cities like Copenhagen (Q1748) are situated in the state of Denmark (Q35) and cities in Greenland (Q223) are situated in state of Kingdom of Denmark (Q756617) but these are the same country! And if you see the data sets of both Denmarks then you can learn that all is completely mixed in the Wikipedias. Normally we use country names like Germany (Q183) without political description (official name is Federal Republic of Germany). Other countries like France, Netherlands have overseas territories, too but there are only one state. I think Danish authors should help to remove the confusion. --RolandUnger (talk) 16:41, 14 October 2018 (UTC)
  • A possible way to resolve it would be to unify the interpretation of country of citizenship (P27) and country (P17): if a "country" is part of a larger state, but doesn't have it's own citizenship, then don't allow it as a country (P17) target. That would mean changing quite a few statements for the Netherlands, but would avoid changing a lot for the UK. I can't see any indication either that Greenland (Q223) and Faroe Islands (Q4628) have separate citizenships. I think it would make sense, since a state shouldn't be treated differently in Wikidata depending on whether it calls its internal subdivisions "countries" or "states". Ghouston (talk) 20:53, 14 October 2018 (UTC)
    • Another option which doesn't require changing the Netherlands or the UK would be to allow things which have their own ISO 3166-1 codes. It's a widely-used international standard, so it seems reasonable to say that the things it lists are often considered to be countries. - Nikki (talk) 21:29, 14 October 2018 (UTC)
      • But then we'd have different definitions of country for country of citizenship (P27) and country (P17). ISO 3166-1 would also make places like Hong Kong, Macao and "United States Minor Outlying Islands" into countries. Ghouston (talk) 22:43, 14 October 2018 (UTC)
        • I don't think that's a problem. They're different properties, they can have different allowed values (and these are not the only two country properties we have, see country for sport (P1532) for example, a single set of allowed values for all country properties is not possible). country (P17) has a wide variety of uses so I don't think it makes sense to require the narrow definition that makes sense for country of citizenship (P27). - Nikki (talk) 10:37, 15 October 2018 (UTC)
          • Sure, but I don't see what's to be gained in this case by treating the Netherlands and Denmark differently to other states (or are there other exceptions too?). Many of them have regional parliaments and semi-autonomous areas. If "Kingdom of the Netherlands" was given an alias "Netherlands", and the existing statements assigned to it, then presumably it would appear first when selecting values and there'd be less confusion. The only advantage I can see in treating the Netherlands and Denmark differently to say the UK is that it's sort of the status-quo, and maintaining the status-quo seems to be an incredibly powerful force on Wikidata. I'm not sure why that is, for such a relatively new project. Ghouston (talk) 00:02, 16 October 2018 (UTC)

Amsterdam

Why are there two entries for Amsterdam in the Netherlands? There is no difference between Amsterdam as a capital and as a municipality. Both refer to the same administrative and legal body. It also confusing, as both contain now the same properties like population.

Amsterdam (Q727) capital and largest city of the Netherlands

Amsterdam (Q9899) municipality in the Netherlands, containing the city of Amsterdam  – The preceding unsigned comment was added by Historazor (talk • contribs) at 12:12, 15 October 2018‎ (UTC).

Not so. The municipality of Amsterdam has a couple of populated places apart from the city, for instance Holysloot (Q672125) and Ruigoord (Q3252907). Lymantria (talk) 13:42, 15 October 2018 (UTC)
And one is shown as located within the other. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:55, 15 October 2018 (UTC)
But even if they were coextensive with (P3403) to each other, a municipality and a city are not the same. Lymantria (talk) 19:20, 15 October 2018 (UTC)
"Municipality" also seems to be ambiguous: it can refer to either a territory administered by a local government body, or by the body itself. For some places in Wikidata these have separate items, e.g., London Borough of Camden (Q202088) and Camden London Borough Council (Q5025790), and other places they don't. It's not really clear if Amsterdam (Q9899) refers to only the administrative body, or the territory as well. Ghouston (talk) 04:11, 16 October 2018 (UTC)
Both. Lymantria (talk) 06:40, 16 October 2018 (UTC)

Query Lexemes in the Query Service

Hello all,

Graph of Lexemes derived from L2087

I’m very happy to announce that another important feature for Lexicographical Data has been deployed: the ability to query Lexemes in the Query Service.

Here are a few examples:

The queries are based on the RDF mapping that you can find here. Feel free to help improving the documentation, so people can understand how to build queries out of Lexemes.

Thank you very much to Tpt who’s been doing a huge part of the work by mapping Lexemes in RDF, and Smalyshev (WMF) who made the RDF dumps available and integrated in the Query Service.

Feel free to play with it, bring some of these ideas of queries to life, and let us know if you find any issue or bug. These can be stored as subtasks of this one on Phabricator.

If you have questions about Lexicographical Data in general, feel free to write on the talk page of the project. If you have specific questions about the integration in the Query Service, you can also ping Stas onwiki or on IRC.

Cheers, Lea Lacroix (WMDE) (talk) 08:06, 16 October 2018 (UTC)

Thesis data in the Edinburgh Research Archive

Hi, following this discussion, we are in a position to begin the mass import thesis data from the Edinburgh Research Archive. In a nutshell:

  1. ChaoticReality has developed a utility to import the thesis record metadata from ALMA catalogue entries for the Edinburgh Research Archive’s theses.
  2. Can we create and delete test Wikidata items to see that the QS export is working correctly?
  3. Following this testing phase, this can we place a bot request to then mass import given the size of sample we are talking about from the Edinburgh Research Archive? Stinglehammer (talk) 12:03, 16 October 2018 (UTC)

JSON-LD now on beta

Hello all,

We’re planning to add JSON-LD as a serialization format for Wikidata. This will allow for example an easier access to RDF data from Javascript.

This is now deployed on https://wikidata.beta.wmflabs.org. Example: https://wikidata.beta.wmflabs.org/wiki/Special:EntityData/Q64.jsonld

As with the other formats we already support (like turtle or rdf/xml), content negotiation is used if the format is not indicated by a suffix like .jsonld. The MIME type that can be used in the Accept header to request JSON-LD output is application/ld+json.

If you’re interested in this feature, please test it, and let us know if you find any issues. The related ticket is this one. If everything goes as planned, we will enable it to wikidata.org on October 31st.

Cheers, Lea Lacroix (WMDE) (talk) 13:05, 16 October 2018 (UTC)

Also pinging Cscott and Multichill who were involved in the previous discussions, and Maxlath who's working a lot with Javascript :) Lea Lacroix (WMDE) (talk) 13:05, 16 October 2018 (UTC)
Is there a description, comparable to mediawikiwiki:Wikibase/DataModel/JSON? Jc3s5h (talk) 13:59, 16 October 2018 (UTC)
I believe that mediawikiwiki:Wikibase/DataModel/ is the appropriate description, since the JSON-LD format is a direct representation of the underlying datamodel, as expressed in the existing turtle/RDFa/n-quads representations, and they don't appear to have separate description subpages. The "old" JSON format is an ad-hoc format with no standardized semantics, which is why it needed its own description. Specs for JSON-LD can be found linked from https://json-ld.org/. See also https://gerrit.wikimedia.org/r/465547 which will add appropriate JSON-LD links from article pages to the JSON-LD format data from wikidata (via content negotiation using the HTTP Accept header). Cscott (talk) 14:49, 16 October 2018 (UTC)
MisterSynergy Thierry Caro Vanbasten_23 Malore Сидик из ПТУ Mathieu Kappler Lee Vilenski Erokhin Dandilero Blackcat Looniverse

Notified participants of WikiProject Sports

I noted two problems about this property:

@Xaris333, Fundriver, Kooma: as main contributors --- Jura 17:28, 15 October 2018 (UTC)
Like
⟨ Premier League (Q9448)  View with Reasonator View with SQID ⟩ sport league system Search ⟨ Q18559 ⟩
sports league level (P3983) View with SQID ⟨ 1 ⟩
? Xaris333 (talk) 21:04, 15 October 2018 (UTC)
@Xaris333:Yes, like that.--Malore (talk) 22:31, 15 October 2018 (UTC)
Hello,
well first off: The german description talks about sports in general, i think you should just correct the english description and you are all fine. Also the english description isn't really fitting the name of the Property.
To the suggestion: I wouldn't do this, because in history the the sports league level (P3983) of some specific league has often changed, because of the creation of a new league and therefor it should be possible to place a start- and enddate to the league niveau (even tho this didn't happened yet I think, but I could tell you some league, where it changed and would be appropriate to set a start- and enddate on the property).
Best regards,
Fundriver (talk) 13:35, 16 October 2018 (UTC)
@Fundriver: It's possible to have two different statements that differ only by qualifiers, something like that:
sport league system
Normal rank imaginary league system
sports league level 1
start time 1970
end time 1980
0 references
add reference


add value
sport league system
Normal rank imaginary league system
sports league level 2
start time 1980
end time 1990
0 references
add reference


add value
However, maybe it's better to have two separate properties.--Malore (talk) 15:12, 16 October 2018 (UTC)
Yes, I know - it would be possible. But I think the sports league level (P3983) is a main element to describe a sports league. Otherwise you tell basicly, your league was 4x in the same sport league system, only with different qualifiers.. Fundriver (talk) 17:19, 16 October 2018 (UTC)
Ok, you're right. I proposed league system property.--Malore (talk) 23:45, 16 October 2018 (UTC)

Number of peoples or animals among a nationality / population / breed

Hello. Sorry for my english. Is there a property for the number of "objects" among a population (animals or human) ? As an exemple, for French (Q121842), we don't have any estimation for the number of french people in the world. This could be very useful (and should be allowed for use for animals, like Q588252 also). Thanks. --Tsaag Valren (talk) 09:31, 16 October 2018 (UTC)

@Tsaag Valren: We have quantity (P1114), but I'm not sure it's the right property for this. --Yair rand (talk) 20:50, 16 October 2018 (UTC)

Quality of a source needs to be documented and communicated

I am comparing the data of Nobelprize.org and Wikidata see Listeria list and run into problems what sources to trust

Suggestion is that we formalize how WIkidata document an used source (or start with sources that are WD properties):

  1. we start add some quality guidance of what type of source this is e.g
    1. Basic facts
      1. Number of people working with this source (fulltime/community...)
      2. When the organisation maintaining it was started
      3. If they have people full time hired
      4. If they have a documented quality process
      5. If they have controlled changed management
        1. Do you get a ticket when reporting problems and how do you report problems
        2. Can you review older versions of the source
        3. Is a reported change request possible to track
        4. Is this source Data driven
          1. Can we access it using SPARQL and federated search
          2. Can we access primary sources used
          3. Can we access a digital version of primary sources used
          4. Do they have external identifiers (en:Linked data) compare 5stardata.info
          5. Do they have Wikidata as same as property or are WD and the external source "share" an external property like VIAF ID (P214)
    2. Quality reviews of experts
      1. e.g. is this sources used by documented authorities in the field
    3. Quality reviews of Wikipedians
      1. Comments from people who has used this source what are the experiencies
      2. Where to find reported problems/mismatches
    4. ???

- Salgo60 (talk) 05:27, 14 October 2018 (UTC)

  • Much of this has little to do with the quality of a source. For example, a database maintained by one reputable academic about an area in which he or she is expert can be fine; a database maintained by a group of followers of Lyndon LaRouche, or left as a legacy from the Stalin-era Soviet Union or Nazi Germany is inherently suspect no matter how formally proper it might seen and no matter the formal credentials of its participants. - Jmabel (talk) 01:38, 15 October 2018 (UTC)
@Jmabel: It has if you dont try to write something down then you dont communicate your opinion or your understanding. I can see a bigger need to document quality of surces
  • with the increasing size of the Wikidata project
  • the project is getting more and more Global
  • number of added properties is "exploding" and it is more difficult to understand the value/quality of a source
I did a test (in Swedish) to document one of the best sources in Sweden Dictionary of Swedish National Biography ID (P3217) a source that 99% of the people who has study history at the Swedish University trust but I guess nearly no one outside Sweden knows about see link some facts
- Salgo60 (talk) 10:39, 15 October 2018 (UTC)
I agree that there is a need for that. I don't know how to realize it. --Marsupium (talk) 20:55, 16 October 2018 (UTC)
@Marsupium: Better something than nothing?!?! Cant we start with something like featured article badge (Q17437796) but for a source... when you start look at historical famous people in Sweden (e.g. Selma Lagerlöf (Q44519) Q44519#P569) we are getting +10 sources indicating a birth date ==> would be nice to filter/order them on trusted sources - Salgo60 (talk) 07:30, 17 October 2018 (UTC)
@Salgo60: Fine, here you my brainstorming now ;):
I really feel the same and Q44519#P569 is a good example. I think we should even partly remove sources. A short previous related discussion is Wikidata:Project chat/Archive/2017/10#Redundant Wikipedia citations. But it is a hard task. I agree with Jmabel that most of the above mentioned doesn't say much about quality of a source (some we have already established ways to indicate, e.g. SPARQL endpoint).
Though as a small start: In the example Q44519#P569 I guess for an algorithmic evaluation https://web.archive.org/web/20160401152316/http://jeugdliteratuur.org/auteurs/selma-lagerlof should ranked lowest, a bare URL without author (P50) or other subproperty of (P1647) of creator (P170), publisher (P123) or anything else, then Find a Grave (Q63056) (and some of the others?) as a crowdsourced source. Perhaps we should state Find a Grave (Q63056)instance of (P31)"crowdsourced work" (no item for that yet). Then it gets difficult. More criteria would be if a source is peer-reviewed.
For "Where to find reported problems/mismatches" deprecated statements referenced with a source and their proportion of all statements can be counted. Also the uses of Template:External reference error reports can help, also with "Do you get a ticket when reporting problems and how do you report problems". We should get more of that in the main Wikidata database to enable querying that information. Also issue tracker URL (P1401) is somehow related.
BTW: Do you have a specific use case for this? I think it might be good to try to handle a specific use to figure out how to deal with this in general. --Marsupium (talk) 10:00, 17 October 2018 (UTC)
@Marsupium: the user case I have is a federated search comparing WIkidata and Nobelprize.org in a Listeria list link
  • Lesson learned
    • Wikidata is excellent in fast getting the correct death date
    • As a Nobel prize is global I fast run into the problem of finding sources you dont know anything about
    • My dream scenario is that Wikidata produce lists like above indicating a difference AND that we also present the best sources confirming the facts
  • Rank sources
    • it's difficult but for Q44519#P569 we have the church books i.e. en:Primary Sources in electronic form link telling that Selma Olivia Lovisa is born..... if you can read that source then that is what you trust... we als have one of the best ranked sources (in Swedish) link ==> its a secondary source but written by professionals using primary sources
- Salgo60 (talk) 10:24, 17 October 2018 (UTC)
@Salgo60: Comment: "secondary source but written by professionals using primary sources" is even better than a primary source, at least for Wikipedia:No original research (Q4656524).
Sorry, writing here, I forgot the first sentence of the section. "AND that we also present the best sources confirming the facts" meaning a SPARQL implementation? At least for the first mismatch with >1 reference Q106471#P569 ranking isn't too difficult, a SPARQL implementation for that example would be something like this:
SELECT ?reference ?referenceItem ?referenceItemLabel ?quality_step_c
WHERE 
{
  wds:q106471-515C1587-101A-48A6-AF11-61EE615B90FE prov:wasDerivedFrom ?reference. # the statement is https://www.wikidata.org/wiki/Q106471#P569
  ?reference pr:P248|pr:P143 ?referenceItem.
  BIND(0 AS ?quality_step_a)
  BIND(IF(EXISTS{?reference pr:P143 [].},?quality_step_a - 1,?quality_step_a) AS ?quality_step_b) # the reference has a property P143 (imported from Wikimedia project)
  BIND(IF(EXISTS{?referenceItem wdt:P629 [].},?quality_step_b + 1,?quality_step_b) AS ?quality_step_c) # the referenceItem has a property P629 (edition or translation of)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY DESC(?quality_step_c)
Try it!
Would that be something you would want? And now a template shall output the best source? --Marsupium (talk) 11:20, 17 October 2018 (UTC)
Thanks that is excellent also user Larske (talkcontribslogs) did some magic at Wikidata:Request_a_query#Status_update
I have spent to much time working with the secondary source mentioned above Dictionary of Swedish National Biography ID (P3217) and lesson learned is that they also have problems. They have published books since 1918 and when we today have the sources in electronic format and also tools like Wikidata we can see that even professionals has problems and I think WIkidata could be a good community to document that... - Salgo60 (talk) 11:36, 17 October 2018 (UTC)

Languages actually used in labels and titles / translations of labels and titles

While the language of a label or title must be specified, the actual language may differ:

  • Bend It Like Beckham (Q369492) The German title of that British-Indian movie is Kick it like Beckham, which is English (German distributors like the appeal of English as a modern/hip language, but they want to avoid using the kind of words and phrases that a majority of potential ticket buyers may not understand; so, bend bad, kick good).
  • Die Hand Die Verletzt (Q4162624) This episode of an American television show has a German title, probably also because it sounds cool to American ears.
  1. Is there any way in Wikidata to specify the actual language? I would find it interesting to figure how how often and which different languages are being used, and in what countries this is more prevalent.
  2. Is there a way to specify the literal meaning of a title? That television episode title above means The hand that wounds (naturally, the German title of the episode is Satan). Downsides: many translations (N*N for N languages), translations would be mostly original research.
  3. Unrelated Wiki syntax question: Any way to not have this second (numbered) list be an indented part of the second bullet item of the first list?

If these things are not possible yet, do you have an opinion on whether they might be something worth supporting in the future? --109.91.87.44 11:49, 16 October 2018 (UTC)

For literal meanings, see literal translation (P2441). - Nikki (talk) 16:16, 16 October 2018 (UTC)
Consider also applies to jurisdiction (P1001). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:17, 16 October 2018 (UTC)
Re syntax: You can you *#, if there are no intervening line breaks. --Yair rand (talk) 21:21, 16 October 2018 (UTC)

For point #1, the title (P1476) statement usually specifies a language, and you can also add language of work or name (P407). Although, language of work or name (P407) may be hard to interpret in that case, since the original name is in one language but most of the dialogue in another. Then you've also got native label (P1705), since there may be multiple title statements and who knows which would be the original. But it gets confusing. Ghouston (talk) 22:56, 16 October 2018 (UTC)

Is Q22577145 an island?

According to GeoNames, East Island (Q22577145) is an island, but its coordinates map to an an area of mainland Toronto. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:30, 23 October 2018 (UTC)

It's located a few hunderd metres downwards... Sjoerd de Bruin (talk) 20:36, 23 October 2018 (UTC)
The coordinates from GeoNames via the Swedish/Cebuano Wikipedias can be a bit off. I've added the CGNDB ID for it (which is the most likely source GeoNames used for that item) and replaced the coordinates with the ones from there. - Nikki (talk) 20:59, 23 October 2018 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:17, 23 October 2018 (UTC)

Planned RDF ontology prefix change

We are planning to change the prefix and associated URIs in RDF representation for Wikidata from:

PREFIX wikibase: <http://wikiba.se/ontology-beta#>

to:

PREFIX wikibase: <http://wikiba.se/ontology#>

If you are using Wikidata Query Service, you do not have to do anything, as WDQS already is using the new definition.

However, if you consume RDF exports from Wikidata or RDF dumps directly, you will need to change your clients to expect the new URI scheme for Wikibase ontology.

Also, if you're using Wikibase extension in your project, please be aware that the RDF URIs generated by it will use this prefix after the change. This is defined in repo/includes/Rdf/RdfVocabulary.php around line 175:

self::NS_ONTOLOGY => self::ONTOLOGY_BASE_URI . "#",

The new data will have schema:softwareVersion "1.0.0" triple on the dataset node[1], which will allow your software to distinguish the new data format from the old one.

The task tracking the change is phab:T112127. I will make another announcement when the change is merged and deployed and the data produced by Wikidata is going to change.

Please contact me (or comment in the task) if you have any questions or concerns. Smalyshev (WMF) (talk) 22:27, 17 October 2018 (UTC)

Can we please rename be_x_old in left box to be_tarask?

We already renamed this Wikipedia years ago, but why this box still called be_x_old instead of nowadays be_tarask? Can we please submit a Gerrit patch to rename that? --180.97.204.30 22:46, 17 October 2018 (UTC)

still a challange to developers. --Liuxinyu970226 (talk) 14:26, 18 October 2018 (UTC)
By the way, it's impossible to solve it with a single patch because the identifier is part of the data of items. Matěj Suchánek (talk) 15:25, 18 October 2018 (UTC)

Mobile errors (Microsoft Edge)

The word "image" here seems to appear in the upper left corner of the screen.

I attempt to edit Wikidata with my Microsoft Lumia 950 XL phablet with Microsoft Edge and for some reason I can't add "Statements" using the mobile view so I am forced to switch to "Desktop mode", however in desktop mode my browser seems to "flash and crash 💥" and then forcefully reloads. Adding statements can take up to 30 (thirty) minutes because of the frequent crashing, I don't have this error on any other website (including other Wikimedia projects), is anyone else experiencing this? -- 徵國單  (討論 🀄) (方孔錢 💴) 07:32, 18 October 2018 (UTC)

Senses are now part of Lexicographical Data

Hello all,

As previously announced, the next big piece of Lexicographical Data on Wikidata is now deployed: Senses.

Senses will allow you to describe, for each Lexeme, the different meanings of the word. By using multilingual glosses, very short phrase giving an idea of the meaning. In addition, each of these Senses can have statements to indicate synonyms, antonyms, refers-to-concept and more. By connecting Senses to other Senses and to Items, you will be able to describe precisely the meaning of words with structured and linked data. But the most important thing is that Senses will be able to do is collect translations of words between languages.

Thanks to Senses, you will be able to organize and connect the existing Lexemes better, and to provide a very important layer of information. With Senses support, we now have all the basic technical building blocks to allow structured machine-readable lexicographical data, that can be reusable within and Wikimedia projects and by other stakeholders.

Feel free to try editing Senses. You can use the sandbox to make some tests. Let us know if you have questions or find bugs.

Note: there are still issues with sorting the IDs of Senses, Forms and sorting the glosses, that will be solved later this week. Thanks for your understanding.

Cheers, Lea Lacroix (WMDE) (talk) 10:16, 18 October 2018 (UTC)

Best practices for adding property constraints/violation checks.

I've noticed that a common error is to add position (Q4164871) and occupation (Q12737077) as values for instance of (P31) along with human (Q5) (presumably since in conversation we say "A is a writer" or "B is a senator").

See the following queries for examples:

 SELECT ?human ?humanLabel WHERE {
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
   ?occupation wdt:P279*/wdt:P31+ wd:Q12737077.
   ?human wdt:P31 wd:Q5.
   ?human wdt:P31 ?occupation.
 }

(query)

 SELECT ?human ?humanLabel WHERE {
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
   ?position_held wdt:P279*/wdt:P31+ wd:Q4164871.
   ?human wdt:P31 wd:Q5.
   ?human wdt:P31 ?position_held.
 } 

(query)


I've previously migrated these manually to the occupation (P106) and position held (P39) predicates respectively but since this seems to be a frequent error I would love to add these as a constraint violation of instance of (P31). Since this property impacts so many items I was hoping to test this in a sandbox but can't seems to find the place to do this.


Cheers, ElanHR (talk) 19:11, 18 October 2018 (UTC)

How to link a branch/subfield to its parent topic?

Currently, branches and subfield are linked to their parent topics by part of (P361). For example,

⟨ arithmetic (Q11205)  View with Reasonator View with SQID ⟩ part of (P361) View with SQID ⟨ mathematics (Q395)  View with Reasonator View with SQID ⟩

. Wouldn't be better to limit the use of part of (P361) to physical things and use facet of (P1269) in cases like this?--Malore (talk) 02:08, 19 October 2018 (UTC)

They mean different things. history of mathematics (Q185264) is a facet of (P1269) mathematics, arithmetic (Q11205) is not. --Yair rand (talk) 02:13, 19 October 2018 (UTC)
An alternative would be
⟨ arithmetic (Q11205)  View with Reasonator View with SQID ⟩ instance of (P31) View with SQID ⟨ specialty (Q1047113)  View with Reasonator View with SQID ⟩
of (P642) mathematics (Q395), but that doesn't seem ideal. For medical fields, there's an item medical specialty (Q930752) which is used quite a bit. Ghouston (talk) 02:52, 19 October 2018 (UTC)

Updating of the formatter url

What's up with formatter URL (P1630) on KMDb documentary film ID (P3750). On the page of the property there's the correct link to https://www.kmdb.or.kr/eng/db/kor/detail/movie/A/$1 but on the item My Heart Is Not Broken Yet (Q11292885) it seems like http://www.kmdb.or.kr/vod/vod_basic.asp?nation=A&p_dataid=$1 is still used. The change is already 4 days old. Did a change introduce a bug that prevents these links from updating? ChristianKl19:57, 17 October 2018 (UTC)

@ChristianKl: As with other things on item pages, formatter URL's are not updated unless the item itself changes in some way, or you clear it from the cache (load with ?action=purge). ArthurPSmith (talk) 20:23, 17 October 2018 (UTC)
What's the reasoning for it working like this? It seems like it's useful for changes in formatter URL's to propogate. ChristianKl08:51, 19 October 2018 (UTC)
Simple consequence of caching. You could ask the developers I guess. ArthurPSmith (talk) 14:07, 19 October 2018 (UTC)

Usage statistics of Magnus' tools

Hello all! I'm trying to collect some statistics of how influential Magnus Manske's tools are for Wikidata and for our projects in general. (Does it make sense that a volunteer keeps doing and maintaining such crucial work?) You can help by looking for, and adding statistics of usage of your favorite Magnus tools in this Phabricator ticket: https://phabricator.wikimedia.org/T207370. (Phabricator was the first place I thought about to post such information, but if someone wants to move it elsewhere, feel free!) Cheers, Spinster 💬 06:13, 19 October 2018 (UTC)

Magnus' tools clearly play a central and critical role in data imports. The QuickStatements format has been adopted de facto as the standard way to represent edit batches in many tools. Does it make sense that a volunteer keeps doing and maintaining such crucial work? I think this is worth asking indeed. Currently the data import infrastructure is patchy and entirely maintained by volunteers or short term funded projects (Magnus' tools, Harvest Templates, wikidata-cli, the Primary Sources Tool, OpenRefine - and many others). I think it would be fantastic to see the WMF / WMDE step in on this issue, pretty much in the same way that WMDE developed the SPARQL query service (removing the need for Magnus' WDQ) or the quality constraints (which could eventually replace KrBot, maybe?).
Freebase had a lot of issues but one thing they got right (I think) was to invest early on in the data import pipeline with their bespoke query language MQL (which could be used to read or write the knowledge graph) and by developing Gridworks. MQL then inspired the popular GraphQL, and Gridworks morphed into OpenRefine, so this data import pipeline even outlived Freebase itself. − Pintoch (talk) 08:59, 19 October 2018 (UTC)
@Pintoch: re: data import infrastructure
I think Wikidata should be a driven force of pushing datadriven Linked data and Federated SPARC. I did last week a promising test to monitor quality in Wikidata were I
- Salgo60 (talk) 09:37, 19 October 2018 (UTC)

the request for deletion page

I would like to participate in the discussions. But I have a question -

Some items get requested and deleted almost instantaneously. While others have been sitting there for months with a few comments. What's the difference? Lazypub (talk)

  • Mainly obviousness of the case for deletion. - Jmabel (talk) 16:07, 19 October 2018 (UTC)
    • I guess I am just familiar with Wikipedia. "Speedy Deletion" doesn't appear (that I am aware of) on a discussion board. But requests for deletion do, and they always run at least a few days.Lazypub (talk)
Items are deleted a bit hastily, mostly even the creator of the item is not pinged about discussion (even about deletion), despite there is a rule to do so. I find this a bit unpleasant.
I started some discussion about that issue (see Wikidata:Administrators' noticeboard/Archive/2018/07#Pinging of item creators during deletions, but unfortunately no progress in that issue.--Jklamo (talk) 17:02, 19 October 2018 (UTC)
@Jklamo: The gadget used to request deletions now pings the creator of the item in the edit summary. (This, of course, does not ping the bot operator if the item was created by a bot, but at least the current functionality is a start.) Mahir256 (talk) 17:10, 19 October 2018 (UTC)
Great, thanks at least for that.--Jklamo (talk) 17:24, 19 October 2018 (UTC)

Bug ?

Hi, i created yesterday Barroude cirque, with entries on Commons and french wp, but the article on french wp does not show any commons link. The same for Barrosa cirque which was created long time ago. --Guérin Nicolas (talk) 05:46, 25 October 2018 (UTC)

Ok, it works now. --Guérin Nicolas (talk) 08:03, 25 October 2018 (UTC)
This section was archived on a request by: Sjoerd de Bruin (talk) 08:16, 25 October 2018 (UTC)

Soundex formatter URL

The formatter URL on Soundex (P3878) is https://www.wikidata.org/wiki/Special:Search?search=haswbstatement%3A"P3878%3D$1" . Is that correct? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:19, 14 October 2018 (UTC)

Seems useful and better than nothing. @Jura1: let me mention as you haven't been mentioned. Sjoerd de Bruin (talk) 12:41, 14 October 2018 (UTC)
I'm unclear how providing such a link to one of our data consumers is "Useful"; please can you enlighten me? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 14 October 2018 (UTC)
Can you enlighten us what "consumer" you have in mind? --- Jura 13:34, 14 October 2018 (UTC)
Any. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:18, 14 October 2018 (UTC)
Can we have a sample use case? --- Jura 14:20, 14 October 2018 (UTC)
I hope so - that's what I'm asking for; the use case for providing this URL to data consumers. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:24, 14 October 2018 (UTC)
I don't know. You were talking about data consumers. Maybe you can enlighten us how they get it and what you have in mind. --- Jura 14:28, 14 October 2018 (UTC)

There being no use case for this formatter URL, I propose to remove it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:12, 15 October 2018 (UTC)

There being no argument that it has any utility, I have removed it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:21, 19 October 2018 (UTC)

...and was promptly reverted by Jura1, with the edit summary "apparently there is demand for it and you failed to explain yourself". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:01, 19 October 2018 (UTC)

Crowdsourcing "How terrorist groups end" and other analyses of terrorism data

What do you think about the suitability of Wikidata for crowdsourcing collaborative research on terrorism, including updating Jones and Libicki (2008) while perhaps linking it to the Global Terrorism Database (GTD) and inviting the lead authors and organizations of these sources to support this effort?

Jones and Libicki (2008) effectively say that the US is using the least effective approach to combating terrorism: Terrorists were more likely to win than be defeated militarily, at least according the authors' analysis of the terrorist groups that ended between 1968 and 2006:

  • 7% were defeated militarily.
  • 10% won.
  • 40% were suppressed by law enforcement.
  • 43% changed to nonviolent political actors through negotiations.

In spite the seemingly profound implications for the War on Terror, I have yet to see Jones and Libicki (2008) mentioned as part of the public discussion of the War on Terror. I see the following as likely obstacles to overcoming this deficiency in the public discourse:

  1. The elites in the US who have the most influence on the editorial policies of the mainstream media seem to support the way the War on Terror has been pursued so far and would not like to see it ended.
  2. At least some leaders with connections to the US military seem to have a generally low opinion of RAND Corporation, Wikidata Q861141View profile on Scholia.

I'd like to work with organizations opposed to the War on Terror encouraging replication and expansions of Jones and Libicki (2008) while perhaps linking it to the GTD. These organizations could also engage in activities designed to increase public awareness of these results while recruiting volunteers to crowdsource improvements of these data and exploring alternative analyses of the data based on alternative selection and classification of cases, etc. This could include raising funds to pay graduate students to do some of this work under the supervision of Jones and Libicki (2008) and the research professors associated with the w:Global Terrorism Database and related research.

references

Comments? DavidMCEddy (talk) 19:57, 19 October 2018 (UTC)

  • I think we already had this question a few months back. As the topic is fairly sensitive, I don't think it's particularly suitable for Wikidata beyond support for existing articles at Wikipedia. --- Jura 21:02, 19 October 2018 (UTC)
@Jural: - Which topic is fairly sensitive? Terrorism or using Wikidata to crowdsource research?
- Thanks, DavidMCEddy (talk) 01:56, 20 October 2018 (UTC)
@DavidMCEddy: Jura probably is referring to the topic of terrorism. If you want to refine the stats found in the paper you mention by adding information about incidents, you're welcome to do so (before you start this, however, you should determine how to map info about an incident or incidents to each of the percentage outcomes you mention above). I am not sure, however, that this should be an organized, coordinated wholesale effort to effect such a refinement--beyond, of course, the obvious disclaimer of paid editing that you suggest for the grad students and the obvious COIs on the part of Jones, Libicki, and researchers affiliated with the GTD should any of them decide to take up editing Wikidata themselves. A lot of people here might take issue with your proposal simply because of an implied agenda ("In spite the...of RAND Corporation")--not that I am saying that you have one, but that some of your statements can be construed as implying one. (Don't take my word as final, however; you should await further comment on your proposal from others.) Mahir256 (talk) 02:37, 20 October 2018 (UTC)
@Jural: @Mahir256: I'm keenly aware that "the definition of terrorism is controversial". I believe that Wikidata can help improve understanding of the extent to which the different definitions matter. Examples:
* Jones and Libicki (2008) use a RAND Corporation definition of terrorism and terrorist groups. The Global Terrorism Database uses different definitions for data curated at different dates. Creating links between Wikidata and both of these databases could facilitating crowdsourced discussions of which incidents were or were not "terrorism" under different definitions of "terrorism". This could further contribute to scientific discussions of the extent to which the conclusions of of different studies, e.g., Jones and Libicki (2008), are robust or sensitive to which definition is used.
* The NAVCO Data Project has facilitated some work that I think is extremely important and seems largely overlooked in the mainstream discussion of how people should think about national security and appropriate response to conflict.
If Wikidata does not want to support this directly, what might Wikidata like or be willing to do to support other individual(s) or organization(s) to support crowdsourcing this kind of effort? I plan to contact individuals and groups like Jones and Libicki (2008), Erica Chenoweth, the RAND Corporation (Q861141) the National Consortium for the Study of Terrorism and Responses to Terrorism (Q18151703), International Physicians for Social Responsibility and the American Friends Service Committee to ask what they'd like to do to encourage extensions and wider use of related research. Before I do that, I'd like to get more discussion of these issues here.
A related question is that of paid editors: How is that policy enforced? Largely by the honor system -- and blocking edits by users with obvious high rates of biased editing?
I think the world needs something like Wikinews to grow into millions of local editions with paid editors helping encourage volunteers to contribute content of concern to them with professional editor(s) helping volunteers adequately check their facts and write from a neutral point of view with acceptable journalistic style. I've discussed this in more detail in "v:Everyone's favorite news site".
Comments? Thanks, DavidMCEddy (talk) 18:22, 20 October 2018 (UTC)

Archive (Q57495732)

Please delete Archive (Q57495732). I replaced it with Ziffer House Archive: Documentation and Research Center of Israeli Visual Arts (Q57535814). רוזמן יצחק (talk) 07:11, 20 October 2018 (UTC)

רוזמן יצחק: I have requested its deletion. Esteban16 (talk) 15:40, 20 October 2018 (UTC)

Merge request

Please merge Q49635349 and Q6473084. --193.157.212.243 12:58, 25 October 2018 (UTC)

Done. - Nikki (talk) 13:19, 25 October 2018 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 07:06, 26 October 2018 (UTC)

Could Wikidata help with bringing navigation templates to the English Wikipedia on mobile?

As I don't know of any forum to discuss things which affect more than one Wikimedia project and Wikidata seems to be the closest thing to this I will ask it here. Also I couldn't find any individual developer on the English Wikipedia that does this so maybe there are developers here. But in the English-language Wikipedia navigation templates don't seem to display at the bottom of any articles and I'm forced to click on "Desktop view" and then jump to the bottom of the page to edit them if I want to edit any navigation template. Is there a reason why navigation templates are displayed in the Dutch-language Wikipedia (see the attached images above) but not in the Vietnamese-language Wikipedia, English-language Wikipedia, Etc.? If so, why does this limitation only persists to some Wikimedia wiki's but not others? -- 徵國單  (討論 🀄) (方孔錢 💴) 06:56, 21 October 2018 (UTC)

It's better to ask at eg. meta:Tech. Anyway, the reasons may be that some wikis customised CSS loaded on mobiles or completely migrated to TemplateStyles. Matěj Suchánek (talk) 09:26, 21 October 2018 (UTC)

may I use the images from wikipedia

I am an educator and I create videos in various platforms. So I need various images to show in my videos to make my learners feel comfortable and to engage them in watching the video.

So my question is that May I use the images from wikipedia to use in my videos freely both for commercial and non-commercial uses.

please reply me on that regards.  – The preceding unsigned comment was added by 2405:204:f09b:61f8:c549:339:8646:c987 (talk • contribs) at 07:04, 21 October 2018‎ (UTC).

See c:Commons:Reusing content outside Wikimedia. Some Wikipedia may allow non-free media, so conditions for that will be different. --EugeneZelenko (talk) 14:12, 21 October 2018 (UTC)

Featured on main page

I'm thinking Douglas Adam's second wife wasn't named "Wikidata Sandbox" Adele Onto (talk) 12:44, 21 October 2018 (UTC)

Fixed. (Anyone know if there's some way for the sandbox to be filtered out of Reasonator listings?) --Yair rand (talk) 19:13, 21 October 2018 (UTC)

Structured data on Commons related properties

First pieces of Structured data on Wikimedia Commons (Q43387741) are going to be rolled out next month, which will allow wikidata-like metadata storage for files on Commons. As a result we will need some new properties and some existing properties to be used in new ways. Wikidata:WikiProject_Commons/Properties_table page lists many properties which might be used and many which might need to be created. I assume new properties will be proposed at Wikidata:Property_proposal/Sister_projects in Wikimedia Commons section, and that we will advertise on Commons to come there and voice their opinion. Some of the properties might be almost unusable in Wikidata context, like "Other versions" file type property for listing other images that depict the same thing (like the same artwork). Or Digital representation of property for linking files on Commons to wikidata items, the way c:template:Artwork's Wikidata field does. However even if a given property might never be used on Wikidata, it will be activated on both projects (Wikidata and Commons) since I do not think there is a way of limiting the scope of the property even if we want to. Before I (and others) start proposing new properties, I thought I will announce it here so it is less of a surprise. Pinging participants of the earlier discussions: @Multichill, Keegan, Jheald, SandraF (WMF), Pigsonthewing, Yann:. --Jarekt (talk) 22:00, 21 October 2018 (UTC)

Help needed to fill up an item about a book

I split the item Real Estate Hegemony (Q10928384) into Land and the Ruling Class in Hong Kong (Q57517186), which 10928384 contains the en-wiki article that full of original research on real estate hegemony in HK, while fr-wiki was the translation of en-wiki. The new item was only about the book Land and the Ruling Class in Hong Kong, which the literal translation of the Chinese book title, was "real estate hegemony". So they are only namesake and should not house in one item. I also created a draft in en-wiki to match the zh-wiki article which about the book. Despite i did not have the full book in hand, but i collected the isbn and other parameters from amazon and other online bookstore. However, how to fill those into the new item as the book had multiple editions ? Matthew hk (talk) 17:42, 19 October 2018 (UTC)

@Matthew hk: Try looking at the properties listed at Wikidata:WikiProject Books. (You can link the items with edition or translation of (P629) and has edition or translation (P747).) Jc86035 (talk) 13:02, 22 October 2018 (UTC)
@Matthew hk: We have to create one item for the work and then one item for each edition/translation including the first edition in the original language. Usually WP articles refer to the work item.
According to the draft you wrote you need 4 items for each edition and one item for the work. Item Real Estate Hegemony (Q10928384) means no sense in the current description. What is a concept ? A better description should be provided. For me, Real Estate Hegemony (Q10928384) should be defined as instance of (P31): plutocracy (Q131708). Snipre (talk) 13:28, 22 October 2018 (UTC)
@Matthew hk: I don't understand the book publication: was it first published in English and then translated in Chinese or the inverse ? Snipre (talk) 13:59, 22 October 2018 (UTC)

Q842433

Q842433 needs a review. It seems to be a mixture of at least, Atmospheric friction and Orbital mechanics. Voyager85 (talk) 23:54, 19 October 2018 (UTC)

@Voyager85: What language do you see "atmospheric friction" here? The ones I am familiar with appear to be entirely focused on orbital mechanics (or astrodynamics). ArthurPSmith (talk) 14:39, 22 October 2018 (UTC)

Mehr als ein Beruf

Kann man in den Wikidata eigentlich mehr als einen Beruf anführen? Kenne mich da nicht aus. Bei Jakob Joseph Matthys (Jakob Joseph Matthys (Q29365470)) wird nun als Tätigkeit «Linguist» angegeben. Sein Beruf war aber römisch-katholischer Priester, Sprachen waren nur sein Hobby – auch wenn es dieses ist, das ihn bedeutend gemacht hat, und nicht etwa sein Beruf. Oder was versteht man genau unter «Tätigkeit»? Gruss, --Freigut (talk) 13:04, 22 October 2018 (UTC)

Ja, Du kannst da mehrere Werte angeben, die dann gleichwertig gelten. Unter der vorhandenen Aussage steht in Link, wo Du weitere Tätigkeiten ergänzen kannst. Unter „Tätigkeiten“ versteht man grob gesagt alle Tätigkeiten, die für die Biographie der beschriebenen Personen relevant sind; es spielt keine Rolle, ob die Tätigkeit bezahlt wird/wurde oder nicht. Was nicht reingehört, sind spezifische Positionen wie beispielsweise "Bundeskanzler" oder "Erzbischof von Berlin" – dafür gibt es andere Eigenschaften. —MisterSynergy (talk) 13:16, 22 October 2018 (UTC)
Alles klar – vielen Dank! --Freigut (talk) 13:33, 22 October 2018 (UTC)

Wikidata weekly summary #335

Some suggestions about property "YouTube channel ID"

Excuse me for possibly stupid topic. I am just newbie in this wiki project and afraid to take this moves. I suggestion to rename property "YouTube channel ID" (YouTube channel ID (P2397)) to "YouTube channel", add property "number of subscribers" (number of subscribers (P3744)) to "allowed qualifiers constraint", create property "number of views" and add it to "allowed qualifiers constraint".

Also I want to move from ruwiki to wikidata work of my bot that updates data of number of subscriptions and views of YouTube channels. How do you think about this? -- IEPCBM (talk) 16:48, 20 October 2018 (UTC)

@IEPCBM: What we store using the former property is an IDentifier. We generally name our identifiers with an "ID" suffix for clarity. What benefits do you think dropping that suffix would have? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:27, 20 October 2018 (UTC)
@Pigsonthewing: I think that if in the property we add qualifiers "number of subscribers" and "number of views" "YouTube channel" will be more correct because this property will have additional data (not only ID).  – The preceding unsigned comment was added by IEPCBM (talk • contribs) at 18:06, 20 October 2018‎ (UTC).
@IEPCBM: I think it would be better to add the subscriber data and so on with separate properties in separate statements, since you would need to add sources and point in time (P585) for each data point (for example, number of YouTube subscribers → 100, with qualifier point in time → 21 October 2018). Furthermore, on Wikidata usually old data shouldn't be overwritten with new data if the old data was correct at the point in time (P585). Jc86035 (talk) 12:43, 22 October 2018 (UTC)
Maybe tabular data would be a solution here. - Nikki (talk) 12:57, 22 October 2018 (UTC)
@Nikki: If there are no copyright concerns about this I think this would be the best way to do it, especially if e.g. a bot is going to be adding a new subscriber count for each channel multiple times per day (statements are a bit limited in this regard by the best time precision available being one day). Jc86035 (talk) 13:21, 22 October 2018 (UTC)
@Nikki: Does tabular data look like this? -- IEPCBM (talk) 16:02, 23 October 2018 (UTC)
@IEPCBM: Yes. For YouTube subscriber counts and so on I assume you would have a time column and a number of subscribers column for each channel, and some Wikipedia template would be automatically retrieving the most recent row of data through a Lua module. (You could also add historical data by sifting through the Internet Archive.) Jc86035 (talk) 16:25, 23 October 2018 (UTC)

Weights

I've been doing a little work today looking at heights and weights. One odd thing I've discovered is that if mass (P2067) is defined in terms of kilogram-force (Q216880), not kilogram (Q11570), then it seems to show up in Wikidata as though it was units of approximately, but not exactly, ten kilograms. See, eg, Mitsu Shimojo (Q6882969) - query data. They should be about equivalent. Any idea what to do about this? (Other than fix the use of kilogram-force, of course). I've not dealt with measures much. Andrew Gray (talk) 20:38, 20 October 2018 (UTC)

I hadn't heard of kilogram-force (Q216880), but it's a unit of force, not a unit of mass. I.e., it shouldn't be used as a unit for mass (P2067). Ghouston (talk) 21:45, 20 October 2018 (UTC)
The word "weight" existed before Newton's laws of motion, which first recognized the difference between the mass of an object and the force exerted by gravity on that object. In the situation described by Andrew Gray seems to be using the meaning of "weight" used by modern physicists, which is the force exerted on an object by gravity. A kilogram-force is poor unit of force that panders to people who don't want to learn to do it right; it is the force of gravity on a mass of one kilogram at the surface of the earth. It is approximately equal to 9.8 of the proper unit of force, the newton. So the poor unit, kilogram-force, is being converted to the proper unit, newton. Jc3s5h (talk) 22:10, 20 October 2018 (UTC)

New tool: Wikidata Image Positions

Hi all! I built a tool to show image areas indicated by relative position within image (P2677) qualifiers: Wikidata Image Positions. It shows you the image (P18) of a given item along with all the elements it depicts (P180), according to their relative position within image (P2677). Two examples:

You can explore more examples by using this query, which finds the items with the most depicts (P180) statements with position qualifiers:

SELECT DISTINCT ?item ?itemLabel (GROUP_CONCAT(distinct ?natLb; separator=" - ") as ?nature) (COUNT(distinct ?depictspos) as ?nb_depicts)   ?display
WHERE
{
  ?item p:P180[ pq:P2677 ?depictspos ] .
  ?item wdt:P31 ?nat.
  ?nat rdfs:label ?natLb . filter (lang(?natLb) = "en") .
  BIND(IRI((CONCAT("https://tools.wmflabs.org/wd-image-positions/item/", REPLACE(str(?item),"http://www.wikidata.org/entity/","")))) as ?display )
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
  }
} GROUP BY ?item ?itemLabel ?display
ORDER BY DESC(?nb_depicts)
Try it!

I’ve also set the tool up to be the new formatter URL (P1630) for the relative position within image (P2677) property, so you can just click the values to verify them. (You might need to purge the item first if it was last rendered before the formatter URL (P1630) was changed.) Hope this helps! --Lucas Werkmeister (talk) 19:44, 21 October 2018 (UTC)

Cool. It would be nice if the frame turned red for items that happen to lack a image (P18) statement. Thierry Caro (talk) 20:26, 21 October 2018 (UTC)
Great I guess this is Wikidata implementation of Annotations used for a long time on Commons, for example here. We probably should import bounding boxes already defined on Common to here is the depicted item has an item. --Jarekt (talk) 22:17, 21 October 2018 (UTC)
@Jarekt: pretty much, yeah – I’m not claiming to invent anything new here :) the relative position within image (P2677) property has existed for a long time, but perhaps the existence of this tool can encourage more use for it? (Also, I’m not sure if importing bounding boxes from Commons is okay, license-wise.) --Lucas Werkmeister (talk) 10:24, 22 October 2018 (UTC)
License-wise bounding box coordinates are not something you can copyright, and would fall under c:Template:PD-ineligible so we should be fine there. --Jarekt (talk) 12:26, 22 October 2018 (UTC)
@Lucas Werkmeister: Is making this the default formatter URL (P1630) for relative position within image (P2677) the right way to go?
Previously one could install the User:Husky/ifff-viewer-link.js script, which would link the relative position within image (P2677) to an IIIF crop of the image. This new URL formatter breaks that script. (Okay, the IIIF service is currently down and seems to be in no hurry to come up again, also breaking Shonagon's IIIF image cropper service used to create P2677 strings, as well as the Zoom viewer on Commons; but wouldn't that be a better long-term link?) Jheald (talk) 08:56, 22 October 2018 (UTC)
@Jheald: well, I wrote this tool because the other one is broken… if it ever comes back to life, I assume it should be possible to edit the user script so that it doesn’t break when the value is already a link, and instead replaces the link with a different one. (I’m also thinking of adding editing support to the tool, as it happens.) --Lucas Werkmeister (talk) 10:24, 22 October 2018 (UTC)
Update: the tool now supports editing! You can draw a relative position within image (P2677) region for any depicts (P180) statement that doesn’t have the qualifier yet by dragging the mouse across the image with the left mouse button held down. See User:Lucas Werkmeister/Wikidata Image Positions#Editing for longer instructions. --Lucas Werkmeister (talk) 15:03, 23 October 2018 (UTC)

I'm confused

Is this a malfunctioning bot? Because it looks a lot like a malfunctioning bot. GMGtalk 17:33, 22 October 2018 (UTC)

@GreenMeansGo: On the surface, it does. @Lymantria: for more opinions. Mahir256 (talk) 17:46, 22 October 2018 (UTC)
They do not have a bot flag. I was considering blocking them and asking to file a bot request, but I guess we need to wait for Lymantria--Ymblanter (talk) 19:32, 22 October 2018 (UTC)
Seems like the bot was running from the main account in stead of from User:WDBot. Lymantria (talk) 19:36, 22 October 2018 (UTC)
It doesn't seem to work any better on that account. Ghouston (talk) 20:59, 22 October 2018 (UTC)
The problem being? Lymantria (talk) 05:35, 23 October 2018 (UTC)

Hi, I operate the bot User:WDBot. I hope to solve the confusion. When I have ported the bot script from test to production I've found that there was a problem with one reference for few countries. Ireland was one of this countries. For Ireland and the other few countries with this issue I've corrected (manually - without using a script) the reference using my account. Cheers! Datawiki30 (talk) 08:02, 23 October 2018 (UTC)

External Identifier Website elfilm.com is down

This concerns elFilm film ID (P3143) / elFilm person ID (P3144) with 137.010 entries. Shall we delete these properties? Queryzo (talk) 06:59, 23 October 2018 (UTC)

No. If it is permanently down, deprecate the formatter URL (P1630) claim on the property page which will deactivate link generation in the standard web UI and indicate to data users that there is no online accessibility currently. However, we intentionally keep the properties and all of its identifiers in such cases. —MisterSynergy (talk) 08:06, 23 October 2018 (UTC)
I have replaced the two formatter URLs with versions using archive.org Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:35, 23 October 2018 (UTC)

Countries and their subdivisions and territory

The question of which country a region or subdivision is part of can sometimes be quite complicated, and our current handling of this is very imprecise and occasionally inconsistent. I'd like to establish some standards for this and add them to the relevant documentation. To summarize some of the relevant things to take into account:

  • Some items correspond to regions: geographical features (islands, peninsulas, etc) or other areas which don't exist as a part of any county's set of subdivisions, but the territory of which is within one or more countries. Some items correspond to subdivisions, which may exist within the set of subdivisions used by only one country among a set of countries that the corresponding territory may be located in, or the same subdivision may be used by multiple countries that the area may be in. (That is, there are occasionally competing subdivision structures over the same area, with different items, and sometimes not.)
  • Territories can be controlled and administered by a country; a particular country can be "the government" over an area, in practice. Regular administration can be military or civilian. Under certain circumstances, control may be exercised by an alliance of countries, sometimes working under an international organization. Occasionally, military control and day-to-day civil administration of the same area may be run by different countries.
  • A country can claim a territory as their own. This can be done with or without claiming the territory as part of the country proper. Disputed claims between countries can be further complicated by the fact that a local government of a subdivision can have its own opinion, and claim itself to be part of a particular country.
  • A territory can be internationally recognized as being part of a country. Any country or group of countries can recognize an area as belonging to a country. (I'm unsure to what extant this existed before modern history.) There can be hundreds of different countries expressing opinions on this, so presumably we don't want to duplicate the whole list many times.

The existing properties used in this area are: country (P17), contains the administrative territorial entity (P150), located in the administrative territorial entity (P131), and territory claimed by (P1336). (Also somewhat relevant are coextensive with (P3403) and territory overlaps (P3179), which manage the relationship between regions and subdivisions of the same area.)

Using the existing properties, along with possibly new properties or qualifiers, we should clarify how these properties should be used to express the above information on regions and subdivisions, keeping several priorities in mind:

  • Specificity: Items should ideally include as much of the data as possible, unambiguously as possible. Users should be able to query any specific element of association between a territory or subdivision and a country.
  • Simplicity: Editors should be able to easily figure out how to format a statement for any such kind of relation, without having to read endless complicated documentation.
  • Minimalism: Most areas are undisputed, and are claimed by, controlled by, administered by, and recognized as being part of, only one country. We should try to minimize the number of extra statements so that we don't need to add extras to every such ordinary situation.

Essentially all of the ways a subdivision/area can be connected to a country are not dependent on each other, making everything rather complicated. To take a fictionalized example, in case this helps:

The government of Q01 lists among its administrative subdivisions the Q02, which corresponds to the territory/area of the geographic feature Q03, which is claimed by Q04 as subdivision Q05, and administered by Q06 (as a different type of subdivision, which has a separate item Q07). The local government of Q02 itself considers itself to be part of Q07. The international community considers the area to be part of Q08 and recognizes it as such.
What statements should be used on each of these items to convey this information?

How should this data be ideally structured? --Yair rand (talk) 23:34, 14 October 2018 (UTC)

Wikidata:WikiProject Country subdivision has some pre-existing work in this area, but it seems to have been largely dormant for quite a while. There's still a lot of even quite basic work to do in this area (e.g. how much is missing or red on Wikidata:WikiProject Country subdivision/Items, even in terms of the first-level administrative division (Q10864048) in a lot of countries) so making sure that the underlying concepts are in place and well-documented so that people can help out by filling in a lot of those gaps, would be very useful. --Oravrattas (talk) 14:00, 16 October 2018 (UTC)
I've linked to this thread from the WikiProject talk page. --Yair rand (talk) 20:51, 16 October 2018 (UTC)
(Anyone know of any way to get more attention on this issue? An RfC, maybe?) --Yair rand (talk) 21:18, 23 October 2018 (UTC)

Petscan / toolserver replag issue - FIXED

Petscan queries involving wikidata are a bit wonky right now. Replication to the wikidata database used by Petscan (and other toolserver tools?) has been turned off whilst an issue is fixed. Current replag is ~15 hours. No ETA to fix, but the DBAs are all over it. --Tagishsimon (talk) 23:20, 19 October 2018 (UTC)

@Jura1: The first problem is still pending patches to fix, please do not mark section resolved yet. --Liuxinyu970226 (talk) 23:27, 22 October 2018 (UTC)
@Jura1: Replag is currently fine - https://tools.wmflabs.org/replag/ - and so phab T207524 has been closed. DBAs still seem to be doing final checks on the issues underlying T206743. I've amended the title of this thread so that readers are not misled. --Tagishsimon (talk) 19:17, 23 October 2018 (UTC)

Group of dubious items

I’m playing with ORES training datasets and there is an edit which involves a dubious item, which is probably not notable : Q41030406 : no interwiki, no authentication information. It even has statements with deleted items value in parenthood statements (did not even know it was possible). It may be that it points to a group of not notable items. I could just request its deletion but I’m not sure it’s the right thing to do, that’s the reason I live a message here. What do you think, how to clean that possible mess ? track the contribution of the creator of these items ? Why was the deleted items deleted without their use cleaned up beforehand, what happened ? It seems something went wrong in the process here. author  TomT0m / talk page 19:40, 23 October 2018 (UTC)

I've tried to track the creator, which added unsourced etnicity statements or items about very young children with fake information, for a few weeks one or two years ago. Don't really have the time and patience anymore for it, due to the various IP ranges he/she uses. Sjoerd de Bruin (talk) 20:23, 23 October 2018 (UTC)

President of an organization/suggested qualifiers problem

I am trying to add the statement that Cornelia Storrs Adair (Q57662600) has <position held> president (Q1255921) of the National Education Association (Q3111510). I have the beta feature "Entity suggestions from constraint definitions" turned on. The system won't let me add "of" as a qualifier. and the suggestions offered for qualifiers seem to be appropriate for president (Q30461) of a country, not of an organization. Help? - PKM (talk) 19:56, 23 October 2018 (UTC)

@User:PKM I noticed the same thing, but discovered "of" is still there as an option, only waay down on the list. For me, the same thing happens when I try to select "male" or "female" as the value to "sex or gender". They're buried out of sight and under hermaphrodites, intersex and other options. Moebeus (talk) 22:57, 23 October 2018 (UTC)

Statements that contain a redirected item

A bot created a duplicate item Oliver Braddick (Q7087416) and linked some of his articles to it [5]. I merged it to the earlier item, but now when you try to query these items they don't get picked up [6]. Is this a bug? Ghouston (talk) 00:21, 24 October 2018 (UTC)

I think there may be something in the back-end software that will eventually migrate the statements to the new item automatically. This doesn't show up in the item histories. Ghouston (talk) 00:46, 24 October 2018 (UTC)
  • KrBot eventually updates the statements (no back-end software). If you don't want to wait, you need to query the redirect explicitly [7]. --- Jura 01:10, 24 October 2018 (UTC)

Bad birthdays

Google's Knowledge Graph has informed me that en:Laura Vaccaro Seeger, Q17147769, was born at 00:00:00 on 1 January 1900, which is obviously an error as the world's oldest person was born in 1903. (User:GreenMeansGo has deleted her birth date from the record here). It turns out that this statement was added by Reinheitsgebot four months ago, drawing on her VIAF information, although her VIAF page doesn't have a birth date. I suspect that the bot misinterpreted the complete lack of data as an indication that she was born at 00:00:00 on 00-01-01.

Could someone query the database to find everyone who was allegedly born on 1900-01-01, and then analyse the results to see how many of these dates were added by this bot? I'm just guessing that if there are many of them, that most of them will be errors. Nyttend backup (talk) 19:29, 3 October 2018 (UTC)

<ns1:birthDate>1950</ns1:birthDate>
<ns1:deathDate>0</ns1:deathDate>
<ns1:dateType>flourished</ns1:dateType>

Date types can be 'lived', 'circa', or 'flourished'. To quote https://github.com/OCLC-Developer-Network/viaf-dates/blob/master/README.md:

'lived' says the dates are birth and death dates and should be accurate += 3 years. 'circa' says the dates are a guess and should be given a 10 year error of margin. 'flourished' are more likely to be dates the person worked or a century date and are given 100 years leeway.

In this case work period (start) (P2031) is a more appropriate property than date of birth (P569) (diff). But even when VIAF states dateType=lived, it can still be incorrect or indicate something else – e.g. https://viaf.org/viaf/72927896/viaf.xml for Ebenezer Hewlett (Q18671012):

<ns1:birthDate>1737</ns1:birthDate>
<ns1:deathDate>1747</ns1:deathDate>
<ns1:dateType>lived</ns1:dateType>

One more example – Mahākāśyapa (Q335304) hadn't died at age 1 (diff) https://viaf.org/viaf/62449721/viaf.xml:

<ns1:birthDate>-550</ns1:birthDate>
<ns1:deathDate>-549</ns1:deathDate>
<ns1:dateType>flourished</ns1:dateType>

Please see this article as well: Parsing and Matching Dates in VIAF. Santer (talk) 14:11, 11 October 2018 (UTC)

  • The format look just short of having
<ns1:birthDate>2018</ns1:birthDate>
<ns1:dateType>date of death</ns1:dateType>

--- Jura 09:26, 24 October 2018 (UTC)

Linking two items

Hi, I have two items, Milano Porta Garibaldi railway station (Q801187) and Garibaldi FS (Q1087026), which they are linked, phisically too: the first one is a railway station and the second one a subway station. Is there a way to link them on Wikidata? Thank you very much --★ → Airon 90 20:25, 23 October 2018 (UTC)

connects with (P2789), see for example Q800589#P2789 --Pasleim (talk) 22:03, 23 October 2018 (UTC)
@Pasleim: Thank you! Anyway, I found out property interchange station (P833). Is this better than your proposal? --★ → Airon 90 07:19, 24 October 2018 (UTC)

How should lifelong membership be modeled in Wikidata? Some Wikidata items about humans (such as Vilho Setälä (Q3893579)) have member of (P463)Universal Esperanto Association (Q12565). This seems like the correct way to render membership. But some Wikidata items (such as Frank Buckley (Q12347962)) have member of (P463)lifelong member of UEA (Q12346649). This seems wrong to me. The item lifelong member of UEA (Q12346649) for some reason also has instance of (P31)Esperanto club (Q2367344), which also seems wrong to me. Robin van der Vliet (talk) (contribs) 22:35, 23 October 2018 (UTC)

Use has characteristic (P1552)=life member (Q20730801) like this. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:19, 24 October 2018 (UTC)
Following the suggestion of Ghouston (talkcontribslogs), I have used the property subject has role (P2868) as a qualifier everywhere already. If you believe that has characteristic (P1552) is more appropiate, maybe we could change it everywhere with a bot? Robin van der Vliet (talk) (contribs) 09:34, 24 October 2018 (UTC)

Edinburgh Research Archive - thesis data in final furlong.

Hi all, ChaoticReality is looking to run some test edits using QuickStatements (exporting to the Wikidata test site) so we can ensure the thesis data items are being created correctly before running QS on Wikidata proper for all the theses. Could someone take a look at ChaoticReality's request for autoconfirmed/rollbacker status please so that he can run these test edits on the test site as this is the only thing that is holding up being able to add these theses to Wikidata at the moment. Very best, Stinglehammer (talk) 12:17, 24 October 2018 (UTC)

@Stinglehammer: Is there anything preventing User:ChaoticReality from making the requisite 50 edits needed to gain autoconfirmed status (the account is sufficiently old, I would believe)? Once that happens, and a request is explicitly made under the 'rollback' heading on WD:RFP, I would not mind granting the rollback flag. Mahir256 (talk) 14:18, 24 October 2018 (UTC)

Taxonomy: concept centric vs name centric

Hi there,
I'm a new editor to Wikidata but I'm already an extensive consumer of data by Wikidata using the SPARQL interface.
During the last year, there's been one point that's really bothered me. My plan is to fix it, first by hand, later by bot. However, already my first edits were reverted, so maybe I have first to reach out.
The point is that a concept with multiple names gets modelled by one item. The exception to that are taxa which have one item per scientific name. Thus, a taxa with multiple scientific names has multiple items.
This generates a few problems:

  1. For data consumer (and likely new editors) it's difficult to understand the data structure if multiple ways of modelling are used across Wikidata.
  2. Links to Wikipedia are split; if different language versions of Wikipedia describe the same taxon under different names they are not linked to each other.
  3. Data is added multiple items. For example "Image" or "taxon common name" are added on each item about the same taxon.

How can we resolve the problem? My solution is to merge items about the same taxon and use ranks and qualifiers to indicate which statements are outdated resp. preferred. 130.92.255.36 07:56, 15 October 2018 (UTC)

WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. --- Jura 08:02, 15 October 2018 (UTC)

  • I added "taxonomy" to the section header above and pinged the relevant WikiProject. BTW it's not really specific to the field that links to Wikipedias don't follow Wikidata's structure. Merging items to connect random elements isn't really a good idea. --- Jura 08:01, 15 October 2018 (UTC)
    • I understand that connecting random elements isn't a good idea but my plan is to merge items describing exactly the same taxon, just known under different scientific names. As analogy, look at "Louis XIV of France". He is known under many different names, still only one item describes him. --130.92.255.36 08:20, 15 October 2018 (UTC)
Agree this is currently a major nuisance. As an example, the species formerly known as Madanga ruficollis (Q178830) was recently reclassified as Anthus ruficollis, but I know if I change the taxon name (P225) to that (apparently sacrosanct and "unchangeable", but nothing to indicate it, nor to make the change impossible), someone will just revert it. Currently, as far as I can see, it has to wait for someone to create a new item for it here, and transfer all the links across to it - very cumbersome! - MPF (talk) 08:41, 15 October 2018 (UTC)
  • In items here about taxa, the taxon name is leading. If an author publishes a paper to suggest that a species should be placed into a different genus - say that Madanga ruficollis has to be place in the genus Anthus like in the example above - it is a misunderstanding that the earlier name has been rejected or disappeared. First of all authors may have different opinions on the issue. But also the earlier name has been published (often many times, for instance on a Wikipedia). You will see that taxonomy publications on Anthus ruficollis will mention the earlier name Madanga ruficollis. The most correct way on Wikidata is to make a new item on the new name and keep the old one - linking them by taxon synonym (P1420). Of course the confusing is understandable, since the plants or animals at hand do not change by placement in a different genus. One might think that just changing taxon name (P225) would be sufficient. However, the placement of a species in a different genus reflects more than just renaming, it shows new insights and/or opinions in the placement of the species in the tree of life in relation to other species. The header "concept centric vs. name centric" hence is not reflecting correctly what is going on as the name reflects the concept and the taxonomical concept has indeed changed. For convenience reasons, sitelinks may be collected at one item. Lymantria (talk) 09:22, 15 October 2018 (UTC)
  • True, it's just that the proceedures for doing all this at wikidata are so obscure and impenetrable. And if changing P225 is not the way to do it, why is this not locked to make it impossible to do accidentally? Is there really no easier way? - MPF (talk) 09:45, 15 October 2018 (UTC)


  • I fully agree with the initial comment in this section and in fact because there is no agreed solution to this problem Wikidata is no longer of much interest to me. I am mostly concerned about fungi, which are normally known by their scientific names (at least in English), and those names are currently undergoing enormous numbers of changes with new genera constantly appearing. For these homotypic synonyms there is no controversy at all that the species are exactly equivalent (it is true that there also exist cases where there are conflicting views as to the exact equivalence and those cases are more complicated to handle, but they are much less frequent). We need Wikidata items for all the important synonyms, but one of the items should be selected to mean the real organism; the other items should be restricted to taxonomical information. The special "Wikidata" item should contain the wikilinks and all properties which belong to the organism (and which do not need to be duplicated for all the items). We should avoid language which implies that the selected "Wikidata" name is the right one and that the other names are wrong. Almost the only thing that Wikidata does for the Wikipedia projects at present is provide the interlanguage links, but the various language versions may naturally happen to use different names for the same species, so unless one special "Wikidata" item is chosen for all, the various language pages will not be correctly linked together.
I think the title "concept centric vs. name centric" reflects well the main issue here. It is true that the exact definitions of the various names are not identical, but I think the place to document that is in the Wikipedia article, of which there should probably be only one. All the descriptions are attempts to define the same actual species which exists in nature and if there are conflicting definitions, and someone wants to include that information in Wikidata, I think that it should be documented by using references and author attributes to multiple properties on the same main organism item.
I made a detailed proposal for a solution in the discussion of the synonym property, but unfortunately it was not agreed. I proposed that only the special item representing the organism should have instance of (P31) = taxon and the others should have instance of (P31) = synonym, but some other method could also be used. In difficult cases multiple special "Wikidata" items could be allowed. Some objections were raised but I think they have solutions and anyway it is a very high priority that Wikidata should be able to selected have organism-level items. Otherwise I think the Wikidata data model is failing to support "tree of life" information satisfactorily. Strobilomyces (talk) 10:00, 15 October 2018 (UTC)
    1. This is not really the proper venue for this discussion, which should be at Wikiproject Taxonomy.
    2. Without a detailed case-by-case study it is not really possible to tell if something is the same taxon. Taxa are not necessarily stable, and may be highly dynamic. For example, to most authors Pentaphylacaceae consists of one species, but there are those who consider it to comprise some five hundred species. In most cases, the only known way to track taxa is by scientific name (or sometimes clade names); there is, as yet, no way to track taxa by other means. So there would be nothing to base items on (and once it does become possible to have identifiers for all taxa ever recognized, it likely will prove that these are very, very many).
    3. It is easy to track scientific names. One name, one item makes it easy to gather database identifiers and references referring to that name. Data retrieval is also easy, as in the enwiki taxonbar (pointed out above).
    4. Something of a problem is the placement of the sitelinks. All the sitelinks of homotypic names can be placed together in one item (so as to have them linked), as long as not too much value is attached to what item they are in. Various other solutions have been proposed, but as yet nobody supported an approach suggested by any other user.
    5. Another problem is that from time to time, there are users who feel that there is, or should be, a single accepted name for each and every taxon. Besides not being in the spirit of the WMF, there is the slight problem that these users do not necessarily agree with each other, or with most of the world literature. - Brya (talk) 10:55, 15 October 2018 (UTC)
  • @Brya: Thanks for your answer.
1. I would cetainly be happy to have the discussion at Wikiproject Taxonomy.
2. In general it can be difficult to tell if two taxa are synonyms or not, but in many cases there is no doubt, for instance the fungus Marasmius alliaceus now has a new name Mycetinis alliaceus and it is absolutely clear that the two terms refer to exactly the same mushrooms, since the change was at the genus level. There are many hundreds of cases like that in the fungi and it would be useful to have a solution for those simple cases. I suppose that if there is doubt it may be necessary to keep the names separate but I think that the claim of synonymy should be with a reference and so if it is sometimes wrong, that does not mean Wikidata is wrong. Taxonomic databases often give synonym information and that is what I propose we should use to determine if taxa are the same. There will be difficult cases, but it is normal to have to take that that sort of decision in this context. Wikipedia pages can sometimes cover multiple closely related groups and we should not need a separate taxon item in Wikidata unless a separate Wikipedia page is necessary. It would be an immense service to users to provide items for particular organisms even if the meaning of the organism is a bit vague and it is necessary to read the Wikipedia article to understand exactly what the various names mean.
If I understand you right, your example is at the family level; in the past family Pentaphylacaceae consisted of only one species, but DNA work has shown that that species belongs in the clade of family Ternstroemiaceae and due to the priority rules the name of the combined family has to be "Pentaphylacaceae", so it has gained 500 species. By the way, the author string of the family does not change in this process! Well, I think my proposal needs to apply mostly at the species level, which I think is the "organism" level. Perhaps I should not try to apply it at the family level.
3. It is true that because of the strict nomenclature rules, names are easier to track than taxa, but it is a terrible disadvantage if we don't have Wikidata items which correspond to the real species (and perhaps other levels). I don't think that we should give up and I think we can track the taxa using the Wikipedias, once they are covered. I think it is OK if initially Wikidata is loaded with thousands of items at the name level from external databases, but in those cases where the organisms get articles in language Wikipedias or categories in Wikimedia Commons we should improve the quality of the data. Then I think that synonyms should be linked together and one "Wikidata" item should be selected, so that we will have an item at the organism level.
4. I am very glad to see you say "All the sitelinks of homotypic names can be placed together in one item". I have done this in certain cases, but I was worried that I was breaking some rule. I also agree that not much value should be attached to the question of which item they are in. If there are non-taxonomic properties (which belong to the organism), they should also be added in the item with the sitelinks. In fact this means that where an organism has pages in at least two Wikipedias, we do actually have a selected organism-level item. If this could just be agreed as a formal principle, I would be much happier.
5. Yes, it is a problem if users are intransigent about pursuing their particular taxonomy, and I think that is one reason why we need clear rules about how to deal with divergent taxonomies. Strobilomyces (talk) 15:11, 15 October 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── A pointer to this discussion has been posted at Wikiproject Taxonomy; and members of the project have been pinged here. This venue is fine.

If all the sitelinks of homotypic names are to be placed together in one item, then that item would need to have all those names as aliases; there has previously been strong resistance to using such aliases. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:40, 15 October 2018 (UTC)

That would only be the case if the intent is to cause maximum confusion. - Brya (talk) 17:00, 15 October 2018 (UTC)
  • Using the number of Wikipedia sitelinks as a primary criterion to define items is probably the most complicated approach. For Wikipedias, it doesn't really matter across the number of items information is spread: they can easily load from hundreds of structured items. Anyways, from an external standpoint, it doesn't really surprise me that a field called "-nomy" is name-centric. --- Jura 17:18, 15 October 2018 (UTC)
- Using the number of Wikipedia sitelinks is the only method we have at present, but I would like to see an agreed property used instead. I think you are grossly optimistic about how easy it is to get software implemented. The problem is not the difficulty of writing the software or the load on the system, it is the necessity to (1) define clearly and agree the required data model and specification, (2) persuade the sitelinks software team that these special rules for taxonomy items are important enough to devote resources, and (3) publish the rules clearly and win acceptance so that they will be taken into account in other less basic software such as Lua scripts. Strobilomyces (talk) 11:52, 17 October 2018 (UTC)
- Thanks for notifying Wikiproject Taxonomy, by the way. Strobilomyces (talk) 11:57, 17 October 2018 (UTC)

The Color of the Bikeshed

I had the same problem, actually, with magazines. Journals, if that is a more respectable name for ya. Bird-lore became Audubon Mag, or some such. A big difference between the two is the license to use them.

The early "find and identify" taxonomy, probably uses whatever reference material they had with them at the time. "Scientific Society" notes and inclusions, maybe a book; samples brought back home and identified with whatever reference material they had available.

150 years later, I find a species name here. None of the cited taxon groups within the species and none of the wikipedias use the name. I separated them, the interesting part was "who is calling this that". It got merged again and the one site that called it by the merge name was left off (another was found or created since then). I was treated as if I was the idiot.

Wikidata is not a taxon identifier authority. If there are two news agencies reporting a thing and they both claim that a different thing happened, this is not the place to determine the correctness of one or the other. So it is with taxon names.

If a user of the data doesn't like the complicatedness of the data, the user should be instructed, then, to go fix the "science" as it has been practiced or to find something less complicated to be involved in. I would like to call this my opinion, but it is not even that! It is the fact of taxonomy.--RaboKarbakian (talk) 16:30, 15 October 2018 (UTC)

User:RaboKarbakian - as far as I can tell you created an item for the same name (two items for one name), rather than an item for the homotypic synonym. [ Brya (talk) 17:00, 15 October 2018 (UTC) (- split comment)]
User:RaboKarbakian - I think that Wikidata should try to provide identifiers for particular species, and (less importantly) for taxa at other levels. Authors of Wikipedia pages have to confront this problem in deciding how many pages to create and selecting titles, and the definitions of Wikidata items depend on Wikipedia pages if those exist. The projects are doing a big service if they can accomplish that and it is very unhelpful to say that each user should have to review the entire taxonomic history to find the right item; only a few specialists should need to do that and the differences between the homotypic synonyms are unimportant for most purposes. Strobilomyces (talk) 12:59, 17 October 2018 (UTC)
I think that what you are suggesting is more of this: https://www.wikidata.org/w/index.php?title=Q5175878&type=revision&diff=759332100&oldid=759325089 (Which was an impressively ill informed waste of my time, and the time of others as well) Please, look at the links there to confirm. Wikimedia is actually able (by license) to host the documents that named and used the older takes on TOL and names of living things. It is far more interesting and honest to just report it as they did it. Without spending a lot of time discussing it and by spending more time working on it, all the species names can be documented throughout history with the papers and even the "vulgare" or "romantic" (old definitions to be used here) writings. To give this method of documentation the same limitations as an encyclopedia -- why even bother with this as the wikipedias are there doing what you want already?--RaboKarbakian (talk) 02:51, 18 October 2018 (UTC)
The "stated as: Leptinella plumosa" is superfluous in an item that deals with Leptinella plumosa, as only papers and database with the name "stated as: Leptinella plumosa" are included in that item. - Brya (talk) 03:16, 18 October 2018 (UTC)

homotypic synonyms

User:Strobilomyces - putting all homotypic synonyms in one item superficially looks nice (and it might have worked if the basic structure of Wikidata had been different), but it will either lose information or be frighteningly complicated, with every statement accompanied by qualifiers indicating to which name it applies. This may even be the case for something as basic as taxon rank; no reason why there may not be homotypic synonyms in three or more different ranks. It will be terribly awkward to edit or to read.
        Properties of taxa should preferably be referenced and placed with the name in the reference. What applies to one particular circumscription may not apply to a different circumscription. And many different circumscriptions for one taxon is not limited to the family-level. What is one species to one taxonomist may well be hundreds of species to another taxonomist. A notorious case is the dandelion.
        And of course we would need new properties and a new taxobox module.
        Maybe somebody will come up with a piece of software to link sitelinks in different items (the Bonnie and Clyde issue is still there as well). - Brya (talk) 17:00, 15 October 2018 (UTC)
User:Brya I am not suggesting putting all homotypic synonyms in one item, I am suggesting being allowed to mark one of them as special and to use that special one for sitelinks and for other properties which belong to the organism. Indeed there are hometypic synonyms at different ranks, that is OK. Normally properties of taxa are the same for all homotypic synonyms; they tend to be very general (example: has fruit type (P4000)). If the circumscriptions have different organism properties, I think they are not homotypic and at least that is a very special case which does not occur in the present Wikidata and which could be treated exceptionally. By giving priority to these obscure cases we are losing the opportunity to have an organism-level item in Wikidata, which could really add value to the project. I don't think that the dandelion mini-species change anything here; we can represent alternative taxonomies using references, qualifiers etc. and the same decisions have to be made with or without this proposal.
The taxobox module refers to name-related information and I actually doubt that any change to that is needed. But we would need new properties. I would like to put forward another simplified version of my proposal:
  1. The current name-based data structure would stay as it is and we would only add information to it.
  2. It would be allowed to add the new property "organism item" to taxonomy items to indicate a chosen item for sitelinks. I am not proposing a change to the software, but in time we should update the data so that those items with sitelinks have the new property and where homotypic synonyms (indicated by taxon synonym (P1420) and instance of (P31) with of (P642)) have different sitelinks, one item should be chosen and the sitelink information merged. Perhaps there could be exceptions, but I think if the Wikipedia pages are not equivalent, the items should not be homotypic synonyms. The "organism item" could be useful for other purposes, such as keeping in one place information which is independent of name. Perhaps another method of labelling the special item would be better, for instance it could be an attribute of instance of (P31) = taxon.
  3. A restriction would be that Wikipedia pages should only link to items marked as "organism items". This would not be enforced by software, at least at first, but it would solve the problem that sitelinks for the same organism can be scattered amongst different names.
  4. Also a new property "Authority for current name" should be added to allow indication (with a reference) that one of the items is the "real" current name. Conflicting views could easily be accommodated. This is a somewhat different issue, but it would make clear that the current name may be different from the "organism item".
This proposal would solve the sitelinks problem and allow organism-level data to be added without losing any information. In order to be useful I think it would just need a formal proposition and a consensus that this is a useful addition to the taxonomy part of Wikidata. Strobilomyces (talk) 19:21, 15 October 2018 (UTC)
  1. This is a quite different solution than proposed by others in this thread.
  2. This proposal would not so much solve the sitelinks problem, as move the problem to a different level. It requires a "consensus taxonomy" which does not exist in reality. - Brya (talk) 03:22, 16 October 2018 (UTC)
I don't agree that this requires a consensus taxonomy. For each set of homotypic synonyms it requires one to be selected, but there does not need to be any consensus that the one selected is the true current one. It needs consensus that the given names are really synonyms, but in many cases I think that is not a problem. The sitelinks are still useful and important even if the organisms of the articles they link are technically slightly different in some way. If there is doubt, the items can be left separate (i.e. not linked by synonyms to provide a unique item for sitelinks). Strobilomyces (talk) 14:33, 17 October 2018 (UTC)
If one is selected more or less at random, what is the difference with the current situation? Also, at the moment every possible synonym-relationship can be expressed in principle. It would probably be better if we had a "is a synonym of taxon" property and a "is homotypic with" property. - Brya (talk) 17:05, 17 October 2018 (UTC)
The difference with the current situation is that it would be an agreed policy which users could take advantage of and build on. It should be recognized as an error if someone adds sitelinks to a homotypic synonym of an item with sitelinks. In future software could be added to enforce it (perhaps with exceptions if necessary), which would not be possible without an agreement. There should be some formal way of marking which arbitrary item from the set of synonyms has been selected (just using the sitelinks is unsatisfactory; it only works if there happen to be sitelinks and does not look official). I am not sure what is the best way to mark the selected synonym; it is true that we could use the existing properties taxon synonym (P1420) and instance of (P31) with of (P642) to indicate this, but this has been rejected in the past because it is considered to be taking a POV as to the correct current taxonomy.
I agree with the "is a synonym of taxon" property (very like taxon synonym (P1420)) and the "is homotypic with" property. Also I think we need a "current name according to" property to show which synonym is considered to be the current one according to the given authority. This would allow conflicting taxonomies to be documented and would emphasize that the selected synonym for sitelinks is not necessarily the true current name. Strobilomyces (talk) 16:40, 18 October 2018 (UTC)
Ah, an explicit marker that does not mean anything?
  • The "is a synonym of taxon" property would be the inverse of "taxon synonym (P1420)".
  • We do have a "current name according to" property: it is "taxon name", which can be referenced.
- Brya (talk) 17:33, 18 October 2018 (UTC)
„For each set of homotypic synonyms it requires one to be selected“. WD is not a taxon authority. What we try to do is following and model different taxonomic point of views (aka concepts) along references. --Succu (talk) 21:13, 18 October 2018 (UTC)

Separate names from concepts?

Given the number of others who have jumped in... Perhaps it would simplify things for most purposes if the taxonomic name items were completely divorced from the items representing classes (i.e. any collective grouping) of organisms? We do have a brand new Lexeme namespace specifically designed to hold information about words and short phrases, their origins, and their meanings (linking those meanings to regular Q items where appropriate). Probably we're not ready to wholesale move those millions of Q items over to corresponding L entries, but supporting the distinction between words and their meanings seems at least a sensible start: let's create independent Q items, not in the "taxon" hierarchy, for the conceptual species, genera, families, etc at least where there are corresponding wikipedia pages, and just model them separately. ArthurPSmith (talk) 18:38, 15 October 2018 (UTC)

  • It might work for Common names (currently strings on some items), but I don't think it would solve it for the actual items. You'd still need to name them ;) --- Jura 18:44, 15 October 2018 (UTC)
  • I agree that a complete set of organism items separate from the name items would solve the theoretical problems nicely and achieve the modelling aims. But I have always assumed that this would not be acceptable as it would look like duplication, be confusing and difficult to explain, increase the amount of work in updating the data, and be a big change to the current system. But in my opinion the current system is not fit for purpose - we need taxon-level information. But we would have to have a very robust consensus in order to go in that direction. Strobilomyces (talk) 19:37, 15 October 2018 (UTC)
    • @Strobilomyces: One of the reasons I suggest it is that the taxon hierarchy has always been out of sync with the way we manage basic class membership relationships in the rest of wikidata via instance of (P31) and subclass of (P279) statements. The duplication issue I think can be explained simply by looking at those class relationship statements on the items - it's not uncommon in the rest of wikidata to have multiple items with the same name, distinguished mainly by their position in the class hierarchy. ArthurPSmith (talk) 20:40, 15 October 2018 (UTC)
  • I agree with the statemens of several others above. It would in some ways be best if only the currently valid name (zoological def) had a page and synonyms were referred to it. However, that may work at Wikipedia and its the preferred method at Wikispecies also. But Wikidata is attempting to database all terms. As stated earlier any synonym is still an available name (again zoological def) it has not been disposed of in any way and can at any time be resurrected. However some way of linking the synomyms would be a good idea, it would also be useful to identify if a name is the currently accepted name or combination. Cheers Scott Thomson (Faendalimas) talk 21:03, 15 October 2018 (UTC)
  • A separate name-space for taxon names is interesting in theory. It would presumably work well for species of birds. It may work for taxa where Wikipedias have genuine pages for taxa, that is at some length and in some detail. However, there are hundreds of thousands Wikipedia pages that don't have genuine information, but are based solely on the scientific name and taxonomic position ... - Brya (talk) 02:59, 16 October 2018 (UTC)
  • I have the impression that taxon items are supposed to represent the work of classification by a scientist, not the actual organisms involved. Since they are only "instance of taxon", and not subclasses of anything, they can't have instances. That means you need other items to represent that actual organisms, e.g., human (Q5) for humans. When you have an item like Wolf of Ansbach (Q39019), it presumably needs to be an instance of something other than wolf (Q18498), which gives a constraint violation. A new "wolf" item, perhaps, or just assign it to animal (Q729), which does have a subclass? Ghouston (talk) 23:03, 16 October 2018 (UTC)
    @Ghouston: parent taxon (P171) is a subproperty of (P1647) subclass of (P279). --Yair rand (talk) 23:41, 16 October 2018 (UTC)
    Thanks, subproperties of the subclass property is a new one to me. The constraints don't know about it either. Ghouston (talk) 00:27, 17 October 2018 (UTC)
    I've adjusted the constraint on instance of (P31).--99of9 (talk) 01:50, 17 October 2018 (UTC)
    There already is a "wolf item", at Q3711329. - Brya (talk) 05:16, 17 October 2018 (UTC)

Conflated ranks

Utatsusaurus hataii (Q3053716) seems to conflate a genus and species. What's the best way to disentangle them? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:15, 16 October 2018 (UTC)

Leave it alone. Most fossile taxa are "monotypic". We are missing a lot of fossile type species. --Succu (talk) 21:58, 16 October 2018 (UTC)
Why would we "leave alone" an item which clearly conflates two distinct topics? What does other missing items have to do with this issue? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:45, 16 October 2018 (UTC)
@Pigsonthewing: It would seem to me that the genus and species for a monotypic genus both refer to the same thing. What am I missing? How are they different topics? - Jmabel (talk) 03:28, 17 October 2018 (UTC)
So what's Utatsusaurus (Q20672904)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 17 October 2018 (UTC)
@Pigsonthewing: I'm literally not sure I understand your question, assuming it was addressed to me. I have no specialized knowledge to answer some of the possible meanings of the question; what exactly is in doubt? - Jmabel (talk) 15:28, 17 October 2018 (UTC)
You asserted, AIUI, that it is correct that Q3053716 is about both the species and the genus, because they are "the same thing" and are not "different topics"; yet Q20672904 is about the genus alone. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:09, 17 October 2018 (UTC)
What I'm saying is that the things in the world that are in the genus are exactly the same as the things in the world that are in the (unique) species for that genus. Both concepts describe the same things. - Jmabel (talk) 22:08, 17 October 2018 (UTC)
Actually they are not quite the same thing. However, the problem is not here but at the Wikipedias, so unless Wikipedias split these, this is the straightforward way to handle them. - Brya (talk) 05:12, 17 October 2018 (UTC)
A genus and a species are not „the same thing”. You need a species to „define” a genus. --Succu (talk) 06:12, 17 October 2018 (UTC)
agree thet are not the same thing, a genus is a group of species more similar to each other than to anything else. A species is a group of populations capable of successfully interbreeding and are more similar to each other than anything else. Being monotypic does not change the definition as it also includes the hypothetical unknown relative in the genus, ie undiscovered fossil history that can be inferred from relationships. Also known as ghost linneages. Cheers Scott Thomson (Faendalimas) talk 06:28, 17 October 2018 (UTC)

I see that Brya has made some changes, which go part way towards untangling (but do not fully untangle) the two subjects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 17 October 2018 (UTC)

I did not "untangle" anything. I just adjusted the label; labels tend to be fairly meaningless (there are lots of wrong labels out there, as many users put in the title of the Wikipedia page rather than the topic of the item). And I moved it to fossil. - Brya (talk) 03:22, 18 October 2018 (UTC)

Genera and species should not be conflated into one item even in cases where the genus is monotypic. Fossil taxons *especially* need to make a very good distinction between those two. BTW, on a related note, Wikipedia should try to write about species of dinosaurs wherever possible, even in cases when the genus is monotypical. This is because when a new species gets discovered, we'll always have to extract species-specific from genus-specific information. --Vojtěch Dostál (talk) 12:18, 22 October 2018 (UTC)

Indeed. So, again: What's the best way to disentangle these items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:42, 24 October 2018 (UTC)
For the moment there is nothing to disentangle. - Brya (talk) 17:02, 24 October 2018 (UTC)
"Nothing to disentangle"? Well, the item currently has the label "Utatsusaurus hataii" in several languages, and "Utatsusaurus" in others. It has the taxon name value of "Utatsusaurus hataii", but is linked to c:Category:Utatsusaurus. It has Wikipedia links to en:Utatsusaurus, es:Utatsusaurus, and several others of that title (despite claiming to have the taxon rank "species"), but also to it:Utatsusaurus hataii. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:03, 24 October 2018 (UTC)

Ixinandria (Q3753571) was synonymized to Rineloricaria (Q138093) but...

Hi, how to resolve this? Ixinandria steinbachi is the only species in Ixinandria (other wikis links from species articles to Q3753571 - named as genus!) But Ixinandria was last recognized as synonym of Rineloricaria (Q138093). I can't merge Q3753571 with Q138093, because of links from other wikis – they should be linked with new element named Rineloricaria steinbachi (synonym: Ixinandria steinbachi) or Ixinandria Q3753571 should be renamed as Rineloricaria steinbachi Q3753571 retaining all wikilinks. Ark (talk) 14:58, 18 October 2018 (UTC)

Pinging @Panellet, Beusson: to see if merging is possible or not in Calatan. --117.13.95.102 01:45, 25 October 2018 (UTC)

property for "category for people imprisoned here"?

Hi, I've been working on Ebensee concentration camp, mostly using Mauthausen concentration camp as my guide. I see that there's a property for linking the Commons "category for people who died here." Would it also be useful to create a Commons category for "people imprisoned [in various prisons]" and then a Wikidata property for the category? Or is that unnecessarily complicated? Rachel Helps (BYU) (talk) 17:00, 24 October 2018 (UTC)

@Rachel Helps (BYU): Better use significant event (P793) in the item of the persons who were imprisoned with qualifier location (P276) to define the prison. Snipre (talk) 21:19, 24 October 2018 (UTC)
@Rachel Helps (BYU): It exists place of detention (P2632). Nevertheless, if you think that property ("category for people imprisoned here") would be useful for Wikidata, feel free to propose its creation at Wikidata:Property proposal (after all we already have category for people born here (P1464), category of associated people (P1792), category for people who died here (P1465), category for people buried here (P1791), category for recipients of this award (P2517), category for employees of the organization (P4195), category for films shot at this location (P1740)). strakhov (talk) 23:13, 24 October 2018 (UTC)

Music videos

I've moved some YouTube video ID (P1651)s from items about songs to items about music videos (e.g. Make You Feel My Love (Q56085848)). This is fairly clear-cut, especially where the audio of the video is actually different, but are there cases where videos should be associated directly with items about things which are originally audio-only (e.g. lyric videos, videos which are one static image + audio)? Jc86035 (talk) 13:17, 22 October 2018 (UTC)

I would say yes - there are millions of songs recorded before music videos was a thing but enthusiasts publish these to youtube, often with a picture of the label or as a recording of a spinning turntable. Moebeus (talk) 00:15, 23 October 2018 (UTC)
@Jc86035, Moebeus: I would say only if it is official...for lyric videos, this would have to be published by the record label or the artist him/herself; for static image videos, if they're on a channel of the form "(artist) - Topic" and the publisher is the one listed in the description "Provided to YouTube by (publisher)". Mahir256 (talk) 17:28, 23 October 2018 (UTC)
@Mahir256, Moebeus: The original thing I was concerned about was that videos are not audio, and so the video might be considered a separate, derivative work in its own right (even if the video is just the static image). On the other hand, if the video is considered to be the medium then it would make more sense to add the video links to items about songs (although this sort of technical correctness here is largely pointless right now, since there are still tens of thousands of songs conflated with their singles). Jc86035 (talk) 08:41, 25 October 2018 (UTC)

From list article

Old-timer as photographer in Commons and editor in WP, newbie to WD. ENWP has many list articles, such as en:National Register of Historic Places listings in Oyster Bay (town), New York of things that are automatically notable there. Some on the list have no photo or article as yet. I like to travel with my phone opened to the Commons App, which shows WD items that have geocoordinates but lack a photo. Is there an automated or semi-automated way to make WD items for those list items, with of course the coordinates, so I and others will see them in the map? Or, if the WD items do exist, how do I find them? Jim.henderson (talk) 17:46, 24 October 2018 (UTC)

@Jim.henderson: I use the toolforge:wikishootme tool to find photography targets. As far as creating bulk sets of items here from e.g. historic place registers, toolforge:mix-n-match can help with that. Sam Wilson 00:42, 25 October 2018 (UTC)
@Jim.henderson: There are a number of tools that can help create missing items; I have had good luck with Wikidata:Tools/OpenRefine - it does a nice job of reconciling your list of entries to any existing wikidata items, before you create new ones. Best of luck! ArthurPSmith (talk) 14:53, 25 October 2018 (UTC)

Label freshness (cache delay)

Today we saw in the WD UI a label "Actor Porno" for teacher (Q37226) (teacher). This comes from this revision, which was reverted just 2h later:

  • 19:10, 18 October 2018‎ Tacsipacsi (talk | contribs)‎ . . (85,209 bytes) (+38)‎ . . (‎Undo revision 767088636 by 152.172.115.246 (talk): vandalism) (undo | thank)
  • 17:45, 18 October 2018‎ 152.172.115.246 (talk)‎ . . (85,171 bytes) (-38)‎ . . (‎Changed [en] label, description and aliases: Actor porno, Actor de pornografia uwu, schoolmaster, schoolmistress, educator, professor, school teacher) (undo) (restore)

The SPARQL endpoint doesn't show such label

The question is why the WD UI shows such obsolete label a whole week after it was reverted? --Vladimir Alexiev (talk) 07:35, 25 October 2018 (UTC)

Complex constraints

Is it possible to have property constraints like "if statement A→B, then also statement C→D" (example: if instance of (P31)song (Q7366) then disallow instance of (P31)single (Q134556)); or "if item X has statement A→B with qualifier P→Y, then item Y should have statement A→B with qualifier Q→X" (for follows (P155)/followed by (P156) qualifiers on part of the series (P179) statements)? Jc86035 (talk) 13:52, 25 October 2018 (UTC)

@Jc86035: Yes. See {{Complex constraint}}. --Yair rand (talk) 16:10, 25 October 2018 (UTC)
@Yair rand: Thanks. (I've posted a query request for the latter, since it is far beyond my primitive SPARQL skills.) Jc86035 (talk) 16:38, 25 October 2018 (UTC)

Multiple sandboxes for training

I'm going to be delivering a couple of training sessions on Wikidata in the near future: October 20 in Cambridge, and October 25 for Coventry University. I often do a brief introduction to Wikidata when doing Wikipedia training, but these two events will focus on Wikidata. I always prefer to have participants actively trying out editing, rather than just hearing about it, so my usual teaching involves them working in their sandboxes for Wikipedia training. I would very much like to develop active participation in these events, but on Wikidata, users don't have user-sandboxes in their user space. I'd therefore like them to each create one or more "sandbox-items" in mainspace that they can then edit for practice. For example: user:Fred may create Sandbox-Fred Coventry and make test edits there as if it were the Coventry Wikidata entry, without disturbing the real item. Each participant would have their own item(s) to avoid the confusion of multiple people editing the same item, as they would if we used Wikidata Sandbox (Q4115189). After the event, I would request deletion of all the "Sandbox-" items.

Now, would there be any objections to this scheme? Are there issues or complications that I haven't foreseen? Does someone know of a better way? I'm interested in any and all feedback. Cheers --RexxS (talk) 16:18, 12 October 2018 (UTC)

  • We do have three sandboxes. Maybe we could have a few more. The advantage them being stable is that users don't get confused about the edits. Alternatively, maybe we could generate lists of potential items that could easily created by new users. --- Jura 16:28, 12 October 2018 (UTC)
    Thank you, Jura. I'd be really interested in generators for lists of potential items for creation by new users, but I'd want that as an addition for the second lesson, after participants are comfortable with the interface. Cheers --RexxS (talk) 16:42, 12 October 2018 (UTC)
  • This seems reasonable, but do you have a mechanism to clearly identify these sandbox items, so they don't cause trouble later? Maybe encourage everybody to make them instance of (P31) Wikidata Sandbox (Q4115189), or add that when you become aware of one? ArthurPSmith (talk) 18:52, 12 October 2018 (UTC)
    @ArthurPSmith: Hopefully, I'll have three mechanisms:
    1. I intend that each item should take the form <"Sandbox-"><name-of-user><" "><name-of-item>. Currently no item begins with "Sandbox-", so they should be easy to track down afterwards.
    2. I will be asking the participants to post a message on my talk page (to experience talk page conversations). That will allow me to scan their contributions and I'll quickly spot any edits to items not fitting the pattern I prescribed.
    3. I usually have some of the session time where participants work freely on a topic that interests them, and I work my way round everybody, helping out and checking on what they are doing. That tends to be a safety-net where I can spot those who have problems (and they are the ones most likely to create items outside of the pattern).
    It's not foolproof, but hopefully will avoid leaving things for others to have to clean up. --RexxS (talk) 20:03, 12 October 2018 (UTC)
    •  Oppose as far as I'm concerned. --- Jura 20:08, 12 October 2018 (UTC)
      Care to elaborate? --RexxS (talk) 20:47, 12 October 2018 (UTC)
      • If you occasionally create test items with random statements, any Wikidata user who queries the database can end up getting them in their results and polluting them. In addition to my first suggestion, you could also create random items at https://test.wikidata.org --- Jura 20:54, 12 October 2018 (UTC)
        I can see the issue with potential result pollution, even for a brief session, so I'm sympathetic to that point. I agree that it wouldn't scale well. I've always shied away from using test.wikidata.org because of the possibility that the interface may be altered significantly by testing, but it looks quite comparable right now. I don't know anybody who has successfully used it for classroom work, but I think it's definitely worth trying out for at least one event. Thanks for the tip. --RexxS (talk) 22:32, 12 October 2018 (UTC)
  • I don't see an issue since you plan to RFD them after the event. I'm not sure I'd be particularly bothered without that caveat either, but I haven't thought it all through. --Izno (talk) 20:51, 12 October 2018 (UTC)
  • If the existing sandboxes aren't enough, then I think https://test.wikidata.org/ would be better, because people can happily test without messing up the live data and without having to worry about being reverted or even blocked or the items being deleted while they're trying to test. - Nikki (talk) 09:30, 13 October 2018 (UTC)
  • I don't see a good reason why you have to train with fake data. If you take real data that currently missing in Wikidata it would be more motivating to the people in your classroom.
You might for example take books that are currently in Wikidata that are only tagged as books and that have ISBN numbers and then ask your students to fill the items with more true information. ChristianKl15:03, 13 October 2018 (UTC)
Thanks for that suggestion, ChristianKl. Many years of teaching Wikipedia editing makes me very cautious about letting new editors loose on a wiki without first letting them gain experience in a sandboxed environment. I'm also trying to develop a general course structure (with example lessons) that can be reproduced as many times as needed by other trainers. That means I need to be working with fake/sandboxed data that can be wiped clean before the next group of participants.
I do agree that it's more motivating to work with real data, and I certainly would want to do that with any group, as soon as they've taken on board the structured lesson that starts them off. In many cases, these sort of events have a specialist target audience ("Wikidata in the sciences and humanities" for Wikidata:ContentMine/Cambridge Wikidata Workshop and "Learning on the Open Web" for Coventry University), so many participants will already have access to a lot of data relevant to themselves, and we ought to to tap into that expertise. Cheers --RexxS (talk) 13:10, 19 October 2018 (UTC)
  • For test.wikidata.org , you might need to define some items or properties before the course. (Maybe the bot running there could be configured to do that on each reset).
    The idea of looking for items with few statements seems like a good one as well, e.g. http://petscan.wmflabs.org/?psid=6118865 . You could also use categories with articles that need items. --- Jura 14:15, 14 October 2018 (UTC)
    Thanks again, Jura. I've looked around test.wikidata.org in the past and most things are in place – I've just added a description for 'point in time' there. I expect I'll have to actually run the first course to find where gaps may be, but I'll be looking out for them. Cheers --RexxS (talk) 13:10, 19 October 2018 (UTC)
    • @RexxS: thinking it over, using items with few statements here might be the better way to start. I don't think one can break much. It might be good to avoid using items about living people. --- Jura 04:37, 25 October 2018 (UTC)
      @Jura1: As it turned out, using test.wikidata.org was surprisingly successful. It helped that the participants understood that they were working in a test environment, which removed any hesitancy about making their first edits. They all soon became far more confident and capable, so I was no longer worried about them adding to items here. I'm pretty happy about how that went, so maybe the next time I'll move back from test.wikidata to here sooner and hopefully improve some items meaningfully. Cheers --RexxS (talk) 20:36, 25 October 2018 (UTC)

Suggestions based on constraints

Hello all,

A few months ago, we enabled suggestions based on constraints values for the constraints section of a property. We would like to explore further this possibility of having better suggestions for entities, that’s why we created a beta feature.

If you’re going to your Preferences and check the beta features list, you’ll be able to enable "Entity suggestions from constraint definitions".

At the moment there are two constraint types that are used for generating suggestions:

You can learn more on this page.

Feel free to try it, and let us know if you find it relevant and useful.

If you find any issue, feel free to report it in this ticket. Lea Lacroix (WMDE) (talk) 11:58, 17 October 2018 (UTC)

@Lea Lacroix (WMDE): Nice... another thing. I try to use the restriction on Property:P778#P2302 and one thing I see as a problem is that the world outside Wikidata is not as pure and beautiful as I would like it to be ==> that I get for 3500 Swedish Church parishes a lot of exceptions from my restrictions see Wikidata:Database_reports/Constraint_violations/P778 (I am still cleaning). With this experience I feel an easy way to move things into the exception list would be nice - Salgo60 (talk) 12:54, 17 October 2018 (UTC)
--- Jura 14:19, 20 October 2018 (UTC)

Clarification of "structured data" in main, property, and lexeme namespaces for CCO license

Quoting from the terms of use banner at the bottom of an item page, "All structured data from the main, property and lexeme namespaces is available under the Creative Commons CC0 License; " Is there any unstructured data on main, property, and lexeme namespace pages? I can understand how the linked data itself might be a google doodle for instance and not covered, but is every character in the .json dump for example under CCO? Specifically, does the description and label for an item page and a property page count as unstructured data? --Notabotyet (talk) 00:25, 25 October 2018 (UTC)

As far as I know, it means all data from those namespaces but I can see how the wording is confusing. @Lydia Pintscher (WMDE): Perhaps the wording should be changed? - Nikki (talk) 09:33, 25 October 2018 (UTC)
Yes all of the data - including labels and descriptions and aliases. I'm not sure how to change it to make it clearer. Anyone got any suggestions? --Lydia Pintscher (WMDE) (talk) 03:40, 26 October 2018 (UTC)

Transliterations

Are McCune-Reischauer romanization (P1942) and similar properties supposed to be used as statements or as qualifiers? I've moved them to qualifiers of title (P1476) on Gangnam Style (Q890) but I'm not sure if this was correct. Jc86035 (talk) 08:19, 26 October 2018 (UTC)

They should generally be qualifiers, so that the transliterations are linked to the corresponding original text. - Nikki (talk) 09:04, 26 October 2018 (UTC)
@Nikki: For the property's example, Park Geun-hye (Q138048), would the qualifiers be duplicated for each value of name in native language (P1559) (one each for Hangul and Hanja)? Jc86035 (talk) 10:05, 26 October 2018 (UTC)
No idea. I don't think the hanja should be a separate statement at all (I proposed a property a long time ago, Wikidata:Property proposal/Archive/43#hanja, but it wasn't accepted), so my opinion is that it's badly modelled whether you duplicate the qualifiers or not. - Nikki (talk) 10:19, 26 October 2018 (UTC)

coordinate location

Hi, Is there a way to add coordinate location (P625) with quick statements? We have the data in the form of decimal numbers for lat/lon. Thx, Jane023 (talk) 15:26, 26 October 2018 (UTC)

@Jane023:This works
qid,P625
Q3669835,@043.26193/010.92708
- Salgo60 (talk) 16:54, 26 October 2018 (UTC)
It's been well documented in Help:QS. Matěj Suchánek (talk) 17:10, 26 October 2018 (UTC)

Needs an explanation

On 2 August 2013 User:RobotMichiel1972 added incorrect coordinates to Chicago Theatre (Q642866) and cited them to the English Wikipedia. But the coordinates were correct (and more importantly, different) at that time in the English Wikipedia article, if one checks the article's history. What is the genesis of this error, and how do we know that the bot didn't systematically introduce thousands of such errors? Abductive (talk) 04:17, 26 October 2018 (UTC)

Perhaps ask the bot operator? For those who can't see it, the coordinates were imported from this revision and the difference is an arcsecond for both latitude and longitude... Matěj Suchánek (talk) 06:54, 26 October 2018 (UTC)
The bot operator is no longer editing. The only acceptable outcome is correcting errors. I have repeatedly pointed out errors on coordinates on Wikidata here in Project Chat. What kind of database allows systemic errors to persist without an investigation? Abductive (talk) 06:03, 27 October 2018 (UTC)

Subscribers

Have I done Q5608#P3744 correctly? (Should the conflicts-with-P31-Q5 constraint be removed from number of subscribers (P3744)? Is there a better property than online service (P2361) to indicate that it's for Twitter?) In general, is there a better way of showing this information / should this be removed and instead recorded on Commons as tabular data, as suggested in #Some suggestions about property "YouTube channel ID"? Jc86035 (talk) 08:00, 27 October 2018 (UTC)

It's much more common to add number of subscribers (P3744) as a qualifier to Twitter (X) username (P2002) instead, as with Q192912#P2002 --Oravrattas (talk) 09:16, 27 October 2018 (UTC)
@Oravrattas: I'd thought a lot of those were added relatively recently. I know that's much more common – I changed it from that as a (sort of?) test, because otherwise the old value has to be deleted every time the data is updated. Jc86035 (talk) 11:43, 27 October 2018 (UTC)
Ah, sorry, I wasn't aware of that context. In theory I guess you could continue to add new differently-qualified Twitter (X) username (P2002) statements each time the value changes (and mark the current one as preferred), rather than deleting the old ones, but I suspect that if you're wanting to keep a constantly changing history of things like this, then the Tabular Data approach might indeed be a better route. --Oravrattas (talk) 11:58, 27 October 2018 (UTC)

Link "other Websites"

Moin Moin, I'm currently unable to insert or delete a link in "other websites" in every object. Problem known or have I skipped something? Regards --Crazy1880 (talk) 10:18, 27 October 2018 (UTC)

I've been wondering the same. — Luchesar • T/C 10:31, 27 October 2018 (UTC)
The developing team has been notified. --MB-one (talk) 10:34, 27 October 2018 (UTC)
This should be fixed now. Sorry. --Lydia Pintscher (WMDE) (talk) 16:28, 27 October 2018 (UTC)

Commons link to Geddy Lee

Hi, why I cannot add Commons:Category:Geddy Lee to Geddy Lee under "other sites"? When I click the "add" button, I dont get the "wiki" box like normaly. Thx. --JuTa (talk) 14:44, 27 October 2018 (UTC)

Temporary problem, see 2 topic before... --ValterVB (talk) 14:53, 27 October 2018 (UTC)
I'm not sure why the category won't add, but it looks like ValterVB has an answer on that.
However, since there is a gallery on Commons, that is what should take the sitelink. (And did, until you removed it).
If you want to sitelink a Commons category as well (eg to make an infobox work), create an item for the category here, and connect it to the main item with a category's main topic (P301) / topic's main category (P910) reciprocal pair. Jheald (talk) 14:57, 27 October 2018 (UTC)
I tried to undo my change - see here, without any effect. I solved the problem on commons by adding the qid= value to the wikidata infobox at Commons:Category:Geddy Lee. --JuTa (talk) 15:02, 27 October 2018 (UTC)
This should be fixed now. Sorry for the trouble. --Lydia Pintscher (WMDE) (talk) 16:27, 27 October 2018 (UTC)

Commons => WD

Isn't possible anymore to link Commons categories to WD elements ? OT38 (talk) 15:42, 27 October 2018 (UTC)

We're working on fixing that right now. Sorry for the trouble. The ticket for it is phabricator:208124. --Lydia Pintscher (WMDE) (talk) 16:13, 27 October 2018 (UTC)
This should be fixed again now. --Lydia Pintscher (WMDE) (talk) 16:27, 27 October 2018 (UTC)
The icon with the link to Commons at the center of the left of the welcome page of the french WP has disappeared. Do you have any news ? OT38 (talk) 17:40, 27 October 2018 (UTC)

Suggestion/proposal for future development: partial protection, protecting labels

It would be useful protecting (somehow) labels, for example, of items being instance of given name (Q202444) or family name (Q101352) (or subclasses of these), at least the labels already included in an item. They do not need much actualization. Commons is now using our labels (the English one in particular) to categorize per name their biographical categories.

This protection would avoid this yet-to-be-created category being filled with every guy named Ali after edits such as this vandalism. strakhov (talk) 14:04, 27 October 2018 (UTC)

Classification of songs / compositions

Wikidata usually has one item for single, song (track/recording), composition and music video due to their creation based on the existence of Wikipedia articles.

  1. Based on this discussion, the music video, single and song should be separated. In the near future, how should this be dealt with? Most songs still have instance of (P31)single (Q134556) to indicate that they were released as singles, and have the artists' singles chronologies placed in follows (P155) and followed by (P156). This is strictly incorrect (I've created a new item for the single and used P155/P156 as qualifiers to part of the series (P179) → [artist's singles discography]), but removing these without doing the work to create items for the singles would cause data loss (although the data could be re-imported from Wikipedias properly at a later stage).
  2. Is a audio track (Q7302866) considered separate to or the same as a composed musical work (Q207628), given that most contemporary audio tracks usually have music production (Q959049) which makes them more similar to compositions than to raw recordings? (I've been using instance of (P31)song (Q7366), which is quite vague.)
  3. With streaming and downloads, is a track on its own also considered a musical release (Q2031291)? Would separate items need to be created for the track downloads?
  4. English Wikipedia generally defines a single after about 2002 as being a song or songs which was/were given heavier promotion than others by an artist and thus labelled a single. For example, Death of a Bachelor (Q21198147), Getaway Car (Q56598871) and Secret Love Song (Q21775880) are all considered singles (due to radio promotion) despite not receiving separate digital or physical releases. Would these items get separate items for being released as singles? Would the singles chronologies be placed on the item about the song/track?

Jc86035 (talk) 09:00, 26 October 2018 (UTC)

@Jc86035: As I understand your problem, this is similar with the work/edition/exemplar problem of book. My proposition is to create a work item for each song and then additional items for each version like single. Al data related to the version will be stored in the version item but general data like author or genre will be saved in the work item. Book used the following item to classify work and version: written work (Q47461344) and version, edition or translation (Q3331189). See Wikidata:WikiProject Books for more information. Snipre (talk) 09:49, 26 October 2018 (UTC)
@Snipre: I agree, and I've done this for a few items already (as have others) – but there are tens of thousands of songs which are only labelled as singles through instance of (P31), and it would probably require a fairly complicated bot run and/or a lot of manual work to fix this. See also the deletion discussion for P1432 (P1432). Jc86035 (talk) 10:01, 26 October 2018 (UTC)
@Jc86035: Don't care about how the data is now, just define an appropriate model and make sure that the model is displayed in the best place in order to be useful. Snipre (talk) 13:47, 26 October 2018 (UTC)
@Snipre: Would that be a subpage or the main page of Wikidata:WikiProject Music, or somewhere else? I've already placed some example items at Help:Modelling/Arts. Jc86035 (talk) 14:03, 26 October 2018 (UTC)
(also, noting that 2–4 are separate issues which should also be resolved.) Jc86035 (talk) 14:04, 26 October 2018 (UTC)
@Jc86035: This should be defined by the project Music. Snipre (talk) 14:39, 26 October 2018 (UTC)

WikiProject Music has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

@Snipre: The project isn't too active, I'd think. Since I started the discussion here I've pinged the list of members. Jc86035 (talk) 14:58, 26 October 2018 (UTC)
When this is resolved, could someone please rework the content in Help:Modelling/Arts accordingly? Or work with me to do so? Thanks. This is probably an area where content is often modeled by people without a lot of WD experience. - Jmabel (talk) 16:20, 26 October 2018 (UTC)

@Jc86035 I'm very interested in helping sorting this out! Maybe we could start with properly defining recordings/tracks on the Project Music style pages? A digital track on it's own is not a release imo, as reflected by Spotify, iTunes and others in this space. It might be boring to create a s single release with only one track in it, but this accurately reflects how it's being modeled by the industry. And a track is definitely separate from a composition, in almost every respect.. Moebeus (talk) 16:21, 26 October 2018 (UTC)

@Moebeus: I think it would be simplest conceptually (and possibly more correct?) to define the composition to be the same as the track/recording, since this avoids creation of one item for the work/melody/lyrics and one item for the track/instrumentation/production. I mentioned this in the previous discussion – this would also resolve the issue of whether covers that are musically different are still just recordings of the original (they wouldn't be). Jc86035 (talk) 16:32, 26 October 2018 (UTC)
Yes, actually having definitions would really help. WikiProject Music doesn't have any subpages yet, though – maybe on Help:Modelling/Arts? Jc86035 (talk) 16:37, 26 October 2018 (UTC)
@Jc86035 I'm strongly against mixing the concept of a recording with that of a composition (going forward, I'm aware this is how it is now) as this would go against any established model and industry standard out there. But I'm all for a discussion on how to best solve it! Is there a more flexible way to do this though, something like a slack channel or something to have an easier discussion before putting forward a more "official" proposal? I'm not familiar with Help:Modelling/Arts, I'll check that out. Moebeus (talk) 16:50, 26 October 2018 (UTC)
@Moebeus: There is a Discord chatroom (which I've used), but there is currently no Wikidata-specific channel for it. I think that could be useful (if a few more people join it, anyway) since otherwise it might take quite a while to figure out how everything is supposed to work.
Is the composition defined as the sheet music, as the melody and lyrics, or something else? Would the recording be considered to be a version of the composition? If an artist has multiple recordings which are similar to each other (e.g. Make You Feel My Love (Q56085788), studio recording/cover by Adele; Make You Feel My Love (Q56085861), live recording by Adele), but which are both recordings of a modified version of someone else's song (Make You Feel My Love (Q1886329), song/recording by Bob Dylan), then how many items should be involved? Jc86035 (talk) 16:55, 26 October 2018 (UTC)
@Jc86035 For now I'm using Property:P2550 ("recording or performance of") on the recording to tie it to the composition. The composition is defined as per ISWC (International Standard Work Code). The recording likewise has a definition in ISRC (International Standard Recording Code). If our model more or less comply with this, we'll be fine to do more automation in the future. Moebeus (talk) 17:04, 26 October 2018 (UTC)
@Moebeus: For the purposes of that property (and others), would production, synthesizers, etc. be encompassed by "recording"? Jc86035 (talk) 17:11, 26 October 2018 (UTC)
@Jc86035 Yes. Substantial new arrangements might trigger a new composition, but not always. (p.s. - I'm having a look at Adele now, so don't stress if it looks a little messy for a few minutes ;-) Moebeus (talk) 17:22, 26 October 2018 (UTC)
@Moebeus: There are at least four distinct versions of the album; I'd think we would need to create items for all of them if we were to connect all the identifiers (ASIN, UK iTunes, MusicBrainz...) for those editions. Jc86035 (talk) 17:53, 26 October 2018 (UTC)

@Moebeus: Should sitelinks go to the composition or to the recording? Most articles' infoboxes have a mixture of the information but probably fit closest to the recording. Jc86035 (talk) 05:00, 27 October 2018 (UTC)

@Jc86035 This is a tricky one but here's what I do most of the time: If the Wikipedia page name is "Nnnnnnnnnn(song)" I link it to the original composition. If the WP name is "Nnnnnnnnnn(single)" I link it to the release. If the name is Nnnnnnnnnn(Elvis song)" it's a little trickier, then I read the article to see if it's specifically about the Elvis version. If yes, I would link it to Elvis' recording. Unfortunately there is no foolproof answer here, as some articles are really about a bunch of singles, and others are more in-depth about stuff to do with the actual composition. Moebeus (talk) 08:49, 27 October 2018 (UTC)
@Moebeus: English Wikipedia doesn't have this distinction – PetScan says there are only 17 articles (of 60,325) which have titles using "single" for disambiguation; I'll probably be moving them to align them with the other 60,308. I guess it would make more sense to just have all sitelinks go to compositions, and then future Wikidata infoboxes could pull data from the earliest recording. (Alternatively if a composition is first published in the form of the recording then the infoboxes could use that?) Jc86035 (talk) 09:01, 27 October 2018 (UTC)
Overall it seems that except for single albums (South Korean maxi singles), all single releases are either referred to by their A-side (as songs) or have separate articles for all of their tracks. Jc86035 (talk) 09:19, 27 October 2018 (UTC)

Also, should writer/lyricist be added to both compositions and recordings? Jc86035 (talk) 08:45, 27 October 2018 (UTC)

@Jc86035 My general rule: lyricist belong on the composition while lyrics (like MetroLyricsID) belong on the recording. That is because most lyric sites tie to the recording artist version, and not the original composition. Moebeus (talk) 08:53, 27 October 2018 (UTC)
Okay. I have a couple of my own items to clean up, then. Jc86035 (talk) 09:01, 27 October 2018 (UTC)

@Moebeus: Why have you linked specific versions of Adele's singles through the chronology? I would've thought that that information would have been left for edition items. Jc86035 (talk) 13:50, 28 October 2018 (UTC)

@Jc86035 I'm not done, but feel free to rearrange as you see fit. As you've probably experienced yourself WD is really suffering from not having the concept of "release group" or "master release" as MusicBrainz/Discogs/Allmusic operates with. "first edition" might work well for books, but for an international artist that might release an album in 50 countries at the same time that doesn't really scale. Moebeus (talk) 13:55, 28 October 2018 (UTC)

@Moebeus: Is Make You Feel My Love (Q56085802) not the Wikidata equivalent, so to speak, of a release group? Should a new item be created to represent the "release group" concept? (Also, is there a good way of choosing a "master"? I imagine many more people have streamed Adele's singles than have bought the vinyls.)
I think just using new items to define things which are structurally convenient would solve some of the other issues I posted at the top – for example, creating "track with individual download"/"track available for streaming" and "radio single" might solve or ameliorate #3 and #4 respectively (although there are probably better solutions, especially for #3). Jc86035 (talk) 14:14, 28 October 2018 (UTC)
@Jc86035 I think you've modeled Make You Feel My Love (Q56085802) perfectly! Yes, I think that a new item should be created to represent the release group, possible both a "master single" and a "master album", maybe a "master EP" as well. The reason I haven't done it myself is I think it's a pretty big change that needs some consensus among the people on here interested in music. I also thought that it might be better to do it the other way around: created a "release single/LP/EP", as most of the existing entries in WD already function as kind of catch-all "masters". This would also be less disruptive. What do you think? Moebeus (talk) 14:28, 28 October 2018 (UTC)
@Moebeus: I'm not sure. I think both of them would make sense, but the release group/master edition(s) structure might be easier to manage, and with that structure it might be easier(?) to imply that there is in fact no "master" edition of a particular release.
I don't think it would be less disruptive: pretty much almost every song item will need to be separated out into release(s), recording(s), video (if any) and composition at some point, and only 1,762 items use tracklist (P658) out of tens of thousands of albums (with either approach, most of those would probably need to be changed anyway, and most of the rest would just remain incomplete unless I'm misunderstanding something). I think the "master edition" item might not even be necessary if preferred rank is used on has edition or translation (P747). Jc86035 (talk) 14:44, 28 October 2018 (UTC)
@Moebeus: Have I used performer (P175) correctly at Written in the Stars (Q57898036), or should there be one statement for each performer/track combination? Jc86035 (talk) 15:01, 28 October 2018 (UTC)
@Jc86035 I recommend using Property:P1706 to indicate featured artist and Property:P767 to list individual musicians, technicians, etc. ("Personnel"). The way performer (P175) is currently being used it should really be renamed "Release Artist" imo. Moebeus (talk) 15:14, 28 October 2018 (UTC)
@Moebeus: How does P1706 work if there is more than one main artist (which claim gets the qualifiers)? Usually the "featured artist" is one of the performers anyway so I don't think performer (P175) would be the problem.
Does the composition or the recording get NAACP Image Award for Outstanding Song (Q1659596)? I think it's probably safer to go with the recording for most awards that aren't specifically about honoring the songwriters. Jc86035 (talk) 15:22, 28 October 2018 (UTC)
@Jc86035 It certainly looks like it's the performer and not the songwriter taking home the prize, so that would indicate the recording to me. I haven't really checked the source website though, so don't take my word for it. Moebeus (talk) 15:31, 28 October 2018 (UTC)
@Moebeus: Do we need an equivalent property to samples from work (P5707) for work–work relationships? For Popular Song (Q57899540) I've used has melody (P1625), but the words, meter, and so on are also being quoted because of the P5707 relationship of the tracks and P1625 doesn't seem to express that. (edit – I've removed P1625 because it seems to be for the whole melody being used, which is a bit different.) Jc86035 (talk) 15:56, 28 October 2018 (UTC)
@Mahir256, Valentina.Anitnelav: thoughts? Jc86035 (talk) 14:44, 28 October 2018 (UTC)

problem with format (Q2085518)

WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Currently, book format (Q18602566) is a subclass of format, but the latter is a subclass of "arranging of data" and, furthermore, is said to be the same as data format.--Malore (talk) 23:55, 26 October 2018 (UTC)

Thanks! Jane023 (talk) 12:41, 27 October 2018 (UTC)
@Jane023: What's the solution?--Malore (talk) 02:52, 28 October 2018 (UTC)
Funny - my reply went to the message above this one but due to a time lag got logged in the wrong place. I didn't even see this question until your alert. Sorry for the confusion! (nice to know that project chat is so popular that things like this happen) Jane023 (talk) 10:57, 28 October 2018 (UTC)

Property for normalised competition scores?

Do we have a property for a 'normalised score', the use case being instances of the Sweet Adelines International quartet competition (Q57775861) such as Q57780859#P1346 ... the competition changed its scoring method over time, notably bouncing the maximum absolute score from 3200 to 4800, and values in between. Expressing normalised scores (e.g. percentiles) facilitates comparisons across time. --Tagishsimon (talk) 23:04, 27 October 2018 (UTC)

Thank you user:Tagishsimon for raising this here.
With this specific example, you can see in this Wikipedia article table (from 1979 onwards in the “score” and “%” columns) how the score achieved by the winner is not comparable year-to-year. If you sort the table to show the ‘most number of points ever obtained’ that does NOT produce the same thing as ‘best score ever achieved’. Consequently, the more important measurement that community uses is the “% of maximum score”. (Here is an explanation of the scoring system changes over the years to show why that is the case).
User:Harmonia Amanda mentioned to me that in Ice Skating competitions there is a similar problem with the changing scoring system over the years. However in that case the different systems are [potentially] notable enough to have different classification systems in Wikidata in their own right. That would allow for qualifier statements to explain which score regime is applicable, an elegant solution, but not one which would be applicable for less “famous” things, I believe. Wittylama (talk) 08:11, 28 October 2018 (UTC)

Primary sources tool

Why doesn't the primary sources tool usually provide a reference to the dataset from which the data was collected (particularly for properties like ethnic group (P172))? I didn't know about the tool before, but it seems really problematic to allow users to import all sorts of data without recording where it was actually imported from in Wikidata itself. Jc86035 (talk) 09:53, 28 October 2018 (UTC)

It wasn't sourced in Freebase or the given reference has been blacklisted. Sjoerd de Bruin (talk) 10:36, 28 October 2018 (UTC)

tinyurl no longer works with wikidata query due to fbclid

The facebook click id breaks the tinyurl service for Wikidata query. This is only for tinyurls shared on fb, but just wanted to report that it happens in chat windows as well as fb posts. I understand there is an alternative service using google url shorteners, but that may be disappearing soon. Jane023 (talk) 11:39, 28 October 2018 (UTC)

Currently, they are used both as main values and as qualifiers. Wouldn't be better to use them only as qualifiers?--Malore (talk) 20:13, 28 October 2018 (UTC)

I'd prefer to keep academic degree (P512) this way, people mostly don't earn their doctorate on the same day their study stopped. Sjoerd de Bruin (talk) 20:18, 28 October 2018 (UTC)

QuickStatements v2 now supports lexicographical data

As a birthday present from Magnus Manske and me, QuickStatements v2 now supports editing lexicographical data on the statement level: lexeme, form and sense IDs can be specified as the subject of statements to be added or removed, or as the values of statements, qualifiers or references. For example, the following code produced this diff:

L123|P5188|L123
L123-F3|P5189|L123-F3
L123-S2|P5979|L123-S2

Editing lexeme lemmas, languages, or lexical categories, form representations or grammatical features, or sense glosses is not supported yet; neither is creating new lexemes, forms and senses.

Pinging VIGNERON, Jura1 and Vesihiisi, who requested this. (Probably others too that I’m not aware of. I’ll copy this to Wikidata talk:Lexicographical data anyways, so hopefully the right people should see it either way.)

Go forth and edit! (at a reasonable pace (and preferably with sources)) --Lucas Werkmeister (talk) 21:53, 28 October 2018 (UTC)

Precisions (or lack of it)

Just noticed this diff on my watchlist. Definitely nothing to say about it, seems totally legit. The reason I speak here is the value of the statement : 2 622 433 959 604.16 american dollars ! That’s a huge number with an extraordinary precision of 1 dollar cent :) How precise is this, really ? My gut feeling is that the precision is way less than this … Are the world bank datas that confident in their precision ? author  TomT0m / talk page 09:16, 29 October 2018 (UTC)

@TomT0m: The data download doesn't actually specify how accurate the figures are, so not giving a precision/uncertainty value is probably correct. Jc86035 (talk) 12:35, 29 October 2018 (UTC)
@Jc86035: it's the same for "life expectancy" datas on the same item and in that case some figures definitely lacks meaning ... I think the least we can do is to qualify such statements where the source is unclear sourcing circumstances (P1480), maybe with unknown value Help, or maybe a more careful reading of the documents of a source about the dataset (I remember to have found a document that explains how to compute the precision for the french demographic institute, but the numbers were published in the dataset - that said I did not exploited this /o\). author  TomT0m / talk page 16:58, 29 October 2018 (UTC)

Is the period (full stop) necessary after the initials in author or author string?

I'm wondering if there is a guideline or practice that tries to reduce the use? It looks like the search tools are agnostic to the presence of the punctuation. I'm not trying to advocate for against. -Trilotat (talk) 09:51, 29 October 2018 (UTC)

@Trilotat: The English Wikipedia's Manual of Stype recommends the periods; see w:Wikipedia:Manual of Style/Biography#Initials. Help:Label (a proposed guideline) offers no guidance on this matter. Usually I would expect that the English Wikipedia policies and guidelines would be far more comprehensive (or, viewed cynically, much more bureaucratic). Jc86035 (talk) 12:31, 29 October 2018 (UTC)
As I understand it (I may be wrong), "author name string" is basically a holding form of "stated as" when the author doesn't have an item, so it should be exactly as stated in the primary source material (if available) or whatever source the string is extracted from. Circeus (talk) 14:36, 29 October 2018 (UTC)
ETA, since I didn't notice youw ere asking about the author property too: You can't pick one for author, the form that shows is whichever has been entered for your set language (default English), otherwise, you get the Q number. Item names do take punctuation into account when searching. It will only be agnostic if the alternate version has been entered as an alternate (in the "also known as" column, which is also searched).
@Trilotat: Oh, sorry, I didn't read the section header properly. I would suggest doing whatever the Manual of Style says, except in fields which are supposed to be for direct quotations, like "author name string" and "stated as". Jc86035 (talk) 15:02, 29 October 2018 (UTC)

Happy birthday Wikidata!

Hello all,

Today is the sixth anniversary of Wikidata. I'd like to take this opportunity to thank all of you for working so hard on the content and structure of Wikidata, as well as building tools, doing outreach, bringing new people into the community :)

On Wikidata:Sixth Birthday you will find a message from Lydia, the events happening around the world, the birthday presents and messages from the community.

Cheers, Lea Lacroix (WMDE) (talk) 10:35, 29 October 2018 (UTC)

Technical Advice IRC Meeting (changed time)

We'd like to invite you to the weekly Technical Advice IRC meeting. The Technical Advice IRC Meeting is a weekly support event for volunteer developers. Every Wednesday, two full-time developers are available to help you with all your questions about Mediawiki, gadgets, tools and more! This can be anything from "how to get started" over "who would be the best contact for X" to specific questions on your project.

The Technical Advice IRC meeting is every Wednesday 4-5 pm UTC as well as on every first Wednesday of the month 11-12 pm UTC. If you know already what you would like to discuss or ask, please add your topic to the page of the next meeting. Cheers, -- Michael Schönitzer (WMDE) (talk) 12:56, 29 October 2018 (UTC)

Ability to clone/copy items?

Is there any tool which I could use to clone existing item to the new one? The use case would be that user would create manually the first item as a template and then multiply them and fill the rest of the data which would differ between items. --Zache (talk) 14:24, 29 October 2018 (UTC)

@Zache: Yes: User:Magnus_Manske/duplicate_item.js. Use with care! You can also use QuickStatements (Q20084080)/ QuickStatements 2 (Q29032512) for your use-case. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:22, 29 October 2018 (UTC)

Facto Post, Birthday special issue

Up at w:User:Charles Matthews/Facto Post/Issue 17 – 29 October 2018. As ever, you can subscribe to have this mass message delivered to you on enWP. Charles Matthews (talk) 15:04, 29 October 2018 (UTC)

Wanted: Info on comparative usage of Wikidata in Wikipedias & sister projects

It would be helpful to have some data on which Wikipedias (and sister projects) are making the most use of Wikidata, and how. For example:

  • Which have the most Wikidata-derived templates in active use?
  • Which use Listeria?
  • Which have the most articles with some data from Wikidata?

It would also be useful, for any given project, to be able to see a qualitative and quantitative overview of how Wikidata is used.

Does anyone know where such information can be found? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:19, 29 October 2018 (UTC)

Wikidata Concepts Monitor (WDCM) should answer some of your questions. — Envlh (talk) 16:03, 29 October 2018 (UTC)
Thank you. I shall investigate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:29, 29 October 2018 (UTC)
This should answer your second question. Mahir256 (talk) 16:09, 29 October 2018 (UTC)
Thank you. Unfortunately, that does not seem to distinguish between articles and (e.g.) user-space pages. Perhaps Magnus might make it do so? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:29, 29 October 2018 (UTC)

Data loss during the data centre switchover

Hello all,

During the data centre switchover routine on October 10th, some unexpected problems occurred over the past days:

  • For a few hours, a small part of the data was not accessible. Some items and lexemes seemed to have disappeared.
  • Some data may have been lost, including edits, preferences changed as well as user accounts created during a period or about 50 minutes (from 2018-09-13 09:08:17 UTC to 2018-09-13 09:58:26).

Part of the data has already been restored (edits and revisions. The rest (user accounts, preferences) will be restored at the beginning of next week.

If you edited Wikidata on September 13th, please check your contributions. If you encounter any problem in the next days, like items not reappearing or something missing, let me know.

If you're interested in technical details, you can have a look at the Phabricator ticket. Thanks for your understanding, Lea Lacroix (WMDE) (talk) 14:45, 12 October 2018 (UTC)

Hello all,
Edits that occurred in the 50 minutes that were temporarily lost during the switch back to the eqiad data centre may now not exist in the current revisions of the pages. This is due to edits being made to these pages while either the edits did not exist in the revision table, or before the page_latest field was updated in the page table.
These lists should indicate all revisions that could be missing. Feel free to check them. Lea Lacroix (WMDE) (talk) 13:29, 17 October 2018 (UTC)
It impacted only Wikidata, which runs on its own server now. Lea Lacroix (WMDE) (talk) 14:22, 17 October 2018 (UTC)

I would like to clarify and expand over this now that the task is closed- while no data loss happened- as in, no rows written to the database are missing anymore, and all are available on the history of all pages, it is possible that some edits (e.g. by a bot or a tool doing automated edits) may have been done on the above pages during the specific time where an old version was edited (the same idea than when a "conflict" happens for editing an older version, but the edit goes through). The incident happened a month ago, but was only exposed to users 2 weeks ago, during the dc switch. No other wikis, or the content itself was affected, only the edit and user metadata, appearing outdated for some time.

This may look like those users removed data when in reality they were just editing an older version of the page. The reason for this is that the database stores "versions" of pages, not edits/diffs. The temporarily missing edits are now available back again since the start of last week, but they may be under an older version on the history, not being current.

I personally think someone within the community should be the person to determine which is the "right" version, and I strongly believe it is a really bad idea for server administrators to edit or delete contributor's actions from a technical perspective- due to the lack of transparency-, so the final version should be determined on a case by case basis by the community. But please revert to an appropriate version where it corresponds. There are I think only 200-400 pages to review, and a bot may detect removals of content quite reliably automatically. If someone thinks that the diffs could show a bot as destructing data when in reality it didn't, they should ask for the normal "wiki" procedure to delete their edits from a Wikidata administrator.

I hope that clarifies what happened and which are the next steps. We are very sorry this happened, this was not a normal incident, and we are working now on preventing this from ever happening in the future. We worked really hard in the last 2 weeks to recover all data, which was not easy as the incident happened more than a month ago. Sorry again. --JCrespo (WMF) (talk) 10:53, 24 October 2018 (UTC)

@JCrespo (WMF): I reopened the task this morning, coincidentally at around the same time you posted here (and it remains open, with additional comments, as of just now), because an item is still not correct. In the specific case I give as an example there, I was asked [9] not to edit to restore the item. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:14, 24 October 2018 (UTC)
Sorry I wasn't clear enough, my fault-- I asked not to edit during the recovery process, because it was editing over old revisions that caused those history "conflicts". Now that the recovery is done, edits (merges) should be done as appropriate. Please let me know how I can help. JCrespo (WMF) (talk) 11:07, 25 October 2018 (UTC)
We should be able to have either users or bots work through the entities listed at User:Addshore/2018/10/DC_Switch_Issue ~150? to fix any lost data. ·addshore· talk to me! 15:01, 30 October 2018 (UTC)

Plant species sequenced by RAD-SEQ

Hi, had a question emailed to me from one of our academics about the meta-analysis of evolution in plants. They need to find as many as plant species which have been sequenced with RAD-seq (a sequencing method). Upon checking, they found Web of Science only allow datasets to be downloaded one by one and NCBI not to be terrible comprehensive. Worse, people tend to build their own database to deposit this data rather than have a central hub like Wikidata. So, the question would be the academic wanted to see if plant species on Wikidata could, in theory, have RAD-seq info added or if this would be outwith project scope? Any thoughts let me know Stinglehammer (talk) 16:44, 17 October 2018 (UTC)

@Stinglehammer: What form would this data take? Can you give an example? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:01, 17 October 2018 (UTC)
@Stinglehammer: Nudge. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:57, 24 October 2018 (UTC)
@Pigsonthewing: Have asked for an example of the type of data this would take. Will resurrect this when I've discussed further with the academic in question and got a clearer idea. Cheers, Stinglehammer (talk) 17:50, 30 October 2018 (UTC)
Could you please try to enhance Restriction site associated DNA markers (Q7316324) first? --Succu (talk) 21:33, 17 October 2018 (UTC)

Community Wishlist Survey

Hi. I'm planning to propose two proposals for the Community Wishlist Survey related to Wikidata. Unfortunately each user can only "own" three proposals and I want to submit two others, so it would be helpful if someone else could take ownership of one (or both) of them.

  • Improving the verifiability and reliability of data
    • allow protection of classes of objects (e.g. semi-protection of all existing descriptions in German, or semi-protection of all existing statements with a reference with a reference URL (P854))
    • create items for sources used in Wikipedias (e.g. news articles)
    • link those sources to existing statements automatically or semi-automatically
  • Improving Wikibase and the data model
    • multilingual-text datatype; improvements to structures in lexicographical data
    • enabling time precision of hours/seconds/minutes
    • enabling globes other than Earth for geographic coordinates
    • fixing date localization

It would also be helpful if other things could be suggested to be included: I think small-scale proposals are unlikely to succeed, since only the top ten (by popular vote) will certainly be evaluated, so it would be better if the proposals could include some other minor issues that should be fixed. Is there something that should probably added to these? Jc86035 (talk) 11:10, 26 October 2018 (UTC)

Isnt a better approach to always verify data with trusted external sources and get alerts when we have a difference. I feel that this approach is more inline with that the most important thing for Open Free Knowledge is that it can be confirmed with trusted sources. Compare SPARQL Federated search in this Listeria list: Wikidata <-> Nobelprize.org - Salgo60 (talk) 17:01, 26 October 2018 (UTC)
@Salgo60: I agree this is a good approach, but things like events might not even have their own database entries. Particularly with things other than people and places, there might not be a lot of good-quality open databases to draw information from.
Importing articles from websites which are already used a lot as sources in Wikipedias also increases the chance that the Wikipedias might sooner be able to use Wikidata for sourcing (which would allow information about sources to be improved en masse, especially for the sources which are used in tens of thousands of articles). Jc86035 (talk) 17:08, 26 October 2018 (UTC)
The problems I see with "protecting"
  1. I feel at least in the Swedish Wikipedia there is no consensus what a good source is and the discussion we have is rather "primitive"
  2. I think sourcing must start in defining the quality of a source see Project_chat/Archive/2018/10#Quality_of_a_source_needs_to_be_documented_and_communicated. In my book of trust the number of sources is not important it is the quality of a source that adds trust.... just that someone thinks something needs to be protected is not a good approach or that its used a lot on Wikipedia is not an argument.....
- Salgo60 (talk) 17:23, 26 October 2018 (UTC)
@Salgo60: I think protecting statements sourced to particular sources might be helpful, but I focused on things like properties because it might be easier to implement blocks for them. Preventing new/unregistered users from changing existing values for e.g. height, weight and gender (regardless of whether they're sourced) might also prevent a lot of vandalism, although in my experience vandals most often go for the labels and descriptions possibly just because they're at the top of the page. Jc86035 (talk) 19:31, 26 October 2018 (UTC)
It's a bit of a problem, when the king of the Netherlands can have his monogram vandalised [10] and it's a month and a half before it's fixed: what about the tens of millions of more obscure items that are unlikely to be monitored? Ghouston (talk) 08:55, 27 October 2018 (UTC)
@Ghouston: I think it would need to be tested – if it were technically possible e.g. to semi-protect all existing statements for external identifiers, or even to semi-protect all items with a PubMed identifier, then it might be possible to stop a lot of vandalism, but we would need to know what those things to be protected are. (In any case, the work that the WMF has done so far in this area is obviously not enough.) Jc86035 (talk) 11:36, 27 October 2018 (UTC)

@Ghouston, Salgo60: Posted at m:Community Wishlist Survey 2019/Wikidata/Improvements to the Wikidata data model and user interface and m:Community Wishlist Survey 2019/Wikidata/Improvements to the reliability of Wikidata. Jc86035 (talk) 08:42, 30 October 2018 (UTC)

Wikidata weekly summary #336

The Community Wishlist Survey

11:06, 30 October 2018 (UTC)

Extended Date/Time Format Specification

We should adopt the "Extended Date/Time Format Specification" (EDTF) profile for ISO 8601, which allows for, for example, uncertain and vague dates, for use in properties with a "time" datatype.

ISO 8601-2019, due in the middle of next year, is expected to support all of the features of EDTF.

Disclosure: I contributed, partly as a Wikimedian, to the draft of this specification. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:22, 22 October 2018 (UTC)

  • @Pigsonthewing: Interesting to hear about what may be included in the next edition of ISO 8601 (Q50101). The question that jumps to me straight away is whether alternative weather season systems such as those included within Indigenous Australian seasons (Q6024584) and dry season (Q146575)/wet season (Q3117517) been considered? Dhx1 (talk) 08:47, 23 October 2018 (UTC)
  • Oppose. EDTF, like ISO 8601 that it extends, only allows the Gregorian calendar and proleptic Gregorian calendar. Wikidata allows both Gregorian and Julian. Jc3s5h (talk) 15:31, 23 October 2018 (UTC)
  • I have several questions:
    • The section on "intervals" seems to be about showing a span of time, as in, A/B means that the thing started at A and ended at B. Is this correct? If so, how does one indicate a range of ambiguity, as in, it started at some time between A and B and ended some time between C and D? (Assuming the range isn't something that can be shown by X's instead of digits.) Does this use something with the [] syntax?
    • Does this work with other calendars? Presumably we'd need some custom extensions for calendars that are very different, but can it work for Proleptic Gregorian or Julian without issue? (Does it have a year 0?)
    • We currently theoretically store time zone data and specificity data as separate components in the data, I think. Would these be replaced?
    • If we're changing around the data model for dates, we should probably fix existing time zones. We don't actually support time zones in the UI yet, but all the data is claiming to be UTC even though it's not. EDTF supports both time zones specified and just indicating "local time". Should existing dates be switched to the latter?
    • Why does it have separate options for "Spring - Northern Hemisphere" and "Autumn - Southern Hemisphere" when those refer to the same time period?
    • Would we have the option to use the full set of syntax on any date statement? Could we use this to replace sourcing circumstances (P1480) circa (Q5727902), earliest date (P1319), latest date (P1326), and birthday (P3150)?
    • Are these dates easily convertible to other formats? Can it work with existing SPARQL queries? I imagine that some types might be difficult.
    • Are the devs on board with this?
  • --Yair rand (talk) 19:36, 23 October 2018 (UTC)
I think we need to ask what you're actually asking for here & in the phabricator ticket.
My take:
  • Is Wikidata going to primarily store dates as EDTF strings? IMO: No. EDTF strings are at best a serialisation, not structured data. You really don't want to have to use string operators in a query if you want to extract dates that have a particular characteristic. It's horribly inefficient and not very readable. Something more readily translateable into more atomistic RDF triples would continue to be my bet for Wikidata's preferred structure.
  • Might Wikidata/WDQS gain the ability to export dates in EDTF format? Possibly, but I'm dubious. It looks like quite a lot of code to write, and for dates with particular types of uncertainty (eg dates only known to a decade), it looks like EDTF has done the typical committee compromise thing of allowing everybody's standard as a possibility, so it's not clear which option should be preferred and targeted.
  • Might WDQS gain the ability to recognise EDTF strings and present them in some friendly way? (cf how it can interpret values of type geo:wktLiteral). Again I'm dubious. If one wanted to translate EDTF strings into natural text, properly internationalised, that is a huge number of languages to consider, for a module that is probably not very publicly editable. Yes the "other date" template on Commons might help you to quite a lot of those internationalisations, but still this wouldn't be a small job, so it's wouldn't be something I'd expect one of the full-time paid developers to take on, or be assigned.
  • Could WDQS offer dates in EDTF format in parallel to dates in something closer to whatever format is being used internally - in the way that properties in the RDF dump have different forms, eg to connect to simple values or full value nodes, or human-readable links vs linked-data links. This is really a special case of #2, only with values being stored in the dump rather than generated by a service on the fly, and the same issues apply. Neither of these options is impossible, but I am dubious that the developer time would be forthcoming to make either a reality. Plus, given the apparent lack of clear preference as to how certain approximate dates should be written, I am not convinced that an EDTF string would necessarily emerge from being round-tripped unchanged.
But @Pigsonthewing: what do *you* have in mind by your ticket? When you call for WD to implement EDTF, what actually do you mean by that? Jheald (talk) 21:45, 23 October 2018 (UTC)
Primarily I want the community - especially devs - to discuss these issues. That's something I'd expect to happen over a period of months, not be resolved in a day or two [it's been nine years since I first contributed to an early draft of the spec!]. That should also involve keeping a watching brief on how other bodies use EDTF - whether that's W3C in their HTML standards, schema.org and the like, or bodies that we exchange data with, like VIAF/OCLC. As and when they do, I expect open-source code libraries for manipulating them to emerge. My instinct is that we should be accepting and outputting some, preferably all, levels of EDTF strings (I'm agnostic on the internal storage format), as an official evolution of ISO8601. The use cases for doing so are, I trust, clear; and EDTF is designed in part to address some of the issues that have occurred on Wikimedia projects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:17, 23 October 2018 (UTC)

Having a discussion is good. If WD can make a breakthrough in the handling of approximate/uncertain dates, that would be a boon to the historical/culture communities. But I think WD, which has a variety of date precisions, is already ahead of many other approaches/communities:

  • VIAF has no approximate dates at all. You can only use the "standard" year/month/day precisions, that's all
  • RDF has the same, with the datatypes xsd:gYear, xsd:gYearMonth, xsd:date. Anything more you need to model separately
  • CIDOC CRM has an advanced model: crm:E52_Time-Span can have up to 4 date points: crm:P82a_begin_of_the_begin, crm:P81a_end_of_the_begin, crm:P81b_begin_of_the_end, crm:P82b_end_of_the_end; and the textual crm:P79_beginning_is_qualified_by, crm:P80_end_is_qualified_by. (The CRM itself (the spec) posits the existence of some all-powerful Date datatype in the underlying implementation, but the RDF rendition goes with these 4 date points.) But it's not adopted quite widely (i.e. there are few if any datasets to use all these props).

I think the biggest omission in WD right now is that you can't specify uncertain begin/end (eg 579-583 AD) unless it matches the current precision boundaries (eg decade) --Vladimir Alexiev (talk) 08:06, 25 October 2018 (UTC)

In answer ot the Gregorrian/Juluian calemdar debate, the standard is open to extension to allow for this and also to allow for other calendars. I am assuming that there will be some sort of specifier before the date to say "This is the XYZ calendar". In particular I notice that the months are allocated numbers 1-12 and other sub-years are allocated numnbers 21 onwards. This alows for a 13th month such as the 13th Jewish month that occurs in some years. Martinvl (talk) 21:46, 29 October 2018 (UTC)

The standard is tied to the next revision of ISO 8601. Wikidata has no control over it nor does Wikidata have any means to suggest any changes. Looking at how infrequently ISO 8601 has been revised, it appears the Julian calendar has missed this revision cycle and there probably won't be another opportunity for about a decade. Jc3s5h (talk) 22:14, 29 October 2018 (UTC)

 Support It's good to be able to enter data that's standardized in the official ISO specification direclty into Wikidata without having to translate it into a different standard. Having a better ability to specify uncertain dates also has a lot of practical use and will lead to data in Wikidata that matches more accurately what's written in our sources. When reworking the time field it would also be good to fully introduce the ability to specify the exact time when an event happens down to hours/minutes/seconds. ChristianKl10:32, 31 October 2018 (UTC)

Shall Q5837762 and Q19693229 be merged?

I'm new here, and pardon me if my question is stupid.

Q5837762(Muhammed) and Q19693229(Muhammad) looks like the same thing to me. Shall they be merged? --Inufuusen (talk) 07:19, 29 October 2018 (UTC)

  • I undid an edit [11] that mixed them. One is for people with the name in Arabic script, the other with a specific spelling in Latin script. --- Jura 07:24, 29 October 2018 (UTC)
    • @Jura1: Thank you for replying. I kind of understand your explanation. I think both of them are people with a name. The difference is that the first one relates to the arabic name written as mḥmd, meaning the name of those people are formally written in Arabic. The second one relates to the name with mḥmd as the original form but formally spelt as Muhammad. --Inufuusen (talk) 04:44, 31 October 2018 (UTC)

RfC draft for rank usage on dated datas on historical entities

Hey all, a proposal for an RfC to reread before publication : Wikidata:Requests_for_comment/Best_practices_for_statement_ranks_for_disappeared_entities . I think the question is of value for everyone and that we should give a Wikidata wide answers, the resulting guidelines could be added to ranking for example. Please don’t hesitate to edit the draft before the publication / add example to discuss / if you agree it’s an important question. An additional question is about help:evolving knowledge.

I heard there was previous discussions about this, so if you have any link it would be useful. author  TomT0m / talk page 16:33, 25 October 2018 (UTC)

Wikidata:Requests_for_comment/Best_practices_for_statement_ranks_for_disappeared_entities now opened. author  TomT0m / talk page 21:52, 31 October 2018 (UTC)

Copyright and Descriptions

A few months ago I rewrote some descriptions by using the text from an authoritative source. My description was of the form

"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. (Authoritative source)".

The source was mentioned under "described by source". Yesterday another editor removed the text "(Authoritative source)" from the description. This raises a number of questions which impinge both on copyright and on good practice:

  1. Was I correct in quoting directly from the authoritative source?
  2. If so, should I have acknowledged the source in the description?
  3. If so, how should I have acknowledged the source?
  4. Finally, should this be noted in the guide on descriptions?

Martinvl (talk) 21:30, 29 October 2018 (UTC)

Descriptions are typically only supposed to be a few words long, and certainly don't need to supply sources in them. The descriptions are supposed to be licensed under CC-0. That (copyrightable-length) description should never have been imported in the first place, both because of description standards and copyright issues. --Yair rand (talk) 05:40, 30 October 2018 (UTC)
Because descriptions can be controversial, I consider it a design flaw for Wikidata to not make it possible to supply references for descriptions, so they can be attributed to a reliable source. Jc3s5h (talk) 15:20, 30 October 2018 (UTC)
Agree with that, in principle, and same for labels. Ability to add does not mean it's necessary. Because of existence of the limited labels title (P1476) etc. are either much underused or partly redundant. --Marsupium (talk) 19:55, 30 October 2018 (UTC)
This page from Merriam-Webster give guidance on how to cite an entry from a dictionary. This tells me that they expect people to cite their work and to attribute it. The M-W guidance states what should be included rather than how to include it. Marrying up this page with current Wikidata practice, I suggest that the name of the source document should be added in parenthesis and in italics at the end of the definition and that the full citation should be given in the section "Described by source". -- Martinvl (talk) 20:48, 30 October 2018 (UTC)
I’d like to point to User:Yair rand’s comment above again, in order to support its main point: a description of the length as given in your example shouldn’t have been imported at all (although I’m in doubt whether it is already copyrightable). Descriptions are basically disambiguators of items with identical labels, for human data users; they should intentinally be kept short, neutral, and simple, following guidelines given in Help:Description. A proper and good description is uncontroversial and inherently not copyrightable and it thus does not need to be supported by a source. —MisterSynergy (talk) 21:02, 30 October 2018 (UTC)
May I give an example. The item System of quantities does not yet have a description. If I go to item 1.3 of this document, I see that a "system of quantites" is defined as "set of quantities together with a set of noncontradictory equations relating those quantities" and alongside it the French equivalent. Can you suggest a better description for Wikidata purposes? -- Martinvl (talk) 22:21, 30 October 2018 (UTC)
Why is this sentence (as a WD item description) a problem? Such short sentences are rarely subject to copyright laws. Wostr (talk) 22:58, 30 October 2018 (UTC)
OK, copyright is not a problem, but attribution is - see the Merriam-Webster guidance. Copying without attribution is plagurism. -- Martinvl (talk) 23:03, 30 October 2018 (UTC)
For the record only: I agree with “Descriptions are typically only supposed to be a few words long, and certainly don't need to supply sources in them.” and MisterSynergy in general. However, sometimes descriptions have to be more than “disambiguators of items with identical labels”, namely definitions for class items. I often run into the problem when adapting definitions from Art & Architecture Thesaurus (Q611299), it would be good to have a possibility to add sources here. Copyrighted content isn't ok of course! --Marsupium (talk) 23:27, 31 October 2018 (UTC)