Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
A place to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.

Please use {{Q}} or {{P}} the first time you mention an item or property, respectively.
Other places to find help

On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2021/05.

How to add an identifier to a wikidata page[edit]

Hello everyone. I'm a relatively new user and am asking for help because I can't figure out how to add identifiers to data pages. Any help would be greatly appreciated.  – The preceding unsigned comment was added by PunkWillNeverDie44 (talk • contribs).

How to create a list of examples?[edit]

For example, I would like to create a list of English semordnilap words. The items are examples of such words. So, they are not exactly standalone items.

I assume the item representing such a list should be: instance of Wikimedia list article. What is the preferred statement for items representing examples of words? Usage example?  – The preceding unsigned comment was added by D. Senkyr (talk • contribs) at 18:26, 10 April 2021 (UTC).

Add 270,000,000 triples for last names and first names of people to papers?[edit]

Currently we have 135,555,066 statements with author name string (P2093), mostly on items about scholarly papers.

The following proposals:

propose to complete that with two more qualifiers/triples, each. That would be 270 million more triples on query service.

While there may be some benefit to it, I think we should be able to come up with a better solution. --- Jura 10:55, 3 May 2021 (UTC)

These have been in discussion since last year, and are necessary to be able to reformat 'First Middle Last' as 'Last, First Middle' within en:Template:Cite Q at least - with the current setup it is impossible to know what is part of the first/last name and what isn't. They finally got marked as ready for creation yesterday, and now you start a discussion here??? Mike Peel (talk) 12:02, 3 May 2021 (UTC)
  • I don't see the size question addressed in the discussion and it's fairly unrelated to the proposals as such. --- Jura 12:30, 3 May 2021 (UTC)
  • I believe these should only be added where a bibliographic reference clearly makes the separation into first and last, which in my view is likely to be only a tiny fraction of the total. ArthurPSmith (talk) 17:40, 3 May 2021 (UTC)
    • What leads you to believe that? The problem on Wikipedia this attempts to solve isn't limited to these. Besides most author name strings are trivial to split into "last name" and "first name". Also, doesn't the source for most entries (Pubmed) split them for most names? --- Jura 05:27, 4 May 2021 (UTC)
  • Instead of complaining about size of two links to existing items added, think about the size needed for the author strings. In principle they could be deleted after first/last are added, and you would gain space. --SCIdude (talk) 07:15, 4 May 2021 (UTC)
  • Well, yes, but, mainly no. The size of the string stored pales into insignificance against the size of the triples created to store strings, and, moreover, the proposal is for qualifiers of the P2093 statement, which would not work very well if the P2093 statement was deleted. --Tagishsimon (talk) 07:56, 4 May 2021 (UTC)
  • Indeed, you might want to re-read the proposal. BTW, it's two per author (P50) or author name string (P2093) statement. So the total is probably 310,000,000 triples. --- Jura 08:00, 4 May 2021 (UTC)
So yes, 310M new triples, adding roughly 2.4% to the number of triples in WD, which is equivalent to 2 months of normal triple growth. It's unlikely the query service will care very much from a reporting perspective. The main pain arises in the RDF serialisation; but iirc that was recently substantially improved? And otoh, we probably should be here to make WD data usable, which CiteQ does well. WD is only going to get larger and its data more Baroque. I'm not convinced the argument for restraint, in this instance, is made. --Tagishsimon (talk) 08:29, 4 May 2021 (UTC)
  • At the risk of being late to this discussion, I am missing some acknowledgment/reasoning for going with what seems to be a rather untypical data structure. The straightforward data model would link papers to items for authors. Any possibly needed properties for formatting their names would then be added to those items. The advantages are, obviously:
  • Name strings are only recorded once, simplifying any corrections and saving storage space and triplets otherwise wasted on redundant information
  • Item references (integers) use less storage space than strings, further reducing storage requirements
  • Author items link between that author's papers, allowing the data to be used in the way all other data is used, such as by querying etc.
I get that it isn't entirely obvious how to distinguish between multiple authors with the same name, but:
  • The sources for the current data usually do link to author pages, either specific to a publication, at the author's institution, or at one of meta-services (researchgate etc)
  • When I started in academia, it was common practice to search one's name and find some variation of name/middle initial(s)/nickname that was unique
  • Subject matter, publication date, language, and names of collaborators could all be used to heuristically decide if two names reference the same person
  • For the purposes outlined above, i. e. formatting of names on citations, both extreme cases would still work flawlessly: collapsing all authors of the same name into a single item, or creating separate items for every author name on every paper.
Sorry of I missed a previous discussion of these issues. Karl Oblique (talk) 12:28, 4 May 2021 (UTC)
@Karl Oblique: Yes, the general plan is to replace author name string (P2093) with author (P50) statements where possible; however this does not capture how an author's name is represented as an author of a particular work which is the question here, and why at least in some cases these qualifiers unfortunately are still needed. ArthurPSmith (talk) 20:01, 4 May 2021 (UTC)
  • The main assumption seems to be that LUA on English Wikipedia can't read all the authors items correctly. Oddly, it seems to work out on ruwiki and few other wikis using the same module as ruwiki. --- Jura 12:14, 5 May 2021 (UTC)

How to indicate place of origin for a hyphenated ethnic group?[edit]

There are many instances of ethnic group (Q41710) that are essentially combinations of two distinct places, regions or countries. For example Taiwanese Canadians (Q7676603), Hong Kong Canadian (Q5894321), Canadian American (Q5029681), or Scottish American (Q3476361). In each case people came from one place to another and their place of origin gave them (and sometimes their ancestors) a distinct ethnic identity in the new place. I would like to find a good and consistent way of indicating for these ethnic groups both the current place of residence and the place of origin. And then for example I would like to be able to use Taiwan (Q865) -> Canada (Q16) to look up Taiwanese Canadians (Q7676603). How might I be able to indicate this? I see country of origin (P495) but this doesn't seem to like being applied to an ethnic group. It may also not be flexible enough to include diaspora communities that aren't from countries (e.g. Tibetan Canadians (Q7800410) or African Americans (Q49085) where neither Tibet nor Africa are countries). Any advice would be greatly appreciated! Nate Wessel (talk) 13:31, 3 May 2021 (UTC)

Why exactly does country of origin (P495) "[not] seem to like being applied to an ethnic group"? At first blush you shouldn't be getting error messages since ethnic group (Q2531956) is one of the values explicitly allowed to an item with the property, and the allowed targets seem flexible enough (Tibet (Q17252) is a valid one, as is Africa (Q15)). Circeus (talk) 18:36, 5 May 2021 (UTC)
I'm not sure. I see a warning on the country of origin (P495) statement that I added to Taiwanese Canadians (Q7676603). And also to the one I just tried adding to African Americans (Q49085). "Entities using the country of origin property should be instances or subclasses of one of the following classes (or of one of their subclasses), but Taiwanese Canadians currently isn't: Product... ". Nate Wessel (talk) 17:48, 6 May 2021 (UTC)
Actually, I think indigenous to (P2341) may be exactly what I've been looking for. Nate Wessel (talk) 18:18, 6 May 2021 (UTC)

Items on ghost names (taxonomy)[edit]

I recently discovered that the name 'Coenagrion exornatum' (Selys, 1872) was removed from the World Odonata List. It didn't take me very long to find out why: the name appeared not to represent an accepted species, but was a misprint for Coenagrion ecornutum (Selys, 1872). The 'name' first appeared in 1890 in the paper in which Kirby created the genus Coenagrion. After that, it started to live a life of its own, popping up in databases next to Coenagrion ecornutum as if they were both names for accepted species. At the latest in 2010 it was noted that the name was only a misprint, when Jin Whoa Yum et al. commented on the name. Here one can find a pdf of their paper, the statement is on p. 45. 'Names' of this type attract the attention of taxonomists because they lack everything an available name has: there is no nomenclatural type associated with it, there is no bibliographic reference to a protologue, and if there is one, the name can't be found there; there are no distribution data, in general there is nothing that indicates that a species with this name actually exists.

I tried to get rid of this piece of disinformation in Wikimedia by asking for speedy deletion in the projects that have an article with this name, giving the arguments of course. In some cases the article was indeed deleted, but on the Polish Wikipedia, they changed the name into a redirect, as if this ghost name were a synonym. I also tried to make clear in the Wikidata item that this is not the name of an accepted species but a misprinted name. I changed the authority from (Selys, 1872) into Kirby, 1890 because that's the place where the name first popped up. And I tried to add the above mentioned pdf as a reference for all of this. That however, appeared one step too far for me.

As this is only one example of literally hundreds of thousands of cases, and I guess it's important to take appropriate action if an article or an item suggesting it is the name for an accepted taxon appears to be disinformation, I'd like some advice on how to best tackle this in Wikidata. Most important: how to give a reference to a paper or other scientific work if the author of that reference is not recognized by Wikidata, and hence the reference refused, but there is a pdf or other source available online. 01:35, 4 May 2021 (UTC)

This is a prime example of the schizophrenic way Wikidata approaches taxonomenclature. It would be just fine if wikidata could handle names that do not correspond to accepted taxa (not just misspellings, but also things like replaced homonyms and other objective/homotypic synonyms), but Wikidata is incapable of doing so in any meaningful way without causing a huge amount of warnings. I gave up work in that area entirely a while ago because of that. Circeus (talk) 18:46, 5 May 2021 (UTC)
It is not clear from what you write where your problem is. You write "I tried to add the above mentioned pdf as a reference for all of this": it is not forbidden to give an URL to a PDF as a reference to a statement so why did you not succeed? Also which statements did you try to add, please point us to these. --SCIdude (talk) 07:34, 6 May 2021 (UTC)
First of all: thanks for answering. The most practical problem is this. In the statement 'taxon name' (it is of course not a taxon, it is a misprinted name, but more on that later), under 'reference', I tried to add a link to the pdf of the paper of Jin Whoa Yum et al. (2010), because in that paper (p. 45) that claim is made: "Coenagrion exornatum [misprint] (Selys): Kirby, 1890: Syn. Cat. Neur.-Odon., London: 150" (and as soon as one has that hint, one can look up Kirby's paper and compare with Selys's publication, and realize Yum et al. are right). I was unable to create that reference, probably because of my inexperience with adding references in Wikidata.
The other problem is that I have no idea how to cope with a 'taxon name' that in fact is no name at all. It should be possible to have an item for the misprinted name 'Coenagrion exornatum', but without all the claims that it is a taxon or a taxon name. The only useful knowledge about that name is that Kirby made an error when he meant to refer to Coenagrion ecornutum. Nothing more, nothing less. But because 'Coenagrion exornatum' has the form of a taxon name, it seems that there are all kinds of mechanisms that want to have it taken up as the name of (in this instance) a species. With the risk that sooner or later it pops up as a direct child of Coenagrion, as it did before; it was even often listed next to Coenagrion ecornutum as if both names represented accepted species. And then all the work that contributors (i.c. I) have done to sort this out will have been in vain. 00:49, 7 May 2021 (UTC)

Requiring References in QS[edit]

Probably a controversial proposal but anyone have thoughts on requiring all statements added through QS/similar batch tools to have references? Looking through the recent history it's clear that these changes could all have references but the user just opted not to do it. A few examples: [1] [2] [3] [4] [5] [6] [7]. Alternative similar ideas would be to apply this restriction only to larger batches or to make users click through a scary warning if they aren't adding references.

I recognize this might sometimes produce redundant or obvious references but that is far better than having no idea if you can trust a statement and shouldn't be hard to do.

BrokenSegue (talk) 13:10, 4 May 2021 (UTC)

  • Labels cannot get references... So... I'd be against that idea (good faith idea, by other ways). Another problem, if you would batch copy country (P17)=France all coordinates that are obviously objects whose coords are well inside France, how would you do that if you were to require a reference? --Bouzinac💬✒️💛 13:55, 4 May 2021 (UTC)
    inferred from (P3452) geographic coordinate (Q104224919)? That being said, I think that example needs references more than most of BrokenSegue's examples, because you could plausibly run into a scenario where you find out that the coordinates are wrong and you need to know whether that makes the country invalid as well.
    That being said, instance of (P31) and external identifiers are often self-sourcing, unless you're doing something weird (either having some really specific non-obvious type for the item or using some sort of heuristics or cross-referencing for the identifier). In those cases, the source of the information is pretty much irrelevant, and adding references really only serves to juice the reference statistics without helping data consumers. Vahurzpu (talk) 15:45, 4 May 2021 (UTC)
  • I like the idea of encouraging references in batch jobs, but I am also nervous about the strong version of this proposal. Perhaps start with properties that have a citation needed constraint (Q54554025)[8]. I often feel that we could use better tools to validate QS batches. How about some sort of QS Lint that previews the violations that will be reported by the existing error checking? Bovlb (talk) 16:37, 4 May 2021 (UTC)
    @Bovlb: My guess is that such a linter would be much harder to implement than a blanket restriction and implementation effort is an important consideration. BrokenSegue (talk) 16:43, 4 May 2021 (UTC)
    CC @Amir Sarabadani (WMDE), Lucas Werkmeister (WMDE). Bovlb (talk) 17:16, 4 May 2021 (UTC)
  • @Bouzinac: Labels/descriptions are not statements (I think) so would be exempt here. And yeah your example should have a reference for sure and is a good example why we should have this rule. BrokenSegue (talk) 16:41, 4 May 2021 (UTC)
  • @Vahurzpu: I think unless you are creating the item then the instance of (P31) is not self-sourcing. It's probably sourced from wikipedia or some external identifier or from the label. Even if you are creating the item it should be easy to find a source that, say, Florida Dental Association (Q5461307) is an organization (though maybe it's pointless). As for external identifiers I think you either imported from that external-DB (so just say it was stated in that DB) or you joined from one DB to another (so say its from the first DB). I would be ok exempting new item creation from this rule though. BrokenSegue (talk) 16:41, 4 May 2021 (UTC)

It's difficult enough as it is to keep wikidata synched with article creation on language wikipedias. Adding a new hurdle for that process will not be helpful. A rule which acts as a perverse incentive - e.g. to add less than useful references merely to make progress - also not helpful. Bottom line is that unreferenced statements are what they are: unreferenced, and therefore less complete & trustworthy than well-referenced statements. Users can decide for themselves whether to use/trust unreferenced statements. References are lovely but so is the completeness of wikidata sitelinks to WPs. Why exactly should the latter suffer because of your particular interest in the former. --Tagishsimon (talk) 18:16, 4 May 2021 (UTC)

@Tagishsimon: I'm sympathetic to the concern that this would burden users but your example seems particularly non-problematic. If you are syncing wikipedia to wikidata the reference often just says "imported from wikipedia". It's really easy. I agree people adding bad references is a risk. But for large batch people edits should be able to succinctly express how the statements were arrived at with minimal effort. I don't see how this proposal would hurt "completeness of wikidata sitelinks to WPs" since sitelinks are unreferenced. BrokenSegue (talk) 19:54, 4 May 2021 (UTC)
  • I looked at some of the batches you mentioned above and their main source seems to be a Wikipedia import. For these, merely adding "imported from" can be interesting, but in general isn't much help. Also, [9] had some problems references wont solve.
    For batches adding P31, not sure if references help much: if P31 needs a reference, one is probably using the wrong value or the item has some basic problems. --- Jura 20:07, 4 May 2021 (UTC)
    @Jura1: yeah the batches I picked weren't the best they were just the arbitrarily chosen recent batches that didn't have references. I could be convinced P31 doesn't need a reference but I do think "imported from" is much better than nothing (especially if it's paired with a date). I also would hope some bad batch jobs wouldn't be run at all because of the difficulty in making a reference. BrokenSegue (talk) 17:26, 5 May 2021 (UTC)

Statement for list[edit]

What is the statement to make for example list of tallest buildings in list of tallest buildings (Q1779466) so I can list all buildings under one statement? I have asked about it but gave wrong example so I got missed answers. Eurohunter (talk) 19:28, 4 May 2021 (UTC)

Are Wikipedia lists copied to Wikidata? --SCIdude (talk) 09:15, 5 May 2021 (UTC)
@SCIdude: I think no but what do you mean exactly? If data here is for use by Wikipedia I can imagine it could be used for tables across all Wikipedia versions. Eurohunter (talk) 10:18, 5 May 2021 (UTC)
No. If you have a Wikipedia list article, the articles listed should each have a Wikidata item with the same P31 value. So use the query to get all items. Second, your question: why not make Wikipedia lists by bot from Wikidata list? No one makes and maintains Wikidata lists, because the items will have P31/P279 and you just query for them to get a list. Finally, one could have the idea e.g. to make a Wikipedia list by using a Wikidata query, the problem is that in most cases there will be much more Wikidata items than Wikipedia articles, so most of the links in the Wikipedia list will be red. However, it is possible to show only those with article. --SCIdude (talk) 14:00, 5 May 2021 (UTC)
As to the original question, try:
SELECT ?item ?itemLabel (MAX(?height) as ?maxheight)
  ?item wdt:P31/wdt:P279* wd:Q18142.
  ?item wdt:P2048 ?height.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
GROUP BY ?item ?itemLabel
ORDER BY ?maxheight

Try it!

Family names used as middle names[edit]

I just encountered a problem at Bob Herman (Q106690802) that I've come across several times now. The individual's name is Robert Dunton Herman. We have Dunton (Q37503615) as a family name, but Dunton isn't really a given name. However, our normal way to mark middle names is to put them as a given name with qualifier series ordinal = 2. When I try to do this, it gives me a exclamation mark error, complaining that I'm using a family name as a given name. How should we resolve this? It doesn't seem right to create an entry for "Dunton" as a given name, since it's not. It also wouldn't be good to set given name (P735) to accept family names, though, as that would lead to a lot of errors. And entering the data as Given name = Robert, Family name = Dunton (ordinal 1), Herman (ordinal 2) wouldn't be right, either, as that'd make it seem like he has a two-word last name. {{u|Sdkb}}talk 20:30, 4 May 2021 (UTC)

Especially in the English naming tradition, which is widely used in places such as Ireland, Canada, Australia, and the United States, people are allowed to give almost any given name to their children they please. Later, adults are free to change their given name to almost any names they please. So if some parents gave their child the first given name "Robert" and second given name "Dunton", then both of those are first names, even though some of Robert's ancestors may have used "Dunton" as a family name. These exclamation point errors are constraint violations. Constraints are not right rules; they are merely hints that the value might not be correct. It is often necessary to ignore them. Jc3s5h (talk) 21:13, 4 May 2021 (UTC)
I think you need more detail to know if its a family name or not. Does one of his parents have surname "Dunton"? If so, I'd say he has two family names. Ghouston (talk) 01:35, 5 May 2021 (UTC)
According to Prabook, it's the maiden name of his mother, so I'd say it's a family name, like any double-barrelled surname. Ghouston (talk) 01:40, 5 May 2021 (UTC)
  • The solution was to create a QID for "Dunton" as a given name, even if it is unique to this person. That is how we handle someone with a given name that was a family name of a maternal ancestor. The most famous example is "Johns Hopkins". --RAN (talk) 04:56, 6 May 2021 (UTC)

Qualifiers and statements[edit]

I think sometimes qualifiers for statements can be repeat as statements. For example you can use country (P17) as statement and as qualifier for other statement in one item. So sometimes I don't know if I should use something as statement or qualifier. Is it correct? Eurohunter (talk) 22:22, 4 May 2021 (UTC)

Depends on the statement. No way you can generalize it. --SCIdude (talk) 09:14, 5 May 2021 (UTC)
@SCIdude: So what if we have country (P17) for statement then there is statement for product certification (P1389) and country (P17) as qualifier. I know it's not the best example because in this case it can be used to distinguish product certification (P1389) between countries but for sure there are other examples where it looks like repeated and it is probably performer (P175) if we use it as qualifier for instance of (P31). Eurohunter (talk) 10:24, 5 May 2021 (UTC)


Hi. The user Tm insits on adding names in Portuguese in the English section of Q3375706. Could someone stop him? --Lojwe (talk) 23:37, 4 May 2021 (UTC)

And what about you deleting sourced alternative names in Portuguese, for several weeks? Tm (talk) 23:41, 4 May 2021 (UTC)

Those names are incorrect. They should be deleted too, but, as you are difficult to deal with, I prefer to go step by step. --Lojwe (talk) 23:49, 4 May 2021 (UTC)

Yes, ironically, those names are "incorrect names" by your own words, but so much wrong, that "Empreendimento Hidroeléctrico do Douro Internacional / Picote" is the name given by the portuguese state DGPC - Directorate-General for Cultural Heritage, "Aproveitamento Hidroeléctrico do Douro Internacional - Picote" is the name given by the portuguese Association of Architects or EDP, the electric utility that owned this dam and Aproveitamento Hidroeléctrico do Douro Internacional is the name used by one of the construction companies or " Empreendimento Hidroeléctrico do Douro Internacional" is used by the tourist route "Rota da Terra Fria" made by the association of the municipalities "Associação de Municípios da Terra Fria do Nordeste Transmontano". But who are they to know better than you? Tm (talk) 00:13, 5 May 2021 (UTC)

Perhaps as a compromise, Lojwe, you could stop your close policing of the addition of Portuguese language aliases as Portuguese language aliases; aliases are mainly useful as aids to discovery via search, and so in general more is better ... and Tm, you could stop adding Portuguese language aliases as English language aliases, since if they also exist as Portuguese language aliases then they will be found via search irrespective of their absence from the English language alias list. --Tagishsimon (talk) 05:32, 5 May 2021 (UTC)

  • Aren't native labels fairly common as aliases (at least) in English? The main oddity here it seems to be that it might add company names to what can be seen as a geographic feature. --- Jura 09:38, 5 May 2021 (UTC)
  • Do not remove Portugal names in this case from aliases. If they are "original name" for certain place or work then they bacame "common" alias for name in English or other language and should be keeped across all languages as aliases. Eurohunter (talk) 10:28, 5 May 2021 (UTC)
@Eurohunter: The user TM insists on introducing Portuguese names in the English section. He's exhausting. Can someone do something about it? --Lojwe (talk) 06:42, 11 May 2021 (UTC)
@Lojwe: Yes and that's what we wanted (original names in Portugal for Portugal places in English alias and all languages). Did read what I wrote above? Eurohunter (talk) 13:26, 11 May 2021 (UTC)
And user Lowje insists in deleting sourced native names or changing its spelling claiming a Wikidata policy that does not exist, more than one place or removing labels like "Linha Internacional de Barca d'Alva-La Fregeneda a La Fuente de San Esteban" that are the name of the article in the portuguese wikipedia. Tm (talk) 18:43, 11 May 2021 (UTC)

@Eurohunter: The name of this Portuguese place is Bemposta. Barragem (pt) is Dam (en) in English. That's what I complain about. I am not sure about what you meant before. --Lojwe (talk) 17:34, 11 May 2021 (UTC)

@Lojwe: Yes so "Dam" for label and "Bemposta" for alias in english. Eurohunter (talk) 18:19, 11 May 2021 (UTC)
The name of this damis not "Bemposta", but of the the civil parish (see Q816392) were this dam is located. Tm (talk) 18:43, 11 May 2021 (UTC)
Yes, the dam is named after the parish. Bemposta is the name of the place. Barragem is not a name of a place, is a word said Dam in English. --Lojwe (talk) 19:57, 11 May 2021 (UTC)

What Json Fields are Nullable?[edit]

I'm working on a library for parsing JSON of Wikidata Entities into a more native format in a strongly typed language (OCaml). Because this language is strongly typed, any fields that are sometimes null have to explicitly be marked as only optionally containing data; when accessing this data using my library, people will have to explicitly deal with the case that no data is present before the code will compile. I've read through the Wikibase JSON Format Documentation a number of times but I keep getting tripped up by fields being nullable that I thought weren't. For example, the property coordinate location (P625) of St John's College (Q691283) is a globe-coordinate with a null precision, but precision wasn't listed as being optional in the documentation. Is there any canonical list of fields that are nullable? I would just mark everything as being optional except that makes it really annoying to work with my library, especially considering there are many fields that are almost certainly never null (like the text field of monolingualtext). --ImpossiblyNew (talk) 04:38, 5 May 2021 (UTC)

A possible explanation for your sample [10] might be that it was added years ago and, while the format evolved, not all data on Wikidata was updated to match that. --- Jura 09:25, 5 May 2021 (UTC)

Data Donation from "Open Editors" (Editors from Scholarly Journals)[edit]


I have a data-collection project underway called "Open Editors". It used webscraping to collect data about almost half a million researchers who are in the editorial boards of ca. 6.000 scholarly journals. I plan to scrape regularly (annually over the next few years) so that the data get updated.

I would love the data to be freely available, not just with a CC0-license and a CSV-file (on GitHub), but at Wikidata, so that the data can be linked extensively.

However, I am too new at Wikidata; I lack the knowledge on how to do proceed.

Thus, may I ask, what I shall I do in order to initiate a data donation? Is the dataset even suitable for such a venture?

Thank you!

Andrepach (talk) 08:56, 5 May 2021 (UTC)

Nice. The journal link can be used as reference, as you also recorded the date when you visited it. --SCIdude (talk) 09:11, 5 May 2021 (UTC)

Call for Election Volunteer[edit]

Hi everyone,

Voter turnout in prior board elections was about 10% globally. We know we can get more voters to help assess and promote the best candidates, but to do that, we need your help.

We are looking for volunteers to serve as Election Volunteers. You can read more about this role here:

Election Volunteers should have a good understanding of their communities. The facilitation team sees Election Volunteers as doing the following:

  • Promote the election in their communities’ channels
  • Organize discussions about the election in their communities
  • Translate messages for their communities

Do you want to be an Election Volunteer for Wikidata or any of the Wiki projects, and connect your community with this movement effort? Check out more details about Election Volunteers and add your name next to the community you will support in this table or get in contact with a facilitator. We aim to have at least one Election Volunteer for Wiki Projects in the top 30 for eligible voters. Even better if there are two or more sharing the work.

If you have any questions or comments regarding this role please reach out to me or any of the board governance facilitators.

Best,Zuz (WMF) (talk) 09:27, 5 May 2021 (UTC)

url formatter[edit]

Any reason that this url formatter doesn't seem to be working ? Property:P9505#P1630. Jheald (talk) 14:02, 5 May 2021 (UTC)

  • @Jheald: Works now. For some reason there's a delay of some number of hours before url formatters take effect in the UI. You may have to purge items that don't show the link yet. ArthurPSmith (talk) 17:18, 5 May 2021 (UTC)

Subclass of heritage designation[edit]

listed historical resource (Q15203884) is heritage designation by the city of Winnipeg, Canada. Buildings added to the Winnipeg List of Historical Resources (Q106714820) up to 2014 have a classification of grade I, II, or III, which affects their level of protection, but items added since do not (or they have grade “N/A”). It is important to record this, but I’d rather not create three or four new designations to capture this, but rather keep listed status in one place and add a qualifier. Is there an existing property or qualifier suitable for this? —Michael Z. 18:48, 5 May 2021 (UTC)

  • You could qualify statements like Q7885387#Q7885387$6e117367-4e14-d5e6-a7b5-78fe4a978942 with criterion used (P1013)? --- Jura 06:48, 7 May 2021 (UTC)
    The general criteria used to determine designations and grades are separately listed here, but the specific criteria for each designated property are not listed explicitly (maybe some can be inferred from each property’s report). So that’s not quite right, but that property led me to subject has role (P2868), which sorta kinda works if I create four items representing the “roles” Grade I, II, III, and N/A. Does this make sense?
    The alternative is to add three or four subclasses (“N/A” could be added to explicitly contrast with “no data,” and the encompassing superclass ideally becomes empty).
    Which approach is better? My instinct is to have a binary quality of designated or not, and add the grade as a qualifier only, but the implementation is feeling like a kludge. It would feel odd if this municipal designation were more complicated than the provincial and federal ones, but maybe it just is what it is. I guess subclasses is more natural in Wikidata and allows the most fine-grained application of data in the long run.
    I want to decide on this modelling before I start adding a lot of items and statements. —Michael Z. 14:05, 7 May 2021 (UTC)
    I'm not involved in heritage description so I do not understand why criterion used (P1013) with one of four items named like "Winnipeg List of Historical Resources classification grade I" is wrong, because it sounds fine to me. Can you please explain? --SCIdude (talk) 16:07, 7 May 2021 (UTC)
    Criteria are the standards or basis for an evaluation, and its result may be a designation (with or without a grade). This is the normal general meaning, and also used specifically by the city of Winnipeg: “Heritage Values are the architectural & historic significance of a resource and are based on the following criteria: AGE - Its importance in illustrating or interpreting the history of the city or a neighbourhood; PERSON . . . CONTEXT . . . STYLE . . . LOCATION . . . INTACTNESS.” Source(expand “learn more”). —Michael Z. 20:21, 7 May 2021 (UTC)

It's not very clear, your personal preference aside, why you are not following the established pattern seen, for instance, for England and Wales with Grade I listed building (Q15700818), Grade II listed building (Q15700834) &c and for Scotland, category A listed building (Q10729054), category B listed building (Q10729125) &c, all used as main heritage designation (P1435) values. This report provides counts of use. Were you to persist with using a qualifier, object has role (P3831) would be more appropriate than subject has role (P2868) IMO. criterion used (P1013) is useful when modelling the criteria used for something. Here the various grades are not in themselves a criteria, but rather the result once criteria have been applied. --Tagishsimon (talk) 16:26, 7 May 2021 (UTC)

Because I was not aware of those. Thank you. —Michael Z. 20:21, 7 May 2021 (UTC)


Is there any way to copying references between items? Eurohunter (talk) 19:22, 5 May 2021 (UTC)

@Eurohunter: Preferences > Gadgets > enable DuplicateReferences Vahurzpu (talk) 19:28, 5 May 2021 (UTC)
@Vahurzpu: Yes but this tool moving reference just from one statement to another in one item and I need to copy references from one item to other items. Eurohunter (talk) 19:45, 5 May 2021 (UTC)
I now realize that I misread your question. I don't know of tool that does this, but maybe someone else does. There's always the option of creating an item for the reference and linking it with stated in (P248), assuming that makes sense for the reference in question. Vahurzpu (talk) 21:00, 5 May 2021 (UTC)
To my knowledge there is no simple way, which is another glaring reason most statements aren't referenced. When a reference doesn't exist as a fully self-contained item (and I don't believe every citation warrants a dedicated item down to page number), I use MoveClaims tool (User:Matěj Suchánek/moveClaim.js) to copy a referenced statement to the target item (even if it has no relevance beyond the shared reference), then modify the statement and/or reference as needed. -Animalparty (talk) 22:18, 5 May 2021 (UTC)

George Sand (Q3816)[edit]

What's the best way to add sex or gender (P21) = male (Q6581097) to the item?

Once added, we will have to set rank to deprecated, but that's another question. --- Jura 12:29, 6 May 2021 (UTC)

She liked to wear men's clothes but in what sense could she be regarded as "male"? — Martin (MSGJ · talk) 20:55, 6 May 2021 (UTC)
I had mostly in mind facts Wikidata knows about, i.e. authorship under pseudonym. --- Jura 06:43, 7 May 2021 (UTC)

GND and ZDB identifiers for journals[edit]

I have a question on how to link to journals with the ZDB and GND. I have Nature Methods (Q680640) which is recorded in these two datasets:

However I do not find a property for which identifier "026529807" works. Instead, there is ZDB ID (P1042) which has identifier 2163081-1 and using the formater URL creates the following link which redirects to . So it seems to me that identifier "026529807" is preferred across GND and ZDB but I cannot link to it from Wikidata while the ZDB ID (P1042) identifier "2163081-1", to which we can link, seems to be redirected to the other identifier. Can someone explain what is going on and whether one identifier should be preferred over the other? Is there a property with a formater URL which works for "026529807"? I am just confused here. --Hannes Röst (talk) 16:33, 6 May 2021 (UTC)

This seems quite mysterious. Google doesn't find anything other than this conversation linking 026529807 with an "IDN" identifier. Do you have any contacts at ZDB who could explain what IDN means? ArthurPSmith (talk) 15:04, 7 May 2021 (UTC)
It’s beyond complicated. GNDs also have IDNs, which are called PPNs (Pica-​Produktions-Nummer) in other contexts. They are causing all sorts of troubles because sometimes you can only use the GND, sometimes only IDN/PPN, sometimes both and sometimes none, depending on the catalogue a library is using. I never really figured out the logic behind that, maybe @Kolja21, Wurgl, Raymond, Unukorno, Silewe: can help? --Emu (talk) 19:42, 7 May 2021 (UTC)
Like explained here "" are links for single editions. For ID 026529807 use DNB editions (P1292). --Kolja21 (talk) 20:08, 7 May 2021 (UTC)
PS: IDN = Identifikationsnummer. Explication in German. --Kolja21 (talk) 20:20, 7 May 2021 (UTC)
In short: If the link starts with it is a GND-Number. If there is no /gnd/ in the link, it is that IDN/PPN. Another hint: GNDs do never start with a zero. --Wurgl (talk) 20:47, 7 May 2021 (UTC)
So should we propose a new property for the IDN, or is it already covered somehow? ArthurPSmith (talk) 20:59, 7 May 2021 (UTC)
In deWP, we use the ISSN for magazines, as you can see in the Infobox of de:Nature Methods --Wurgl (talk) 21:20, 7 May 2021 (UTC)
It seems that the ISSN links also into the ZDB, but then what are the circumstances where we would need ZDB ID (P1042), is that if there is no ISSN? Also it seems that the ISSN search reveals two results: [11] and [12] so its unclear how to handle that, should we link both ZDB links? --Hannes Röst (talk) 14:24, 10 May 2021 (UTC)


How long does it take for a wikidata item to be indexed by a search engine? Or is that up to the people who run the search engine? CanadianOtaku Talk Page 01:15, 7 May 2021 (UTC)

that is up to the search engine people. BrokenSegue (talk) 02:12, 7 May 2021 (UTC)
  • You can run a test yourself, create a new QID and search in Google for it every day. Generally it takes about 30 days for new items in Wikidata to be indexed. The 30 days window gives us a chance to weed out fake entries. --RAN (talk) 16:18, 7 May 2021 (UTC)

Plural units[edit]

This might be a somewhat fussy/low-priority task, but it's always irked me a bit that we describe the cost of something as "47 United States dollar" or the height as "47 foot" or the area as "47 acre". We do have lexemes that seem to record the plural forms of these units, e.g. acre (L15846); would it be hard to get those connected so that plural units are displayed properly as plural? {{u|Sdkb}}talk 02:19, 7 May 2021 (UTC)

As far as I'm aware, in English and Dutch we do not use plural in these constructions. For other languages I do not know, but the English examples you give here are syntaxically correct to me. Edoderoo (talk) 05:41, 7 May 2021 (UTC)
We'd say "the cost is 47 dollars" or "the campus is 48 acres", not "the cost is 47 dollar" or "the campus is 47 acre". {{u|Sdkb}}talk 17:53, 7 May 2021 (UTC)
The way units are used depends on the language. For instance, "three metres" in German is "drei Meter", using the singular form. It would be a nice project to develop a tool which renders grammatically correct quantity expressions for various languages using lexemes. Toni 001 (talk) 08:03, 8 May 2021 (UTC)
Wikidata is mainly for machine. Grammar no matter. Machine read data, third party user modify format as user want. All good. -Animalparty (talk) 21:04, 7 May 2021 (UTC)

Property for 2016 mirror(?) of Wikipedia[edit]

Currently there is Wikidata:Property_proposal/English_Everipedia_ID open for discussion. I'm not really sure what to think of it.

w:Everipedia#Content_and_users notes "as of May 2019, most or all Everipedia articles originating as Wikipedia articles, including those never edited on Everipedia, had not had updated Wikipedia content applied since 2016". I wonder if it's that site where I occasionally end up when looking for articles deleted from enwiki.

Anyways, I think the discussion could use more input. In the meantime, samples mostly pointing to copies of Wikipedia have been removed. --- Jura 06:41, 7 May 2021 (UTC)

I think Everipedia is a terrible idea, and no one should associate with it. -Animalparty (talk) 04:41, 8 May 2021 (UTC)

request to import data from project: "Cheung Chau Piu Sik Parade"[edit]

We want to import data to WikiData from our project: Cheung Chau Piu Sik Parade

Here is the project link:

Here is the dataset:

Thank you for your suggestions.

  • Wikidata is not about uploading your CSV file but Wikipedia has it's own structure. To upload data to Wikipedia you have to think about how the information can be expressed in Wikidata's data structure. Items are about more then just label and description. ChristianKl❫ 11:17, 7 May 2021 (UTC)
  • We just want to do data donation, how can we start? --Hkbulibdmss (talk) 06:19, 10 May 2021 (UTC)

User profiles, wikidata and allow to have links to other profile languages[edit]


Can we officially allow (vote) a user to create a wikidata page only for the purpose of linking to profiles in different languages? Current it is not officially strictly allowed.

For example, I have a lot of posts on skwiki and hrwiki (plus, of course, enwiki), so if someone opens my profile, they would see that I also have profiles on another language wiki.

I mean to officially enable this (red in the picture) on wiki profiles.

✍️ Dušan Kreheľ (talk) 08:59, 8 May 2021 (UTC)
✍️ Dušan Kreheľ (talk) 00:19, 9 May 2021 (UTC)

It would be nice if wikidata could take care of that. At the moment there is mw:Extension:Cognate, so I wonder if it could be repurposed to link between user pages in different wikimedia wikis.--MathTexLearner (talk) 15:23, 8 May 2021 (UTC)
You need to manage this locally via oldschool interwikilinks. This is not going to happen via Wikidata. —MisterSynergy (talk) 10:28, 9 May 2021 (UTC)
If it was possible, it would probably be nice to connect userpages via Wikidata! --Koreanovsky (talk) 10:59, 9 May 2021 (UTC)

I understand that the proposal involves creating a Q for an editor. I am against that idea. It will be great when we can have "global user pages" but creating Qs for editors is not the way. B25es (talk) 15:23, 9 May 2021 (UTC)

We already do have global user pages. However, that gives you the same content on every wiki, which you may not want (for instance, my Wikidata userpage has stuff about my Wikidata editing, while my Commons userpage has stuff about photography). Vahurzpu (talk) 16:15, 9 May 2021 (UTC)
@B25es: You also have some arguments/reasons why this is not good?
✍️ Dušan Kreheľ (talk) 23:09, 9 May 2021 (UTC)

Qs are about subjects (there used to be an old and often repeated sentence in Spanish "personas, animales o cosas", of course here we have also ideas, abstract concepts, facts, events, lands, classes of viruses...) that are relevant (worth to be mentioned) in this project. My turtle isn't and I am not either. Therefore, no Q shall be made about me (or my turtle). If at some point in time I happened to be of interest (for instance, if I run for alderman of my town hall, I think we have to be inclusive) then a Q could be made about my person.Dušan Kreheľ

If global user pages are not good enough, they should be improved. But adding Qs about subjects that do not merit it or -even worse- do not want them, that's not the way to solve the problem. I know of fellow users who have Qs because of being notable in some way: being a relevant member of WMF or a chapter, for instance. And I can't help wonder "do I really want to know who is this person's father?" -father/mother/place and date of birth/alma mater are pretty common properties about any person. Because Qs are to be filled with Ps and those filled with info. That's the nature of this project. And I really don't feel like we should have (y)our personal information here exposed as if we were Q181715, Q76754, Q57359, or Q196527.

I see your point, but the answer is not a Q for every editor. B25es (talk) 06:30, 10 May 2021 (UTC)

Dušan Kreheľ, create such a page in metawiki (link to your user page) which will act as global user page. There you can add hints to wikis where you are usually active. (Edit while still writing: I noticed that Vahurzpu already pointed to this feature.) This page will be displayed on all your user pages in Wikimedia universe for which you do not have a dedicated user page created, but only if you have at least once logged in and your account is existant in this project. You can take my user page here in Wikidata as example and see, how this may look. If there is some content on page, but you want to get the global page displayed request a local deletion. If you do want to add this information only into your userpage on Wikidata: I used the templates {{User SUL Box}} and {{BUser}} (the latter twice). — Speravir – 00:09, 11 May 2021 (UTC)
  • I think the easiest solution here would be on the software side - allow old-style interwiki links (like we had before Wikidata) to work on user pages. Guettarda (talk) 13:59, 11 May 2021 (UTC)
    Don't these old-style interwiki links still work? I can't really test it, as I have only a global user page on meta. —MisterSynergy (talk) 14:07, 11 May 2021 (UTC)
    They indeed still work. And this feature even works in preview of wikieditor (flavour of 2010). — Speravir – 00:34, 12 May 2021 (UTC)

Description of templates[edit]

I just tried to merge Q26105412 and Q10976602, but the attempt was blocked due to conflicting descriptions in various languages. It seems most of the conflicting names are of the type "Wikipedia template" vs "Wikimedia template". I am wondering whether the former type can be considered obsolete (or are there cases where "Wikipedia template" works, but not "Wikimedia template"?), and if so, whether a bot could run over and fix this across Wikidata. --2A02:587:B946:8D35:F8E0:13E0:66F5:E873 10:09, 9 May 2021 (UTC)

Don't know what the problem was. I've just merged them. No bots needed. --Tagishsimon (talk) 10:30, 9 May 2021 (UTC)
Thanks. Might be s problem only when using the merge special page. --2A02:587:B946:8D35:F8E0:13E0:66F5:E873 12:00, 9 May 2021 (UTC)
In general whenever you get an error message and want help with it, it makes sense to copypaste the error message. That's true on Wikimedia and also when you ask for help elsewhere on the internet.
In this case, it would be possible that there's some rule that doesn't allow non-autoconfirmed users to make certain merges. ChristianKl❫ 22:01, 9 May 2021 (UTC)
Not 100% certain on this, but as I believe I've run into this at least once with Special:MergeItems, this may be a quirk of how the special page works. The way most editors merge (I assume) is with the merge gadget, which has specific logic to make it ignore description conflicts. It's just that unregistered users have no way to enable the gadget. Vahurzpu (talk) 01:34, 11 May 2021 (UTC)

Member of a First Nation[edit]

What’s the right statement to describe a member of a First Nation (also called an Indian band, Indian reserve, etcetera)? It is a cultural community, not an ethnic group.

Related: the description of member of (P463) says not to use it for a social group, but not what to use. There is also social group (Q874405) and social community (Q4430245), but they aren’t defined with precision. —Michael Z. 02:32, 10 May 2021 (UTC)

@Mzajac: this was discussed pretty recently. maybe the old conversation will help some Wikidata:Project_chat/Archive/2021/04#Wikidata_properties_for_tribes? BrokenSegue (talk) 04:15, 10 May 2021 (UTC)
Thanks. Tagging user:*Treker, who was planning to propose a property. citizenship (Q42138) may be a good model for this.
For general info, a Canadian First Nation, a First Nation band (Q2882257), is not an ethnic group or sub-group. Some have only dozens of residents. Some have multiple Indian reserve (Q155239) associated with them, and also traditional lands beyond their boundaries. Membership in a First Nation is separate from residence there: one can have either or both. Most, but not all members are registered under the Indian Act (Indian Register (Q2095049)). Inuit and Métis people do not have this arrangement. (I am no expert; just sharing the basics I’m aware of.) —Michael Z. 15:24, 10 May 2021 (UTC)
@Mzajac: I am still planning on proposing a tribe propery (and maybe two separate properties for federally recognized tribes of the US and First Nation bands of Canada), but I want to a really good job of it and since I'm a bit sick right now I will likey wait until I have the energy in a few days. I think it's absolutely a thing that's needed for Wikidata.*Treker (talk) 19:09, 11 May 2021 (UTC)
Thank you. I don’t know much about US tribes, but it would be beneficial if a generalized property could work for membership in Indigenous groups everywhere, and perhaps specific subclasses added if and when necessary – otherwise this exercise will just be repeated again and again. Recently Canadian First Nations have been establishing that they legally exist beyond the borders of Canada.[13] There are also Indigenous peoples in hundreds of countries in Oceana, Australia and New Zealand, Lapland, Crimea, the Philippines, etc. —Michael Z. 19:21, 11 May 2021 (UTC)

Share your IP Masking comments and suggestions[edit]

Hello colleagues,

Due to global trends about user data collection and use, the Wikimedia Foundation will be masking IPs to protect editors' privacy but would also be building tools to ensure we are able to continue fighting vandalism and other abuse in the absence of IP addresses. We would like to know, how will IP Masking impact you? Also which tools will you need to be able to effectively govern the projects in absence of IPs?

Kindly read more about the project and the tools we are currently working on, you can offer critique, you can also suggest your own. Please use the talk page for this.

Best regards,
Anti-Harassment Tools Team
STei (WMF) (talk) 12:36, 10 May 2021 (UTC)

Wikidata weekly summary #467[edit]

"manual" update of constraint violations pages[edit]

Is there actually a way how you can make the KrBot2 that it should please refresh the page Wikidata:Database reports/Constraint violations/P2991 again? --Gymnicus (talk) 15:24, 10 May 2021 (UTC)

No. KrBot2 updates are a pretty heavy operation. The bot operator processes an offline dump to update these reports every couple of days. —MisterSynergy (talk) 15:33, 10 May 2021 (UTC)
@MisterSynergy: Interestingly, the KrBot updated the page right now. How did it come about now? --Gymnicus (talk) 15:36, 10 May 2021 (UTC)
Well, that's unusual. Maybe you got a special treat by the operator? Usually KrBot2 makes these updates, not KrBot. —MisterSynergy (talk) 15:40, 10 May 2021 (UTC)
I started update procedure manually. This can be done for small reports (less than ~2000 items). Direct access to Wikidata is used in this case. This makes many requests to Wikidata engine, so it can not be used in regular basis unfortunately. — Ivan A. Krestinin (talk) 15:50, 10 May 2021 (UTC)
For some of the constraints, there is the option to run the queries on property talk pages. This should give the same or similar results. --- Jura 06:24, 12 May 2021 (UTC)

Merging Multiple Items[edit]


I recently noticed there are four entries for the same public library in Findlay Ohio and looked into merging them.

  • Q69961220
  • Q69961218
  • Q69487490
  • Q30289487 - Notably uses the full name instead of an abbreviation. Unsure if this should be merged with the others or not due to slight differences (I'm 99% sure that on the ground, these are one and the same, but maybe there's some library nuance I'm missing?)

However I got an error when I tried to use the merge tool, and noticed that these items refer to themselves anyway, so I figured I would ask a more experienced user first so I didn't mess anything up.

Thank you for your time! Mbrickn (talk) 00:55, 12 May 2021 (UTC)

The labels & descriptions used for the first three were arguably suboptimal, but the P31 property specify they are, respectively, a bookmobile, a main library and a library network - all imported from the same source, the Public Libraries Survey 2017. They are not duplicates. The fourth (described variously as an archive organization / cultural instituton) might be a duplicate (of the main library or the network), but might equally be a discrete sub-entity worthy of an item. For now I've made it part of the network, which precludes it being merged. I think there's little more to do here, sadly. --Tagishsimon (talk) 01:16, 12 May 2021 (UTC)

Update Ancient History Encyclopedia[edit]

Hello! I was wondering whether someone with more knowledge about Wikidata than I could update Ancient History Encyclopedia ( The publication recently rebranded to World History Encyclopedia, and all links should now point to$1 instead of$1. Is it possible to change the name and URL accordingly? Thanks!

should be done. thanks for the tip. BrokenSegue (talk) 02:33, 12 May 2021 (UTC)

should i merge artist and artist commonswiki category?[edit]

when i try to add commonswiki site link on Linda McCartney (Q228899), it says interwiki conflict with Category:Linda McCartney (Q8591290). earlier i raised similar question on interwiki conflicts chat. however, i request you to suggest best way to resolve this issue. Gi vi an (talk) 07:00, 12 May 2021 (UTC)

@Gi vi an: A short answer is "no". We usually keep distinct items for concepts and their Wikimedia categories. Vojtěch Dostál (talk) 08:25, 12 May 2021 (UTC)

single (Q134556) or musical work (song) (Q2188189)[edit]

Today I was looking at wikidata item Mag ik dan bij jou (Q19696999). This item descibes a beautiful song of the dutch performer Claudia de Breij. It is labeled as instance of single (Q134556), in my opninion it should be labeled as musical work (Q2188189). A single is descibed as "type of music release usually containing one or two tracks". Musical work is described as "musical work of art" or "piece of music" My interpretation: a single is something you put on your record player. The musical work is the song itself, the creative work.

What do you think? JohnBoers (talk) 07:07, 12 May 2021 (UTC)