User talk:Jheald

From Wikidata
Jump to: navigation, search
Logo of Wikidata

Welcome to Wikidata, Jheald!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, please ask me on my talk page. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! Liuxinyu970226 (talk) 23:58, 10 August 2014 (UTC)

WDQ[edit]

Yes, you can filter those results by label matching some characters, but for that you need to use Autolists2. The link is on the same page of autolists, on the top part, it says: "FOR EDITING WIKIDATA, please use this tool's successor, AutoList 2!"--Micru (talk) 06:42, 15 August 2014 (UTC)

Wikidata:WikiProject sum of all paintings[edit]

I see you're interested in GLAM and sturctured data. You might want to join this project. The project has it's own goals and it will already give us a lot of experience on how to model data about works of art. Multichill (talk) 09:48, 17 August 2014 (UTC)

Wikidata:WikiProject Structured Data for Commons/Phase 1 progress/Links updates[edit]

I think will be good idea to setup periodical links updates (for example, once a day). Or provide update link similar to c:User:OgreBot/Uploads_by_new_users/2014_September_06_06:00. --EugeneZelenko (talk) 14:56, 8 September 2014 (UTC)

@EugeneZelenko: Hi Eugene, thanks for the interest. I have to admit, I'm still quite a newbie when it comes to automation -- at the moment I'm trying to get my very first bot to run (on Commons), and struggling to understand why the perl module I was going to use is refusing to install -- so it looks like the moment has finally come that I'm going to have to get to know Python...
The pages were the best I can do at the moment with my current level of experience and understanding, because I haven't yet really worked out how to get to get myself set up on Labs to run my own queries; nor the bot framework to put it all together; nor how to trap a button being pressed and get it to run a bot. These are all things that I probably ought to try to find out quite soon, but for the moment it's beyond where I am at.
On the other hand, what should be reasonably easy with the links provided should be to get a current count to see if that's different to what's on the page; and to download a .tsv file that can then be cut-and-pasted in the edit window. Yes it's a pain, and it would be nice to have a shiny blue button that updated everything automatically, or a cron job automatically updating the pages every n days; if somebody wants to make that, I'd be very very happy to see it.
But for the moment, it did the basics of showing what's there. And if some nice person did go ahead and spend an afternoon clearing out eg all the direct file sitelinks that shouldn't be there, then I hope I have made it easy enough for them to regenerate the page. Jheald (talk) 15:33, 8 September 2014 (UTC)
Updated cygwin, so now MediaWiki::Bot now installs: so I'm getting there... towards my first bot edit, anyway. Still a long way to writing automated pages though. :-) Jheald (talk) 15:45, 8 September 2014 (UTC)
I fixed some of problems as time allowed, but I didn't have time for pages updates. --EugeneZelenko (talk) 13:57, 9 September 2014 (UTC)
@EugeneZelenko: Wow! You have been busy! Now only 4 links to Creator pages, none at all to Institution pages, and a dozen fewer than there were to File pages. Impressive!
You're right, the updating is a pain (having now done it). It would good to have auto-updated summary statistics on the summary page. This surely can be built, but I'm not sure I can do it very soon. (Still failing to get my first ever bot edit to actually happen over on Commons -- for some reason Media::Bot can't re-write the page, so it looks like the day has finally arrived for me to learn Python!)
However one important thing that may help, is to tell people that they need to be logged in to the Quarry tool in order for the "Submit Query" button to appear. Then, particularly if you're just working on one namespace, updating oughtn't be too painful. Jheald (talk) 16:47, 9 September 2014 (UTC)

Creator template[edit]

Hi Jheald, I'm traveling these days and I don't have time to look into it, but I recommend taking a look to the Authority template in Wikipedia, maybe you get some ideas from there. Good luck!--Micru (talk) 12:30, 22 September 2014 (UTC)

Louis Carrogis Carmontelle (Q982053)[edit]

Hi James, I don't get these edits. Care to explain? Multichill (talk) 18:02, 21 November 2014 (UTC)

@Multichill: I was using Magnus's QuickStatements tool to add Commons Creator page (P1472) properties. But it seems I have got some of the Q-numbers wrong. It's conceivable that I wasn't careful enough to check whether a regex capture had been successful, and used an old value. I'll look at the scripts and try to identify what went wrong, and if there were any other Q-numbers that got spurious additional links. Thanks for spotting this & pulling me up about it. Jheald (talk) 18:33, 21 November 2014 (UTC)
@Multichill: Update: It looks like the script was working properly, but the Q-values on the creator templates were wrong. (Presumably due to a never-corrected cut and paste when they were created). There are also a few genuine duplicates, which I will turn into redirects. In all about 100 such Creator templates to go through, so I'll get on with that. All best, Jheald (talk) 19:01, 21 November 2014 (UTC)
Ok, thank you! You did see the constraint report? It's very useful for finding mistakes and duplicates. Can you revert Louis Carrogis Carmontelle (Q982053) when you're done? Multichill (talk) 19:14, 21 November 2014 (UTC)
@Multichill: Thanks, I'd forgotten that was there. So I'll be going through the list, and sorting out the dupes and incorrect Q-numbers. Jheald (talk) 19:19, 21 November 2014 (UTC)

BBC Your Paintings artist identifier[edit]

You added Art UK artist ID (P1367) to John Jones (Q454248), Francina Margaretha van Huysum (Q15511647) and James Charles (Q6131230). What was the data source? These statements were all wrong but it does not seem like an error by you but by the data source as also other users added exactly the same wrong statements. --Pasleim (talk) 10:02, 2 December 2014 (UTC)

@Pasleim, Jane023: Good catch. These look like automated edits made by synchronising from Magnus's Mix-n-match tool. Somebody has wrongly identified "Your paintings" links with these items in the tool, and it is then trying to re-add the information every time somebody uses the synchronise option. (Choose catalog, then the 'Y' link at the end of the "Your paintings" line).
This will presumably continue until the incorrect identifications are removed from the tool. The best way to do that is probably to set up correct items for these links; then update the tool by importing from Wikidata; then check for double use of IDs. I'll get on to this. I think I can see the "John Jones" one; the others may take a little more investigation. Jheald (talk) 11:46, 2 December 2014 (UTC)
@Pasleim, Jane023: Update. I think I've removed all but the John Jones identification from the mix'n'match tool. The key was to use the 'Search' link, which then had a "Remove match" function. Unfortunately, there seem to be more "John Joneses" that the search can display, so it doesn't give me the option. So for the moment, anyone who uses the update function must remember to remove the "John Jones" link manually from the Quick Statements run. Jheald (talk) 12:38, 2 December 2014 (UTC)
Yes! Thanks Pasleim and Jheald! I have been "unmatching" these whenever they appear, because otherwise they just get added again. So far it seems the error ratio is quite low, but it does worry me. One thing I noticed is that when I unmatch these mistakes and then make another sync run the same day, the mistakes get made again, so you need to check the data if you make a sync run the same day. Otherwise it's best to wait a day between sync runs. Jane023 (talk) 16:18, 2 December 2014 (UTC)

RKD[edit]

Hi James, this doesn't work. It's just a report so it's easy to find items to work on. I added these manually now. Multichill (talk) 20:12, 7 December 2014 (UTC)

@Multichill: Okay, that makes more sense now. I thought it was a bit brutal to edit by hand! My immediate current focus is more on the "Your Paintings" list, as organised by time over at en:WP; on trying to wrap up the BL map tagging project; and try to get some experimenting done with en:Content based image retrieval, ideally to have something with the BL collection to have to show for a seminar on the 17th -- so I'm a bit committed at the moment. But I'll try to fit in some of the RKD artist lookups if I can find a moment. All best, Jheald (talk) 21:59, 7 December 2014 (UTC)

GEMET Thesaurus?[edit]

See https://www.eionet.europa.eu/gemet/theme_concepts?th=13&langcode=en . What do you think? Should we add it? Multichill (talk) 11:35, 12 April 2015 (UTC)


twins[edit]

FYI: https://de.wikipedia.org/wiki/Diskussion:Johann_Zacharias_Richter --- Jura 12:01, 23 September 2015 (UTC)

@Jura1: Very interesting. Thank you! Jheald (talk) 13:34, 24 September 2015 (UTC)

tinyurl and WDQS[edit]

You don't have to rely on tinyurl, copy pasting the url on WDQS includes the query. This allows to build clickable links, at the cost of sligtly less readable diff. I must admit I prefer clickable links. author  TomT0m / talk page 09:05, 8 October 2015 (UTC)

Reason for deprecation[edit]

I've created reason for deprecation (P2241) based on your request. Mbch331 (talk) 11:46, 16 October 2015 (UTC)

Improved grouping[edit]

Hi Jheald,

This might help you for improved/simplified grouping. --- Jura 13:20, 18 October 2015 (UTC)

Thanks for commenting. Unfortunately, I don't think your alternate can work out. There are too many variations involved and what works with the English label "John and variants" doesn't necessarily lead to the same with the label for the same item in another language (e.g. ru:"Джон and variants"). --- Jura 14:44, 18 October 2015 (UTC)
@Jura1:: So create groups that combine everything that is considered as a variant in any language -- as per the searches at Wikidata:WikiProject Names/given-name variants.
Then, if it makes sense to define particular sub-groups within that overall group, that is straightforward too. Jheald (talk) 15:35, 18 October 2015 (UTC)
overlapping subgroups? --- Jura 15:40, 18 October 2015 (UTC)
@Jura1: Not a problem. An item can be a member of more than one subgroup. It is then possible to query for either subgroup and extract a list of corresponding "instances of". Jheald (talk) 15:45, 18 October 2015 (UTC)
Personally, my primary focus is not querying them. I thought the property might help you with your queries, but it seems it doesn't. I did find a way to solve the identical birth/death day question though.
I'm sure in theory your suggestion might work. It might even work in practice with a single users creating the groups. We frequently get such suggestions or comments in property proposal discussions, but one needs to bear in mind that this is Wikidata: many contributors from different backgrounds, editing in different languages. For things to work, you need to have clearly defined properties that can be referenced and checked. With names this particularly tricky .. --- Jura 15:55, 18 October 2015 (UTC)

Just to know...[edit]

... how do you find this category so that you can make such an edit? I ask because if you find it in dewiki then it should be mentioned as a reference. --Aschroet (talk) 06:00, 7 November 2015 (UTC)

@Aschroet: I ran a search for every article-like item that had a sitelink to a Commons category, but didn't have a Commons category (P373). I did think about putting in a reference, but it seemed odd to give Wikidata itself as a reference, and I couldn't find a Q-number for 'sitelink'; and in any case the new 373 claim is only as strong as the existing cross-namespace sitelink, which is unreferenced. So it seemed reasonably appropriate to put it down as a similarly unreferenced bare claim. Jheald (talk) 08:06, 7 November 2015 (UTC)
Would it be possible to do the opposite as well, please - wherever P373 exists but there isn't a sitelink to a Commons category, add the sitelink? That would be incredibly useful for interwiki links on Commons. Thanks. Mike Peel (talk) 13:14, 7 November 2015 (UTC)
Hi @Mike Peel:.
On a purely technical level, it would be entirely possible and, in fact, dead straightforward. The only difference would be one of scale. There were about 80,000 items that had cross-namespace sitelinks but no Commons category (P373), whereas there are about 800,000 items that have a P373 but no sitelink. I'm Magnus's Quick Statements tool is throttled to about 4500 edits an hour, ie about 100,000 edits in 24 hours going full tilt. (I'm currently making the edits in batches of 4,000 or 20,000 at a time). So whereas this job is going to take about a day to complete, the opposite would take about 10 times as long.
But that's not the real issue. The real issue here is political, not technical. Adding P373 statements is (or should generally be) completely uncontroversial -- it is exactly what the property was made for. On the other hand there is a definite controversy about sitelinks that go "cross-namespace", ie from an article-like item here to a category on Commons.
It is a controversy that may be edging towards resolution purely through the development of facts on the ground. A year ago there were 100,000 such cross-namespace links. I ran the same search a couple of months ago and found there were now 200,000. So it does look like we're moving towards a de facto acceptance on the ground. I posted these numbers at the time, both to the mailing list and to Project Chat, to ask whether people were okay with this, because if one wanted to take a definitive view on it, the time to do so would be now. But it seemed the response was just a resounding "Meh".
There definitely was a constituency here for a distinct Category <--> Category, Article <--> Gallery sitelink division. For one thing it means you know what kind of Commons page you're going to end up on, so there's predictability, whether for people or for bots or for tools; and for another thing, it means that if you allow no links from article items to categories, you can never get trapped wanting to add a link to a category but being caught out because there is already a link to a gallery blocking its path.
As I have said, I am not sure to what extent there is or is not still a constituency prepared to take action to enforce such demarcations. But at the same time given the current greyness of the issue, I am not sure that I would want to be one to steam into such muddy ground tooling up to make 800,000 edits. Jheald (talk) 15:17, 7 November 2015 (UTC)

Questionable use of withdrawn identifier value (Q21441764)[edit]

Hi, I don't think your practice WRT BBC Your Paintings "Identifiers" is of much help:

  1. Once the redirects are retraced in Mix'n'Match (which I just did) these false cuplicates clobber both the duplicate list on Mix'n'Match and on the Value Constraint Report for P1367. Removing non-actionble identifiers from Wikidata and making certain that they won't reappear (by setting them to N/A in Mix'n'Match) seems to me a much more appropriate way of handling this
  2. Though impressive and somehow under curation (if not we wouldn't have the problem of "vanished" URLs at all) it's just a website which allows incoming links: Obviously they are performing clean-ups but don't even care to implement redirects. I don't think Wikidata's task should be to document changes on that website
  3. Those unusable identifiers stem from Mix'n'Match. Unfortunately the underlying dataset is not documented, Magnus may have harvested the Website at some (single?) point in time and/or may have had access to data files provided to him: So using P2241 actually means documenting the difference between this unclear dataset from reality? Not worth pursuing I think.
  4. Magnus may re-import the Website and equip Mix'n'Match with the set of then current identifiers, i.e. those not valid any more will cease to exist in the Mix'n'Match database but will survive perpetually on Wikidata? So (see point 2 above) Wikidata would provide some persistence for BBCYP "identifiers" the original provider obviously doesn't care about (at the moment). I'm not sure about the fundamental implications of persistence as added value by third parties but in the YP case it would be a crude approximation anyway: We have some peripheral (i.e. current M'n'M database) evidence that a certain identifier has existed (been actionable somewhere in the past) and some (soon to be removed) statement in M'n'M that this identifier was related to a certain Q-item. Transferring that to Wikidata as a P2241-qualified statement leaves us with somethin completely unverifyable...

Actually, there are cases where this new property together with withdrawn identifier value (Q21441764) (or some variants of it) make sense: I remember the concept of a "cancelled ISSN": There is a regulation that periodicals which use ISSNs against policy (e.g. not getting floated after a no label (Q1514286) or assinging an ISSN for something other than a serial) won't get recycled but remain in the database, tagged as "cancelled". An related case are "wrong ISSNs": If the ISSN printed on the journal does not exist (has a checksum error, i.e. isn't any ISSN at all), is not assigned to any periodical (is formally valid but not existing at that time), or even officially assigned for something other (so the ISSN exists) then it's worth recording because queries may be performed based on face value.

Thus a (non exhaustive) list of withdrawal reasons might be:

  1. identifier formally invalid (but was used anyway)
  2. identifier did never exist (but was used anyway as such)
  3. identifier is not actionable any more (announces error)
  4. identifier is actionable but announces deprecation (acknowledges that it has existed)
  5. identifier exists but the corresponding object is deprecated (think of ODNB biographical articles where later research came to the conclusion that the person described is identical to another person or actually two different persons)
  6. ...

A similar thing on a much higher scale can be currently noticed for RKDartists ID (P650):

  1. The initial import accidentially contained thousands of "See" references (they have an identifier of their own, but no link to the object they are referring to)
  2. The initial dataset contained tens of thousands of entries "in bewerking" (under construction). Thousands of them have enough accompanying data to be spotted as quite obvious duplicates of other entries (and thousands of them do not have enough data at the moment to make matches possible - probably they should have been left out from Mix'n'Match at first hand, but identifier-wise these items definitely exist).
  3. There seems a massive weeding effort ongoing: Especially for artists from Belgium, writers from Germany and those from the lower parts of the alphabet the links aren't operational any more
  4. I noticed that because an IP from the Dutch National Library started removing bunches of RKD identifiers from items in the Constraint report: So actually there is some feedback loop, Wikidata reports are used by (institutions related to) the original providers to perform or at least prioritise data sanitation.
  5. Again: RKD itself does not care about the fate of identifiers for items which weren't propert items at all or were designating items they have removed for whatever reason (their initial data collection appears to have been extremely broad in scope and even clearly identifyable persons might be well beyond the topical restrictions of RKD).

So many RKD identifiers we currently know about may just be "leaked": They will be withdrawn as provisional or as not relevant (and in many cases as duplicates) and the question would be if we really shold document the RKD identifier for persons RKD does not want to deal with at all? -- Gymel (talk) 15:28, 14 November 2015 (UTC)

@Gymel: We may not have been the only people to have harvested the BBC Your Paintings identifiers (or any other set of identifiers). It seems to me that it is useful to record retired identifiers (a discussion that's been had both on the mailing list, and at the Sum of All Paintings project recently), not least because people may match their copy of the old identifiers to our copy of the old identifiers.
As for these messing up the Constraint reports, or MnM single values, then that is simply a bug in the Constraint reports and in MnM that needs to be fixed -- deprecated values should not be considered for the single instance.
Another usefulness is that we now have a SPARQL-searchable list of the retired identifiers -- so for example, we can now generate a report of all retired identifiers for which there are not new identifiers, and ask the PCF "what happened to these?" -- in a couple of cases (of names that look genuine, and don't seem to have a new id) may be a system refresh glitch at their end.
I think I have now marked all the retired identifiers that we have items for (and merged any where we also have items for current identifiers). I think that they are worth keeping.
As for values for reason for deprecation (P2241), you are very welcome to create further value items to document such cases in more detail as you wish. Jheald (talk) 15:44, 14 November 2015 (UTC)
@Gymel:. To add to the above, I think the "single value" constraint report does ignore deprecated values. The 93 multiple values currently reported is similar to the number from several weeks ago (it was actually 97 then) -- as far as I can see, it represents genuine unmerged duplicates on the PCF site, and doesn't seem to have gone up. Jheald (talk) 15:48, 14 November 2015 (UTC)
Here's the start of the thread on wikidata-l : [1]
The discussion also continued into the next month : [2] Jheald (talk) 15:57, 14 November 2015 (UTC)
Interesting, I will pursue that: For VIAF ID (P214) we are sometimes marking identifiers as deprecated if the VIAF cluster exists but is not usable since it conflates different persons and an alternative cluster for that person also exists. VIAF may act on that findings and it would be good to know if the constraint report is not complete once one wants to reaccess these cases. -- Gymel (talk) 15:58, 14 November 2015 (UTC)
OK, I'm not impressed by the discussion on the mailing list. As I said before I can see use cases for keeping deprecated identifiers, but one has to differentiate:
  • VIAF was given as example several times: Every month they automatically cluster and recluster their consitutent entires, currently they record >7M redirects (targetting about 28M entries) and provide resolution services. They also provide a change history for any single cluster. Given that Wikidata also has a version history for items actively recording obsolete identifiers here seems overkill.
  • Use cases of outdated information are construed and Wikidata should somehow step in so that the providers of the original data can be asked "what happened" (but not be bothered at the same time): Well, those utilizing the outdated identifiers could aks directly (increasing the pressure on the providers to operate more carefully). Wikidata could only serve as a place for acknowledging that an identifier indeed did exist and does not exist any more. However for these Wikidata cannot be as exhaustive as for valid ones.
  • Admittedly many data providers should invest more into persistence of identifiers, e.g. by at least "supporting" them by redirects. But those who do that usually have an interest of re-users eventually migrating to up-to date values. Wikidata IMHO should not thwart that by establishing an one-stop solution for the abselutely lazy.
  • My RKD example above shows that "support" will have limits: Some things will simply go because they shouldn't ever have been assigned an identifier (from the provider's point of view). That's the downside of presenting provisional entries to the public which IMHO generally is a good thing
  • Some sites like BBC YP are way too sloppy with their handling of what we perceive as identifiers. But are we really in a position to remedy that? You stated that you have recorded the 50 or so obsolete identifiers you deemed important here. But the actual number might me much higher and - as said above - what we can record is only the arbitrary difference between the unknown point in time some data was harvested and today.
  • Last, not least: Unactionable identifiers of that kind cannot be verified (but perhaps in the Your Paintings case by a link to the internet archive). Common opinion here is that identifiers don't have to be sourced, because they can be immediately verified again at any given time. Thats obviously not the case here! -- Gymel (talk) 16:42, 14 November 2015 (UTC)
@Gymel: I have recorded all the obsolete identifiers I knew about, that I have so far been able to identify items for, based on the pages in this series, the identifier columns in which are based on Magnus's (or Jane's) original scrape in 2012.
You are correct, that these may no longer be verifiable and can no longer be confirmed. Mistakes may have crept in. But so what? They are dead links and marked as such. If a copying error has crept in, the worst case scenario is that then somebody may not be able to match their old reference link to our old reference link. That doesn't take away from the positive side, that in as many cases as possible, it will be possible for somebody to match their old dead reference to our old dead reference, and mostly we should also be able to give them a live new reference. Jheald (talk) 16:56, 14 November 2015 (UTC)

in support of User:Snipre and issue of (uncontrolled) bot imports from wikipedias[edit]

Would you be happy if some, not involed, changed you topic? --Succu (talk) 20:10, 19 November 2015 (UTC)

@Succu: It's a project page. It involves everybody; and should have an appropriate neutral header. Jheald (talk) 20:17, 19 November 2015 (UTC)
Really? Any hint where I can find this rule? --Succu (talk) 20:20, 19 November 2015 (UTC)
@Succu: It's common sense, and happens all the time. Nobody 'owns' the header of a section of a public page. It should be whatever best, most neutrally and most succinctly tells the reader what follows, and encourages participation from all points of view. I would revert again, if I hadn't hit the 3 edit limit, because the present header is simply not appropriate, and would also be far clearer if shortened. Jheald (talk) 20:25, 19 November 2015 (UTC)
@Succu: But if you want a reference, here's the en-wiki guidance from en:Wikipedia:Talk_page_guidelines#Editing_comments,
Section headings: Because threads are shared by multiple editors (regardless how many have posted so far), no one, including the original poster, "owns" a talk page discussion or its heading. It is generally acceptable to change headings when a better header is appropriate, e.g., one more descriptive of the content of the discussion or the issue discussed, less one-sided, more appropriate for accessibility reasons, etc.
Wikidata may not have yet the same depth of conduct guidance, but the broad principle still makes sense. Jheald (talk) 20:30, 19 November 2015 (UTC)
The unreflected export of „rules“ of your home community is not very helpful. At dewiki we normaly do not change the heading of a discussion (Kmhkmh). Especially if we are not involved in the discussion, Multichill. --Succu (talk) 20:51, 19 November 2015 (UTC)

Preferred rank[edit]

Hi,

I'm afraid I have no idea on how bots work, SPARQL and so on. I made these changes because the template Spanish Wikipedia uses for national sub-entities has changed and shows all instances as a subtitle. You can check out this problem on the Frankfurt article. Users who made those changes in the template are Agabi10 and Metrónomo. They suggested selecting «preferred rank» so that it only shows those values, and it seems to be solving solving these problems we have now in nearly every city article. I understand this has caused some problems with bots on Wikidata but, as I said, unfortunately I have no idea on bot operation or how to edit templates. Can you speak Spanish? If so, it would be useful if you read this talk page and further discuss the issue with them. Anyway, I'm going to tell them and hope you can work out this problem together.

Meanwhile, I stop my edits until a solution is found. Greenny (talk) 15:25, 20 November 2015 (UTC)

Done, you can check the talk here. As I deduce from your userpage that you can't speak Spanish, I've encouraged them to write in English from now on. Greenny (talk) 15:34, 20 November 2015 (UTC)

Hir and Bron[edit]

Actually I read "Hir" in an English Start Trek novell some time ago. Captain Riker and the starship Titan visited a planet of alien Invertebrata (Q43806) with only one sex. Instead of "Him" or "Her" they said "Hir".

And Saga says Hen in Swedish, something her Danish college dislike. The Swedish word is considered as a gender-neutral version of "Hon" (She) and "Han" (He). The word is widely used in media and is today included in wordbooks. Personally, I think that word still isn't neutral enough, since it is promoted by political groups. I guess the word is imported from Finnish, which do not have genders in the same way as our German-derived languages have.

I followed Bron/The Bridge last season, but stoped to watch this season, since I thought it was to much of violence present. My post-traumatic stress disorder (Q202387) become worse... -- Innocent bystander (talk) 16:10, 25 November 2015 (UTC)

@Innocent bystander: Sorry to hear that. In the pre-series publicity, I thought I had read the lead writer saying they thought they should dial down the body count this series, since they thought it had become a bit excessive the last couple of times. I'll just have to see how it goes -- they do like to pull surprises! Jheald (talk) 16:30, 25 November 2015 (UTC)
I think there is one episode left here (this sunday) and my wife still follows it. I do not know if the number of bodies have increased or decreased and we are maybe not shown so much violence within the TV-frame. But the description of such things as missing body parts and how they have been removed is a more efficient way to give me new nightmares than many other ways to describe violence. That is the good thing with Star Trek novels. The close combats are few. -- Innocent bystander (talk) 17:26, 25 November 2015 (UTC)

Second Severn Crossing[edit]

Hi, I'm confused by this edit to Second Severn Crossing (Q1287969). Surely the bridge is in all of England, Wales, Monmouthshire and South Gloucestershire. However if only the lowest level should be included then why retain Wales? Thryduulf (talk: local | en.wp | en.wikt) 14:55, 28 November 2015 (UTC)

@Thryduulf: I was running an automated process to remove all located in the administrative territorial entity (P131) = England (Q21) when there was also an English county given. The same could be done for Wales, but one step at a time... (though I have now removed Wales in this case).
The Severn Bridge may be a special case, as it joins two different nations. So perhaps, in this case, England & Wales might be justified. But for most places, if we already have that the country = the UK, and the county, then England as well seemed just a distraction. Usually located in the administrative territorial entity (P131) = England is a sign that further refinement is needed. Jheald (talk) 15:04, 28 November 2015 (UTC)
Thanks for the explanation, it makes sense now. Thryduulf (talk: local | en.wp | en.wikt) 15:09, 28 November 2015 (UTC)

Wikidata:Database reports/Wikipedia versions[edit]

Dear Jheald; I have seen you contributing to a lot at pages linked to https://www.wikidata.org/?curid=24028442# (as for today titled Wikipedia versions but intended in general for WMF projects). I would be happy if you can review the properties of these pages, create the missing Wikibook and Wikiversity project pages, comment on user:I18n/sandbox (where you may find many usefull queries) and comment there with new / additional ideas. Best regards Gangleri also aka I18n (talk) 19:54, 9 January 2016 (UTC)

Hi! I want to let you know that the number of Wikidata:Database reports/WMF projects has increased to more then 385. You may be interested in adding labels and descriptions in other anguages, follow the discussion at property talk:P1800 and comment there. Best regards Gangleri also aka I18n (talk) 02:59, 12 January 2016 (UTC)

BBC Your Paintings[edit]

...is called Art UK as of today. The properties need to be adjusted. --Jane023 (talk) 09:33, 24 February 2016 (UTC)

I informed Magnus and he is converting them now - 36k links!! --Jane023 (talk) 10:26, 24 February 2016 (UTC)
@Jane023: So: extraordinarily ugly new site, extraordinarily ugly new name, and they changed rafts of identifiers. Are these guys a complete bunch of muppets?
(And I see they don't even own their own twitter handle, so have to use this instead!)
My watchlist is lighting up with lots of old identifiers that Magnus is removing. Do you know if he will be replacing them with new ones?
And is there an old-to-new conversion list, so I can update the pages at en:Wikipedia:GLAM/Your paintings/header ?
Thanks for the heads-up, Jheald (talk) 14:46, 24 February 2016 (UTC)
Ask Magnus for a copy of his list? He already finished the conversion and will start updating 16k new links. I was very annoyed as well (I was informed by news letter yesterday). --Jane023 (talk) 15:28, 24 February 2016 (UTC)

links to random items[edit]

Hi Jheald,

From tome to time I come upon items on categories where you have put in links to random, unrelated items (for example here), apparently because these have a name with the same spelling. Items on categories are not disambiguation pages, but are there to gather and connect sitelinks to categories on the same topic. - Brya (talk) 05:33, 18 May 2016 (UTC)

IIIF-tool for the property relative position within image (P2677)[edit]

Hello James! Thanks a lot for the property relative position within image (P2677). I'm still not using but it could be a great improvment on visual artworks. One issue is that we need a tool to help us to provide data. So I made a little fork of the Liz Fischer's IIIF-tool created for image annotation on IIIF standard : Cropper. It's a just a draft (I'm not a developper) of what we could have. Maybe that could be interesting for you. Best regards --Shonagon (talk) 01:35, 22 May 2016 (UTC)

Example of use: Virgin among the Virgins (Q21013224) --Shonagon (talk) 02:14, 22 May 2016 (UTC)
Hello Jheald. An additional development to display the image fragments of an artwork has been done. It's multingual; so it's possible to display labels and links to Wikipedia in differents languages. Surely more robust tool could be done but we have now a first interface to edit and display image artwork annotation, which is essential for using relative position within image (P2677). Best regards --Shonagon (talk) 07:26, 28 June 2016 (UTC)

Dorset description[edit]

Hey Jheald,

Just wanted to let you know I partially reverted this change. The text about "Q21694711" was showing up in search results on Google, Wikipedia.org, top of the article in the Wikipedia app on Android and iOS, and other places that utilise Wikidata descriptions. Thanks! --Krinkle (talk) 03:06, 20 July 2016 (UTC)

Best way to get sitelinks for lots of items at once[edit]

Hi! If you're interested in Special:Permalink/243943252#Best way to get sitelinks for lots of items at once ? in probably much more better way, then there is one. Use SPARQL. Query, you can get data in json format, by adding that query in this link in {} place: https://query.wikidata.org/bigdata/namespace/wdq/sparql?query={}&format=json. You can of course include other needed columns there. --Edgars2007 (talk) 10:41, 4 September 2016 (UTC)

Unused property[edit]

This is a kind reminder that the following property was created more than six months ago: metasubclass of (P2445). As of today, this property is used on less than five items. As the proposer of this property you probably want to change the unfortunate situation by adding a few statements to items. --Pasleim (talk) 19:15, 17 January 2017 (UTC)

Art UK links[edit]

Hi James, you mentioned ART UK on Commons. One thing I realized with Art UK artist ID (P1367) and Art UK artwork ID (P1679) is that their links are rather unstable. When the name of the artwork changes, so does the url breaking our links. That's a shame because for artworks they do seem to have an unique id. See for example http://artuk.org/discover/artworks/bacchus-and-ariadne-114356, the id is 114356 (you can find it in the HTML source too). Wouldn't it be nice to be able to just records that integer here instead of "bacchus-and-ariadne-114356"? Do you happen to have any contacts at Art UK you can use? I can easily import several thousand Art UK artwork ID (P1679) links, but I'm a bit reluctant to do that now with the unstable links. Multichill (talk) 11:08, 26 January 2017 (UTC)

Hi @Multichill:
I've just this morning had an email back from User:Charles Matthews. He and wmuk:User:Richard Nevell (WMUK) from WMUK met with some Art UK people last month.
...
I think the painter identifiers we have now are broadly correct -- I will do a verification run to confirm later today, or in the next couple of days.
As for painting identifiers, I was thinking about making a trial run on some of the collections we currently have the best accession number coverage for -- eg National Gallery, National Portrait Gallery, Tate -- but I am very happy to coordinate with you.
As to identifier stability, the important thing of course is to be able to serve people URLs that work. With luck, the big identifier change was when they moved to their new site. Beyond that, until they publish any regular list of recent identifier changes, then all I think we can do is regular verification runs, and use the "Accessed" qualifier to make a not of what date the idenfier was valid. It would be nice if they had a more stable scheme; maybe that will come, and we do need to keep keep knocking on their door, I think. But it seems we first need to prove ourselves more. Jheald (talk) 13:57, 26 January 2017 (UTC)

@Multichill: I can explain more about the meeting, but in a mail. Charles Matthews (talk) 14:51, 26 January 2017 (UTC)

Trail run sounds like a plan. I'll write some import code, I already have most of it so should be done soon. I'll share the link to github here, will be in Python
@Charles Matthews: please do :-) Multichill (talk) 15:40, 26 January 2017 (UTC)
@Multichill: I was just going to add Art UK painting identifiers for paintings where we already had accession numbers, and then just add them using QuickStatements. But it would be easy enough to pass you what doesn't match. Jheald (talk) 15:47, 26 January 2017 (UTC)
Ok, bot and example edit. It's running now. Multichill (talk) 16:41, 26 January 2017 (UTC)
Thanks. Jheald (talk) 16:47, 26 January 2017 (UTC)
I'm importing quite a few new links. I updated the constraints on Property talk:P1679 to catch more useful stuff. Might need a bit more tweaking. Multichill (talk) 19:36, 26 January 2017 (UTC)

@Multichill: Grrr... Just done the validation scrape. Over 250 no-longer-working identifiers to investigate. (BTW I saw you're asking Magnus for a full rescrape for Mix'n'Match -- I suppose that adapt to identifiers that have been updated here in the meantime.)Jheald (talk) 22:09, 31 January 2017 (UTC)

Quantity on ART UK links[edit]

Hi James, this seems wrong. Quantity on an identifier of 4? You're trying to say art uk has 4 works, but this is not the way to do it. Also doing such a large controversial import without discussion is not the best way to go or did I miss the discussion somewhere? Multichill (talk) 16:33, 6 February 2017 (UTC)

You are running a bot job, someone objects. You should pause and discus it. Multichill (talk) 17:25, 6 February 2017 (UTC)
@Multichill: Stopped. (Sorry I didn't see your message sooner).
So, where and how to identify the number of works Art UK has in its catalogue under this identifier?
Because the same artist may sometimes have more that one identifier at Art UK, and this information relates specifically to the identifier rather than the artist, it seems to me the appropriate place is as a qualifier on the identifier.
So then, which property to use? quantity (P1114) seems the most generic, for a "quantity, total number, number of instances, number, amount, total" as its list of (English-language) equivalent names gives for it.
In particular, this is the "number of instances" for the identifier in the Art UK database -- so if P1114 is intended for use including "number of instances", this seems entirely appropriate.
But if there is an alternative that you would suggest, that you think would be more appropriate, then I am very open to discussion.
I would like to get on with things though, because Art UK have been complaining they haven't been getting enough hits from us; so I'd like to be revising and rolling out a template on en-wiki including this information as soon as I can get it done. Jheald (talk) 17:46, 6 February 2017 (UTC)
Got distracted by other things. I did this change to make it a bit clearer, but still doesn't feel right. At first I thought you meant the person had 4 Art UK artist ID (P1367) links. I had to check the link to realize you meant that on the linked page it had 4 paintings.
I'm not sure you should even document it this way. In some point in the future we'll have all art uk works and you can just do a query to get this information. Multichill (talk) 20:17, 6 February 2017 (UTC)
@Multichill: Even if we did, they wouldn't necessarily have it, so the information would still be germane in documenting their database. Besides, I want to use this information in a WP template this week, not at some far distant point in the misty future.
I'm not sure your edit helps, because the "of" is placed as a qualifier on the identifier, not on the number of works. At some point in the future, when the data is next updated and re-written, the ordering could get changed; or other qualifiers might get added and upset the order, eg one for "preferred form of name" (in this database, associated with this identifier). It doesn't feel safe to me to rely that WD is always going to show the same qualifiers always in the same particular order. Jheald (talk) 20:28, 6 February 2017 (UTC)
Your putting time pressure on this. My experience in (wiki) projects is that this hurts the quality. I would appreciate if you could discus this in a broader venue before adding more. Multichill (talk) 20:40, 6 February 2017 (UTC)
@Multichill: Okay. Where do you suggest? Jheald (talk) 20:43, 6 February 2017 (UTC)
What about Property talk:P1367 and a link at Wikidata:Project chat to get some people to comment on it? Multichill (talk) 20:46, 6 February 2017 (UTC)
Mind to stop your silly additions? --Succu (talk) 22:15, 9 February 2017 (UTC)
@Succu: Task is now 95% complete, so I am going to finish it. It makes no sense to leave the last 5% not done. Jheald (talk) 22:21, 9 February 2017 (UTC)
Cool, than we have to remove 100% of query results at a certain point of time. --Succu (talk) 22:27, 9 February 2017 (UTC)
@Succu: I'm sorry, what are you talking about? Jheald (talk) 22:32, 9 February 2017 (UTC)
But I am curious as to why you think the addition is "silly" ? Jheald (talk) 22:23, 9 February 2017 (UTC)
Are you prepared to update this fixed number when the count at Art UK (Q7257339) is updatend? We have queries for this. --Succu (talk) 22:35, 9 February 2017 (UTC)
@Succu: And how exactly do you propose querying something which is not stored on Wikidata? Jheald (talk) 22:37, 9 February 2017 (UTC)
OK, vice versa. What do you want to express with this addtion? --Succu (talk) 22:48, 9 February 2017 (UTC)
@Succu: It expresses that Art UK (a catalogue of UK public collections) has 16 paintings by Esther Tyson (Q21458718), compared to eg only 1 by Hendrick van Zuylen (Q28431499) Jheald (talk) 22:58, 9 February 2017 (UTC)
... which means I can now write queries eg like this, for the total number of works at Art UK by painters that we have items for: tinyurl.com/zgjvucp. Jheald (talk) 00:53, 10 February 2017 (UTC)
+1. Jheald, how do you plan to update those numbers every time when any item is added to the catalogue? Or is there plan to have those numbers obsolete forever? --Infovarius (talk) 16:12, 2 March 2017 (UTC)
@Infovarius: The Art UK external IDs are only mildly stable -- they change if Art UK revise the name for an artist, or modify an artist's dates, or e.g. add a date of death. I asked them whether they could publish a regular record of ID changes, but apparently they can't -- apparently they don't hold the data centrally. It's only quite a small proportion that get changed; but it does mean that at regular intervals we will need to re-check the ID links to make sure they still work; we can check the quantity data at the same time.
The quantities probably won't change much -- Art UK was set up to be a survey of oil-on-canvas works in publicly-owned collections, and that survey is now complete. But they may change a little: Art UK may in future add some sculpture, and a limited number of works on paper.
So it's possible that the numbers may go out of date. But there is a retrieved (P813) date in the referencing for each statement, so it shoulf always be possible to tell how recently the data was checked. Jheald (talk) 16:29, 2 March 2017 (UTC)

GSS[edit]

I see that you've been removing GSS codes from a number of items, e.g. [3]. What is the reason for this? This property is currently used by w:zh-yue:Template:Infobox English county. Deryck Chan (talk) 14:15, 16 March 2017 (UTC)

Hi @Deryck Chan: There were a number of GSS codes that were on the wrong items, eg Essex (Q23240) -- they were on items for the ceremonial counties, when (as is clear eg from how the map if you follow the GSS links excludes eg Southend-on-Sea and Thurrock), they ought to be on the items for the County Council areas, eg Essex (Q21272241).
This also applies to most of the other identifiers on the ceremonial counties, eg FIPS 10-4 (countries and regions) (P901), OpenStreetMap Relation identifier (P402), NUTS code (P605) etc, which should also be moved across in the near future.
Compare en:Essex, where the facts that apply to the non-metropolitan county are shown in a different part of the infobox to those that apply to the ceremonial county.
en-wiki combines the two; but to make co-referencing and properties like located in the administrative territorial entity (P131) work properly, we have two different items for the two concepts.
Hope this makes some sense now. All best, Jheald (talk) 15:18, 16 March 2017 (UTC)

Golden Hind[edit]

Can we continue geeking out about the Golden Hind here? I think I'm getting pretty far down into the weeds for the Project Chat page.  :-)

I'm going to keep digging for a end date for the original. And for the actual citation in Stow - having an oddly difficult time finding it. - PKM (talk) 00:42, 24 March 2017 (UTC)

"The original Golden Hinde remained in Deptford for about 100 years, until it started to disintegrate and had to be broken up." it says here. - PKM (talk) 00:47, 24 March 2017 (UTC)
And here's a citation for inception date, built place, commissioned by, the wharf where it was displayed, and even "ship museum" if you want to use it! http://goldenhind.co.uk/pages/education/the-original-golden-hind/88 - PKM (talk) 00:53, 24 March 2017 (UTC)
And bingo! "AD 1668. John Davies, of Camberwell, the storekeeper of Deptford dockyard, caused a chair to be made out of the remains of the ship, 'The Golden Hind' ... here. - PKM (talk) 01:01, 24 March 2017 (UTC)
@PKM: Superb! Hope you're adding this to en-wiki as well. 1:30 am here, so I'm turning in; but really pleased you're on the case! Jheald (talk) 01:32, 24 March 2017 (UTC)
Will do, soon. - PKM (talk) 05:34, 24 March 2017 (UTC)
EN Wiki updated and I found a source for the date of renaming the Pelican to Golden Hind <does happy dance>. Lots of updates made at Golden Hind (Q546198) since I had all the references open anyway. - PKM (talk) 19:47, 24 March 2017 (UTC)
@PKM: That's looking really good now. Thank you so much. Jheald (talk) 20:46, 24 March 2017 (UTC)

CPs[edit]

First thanks for all the work your doing for adding statements for parishes but I just wandered what to do with some where there are 2 items but the main one has statements for both, for example Q2055282 (settlement and parish) and Q24674398 (parish only). While I do think we should probably have separate items for districts even if they have similar boundaries (like Exeter) I'd suggest that it is unnecessary for parishes (except for cases like Q1002828 and Q21347409 where the parish doesn't include the settlement). Although I think cases like Q637298 and Q24662858 seem OK as it is a town and the ONS population is much smaller than the parishes. The reason why some parishes have 2 items is because of Lsjbot, who sometimes created pages for the settlement as well as the parish, maybe they should be marked with Property:P460 or Q17362920, although I think items are only true duplicates if they are unquestionably on the same topic not just where a distinction has been made. Why don't you also do the same thing with JhealdBatch for wards as well, as cases like Bristol Q21693433 don't have any parishes, I did create items for wards but most don't have any (although Bristol does). Lucywood (talk) 20:06, 31 March 2017 (UTC)

Hi @Lucywood: Sorry not to get back to you sooner. My wife and I were having a long weekend away from the Internet. (Overdue and very much needed!)
With regard to the CPs, I do hope we're getting there. Some key queries I have been watching:
  • tinyurl.com/n43ysz2 - Latest count of number of distinct, non-deprecated GSS codes for civil parishes. Latest value: 10123 ; should be: 10449 => still to find: 326
  • tinyurl.com/mnoklwy - Items marked as current CPs, that do not have GSS codes. (Currently: 287). I do find this quite a brutally slow list to work though.
Some were CPs, that need to have a end time (P582) qualifier added to their P31. Some are in fact civil parish group (Q29043077)s, though editors on en-wiki may not be aware of the fact. Some are completely other things altogether (eg public baths, etc). Some do match entries in the GSS list, but the formal name that GSS has (and usually Commons too, following the GSS) may be slightly different, sometimes opening up questions of what to link to what, and also whether or not the Commons category tree is accurately reflecting this.
But you are quite right that there is also a very real issue with some CPs being claimed by multiple entries here. (Some of which I may have created or added to, by tagging settlements as CPs). The following queries try to reveal this:
  • tinyurl.com/mw3e4pb GSS values claimed by more than one item. (Currently: 84).
  • tinyurl.com/mfdlwjl Commons categories for CPs claimed by more than one item. (Currently: 81).
  • tinyurl.com/kkuz36e CPs that are in areas that are also claimed as CPs. (Currently: 72).
  • tinyurl.com/mts62qu A query that tries to combine the above. (Currently: 142).
This is partly what I opened the discussion at User talk:Kelly to try to think through.
The last query appears to reveal broadly two groups -- one is (mostly) parishes in South Kesteven, where a settlement item and a parish item share a Commons category; the other, almost completely distinct case, is where there are two items both marked as CPs, with one usually a P131 of the other.
My own view is that the link from Commons to CP items here is very valuable, eg for us to be able to use the very categorisation there to infer statements, to add to items here (ie: which parish is a geographical item in). We would lose out if items here did not have a Commons link.
Equally the link to/from Commons categories via Wikidata items from/to Wikipedia items is clearly very valuable.
User:Nilfanion makes the interesting point that ultimately Commons may be quite happy to have distinct categories for parishes and for settlements of the same name. However, the fact remains that for at the moment Commons does not for the most part make such a distinction, and that making and populating such a split will/would be no small amount of work.
So my own view is that for the moment it probably does make sense to combine items for settlements and parishes, until such time as they get split on Commons. The other thing that weighs with me is that so many of the current properties seem to be quite relevant to both parishes and settlements -- eg KEPN ID (P3639), OpenDomesday settlement ID (P3118), Vision of Britain place ID (P3616), British History Online VCH ID (P3628) -- I'd probably place most of these on the settlement, if forced to choose, but it's not a clear-cut thing.
On the other hand I am reluctant to undo somebody else's work, and merge back the items that User:Kelly has split out for the South Kesteven parishes. (And similarly User:Robevans123 for communities in Anglesey).
The bulk of the others, as you note above, appear for the most part to reflect sv-wiki and ceb-wiki stub articles created by Lsjbot, whose operator I understand has since retired from Wikipedia editing.
So what to do with these? said to be the same as (P460) and Wikimedia duplicated page (Q17362920) are both interesting options. But if people are content that we don't try to force there to be separate items, just because of stubs created by a bot, then perhaps the best way forward may just be to kill the stubs, by redirecting the stubs on sv-wiki and ceb-wiki (merging any content that seems particularly useful to keep), allowing the corresponding items to be merged here. Would anybody have any objection to this. (And is there anywhere else we should ask first?)
With regard to wards, I have tended to keep those separate from parishes (and they have different GSS codes, starting "E05"). Around 18 months ago, when I asked the UK project on en-wiki what was most valuable to have in the P131 hierarchy for UK places here, the view was that parishes are useful, because they have typically been comparatively stable over comparatively long periods of time; whereas often wards seem to be much less stable: much more likely to be re-drawn as population numbers change. So I haven't seen it as such a priority to create and populate items for wards. I don't think they should be combined with items for CPs; but a new property "coterminous with" might be useful to connect them with parishes (& v.v.), in the occasions where they do have equivalent boundaries. Jheald (talk) 19:57, 3 April 2017 (UTC)
Whether it makes sense to systematically create wards for unparished areas, ie (typically) areas that were former metropolitan boroughs, I am not sure. For the City of London, I think: certainly -- and I think these items all exist & have Commons cats (though still need GSS codes). For other areas, eg Bristol, I don't know. Clearly if en-wiki has articles, we will have items, and they should be described as well as we can. Beyond that, my inclination would be to see how far Commons goes at the moment. If Commons has categories, particularly if they are well populated, it probably makes sense to have items here. If Commons doesn't have categories, then maybe there are other higher priorities for work here. Jheald (talk) 20:14, 3 April 2017 (UTC)
I think there is a distaste in a significant proportion of both en.wp and Commons communities for using wards for localisation (apart from the City of London). There are several drawbacks:
  1. Wards are very variable units. As an example in Bristol, compare the 2009 wards of Hartcliffe and Bishopworth to the 2016 wards of Hartcliffe & Withywood, and Bishopsworth.The two wards in 2009 split their combined area into a West and East, while the two wards in 2016 cover the exact same area, but are a North/South split
  2. Wards have low recognition. If you asked someone where they live, the ward is unlikely to be quoted and an area of the city is more likely to be quoted.
  3. When both exist, there is a complex relationship between CPs and wards. Sometimes one is a subset of the other, sometimes not. That makes a logical hierarchy awkward, as CPs are desirable.
In the absence of anything better, Commons sometimes goes to street-level to provide the granular localisation.
At the same time, there is a strong desire to get localisation within the unparished areas. One possibility is shown by my work in commons:Category:Districts of Plymouth, which basically splits the city into the regions known by residents (and all could potentially have WP articles). A better solution might to use the city council's neighbourhoods which are defined in terms of community identity and natural boundaries, and unlike people's perceptions are objectively defined. Following the Localism Act 2011, Neighbourhood Areas have been established in many large cities, when they exist these might be ideal.--Nilfanion (talk) 23:51, 3 April 2017 (UTC)

wards[edit]

@Lucywood: Despite User:Nilfanion's cautions above, I have started adding GSS code (2011) (P836) links to items marked as ward (Q1195098) or ward or electoral division of the United Kingdom (Q589282), on the basis that if that is how items here have been marked, then we might as well link to their boundaries etc.

I also have a extracted a list of wards from sub-categories of en:Category:Wards_of_England, which should probably be marked up as such here, since in many cases the items have no existing P31. (Though in some cases they are identified as some sort of human settlement (Q486972)).

A further complication that I now realise (on top of all Nilfanion has written), that I had not appreciated is that there can be a distinct difference between electoral divisions used to elect County councillors (see eg item note at the OS), and the wards used to elect district councillors -- I should have read en:Wards and electoral divisions of the United Kingdom more closely. I had been happily assuming that everything was the latter, but then I hit Pulborough (Q7259268), this link on OS OpenData; which is significantly different from the district ward "Pulborough and Coldwatham" this link -- in each case, click on the value for 'Extent' to compare the boundaries.

I am hoping that the only county electoral divisions that have got into Wikidata are those that are subcategories of en:Category:Electoral divisions of England -- but it would be useful if you confirm. Jheald (talk) 19:08, 16 April 2017 (UTC)

@Lucywood: There were also a couple of wards that you added that I've had a bit of trouble identifying. Is there any help you can give me with either of the following?
  • Courtfield (Q28938159). Said to be in LB Brent. The only ward I could find was in Kensington & Chelsea: [4]
  • Devon (Q27889472). There's one in Newark-on-Trent, in Nottinghamshire [5]. But I couldn't find one in South Kesteven, Lincolnshire.
Jheald (talk) 20:29, 16 April 2017 (UTC)
The first one was probably a mistake, sorry, corrected, the second one used to exist, see [6]. As you know many change quickly but I was using mainly the Ordnance Survey data. Lucywood (talk) 07:49, 17 April 2017 (UTC)
@Lucywood: Thanks. I've found its dates now, thanks to data from the Elections Centre at Plymouth University [7]: appears 1979, disappears 1999 -- I had been thrown because the en-wiki page en:List_of_electoral_wards_in_Lincolnshire#South_Kesteven only had names back to 1999.
It seems quite a random sort of an item to have created. Out of interest, has there been a system or a pattern to the wards you created items for? Jheald (talk) 10:10, 17 April 2017 (UTC)
No there wasn't really apart from Suffolk, Essex and Cumbria and some unusual names like Devon. However as I was suggesting why not use your bot to create items for all of them? Lucywood (talk) 12:59, 17 April 2017 (UTC)
Just to add I see no harm in creating them on Wikidata, but I can't see them getting much use either. However, be aware that there are several classes of wards, and these should be given distinct groupings - ward or electoral division of the United Kingdom (Q589282) may not be a sensible concept.
The various types include:
  1. County electoral divisions (eg Tonbridge for Kent County Council)
  2. Unitary Authority electoral divisions (eg Bugle for Cornwall Council)
  3. "Normal" wards (eg Axminster Rural for East Devon District Council)
  4. English Parish Council wards (eg North for Tavistock Town Council)
  5. Welsh Community Council wards (eg Plymouth for Penarth Town Council)
AFAIK, ONS codes are only applied to the wards that elect to councils with district-level (or unitary) powers.--Nilfanion (talk) 18:12, 22 April 2017 (UTC)

Looking to do DNB queries and data population ...[edit]

Here to seek some help.

For enWS, we have mechanism to check that each article of DNB00/01/12 is in WD (done), and we can run a check to note that each WD item has a main subject (excluding the instances of DNB redirect). What I would like to now ensure that we have reciprocal of DNB item/main subject:person item <-> person item/described by:DNB00/01/12 (qualified) stated in:DNB item. Noting that we number of instances where some have a directed described by "DNB item" often as duplicate that we need to remove after we are sure that we have the correct "described by" statements in place. I suspect that we are going to need to do SPARQL queries to work it out.

Hope that you can help. Thanks.  — billinghurst sDrewth 11:18, 6 May 2017 (UTC)

Follow-up, once we have the relationships in place, please hold the queries as then we can look to populate family names from the articles "Surname, Given name ... (DNBXX)" through to the respective people items.
@billinghurst: Let's see if I can translate the above into queries, to see whether I have understood correctly what you've told me (and what you are looking for).
So currently we have 30,684 items tinyurl.com/lmg9aop that are published in (P1433) Dictionary of National Biography (1885-1900) (Q15987216) or Dictionary of National Biography, first supplement (Q16014700) or Dictionary of National Biography, second supplement (Q16014697); and these all have a link to en-wikisource tinyurl.com/kc82p8k (number doesn't change if we add that latter requirement). From what you have written above I infer that you are able independently to confirm that this is the number that there should be.
That number falls to 30,289 if we require that each article-item has a main subject (P921) tinyurl.com/m4686wh.
The remaining 404 tinyurl.com/lwrzpy3 are redirects at wikisource, eg Audelay (DNB00) (Q19052970)[8], and you believe that this is the number that there should be.
These 404 are all tagged as instance of (P31) DNB redirect page (Q19648608) (tinyurl.com/mgeaw9o)
However, only 22,938 (tinyurl.com/ks9lmxr) of those 30,289 subject items have described by source (P1343) the expected release of the DNB; leaving 7351 which do not tinyurl.com/mabf22y -- however this information could now be added from the results of this query using Quick Statements (although there may be a few more checks we want to make first).
Updated, to exclude redirects: tinyurl.com/mzn276h (7219)
Of the 22,938 there are 22,616 that have an appropriate stated in (P248) qualifier linking back to the article item (tinyurl.com/kwl5wnq).
This leaves 451 that don't have a stated in (P248) qualifier linking back to the article item. (tinyurl.com/kbs9xqy).
However, looking at this list reveals some oddities. For example:
So there maybe some more checking needed on those main subject (P921) statements before adding the inverses in bulk.
Is that the sort of investigation you were looking for ? Jheald (talk) 14:19, 6 May 2017 (UTC)
Correction on that last query. It was getting confused if there were two different DNB articles both describing (or being purported to describe) the same person.
Here's a revised query, with 323 hits, for when there is a link back to the right release of the DNB, but not the original article-item:
tinyurl.com/mxsjz35. Jheald (talk) 14:33, 6 May 2017 (UTC)
Pretty much. Let me look at fixing the errors firstly, then we can review where we are.

Note that with the DNB articles they can refer to multiple people, so we may not have a one to one relationship in that direction (though guess is that we may be missing numbers of those, and I have an inkling how to track)  — billinghurst sDrewth 15:00, 6 May 2017 (UTC)

@billinghurst: Turns out that most of those 323 were redirects, eg Falconberg (d.1471) (DNB00) (Q19019837) (but which had their own "main subject" property, which is why they were being included).
Excluding the redirects brings the number down to 37 (tinyurl.com/lmqfau3) that have a "main subject", but where the subject does not have a "stated in" in turn. Jheald (talk) 15:09, 6 May 2017 (UTC)
For those 37, it looks as if some do have a "stated in", but it's been added as a reference not a qualifier (I was specifically looking for it as a qualifier). Let me know if you'd like an example query to look for this.
Also, there may be some pseudonyms (eg Sawtrey, James (DNB00) (Q19024116)), where there is a link-back from the subject to the subject's main DNB article, but not the DNB article for their pseudonym. Jheald (talk) 15:17, 6 May 2017 (UTC)
That is lots of twists and turns to get my head around at this late hour. I think that I need to consolidate and clean the oddballs first. I would also would like to standardise the DNB redirect set, 1) I don't think that they should have main subjects (they are redirects), 2) it seems worthwhile them having the "of" statement.  — billinghurst sDrewth 15:21, 6 May 2017 (UTC)
I am happy for you to give me lists of inconsistent data approaches/errors/weirds and I will fix those.  — billinghurst sDrewth 15:31, 6 May 2017 (UTC)
@billinghurst:: 23 where the "stated in" is in a reference, not a qualifier: tinyurl.com/nxf5e42 ✓ Done
16 "weird" tinyurl.com/ny5ujhx (might include some double-counting). Jheald (talk) 15:32, 6 May 2017 (UTC) ✓ Done
@billinghurst: 816 redirects with no "of": tinyurl.com/l8qdmzj
some of which have a "main subject"; and some of the subject items have a "stated in". Jheald (talk) 15:48, 6 May 2017 (UTC)
Pictogram voting comment.svg Comment I am proposing to enWS that we delete the DNB redirect 'articles' and accordingly delete the wikidata items They are a redundancy that are placeholders for a book, we don't need them as we can manage by web means. So will park that cpt for the moment.  — billinghurst sDrewth 05:53, 7 May 2017 (UTC)
@billinghurst: I have now added described by source (P1343) + stated in (P248) + imported from (P143) corresponding Wikidata item (Q20651139) for most of the 7000 subjects that didn't have it (see Special:Contributions/JhealdBatch), with the rest going in as we speak.
Will be away from my computer now until much later, but it should be easier to pick up & deal with anomalies once this lot are all in. Jheald (talk) 07:50, 7 May 2017 (UTC)

State of play[edit]

@billinghurst:
The only wrinkle is Henry Elsynge (Q15072637) which does have a described by source (P1343) back to the DNB, but one that has been ranked deprecated, as "misinformation" -- see link on talk page for more.
I think that we can live with the qualification, it is what it is.  — billinghurst sDrewth 14:12, 9 May 2017 (UTC)
@billinghurst: Also the following query, which looks for where the same subject item links back to more than one DNB item, might be worth checking through, just to make sure they're all kosher: tinyurl.com/m7bl37o; currently returns 155 rows. Jheald (talk) 12:40, 9 May 2017 (UTC)
✓ Done fixed misapplied, and applied DNB redirects.  — billinghurst sDrewth 12:56, 10 May 2017 (UTC)
Excellent query, shows up redirects, wrongly attributed and over-enthusiastic. Will take a little while to work through.  — billinghurst sDrewth 14:12, 9 May 2017 (UTC)
@billinghurst: Though on reflection, it may be entirely appropriate that there are additional people and things that can be said to be "described by" a biographical article, in addition to its main subject. Cf this query for items that are not humans "described by" DNB articles tinyurl.com/kkdbsle -- most of which may be entirely appropriate.
So perhaps one should also (or first) look at this way round: tinyurl.com/mqqhuw6 -- articles which have more than one "main subject"; and/or tinyurl.com/mgd9t24 articles where the main subject is not human. Jheald (talk) 14:34, 9 May 2017 (UTC)
This leaves 487 such items with no "of" (query: tinyurl.com/l8qdmzj). Typically these have a "main subject" that is a title, eg "Earl of X", rather than a person. There are also a fair number with no main subject (P921).
I'll leave it to the project to consider whether keeping these redirects is useful. But it may be handy to keep them around, to make sure that e.g. such alt forms are reflected in aliases on the "main subject" item, etc.
  • I won't go adding back any more (DNB00)s etc (diff, diff). But if ppl do seriously want to get rid of these, there are about 30,000 to go... Jheald (talk) 12:22, 9 May 2017 (UTC)
  • Pictogram voting comment.svg Comment @Charles Matthews: for the local DNB project page we should grab the queries that are usable for ongoing checks and other maintenance.  — billinghurst sDrewth 12:56, 10 May 2017 (UTC)

DNB articles[edit]

To note that ultimately the DNB articles will all be moved to be subpages of the work, not root level items where they currently sit. It is a quirk of the time when the project started that they sit with their suffix. To the suffix in items here, or with those that are subpages the guidance here is that that the title is kept simpler with clarification taking place in the description. So the DNB00/01/12 suffixes have been disappearing, though we have kept the years of life. For other works, we have been removing the book title from the subpage, and showing the article, and the name of the work in the descriptor. There are lots to come as adding them has been problematic to this point of time.  — billinghurst sDrewth 11:50, 9 May 2017 (UTC)

I'll bow to whatever the project thinks best... Personally I do quite like the (DNBOO)s and similar on items, as scarecrows to stop people linking to items of the wrong sort (or merging them). But whatever people want. Jheald (talk) 12:26, 9 May 2017 (UTC)
I've argued against removing the suffix style in the past, on grounds of ease of search on Wikisource (which I use all the time). There is a possible technical fix in the search, there. Here, I'm a fan of the suffix style for the same reason as James. Charles Matthews (talk) 13:07, 10 May 2017 (UTC)

The Thames at Westminster (Q19660486)[edit]

I noticed that this one was created by Poulpybot and had no external link to a NT website, but of course these are also on Art Uk. I wonder if you know how to go through and update these (I didn't check how many have been created) with the Art UK links, but that would be a worthwhile thing to do, in my opinion. Jane023 (talk) 13:04, 21 September 2017 (UTC)

@Jane023:. Hmm. Looks like one can do a search for "National Trust" at Art UK to get pages like this, then follow the link to each pic to get the NT accession number (and NT URL where available), then match on the NT accession number against NT pics here. Shouldn't be too much of a challenge to script that, I'll put it on my to-do list, but I can't promise immediate action -- or would it be helpful to have these links urgently? Jheald (talk) 15:27, 21 September 2017 (UTC)
Seems we only have 25 National Trust paintings in the system at the moment though, tinyurl.com/ybng96om. A motley bunch, doesn't seem to be much rhyme or reason to them. Jheald (talk) 15:31, 21 September 2017 (UTC)
Thanks! No I don't need them urgently - I only noticed because I was working on Canaletto and picked up a few. I think the NT website has permanent urls, and the Art UK site does too, so I thought it might be a good idea to get the back end of both of these hooked up somehow with the Wikidata paintings (at least for the collections we have, so e.g. Tate, NPG, NG etc) Jane023 (talk) 15:38, 21 September 2017 (UTC)
Ha I see now that I have done most of these probably, working with Art UK images, such as The Sense of Taste (Q29569637). Jane023 (talk) 15:40, 21 September 2017 (UTC)

Shand Mason[edit]

I seem to have done something wrong and brought about this:[9]

I've tried to understand but I can't. How do I learn what my mistake was? Thanks, Eddaido (talk) 21:55, 23 September 2017 (UTC)

@Eddaido: What's the problem? On 11 August you created a sitelink from en:Shand Mason on English Wikipedia to c:Category:Shand Mason fire engines, a sitelink which seems entirely reasonable, in the process creating the Wikidata item Shand Mason (Q35956189), which is great -- every English wikipedia article ought to have a corresponding Wikidata item.
Today a batch process of mine has added a Commons category (P373) property to the Wikidata item, because sometimes these are easier to deal with for some purposes than sitelinks (and sitelinks can't always be created, eg if the link is already taken by a category). So everything seems fine, just as it should be. Jheald (talk) 22:18, 23 September 2017 (UTC)
I have added a few more statements to the item here. Nothing too incorrect, I hope. Jheald (talk) 22:43, 23 September 2017 (UTC)