User talk:Sic19

From Wikidata
Jump to navigation Jump to search
Logo of Wikidata

Welcome to Wikidata, Sic19!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, please ask me on my talk page. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards!

Contents

Notability[edit]

You seem to be running an unapproved bot to create quite a few items for which the Notability. Please stop and discus this first at the Wikidata:Project chat. Multichill (talk) 23:19, 5 December 2015 (UTC)

I'm using quick statements not running a bot. Every item is a clearly identifiable material entity - Europeana ID's will be added.--Sic19 (talk) 12:39, 6 December 2015 (UTC)
Quick statements is also an automated tool. Please stop or I'll do it for you. Multichill (talk) 12:58, 6 December 2015 (UTC)
You have been blocked. Wikidata is not a site were you can just dump data. If someone ask you to first discus something first, you're not going to just continue. You items have several issues:
  • Why are these photographs notable enough to have an item? This is the main issue. Besides that:
  • Labels are incorrect. "NLW3365149" at no label (Q21670427) looks like an inventory number, that shouldn't be in the label
  • Descriptions should be expanded. Something like "photograph by John Thomas"
  • inventory number (P217) is missing, this will give a huge number of contraint violations
  • The same Commons category (P373) shouldn't be on every item
If you agree to stop creating new items until the discussion is finished I'll unblock you right away. Multichill (talk) 13:15, 6 December 2015 (UTC)


I wasn't sure exactly what or where to discuss - I read a lot of posts but wasn't sure where I should start a new discussion. But, yes let's discuss it and decide what happens next. If you unblock I won't create items while we discuss and I'll respect the outcome. So do we discuss here or in chat? --Sic19 (talk) 13:36, 6 December 2015 (UTC)

Sorry for having to pulled the emergency brake on you. I'll open a topic at Wikidata:Project chat#Notability of photographs in a minute. Let's continue over there so we can get input from other people. Multichill (talk) 13:50, 6 December 2015 (UTC)
✓ Done, see Wikidata:Project chat#Notability of photographs. Multichill (talk) 14:14, 6 December 2015 (UTC)
Hi Sic19, sorry for getting back to you after such a long time. I was meaning to do it earlier. Looking at Wikidata:Project_chat/Archive/2015/12#Notability_of_photographs I would say there are no objections to continue with this batch of photographs. I talked about this with some other people who are active here. I think we want artworks on wikidata, not artifacts. Not all photographs are artworks, only a small subset of them due to age or who made them. Still a pretty grey area and I'm not sure if this wording is correct. I just don't want to have an item for every image in Commons:Category:Images from the Tropenmuseum (or at least not yet).
Sorry for getting in your way. If you need any help, feel free to ping me. Multichill (talk) 20:39, 3 February 2016 (UTC)
Hi Multichill, thanks for getting back to me. I'm glad that there are no objections as it is an interesting collection and working on it in this way is a good way to establish links between the people depicted and other sources of information about them. In this instance surely some of these photographs are the only extant image of these people so I think that makes the effort worthwhile. But, it is fair to say that working on the collection has not been as straight forward as I had envisaged it to be and there has been quite a bit of trial and error to find the best solutions to represent the collection. This is ongoing. It is also apparent that creating these items brings with it a need to create other items particularly for what they are depicting - something that is obvious in hindsight. I understand your point concerning photographs as artworks, for example, it would be hard to justify an item for every celebrity image that is published and it becomes necessary to distinguish between the iconic and the ephemeral. Some of the Tropenmuseum's photographs are fantastic and, in my opinion, should have wikidata items but in other instances the image is merely a surrogate for the artefact so probably having a item for the image is an unnecessary level of abstraction. It is a difficulty I encountered with the photograph actually being a glass plate negative, which has an access print that has now has a digital facsimile - I certainly don't want three items but I find it hard to represent this distinction in a single item. The joys of classification!
I have a few items that seem to be fairly good examples of photographs with rich metadata if anyone needs a guide for similar items e.g. Q21667894 ; Q21667482 ; Q21668521 and I'd be glad to share my experience of working on this collection if anyone else is undertaking anything similar.
Also I'm working on a small collection of watercolour paintings at the moment and I'd like to get involved with the sum of all paintings project. This probably isn't the best place to discuss that but I know that you are involved it that project.
All the best,Sic19 (talk) 22:53, 3 February 2016 (UTC)

digital image (Q1250322)[edit]

Please do not add digital image (Q1250322) as an instance of a historical work of art (for instance, a 19th century photograph). Each item should be about the original work, and digital images are fairly recent. Andreasm háblame / just talk to me 05:09, 14 December 2015 (UTC)

Invitation to Wikidata user study[edit]

Dear Sic19,
I am a researcher of the Web and Internet Science group of the University of Southampton.
Together with a group of other researchers from the same University, we are currently conducting a research aiming to discover how newcomers become full participants into the Wikidata community. We are interested in understanding how the usage of tools, the relationships with the community, and the knowledge and application of policy norms change from users' first approach to Wikidata to their full integration as fully active participants.
This study will take place as an interview, either by videotelephony, e.g. Skype, phone, or e-mail, according to the preference of the interviewees. The time required to answer all the questions will likely be about an hour. Further information can be found on the Research Project Page Becoming Wikidatians: evolution of participation in a collaborative structured knowledge base.
Any data collected will be treated in the strictest confidentiality, no personal information will be processed for the purpose of the research. The study, which has submission number 20117, has received ethical approval following the University of Southampton guidelines.
We aim at gathering about 20 participants, chosen among experienced Wikidata users who authored a large number of contributions.
Should you be interested in taking part or wish to receive further information, you can contact us by writing to the e-mail address ap1a14+wikidata_user_study@ecs.soton.ac.uk.
Thank you very much, your help will be much appreciated!
--Alessandro Piscopo (talk) 23:21, 25 May 2016 (UTC)

National Library of Wales[edit]

Please have a look at for example Q25906415. The hight is set to 178 and the width to 145, while the source says: 87*145 millimeter and 178*240 millimeter. I´m afraid it is this way for many of this kind of items. I don´t know how you got the data, but your code for doing this was wrong. --Molarus 11:38, 8 August 2016 (UTC)

Thank you for pointing this out - I will update with the correct data.Sic19 (talk) 12:04, 8 August 2016 (UTC)
I have seen that the error was already happening in commons, you just copied the data from commons to Wikidata. That is even more bad, because this way we don´t get the right data. I don´t know what to do, maybe we should ask at Wikidata:Project chat? --Molarus 12:06, 8 August 2016 (UTC)

Hey college, you might have meanwhile noticed that I work on a repair job for the physical dimensions of prints in the Welsh Landscape Collection (Q21542493). I plan to adjust all wrong numbers, and there seem to be plenty of them, and add sources as well. This will require another iteration of edits over the entire collection (~4600 items, at least to edits per item). Since you appear to be experienced with this set of items, is there anything I should know before I start (which will be tomorrow at earliest)?

Details: I plan to add image sizes, specifically height times width in millimeters, as given on the linked web pages, to height (P2048) and width (P2049) properties; right now I don’t plan to add paper sizes, and neither do I plan to add qualifiers; however, this is not a final plan and I am very open for input. Thanks and regards, MisterSynergy (talk) 16:58, 13 June 2017 (UTC)

Hi, thanks for your message - this sounds fantastic! I would really appreciate it if you can do this work on the Welsh Landscape Collection prints. Please note that all of the height and width values with a reference are correct. Best regards, Sic19 (talk) 17:23, 13 June 2017 (UTC)

  • Good to know, thanks!
  • Just to make sure: NLW provides dimensions always as height x width, unlike many other collections, right?
  • If I have further questions before I start, I will also come back to you :-)
MisterSynergy (talk) 17:27, 13 June 2017 (UTC)

Yes, the dimensions for this collection are all height x width.Sic19 (talk) 17:39, 13 June 2017 (UTC)

Do you have contact to NLW? I could point to a couple of entries which would benefit from a repair on their side… —MisterSynergy (talk) 17:47, 13 June 2017 (UTC)
Jason.nlw (talkcontribslogs) is Wikipedian in Residence at the NLW. Sic19 (talk) 18:00, 13 June 2017 (UTC)
@MisterSynergy: Hi! Happy to help in any way i can. Jason.nlw (talk) 07:07, 14 June 2017 (UTC)

@Jason.nlw: Nice, let’s try something! For the correction job I need information about image sizes. I have crawled JSON files (whoever is responsible for their availability: thanks a lot!) for all 4669 items from the website of the collection, and took the image size infomation from the “Physical description” field. The following entries do not contain interpretable image sizes:

  • Q21626279 (1130695): 1 print : engraving, b&w ; image size 79115 mm., paper size 180 x 250 mm.
  • Q23690604 (1130274): 1 print : engraving, b&w ; image size.mm., paper size 80 x 105 mm.
  • Q23719782 (1133324): 1 print : lithograph, b&w ; paper size 256 x 319 mm.
  • Q23719811 (1133367): 1 print (2 images): engraving, b&w ; paper size 440 x 290 mm.
  • Q23719831 (1133385): 1 print : engraving, b&w ; image size.mm., paper size 140 x 208 mm.
  • Q23729939 (1133592): 1 print (5 images) : engraving, b&w ; paper size 291 x 224 mm.
  • Q23767806 (1133456): 1 print : aquatint, b&w ; image size 13378 mm., paper size 234 x 296 mm.
  • Q24069254 (1133777): 1 print : aquatint, col. ; image size 390 mm., paper size 502 mm.
  • Q24256073 (1130984): 1 print : engraving, b&w ; image size * mm., paper size 257 x 153 mm.
  • Q25906369 (1130011): 1 print : lithograph, b&w ; image size see notes, paper size 335 x 415 mm.
  • Q25907582 (1130987): 1 print : engraving, b&w ; paper size 342 x 270 mm.
  • Q25910635 (1131532): 1 print : engraving, b&w ; image size 106165 mm., paper size 125 x 200 mm.
  • Q25915160 (1131868): 1 print (2 images) : engraving, b&w ; paper size 288 x 237 mm.
  • Q25917090 (1133425): 1 print : aquatint, col ; image size 240 mm., paper size 225 x 310 mm.

Whether or not those are actual mistakes is beyond my knowledge. However, if someone in the collection could check these values, I’d re-crawl those few JSONs and include the Wikidata items in the correction job as well.

There are a couple of entries which have “sub-optimal” information as well, but it is still interpretable and thus not that important for me. If NLW is interested, I could also look for problems with paper sizes. In case of questions feel free to ask! —MisterSynergy (talk) 07:36, 14 June 2017 (UTC)

@MisterSynergy:Thanks for your help with this. I have spoken with our metadata team and it looks as though these are errors from when the data was created. I will need to arrange for the images to be physically remeasured. The metadata team can then update the catalogue entries and the JSON's will be corrected. The only difficulty might be where there are multiple images on the same ID. With those 3 i can probably update the wikidata items manually but the metadata standard will not allow us to put in multiple image sizes for one item. I will try and get back to you ASAP. Thanks again! Jason.nlw (talk) 08:28, 14 June 2017 (UTC)

I am still preparing, but the update code is already in a good shape. I have another issue to discuss with the two of you: NLW provides information about image size and paper size on their collection website. I learnt that usually image sizes are stored in Wikidata’s height (P2048) and width (P2049), and paper sizes (in general: all kinds of frames) are more or less ignored until now. Therefore I suggest to remove all qualifiers applies to part (P518) image (Q478798) from those claims, of which are have around only 25 per dimension property in this collection right now. The P2048 and P2049 claims of paper size (only 6 as far as I see) should not be deleted and keep their applies to part (P518) paper (Q11472), but in case both are present the image size data gets preferred rank. That setup would make data about this collection consistent with other data here, while not losing anything we already have. Any objections? I’d be able to update items as proposed otherwise. Regards, —MisterSynergy (talk) 06:18, 15 June 2017 (UTC)

Sounds good to me. Thanks!Sic19 (talk) 16:32, 15 June 2017 (UTC)
@MisterSynergy: Hey. The images with bad data (listed above) should now be updated. Ley me know if there are any other issues. Best Jason.nlw (talk)
Thanks for ping. I’ll have a look at it soon. —MisterSynergy (talk) 14:11, 10 July 2017 (UTC)

Hey Sic19 and @Jason.nlw, as you both might have seen on your watchlists, I am meanwhile running the repair job and the first ~1200 of 4655 items have already been updated this night. Sorry for the delay, I was way too busy in real life to invest time here earlier. The update now adds references in line with the database model in Help:Sources (i.e. much better re-usable than the few old references), it corrects wrong dimension claims (mostly height (P2048)), and adds missing data in a few cases this happened (all image sizes as provided by NLW). The job will still take until at least tomorrow, since I can’t edit faster than 1 edit per 10 seconds.

Now I also found a couple of further flaws with the items of the collection, and I suggest to fix them as well once the dimensions are okay. They are not terribly bad, but since we are working on these items anyway these days and the repair isn’t this complicated, I’d like to bring it up here:

What’s your opinion about my ideas? Regards, MisterSynergy (talk) 08:54, 17 September 2017 (UTC)

Hey. Thank you very much for investing your time to fix issues with this collection! Your proposed fixes all sound fine to me, but i would like Sic19's input before going ahead with the changes. Cheers! Jason.nlw (talk) 09:29, 18 September 2017 (UTC)
I'm not completely against these changes but I'd question whether they're really necessary. Whilst the work on the dimensions is fixing an error, I think this is just a different way of doing the same thing. Primarily, my concern is that queries on the NLW collection will miss this entire collection unless they also include subclasses in the query - this might cause issues in a lot of example queries that we have made and shared. Please can we discuss this further before you do any work on these aspects of the collection. Thanks for your help. Sic19 (talk) 13:45, 18 September 2017 (UTC)
  • Thanks for your answer. I will of course not change anything before we agree on a solution, and you have brought up valid concerns.
  • Do you happen to know which of the properties I suggested to change are actually in use in queries? If we could just move part of (P361) to collection (P195) (which then had two values, the library and the collection), the situation would already be okay. Right now we have the problem that part of (P361) as a symmetrical property with has part (P527), but we can't add 4670 values in the collection item with has part (P527).
  • The repair job will probably be finished this evening. There are some hundred items to go.
MisterSynergy (talk) 13:58, 18 September 2017 (UTC)
Best, Sic19 (talk) 18:33, 18 September 2017 (UTC)
If I was to set this up from scratch, I’d add to the ~4670 items (but not the P195 statements with the library item), and to the collection item itself. This would be very “queryable” with SPARQL:
A very similar situation is to some extent already set up for Peniarth Manuscripts (Q11019402) and John Ingleby Watercolours (Q21731178), which are other collections at the NLW. For Peniarth Manuscripts (Q11019402), the very few items of that collection still have a part of/has part relation, but this only works because the number of elements is apparently very small (or this dataset is incomplete, I don’t know). For the other, John Ingleby Watercolours (Q21731178), there are redundant collection (P195) claims with the library item as value in the collection items. However, you could already use those two collections for query testing and you would only have to use P195, as outlined above. Do you see any problem with such an approach? (If you need help with the queries, I’d provide them to you.) —MisterSynergy (talk) 19:11, 18 September 2017 (UTC)
Yes, this makes sense and I do think it is sensible approach. Although it is technically unnecessary duplication, I would like to retain the wdt:P195 wd:Q666063 statements for an interim period to make sure that the Welsh Landscape Collection doesn't disappear from existing example queries, particularly the Histropedia example. Sic19 (talk) 05:31, 19 September 2017 (UTC)
No problem, of course. If you give me a go, I’d move the P361:Q21542493 to P195:Q21542493 with one edit per item. The P195:Q666063 and anything else would be left untouched. —MisterSynergy (talk) 07:36, 19 September 2017 (UTC)
The dimension repair job has now finished. The collection has 4675 items (according to Wikidata). There were 4655 items corrected with my script with the crawled data from NLW, and I looked into a couple of further items manually. There are issues with five items which I cannot resolve:
Inform @Jason.nlw just in case he does not watch this page.
All other items now have image sizes with full references added. Units and bounds are also used correctly in all cases.
The information should somehow be corrected at Wikimedia Commons as well. However, I have no idea how to do this… Thanks for your input and regards, —MisterSynergy (talk) 07:36, 19 September 2017 (UTC)
Hey MisterSynergy Thanks very much for correcting this issue for us. I will take a look at the problems above and see if i can fix them manually. To fix the issue on Commons i think we would need someone who could run a little script on the API. I will have a think on that one. Cheers! Jason.nlw (talk) 10:49, 19 September 2017 (UTC)

Places depicted in artworks[edit]

I updated Crotos/Callisto. And wow... what you did on Wales is very impressive and pleasant. Bravo! Best regards. --Shonagon (talk) 15:59, 16 October 2016 (UTC)

Thanks for updating Callisto - it's great to see the collection on a map! All the best, Sic19 (talk) 21:13, 17 October 2016 (UTC)

Categories[edit]

Please stop adding categories, until this discussion reaches a consensus. Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:42, 1 July 2017 (UTC)

Henry Charles (Q20733139) and contributed to creative work (P3919)[edit]

Hi. I fiddled with your edit on q20733139, though wondering whether rather than qualifying further with an "of" statement that we look to better articulate the qualifier for "subject has role". We could look to create something like "letter writer", "writer of letter" (parts exist for that already) or something else that I cannot think of. Thoughts?  — billinghurst sDrewth 06:12, 7 September 2017 (UTC) (please ping me when responding)

@billinghurst: Hi, thanks for your edits and apologies for my slow reply. I'm thinking that the qualifier subject has role (P2868) - correspondent (Q1155838) is probably sufficient as further information is available from the reference, if required. That said, I wouldn't object to any of the more specific roles that you've suggested, like letter writer, if you choose to represent the relationship in this way. There probably isn't an entirely right answer here. Thanks again for your help and suggestions - happy to discuss further if it will be useful. All the best, Sic19 (talk) 18:51, 18 September 2017 (UTC)

Welsh Journals[edit]

Hi Sic19!,

I have started uploading images for pre1880 Welsh journals. I am hoping there will be another batch in the coming days and then i will send you the data if you like - which should make it easier to match the image to the Wikidata items. Best Jason.nlw (talk) 13:49, 19 September 2017 (UTC)

Documentation of WikidataCon[edit]

Hello,

Thank you so much for your work on this page! It's incredibly helpful for people who would like to have an overview of the event, and also for our reporting tasks :)

Cheers, Lea Lacroix (WMDE) (talk) 09:47, 2 November 2017 (UTC)

Hey,
No problem at all. Yes, I thought it would be beneficial to have a single page with links to all the notes, slides and videos from WikidataCon 2017. There is a lot of really useful information and hopefully it will now be easy to access.
Thanks for organising WikidataCon. It was a fantastic weekend and I'm glad we have a great series of videos from the event - I know I'll be revisiting some of the sessions very soon.
All the best, Sic19 (talk) 00:01, 3 November 2017 (UTC)

Help[edit]

I want to participate in these two projects as I do to be a participant:

  1. https://www.wikidata.org/wiki/Wikidata:WikiProject
  2. https://www.wikidata.org/wiki/Wikidata:WikiProject_Music

And −−Gilrn (talk) 20:28, 2 November 2017 (UTC)

Sorry, I don't think I can be much help. Perhaps you could post a message on the WikiProject Music discussion page. Sic19 (talk) 21:31, 2 November 2017 (UTC)

Bad data[edit]

I reverted you here because you linked to an item about an organisation, not a place. Also, there was already a correct value for that property. Please check your other data. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:58, 7 November 2017 (UTC)

I was thinking in terms of the library network as a spatial entity but, as you've quite rightly pointed out, this item is for the library as an organisation rather than the public library system of a local authority. I'll revisit the other data I created in the same batch and put it right. Thanks for bring this to my attention. All the best, Sic19 (talk) 17:49, 7 November 2017 (UTC)

"Editor"[edit]

It appears that you've made a number of automated edits based on DWB which confused some human titles with other things with the same name, e.g. this edit that had editor (Q985394) when I'm sure it meant editor (Q1607826). I've made the editor (Q985394) -> editor (Q1607826) while migrating all existing uses of no label (P794) for job titles to position held (P39), but there might be other stray entries with similar mistakes. Deryck Chan (talk) 17:04, 24 November 2017 (UTC)

Yes, I did indeed want to use editor (Q1607826) in these edits. Thanks for correcting and letting me know about it. Sic19 (talk) 18:20, 24 November 2017 (UTC)

"college library" vs. "academic library" -- and "further education library"[edit]

I noticed on February 27, 2018 you made college library a subclass of academic library. A little over a week before that I had posted a discussion item to merge the two: https://www.wikidata.org/wiki/Talk:Q1622062#Merge_with_%22academic_library%22_(Q856234) If you check the history of college library, it has changed its subclass hierarchy multiple times (it has been a subclass of academic library before). It would be a good thing to discuss.

Also, rather than creating more specialized subclasses of college library like "further education library", I think it would be better to just call those instances of either academic library, or school library. Dan scott (talk) 22:50, 3 March 2018 (UTC)

Share your experience and feedback as a Wikimedian in this global survey[edit]

WMF Surveys, 18:57, 29 March 2018 (UTC)

... House[edit]

Note that most buildings in London with names ending in "... House" are not instance of (P31) house (Q3947) -- they are more likely to be large office buildings, corporate headquarters, blocks of mansion flats, etc. Jheald (talk) 17:56, 2 April 2018 (UTC)

Many would actually been built as houses but your point is valid. I'm trying to add the missing instance of (P31) claims to over 100,000 listed buildings in the UK and I'd prefer not to use the very broad, overarching architectural structure (Q811979) on all of them. If you want to send a list of the items that are definitely incorrect I will amend the instance of (P31) statements. Or perhaps you can offer some constructive suggestions on how to identify an appropriate instance of (P31) for these structures? Simon Cobb (Sic19 ; talk page) 18:45, 2 April 2018 (UTC)

Reminder: Share your feedback in this Wikimedia survey[edit]

WMF Surveys, 01:40, 13 April 2018 (UTC)

Your feedback matters: Final reminder to take the global Wikimedia survey[edit]

WMF Surveys, 00:50, 20 April 2018 (UTC)

Multilingual captions testing is available[edit]

Greetings,

The early prototype for multilingual caption support is available for testing. More information on how to sign up to test is on Commons. Thanks, happy editing to you. - Keegan (WMF) (talk) 17:06, 24 April 2018 (UTC)

Great work on University of Leeds publications[edit]

Hi Simon, thanks for your great contributions in this area! I just added University of Leeds (Q503424) to the organization examples in Scholia (Q45340488). Do you use Scholia in your curation workflows? If so, we'd like to know about any suggestions you might have on how to improve the web service, the reference manager or other aspects of the functionality, documentation or presentation. --Daniel Mietchen (talk) 00:13, 8 May 2018 (UTC)

Hanes y Brytaniaid a'r Cymry[edit]

While creating some version, edition, or translation (Q3331189) items for several books available online from the British Library, I have just created Hanes y Brytaniaid a'r Cymry (1873 edition) (Q53576694).

I now see that you have already created Hanes y Brytaniaid a'r Cymry (Q29572839).

What do you think would the best thing to do with the two items? Wikidata:WikiProject Books appears to advocate having two separate items, one for the work and a further one for each identifiable edition. In practice, I'm not sure how far people follow this if there has only been one edition of the work (as appears to be the case here); but I suppose it's always possible that someone might produce a new edition, with eg a new introduction.

Do you think it's worth keeping two items for the book? (In which case, which fields should live where?) Or alternatively, do you think it would be better to merge the two? Jheald (talk) 22:12, 15 May 2018 (UTC)

@Jheald: I think it is OK to merge these items - the only thing that is making me slightly hesitant is catalog code (P528) on Hanes y Brytaniaid a'r Cymry (1873 edition) (Q53576694) which is probably copy specific. In most cases, it is unnecessary to have data about anything more specific than an edition of a work - obvious exceptions are rare books like Gutenberg Bible (Q158075) or First Folio (Q833645) when individual copies might be important enough as objects to have an additional item despite being the same edition. Perhaps I misunderstand the usage of this property?
On a few occasions recently I've spotted duplicates that are different digitised versions of the same edition e.g. On the laws and practice of horse racing (UPenn copy) (Q51514189) and On the laws and practice of horse racing, etc., etc (UC copy) (Q51425849) - they have different identifiers from the same sources but it I don't think that alone is sufficient reason for importing both.
Thanks for asking before merging. Simon Cobb (Sic19 ; talk page) 18:22, 17 May 2018 (UTC)
Thanks, Simon.
For what it's worth, the BL 'System' numbers I think in fact do approximate, at least more or less, what we denote as version, edition, or translation (Q3331189) -- so under the 'Holdings' screen in an item's BL catalogue entry, there are sometimes a number of different copies (with different shelfmarks) all matched to the same System number. But on reflection as a result it may therefore have been misleading of me to use catalog code (P528) -- I've been thinking of applying for an external ID property for it instead, as it can be turned into a URL link.
In respect of "the Laws of Horseracing", I think the key issue isn't different identifiers, but that there are different scans available, each one coming from a different source -- that might then, for example, have two different Commons categories of images. I recently raised this as a specific question on Wikidata talk:WikiProject Books, where User:Snipre responded that (at least in his opinion) there should be separate items in such cases. The advantage, I think, is that then one can document in full the details for each scan series -- which physical copy it came from, which identifiers it corresponds to (eg here where one of the scan-series has a corresponding identifier both at the Internet Archive and the Biodiversity History Library) -- having a specific item for this copy means that the two identifiers can be connected.
All the same, at least for myself, I would be slow to create such items, unless I was convinced there was a clear need for them (eg two different Commons categories). So for example with A Topographical Dictionary of Wales (3rd Edition) (Q52243033) I didn't/haven't taken it that step further, so haven't created items to connect eg which Hathi trust or Internet archive copies connect to which Google scans. But if somebody wanted to clarify that information, new items would seem quite reasonable to do so. However, I also strongly think that the version, edition, or translation (Q3331189)-type item should continue to contain a full account of all the online versions/sources available for a particular edition, so if the new items were created IMO it would make sense to connect them via statement is subject of (P805) statements.
Similarly, for "the Laws of Horseracing", I think what is actually needed is a further item for the version, edition, or translation (Q3331189), bringing together all of the available online versions information, using statement is subject of (P805) statements to the further two items to distinguish the different scan families.
(Incidentally, it's quite annoying that both of the "Laws of Horseracing" items are identified as instance of (P31) publication (Q732577) -- this (IMO) is too vague, and instead it would be better for them to be instances of something like "single book, and scans made of it", perhaps a subclass of exemplar of text (Q5419997)).
On the subject of separate items or not for books and editions, it seems that you would think that where there only ever was a single edition, it's probably best to only have a single item. Though if one did go for that, I think the item then ought to be identified as instance of (P31) both book (Q571) and version, edition, or translation (Q3331189), perhaps with the required edition or translation of (P629) statement then pointing to itself.
I think that does make some sense (and which is why I was so far holding off making separate 'book' items for the 230 'edition' items I created earlier this week).
But I'd also like to run it past Wikidata:WikiProject Source MetaData to see what their thinking is on book-items and edition-items -- for example, which is the one that should be cited? And there some of the values of some of the statements I've added that might not work well with a cite template. But I do need to have a good read through the talk pages there first. Jheald (talk) 22:13, 17 May 2018 (UTC)
I changed On the laws and practice of horse racing (UPenn copy) (Q51514189) and On the laws and practice of horse racing, etc., etc (UC copy) (Q51425849) to now be instance of (P31) a new item individual book (Q53731850), and moved most of their statements to a new edition item On the laws and practice of horse racing (1866 edition) (Q53738443); but they could probably be done away with altogether, and the scan provenance instead be distinguished using a collection (P195) qualifier.
Thread open at Wikidata_talk:WikiProject_Books#Does_there_*always*_need_to_be_a_separate_work_and_edition_item? as to whether this is right, or what makes most sense. Jheald (talk) 23:13, 18 May 2018 (UTC)

Structured Data on Commons IRC Office Hour, Tuesday 26 June[edit]

Greetings,

There will be an IRC office hour for Structured Data on Tuesday, 26 June from 18:00-19:00 UTC in #wikimedia-office. You can find more details, as well as date and time conversion, at the IRC Office Hours page on Meta.

Thanks, I look forward to seeing you there if you can make it. -- Keegan (talk) 20:54, 25 June 2018 (UTC)

What properties does Commons need?[edit]

Greetings,

Structured Commons will need properties to make statements about files. The development team is working on making the software ready to support properties; the question is, what properties does Commons need?

You can find more information and examples to help find properties in a workshop on Commons. Please participate and help fill in the list, and let me know if you have any questions. Thanks! -- Keegan (WMF) (talk) 18:53, 28 June 2018 (UTC)

Structured Data feedback - Depicts statements draft requirements[edit]

Greetings,

A slide presentation of the draft requirements for depicts statements on file pages is up on Commons. Please visit this page on Commons to review the slides and discuss the draft. Thank you, see you on the talk page. -- Keegan (WMF) (talk) 21:20, 7 August 2018 (UTC)

newspapers[edit]

I found you when I ran this query (Wales lit up very brightly). We're making a major push at the moment to improve newspaper coverage, and would love to have you join in at w:Wikipedia:WikiProject_Newspapers (also note the Wikidata subpage). --99of9 (talk) 07:42, 10 August 2018 (UTC)

Thesis data matching[edit]

I've just spotted Modulation of human neutrophil apoptosis by tumour necrosis factor-? (Q56457702) - this has been matched to J. K. Rowling (Q34660). Are you sure about this? The thesis is dated 1998; Rowling wasn't married until 2001, and apparently only rarely uses the married name Murray. Her WP article doesn't mention a doctoral thesis, and it seems very unlikely it would miss something so substantial as completing a doctorate whilst writing her novels...

I'm not sure what algorithm was used for the matching of theses to items here, but it might be worth looking into it for other false positives like this. Andrew Gray (talk) 20:57, 4 September 2018 (UTC)

Structured Data feedback - structured licensing and copyright[edit]

Mockups of structured licensing and copyright statements on file pages are posted. Please have a look over the examples and leave your feedback on the talk page. -- Keegan (WMF) (talk) 20:32, 7 September 2018 (UTC)

Wikidata weekly summary #330[edit]

New discussion on Commons talk:Structured data[edit]

Hello. I've started a new, important discussion about creating properties for Commons on Wikidata. Please come join in, if the process is something that interests you or if you can help. Keegan (WMF) (talk) 16:48, 19 September 2018 (UTC)

Book vs version debate[edit]

Hello, I noticed you've been creating a lot of book records. I implore you to please create separate items for versions, instead of adding data to the book item. My main argument is that it is easy to go from 2 items to 1, but significantly more difficult (and sometimes impossible) to do the reverse. Consider Aftermath : remembering the Great War in Wales (Q20599115). According to OpenLibrary, the ISBN-13 is for a 2000 edition, whereas the ISBN-10 is for a 1998 edition. Without external information, it would be impossible to determine which data belongs to the 2000 edition and which data belongs to the 1998 edition. This is why the split between books/versions exists. (This split is based on a model called FRBR, based in library science). Although the folks at WikiProject Books are still in debate as to what exactly Wikidata's book model will be, combining items as you have will make it significantly harder to update records like Aftermath : remembering the Great War in Wales (Q20599115) to whatever that model might be. Especially since your edits appear to be automated, please create separate items for "versions", and link them using the "version of" property. Cheers, --Hardwigg (talk) 05:45, 20 September 2018 (UTC)

Thanks for your message. Didn't you notice that I have been creating separate items for each edition, reprint etc.? I understand FRBR but applying it to messy and inconsistent Wikidata items is never going to work so I am adding statements to the existing items to make sure it is clear which version each is supposed to represent and hopefully this will avoid creating duplicates. The current situation is there are a lot of University of Wales Press items (i.e. the book (Q571) items), which do not have enough statements to determine which edition they are supposed to represent. Most of these items are from Welsh Wikipedia articles but not all represent the first edition and that makes it very difficult to reconcile the data I am uploading with existing Wikidata items. My plan is to upload the data for the University of Wales Press editions, which is over 1000 items, and then link the versions to the work - both must first be created before the linking can happen. If there is only one expression/manifestation I do not intend to create a separate item for the work.
Please keep in mind that I am ingesting approximately 250,000 catalogue records into Wikidata and I cannot check each item individually. Also note this is what will happen repeatedly if books/versions do not have statements that disambiguate them without the need to check external sources.
Finally, this item should indeed be two items as you suggest - thanks for flagging it. Simon Cobb (Sic19 ; talk page) 07:33, 20 September 2018 (UTC)
If you're dealing with post-1950s books, has the linked data version of the British National Biography [1] been any use? If I've understood what it's meant to do correctly, does it give matches to external identifiers for the values of a lot of the most relevant fields? Jheald (talk) 10:20, 20 September 2018 (UTC)
I tried to query it a while ago and the endpoint was being very temperamental. To be honest, I forgot all about it since then and haven't revisited it. The big problem I am facing is existing items with sparse statements and I think it makes sense to improve these items with the data I have from the National Library of Wales before modelling the relationships between editions. When I finish the ingest of the University of Wales Press corpus, I would welcome feedback and criticism from Wikiproject Books participants to develop the workflow for the ingest of more records. Simon Cobb (Sic19 ; talk page) 10:35, 20 September 2018 (UTC)
Thanks for the response, Sic19. I'm excited to see an import of this scale, but want to make sure the data is organized, consistent, and easily queryable. Your plan sounds reasonable, except for the last part. Although WikiProject Books is discussing model changes, the current accepted model is 1 item for the work and 1 item for each edition (as described at Wikidata:WikiProject_Books). No one's saying its perfect, but we need a model for consistency and that's the one Wikidata has chosen. Individual user edits are harder to regulate, but mass imports should definitely be following this model. Also, if you're ingesting hundreds of thousands of records, I would hope you've confirmed your import plan with at least Wikidata:WikiProject_Books. Is there a discussion somewhere I can look at? I'd be interested in seeing that community's thoughts on this as well.
And you're right, it looks like you did create separate items as well. It looks like ~270 items are combined. I stumbled on 11 of these while my bot was adding Open Library IDs to Wikidata items using ISBNs as identifiers. I noticed I was getting a higher than usual number of "Multiple OpenLibrary items found" warnings, and this seemed to be the cause. Open Library's far from perfect, but these look like genuine differences. These are only the errors I was able to find because a) the book was new enough to have an ISBN, and b) had multiple editions on Open Library. I imagine there are a lot more errors which would be much more difficult to find.
The most logical plan to me seems to be 1) create instanceOf: version items; 2) link these to existing (or new if absent) work items. When in doubt for (1), i.e. something already exists but it is uncertain if it is the same version, create a new item. It's always easier to merge 2 items into one than to split 1 into 2.
--Hardwigg (talk) 20:26, 20 September 2018 (UTC)

Wikidata weekly summary #331[edit]

Structured Data - upcoming changes to viewing old file page revisions[edit]

How old revisions of file pages work are likely going to have to change for structured data. There is information about the change on the SDC hub talk page, please read it over and leave feedback if you have any. Keegan (WMF) (talk) 15:30, 28 September 2018 (UTC)

Wikidata weekly summary #332[edit]

Structured Data - IRC office hours today, 4 October[edit]

There will be an IRC office hour for Structured Data on Commons today, 4 October 2018, from 17:00-18:00 UTC in #wikimedia-office. You can find date/time conversion, as well as a link to join the chat in your browser if needed, on the IRC Office hours page on Meta. I look forward to seeing you there. -- Keegan (WMF) (talk) 05:49, 4 October 2018 (UTC)

Structured Data - search prototype[edit]

There is a search prototype for structured data on Commons available. Please visit the search prototype page on the structured data hub for information on testing and feedback. -- Keegan (WMF) (talk) 19:07, 5 October 2018 (UTC)

Wikidata weekly summary #333[edit]

Wikidata weekly summary #334[edit]

Wikidata weekly summary #335[edit]

Wikidata weekly summary #336[edit]

Structured Data - IRC office hour today, 1 November[edit]

There will be an IRC office hour for Structured Data on Commons today, 1 October 2018, from 17:00-18:00 UTC in #wikimedia-office. You can find date/time conversion, as well as a link to join the chat in your browser if needed, on the IRC Office hours page on Meta. I realize this may be short notice for some people; I am experimenting with advanced notice times to see what works best for the most people, I'll be giving more warning before the next office hour. I look forward to seeing you there. -- Keegan (WMF) (talk) 16:02, 1 November 2018 (UTC)

Structured Data - IRC office hour today, 1 November[edit]

The above message says 1 October in the body when it should say 1 November, as the subject line says. Apologies for making a new section by mass message, it's the only way to get this out quickly. See you in twenty minutes! -- Keegan (WMF) (talk) 16:37, 1 November 2018 (UTC)

Structured Data - copyright and licensing statements[edit]

I've posted a second round of designs for modeling copyright and licensing in structured data. These redesigns are based off the feedback received in the first round of designs, and the development team is looking for more discussion. These designs are extremely important for the Commons community to review, as they deal with how copyright and licensing is translated from templates into structured form. I look forward to seeing you over there. -- Keegan (WMF) (talk) 16:25, 2 November 2018 (UTC)

Wikidata weekly summary #337[edit]

Wikidata weekly summary #338[edit]

Geoshapes draft instructions[edit]

Hey

I made a start with the Geoshapes instructions here, I think I roughly have the steps correct, needs some examples and some filling out. I added a nice example of a map but needs more, the Welsh map you have is a nice example (I think to get the formatting right it needs to be a template?), also having a map that shows some kind of geology would be nice, a species distribution map would be amazing, I'll have a look for some shapes.

--John Cummings (talk) 15:54, 17 November 2018 (UTC)

@Sic19:, great work so far, just a few things left to do, I'll keep looking for sources of data, there's a huge amount produced by the US government. My only suggestion of a change is to either have the example code on a separate page or in a drop down thing so that we can have more than one example and also people can understand where the code goes within the template code. John Cummings (talk) 10:24, 20 November 2018 (UTC)

Wikidata weekly summary #339[edit]