Wikidata talk:WikiProject Books

From Wikidata
Jump to navigation Jump to search
On this page, old discussions are archived. See: 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023.

Project Gutenberg ebook ID (P2034) constraints[edit]

Why can't this apply to works? I get it, that we want everything and its grandmother on the versions, but I think an exception can be made for this. For example, there are cases where we can't know what specific version is being hosted on Gutenberg, such as Aladdin O'Brien, which doesn't list any of the title page information, or the publisher data. So to connect that to a specific version of Aladdin O'Brien is misleading...

Furthermore, it doesn't look like they even care about versions like we do; their focus seems to be one ebook per work, indiscriminate to versions. If it's one ebook per work, then maybe it can sometimes be on the work item and not the version item. Scripts that use our Wikidata items need to get both the work data and the version data. PseudoSkull (talk) 02:07, 27 July 2023 (UTC)Reply[reply]

Project Gutenberg typically alters their text, often to fit modern typographical convention, even when their text can be traced back to a specific source. Gutenberg editions are, therefore, really editions in their own right. I treat them as such, for example: Middlemarch (Q111272552). --EncycloPetey (talk) 02:23, 27 July 2023 (UTC)Reply[reply]
Also, your assumption that Gutenberg limits themselves to one ebook per work is incorrect. The Sophocles play Antigone exists here, and here, and here, and here, and here, and here. Gutenberg includes not only English works, but works in translation, both into English and into other languages. --EncycloPetey (talk) 02:33, 27 July 2023 (UTC)Reply[reply]
Ah, well then I concede that point (I didn't take translations into account). PseudoSkull (talk) 03:15, 27 July 2023 (UTC)Reply[reply]

How to mark separate part of a book?[edit]

Some books are divided into parts. Here is an example: Infection Prevention and Control in Healthcare, Part II: Clinical Management of Infections, An Issue of Infectious Disease. It's a book that contains different articles. The book has at least 3 parts. The preface names those parts issues, but how to mark a Wikidata entity as a part? D6194c-1cc (talk) 20:07, 27 July 2023 (UTC)Reply[reply]

Wikidata:Property proposal/Identifiant Médiathèque Numérique CVS[edit]

Hi, as I can't ping the project, here is a message about a proposal to add a ID used by the service used in France by a few libraries for video on demand. Misc (talk) 18:34, 6 August 2023 (UTC)Reply[reply]

Property Proposal: state of transmission[edit]

Dear Project Participants,

may I direct your attention to another recent property proposal? Best, Jonathan Groß (talk) 19:13, 6 August 2023 (UTC)Reply[reply]

publication date and multivolume books[edit]

How to deal with publication date (P577) and multi-volume books published over a long period of time? For volumes within a decade or century, it makes sense to use the decade/century item with qualifiers P580 (first volume) and P582 (last volume). But what if the publication crosses the century boundary (as in the case of Ottův slovník naučný (Q2041543) or Q121625294)? In Ottův slovník naučný (Q2041543) no value is used, in Q121625294 unknown value is used, neither of which seems quite right to me. Jklamo (talk) 20:49, 19 August 2023 (UTC)Reply[reply]

@Jklamo: I would simply use the millenium or above value with P580 and P582; I believe that the value in those qualifiers should take prevalence over anything that is in the main value . --Jahl de Vautban (talk) 11:21, 20 August 2023 (UTC)Reply[reply]

Which x_work to use for a book, and why so few books from a publisher[edit]

I understand the ontological reason for having x_work separate from book, but if I wanted to add the books from a particular publisher, would they go under written work (Q47461344) or creative work (Q17537576) hierarchies. And for the vast majority of books that are only ever produced in 1 edition from one publisher, does it really make sense to have 2 items for them, one for the work, one for the physical object? I came here because my military interests mean I'm interested in Osprey Publishing (Q2697821) who've produced 2000 odd short books over 60 years in long, narrow focus series which seem ideal for WD recording, but I was surprised that a query only showed 28 titles, with all sorts of instances, a really muddled dataset.

SELECT DISTINCT ?item ?itemLabel ?instanceLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
    SELECT DISTINCT ?item ?instance WHERE {
      ?item wdt:P123 wd:Q2697821.
      ?item wdt:P31 ?instance.
Try it!

And the Fortress (Q113697822) series of 110 books only has 3 members here

SELECT DISTINCT ?item ?itemLabel ?instanceLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
    SELECT DISTINCT ?item ?instance WHERE {
      ?item wdt:P179 wd:Q113697822.
      ?item wdt:P31 ?instance.
Try it!

Vicarage (talk) 06:10, 29 August 2023 (UTC)Reply[reply]

We don't use "book" because that word has too many different meanings. A "book" can be a particular work, an individual copy, a part of a longer work, a volume in a series. So we avoid the term "book". We use "work" (for the general) and "edition" (for a particular publication of that work). This has to be done because library databases do this. There will be a data item in a library database for the work, and a separate data item for each edition. In order to interface with library databases, we have to do the same. An "instance" is a particular copy, like a copy of Gutenberg Bible held in a specific collection, or a manuscript in a particular library.
The data item for a literary work is the general item, that covers properties not specific to any edition, such as the language of composition, date of first publication, author, and library IDs for the work. A publisher issues an edition of a work, and the publisher goes on the version, edition or translation (Q3331189) for the edition from that publisher. The data item for an edition carries all the data specific to that edition: date of publication, publisher, number of pages, place of publication, scans of that edition, etc.
Yes, some of the datasets are muddled, because people import data from Wikipedias without correcting it to match the Wikidata structure, or edit without understanding the concepts of "work" and "edition", or because the data was imported from WorldCat, whose database is very muddled. --EncycloPetey (talk) 19:48, 29 August 2023 (UTC)Reply[reply]
I think it's worth maintaining the distinction between a work and an edition @Vicarage.
I understand the concern about adding complexity, however I think we can effectively SPARQL that problem away into a single query.
I don't know the series you're interested in, but from my own patch one surprise I've had is the range of formats, even off a single edition.
I've been listening to lots of works in an audiobook format, which I think we should keep distinct from other editions (print, ebook), but it still makes sense to have a single abstract record (the work) that we can use to unite all these and say we're talking about a fundamentally similar set of things with shared commonalities.
I wonder if your osprey books might have the same? Perhaps there was only one edition, but might you want to distinguish between audiobook edition, ebook editions and print in future? Or perhaps even translations if they were of international interest?
Plus, you never know there could be a second edition of interest emerged in the future.
The edition / work split seems like a really robust data model, even if it is a bit more work to create a d query IMO Huw Diprose (talk) 13:25, 1 September 2023 (UTC)Reply[reply]
With only 1% of the Osprey books recorded, WD is no use for my military history work. By contrast Goodreads has recorded all of the Fortress series at I don't think I will be able to embark on adding them here, and focus on the fortifications themselves, which are well recorded here. Vicarage (talk) 13:42, 1 September 2023 (UTC)Reply[reply]

Publication Date Vs Inception[edit]

Hey folks.

@Pfadintegral and I have been having a conversation over here about how to use Inception and publication date on works and editions.

I note that you only have Inception under the work here and publication date on the edition, so I'd been assuming the work is an abstract that isn't published per say, and so doesn't have a publication date.

Pfadintegral has pointed out that publication date's description mentions it's only ever the first publication (so perhaps as with title or subtitle we could have one on both the work and updated values on the edition).

I note there are warnings when putting publisher details on works, but not publication dates.

So I sense there must be some prior conversations here.

What do bookfolk of wikidata recommend? Huw Diprose (talk) 08:15, 1 September 2023 (UTC)Reply[reply]

Publication date is for the date of publication. Inception is used for when the author started writing the work, and is not related to publication at all. There are some works whose inception was during the lifetime of the author, but were only published after the author died. --EncycloPetey (talk) 15:30, 1 September 2023 (UTC)Reply[reply]
That was also my understanding. If there is a consensus on this, I would propose amending the project page to list both inception and publication date under "work item properties" and use a work where they are different as an example for both. Pfadintegral (talk) 15:40, 2 September 2023 (UTC)Reply[reply]
Agreed that would be really helpful! Huw Diprose (talk) 16:50, 2 September 2023 (UTC)Reply[reply]
Great thanks for the clarification @EncycloPetey! Huw Diprose (talk) 16:51, 2 September 2023 (UTC)Reply[reply]

Open access vs free to read[edit]

Notifying about the opened discussion since it is related to citation templates: Wikidata:Project chat#Open_access_vs_free_to_read. D6194c-1cc (talk) 19:50, 4 September 2023 (UTC)Reply[reply]

Modeling specific chapters of books[edit]


I wonder if there is any existing guidance on the best way to model chapters of a book. To explain my use case, I read a few books on movies and some are focused on specific movies analysis. For example, in Queer Muslim diasporas in contemporary literature and film, most chapters examine 1 single movie (sometimes 2), like chapter 5 is analysing My Brother the Devil (Q769753), etc, etc. I would like to connect the movie item with the book item (once created), if possible in a more granular way than just the book level.

One way is to add described by source (P1343) to the movie item with some qualifiers for chapter. For example, this is done on Star Trek: First Contact (Q221236) (I found only 10 items with that modeling). Another way would be to add several value on main subject (P921) at the book item, pointing to the movie items, again with proper qualifiers. This is done on Star Trek: The Art of John Eaves (Q107023900), minus qualifiers. A 3rd and 4th way is to create one item for each chapter, and connect them to either the edition (3rd way), or the work (4th way), and add main subject (P921) for each, and/or described by source (P1343) to the chapter. For example, this is the approach used for “Machines Making Machines? How Perverse.” Racism, (White) Sexual Anxiety, the Droids of Star Wars and the Prequel Trilogy (Q120761367) (one of the few item using that scheme, but I was lazy and limited my search to movies).

I searched the archives and found no consensus on the best way to achieve that and in fact few discussions. And while each method would work (more or less) for my purpose, I would prefer to not have a bespoke system for my specific needs. I searched with a few queries, and while I found some example for each scheme, none was overwhelmingly used.

I personally lean on option 3 (separate item for chapter, linked to editions). This is more natural, easier to query, allow to add specific value on each chapter. However, that also mean that the list of chapter would be duplicated for each edition. I am not sure how a mix of 3 and 4 would work (eg, link to the work level item by default, unless there is reason to have a different number of chapter at the edition level). Or maybe have it on both, and have some bot that do the work of filling editions based on the work level information.

So have people opinions on that ? Was it already discussed to death and I missed it ? Misc (talk) 21:39, 8 September 2023 (UTC)Reply[reply]

(New) related property proposal : Plate[edit]

Hello, Wikidata:Property proposal/Plate may be relevant to this WikiProject, thank you, Maculosae tegmine lyncis (talk) 07:34, 9 September 2023 (UTC)Reply[reply]

Property_proposal: BISAC_Subject_Heading[edit]


Property proposal: EDItEUR Thema id[edit]

Wikidata:Property proposal/EDItEUR Thema id Vladimir Alexiev (talk) 15:32, 12 September 2023 (UTC)Reply[reply]

Property_proposal: ONIX_Subject_Scheme_id[edit]

Wikidata:Property_proposal/ONIX_Subject_Scheme_id Vladimir Alexiev (talk) 16:04, 12 September 2023 (UTC)Reply[reply]

Property proposal:Diktyon[edit]

Comments are welcome at Wikidata:Property proposal/Diktyon. Jonathan Groß (talk) 13:29, 14 September 2023 (UTC)Reply[reply]

Distinction between manuscripts and works[edit]


while trying to improve our data on ancient written works, I sometimes come across items that conflate manuscript (Q87167) and written work (Q47461344). A prominent example is Cologne Mani-Codex (Q657420). Obviously it would be a good idea to create distinct items for (a) the work (as an abstract concept) and (b) the textual witness it is transmitted in. While this distinction may seem artificial to some, it is in my opinion the best way going forward. Examples how to do this are Berlin Chronicle (Q21100459) which is transmitted on Egyptian Museum and Papyrus Collection, P 13296 (Q21100575) and Alexandrian World Chronicle (Q21100150) which is transmitted on Goleniscev Papyrus (Q21100168).

The reason I'm writing this here is that I am unsure what to do with existing sitelinks. Wikipedia articles notoriously conflate textual witnesses and the works they transmit. I can think of two solutions:

  1. Keep sitelinks on the old item and make it a dedicated item for the work (as this is what most readers will be interested in)
  2. Keep sitelinks on the old item and make it a dedicated item for the textual witness (which is often used as illustration).

Thoughts? Jonathan Groß (talk) 14:14, 15 September 2023 (UTC)Reply[reply]

Given that it is a complex problem, I would have a slight preference for 2: when a sitelink conflates many concepts which have distinct items in Wikidata, I usually try to move the sitelink to the item which corresponds more precisely to the title and the incipit of the sitelink. Since articles regarding manuscripts and the work(s) inside them usually are titled with the name of the manuscript and have an incipit like "X is a manuscript etc.", I think 2 is slightly better and probably also the most intuitive one for users to be applied. Anyway, disentangling these will require a lot of effort :( --Epìdosis 08:03, 16 September 2023 (UTC)Reply[reply]

Perhaps we could make a subpage for these cases and document what we're about to do there? Checking the items from Category:Papyrus (Q7356868) would be a good starting point. Jonathan Groß (talk) 08:08, 16 September 2023 (UTC)Reply[reply]

Also, a Database report could help listing items that have both instance of (P31)written work (Q47461344) and instance of (P31)manuscript (Q87167). Jonathan Groß (talk) 11:55, 16 September 2023 (UTC)Reply[reply]

Here is a SPARQL query doing just that:

SELECT DISTINCT ?item ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
      ?item p:P31 ?statement0.
      ?statement0 (ps:P31/(wdt:P279*)) wd:Q47461344.
      ?item p:P31 ?statement1.
      ?statement1 (ps:P31/(wdt:P279*)) wd:Q87167.
Try it!

Jonathan Groß (talk) 12:08, 16 September 2023 (UTC)Reply[reply]

Many relatively empty items, such as Codex Speculum (Q5140244), suffer from the same problem, cf. e. g. w:de:Diskussion:Codex Speculum. HHill (talk) 15:41, 16 September 2023 (UTC)Reply[reply]

New properties proposed: Pinakes IDs[edit]

Allow me to direct your attention to this proposal by Epìdosis. Jonathan Groß (talk) 06:21, 29 September 2023 (UTC)Reply[reply]

Add link please[edit]

Can someone familiar with page translation please add a link to Wikidata:WikiProject Manuscripts/Data Model under the "Manuscript Properties" section on Wikidata:WikiProject_Books? Thanks! PKM (talk) 01:47, 19 October 2023 (UTC)Reply[reply]

Do review apply to works or editions?[edit]

Hi there, @INS Pirat: and I have been disagreeing here on whether reviews like E. J. KENNEY, W. V. CLAUSEN (edd.), The Cambridge History of Classical Literature, II: Latin Literature. Cambridge, University Press, 1982. xviii, 974 pp. Pr. £ 40,- (Q123251163) or The Cambridge History of Classical Literature 2: Latin Literature (Q123251162) apply to the work or the edition (currently only Q123250189 exists, whose P31 needs to be determined). I'm arguing for the latter, as the reviewers are reviewing what they have in their hands, and that is an edition, not a work . The fact that some review go so far as to include detail such as date and place of publication, publisher, pages number, and price makes me all the more inclined to consider that only editions are reviewed. Happy to hear other voices on that. --Jahl de Vautban (talk) 19:57, 31 October 2023 (UTC)Reply[reply]

The beginning. I would elaborate on my last point in there. We demonstrate the notability of a literary work as an article topic in Wikipedia, providing the reviews. The fact, that some reviews may contain the publication data of a specific edition or the multiple editions, usually doesn't affect the reviews themselves (and their mutual relevance) at all. INS Pirat ( t | c ) 20:59, 31 October 2023 (UTC)Reply[reply]
To me it seems that while there are certainly reviews that have a specific edition of a work as their subject (for example, a revised edition of an academic reference work or a new translation of a classic), most reviews take as their subject the work itself, and some might not even reference a specific edition. If a review discusses a work in general and references a specific edition, I see no reason not to link both items as its subject. Note that if the more specific review of (P6977) rather than the general main subject (P921) is to be used for linking reviews with editions, its current description and constraints would need to be changed. Pfadintegral (talk) 07:40, 1 November 2023 (UTC)Reply[reply]
@Pfadintegral: would you have an example of a review that don’t mention a specific edition? I admit that I have only ever see reviews for editions and I wasn’t aware some could exist for works. In the end however I find it odd that review of (P6977) could link to either work or editions. I would be fine however with a triangle where reviewreview of (P6977)edition, editionedition or translation of (P629)work and workdescribed by source (P1343)review. —-Jahl de Vautban (talk) 07:23, 2 November 2023 (UTC)Reply[reply]
Take this review for example: - which has been republished in a print book and has a UID at the Internet Speculative Fiction database, so it should be notable by the standards of Wikidata. Pfadintegral (talk) 09:30, 2 November 2023 (UTC)Reply[reply]
Okay, I admit that for this kind of review it is hard to pinpoint to a given edition, thought I still got the feeling that this is more of a shortcoming and that it shouldn't be the norm. I had hoped for more point of views but two-to one I stand wrong. @INS Pirat: you can revert me back to where it was. --Jahl de Vautban (talk) 09:38, 5 November 2023 (UTC)Reply[reply]
I’m sorry I missed this discussion. I would say Walton's piece is more of an essay than it is a book review, and perhaps assessments of works rather than editions should be essays or scholarly articles (depending on context) and assessments of editions, especially when issued shortly after publication, can best be considered book reviews. - PKM (talk) 00:20, 19 November 2023 (UTC)Reply[reply]
What is the claim, that the reviewers generally care which specific edition they review, based on? INS Pirat ( t | c ) 13:07, 19 November 2023 (UTC)Reply[reply]
I often read reviews whose author made comments on the editing (typos, or whether the notes are end note or foot notes) or cited some pages because they wanted to engage with the content of that page - none of that should be relevant to the work. --Jahl de Vautban (talk) 18:12, 20 November 2023 (UTC)Reply[reply]

Formats & Forms[edit]

I’ve been cleaning up some confusion with “book format” in the sense of folio, quarto, octavo and “book format” in the sense hardcover, ebook, audio book. The sitelinks and all but one of the external IDs on book format (Q18602566) were specifically about the folio, quarto sense, so I’ve started my cleanup there. The subclasses and instances were all over the place (I think because the description was vague). Now we have book format (Q18602566), book form (Q104624828), book distribution format (Q123330346) and print book format (Q82046811) - but book form (Q104624828) still allows for confusion and needs more thought. Does anyone have ideas about how to proceed? - PKM (talk) 03:36, 5 November 2023 (UTC)Reply[reply]

What should Diana Gabaldon's Outlander series look like on Wikidata?[edit]

At first I thought that a bunch of the books in Outlander series (Q18153036) were duplicates, e.g., Written in My Own Heart's Blood (Q17184122) and Written in my own heart's blood (Q54870952). But looking at the sitelinks led me to it:Diana_Gabaldon#Serie_di_Outlander which (via Google Translate) seems to indicate that most of the books were published in two parts in Italy. So what's the correct way to model that in Wikidata? Should each of those books have one item for the overall work, and one item for each part? Should the items for each part be works, or editions? dseomn (talk) 01:24, 17 November 2023 (UTC)Reply[reply]

@Dseomn: I'm not sure how much of intended by the author was the division of the books as it was done by the Italian publisher. I have therefore doubt that they should be considered works on their own, but they could stand as part of a work (Q88392887) perhaps? --Jahl de Vautban (talk) 18:19, 20 November 2023 (UTC)Reply[reply]
That makes sense. I'm still not sure how to handle the different items though. Most of the labels, descriptions, and claims on both items in each pair seem to be about the work as a whole, and it's just the itwiki sitelinks that are different. So maybe it would make the most sense to create new items for each of the parts of a work, move the itwiki links to those, and then merge the existing items into each other? Or should I try to figure out which of each pair is the overall work, and only create 1 new item for each pair, to move the itwiki link to? dseomn (talk) 01:37, 23 November 2023 (UTC)Reply[reply]

OCLC Classify is being discontinued[edit]

I've started a conversation at meta:Talk:The Wikipedia Library § OCLC Classify is being discontinued that is relevant to Wikidata:WikiProject Books. Daask (talk) 18:23, 20 November 2023 (UTC)Reply[reply]

Property proposal: Walmart product ID[edit]

Hello! This property proposal hasn't got any participant in seven weeks. May any member of this WikiProject give their opinion on it? Thank you, Horcrux (talk) 19:11, 20 November 2023 (UTC)Reply[reply]

What are these items supposed to be?[edit]

I can't figure out what literary forms of (Q26213430) and Literary forms and genres (Q30032140) are supposed to be, they link to nothing on Wikidata and seem superfluous. StarTrekker (talk) 14:53, 4 December 2023 (UTC)Reply[reply]

If you follow the link ID on literary forms of (Q26213430) it points to an article. I think it's meant to be a data item for that web-based article. --EncycloPetey (talk) 15:38, 4 December 2023 (UTC)Reply[reply]
I've looked at both links and I'm having a hard time figuring out what the descriptions and statements should be for them.StarTrekker (talk) 14:54, 10 December 2023 (UTC)Reply[reply]