Wikidata talk:WikiProject Books
Add topic| On this page, old discussions are archived. See: 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025. |
New proposal for "OpenITI Author URI" property
[edit]Dear WikiProject Books members, could you please take some time to review and (if you like it) support my proposal:
https://www.wikidata.org/wiki/Wikidata:Property_proposal/OpenITI_Author_URI
This is a first step towards adding metadata for thousands of (mostly premodern) Arabic and Persian books to WikiData.
Thanks, Pverkind (talk) 14:57, 23 September 2025 (UTC)
what is a book
[edit]I was putting together some stuff for the Wikidata Ontology course and noticed an apparent problem with books. book (Q571) is a subclass of document (Q49848) which is a subclass of intellectual work (Q15621286). But a (physical) document is not an intellectual work so this subclass link should be removed. I think that a decision has to be made as to whether instances of book (Q571) can be physical books or are intellectual works and the appropriate subclass link added to Wikidata. Peter F. Patel-Schneider (talk) 20:32, 20 May 2025 (UTC)
- The use of book (Q571) is discouraged anyway, see the Project Page. Dajasj (talk) 11:58, 21 May 2025 (UTC)
- in fact, book (Q571) is - at the same time - the intellectual work, it's publication, and the physical object you can hold in your hand... so, using it to describe an item is completely discouraged, as it only adds confusion... - except perhaps for "paintings" representing someone holding a "book"...
- It was widely used, at the begining of Wikidata, and the cleaning of wrongly described items is still ongoing... please, PLEASE, but never use it for a specific written work (Q47461344), literary work (Q7725634), version, edition or translation (Q3331189) or a individual copy of a book (Q53731850)... ! Hsarrazin (talk) 10:12, 27 May 2025 (UTC)
- Thank you for providing more explanation. I would like to make changes to the Wikidata ontology that impact book (Q571) which I imagine will not be a problem for this group; however, I'd like to double check anyways. As you say, a book is also an intellectual work and I would like to formally establish that in the Wikidata ontology even if people continue to use it while being discouraged to do so.
- To my knowledge, none of these changes conflict or negate the work or guidelines of this group. My hope in writing is to ensure that this is the case.
- Does the group take issue with any of the suggested changes? Copied below for convenience.
- Fixes to books include:
- Making book (Q571) a subclass of intellectual work (Q15621286)
- Removing document (Q49848) as a subclass of intellectual work (Q15621286), leaving it a subclass of information resource (Q37866906) and manifestation (Q286583). Has no characteristic of having version, edition or translation (Q3331189) or anything else other than being a manifestation of an artificial object (Q16686448)
- Relating audiobook (Q106833) and ebook (Q128093) with the property union of (P2737)
- Making individual copy of a work only a subclass of artificial object (Q16686448), removing artwork copy (Q1784021). Low stakes: extremely low usage; 3 instances of, and 8 subclass of. Removing artwork copy resolves disjoint errors that affect paperback (Q193934), printed book (Q11396303), and individual copy of a book (Q53731850). This makes sense because an individual copy of a work to me sounds like an object or manifestation of a Work and not an abstract entity – currently the class tree shows both and serves a disjoint violation.
- If you'll excuse my waxing poetic on the project description, I will link to the full project page here: Wikidata:WikiProject_Ontology/Ontology_Course/Books Kind data (talk) 20:34, 5 June 2025 (UTC)
- Gentle nudge here to see if there are any issues with these suggestions. If not, we will move forward. Thank you again for considering. Kind data (talk) 17:15, 12 June 2025 (UTC)
- I'm not sure that you understood the main point of the discussion above. This project discourages the use of book (Q571) in any field associated with the project. There are too many possible meanings attached to the word, even in literature and publishing. We also would not recommend connecting it into the Ontological hierarchy because its meaning is too broad and too vague. A paperback "book" is a physical object, but an audio"book" is not. A "book" as a physical object could be full of words to be read, or it could contain only pictures or music with no words at all. A "book" could be one volume of several in a multi-volume work, or it could be one section of a written work covering only a few pages. The word is so broad and so vague that it would be inappropriate to use "book" anywhere in the Ontology. --EncycloPetey (talk) 18:39, 18 June 2025 (UTC)
- Thank you for your reply. I understood the main point of the discussion above and that the group discourages the use of book (Q571) in any field associated with the project. I understand that book (Q571) has broad meaning and takes on can take many forms; an audiobook can in fact be a physical object. Whether or not its use is discouraged, it is being used, and as a result is already part of the Wikidata ontology. I do not see the harm in making book a subclass of an intellectual work along with communications media (Q340169), document (Q49848), product (Q2424752), and publication (Q732577). I agree with User:Hsarrazin that book (Q571) is many things at once, including the intellectual work. It is no surprise to me that there are over 35,000 uses of book (Q571) currently in Wikidata and that users frequently gravitate towards it. I do not believe my recommendations conflict with the guidelines of the group nor am I asking for the group to change their recommendations in order to encourage its use. I follow the guidelines of this group when adding bibliographic data to Wikidata. My aim is to address the items that are associated with book (Q571) regardless. Kind data (talk) 23:46, 18 June 2025 (UTC)
- The Wikidata Ontology includes many classes, many of them associated with Wikipedia pages that have multiple or broad or vague meanings. Just because "book" and https://en.wikipedia.org/wiki/Book have multiple or broad or vague meanings is not a reason to not place book (Q571) in a useful place in the Wikidata Ontology. In fact, I would say that this is a good reason to place book (Q571) in a useful place - because "book" is a prominent work in English book (Q571) should be placed in a useful place in the Wikidata ontology, either reflecting the general meaning of "book" or one of its more specialized meanings. There may be good reasons to not use the resulting book (Q571) directly, but then there should be instructions on what to do for items - physical, digital, analog, and intellectual - that might naturally use book (Q571). Peter F. Patel-Schneider (talk) 20:38, 4 July 2025 (UTC)
- I'm not sure that you understood the main point of the discussion above. This project discourages the use of book (Q571) in any field associated with the project. There are too many possible meanings attached to the word, even in literature and publishing. We also would not recommend connecting it into the Ontological hierarchy because its meaning is too broad and too vague. A paperback "book" is a physical object, but an audio"book" is not. A "book" as a physical object could be full of words to be read, or it could contain only pictures or music with no words at all. A "book" could be one volume of several in a multi-volume work, or it could be one section of a written work covering only a few pages. The word is so broad and so vague that it would be inappropriate to use "book" anywhere in the Ontology. --EncycloPetey (talk) 18:39, 18 June 2025 (UTC)
- Gentle nudge here to see if there are any issues with these suggestions. If not, we will move forward. Thank you again for considering. Kind data (talk) 17:15, 12 June 2025 (UTC)
Data arrangement between "work" versus "edition"
[edit]To what degree is duplication desired between the 'work' and 'edition', and how should the work handle discrepancies between editions. For something like "author" or "main subject" which is the same across all editions, should that go only on the work, or also on all the editions?
In the opposite situation, how should differing data be handled? "Publication Date", "Publisher", "Translator" will likely be different across editions. This should obviously be recorded on the edition, but how should it be handled on the work? Omit? Use the oldest? Use any? Driftingdrifting (talk) 23:11, 19 June 2025 (UTC)
- There is typically little overlap. Author is one thing placed on edition that also appears on the work, usually with the same value, but is put on both because authorship of some works does change with editions, such as textbooks. Likewise title, which can change on editions or translations, goes on both, with the work item carrying the database standard forms of the title, and edition-specific titles on the editions. Publication date gets placed on the work data item, where it refers to date of first publication. But any data specific to a particular edition should be placed on a data item for that edition. Publisher and translator are always edition specific, and so should not appear on the work. --EncycloPetey (talk) 00:59, 20 June 2025 (UTC)
- Thanks! I assume that the constraints on the edition on the work are the other place for 'overlap' and should generally tried to be in sync, yes? I see constraints like "translator", "publication date", "publisher" used frequently and I'm assuming there is no magic pulling those dynamically. Driftingdrifting (talk) 01:07, 20 June 2025 (UTC)
- The data model somewhat conflates things....
- An author writes a "work", and it gets published in a "version, edition, or translation" (edition).
- That work gets translated into a new "version, edition or translation" (translation) that then gets published in a "version, edition, or translation" (edition). These are all separate items.
- The way to keep this all working is by using "edition of" and "published in" to explicitly differentiate between works (the result of authorship/translation) and editions (groups of artificial objects that are the result of typesetting).
- Also, a new impression (a new version resulting from the same setting of type, either a later reprinting or a digital reproduction) isn't a new "edition". That direction lies madness, lol. Use qualifiers. Jarnsax (talk) 03:01, 3 September 2025 (UTC)
- Thanks! I assume that the constraints on the edition on the work are the other place for 'overlap' and should generally tried to be in sync, yes? I see constraints like "translator", "publication date", "publisher" used frequently and I'm assuming there is no magic pulling those dynamically. Driftingdrifting (talk) 01:07, 20 June 2025 (UTC)
How to correctly split a trilogy
[edit]To give this as an example: Q722313
Currently we have a single Wikidata item that is used in Wikipedia for the entire trilogy, the first edition of the trilogy as well as an omnibus edition that links all 3 volumes into a single book.
How should this be split up? Should:
1. Create a new overarching 'serie' entry that's named Orbitor/Blinding.
2. Create 3 separate works and link them to the previously created series. (currently the previously mentioned Wikidata describes the first volume of the trilogy, so that will be reused and another 2 items will be created)
3. Create editions for each volume as they were released.
4. Create an edition type for each omnibus book (books that contain all 3 volume) and link each to all 3 previously created works.
Is this correct? Should this be approached in a different way?
Rapiteanu (talk) 08:41, 25 June 2025 (UTC)
- That is basically correct. In particular...
- You need a "series" item for the trilogy as a whole, with three "has parts of the class" = "volume" items, listed in order (use "series ordinal" qualifiers) under "has parts". These three volumes should also be instances of "novel" in their own right, as well as "part of the series" with "volume", "follows" and "followed by" qualifiers.
- You then need four "version, edition or translation" objects. The first three should be obvious.. they are "edition of" the the three individual works (as well as instances of "volume"). For the omnibus, it gets "instance of" = "version, edition, or translation", "has parts of the class" = "novel", and then three "has parts" (with series ordinal qualifiers too).
- The three "novel" items then get "published in" for both the separate publication and for the onmibus, with "publication date" both for the first publication date, and as a qualifier for each "published in".
- Or, at least, that is how I do it... and please use "has parts of the class". It makes it far easier for other people to follow along, since you often end up with items that are "instance of" two different things because the data model conflates them. Jarnsax (talk) 02:17, 3 September 2025 (UTC)
- Modeling it like this (the omnibus as a publication of the three individual works together, not as a publication of the "series") is more generic... authors have done things like publish a series of three books, then an omnibus, then two more, then a larger omnibus. Try to make things as explicit as possible. Jarnsax (talk) 02:32, 3 September 2025 (UTC)
Property proposal: National Library of Uruguay book ID
[edit]I'd like to share with you a proposal of a new Wikidata property consisting of the National Library of Uruguay book ID: Wikidata:Property proposal/National Library of Uruguay book ID
This property would complement the existing National Library of Uruguay authority ID (P12595), which is used for authors. The idea behind this new property is to use it for book editions published in Uruguay. Best regards! Pepe piton (talk) 19:51, 2 July 2025 (UTC)
WikiCite 2025
[edit]Hi. I am Alessandro from the academic committee of 2025 edition of WikCite, the wikicon about sources and references that we are rebooting this year (last edition was WikiCite 2020). It will take place in Bern August, 29th to 31st, mostly on line with a on-site part featuring a dozen invited speakers (but only from Europe).
We’re de facto "rebooting" the conference, and as such, we’re keen to listen and respond to the needs emerging from the community. The theme of "sources and reference" is quite big in the end... the academic committee has already selected a set of key themes, including author profiles, authority control, ontologies and controlled vocabularies, data round-tripping, and new perspectives using Wikibase.
However, there may still be room to introduce additional topics. To that end, several sessions on Saturday 30th and Sunday 31st have been designed so that themes can be chosen by volunteers. If you’re interested in proposing a full presentation, a lightning talk, a discussion topic (in French, Spanish or German), or specific activity for the do-a-thon, please visit the proposal page. The template should be quite simple, you only need to list your suggestion, username, preferred language and slot. Alexmar983 (talk) 19:52, 8 July 2025 (UTC)
Adding metadata using openRefine, and perhaps a bot
[edit]We have a database of bibliographical metadata compiled during an academic research project in literature studies. This data was largely scraped from the US and german project gutenberg sites and is therefore publicly available/verifiable, but there is quite a bit of work involved in collecting it, as I'm sure many of you have experienced. I thought would be great to make the fruits of this labour, that is, the obtained metadata, available on wikidata (lots of it isn't). ...and perhaps other researcher in the digital humanities can be encouraged to do likewise with the metadata they compile.
There are a number of challenges. Firstly, there are the millions of pitfalls of representing metadata, especially in the semantic web style used by wikidata (we're neither librarians nor wikidata gurus). Secondly there is the issue of matching with existing data---this seems to be quite doable, with a bit of work, using openRefine. Thirdly, there is also the issue of verification. Not every statement is likely to be accurate, as is of course often the case with publicly available data --- but perhaps there are ways to verify this data against other library sources as well, semi-automatically? Finally, there are lots of special cases, such as individual poems, which (for now, in any case) I will exclude.
In any case, I would welcome your thoughts and input. I will also be asking many more concrete question here, in the near future. Katdav-wd-lit (talk) 06:31, 11 October 2025 (UTC)
- Gutenberg texts would be their own edition data items such as at The Red Badge of Courage (Q55816148). This results because Gutenberg texts typically differ from previous editions, and because they have a publication date and publisher different from other editions. They are sometimes modeled on a specific earlier edition, but that information is not made public, and Gutenberg proofreaders will often update, normally, or otherwise change spellings and punctuation.
- So the goal would not be to match Gutenberg texts with existing data items, but would be (1) to split off Gutenberg IDs into new data items for the Gutenberg editions, and (2) mark them as edition or translation of (P629) the literary work they are an edition of. --EncycloPetey (talk) 14:45, 11 October 2025 (UTC)
- Thanks, that's very helpful. Of course I would make these entries into a separate version, edition or translation (Q3331189) items, and match them to their work with has edition or translation (P747) (and edition or translation of (P629)).
- The German edition of Gutenberg does say (well, sometimes) which edition it was based on, and includes some metadata about that edition such as publisher, translator, even illustrator. Is there a good way to link to that edition as well, for example, via some kind of based on qualifier? Should it be made an item, as well, if it doesn't exist?
- The German edition of Gutenberg has no Gutenberg IDs, or none that I think wikidata would care for. Perhaps instead include the URL. It does have its own ISBN, which is the same for all works it includes. Another thing I do not know how to represent! Katdav-wd-lit (talk) 00:09, 12 October 2025 (UTC)
- It might be possible to use based on (P144), but it general we don't classify editions as editions of other editions. In working on Wikidata, we've found that Gutenberg copies often differ substantially from the copy they are "based on". One of H. G. Wells' novels had several chapters with Americanized spelling, but other chapters with UK spelling, presumably because different proofreaders worked on the chapters separately.
- It is possible to use work available at URL (P953) if German Gutenberg does not use IDs. But if they do have IDs (but Wikidata has no property) then you can petition for a new property to accommodate the IDs. There are separate properties for ISBN-10 and ISBN-13, depending on the length of the ISBN. --EncycloPetey (talk) 02:38, 12 October 2025 (UTC)
- After some more thought, I realize that the ISBN of the German Gutenberg probably pertains to a version of the site distributed via CD or USB stick, so it probably shouldn't be used for the online edition at all.
- Adding a based on (P144) and connecting to the other edition can't hurt, can it? Once this information is lost, it's much harder to collect it again, while ignoring it is easy... Also, the metadata provided at the Gutenberg site about that older edition may not be available on wikidata yet.
- By the way, German Gutenberg claims the following: Apart from obvious typographical errors, the text of the book edition is generally not altered; in cases of doubt, the text is reproduced exactly as it appears in the book [...]. Cuts are made only when unavoidable for copyright reasons—for example, if a contemporary wrote the foreword or afterword, or if the illustrations are still under copyright. The original spelling is retained; no book is adapted to the “new” spelling rules or altered in content. However, some submitted works and contemporary translations are written in modern spelling. Endnotes and annotations are presented as footnotes at the point of reference. Although this technically contradicts the principle of absolute fidelity to the original, it eliminates the need to click back and forth to read the notes.
— https://www.projekt-gutenberg.org/info/texte/info.html
- For the US Gutenberg, this is very different, as I know. Katdav-wd-lit (talk) 03:12, 12 October 2025 (UTC)
- whatever you decide to do, can I make a plea that given the creative work, I can get an online text with a simple query, not one involving lots of properties and conditions. For many users the minutiae of texts really don't matter. Like going to a physical library and asking 'do you have a copy of Animal Farm'Vicarage (talk) 06:44, 12 October 2025 (UTC)
- I agree that would be nice. But I'm not quite sure what you mean? Is it that you want to be able to find the link to the text easy using a query? Katdav-wd-lit (talk) 10:22, 14 October 2025 (UTC)
- whatever you decide to do, can I make a plea that given the creative work, I can get an online text with a simple query, not one involving lots of properties and conditions. For many users the minutiae of texts really don't matter. Like going to a physical library and asking 'do you have a copy of Animal Farm'Vicarage (talk) 06:44, 12 October 2025 (UTC)
Property proposal: MIT Press Direct Edition Id
[edit]I've recently published a new property proposal for the MIT Press Direct edition id which would cover the unique identifiers used for editions published directly by The MIT Press via their MIT Press Direct platform.
Comments and feedback are apperciated: Wikidata:Property proposal/MIT Press edition id Iamcarbon (talk) 22:17, 19 October 2025 (UTC)
Should main_subject be an allowed property of an edition
[edit]edition A company of forts : a guide to the medieval castles of west Wales (Q77004016) has a value for main subject (P921) without comment. As this edition has no parent literary_work I presume one should be created with the main subject (P921) value. But why don't we have a constraint violation to encourage this? Vicarage (talk) 07:15, 27 October 2025 (UTC)
- I think it would be better to instead just have one item and have the data about editions in the one item. For example via qualifiers on the ISBNs. This would make many things easier and cut down on the number of items. Prototyperspective (talk) 13:04, 10 November 2025 (UTC)
- Ta. Constraint added. Vicarage (talk) 13:26, 10 November 2025 (UTC)
Should guidebooks be also versions
[edit]We have Brough Castle, Cumbria (Q107475146) and Brough Castle (Q98665987), one is a 1982 guide book (Q223638) the other both a 1986 guide book (Q223638) and a version, edition or translation (Q3331189). They have different authors. Should the correct ontology have the creation of a master entry (and is guide book (Q223638) or literary work (Q7725634) with form of creative work (P7937) guide book (Q223638) preferable) with a main subject (P921) but not authors. If so, why is there not a constraint violation to dissuade people from making things both guidebooks and editions. Vicarage (talk) 08:55, 27 October 2025 (UTC)
Labels of editions
[edit]Dear all, I've ran into several disagreements [1][2] with @Quesotiotyo over essentially an identical topic: Should book editions have labels which are translations of the edition's actual title? My reading of VIGNERON's comment here and Epìdosis's comment here has lead me to believe that we should generally not translate book titles on edition items (because that edition was actually never called that) but Quesotiotyo disagrees saying that the translated name is 'the most sensible display name for English-reading users of this item'. Thanks all for clarifying this, let's settle this once and for all, so that we don't run into further disagreements :). Vojtěch Dostál (talk) 22:17, 3 November 2025 (UTC)
- I agree with your position on this. To your argument I would add the point that it is the only policy that can be applied consistently and that giving all editions of a work the same label independent of language is more likely to create confusion than improve clarity. Pfadintegral (talk) 07:08, 4 November 2025 (UTC)
- I agree with your interpretation. Edition items should use the original title in its original language, preferably with mul - and NOT be translated. Work items can have translated labels and editions, and should be created when they don't exist to link everything together. Iamcarbon (talk) 17:14, 4 November 2025 (UTC)
- A bald Q value returned by a query is very ofputting, mul is the solution here. Vicarage (talk) 17:50, 4 November 2025 (UTC)
- We do run into problems when the published edition has multiple titles, in different languages. These are not common, but we should have an approach decided upon for these books as well. I run into this issue with bitexts occasionally. --EncycloPetey (talk) 16:57, 5 November 2025 (UTC)
- Thanks @Iamcarbon @Pfadintegral. Is it then OK to automatically restore the English labels to the actual Czech edition titles? We do have about 900 items where it was changed, often by @Quesotiotyo... Vojtěch Dostál (talk) 12:42, 6 November 2025 (UTC)
- If Czech has already been copied into Mul, maybe I would just remove English if it differs from Czech. Epìdosis 12:44, 6 November 2025 (UTC)
- @Epìdosis Surely you meant to write that they should be removed if they do not differ from Czech (meaning that they are the same in both languages, and thus the mul label can function for both)?
- --Quesotiotyo (talk) 22:06, 7 November 2025 (UTC)
- @Vojtěch Dostál That is not true, I never changed a label on any of these items. I only added them where they were missing (and by using the book's original title, never translating as you have suggested).
- --Quesotiotyo (talk) 17:28, 6 November 2025 (UTC)
- @Quesotiotyo Thanks for specifying your actions. You are right, but my points of concern remain. Removal of the en labels now suggested by @Epìdosis, and leaving mul, is fine with me. Vojtěch Dostál (talk) 08:19, 7 November 2025 (UTC)
- The English labels are not the same as the mul (or the Czech) ones on the items in question, so no, they should not be removed.
- --Quesotiotyo (talk) 22:03, 7 November 2025 (UTC)
- @Quesotiotyo: having a different label on an edition is an obvious mistake, it should absolutely and immediately be removed. Cheers, VIGNERON (talk) 08:42, 9 November 2025 (UTC)
- @VIGNERON But this is a multilingual project, and Help:Label says Note that an item will have multiple labels in different languages. I cannot understand your viewpoint that having a label in a second language (and that is not simply a duplication of the first) is a mistake. Why should these Wikidata items in particular be legible only for those who can read Czech? For what it's worth, the Czech National Bibliography ID (P3184) record typically contains the book's original English title, and in my opinion it was an oversight that so many of these items were created without including this important bit of information, especially where there was no edition or translation of (P629) statement. At the very least, having these original-title labels will be helpful for linking with the corresponding work items when they are eventually created.
- --Quesotiotyo (talk) 06:56, 10 November 2025 (UTC)
- A written work like Animal Farm of course can have labels in multiple languages. When a Czech publisher commissions a translation, and gives it a Czech title, that goes in a mul label. That edition was never called Animal Farm, or any of the other language names that publishers in other countries might pick for their translations. A WD editor has some leeway to do label translation of work titles if native editions don't exist. It is trivial to write a query on the edition/work pair to construct the sentence "A Czech edition of Orwell's book Animal Farm was published under the title Farma zvířat in 1990". And we should not have orphan editions without their parent works. Its easy to create them with item duplication tools where you review the parent and child to keep the properties distinct (see the project page for this talk page). Fixing Czech orphans is much more useful than editing their labels. Vicarage (talk) 07:34, 10 November 2025 (UTC)
- +1 with @Vicarage: with a simpler example what make more sens in English: "Romeo and Juliet is an edition of Romeo and Juliet" or "ロミオとジュリエット is an edition of Romeo and Juliet" ? And it's not just us, for centuries books catalogers have made a clear distinction between work and edition, as you said yourself the translation is the original work title, not the title of the edition the item is about. Also yes, the lack of an item for the work is a problem, but this is a different problem. Finally, it's not really a "mul" problem, before "mul" the exact same title was copied verbatim in all labels and for other kind of publication like scholarly article, it's also done like that. Cheers, VIGNERON (talk) 08:07, 10 November 2025 (UTC)
- A written work like Animal Farm of course can have labels in multiple languages. When a Czech publisher commissions a translation, and gives it a Czech title, that goes in a mul label. That edition was never called Animal Farm, or any of the other language names that publishers in other countries might pick for their translations. A WD editor has some leeway to do label translation of work titles if native editions don't exist. It is trivial to write a query on the edition/work pair to construct the sentence "A Czech edition of Orwell's book Animal Farm was published under the title Farma zvířat in 1990". And we should not have orphan editions without their parent works. Its easy to create them with item duplication tools where you review the parent and child to keep the properties distinct (see the project page for this talk page). Fixing Czech orphans is much more useful than editing their labels. Vicarage (talk) 07:34, 10 November 2025 (UTC)
- @Quesotiotyo: having a different label on an edition is an obvious mistake, it should absolutely and immediately be removed. Cheers, VIGNERON (talk) 08:42, 9 November 2025 (UTC)
- @Quesotiotyo Thanks for specifying your actions. You are right, but my points of concern remain. Removal of the en labels now suggested by @Epìdosis, and leaving mul, is fine with me. Vojtěch Dostál (talk) 08:19, 7 November 2025 (UTC)
- If Czech has already been copied into Mul, maybe I would just remove English if it differs from Czech. Epìdosis 12:44, 6 November 2025 (UTC)
Done per https://editgroups.toolforge.org/b/wikibase-cli/422291e515035/ Vojtěch Dostál (talk) 08:49, 6 December 2025 (UTC)
item structure for multi-volume books (encyclopedias, histories, etc)
[edit]I'm curious as to best practices for making items for reference works that consist of two or more separate volumes, e.g. History of Paterson and Its Environs (Q136262472) or Historical Register and Dictionary of the United States Army (Q136721231). As in the previous examples, I generally use instance of = book edition (Q57933693), number of parts of this work (P2635) = x volumes, and generally don't bother to make items for each individual volume of an edition. I aim for practicality in referencing statements, not esoteric data purity, so generally don't create largely redundant meta items like literary work (Q7725634), especially if a historic book only has one edition or printing (modern public domain reprints-on-demand notwithstanding). This gets a bit more complicated with encyclopedias, which may have multiple editions printed years apart, with each edition consisting of multiple volumes. With this approach, if I create items I generally repeat the above for each edition (e.g. "Encyclopedia of Whatever (1905 edition)", in addition to creating the underlying literary work (Q7725634) stem off editions. For instance Harper's Encyclopedia of United States History (Q5663319) has at least 3 editions (1902, 1905, and 1915), with each edition consisting of 10 volumes. Does this approach seem rational? Is there something I could be doing better? -Animalparty (talk) 18:49, 8 November 2025 (UTC)
- Sounds very sensible. Vicarage (talk) 07:42, 10 November 2025 (UTC)
Should we have work/edition constraints
[edit]My change to main subject (P921) to discourage its use on editions in favour of works was challenged by @Fnielsen because "we have many editions that don't have matching works". I would argue that is a mistake that needs rectifying, not a reason to avoid use the constraint system at all. Do we have many other work and edition rules that would need to be removed if we accepted that an edition can stand alone? Vicarage (talk) 17:51, 13 November 2025 (UTC)
- Well, I agree with you, P921 should be used on "work", not on "editions"...
- but I understand @Fnielsen's objection : if you simply add a constraint, the list of constraint violation to clean up will probably enormous... and people who usually clean this list won't necessarily know how to do this : by creating the "work" item, with original title, author, original language, and then transfer... - we risk that people not knowing how this project works will simply add "work" as P31, or worse, replace version, edition or translation (Q3331189), which would be even more problematic
- so, perhaps, before putting a constraint, it would be necessary to begin with cleaning up... - is someone able to retrieve from SPARQL the number of items concerned (in the Book project, since there are some strange items, too... High Adventure Role Playing (Q5486)) ?
- maybe it would be possible to have a suggestion constraint (Q62026391), not a mandatory constraint (Q21502408) ? Hsarrazin (talk) 19:31, 13 November 2025 (UTC)
- 137k P921 values for editions, 129k of them do not have a parent work declared. Of the ones that do, 5400 do not have the subject in the work, 2400 do. So moving P921 from edition to work is practical, fixing the edition problem is not. A warning rather than a mandatory constraint is fine with me.
- What about the general point that should we add warnings about editions having work properties, and is it possible to write a warning constraint "you have used main subject (P921) on a edition that has a edition or translation of (P629) of a work"
SELECT DISTINCT ?item ?itemLabel #?subjectLabel
WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "en-gb,mul,en". }
{
SELECT DISTINCT ?item ?subject WHERE {
?item wdt:P31 wd:Q3331189.
?item wdt:P921 ?subject.
# ?item wdt:P629 ?work.
#MINUS
# {?work wdt:P921 ?subject}
}
}
}
- allows you to play with the subject. Vicarage (talk) 20:03, 13 November 2025 (UTC)
- Thanks !
- Yes, we indeed have an enormous amount of work to do to add "work" items to "editions"... - including cleaning up items that are at the same time "work" and "edition", like Cup of Salvation (Q40554339)
- Wikisource projects (of which I am) are probably responsible for a not small part of it :(
- on frws, we have a small script that allows to quickly build a "work" item FROM the "edition" one, using the "authors" set on the edition... and creating the work/edition links... - you still need to work a little on the "work" afterwards, but it simplifies doing it, and eases the transfer of properties from the edition afterwards (with the moveClaim gadget)
- it is part of our main script, that retrieves of data from wikisource header... maybe it could be made into a more versatile tool, that everybody could use -> see from line 52 User:Tpt/ws2wd.js ?
- I indeed support having a warning on "editions" that have a work to add the "subject" on the "work", NOT the "edition"
- PS : some of the "editions" with P629 are just horrors to be cleaned up :
- Soupers rapides (Q105948090), where P629 was used to add Amazon Kindle (Q136469)
- this one, novel from an author that claims to be an edition of 2 other works by 2 other authors... -- I cleaned up this one, and created the "work" to show the work of the script
- Hsarrazin (talk) 07:38, 14 November 2025 (UTC)
- PS : some of the "editions" with P629 are just horrors to be cleaned up :
I usually only add the edition item and not the work item. I do it because 1) I need to cite it with Cite Q on Wikipedia or 2) I would like to list it in Scholia 3) I create an item for a Wikisource page. I only add a work item if 1) I write about the work in Wikipedia, - which is not that often 2) there are multiple editions. Furthermore note that editions might have different P921. For instance, People, States and Fear (Q7165832) have an edition from 1983 and one from 1991. The last one considers the post cold war era, which could mean that P921 could be different. Having the P921 on editions may make SPARQL queries simpler. I am not too much into IFLA Library Reference Model (Q54410458) and Functional Requirements for Bibliographic Records (Q16388), but isn't subject headings (P921) usually applied on the manifestation level, i.e. edition level? Does OCLC WorldCat only secondarily assign subject heading on the work level? Isn't Library of Congress Subject Headings assigned to each edition? If the librarians assign on edition/manifestation level, I do not think that Wikidata should diverge. — Finn Årup Nielsen (fnielsen) (talk) 06:20, 20 November 2025 (UTC)
- One problem with quoting practice in other bibliographic systems is that they may not have the WD policy (or indeed capability) of assigning information at the highest possible level, and getting queries to propagate the information. Clearly some editions have different scopes, but the vast majority don't, so a warning with an explanatory text "use in work unless substantially different" would encourage best practice here.
- I originally wanted to only add combined single edition/work items, but I accepted that was not best WD practice. I decided that the work was more important, and others, especially bots, could fill in the editions later. Vicarage (talk) 08:21, 20 November 2025 (UTC)
- Those library catalogs (LC, OCLC, BNF, and every other National library catalog) were built long before FRBR, and of course, subjects were added to any edition (manifestation) record, since the concept of "work" (in the bibliographical meaning) was not even invented... this leading to different editions of the same work having very different "subject" indications, because they were not catalogued at the same time by the same person... which did not help identifying different editions of the same work, especially when the Title was different...
- The main object of frbr/lrm was to have a comprehensive view of catalog records, and to be able to build work records progressively - Most national libraries still struggle with millions of records to be revised : the building of Frbr/Lrm is still a work in progress in all main libraries in the world...
- Wikidata, on the contrary, chose to adopt the logic of this system (work/editions) at the very beginning of the Book project, and thus does not (should not) have a passive of millions of records to correct...
- Wikidata does not have to comply with the past of library catalogs, and put subjects on all editions of a book... it can boldly go where noone has gone before -> directly into the future of libraries ;) -- and many said libraries are looking up to Wikidata to see how we do...
- I agree though, that editions of a book that are substantially different (expanding historic scale, for example), could need added subjects...
- that's why a warning with "use in work unless substantially different on THIS edition" seems to me a good solution... -- otherwise, putting all subjects on the "work" item is the good solution... Hsarrazin (talk) 09:46, 26 November 2025 (UTC)
If we change our property constraint to accommodate items violating our modeling of items then by any practical means, said modeling is no longer enforced and by extension no longer exists. Personally i would never have known that we weren't supposed to use main subject on editions were it not for the constraint. I cannot see how this could be a bad thing. --Trade (talk) 09:29, 26 November 2025 (UTC)
Bot edits
[edit]I am experimenting with bot edits, using wikidataintegrator.
- I would like to have these edits to be recognizable as being done by the bot. But I can't get it to work. Is it necessary to login with OAuth for this to work? Is it sufficient? Am I just not setting the user agent header correctly? I've registered my bot, I just don't get how I tell the API that it's the bot and not me making edits. I'm looking into edit summaries as well, but so far all seem to be able to do is work by trial and error as I can't find very good documentation of how it works.
- For various reasons, it would be great if one could do few moderately big edits instead of many small ones. Currently, I am making one edit per work, and edition. It could be potentially fewer big edits, each creating several items in a batch, and again batched edits adding references between editions and works. Has anyone tried this and can help me with ideas on how to implement this?
Katdav-wd-lit (talk) 20:12, 15 November 2025 (UTC)
- I cannot give advice about tagging "bot" edits... or batch edits in general...
- However, looking at Meine Tante Anna (Q136801480) and its editions, I would strongly advise you to search and add library catalog ids (at least one) to allow for further completing the items, like I did for Meine Tante Anna (Q136801481) - It is extremely difficult to complete editions without them : no publisher, no adress, no page...
- for a german book, I could find it quite easily (though I do not understand german at all), but for english language books, it could be very very difficult to find those catalog IDs afterword, without knowing where the book was published (UK, US, Canada, anywhere else ?) :) Hsarrazin (talk) 11:39, 16 November 2025 (UTC)
- You say "for a german book, I could find it quite easily" --- could you give me some pointers how you were able to do that? Did you search it by hand in the DNB? All our books are German publications. Katdav-wd-lit (talk) 03:11, 23 November 2025 (UTC)
- Yes, that's exactly what I did..., which is not simple for me, as I do not understand german, at all, and contrary to other national libraries around (Belgium, NL), there is no way to switch interface in english :)
- Here was my query (title in the box, and then sort by ascending dates) https://portal.dnb.de/opac/simpleSearch?query=Meine+Tante+Anna+&cqlMode=false&sortOrderIndex=jhr_asc
- Are all your books Gutenberg transcriptions in German ? and are the other "edition" you input the one the Gutenberg transcription was done with, or always the first edition ?
- if so, you should link Gutenberg to the edition linked for transcription with the Gutenberg one, to indicate this... or do like we do for Wikisource transcriptions :
- for Wikisource editions (I'm frws admin, and one of the librarians working on WD/WS links), we simply catalog the edition on which the transcription is done, and then just link the edition item to Wikisource, like this [work] Les Fleurs du mal (Q216578) + [some of its numerous editions] Les Fleurs du mal, 1857 edition (Q18683398), Les Fleurs du mal, edition of 1861 (Q23890839), Les Fleurs du mal (Q61959690)... Hsarrazin (talk) 12:52, 23 November 2025 (UTC)
- You say "for a german book, I could find it quite easily" --- could you give me some pointers how you were able to do that? Did you search it by hand in the DNB? All our books are German publications. Katdav-wd-lit (talk) 03:11, 23 November 2025 (UTC)
- If the separate entries containing only the date of the first publication known to us is considered incomplete data, I can delete those, but I don't have library IDs for these editions in my dataset. I could try to see if one can automate something... You see, this dataset came together in the course of a research project in comparative literature, where we cared about the date, but had no particular reason to record library IDs.
- What I want to say is: I don't think I will be able to convince my fellow researchers to go back and find the library IDs again for over four thousand data records. We simply wanted to share what we had accumulated. I'm also not sure we can change our work flow so that such IDs are recorded in future projects, as it's not necessary for our research and we don't have the resources. So the question is really: Should wikidata have this data without a clear reference for how it was found (although it can be found, if someone else has the resources; I know this is not ideal), or should we not upload it (it meaning the edition entity with only authorship, language, date, and data linking it to the main entry and the edition at Gutenberg.)
Katdav-wd-lit (talk) 14:20, 16 November 2025 (UTC)
- Briefly to chime in, I tend toward this being a useful addition. However, a few thousand items aren't many in the total number of missing book items and do you really have no ID whatsoever? I think the ISBN should always be there so that when another bot/tool imports more data, duplicate items can be avoided and more data for the items be retrieved based on the unique identifier. One could however also use title + author to prevent creation of duplicate items so I still tend toward the proposed import being beneficial. Prototyperspective (talk) 11:54, 20 November 2025 (UTC)
- there are NO isbn on books older than 1970... which was the reason for my asking...
- not giving catalog IDs for old books (those without ISBN) makes the work very tedious and difficult for WD librarians who try to complete the items... Hsarrazin (talk) 12:38, 20 November 2025 (UTC)
- We are more than happy to help. But at least, with the data we provide, one knows the publication year of the first edition. At least for our use case, this is useful information, as statistics about corpora can be based on it. Katdav-wd-lit (talk) 03:14, 23 November 2025 (UTC)
Best description for book editions
[edit]"2022 English-language paberback edition of the true crime book Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls by Kathleen K. Hale (978-0-8021-6182-6)"
How does this sound? Trade (talk) 03:56, 28 November 2025 (UTC)
- Overlong, hard to read and wasteful of database resources. Descriptions are to aid human disambiguation and read in context of the label, and not be summaries of the item. "2022 paperback edition" would suffice, with "French translation" if non-native. The long awaited automatic descriptions could be templated to that, but any push to have 100m editions with long descriptions in 100 languages would cripple WD Vicarage (talk) 06:26, 28 November 2025 (UTC)
- that looks like a promotional line on the back of a paperbook :D Hsarrazin (talk) 06:30, 28 November 2025 (UTC)
- while a simple date/publisher or date/format mention should be enough... "2023 ebook edition", "2022 hardcover edition" - the title, and author name are in the edition item, the genre should be in the linked "work" item, NOT in the description of every edition... - as for the language, it will be necessary only if the book gets translated ;)
- also, I see that you created no less than 7 editions for this book... is it really necessary to create all of those ? Wikidata is not an advertising platform for books : work + first paper (or the edition you really use) enough for citation purpose...
- Is documenting every different ISBN-13 edition really a negative thing? --Trade (talk) 13:04, 28 November 2025 (UTC)
- since Wikidata is neither a library catalog, neither a bookseller catalog, I don't see the point... - is there any real interest in knowing that there are many formats, and more than one publisher ?
- I'd say, as long as a specific edition is not needed for citation purpose, there is no need to have an item for it on wikidata... they are already on Goodreads and other sites, who specialize in these Hsarrazin (talk) 15:01, 28 November 2025 (UTC)
- But Goodreads is not Wikidata nor is it a knowledge database.
- Is there an actual need for banning book editions for literary works above an arbitrary amount? Trade (talk) 21:53, 29 November 2025 (UTC)
- Its been estimated that 150m books have been published, with perhaps 500m editions. WD currently has 120m entries and is already having capacity issues, hence mul and the database split. So yes, we won't be able to handle them all. Vicarage (talk) 22:12, 29 November 2025 (UTC)
- what database split Trade (talk) 22:48, 29 November 2025 (UTC)
- "WD currently has 120m entries and is already having capacity issues" But the majority of items are either mass batch creations of scholarly articles or mass batch creations of said authors of scholarly articles. Restricting book editions seems backwards when they are not even the ones causing the issue in the first place Trade (talk) 22:56, 29 November 2025 (UTC)
- https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split which explains how we now have to run different SPARQL server setups for main data and those scholarly articles. Now that's not the database we add data to, but I can't see it coping with tenfold increases in number of items, whereever they come from. Vicarage (talk) 23:01, 29 November 2025 (UTC)
- We are currenly at less than 2500 book edition items with only 1041 of those having ISBN 13 so i'll say we have a long time before it ever becomes an issue Trade (talk) 23:53, 29 November 2025 (UTC)
- I checked and was amazed to see for instance of (P31) we only have 800k editions and 400k works, such a small fraction of the possible space. I'd have expected mass import campaigns to have produced far more items than that. Vicarage (talk) 06:40, 30 November 2025 (UTC)
- We are currenly at less than 2500 book edition items with only 1041 of those having ISBN 13 so i'll say we have a long time before it ever becomes an issue Trade (talk) 23:53, 29 November 2025 (UTC)
- https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split which explains how we now have to run different SPARQL server setups for main data and those scholarly articles. Now that's not the database we add data to, but I can't see it coping with tenfold increases in number of items, whereever they come from. Vicarage (talk) 23:01, 29 November 2025 (UTC)
- Its been estimated that 150m books have been published, with perhaps 500m editions. WD currently has 120m entries and is already having capacity issues, hence mul and the database split. So yes, we won't be able to handle them all. Vicarage (talk) 22:12, 29 November 2025 (UTC)
- Do you mind me including the ISBN-13 in the description? I feel that makes it much easier to differentiate different editions
- I changed the description to "2022 paperback edition published by Grove Press (978-0-8021-6182-6)" Trade (talk) 13:06, 28 November 2025 (UTC)
- What if there's two 2022 paperback editions? Trade (talk) 13:09, 28 November 2025 (UTC)
- I'd say the role of the description is to disambiguate. For most books format and date would suffice. Only remarkable books get foreign translations or more than one similar edition a year, and then we should play it by ear. Its never a good idea to plan policy based on edge cases. And certainly adding ISBN's to descriptions does not really aid human readers to say, "oh, that one!" Vicarage (talk) 13:26, 28 November 2025 (UTC)
- My idea is mostly if i have the ISBN of a book i can simply search the code to see if this particular edition already have an item before i make a new one Trade (talk) 13:57, 28 November 2025 (UTC)
- You'd be better off doing a SPARQL query of the subject area and searching that. Gemini will write the code for a php search page trivially, and it will be tailored to your needs. As a rule we don't expect descriptions to be formatted versions of item properties, just an aide-memoire. Vicarage (talk) 14:08, 28 November 2025 (UTC)
- My idea is mostly if i have the ISBN of a book i can simply search the code to see if this particular edition already have an item before i make a new one Trade (talk) 13:57, 28 November 2025 (UTC)
- I'd say the role of the description is to disambiguate. For most books format and date would suffice. Only remarkable books get foreign translations or more than one similar edition a year, and then we should play it by ear. Its never a good idea to plan policy based on edge cases. And certainly adding ISBN's to descriptions does not really aid human readers to say, "oh, that one!" Vicarage (talk) 13:26, 28 November 2025 (UTC)
Proposal
[edit]- Replace any statements using non-fiction (Q213051) on genre (P136) to non-fiction literature (Q27801)
- Replace any statements using non-fiction work (Q20540385) on genre (P136) to non-fiction literature (Q27801)
- Remove any statements using non-fiction (Q213051) on instance of (P31) and add genre (P136) > non-fiction literature (Q27801) instead
- Remove any statements using non-fiction work (Q20540385) on instance of (P31) and add genre (P136) > non-fiction literature (Q27801) instead
How does this sound? Trade (talk) 00:34, 30 November 2025 (UTC)
- I'd like to see a breakdown of the current usages of the terms. I think non-fiction (Q213051) and non-fiction literature (Q27801) need review as they both point to a single ENWP article, and other languages are scattered across them. As a gut feeling, literature to me is fiction, so non-fiction literature (Q27801) is rather jarring. Vicarage (talk) 06:35, 30 November 2025 (UTC)
- I don't understand either the difference between non-fiction (Q213051) and non-fiction literature (Q27801), and I'm a professional librarian... :/
- if those are equivalent, why not "merge" the items ? Hsarrazin (talk) 12:36, 30 November 2025 (UTC)
- There does exist an item for nonfiction film (
non-fiction (Q213051)non-fiction film (Q24960157)), a term also used by the library of congress (https://id.loc.gov/authorities/genreForms/gf2011026423.html) - Valentina.Anitnelav (talk) 14:28, 30 November 2025 (UTC)- is it not documentary film (Q93204) ? or are there other types of "non fiction" films ? Hsarrazin (talk) 14:39, 30 November 2025 (UTC)
- @Hsarrazin public service announcement (Q1196567), educational advertisement (Q5341269) and public information film (Q7257915) often convey non-fiction (Q213051) messages to the public, though the messages can be couched in a fictional format as well. For example, a film with a basic message of "It is illegal to drive without a seatbelt" is non-fiction (Q213051) without being a documentary film (Q93204). However, a film showing the fictional story of someone driving without a seatbelt and then being involved in a crash would be fiction (Q8253). news media (Q1193236) can also be non-fiction (Q213051) (depending on the integrity and intentions of the source) without being a documentary film (Q93204). From Hill To Shore (talk) 15:32, 30 November 2025 (UTC)
- is it not documentary film (Q93204) ? or are there other types of "non fiction" films ? Hsarrazin (talk) 14:39, 30 November 2025 (UTC)
- There does exist an item for nonfiction film (
- The term "literature" implies that the work was written as a literary work, that is, written for the quality of prose or to entertain. If a work is literary, then it might be included as part of a literature course, because it shows the styles popular during, or specific to, a particular literary period. The reason we have a separate item for non-fiction (Q213051) is that the vast majority of non-fiction works are not literary works. Instead they are written to inform, to educate, or to summarize information, without trying to be literary.
- Most textbooks are non-fiction, but they are not literary works. Most scientific papers are non-fiction, but they are not literary works. Autobiographies and memoirs might be both non-fiction and literary, if the writer put the effort in to make the prose more than simply factual. Some famous essays are both non-fiction and literary. But most non-fiction works are not literary.
- And I am not convinced that any sort of "non-fiction" is a genre, otherwise On the Origin of Species and The Languages of West Africa and The Autobiography of Benjamin Franklin are all the same genre, which they certainly do not feel the same in any meaningful way. The label of "non-fiction" simply means that the work is "not fiction", and so defines the work by what it isn't rather than by what it is. --EncycloPetey (talk) 13:39, 30 November 2025 (UTC)
- indeed, and most language links do not refer to "litterature" but only to "non-fiction", which means that they should be linked to Q213051, not to Q27801
- to me, the problem seems to be that most people using one of those on a work item make no difference on the level of "language or style quality" used in the book, and just want to mark that it is not fiction... In library sorting use, I've never seen any difference made... all is sorted as non-fiction - the specific "autobiography" or "memoirs" being added or substituted if appropriate... Hsarrazin (talk) 13:47, 30 November 2025 (UTC)
- I've never understood why written work (Q47461344) (a subclass of literary work (Q7725634)) in its description doesn't mention books. That must discourage people from using it in favour of literary_work. And in WD, while so much incomplete data, I think its important to assert that a book's genre is factual, rather than imply it by the absence of a value. Vicarage (talk) 13:55, 30 November 2025 (UTC)
- it's the contrary : literary work (Q7725634) is the subclass of written work (Q47461344) - a written or spoken type of work, later transcribed (speech, conferences, etc.) -, which means that some written work (Q47461344) are NOT litterary... those are (or should be) non-fiction (Q213051)
- part of the problem is that in some languages, and contexte "litterary" only means textual, composed of letters, while in others, a certain quality is implied... this "quality' being subjective in many cases, without any intent from the author, but implied, afterwards, by critics and historians...
- IMHO, only the objective aspect "fiction"/"non fiction" should be really valid :/ Hsarrazin (talk) 14:04, 30 November 2025 (UTC)
- I meant that (in my head at least!) Vicarage (talk) 14:09, 30 November 2025 (UTC)
- The description doesn't mention "books" for multiple reasons: (1) "book" has many, many, many meanings, as this group has discussed many times, so it is not a useful word to use; (2) written works also include inscriptions, calligraphic brushwork, hieroglyphics, writing on monuments, scrolls, papyri, and many other forms that are not books; (3) correspondence, poetry, scripts, and memoranda are written works but are not books. --EncycloPetey (talk) 17:09, 2 December 2025 (UTC)
- neither are speeches, conferences, or even drama plays (not all of them are ever published) Hsarrazin (talk) 17:13, 2 December 2025 (UTC)
- People know what a book means in common parlance for intellectual works, a few hundred pages of text on a subject. To exclude the term means that people will flounder when wanting to add them to WD. The guidance for this page, use written_work or its subclasses, is pretty hopeless for non-fiction works. I decided in the absence of good guidance to use 'literary_work' when adding a non-fictional bibliographies, but I could have chosen reference_work, academic_work, non_fiction_literature, scholarly_work or written_work itself. It would be a shame if the only way I can query "books on forts" is to quote a long list, or get swamped by all written work on the subject. Vicarage (talk) 17:58, 2 December 2025 (UTC)
- some "books" are only 10 pages... your idea of a book is probably not my idea of a book (I'm a librarian), and probably not the same as a collector's idea...
- we excluded it, because it could be either work, edition, specific item, and could thus be used very wrongfully by different people - it has been the case for years, and I think the cleanup of wrongfully described items is not finished... "book" is clearly a no-go as P31 here...
- if you want to use some meaningful description, use written work (Q47461344), then specify with genre (P136), form of creative work (P7937), and don't forget {{P[921}}... Hsarrazin (talk) 18:13, 2 December 2025 (UTC)
- The irony, as mentioned in another thread, is that written_work has lots of aliases for things that aren't books, but not actually 'book' itself. And some think the alternative, literary_work, should be reserved for "literature", which rather excludes non-fiction and popular fiction in many languages. Vicarage (talk) 19:32, 10 December 2025 (UTC)
- You may use the term "literature" to only refer to works written to entertain or having a certain style, but this is not the only use of the term. The English Wikipedia defines "literary work" as "a generic term for works of literature, i.e. texts such as fiction and non-fiction books, essays, screenplays". The description on the item literature (Q8242) recognises that there are two uses of the word: one narrower, one
formerbroader. - We have an item for "scientific literature" scientific literature (Q12042160), a term also used by the library of contress.
- The term "non-fiction literature" would just refer to a subclass of the broader concept of "literature", not to the narrower. - Valentina.Anitnelav (talk) 14:28, 30 November 2025 (UTC)
- I agree, it's the same in French... in the broad sense, "litterature" refers to made of letters or words, by opposition to art, painting, sculpture, film, etc. Hsarrazin (talk) 14:40, 30 November 2025 (UTC)
- I would prefer to use non-fiction (Q213051) over non-fiction literature (Q27801), in the same way I prefer to use children's fiction (Q56451354) over children's literature (Q131539) and so on. I don't think any of the "literature" items should be used as a genre over the "fiction" item when there are both. —Xezbeth (talk) 09:36, 1 December 2025 (UTC)
How to classify picture books?
[edit]See Picture book. These are books that contain mainly images and usually nearly no text and sometimes maybe arguably no text at all.
They are not instances of literary work (Q7725634) as that is a subclass of written work (Q47461344).
Note that this also affects many/several items that do have the above set – it would be good to have a query for these to correct them.
One idea would be setting instance of: book and genre: picture book.
Prototyperspective (talk) 14:14, 3 December 2025 (UTC)
Simplify the process of adding "books"
[edit]It's hard to add a "book" to WD now for the people who doesn't know current considerations for books, the obvious way doesn't work.
When I tried to add book as instance of (P31) = book (Q571) I got an error about "no items must be a book". And I was suggested to use literary work (Q7725634), written work (Q47461344) or version, edition or translation (Q3331189) instead. But 1st and 2nd suggested variants has conflicts with place of publication (P291), so only the 3rd suggested variant is the right choice.
So, book in WD is not a book in a people's mind. In other cases (humans, paintings, stars etc) they are. But not books! I'd suggest to rename book (Q571) to something. And version, edition or translation (Q3331189) to "book" as edition. So people will add "books", which is obvious way. Or at least it should be a "book" for ordinary people - class of items they can use to add books. If it's too complex or will break something - tell me how can I write a manual for adding and citing books from WD in Wikipedia :-) Vmartyanov (talk) 12:18, 10 December 2025 (UTC)
- Would you please consider reading the previous discussion about the SAME subject... ?
- NO ! "book" is not a simple matter, and all ordinary people do not understand the same thing when using it... a book is at the same time the work (the novel "Frankenstein", for example), all of its translations (in French, German, italian, etc.), all of its various editions through time in all of these languages, all of the various examplaries that exist in any library, including "collector" ones, or simply an object with a cover and pages (any object with a cover and pages, in fact)...
- which makes it practically impossible to describe these as a simple item... Hsarrazin (talk) 14:07, 10 December 2025 (UTC)
- The continuing confusion over this means we do need to issue much clearer advice. To me that means we need to encourage the creation of a written work (Q47461344) first, and optionally a linked version, edition or translation (Q3331189) with publication details. We need tools that take one or the other and create the paired item, splitting properties appropriately, and allow them to work on the book (Q571) that users will inevitably create, as they know what books are. Re-labelling it as physical_book and creating a new 'book' item would attract users to this environment. But you can't get away from the fact that in common use a book is both a physical object and its contents. Vicarage (talk) 19:06, 10 December 2025 (UTC)
- I would make "book" be an alias for "written work" or even its main label. If non-experts are entering information about a book into Wikidata, it's not likely that they are entering something about a physical book (due to notability concerns). So its either going to be a written work or an edition and, in my view, there shouldn't be an edition without a written work to back it up, so they have to create a written work anyways. Good instructions for use (and some movement on the decade-long-outstanding ticket group starting with to https://phabricator.wikimedia.org/T97566) would help a lot, I think. Peter F. Patel-Schneider (talk) 19:26, 10 December 2025 (UTC)
- The continuing confusion over this means we do need to issue much clearer advice. To me that means we need to encourage the creation of a written work (Q47461344) first, and optionally a linked version, edition or translation (Q3331189) with publication details. We need tools that take one or the other and create the paired item, splitting properties appropriately, and allow them to work on the book (Q571) that users will inevitably create, as they know what books are. Re-labelling it as physical_book and creating a new 'book' item would attract users to this environment. But you can't get away from the fact that in common use a book is both a physical object and its contents. Vicarage (talk) 19:06, 10 December 2025 (UTC)
- I agree very much that people have a notion of what a book is and that it would be useful to have that reflected in Wikidata. But is the notion 'good enough' to be a class in Wikidata? I would say yes - the popular notion of a book is (from English Wikipedia) "is a written work of substantial length created by one or more authors". So why not make book (Q571) be this and have instructions for use supporting it? Peter F. Patel-Schneider (talk) 18:47, 10 December 2025 (UTC)
Page offest (proposed property)
[edit]Page Offest is new proposed property at Wikidata:Property proposal/Creative work#Page_offset. حبيشان (talk) 13:59, 10 December 2025 (UTC)