Wikidata talk:WikiProject Books

From Wikidata
Jump to navigation Jump to search
On this page, old discussions are archived. See: 2013, 2014, 2015, 2016, 2017.

misuse of NNL work ID (P3959)[edit]

Every instance thus far where I have come across NNL work ID (P3959), it has been misused. This property is supposed to link to work identifiers from work data items, but what I'm finding is that it links to copy records of Hebrew translations from the work data item, which should never happen. --EncycloPetey (talk) 19:22, 5 November 2017 (UTC)

make it clearer in the description? maybe look to add/suggest a constraint that flags the addition/use as incorrect.  — billinghurst sDrewth 23:43, 6 November 2017 (UTC)
I also noticed what you say EncycloPetey, but not reading hebrew, I did not dare remove them. Do you think I should ?
in fact, do you know if there are real work IDs in NNL catalog, or only editions ids (like in many libraries) ? if there are not, maybe rephrase the property as "id for an edition" ? (in all languages) --Hsarrazin (talk) 10:53, 7 November 2017 (UTC)
If you follow the links, you get the library record, which includes the standard library catalog headings in English. Without a single exception, every single record I've come across includes a "...--Translations into Hebrew" header, which means it is not an authority record for the original work. I honestly don't know in the NNL catalog has any work IDs or authority records, but I certainly haven't seen any linked. So the way in with the property is being used does not match its description in Wikidata at all. --EncycloPetey (talk) 13:47, 7 November 2017 (UTC)
mmmmh. no, in fact, I get the standard library catalog heading in hebrew ([http://aleph.nli.org.il/F/?func=direct&con_lng=heb&local_base=nnl01&doc_number=001251501 ex. here from Don Quixote (Q480)'s first NNL link). And when I click on the "English" button to go to the english interface, I get an English page that invites me to make a search, not the En version of the notice. But what I can clearly see is that there is a place set, and a date... which means it is an edition ^^
almost all NNL work ID (P3959) data were added by a single user @זאב קטן: (3614/3743 from NavelGazer data. - which means if we can agree with them, it would be much easier...
is there someone who understands hebrew on the Books project, who could search the database, and maybe help us here ? --Hsarrazin (talk) 14:04, 7 November 2017 (UTC)
Look about halfway down the listings you mentioned (try the second, third, etc items). You'll see catalog information in English such as "Fiction--Translations into Hebrew". I just looked at the listings on Don Quixote (Q480) that you mentioned and the problem is clearly visible to me. I'm not sure why you're not seeing it, and no, you don't have to click on the "English" button to see the cataloging information. The items are standard in English, and are placed among the Hebrew catalog information. --EncycloPetey (talk) 14:54, 7 November 2017 (UTC)
@Nahum, Dovi: Are you or any of your heWS colleagues able to shed light on this topic?  — billinghurst sDrewth 22:01, 7 November 2017 (UTC)
I don't think the NNL has "works" as such in its database, only records of editions. I don't know what "NNL work ID" should link to. I believe the NNL ID property should link to the edition(s) that exist in the library, because that is the place where the information about the work is stored at this library. This at least has been my personal experience with their database.--Nahum (talk) 22:22, 7 November 2017 (UTC)
Then it sounds (and looks to me) as though what the NNL database has are recording of its specific holdings, which would mean specific copies of books rather than works or editions. If that is indeed what it lists, then we have no framework yet for including them on Wikidata. You could argue that the listing could also be treated as editions (or translations), but those would require listing as separate data items rather than inclusion on the data item for the work. --EncycloPetey (talk) 04:08, 8 November 2017 (UTC)
Hello! I'm sorry, I only happened across this just now. I'm bilingual English/Hebrew, and very active in cataloging books with these systems. I'm also doing this in cooperation with the Israel National Library cataloging staff. so, I'll be glad to try to clarify anything, as well as I can.
For now, I'll just say that National Library of Israel ID (P949) is parallel to: VIAF ID (P214) while NNL work ID (P3959) is parallel to: LCOC LCCN (bibliographic) (P1144). "authority records" for works are at- National Library of Israel ID (P949). -- Shilonite (talk) 11:22, 13 December 2017 (UTC)
Every use of NNL work ID (P3959) I've come across is using it as parallel to VIAF ID (P214), which (as you say) is incorrect. The name is also confusing, since "work" has a specific meaning in WikiProject_Books --EncycloPetey (talk) 14:20, 15 December 2017 (UTC)
I'm sorry I didn't get back to you till now (i'm not very well...). I understand what you are saying, but please, so we would both ‘be on the same page’, try to examine the cataloging that I have done, and see if it agrees with your method.
If it does, and I understand that we are both working by the same method, then I will go back and correct whatever you find erroneous. If not please point out the differences to me.
For example what's your opinion about this: https://www.wikidata.org/w/index.php?title=Q20278655&diff=next&oldid=600585790 . ...and in general, what Ive done with that book. I'm asking to learn, if perhaps I've misunderstood.
thank you, Shilonite (talk) 11:42, 4 January 2018 (UTC)

Distinguishing the types of digital text[edit]

In book scholarship there are crucial differences between types of editions, serving different scholarly purposes. See for example these course notes. To take a practical example, a Project Gutenberg edition of a book gives you the electronic text but does not show you what the pages looked like, whereas Archive.org or Google Books usually provide visual scans of the book (plus uncorrected OCR). Which you want to use depends on whether your interest is in readable text or in the exact layout and typography (or handwriting) of the original). full work available at (P953) is not specific enough to tell the user what the link will provide, so we need to express these different types. One list given to me by an academic has these core types:

These could be used in instance of (P31) statements, but my interest is in using them to distinguish the different digital versions of a text in the Wikidata entry for an edition. It seems like the most appropriate way to do this is with the qualifier object has role (P3831). See Political Disquisitions (1775 edition) (Q42788256) for an example, where I represent that all 3 volumes of this edition of this book are available from the Oxford Text Archive in a detailed text edition while page scans of individual volumes are available from other sources. I'm feeling my way here, so I welcome other perspectives and improvements. MartinPoulter (talk) 15:37, 7 November 2017 (UTC)

@MartinPoulter: So I don't presume or misread ... can you more express how you see these being used. Are you looking to use these as qualifiers for full work, or are you looking to use these as the base instance of (P31)? Above in your notes are classifications there are various types of new editions or secondary, or maybe tertiary, reproductions.

Then I suppose I am trying to figure how/where reproductions like enWS and Gutenberg's works would fall where I have considered them per the edition published at the time, though are really becoming editions on their own, which does that then mean we shouldn't link them to those editions as they are there own derivatives. [We so need professional guidance]  — billinghurst sDrewth 22:36, 7 November 2017 (UTC)

Hmm, now you talked "digital text" and how are we seeing that as different from version (Q3331189).  — billinghurst sDrewth 22:41, 7 November 2017 (UTC)
@billinghurst: For my purposes, I'm looking to apply facsimile (Q194070) and no label (Q42794047) as qualifiers for full work available at (P953), though I could imagine annotated edition (Q4769619) and eclectic edition (Q42793760) being in use for instance of (P31) (and possibly there could be exceptions both ways). I'm not proposing to create separate entries for, say, the Oxford Text Archive transcription of an edition. As I understand it, that properly belongs as a full work available at (P953) property in the entry for that edition. I'm looking to tag the full work available at (P953) property in a way that tells the user whether they will get the faithful text of the book or faithful scans of the book. Maybe the "edition" terminology is confusing when used this way, but this is the terminology of academic bibliography. Hope this clarifies. MartinPoulter (talk) 20:48, 14 November 2017 (UTC)

Are we conflating editions and translations; or are we missing translations as their own works?[edit]

To me when we have the creative work (Q17537576) or the variations that we use, we then have edition or translation of (P629). Are we right to combine editions and translations? If we have a translation by an author it gets its own copyright for the translator, and to me that makes it its own creative work (Q17537576) of which then there can be their own editions. I know for the work of Anton Chekhov (Q5685) that enWS has the same works by different translators, so at the work level, are we right to group all the Russian language editions, and the variety of different language translations under the one Property:P629?  — billinghurst sDrewth 22:51, 7 November 2017 (UTC)

technically, a translation is not a creative work (Q17537576), it is a derivative work (Q836950). --Hsarrazin (talk) 19:15, 8 November 2017 (UTC)
What is the gain ? we can always try to complexify the system but for which purpose ? And when I read the difficulties to some contributors to create work and editions items, I think the creation of an additional work item for the translations will jus be a nightmare (one item for the translation as edition, one item for the translated work and one item for the original work). Snipre (talk) 01:04, 8 November 2017 (UTC)
The gain is clarity, consistency, and a disentanglement of conflated concepts that are actually very different. An editor of an edition does a very different job from a translator, and the results of the two processes require different kinds of information. Stuffing them together into a single block has resulted in all sorts of editing headaches. --EncycloPetey (talk) 01:27, 8 November 2017 (UTC)
I understand the conundrum, I have hesitated to post over this matter for months. I am more trying to have an open conversation and deciding to do nothing with our eyes wide open, rather than having to unpick a situation with "Why didn't I say something earlier". In the whole conversations as they have persisted, we have the issue of the conceptual idea (creative work/work/...), to the manifestation/output (edition/book/...). We will continue to be caught by this until we do a far better job explaining this matter.

At the Wikisources we are governed by public domain/free licences, we list at the conceptual level, and reproduce at the manifestation(s). So when we have translations we need to explain the concept of dual licenses. At this point in time we manage all the data and manually apply licenses, though that is not the best way to undertake the curation, especially when it is common data across WP/WS/Commons. Ultimately we should be able to suitably licence translations according to the concept of author and translator irrespective of edition, and one day it will be fed from Wikidata. [And I am probably doing a shithouse job of describing as I need a whiteboard and a marker and to draw pictures, supported by hand-waving, rather than explanation.  — billinghurst sDrewth 03:54, 8 November 2017 (UTC)

@EncycloPetey: Please provide an example how the distinction of editions in original language and translations will help to clarify the situation: just write the relations between the editions in original languages, the translations and the corresponding work items. I did the job with the current system and I will be happy to compare with your simplified system. Snipre (talk) 17:11, 8 November 2017 (UTC)
I don't understand the symbolic language you have used to describe the relations. Please convert your model into prose or some other understandable form, if you would like me to assess it. --EncycloPetey (talk) 17:14, 8 November 2017 (UTC)
@EncycloPetey: You don't need to understand my model, you need to describe once your model, using words, graphics or what is relevant. But try once to put your ideas on the paper and SHOW where the simplicity is. Snipre (talk) 20:39, 27 November 2017 (UTC)
@billinghurst: Ok, you indicate the possible gain even if I don't see clearly what prevent you now to do what you want to do, but I would like to see the cost of that model modification in terms of items relations: can you show us how we would have to link the different editions, translations and works items with your model ? We don't need discussions, we need diagrams to be able to validate a model and that's what is missing now. An ontology follows mathematical rules so discussions are useless: tables, diagrams, systematic descriptions of relations, that's what is important.
You mention some automatic addition of license values to items, this implies bots so you should convert your idea in some programming language. Snipre (talk) 17:11, 8 November 2017 (UTC)
the problem of translations is one of the reasons why FRBR uses 3 levels to describe books (+1 for examplaries, which is not our problem). The need for an intermediate state between the original work and the edition... but the modelling on wikidata seems really difficult, and I'm not sure the linking of editions to the original work through a "translation" level would allow the retrieval of info like "date of creation of the original work" from the edition item. :/
moreover, like billinghurst says, it's already a very difficult task to explain on wikisource how the 2-levels model works… if we have to apply a 3-levels model, it will be nightmare :
and, if it could probably be achieved for books (with a lot of difficulties), it would be absolutely hell for poems and short texts... (--Hsarrazin (talk) 19:15, 8 November 2017 (UTC)

First hack at some cases[edit]

  • A1Y1: a translation of work A1 of author X1 by translator Y1 ... language detail
  • A1Y2: a translation of work A1 of author X1 by translator Y2 ... language detail
  • A2Y1: a translation of work A2 of author X2 by translator Y1 ... language detail
  • A2Y3: a translation of work A2 of author X2 by translator Y3 ... language detail

manifestations of these cases each role into the edition model thereafter, they are just editions (and editions of the translation)

So A1 has editions in the same language or translations into other languages. A1 does not have editions in other languages except via the translations.

the why

So we need a means to identify the one translation of a work, then the variety of places that it appears. Please feel perfectly entitled to update this for clarity. If I can get time at a whiteboard, then I will.  — billinghurst sDrewth 21:22, 8 November 2017 (UTC)

  • Symbol oppose vote.svg Oppose having a new distinct property for translation without at least one good reason, I don't see the problem with the current uses of edition or translation of (P629). Moreover, as @Hsarrazin: pointed it, there is some over-simplifications in the initial statements ; a translation doesn't really have its « own copyright » (see derivative work (Q836950)) and in the others hand, an edition can also be considered as a derivative work (Q836950) and having protection on its own. More importantly, FRBR doesn't care about copyright to distinguish the levels, nor should we. Cdlt, VIGNERON (talk) 09:12, 9 November 2017 (UTC)
    Pictogram voting comment.svg Comment I understand that a translation is a derivative work, even so it does have its own copyright as a creative work. Many pages around that explain this, eg. http://bookwormtranslations.com/copyright-law-and-translation-what-you-need-to-know/  — billinghurst sDrewth 11:16, 9 November 2017 (UTC)
    Well as always with laws, it's complicated; translation doesn't really have « own copyright » but they have « some copyright of their own » (as such, the translator is not the sole author of the translation but just the co-author with the author of the original work, and depending of the country the translator can have less rights on his translation than the orginal author).
    But anyways, I don't see why and how copyrights intervene here, translation are very specific edition but still they are edition (and there is editions way more strange than translations, should we have a different property when an editor transform a poem in verse into a poem in prose? and vice versa? or when other significant changes are made to the original work? in some extreme cases, the better is just to consider that the modifications are so important that this is an entirely new work, for instance no label (Q548338) with Iliad (Q8275)).
    Cdlt, VIGNERON (talk) 12:03, 9 November 2017 (UTC)
A translation can definitely be a new work ("FRBR-style"), because as @Nonoranonqui: patiently explained to me the fundamental discrimination between works is the "Authorial responsability"... So a translation is both a new work, and it's based on/derived/it's a translation of another one. But we probably don't need a new property: we can create an item for a translation, and use
  1. edition or translation of (P629)
  2. translator (P655)
  3. based on (P144)
If I'm not mistaken, these 3 properties give us what we need for understanding the relationship between a book and his translation. A query could look authors, languages and what not to understand everything. I'm not a very good wikidatian, but if properties are simple and clear is better for everyone: we still have queries for complex relations between items. 80.181.62.189 16:46, 13 November 2017 (UTC)
Except it doesn't. Where a translation has multiple editions of its own, this model fails or is corrupted.  — billinghurst sDrewth 21:22, 13 November 2017 (UTC)
I fear that this part of the problem has no solution, from a theoretical point of view. A good translation is both a work and an edition, even for librarians. It's like the wave-particle issue in physics: it's both, depending on how you look at it, what are your needs. Wikidata works with item, which should be "unique". But books don't work that way. So we have to deal with the ambiguity of what we need. I suggest everyone to read this very good free book from @Kcoyle:, she's a great librarian and information professional, and also she's one of us ;-) Aubrey (talk) 09:50, 17 November 2017 (UTC)
@Aubrey: Wrong, there is no obvious solution but we can define a solution with some advantages/disadvantages. We just need to have a logic solution which can be handled by any programming language like SPARQL. Snipre (talk) 15:15, 29 November 2017 (UTC)
@billinghurst: I think we need to distinguish 2 different problems:
Take the case
E1, an edition of work W1 with author X1 in language L1
E2, an edition of work W2 with author X2 in language L1
T1, an translation of edition E1 by translator X3 in language L2
T2, an translation of edition E2 by translator X4 in language L2
If an editor decide to create an new book containing T1 and T2 as
E3, an edition of work W3 by editor X5 containing T1 and T2
There is no problem to create new items for E3 and W3 if we consider that collecting different works is a kind of new work. The Wikidata model is able to handle that situation.
The second problem is to link E1 and W1 to T1 and T2.
To be correct, the information about the fact that T1 and T2 are parts of E1/W1 have to be integrated in W1 and not in E1. Then we have to the answer the question: can we accept the following relations
* W1 has part T1
* W1 has part T2 ? Snipre (talk) 23:55, 18 November 2017 (UTC)
I still don't understand the distinction beetween edition and translation and even less the need for a distinction.
I'm not sure to understand either the case you present here, do you have a concrete example? For W1 has part T1, T2, there is already some case, see this query. Is it what you were thinking about?
Cdlt, VIGNERON (talk) 08:53, 22 November 2017 (UTC)
The use of based on (P144) to link a translation to the document used as based original text for the translation is not the best choice: some book like this one is a translation of this one which is based on the game Mass Effect (Q275960). So based on (P144) can be used twice on the same item once for the translation relation and then for the topic relation. Better avoid that situation. Snipre (talk) 15:15, 29 November 2017 (UTC)

Decameron[edit]

Does anyone know of a Linked Open Data dataset for the stories in Bocaccio's Decameron? We have articles on a couple of stories {no label (Q18600581), no label (Q26710491)) but no structure for the days and the stories for each day that I can find. It would be nice not to have to do this from scratch. - PKM (talk) 20:38, 13 November 2017 (UTC)

  • Agree. Somehow I was exhausted after I 1. ;) I got better with QuickStatements in the meantime, so we could try together.
    --- Jura 20:44, 13 November 2017 (UTC)
To start: brigata (Q43256358), days.
--- Jura 16:07, 17 November 2017 (UTC)
Oh excellent! I can add some references to these. - PKM (talk) 21:04, 19 November 2017 (UTC)
@Jura1: Wow, I have realized just how much deep structure you built here! I am stunned.
I have added novella (Q43334491): short prose tale popular in Renaissance Italy, progenitor of the short story <different from> the modern genre novella (Q149537): written, fictional, prose narrative normally longer than a short story but shorter than a novel, and made novella in the Decameron (Q43303440) a subclass. Much more work to do as time permits. Onward! - PKM (talk) 21:54, 19 November 2017 (UTC)
@PKM: I started a list at Decameron editions and translations and included what I found at enwiki/wikisource. Maybe it's possible to give it a reasonable coverage.
--- Jura 12:29, 24 November 2017 (UTC)
@Jura1: thank you for this page and thank you for creating items about editions and translations. As you've seen I've did some corrections to fit the model of WikiProject Books; you reverted me but I see no reason to not use the model of WikiProject Books (especially as there is another discussion, which is more leaning toward keeping the current model). Cdlt, VIGNERON (talk) 14:16, 24 November 2017 (UTC)
It seems consistent with the current model, except maybe that the manuscripts should use "exemplar of" and not "edition of". I don't mind if you change that. I noticed that some of items used the wrong "translation" item, thus the constraint violations. It's fixed now.
--- Jura 14:20, 24 November 2017 (UTC)
I see many points not respecting the model, there was edition of edition (but edition or translation of (P629) is not transitive, corrected now), there was wrong instance of (P31) (thank you for fixing it), there is still several constraints violations (for manuscripts but not only, identifiers too, eg. something is wrong on Q16438#P1256) and in the end, there is a lot of missing information and some wrong information (like Q16438#P577, the property should be inception (P571) and the values should be better indicated, more precise and referenced with better source, like the entry in the Treccani). Cdlt, VIGNERON (talk) 14:37, 24 November 2017 (UTC)
Q16438 isn't even on the list. The source at enwiki I was mentioning is at w:The_Decameron#Translations_into_English. It should be possible to find the same information in Wikidata. Other languages have similar lists.
--- Jura 14:54, 24 November 2017 (UTC)
Q16438 is the work, it's not on the list but it's the more important item of this list.
And please, learn how to use edition or translation of (P629) and has edition (P747) as it was intended (between a work and an edition, never between two editions ; more information on Wikidata:WikiProject Books and on en:Functional Requirements for Bibliographic Records).
Cdlt, VIGNERON (talk) 15:39, 24 November 2017 (UTC)
No problem. I thought you were trying to present some argument and reference about the items on the list you were breaking. Yes, I think we all agree that Wikidata isn't complete yet and you obviously invited to contribute. A list of French translations could be interesting ..
--- Jura 15:56, 24 November 2017 (UTC)

Additional properties[edit]

Why these properties aren't used at all: country of origin (P495) and after a work by (P1877)? --Infovarius (talk) 10:35, 16 November 2017 (UTC)

Hi, Infovarius
AFAIK, after a work by (P1877) is more for artworks (like an etching after a work by (P1877) an original painting)... for books, I'd probably use based on (P144) or inspired by (P941) - these should be applied on the work item, of course, not on the edition.
as for country of origin (P495), what is the point of giving a country of origin ? the work has an author, and a language ; the country in which the author lived at the time is not necessarily the origin of the work (see Voltaire (Q9068)'s works, written in French, but written in Prussia, and published in Prussia (because of France censorship)... should they have Kingdom of Prussia (Q27306) as country of origin (P495) ? this seems rather inadequate.
on version (Q3331189) items there is already place of publication (P291) - why would you add country of origin (P495) ? --Hsarrazin (talk) 11:17, 16 November 2017 (UTC)
after a work by (P1877) has a different sense from based on (P144) or inspired by (P941) - it has value "person" not "work".
I understand difficulties like with Voltaire (Q9068). But what if all is unambiguate: work has been created and first published in a country - citizenship of an author. Why not to mark this in work item? --Infovarius (talk) 16:45, 17 November 2017 (UTC)
Have you referred to the creation proposal Wikidata:Property_proposal/Archive/31#P1877 ?  — billinghurst sDrewth 06:27, 18 November 2017 (UTC)
@Infovarius: do you have an example for after a work by (P1877)? based on (P144) and inspired by (P941) seems more than enough to me in all cases I can think of (if it is really after *a* work by a person it seems more accurate to directly link to this work instead of the person, plus see the hijacking of after a work by (P1877) which was not at all intended to be used in that way :/ ).
For country of origin (P495), I don't see the need: there is plenty of way to find where a book come from (directly with property like place of publication (P291) - which is far more intuitive and easier to reference - or though the author(s)'s data). Is there a case where the value in country of origin (P495) would be different than the value in place of publication (P291)?
Cdlt, VIGNERON (talk) 10:05, 22 November 2017 (UTC)

I was checking, for 94,017 items with instance of (P31) = book (Q571) (36 %), there is 34,025 with a country of origin (P495). Maybe it should be accepted on works, as place of publication (P291) is only for editions. It's redundant (which is a bad in itself) but it would be easier to do queries and other stuff (like using the redundancy to check the consistency). Cdlt, VIGNERON (talk) 14:43, 24 November 2017 (UTC)

Language property[edit]

I am completely confused which property (language of work or name (P407) or original language of work (P364)) should be used for books, works and films and which is deprecated and will be deleted. User:Pasleim deletes P407 statements, sometimes deletes P364, User:VIGNERON deletes P364. Can you come to an agreement and explain to others? --Infovarius (talk) 20:35, 23 November 2017 (UTC)

original language of work (P364) is deprecated and in a process of deletion (for several months now, it's even written in the original language of work (P364) description) as it was meaningless most of the times (for multiple reason but thank to the FRBR model). For information, I deleted all original language of work (P364) only on items about 'edition' *and* when there was already a language of work or name (P407) with the exact same value (about ~200 items IIRC). So globally, never use original language of work (P364) and always language of work or name (P407). the first removal you cite was an obvious mistake. Cdlt, VIGNERON (talk) 21:12, 23 November 2017 (UTC)
Consensus was reached on WD:PFD to merge original language of work (P364) into language of work or name (P407). However, members of the WikiProject Movies insist on keeping both properties for movies. If you think this is confusing, your comment is highly appreciated on Wikidata:Properties for deletion#Closure of stale thread. --Pasleim (talk) 08:30, 24 November 2017 (UTC)
@Jura1: what are you talking about? The plan is quite clear and logic, see Wikidata:WikiProject Books. And AFAIK, information is not lost (at least not by me, I checked that the information was already there before deleting the deprecated property). Cdlt, VIGNERON (talk) 09:17, 24 November 2017 (UTC)
For items like Les Débuts littéraires de Thingum Bob (Q17352560), there was at least two clues that is it an edition : 1. not in a language spoken by the author and 2. link to Wikisource. I improved the items (who weren't at all following the plan, so it is illogical to use this item as an example of alleged failure of the plan), I think it's clear now.
Cdlt, VIGNERON (talk) 09:17, 24 November 2017 (UTC)
We were looking for a conversion plan. Not that it matters now, we already lost the information in relation to books.
https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 was correct when it was created/edited, but the change of the property on other items made us loose the information that it was just the language of the edition. The same probably applies to all similar items. You will probably need to find a new source to rebuild the information.
--- Jura 09:27, 24 November 2017 (UTC)
https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 was wrong since the beginning. It had a sitelink to Wikisource but was instance of a work. --Pasleim (talk) 09:45, 24 November 2017 (UTC)
Somehow I got the impression the contributor who made it is an expert in the field. So if the approach isn't clear to them, it's unlikely to scale well.
--- Jura 09:54, 24 November 2017 (UTC)
@Jura1: again: what on Earth are you talking about? this edit is clearly and entirely correct, what is wrong with it? (besides the obvious missing properties on the same item but that's beside the point, the item is better after this addition ; and why are you even mentioning it? it's very loosely related to the problem here). What information is lost exactly? For the conversion plan, it's quite easy: delete all original language of work (P364) and replace them by language of work or name (P407) with the same value (and in bonus: check the instance of (P31) and other properties like edition or translation of (P629) and country calling code (P474)). Cdlt, VIGNERON (talk) 10:11, 24 November 2017 (UTC)
I think you are confusing things. Pasleim is stating that the item was wrong to begin with. At least you seem to be satisfied with the approach that seems to be applied for books.
--- Jura 10:16, 24 November 2017 (UTC)
I don't think I'm confusing thing but clearly I'm confused by you. The item Q17352560 was wrong in the beginning as it was empty and missing a lot of property and the instance of (P31) was too general (but reminder: it was created back then in 2014). In 2017, @Hsarrazin: add a language of work or name (P407) and it was a good thing. The only « mistake » (but can we really call it that way?) is that she didn't added others properties nor corrected the P31, but the edit in itself was good. In the end, none of that really matter as original language of work (P364) is not at all involved here.
Can we move on and use a more relevant example? For instance no label (Q19157120) and the P364 deletion I made two days ago. Is there anything you consider as lost here? and why? (I don't see any lost but maybe I'm missing something). If not, do you have an explicit example?
Cdlt, VIGNERON (talk) 10:25, 24 November 2017 (UTC)
Apparently Pasleim and you disagree on https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 . I'm not sure what I can add to help you with this. I'm not aware that an edit is or can be considered incorrect if one doesn't add more statements.
--- Jura 10:31, 24 November 2017 (UTC)
I actually fully agree with VIGNERON. P31 needed to be corrected which wasn't done till today, but this correction should have happened independently of the language property merge. --Pasleim (talk) 10:37, 24 November 2017 (UTC)
Well, P31 on https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 being too general and wrong isn't really the same. It's not independent of the language properties (original language of work and language of edition) because the the use of P407 made it clear that it may not have been the original language of the written work. Once all written work just use language of edition, it's no longer clear.
--- Jura 10:43, 24 November 2017 (UTC)
It's maybe not entirely independent (everything is connected in this Universe) and indeed on this particular example it had to be corrected/completed but now, it seems good to me for this item. Do you have other example where it's unclear or where allegedly « information was lost »? PS: language of work or name (P407) is *not* « language of edition ». Cdlt, VIGNERON (talk) 10:53, 24 November 2017 (UTC)
Ideally, the conversion plan would have taken care of such problems. As Pasleim is doing it, maybe he can detail.
--- Jura 11:00, 24 November 2017 (UTC)
I'm not the only user developed a conversion plan and I'm not the only user doing the conversion (btw, I'm not even member of this WikiProject).
I think, you had too high expectations on the conversion. language of work or name (P407) was not and is not "language of edition". If an item is an edition or a work is defined over P31, P279, sitelinks and external identifiers. If these values were set wrong, they are still wrong now after conversion but it didn't lead to any information loss. --Pasleim (talk) 11:22, 24 November 2017 (UTC)
So if P364 is going to be deprecated I don't understand why User:Pasleim is massively deleting P407 in favor to P364? --Infovarius (talk) 15:27, 24 November 2017 (UTC)
The current rules of WikiProject Movies say to use P364 for movies, therefore I remove P407 in cases where it is redundant to P364. If P364 is going to be deprecated depends on whether or not user accept community consensus. --Pasleim (talk) 18:38, 24 November 2017 (UTC)
We reached consensus to deprecate P364, but we never reached a consensus about the manner in which it would be deprecated, or how the data currently in the property would be handled. The most obvious problem is that the property is explicitly for language of a work, and not for language of editions, nor does the process of deprecation attend to the issue of marking source languages for translations or editions of translations, which is not always the same as the language of the ultimate source (=work). Nor does it solve the problem that works themselves do not have a language; only individual editions / copies will have a language. A "work" refers to the creative piece independently of any specific copy. --EncycloPetey (talk) 05:09, 25 November 2017 (UTC)
Why do you think that work doesn't have a language?? Usually literary works are created in one and only language which is the language. And editions in other languages are just translations from the original language (translation itself can be regarded as creation of new creative work, in different language). Infovarius (talk) 19:51, 29 November 2017 (UTC)
Sorry ? of course all works (textual works obviously) must have a language. But it is language of work or name (P407) not original language of work (P364). --Hsarrazin (talk) 20:00, 29 November 2017 (UTC)
@Infovarius: But the work data item is for the work as a whole, meaning every edition and not any specific edition. A work can appear in any language to which it is edited or translated. We have chosen to eliminate the "original language" property, and now have no means of indicating the original language unless there is a "first edition. This itself is a problematic issue, and some works have no known first edition, and some have a first serialized edition that predates the first bound (book) edition, etc.
We also have no propoerty for marking "language of edition" or "language of translation". We only have a property for "language of work". --EncycloPetey (talk) 02:06, 30 November 2017 (UTC)
Maybe I am not understanding, but if we are talking about the "work" the language P407 is used, and it replaces P364. Editions have P407, and have no requirement for original language as you refer back to the work.  — billinghurst sDrewth 08:50, 30 November 2017 (UTC)
Billinghurst: Why would an edition be marked with language of work or name (P407), since that property is explicitly for the language of the work? Editions are not works. ::::: Also, how do we mark the source language for translations and for editions of translations? We currently have no logical means of doing that. Yes, there can be pointers back to a "work", but for translations we cannot agree on whether the translation is an "edition" or is a "work" and needs its own edition data items.
And, yes, current practice puts language of work or name (P407) on works, but that makes no logical sense. Language is not a property of a work; it is a property of an edition. The language can differ in various translations/editions, so it is not a property native to the work. An item's properties must be invariant, or they are not properties of that data item. "Author" is a property of a work, because a work will always have that author, and this is why we do not replicate the author information on all the data items for the editions. But the date of publication varies with every edition, so we do not put "date of publication" on the work item, but rather on the individual items for each edition. The work instead gets a "date of first publication", or no date at all. The "language" property is in the same category as "date"; it varies with editions/translations, and is not inherent to the work. Yes, a work has an original language of composition, but we've decided to eliminate that property. --EncycloPetey (talk) 14:35, 30 November 2017 (UTC)
language of work or name (P407) on a work is the language it was originally composed by the author... How can you say that it makes no sense to put a language on a work... the language is intrisec to the work... this way, when an edition is the same language, it means it was not translated, whereas when it is different it means it is a translation... Work notice at Bnf (for ex.)
what caracterizes a work is :
  1. an author,
  2. a title (sometimes conventional),
  3. a language,
  4. a date of creation.
Without a language, how can you say that Shakespeare wrote in English, Molière in French or Goethe in German ? --Hsarrazin (talk) 15:14, 30 November 2017 (UTC)
Pretty much my point of view. I would even take it a step further and say that all language belongs on a work, not on edition. Though to do that I have to go back to my argument that each translation is a work too. Any edition of a work, or of a translation, has to be in the same language of its respective parent.  — billinghurst sDrewth 17:02, 30 November 2017 (UTC)
I may be wrong but I think we are mixing very different definitions and senses of the word work here. The sens of work in frbr (that I will write down workfrbr) is very narrow. Editions (and by extension translations, who are expressionsfrbr that we defined to be equivalent to editionwikidata) are not workfrbr but they are work. When P407 says work, I believe this is lato sensu, not stricto sensu. billinghurst: I hear your argument but I feel this is unnecessary or at least I don't see the need (and meanwhile, I see a lot of potential trouble, especially as languages are not always clearly delimited, one can argue that Shakespeare and Molière were not writing in English or French but Early Modern English (Q1472196) and Classical French (Q3100376)). Cdlt, VIGNERON (talk) 17:20, 30 November 2017 (UTC)

Ancient Greek works[edit]

@billinghurst, Hsarrazin, VIGNERON, Snipre: There are currently 13 items left which use both P364 and P407.

SELECT ?item ?itemLabel WHERE {
  ?item wdt:P364 []; wdt:P407 [] .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

Me and EncycloPetey disagree on how to apply the guidelines on the front page of this WikiProject. Can somebody of you help us with these items? --Pasleim (talk) 09:34, 30 November 2017 (UTC)

@EncycloPetey, Pasleim: what is the disagreement exactly? Didn't we all agree that original language of work (P364) is deprecated and shouldn't be used? (and if it weren't used for movies too, the property would probably be already deleted for months)
This is maybe besides the point but I've looked at the results and there is something wrong: The Comedies of Aristophanes (Q21286489) is indicated as work in instance of (P31) but many data indicates this is in fact an edition (the 1853 edition by Hickie according to the link to WS and the publication date (P577)). All items seems to be in the same case.
Cdlt, VIGNERON (talk) 11:22, 30 November 2017 (UTC)
IMO, original language of work (P364) should be removed from The Comedies of Aristophanes (Q21286489) but EncycloPetey reverted that change 4 times during the last months [1] [2] [3] [4]. I also think that it should be marked as edition or translation but also that edit was reverted by them [5]. --Pasleim (talk) 12:28, 30 November 2017 (UTC)
@VIGNERON: If you still don't understand the disagreement after months and months of discussion, then explaining it to you all over again isn't going to do any good, is it? Please look at all the previous discussion. There is a lot of it already.
Re: The Comedies of Aristophanes (Q21286489). If you believe this is an edition, then what is it an edition of? What is the work? --EncycloPetey (talk) 14:37, 30 November 2017 (UTC)
@EncycloPetey: I saw the discussions and was wondering if there was a new argument because I thought (wrongly apparently) we were over this, almost all original language of work (P364) has been removed, your 13 texts are the only ones left.
True the case of anthology is a but strange but since you use properties for editions, it seems better to tell it's an edition (or subclass of edition). It should be checked but an item for s:en:Comedies of Aristophanes would be good for the work, wouldn't it? On the other way around, I can ask you almost the same question: « If you believe this is a work, then what is its edition ? ». And on other items like The House of Atreus (Q30349006) you add instance of (P31) = translation (Q7553) ; which is weird as translation (Q7553) is an action, you probably mean translation (Q39811647) which is a subclass of version (Q3331189). Cdlt, VIGNERON (talk) 14:56, 30 November 2017 (UTC)
Why would you think it was resolved? No consensus on what action to take was ever reached. We agreed to deprecate the one property, but never agreed on how to go about that or how to preserve the information it indicates.
Re: the anthology: huh? No, s:en:Comedies of Aristophanes would not be good for the work. That's a disambiguation page for multiple works that bear the same title. There is no work for this to come from. This is, as you say, an anthology composed entirely of components that are editions, but itself has no parent work to come from. As I keep saying, the data model we are using is both flawed and incomplete, so an appeal to that flawed and incomplete model is merely circular reasoning. --EncycloPetey (talk) 16:18, 30 November 2017 (UTC)
I thought it was resolved since all original language of work (P364) has been removed without trouble (except these 13 items). Is there anyone else except you who disagree?
Why not use s:en:Comedies of Aristophanes (and replace the s:en:disambig with s:en:Template:Versions, as it closer to the second), if we take the criteria given by Hsarrazin, all these anthology have : the same author, the same title, the same language, and the date of creation. It seems to fit the bill. True the content is different for each editions but work has no content so it doesn't really matters (and other properties are already here to explicit that). An other solution is to create a specific work for each different version (a bit overkill but it works too).
FRBR may not be perfect and there always will be tricky cases but this is the best system I know. Plus, it's quite well documented and this project agreed on to use FRBR. Do you know a better cataloguing system?
Cdlt, VIGNERON (talk) 17:04, 30 November 2017 (UTC)
@EncycloPetey: "...the data model we are using is both flawed and incomplete..." This is perhaps true but where are your contributions to solve that ? Perhaps it is time to act as a contributor and to propose a better model. Can we hope once to see contribution to a model ? Snipre (talk) 21:17, 30 November 2017 (UTC)
@Snipre: So, you agree that the data model is both flawed and incomplete, and are willing to explore change? This is the first indication that you or anyone else has made that change might be possible. If we can get more of the community to agree to this, then we will be able to proceed. Up to now, everyone has been pushing to spread the flaws. --EncycloPetey (talk) 21:21, 30 November 2017 (UTC)
@EncycloPetey: Read again what I said: "This is perhaps true that the data model we are using is both flawed and incomplete". But as you never show how to complete or improve the current model how can I judge what is missing ? Until you propose something, and this is something I ask you to do since several weeks, the current model is the best we have and we have to use it in order to be coherent inside WD. I prefer to have a bad solution than only criticisms saying we can do better. Snipre (talk) 21:30, 30 November 2017 (UTC)
Since you're not willing to move forward or admit change, it's disingenuous to criticize others for not proposing solutions. If you're not going to implement the ideas of others, there's no reason to propose those ideas. --EncycloPetey (talk) 21:34, 30 November 2017 (UTC)
@EncycloPetey: Where do you read that I am not ready to change my mind ? Please link to one of my comments saying that. You are not logic because you asked people to change their mind BEFORE showing them any reasons to do it. We have a model which need to be extended but we have something real. And you, what do you have ? You never presented nothing, so for me you have nothing to propose. WD is not a poker game so put your cards on the table or leave the table. Snipre (talk) 21:54, 3 December 2017 (UTC)
  • Pictogram voting comment.svg Comment I agree with Pasleim on these. We can call the original language by reference to the parent work rather than trying to replicate it in every edition. For that set of works, that smattering of the instance of (P31) it is just getting ugly ... book/translation/edition. Compilations don't fit the model, wonder how we go with "Greatest Speeches of ..." compilation, it is going shred your models. /me throws his hands into the air, and leaves it to the experts. I think I will stick with doing editions.  — billinghurst sDrewth 17:20, 30 November 2017 (UTC)
  • The correct way to handle a case like The Comedies of Aristophanes (Q21286489) is to consider it as a normal book: we need a work item and an edition item. The work item for this book will contain the links to other work item of the works composing the book
So having this case:
  • A1: a work of author X1 in language L1 represented in WD by QAAA
  • A2: a work of author X2 in language L2 represented in WD by QBBB
  • D1: a combined edition of A1 and A2 by translator X3 in language L3
To represent D1 in wikidata we need 2 items:
QXXX, Work item for D1 with the following statements:
QYYY, Edition item for D1 with the following statements:
Do you agree with the proposed model ? Snipre (talk) 21:17, 30 November 2017 (UTC)
@billinghurst, Hsarrazin, VIGNERON, Snipre, EncycloPetey, Pasleim: With my proposition we solve the question of original language of work (P364). Snipre (talk) 21:20, 30 November 2017 (UTC)
With which proposition is that? I see nothing that solves that questions currently under consideration. What we need are two new properties: (1) language of composition (or first performance, or first publication), and (2) language of edition (which might be adaptable from "language of work"). And for translations we need some third item to indicate language of source text, and/or the identity of the source text, from which the translation was prepared. --EncycloPetey (talk) 21:24, 30 November 2017 (UTC)
@EncycloPetey:Can't you use your capacity to infer ? This is one principle of database: use relations to deduce information not written. Example, if I said that all dogs are mammals and Floppy is a dog, can't you deduce that Floppy is a mammal even if I don't say it ?
So if you don't have the language of the original text in a item defining a translation, go to the item of the corresponding original text. So the original language of D1 is the language of A1 and A2, so if I have to extract the language value from QAAA and QBBB.
If I want to know the gender of the author of The Knights (Q1215817), why do I have to look in item Aristophanes (Q43353) and not in The Knights (Q1215817) ? Snipre (talk) 21:54, 30 November 2017 (UTC)
I'm sorry that you still can't understand the problem after all the discussion we've been through. Your analogy is flawed for all the reasons we've discussed elsewhere. If all dogs are mammals then that is an invariant quality. It does not change. Likewise, the gender of an author does not (usually) change, so there is no need to replicate it of mark it elsewhere because it is an invariant quality. But language of a work varies and does change, and it is context dependent upon the particular edition, translation, or performance of that work, so it is not an invariant property. Likewise "date" of a work depends upon the specific edition, translation, or performance. You cannot deduce anything when the values are inconstant.
And we've already been through the problem of identifying "original" texts. There is no means of marking that reliably. A translation of a text might be made from the "original", or it might be made from a derivative text in another language. --EncycloPetey (talk) 01:13, 1 December 2017 (UTC)
Please (again) don't mix work and workfrbr, work may have several languages (and even that is a bit dubious to me) but workfrbr clearly has always only one language (the on inside the head of the author). For translation (Q7553) vs. no label (Q23808533), this is a different and separate matter (already discuss in multiple sections of this page), if an edition of Hamlet in German has been translated from one of the first edition in Early Modern English or from a different edition in English or French, doesn't change the fact that Shakespeare was thinking his workfrbr in Early Modern English or that the first editions were in that language, this is clearly invariant. Cdlt, VIGNERON (talk) 14:27, 1 December 2017 (UTC)
Edit: my mistake, in FRBR, workfrbr has no language, this is a property of expressionfrbr only. Cdlt, VIGNERON (talk) 14:48, 1 December 2017 (UTC)
@Snipre: it seems good to me and it seems to be more or less what the FRBR recommends : FRBR 2008 (look on pages 30, 67 and passim, could you take a look and confirm if it fits or not?). Cdlt, VIGNERON (talk) 14:27, 1 December 2017 (UTC)
  • Pictogram voting comment.svg Comment If P407 by itself isn't sufficient to express the information, it probably needs qualifiers. If cases are rare, this might scale. If these are frequent, a solution with a dedicated might be needed. I don't see how it helps us determining the meaning of statement, but if we just say that one solution is "correct" or "what mother recommends implicitly". Once we have a solution, one can try to determine if it can be interpreted in this or that scheme.
    --- Jura 14:33, 1 December 2017 (UTC)
    @Jura1: the solution chosen is quite simple: P407 is the language of the item, if P407 is used on an item with P31 = work (or subclass of), then this is the language of the work, if P407 is used on an item with P31 = edition, then this is the language of the edition (and if P407 is used on something else, then look at the P31). We can use qualifier to make it more explicit and duplicate the P31 but honestly, you just have to look at the P31 to already infer a clear answer. Cdlt, VIGNERON (talk) 14:48, 1 December 2017 (UTC)
    • In this case, there are several languages associated with the item. Up to us to find a solution to qualify them correctly statements correctly.
      --- Jura 14:55, 1 December 2017 (UTC)
      • @Jura1: (if this case is The Comedies of Aristophanes (Q21286489)) there shouldn't be, as already said this item is mixing the work (in Ancient Greek) and edition level (in English). The solution is to do as usual (as Snipre put it « consider it as a normal book ») one item for the work, one for the edition. Cdlt, VIGNERON (talk) 15:13, 1 December 2017 (UTC)
        • @VIGNERON: So how many data items will be required to set up the book currently at The Comedies of Aristophanes (Q21286489)? If we do it your way (as I understand it), there will be 27 data items, or maybe more. That's just for the one book that exists in a single edition on a single Wikisource. --EncycloPetey (talk) 21:07, 1 December 2017 (UTC)
          • @EncycloPetey: I would say only 2: one for the 'work' (to create), one for the 'edition' (The Comedies of Aristophanes (Q21286489) that already exist, with some changes on instance of (P31) and edition or translation of (P629) moved to has part (P527) in the new 'work' item so P629 on the edition wan be linked to the new 'work' item). We already do this kind split for usual books, why not doing it for anthology? after anthology are books (and yes, I required a bigger number of items but we put less data on each items so all in all and in the long run, this is clearly better). @Snipre: do you confirm? Cdlt, VIGNERON (talk) 22:45, 1 December 2017 (UTC)
            • @VIGNERON: So no data items for each of the two volumes? No data items for each of the plays included in the anthology (both the work (translation) and the edition (in this anthology))? Why would you not include those? --EncycloPetey (talk) 23:07, 1 December 2017 (UTC)
              • @EncycloPetey: oh yes, you're right, a work and an edition item for each play in the anthology. I didn't look at the plays in detail but its seems to be already done : The Acharnians (Q1059987) the work and The Acharnians (Q19077417) an edition in English (at least you have the works). So with 11 plays, it's 22 items, plus 2 for the ensemble, not sure about the volumes (I would say no, but I will have to see the previous discussions). And if you want to count all, you need an item for the place of edition(s), for the editor(s), for the translator(s), etc., and a lot of items for the character in the plays too ;) Cdlt, VIGNERON (talk) 10:58, 2 December 2017 (UTC)
              • I feel this is somewhat problematic when we try to discuss this and people just keep repeating things they already wrote and don't actually look at items.
                --- Jura 11:53, 2 December 2017 (UTC)
@VIGNERON: That's what I said since several months: a compilation of works is a new work. But here I see a potential problem for some particular cases: if I have a work for one original text I don't have a work for a corresponding translated edition of that work, so if someone decides to publish a new book containing the original text and the translated text, then the proposed model requires a work item for the translation. Does it means we need a work item for all translation, perhaps not, but we have to find a solution for this case. For books, this case is rare but for poems, this case is more frequent. Snipre (talk) 02:30, 3 December 2017 (UTC)
@Snipre: It's not uncommon for texts at all, and is common far beyond poetry. It applies to drama, correspondence, essays, and most of all it applies to a high proportion of classical literature (Greek, Latin, Chinese, etc.) where parallel texts are common and also anthologies of translations are common. --EncycloPetey (talk) 21:22, 3 December 2017 (UTC)
@EncycloPetey: And ? Do you have a solution or a proposition ? Snipre (talk) 21:44, 3 December 2017 (UTC)
A large part of our problem, in a nutshell, is that we are limited to a binary system of [ "work" or "edition/translation" ]. Translations are neither wholly one or the other, yet they do have editions. So, we need a third option of "translation" that effectively lies in between the levels of "work" and "edition". That doesn't solve all the issues, but would be a positive step if we could implement it. --EncycloPetey (talk) 21:51, 3 December 2017 (UTC)
@EncycloPetey: Good. This is a first step. Can you please provide then the relations between the work, the edition in original language and your new class translation in order to see how complex the model is. Do we have all properties or do we need to create some new ones ? Snipre (talk) 22:06, 3 December 2017 (UTC)
I'm not sure what you're asking or what you're driving at. I know "relation" in the mathematical sense and the biological sense, but think it must have a slightly different meaning the way you are using it. How do you expect a response to be framed? --EncycloPetey (talk) 22:13, 3 December 2017 (UTC)
We have 3 classes (work, edition, translation) so we need at least 3 relations and possible 3 others if we want to have reverse properties. Snipre (talk) 23:41, 3 December 2017 (UTC)
The approach at Wikidata:Lists/Decameron editions and translations works out quite well. We just need to find a good way to express what language something was translated from.
--- Jura 07:48, 4 December 2017 (UTC)
With your approach one can easily determine the language something was translated from by following the edition or translation of (P629) chain. The concern is that you are using edition or translation of (P629) to link both edition with translation and translation with original work. But maybe widen the scope of P629/P747 is more comprehensible than creating a handful new properties. --Pasleim (talk) 11:23, 4 December 2017 (UTC)
Agree for this sample, but it's more complicated for EncycloPetey's. I noticed that there was some inconsistency in the labels of P629/P747. Maybe "translation" should be included in both properties and all languages.
--- Jura 16:45, 4 December 2017 (UTC)
RE Pasleim: "With your approach one can easily determine the language something was translated from by following the edition or translation of (P629) chain." But that won't always work. Assuming that the chain exists, and is complete, and isn't confounded by more than one layer of translation/edition (not all these conditions are always met), all the end of the chain may tell you the language of an ultimate work, not necessarily the language from which the translation was made. The English book The Waning of the Middle Ages is ultimately a translation of a Dutch work, but the translation was made from an unpublished French translation that was radically different from the original Dutch. I also have a book I'm woking with on Wikisource where the original language of composition was German, but the English translation was published first because of the death of the author before the German could be published. So is language of composition and the language of first publication are not the same. Our methods for indicating basic information like author, date, and language are too simplistic to cope with a lot of the data we need to record. --EncycloPetey (talk) 17:06, 4 December 2017 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────
A main principle of database design is to avoid duplicate data. You find in the web a lot of literature explaining why duplicate data are bad in a database. If we agree to aim for a good database design, we store the language of the original work only once, namely on the item about the original work. The same with the author. The question left to answer is then how to link editions/derived works/translations with the item about the original work. We currently have edition or translation of (P629)/has edition (P747) and has part (P527)/part of (P361) and published in (P1433). If you think this is not sufficient, please make proposals for new properties. --Pasleim (talk) 17:38, 4 December 2017 (UTC)
WMF DE seems to be for duplicates (at least Wikibase supports symmetric constraints and explicitly doesn't develop better alternatives). WMF gives grants for triplicate schemes .. So I think with an occasional supplementary statement we are still much closer to the ideal.
--- Jura 18:50, 4 December 2017 (UTC)

Edition of an edition[edit]

Aubrey
Viswaprabha (talk)
Micru
Tpt
EugeneZelenko
User:Jarekt
Maximilianklein (talk)
Don-kun
VIGNERON (talk)
Jane023 (talk) 08:21, 30 May 2013 (UTC)
Alexander Doria (talk)
Ruud 23:15, 24 June 2013 (UTC)
Kolja21
arashtitan
Jayanta Nath
Yann (talk)
John Vandenberg (talk) 09:14, 30 November 2013 (UTC)
JakobVoss
Danmichaelo (talk) 19:30, 16 February 2014 (UTC)
Ravi (talk)
Mvolz (talk) 08:21, 20 July 2014 (UTC)
Hsarrazin (talk) 07:56, 9 August 2014 (UTC)
Accurimbono
Mushroom
PKM (talk) 19:58, 10 October 2014 (UTC)
Revi 16:54, 29 November 2014 (UTC)
Giftzwerg 88 (talk) 23:36, 1 January 2015 (UTC)
Almondega (talk) 00:17, 5 August 2015 (UTC)
maxlath
Jura to help sort out issues with other projects
Epìdosis
Skim (talk) 13:52, 24 June 2016 (UTC)
Marchitelli (talk) 12:29, 5 August 2016 (UTC)
BrillLyle (talk) 15:33, 26 August 2016 (UTC)
Alexmar983 (talk) 23:53, 28 August 2016 (UTC)
Finn Årup Nielsen (fnielsen) (talk) 10:44, 29 August 2016 (UTC)
Chiara (talk) 14:15, 29 August 2016 (UTC)
Thibaut120094 (talk) 20:31, 14 September 2016 (UTC)
Ivanhercaz | Discusión Plume pen w.png 15:30, 31 October 2016 (UTC)
YULdigitalpreservation (talk) 17:35, 10 November 2016 (UTC)
User:Jc3s5h
PatHadley (talk) 21:51, 15 December 2016 (UTC)
Erica (ohmyerica) (talk) 19:26, 1 January 2017 (UTC)
User:Timmy_Finnegan
Mauricio V. Genta (talk) 05:38, 12 March 2017 (UTC)
Sam Wilson 09:24, 24 May 2017 (UTC)
Sic19 (talk) 22:25, 12 July 2017 (UTC)
Andreasmperu
MartinPoulter (talk) 09:21, 20 July 2017 (UTC)
ThelmadatterThelmadatter (talk) 01:11, 13 September 2017 (UTC)
Zeroth (talk) 15:01, 16 September 2017 (UTC)
Emeritus
Ankry
Beat Estermann (talk) 20:07, 12 November 2017 (UTC)
Shilonite - specialize in cataloging Jewish & Hebrew books
Elena moz
Oa01 (talk) 10:52, 3 February 2018 (UTC)
Maria zaos (talk) 11:39, 25 March 2018 (UTC)
Wikidelo (talk) 13:07, 15 April 2018 (UTC)
Mfchris84 (talk) 10:08, 27 April 2018 (UTC)
Mlemusrojas (talk) 3:36, 30 April 2018 (UTC)
salgo60 Salgo60 (talk) 12:42, 8 May 2018 (UTC)
Dick Bos (talk) 14:35, 16 May 2018 (UTC)
Marco Chemello (BEIC) (talk) 07:26, 30 May 2018 (UTC)
Harshrathod50 Pictogram voting comment.svg Notified participants of WikiProject Books

Hi,

It seems obvious and trivial to me, and as documented on the property page, the main page here Wikidata:WikiProject Books and in the FRBR, that an « edition of an edition » is not possible and doesn't even make sense (an edition is by definition the thing edited from a work). So logically, I corrected it on Декамерон (Q43475477) but Jura1 (talkcontribslogs) reverted me and is asking for « references ».

For me it's as obvious as « the sky is blue » or « water is wet », I don't know what more to explain... Any idea, remarks, etc. ?

Cdlt, VIGNERON (talk) 16:01, 24 November 2017 (UTC)

PS: to be sure, I checked again in the FRBR, it's clearly stated « Translations from one language to another, musical transcriptions and arrangements, and dubbed or subtitled versions of a film are also considered simply as different expressions of the same original work. » (FRBR, pages 17-18)

Q43475477 is an edition of a 19th-century translation: Q43169039.
Similar to Q43517456 which is an 1860 edition of the 15th century translation Q43516994.
Maybe the Commons sitelinks shouldn't be on these items.
The objective is to provide a full list of translations at Wikidata:Lists/Decameron editions and translations
similar to w:The_Decameron#Translations_into_English.
--- Jura 16:14, 24 November 2017 (UTC)
I don't speak russian enough so I don't know if the edition Q43475477 is based or not on the edition Q43169039 (BTW, they're both edition as translation are edition). But in any case, the property to indicate this information is based on (P144) not has edition (P747)/edition or translation of (P629). For the list, it is easier to create the exact same list when all editions are link to the same work (the SPARQL request would be shorter with just P629 and not P629+ which doesn't really make sense as P629 is not transitive).
Cdlt, VIGNERON (talk) 16:20, 24 November 2017 (UTC)
  • "the property to indicate this information is based on (P144)" what leads you to this conclusion? Is this something you just made up now or is it documented somewhere at Wikidata?
    --- Jura 00:20, 26 November 2017 (UTC)
The way the FRBR Group 1 classes have been implemented on Wikidata does not allow to express that two editions (frbr:Manifestation) are the embodiments of the same frbr:Expression. based on (P144) is not adequate to express this, as it could also be used to express that an adaptation is based on a particular translation. I'm not sure to what extent it is necessary in Jura1's example and use case to actually be able to express such subtleties. I can however understand that some confusion may arise if one has the distinction between frbr:Expression and frbr:Manifestation in mind. --Beat Estermann (talk) 00:25, 26 November 2017 (UTC)
It seems that the labels/definitions of edition or translation of (P629) aren't the same in all languages. At some point, "translation" was added to English ([6]) and some other languages. The approach chosen for Decameron seems consistent with current constraints.
--- Jura 13:29, 26 November 2017 (UTC)
« edition of » and « edition or translation of » is the same thing as translations are editions (at least until now in this project and in FRBR), the precision in the label is just a way to make more explicit for users. Formally there is not constraints right now to forbid edition of edition but there should be as (I feel) this is not at all in the spirit of this project where 'edition of' is supposed to be between only Work and Edition levels, not inside item of the Edition level. Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)
Pictogram voting comment.svg Comment This seems to be what I have been addressing at #Are we conflating editions and translations; or are we missing translations as their own works?. The generic "translation" can be at the work level, or at the edition level. Like the misuse of "book" which can relate the creative work, or a specific edition of the work. The difference in the jargon is not important to most people.  — billinghurst sDrewth 23:04, 26 November 2017 (UTC)
@billinghurst: exactly but even if we choose to consider editions and translations to be different things (which I think to be a bad and unnecessary idea, as most databases and references consider translations to be editions) then we would need a new property 'translation of' and in this case we wouldn't have edition of editions, right ? Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)

For information, right now there is 341 results for edition of edition:

SELECT ?item1 ?item1Label ?item2 ?item2Label ?item3 ?item3Label WHERE {
  ?item1 wdt:P629 ?item2 .
  ?item2 wdt:P629 ?item3 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

Some of them seems to violates multiple constraints (including of lot of manuscripts which probably should use exemplar of (P1574) instead). What should we do with these items?

Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)

@VIGNERON: For exemplar we need to use exemplar of (P1574) and to link the exemplar to the edition or to the work if no edition exists. Snipre (talk) 13:10, 29 November 2017 (UTC)
The main problem is from this kind of item Septuagint manuscript (Q7452368): this is typically a Wikipedia structure which doesn't correspond to the Wikidata model and create an additional layer in instance/subclass classification without having any meaning in the FRBR classification.
Second problem is this item Septuagint (Q29334) or Vulgate (Q131175) which are defined as work but described as translation of the Bible. We need to find a solution for these items. Snipre (talk) 14:30, 29 November 2017 (UTC)
The Septuagint and Vulgate are effectively anthologies of translations. The Vulgate and Septuagint both have the same collection of translated texts, but "The Bible" can vary in what it contains depending upon the form of Christianity (Coptic, Ethiopian, Orthodox, Protestant) or Hebrew (which will not contain the New Testament). So "The Bible" is not a fixed text nor a definite anthology. --EncycloPetey (talk) 21:31, 30 November 2017 (UTC)

frbr:Expression[edit]

Aubrey
Viswaprabha (talk)
Micru
Tpt
EugeneZelenko
User:Jarekt
Maximilianklein (talk)
Don-kun
VIGNERON (talk)
Jane023 (talk) 08:21, 30 May 2013 (UTC)
Alexander Doria (talk)
Ruud 23:15, 24 June 2013 (UTC)
Kolja21
arashtitan
Jayanta Nath
Yann (talk)
John Vandenberg (talk) 09:14, 30 November 2013 (UTC)
JakobVoss
Danmichaelo (talk) 19:30, 16 February 2014 (UTC)
Ravi (talk)
Mvolz (talk) 08:21, 20 July 2014 (UTC)
Hsarrazin (talk) 07:56, 9 August 2014 (UTC)
Accurimbono
Mushroom
PKM (talk) 19:58, 10 October 2014 (UTC)
Revi 16:54, 29 November 2014 (UTC)
Giftzwerg 88 (talk) 23:36, 1 January 2015 (UTC)
Almondega (talk) 00:17, 5 August 2015 (UTC)
maxlath
Jura to help sort out issues with other projects
Epìdosis
Skim (talk) 13:52, 24 June 2016 (UTC)
Marchitelli (talk) 12:29, 5 August 2016 (UTC)
BrillLyle (talk) 15:33, 26 August 2016 (UTC)
Alexmar983 (talk) 23:53, 28 August 2016 (UTC)
Finn Årup Nielsen (fnielsen) (talk) 10:44, 29 August 2016 (UTC)
Chiara (talk) 14:15, 29 August 2016 (UTC)
Thibaut120094 (talk) 20:31, 14 September 2016 (UTC)
Ivanhercaz | Discusión Plume pen w.png 15:30, 31 October 2016 (UTC)
YULdigitalpreservation (talk) 17:35, 10 November 2016 (UTC)
User:Jc3s5h
PatHadley (talk) 21:51, 15 December 2016 (UTC)
Erica (ohmyerica) (talk) 19:26, 1 January 2017 (UTC)
User:Timmy_Finnegan
Mauricio V. Genta (talk) 05:38, 12 March 2017 (UTC)
Sam Wilson 09:24, 24 May 2017 (UTC)
Sic19 (talk) 22:25, 12 July 2017 (UTC)
Andreasmperu
MartinPoulter (talk) 09:21, 20 July 2017 (UTC)
ThelmadatterThelmadatter (talk) 01:11, 13 September 2017 (UTC)
Zeroth (talk) 15:01, 16 September 2017 (UTC)
Emeritus
Ankry
Beat Estermann (talk) 20:07, 12 November 2017 (UTC)
Shilonite - specialize in cataloging Jewish & Hebrew books
Elena moz
Oa01 (talk) 10:52, 3 February 2018 (UTC)
Maria zaos (talk) 11:39, 25 March 2018 (UTC)
Wikidelo (talk) 13:07, 15 April 2018 (UTC)
Mfchris84 (talk) 10:08, 27 April 2018 (UTC)
Mlemusrojas (talk) 3:36, 30 April 2018 (UTC)
salgo60 Salgo60 (talk) 12:42, 8 May 2018 (UTC)
Dick Bos (talk) 14:35, 16 May 2018 (UTC)
Marco Chemello (BEIC) (talk) 07:26, 30 May 2018 (UTC)
Harshrathod50 Pictogram voting comment.svg Notified participants of WikiProject Books

Hi,

I'm presently working on the ingest of a pilot dataset of performing arts productions. The Expressions that are to be described in the context of the performing arts are not necessarily editions (often, they have not been published, but we do know who the translator or the adapter was, we know their language, etc.). I would therefore suggest to create a separate class "Expression", corresponding to frbr:Expression, and to slightly modify the description of version (Q3331189) on the Books project page: in fact, version (Q3331189) seems to correspond first and foremost to frbr:Manifestation.

When describing editions from the perspective of physical artefacts, as is most common in the library world, the approach that was used so far, employing the classes version (Q3331189) and creative work (Q17537576), would be maintained as is. However, when describing expressions from the perspective of their content, as is the case in the theatrical databases I'm working with, a more refined data model could be used which distinguishes between the four FRBR Group 1 classes.

I've described the rationale in more detail here and am looking forward to your comments. --Beat Estermann (talk) 00:05, 26 November 2017 (UTC)

@Beat Estermann:
On this page it's indicated: « Not to complicate too much, we didn't use the FRBR terms "expression" or "manifestation", as the boundary between the definitions it's not easy to grasp. So we used "edition" instead, collapsing those 2 FRBR layers in 1 (other conceptual frameworks similar to FRBR (like Bibframe) collapse those 2 layers too). Thus the double layer work - edition has been used for creating Book properties. »
I don't know if our "edition" level closer to the "expression" or to the "manifestation" FRBR level, I don't know (if I had to tell, I would have said "expression" but it's true that our edition level is - wrongly - more seen as physical than intellectual) and I'm not even sure the question make sense as it's both by design.
For the creation of a new class for "expression", it's a good idea (at least we would be exactly aligned with FRBR), but I don't know how to make it useable for everyone (most of the problem on this project are because the simplified models is too complicated already - even if meanwhile some people want to add a fifth level for translation... - so I'm not sure to deal with one more level).
I've read you text quickly and some things seems a bit strange but it sounds good globally, I'll try to read it more thoroughly soon.
Cdlt, VIGNERON (talk) 16:18, 28 November 2017 (UTC)

Berg Encyclopedia of World Dress and Fashion[edit]

Aubrey
Viswaprabha (talk)
Micru
Tpt
EugeneZelenko
User:Jarekt
Maximilianklein (talk)
Don-kun
VIGNERON (talk)
Jane023 (talk) 08:21, 30 May 2013 (UTC)
Alexander Doria (talk)
Ruud 23:15, 24 June 2013 (UTC)
Kolja21
arashtitan
Jayanta Nath
Yann (talk)
John Vandenberg (talk) 09:14, 30 November 2013 (UTC)
JakobVoss
Danmichaelo (talk) 19:30, 16 February 2014 (UTC)
Ravi (talk)
Mvolz (talk) 08:21, 20 July 2014 (UTC)
Hsarrazin (talk) 07:56, 9 August 2014 (UTC)
Accurimbono
Mushroom
PKM (talk) 19:58, 10 October 2014 (UTC)
Revi 16:54, 29 November 2014 (UTC)
Giftzwerg 88 (talk) 23:36, 1 January 2015 (UTC)
Almondega (talk) 00:17, 5 August 2015 (UTC)
maxlath
Jura to help sort out issues with other projects
Epìdosis
Skim (talk) 13:52, 24 June 2016 (UTC)
Marchitelli (talk) 12:29, 5 August 2016 (UTC)
BrillLyle (talk) 15:33, 26 August 2016 (UTC)
Alexmar983 (talk) 23:53, 28 August 2016 (UTC)
Finn Årup Nielsen (fnielsen) (talk) 10:44, 29 August 2016 (UTC)
Chiara (talk) 14:15, 29 August 2016 (UTC)
Thibaut120094 (talk) 20:31, 14 September 2016 (UTC)
Ivanhercaz | Discusión Plume pen w.png 15:30, 31 October 2016 (UTC)
YULdigitalpreservation (talk) 17:35, 10 November 2016 (UTC)
User:Jc3s5h
PatHadley (talk) 21:51, 15 December 2016 (UTC)
Erica (ohmyerica) (talk) 19:26, 1 January 2017 (UTC)
User:Timmy_Finnegan
Mauricio V. Genta (talk) 05:38, 12 March 2017 (UTC)
Sam Wilson 09:24, 24 May 2017 (UTC)
Sic19 (talk) 22:25, 12 July 2017 (UTC)
Andreasmperu
MartinPoulter (talk) 09:21, 20 July 2017 (UTC)
ThelmadatterThelmadatter (talk) 01:11, 13 September 2017 (UTC)
Zeroth (talk) 15:01, 16 September 2017 (UTC)
Emeritus
Ankry
Beat Estermann (talk) 20:07, 12 November 2017 (UTC)
Shilonite - specialize in cataloging Jewish & Hebrew books
Elena moz
Oa01 (talk) 10:52, 3 February 2018 (UTC)
Maria zaos (talk) 11:39, 25 March 2018 (UTC)
Wikidelo (talk) 13:07, 15 April 2018 (UTC)
Mfchris84 (talk) 10:08, 27 April 2018 (UTC)
Mlemusrojas (talk) 3:36, 30 April 2018 (UTC)
salgo60 Salgo60 (talk) 12:42, 8 May 2018 (UTC)
Dick Bos (talk) 14:35, 16 May 2018 (UTC)
Marco Chemello (BEIC) (talk) 07:26, 30 May 2018 (UTC)
Harshrathod50 Pictogram voting comment.svg Notified participants of WikiProject Books

Berg Encyclopedia of World Dress and Fashion (Q4891400) is a 10-volume encyclopedia published in hardcover, ebook, and online. Each volume has its own ISBNs, DOI, editors and subject area (e.g. African dress). Should I make a work item for each volume or can they all be editions of one work item Berg Encyclopedia of World Dress and Fashion (Q4891400) annotated as to volume number, editors, and subject area? (see contents) - PKM (talk) 23:55, 12 December 2017 (UTC)

After much thought, I am going to add a single edition for the online Encyclopedia and just include the volume information and reference URLs in individual references until I see if this source is useful enough to create items for each volume. - PKM (talk) 20:08, 14 December 2017 (UTC)

Proposing change to qualifiers— remove P248, add P805[edit]

Following a discussion in Wikidata:Project chat about the constraint that stated in (P248) is only to be used for references, that I will replace that with the identified preferred statement is subject of (P805). I will look to set up a references section on same page and have P248 entered there.  — billinghurst sDrewth 04:50, 14 December 2017 (UTC)

I don’t get why you need a qualifier and not a reference. The example do not help. author  TomT0m / talk page 16:42, 11 January 2018 (UTC)
@TomT0m: Where we are using described by source (P1343) it is not a reference, it is a qualifier to the work, as such it is a statement of origin. There has been widespread use of P248 in this situation, and this is a constraint violation, see Property:P248. So this is to offer a contextually corrected property to use for P1343.  — billinghurst sDrewth 04:55, 13 January 2018 (UTC)
See Charles Dickens (Q5686) for some examples of difference. If you have the "show constraint violations" gadget operating, you will see the highlights.  — billinghurst sDrewth 04:58, 13 January 2018 (UTC)

NB: I changed behavior of s:ru:Модуль:Другие источники to use P805 instead of P248 (diff). -- Sergey kudryavtsev (talk) 06:50, 13 January 2018 (UTC)

NB2: w:ru:Модуль:External links uses P805 too.

@billinghurst, TomT0m: Can you run a bot to replace P248 with P805? -- Sergey kudryavtsev (talk) 07:02, 13 January 2018 (UTC)

Well outside my skill set. I have placed this request for a bot to be run. There is a discussion section there if there is any comment to be made about the requested replacement.  — billinghurst sDrewth 10:53, 13 January 2018 (UTC)
Under way with user:PLbot undertaking.  — billinghurst sDrewth 13:21, 24 January 2018 (UTC)

Pseudonyms[edit]

Currently it seem we are assuming that pseudonyms do not have their own items. It’s not the case in external databases that have a proper identifier (Qid here) on pseudonyms. This causes questions on our users (see Talk:Q7245 or Topic:U5ied71lz96i7r8m). I think we should think about this. Any previous discussion about this ? Any known Documentation ? author  TomT0m / talk page 17:07, 11 January 2018 (UTC)

There are definitely cases where the pseudonyms have items, though the article needs to be about the pseudonym, not about the individual, ie. article at WP that has article as such pseudonym itself is notable. I have seen this more in cases of collective pseudonym (Q16017119).  — billinghurst sDrewth 05:02, 13 January 2018 (UTC)
Here a related question: is there a case where a Wikimedia project has two differents pages, one for the person and one for the pseudonym? (and is it a common pratice on a Wikimedia project?). I don't know any but if there is, Wikidata would have to deal with it. Cdlt, VIGNERON (talk) 11:14, 13 January 2018 (UTC)
@VIGNERON: check for instance of (P31) -> pseudonym (Q61002) and see what is there. I know that there are articles for collective pseudonyms.  — billinghurst sDrewth 12:52, 13 January 2018 (UTC)
My unknown case is when we know a text is signed by a pseudonym but we know nothing about whom actually is the author. Is the proposed model currently is
< text > author (P50) View with SQID < unknown value >
credited as search < string pseudonym >
 ?
This raises another question : a pseudonym is supposed to be its own identifier. What happens if several authors uses the same pseudonym at some point in time, we don’t know who one or two actually is but we are rather sure the author is not the same. In other words, there is two « persona », with each an author, with the same signature ? Is there anonymous authors we only know their pseudonyms ? This imply, if an anonymous author has several pseudonym and we create a « human » item for each, that one person can have several « human » Wikidata item. This is not true if we choose to have « persona » items. If we have « persona » item, we also can refer to this pseudo without using its signature string. We can have several « persona » for one pseudonym string. author  TomT0m / talk page 12:27, 13 January 2018 (UTC)
If there is no article/item for an author, just a pseudonym, and an unknown one, you probably should just consider using author name string (P2093). I see little point generating items for people who are basically anonymous.  — billinghurst sDrewth 12:53, 13 January 2018 (UTC)
Good questions.
The precise meaning and use of named as (P1810) is clearly not clear (there is a constraint used as qualifier constraint (Q21510863) but the given example is a direct property :/ I will raise this point on the talk page, but there is other unclear point, among others: is it limited to people or not?). Nonetheless, you model seems good, just one detail: it's not always a unknown value, it can be used for known value too for alternative names which act the same way as pseudonyms
< some old edition of the 'Sonnets' > author (P50) View with SQID < William Shakespeare (Q692) View with Reasonator View with SQID >
named as (P1810) View with SQID < Shake-speares >
(and with statement is subject of (P805) = spelling of William Shakespeare's name (Q7575898)). author name string (P2093) is a good solution too (but it depends on the context).
At least, one point is sure : anonymous (no name) and pseudonymous (some name) are mutually exclusive. It's either one or the other.
Cdlt, VIGNERON (talk) 12:56, 13 January 2018 (UTC)
@VIGNERON: I can also see it being used as qualifier to a reference, see Chet Baker (Q2274) which is causing constraints issues too. From my reading of the English description, it is used for proper nouns, rather than people.

Re your Shakespeare example, does it not come under my earlier explanation? I would have said that would just be the addition of the pseudonym property item added to Shakespeare, and then on the work, use author -> Shakespeare, then qualify with "named as" -> given pseudonym/alternate spelling/whichever  — billinghurst sDrewth 15:20, 13 January 2018 (UTC)

@billinghurst: indeed, I see that this point is already discuss on Property talk:P1810.
Maybe, but I'm not sure to understand, what « earlier explanation » are you talking about.
To get back to the original question, some database have several identifiers but some have only one (BnF has only one for Samuel Clemens/Mark Twain). Cdlt, VIGNERON (talk) 15:52, 13 January 2018 (UTC)
Once more we’re discussing a global issue (pseudonyms) taking the small picture. This tends to spread discussions everywhere :( This amounts to questioning Wikidata objective on this. I tend to think we’re one place where we can add informations that are not hold by over databases. Wikidata has a large scope, and tend to be inclusive. I think as a consequence we should allow to hold information about personas. author  TomT0m / talk page 16:04, 13 January 2018 (UTC)
@billinghurst: I wonder if the lack of a « persona » concept in this model tends to make kind of hard to treat cases in a generic way. There is a lot of properties and way to use it. Hard to take into account all the possible cases and not forget something. If a writer likes to play with the histories of its identities, invents false biographies for them, see https://en.wikipedia.org/wiki/Romain_Gary for example who let his cousin play the role of one pseudonym for the press, hard to model any of this. If we consider « Emile Ajar » a fictional character, then we can have an item for it and link it to the item of Gary’s cousin. Authors have also been known to change pseudonyms wrt. the field of work, eg. Special:EntityPage/Q309240 who signed « Moebius » only for its science fiction work. We can’t really link the pseudo with science fiction properly if we don’t have an item for Moebius. As a qualifier for the pseudonym maybe … but that’s a limited approach. Also a single persona may have several signature string. The « persona item » model allows to treat all kind of corner cases elegantly. And seems to me easier to query while being more flexible. I think we should have « persona » items and property to link them to their puppeteers. author  TomT0m / talk page 15:56, 13 January 2018 (UTC)
(ec) I said above:
There are definitely cases where the pseudonyms have items, though the article needs to be about the pseudonym, not about the individual, ie. article at WP that has article as such pseudonym itself is notable. I have seen this more in cases of collective pseudonym (Q16017119).
So no items for pseudonyms unless there is a wikidata item that says "this is a pseudonym" and not about the person for who it was a pseudonym.

So, for where there are multiple authority controls they are usually both entered against the person and each is qualified with "named as." If there is more than one BnF, then it will have corresponding multiple VIAFs, and it is my understanding that this will put the duplicates into a queue to be considered for merging.  — billinghurst sDrewth 16:02, 13 January 2018 (UTC)

@TomT0m: You can list multiple pseudonyms against one author. The task is to link a work to the author, irrespective of the name used, where the additional names are qualified.  — billinghurst sDrewth 16:05, 13 January 2018 (UTC)
If someone is creating false biographies for a pseudonym, then that sounds like it reaches into one of those where an article is being written about the pseudonym, and it does get its own item.  — billinghurst sDrewth 16:07, 13 January 2018 (UTC)
Then remember that I discussed collective pseudonym (Q16017119) so Ellery Queen (Q586362) and Michael Field (Q839369) have articles and have multiple people involved.  — billinghurst sDrewth 16:09, 13 January 2018 (UTC)

How to include books in a practical manner[edit]

I have read the documentation and there is only one concern that I have. It is wonderfull as a database but it fails me in several ways. I want to add all the books of all the authors we know. The objective is to information about books that are available for reading.

When I read about the database model, I find that there is nothing practical in there. The notion that LUA should be the glue to bind it all is not even an excuse. There are a few scenarios that I want an effective answer for.

  • I want all Wikisource books to be effectively registered so that we know what books are available for reading in what language. I really want us to advertise those books, I want them to be read.
  • I want us to import all books from the Open Library that have an author we have an identifier to the Open Library for. I do not mind to restrict it at first to include only the books with ebooks. To be truthful, I also want to include the books the Biodiversity Heritage Library has at the Internet Archive. For them we have to import many more authors .. but it is an option to treat them like we do scientific publications where authors are only added at a later date.

Now when it is about database design. It is one thing to suscribe to what libraries do, it makes sense when we accomplish things in this way. My challenge is how can we effectively register books and find an audience for these books. Thanks, GerardM (talk) 19:03, 25 January 2018 (UTC),

for wikisource texts, there is a work that is done now, by frwikisource and Tpt, to allow a rather automatic import of texts, as editions, and to ease the creation of work items. But it is not complete yet. You may read what's been done for now here (sorry, it's in French). --Hsarrazin (talk) 19:21, 25 January 2018 (UTC)
That is cool, even important. It is obvious that without data we cannot do much. But how is this going to enable more readers. How will this be a template for all the other Wikisources? How about all the other issues that I raise.. To paraphrase a Wendy advert: Where is the beef? Thanks, GerardM (talk) 19:53, 25 January 2018 (UTC)s
My two cents: Wikisource is still in the initial stages of adding to WD, and only the French and English Wikisources are really large enough and varied enough to be doing much. Many other Wikisources are small, poorly staffed, and have little oversight to maintain consistent formatting and data. Even on the English Wikisource, we face the issue that many older works and editions are so poorly curated, that they practically have to be done over again from scratch.
We've managed to do a decent job of adding authors and author data, but works, editions, and translations still have many challenges to overcome. I have requested a customizable tool for the addition of Wikisource works, but such tools seem to take low priority with the developers, who favor Wikipedia-tools because of the much larger participation. --EncycloPetey (talk) 00:59, 9 February 2018 (UTC)

Works[edit]

What is the current best practice on instance of (P31) for non-fiction works? And if book (Q571) is not the correct P31 for works - and I am sure it's not - could we please change the example on the project page? - PKM (talk) 19:47, 8 February 2018 (UTC)

I would say it depends on the item being added. For entries in the 1911 Encyclopædia Britannica, it's common to use encyclopedic article (Q17329259). There are also options for textbook (Q83790), academic journal article (Q18918145), etc. The use of book (Q571) is simply the most generic sort of example, and sometimes the only meaningful option. --EncycloPetey (talk) 00:54, 9 February 2018 (UTC)
I would say : in theory, all documents, fiction or non-fiction should follow the FRBR. In practice, since almost all non-fiction as only one editionFRBR per workFRBR, there is no real need to use the FRBR and most wikidatians only create one item (which ideally is more or less wrong but pragmatically is more or less right). But if you follow the FRBR, I see no reason why not use book (Q571) for workFRBR or as @EncycloPetey: said, any subclass of it, for instance a general and obvious choice is non-fiction book (Q20540385). Do you have a specific work in mind? Cdlt, VIGNERON (talk) 07:41, 9 February 2018 (UTC)
Mmm we’re actually not really « using FRBR ». You mean « create a work item » ? author  TomT0m / talk page 12:22, 9 February 2018 (UTC)
@TomT0m: mmm too, the first phrase of the first section on Wikidata:WikiProject Books is literally « We used the Functional Requirements for Bibliographic Records (FRBR) model », we adapted it (like everybody, nobody use exactly the FRBR, even the FRBR adapted itself several times since 1997) but adaption is still usage. Anyway, that doesn't matter that much, as PKM was speaking of « non-fiction works », I guessed (maybe wrongly) that she was indeed talking about creating a work item. Cdlt, VIGNERON (talk) 13:16, 9 February 2018 (UTC)
I’m of the opinion that, if we create a single item, it’s maybe best to create the work one anyway. author  TomT0m / talk page 14:12, 9 February 2018 (UTC)
@TomT0m: well yes, I think I get your idea but if you have only one item, you're outside the FRBR and work/edition separation, the item is neither and for the constraints you have to be both. Cdlt, VIGNERON (talk) 14:51, 9 February 2018 (UTC)
I don’t understand. FRBR describe a model, it does not require us to have items for every part of it ? Or does it ? author  TomT0m / talk page 15:15, 9 February 2018 (UTC)
Nobody is coming putting a knife under wikidatian's throat to create both a item about the work and one about the edition Face-wink.svg. But logically, we're are creating item about works and editions. One is less meaningful and useful without the other. Cdlt, VIGNERON (talk) 15:21, 9 February 2018 (UTC)
@VIGNERON: The problem with using non-fiction book (Q20540385) for instance of (P31) is that it's not simply a from ("instance") but a form/genre combination item. That is, "book" is a form but "non-fiction" is a genre. So I wouldn't use that value at all. I would also point out that many non-fiction works have gone through multiple editions. I have books on my shelf about anatomy, botany, Greek theatre, and Latin grammar, as well as dictionaries, encyclopedias, biographies, writing guides, and statistical reference works which have all gone through multiple editions. --EncycloPetey (talk) 16:41, 9 February 2018 (UTC)
@EncycloPetey: There is nothing in instance of (P31) or in Help:BMP that says it classifies work of art by form. It’s a generic property that can handle classification by genre as well, it classifies by many criteria (and it’s is force, no need to reinvent the wheel to classify stuffs). As both genre-classes and form-classes are subclass of « work », this follows that there is no problem into creating a subclass of both non fiction and books. Although we don’t have to and using only instance of (P31) we could as well put statements with the two values. Seems practical however to create such classes for common combinations. author  TomT0m / talk page 17:38, 9 February 2018 (UTC)
@TomT0m: Yet we have no guiding philosophy or principle on this matter. I would argue that instance of (P31) should be limited to a form or structure, and leave the genre (for fiction works) to its own separate statement, and likewise the main subject (for non-fiction works) should be kept separate from the "instance of" statement to the greatest extent possible. --EncycloPetey (talk) 17:46, 9 February 2018 (UTC)
@EncycloPetey: I’d argue that other ontology project have handled taxonomies with a few numbers of properties (two, one to link instance to their class, and one for subclass relationship actually) and several class tree instead of creating one property for each taxonomies, with great success.https://en.wikipedia.org/wiki/OBO_Foundry For example https://en.wikipedia.org/wiki/OBO_Foundry (and they have many class trees). Following their path would probably be an help for interoperability if we share common principle with them. And we will have a hierarchy of artistic genre anyway (if not several, as there may be several ways to classify genres), so having a specific property to deal with them is not much help in my opinion. author  TomT0m / talk page 18:02, 9 February 2018 (UTC)
@TomT0m: Unfortunately, that is an encyclopedic categorization primarily for a single scientific subject field, and for a project like Wikisource, the structure quickly collapses. On Wikisource, we have followed the classification principles of the w:Library of Congress Classification. --EncycloPetey (talk) 18:26, 9 February 2018 (UTC)
@EncycloPetey: This is quite a large field, with many subfields and subontologies that are designed to work well together, which is not easy as there is many many ways to model things in a way that models won’t be easily combinable to each other. Definitely comparable with wikisource in complexity, if not waaay more complex. See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2814061/ for example. Rapidly watching Library_of_Congress_Classification , it’s actually a topic classification amongst many other, like for example the ACM one, there is no objective reason to align to only one. As Wikidata is inclusive, this highlight the fact that we will indeed have to deal with several classification system. Plus Wikidata has a very rich system of items to precisely describe the topics of a book and relations between them, so it might be more efficient to use the « topic » property to the precise subject topic and to use knowledge fields items themselves to find close topics than to use a rigid topic classification tree designed for the non digital library era. author  TomT0m / talk page 19:01, 9 February 2018 (UTC)
And for the collapsing, I don’t know what you are referring to precisely, but if you’re referring to the category system and its loops, instance of (P31) and subclass of (P279)_are in no way comparable. Sure there is problems in our class tree but projects like OBO gives very good principle to avoid them by design. We try to unsure that a subclass is never a subclass of itself by a « subclass of » statement chain, for example, as by definition that would mean any of the classes in that path are equal. Classification (should) obey strong principles like the https://en.wikipedia.org/wiki/Type%E2%80%93token_distinction which are strong and well established, while categories are … way less structured. author  TomT0m / talk page 19:01, 9 February 2018 (UTC)
@PKM: could you give some context or example so we can see clearer here. Cdlt, VIGNERON (talk) 14:51, 9 February 2018 (UTC)
Sure! In general, I always make both work and edition items for my references, since (1) I am not always using the most recent edition of physical books and (2) I frequently use Oxford reference works which have separate editions for the online edition which will have a different date and ISBN than the physical publication. Also, I make heavy use of "main subject" and "genre" on these.
The work I was struggling with most recently was Patterns of Fashion 4 (Q48046762), a work on costume history, (edition: Patterns of Fashion 4 (Q48047403)). I used "book" here because I'm not sure what is better.
Another example is The Concise Oxford Companion to English Literature (Q47463825) (editions: The Concise Oxford Companion to English Literature (Q47463849) online 3rd, The Concise Oxford Companion to English Literature (Q47463828) online 4th). I used "creative work" here, again with some uncertainty. I have used "reference work" for items identified as "dictionary of..." or "encylopedia of..." in the past, but I'm not sure that's right for a general history of something. - PKM (talk) 19:24, 9 February 2018 (UTC)

can someone help me create a property for books?[edit]

hello! I've never created a new property. Is someone available to help? thanks! שילוני (talk) 14:03, 12 February 2018 (UTC)

Some tricky properties of a book[edit]

I'm clearly out of my depth here. It would be appreciated if someone else can take on cleaning up Q1366818 (the book Escape to Life) and then ping me to look at how it would be done correctly.

BEGIN: copied from Wikidata:Project chat.

Q1366818 (the book Escape to Life) presents an interesting situation on several counts. I'm wondering what, if anything, of the following we can somehow convey.

  • The book was originally published in 1939 by Houghton Mifflin. We have an existing entity Q390074 for present-day publisher Houghton Mifflin Harcourt, but not for this predecessor. It would be inaccurate to say that the book was published by Houghton Mifflin Harcourt; what should we do?
  • Klauss and Erika Mann originally wrote the book in German, but it was first published in English translation. A German edition did not come out until 1991. Is there any way to convey that the book was written in German, but first published in English? Is there any way to indicate the first German edition as being just that?

Jmabel (talk) 05:37, 16 February 2018 (UTC)

@Jmabel: - There's quite a lot of prior art at Wikidata:WikiProject Books; they seem to list the pertinent statements for the Work, and for the Edition, as far as i can see from a quick glance. (Sorry I'm pointing you elsewhere rather than answering in detail.) hth --Tagishsimon (talk) 09:02, 16 February 2018 (UTC)
Interesting. As a relatively casual user of Wikidata, how would I be likely to have found that page, other than coming here to ask? - Jmabel (talk) 16:13, 16 February 2018 (UTC)
@Tagishsimon: Even after reading that page, I don't see answers to either of the questions I asked above. Did you read the page and see answers to my questions? Or was this just "there's a lot of stuff about books at Wikidata:WikiProject Books, your questions might be answered there"? I think someone more expert than I on Wikidata would do well to see if this can be expressed with current properties (and if so I'd be interested in learning how). In particular, I'm guessing that for Houghton Mifflin there is some way to do this with custom properties, but I haven't been able to work out how to create one of those. - Jmabel (talk) 16:27, 16 February 2018 (UTC)
@Jmabel:
in fact, Escape to Life (Q1366818) is flawed because it is defined as a work Q7725634, but contains publication infos. If you read the Wikidata:WikiProject Books page, you've seen that works and version (Q3331189) are 2 different types. The work should contain only info about the authors, the original language (german), the original (german) title, the genre, and links to edition items.
infos about editions, both in english and german, must each go into an version (Q3331189) item, which would then have all the properties about the publisher, the year of publication, the title of the said publication, etc. like a traditional library catalog. For each edition there must be a different item, and it would be preferable if you could add a library ID for edition, LoC for example, to be able to differentiate editions and have reference.
Then, each edition is linked to the work item through edition or translation of (P629), and in the work item, you may link to the publications through has edition (P747). Then, you can indicate on the English edition, that it was the first edition, like I did with Escape to Life (Q48914392). It should also be done for the first german edition, for which I have no info at all.
this may seem a little complicated, but it is the only way to manage data about the work and data about the different editions, without mixing them up.
if you need help, you may seek it on the discussion page of the project.
as for your question about publisher, on Houghton Mifflin Harcourt (Q390074), I see it was created in 1880, so it is the right publisher. Publishers often change their name through time, and it is written differently on many books, and it still is the same publisher... If it is the actual denomination in 1939 that bothers you, you can add a stated as (P1932) qualifier to set the exact name of the publisher at the time of publication. :) --Hsarrazin (talk) 16:50, 16 February 2018 (UTC)
Houghton Mifflin Harcourt doesn't seem to me like just a "change of name" of Houghton Mifflin. It represents a merger with the historically equally important Harcourt Brace Jovanovich (previously Harcourt Brace, then Harcourt, Brace, and World, then Harcourt Brace Jovanovich). Aside: there used to be a joke in the publishing industry that the name was changed because Jovanovich thought he was more important than the world.
I'm clearly out of my depth here. I'll bring it to Wikidata talk:WikiProject Books. - Jmabel (talk) 17:04, 16 February 2018 (UTC)

END: copied from Wikidata:Project chat. - Jmabel (talk) 17:06, 16 February 2018 (UTC)

@Jmabel: I totally agree with Hsarrazin, you should had one item for each edition, it's the easiest and simpliest way to go. Cdlt, VIGNERON (talk) 08:35, 23 February 2018 (UTC)
Hsarrazin (talkcontribslogs) VIGNERON (talkcontribslogs) So no item at all for the book as a work, just for editions? Because that is not at all the way that, for example, Hamlet (Q41567) is handled. - Jmabel (talk) 16:51, 23 February 2018 (UTC)
Also, I still see no way to express that the work was written in German, but first published in English translation. - Jmabel (talk) 16:52, 23 February 2018 (UTC)
this is deduced from the fact that the work item's language is German, while the first edition's language is English. --Hsarrazin (talk) 17:04, 23 February 2018 (UTC)
@Jmabel: you obviously need to keep the current item (Escape to Life (Q1366818)) about the work but you also need items for the editions (ideally for all the editions). Reminder: a work is an intangible object, it's *never* published what is published is de facto an edition. When you say "the work is published in English", it's in fact "the work has an edition in English". If you have several items, then it's easy to say "give me date and language of the first (or all, or the last) edition(s) of this work". Cdlt, VIGNERON (talk) 17:24, 23 February 2018 (UTC)
One solution:
one item Qxx0 for the work, with language in German
one item Qxx1 for the manuscript, with language in German
one item Qxx2 for the first edition, with language in English, translated from Qxx1, edition of Qxx0
one item Qxx3 for the second edition, with language in German, edition of Qxx0
For the editor problem, two cases:
1) Houghton Mifflin bought Harcourt Brace Jovanovich and changed its name by the same occasion. In that case, one item is sufficient, with two significant events, one for the buy and the second for the name change.
2) Houghton Mifflin merged with Harcourt Brace Jovanovich in a new entity called Houghton Mifflin Harcourt and in that case a new item is necessary for Houghton Mifflin Harcourt. Snipre (talk) 22:47, 23 February 2018 (UTC)
@Jmabel: Snipre (talk) 23:08, 23 February 2018 (UTC)
The history of Houghton Mifflin & Harcourt is even more complicated (Reed Elsevier had bought Harcourt, turned it into a couple of divisions of Reed Elsevier while keeping the names, then eventually sold those divisions (and also I believe some things that were never part of Harcourt) to Houghton Mifflin which changed its name to Houghton Mifflin at the time of the acquisition. So I guess it's more like your case 1, though I doubt we have entities in WikiData that describe exactly what Houghton Mifflin acquired. - Jmabel (talk) 00:14, 24 February 2018 (UTC)

Collection[edit]

Is there any way to include an edition in a collection of books? I mean, if I want to say some french edition belongs to Le Livre de poche (Q1629027) I can't use collection (P195) without triggering constraint issues thus this property is intended just for paintings, sculptures, etc. I don't know how to handle it -- maybe "part of", "series" or anything else. Any ideas? Thanks. Wikidelo (talk) 14:54, 18 April 2018 (UTC)

You should definitely not use collection (P195), as this is used to link the item to a collection assembled by a collector or collecting organization. Your example rather fits the definition of schema:Series; series (Q20937557) has <equivalent class> schema:Series. However, schema:Series is defined as a sub-class of Creative Work, while series (Q20937557) is not. Instead, Wikidata uses a qualifier to indicate the type of items of a series, e.g. Welsh Triads (Q2542444) (manuscripts), Zanja de Alsina (Q301895) (fortifications), Triumph Tiger (Q3539718) (motorcycles). In addition, several sub-classes of series (Q20937557) series have been defined - you may explore them using the Wikidata Ontology Explorer. Some of them relate to creative works. Maybe this would require some tidying up. --Beat Estermann (talk) 06:09, 19 April 2018 (UTC)
Maybe the following queries answer your question better. You may just replace the "P1433" in the second query to output the same list for the other properties. --Beat Estermann (talk) 06:51, 19 April 2018 (UTC)
#List of properties linking an item to a book series, ordered by frequency of use
SELECT ?property ?propertyLabel ?count WITH {
  SELECT ?property ?value (COUNT(DISTINCT ?item) AS ?count) WHERE {
    ?bookseries wdt:P31/wdt:P279* wd:Q277759.
    ?item ?wdt ?bookseries.
    ?property a wikibase:Property;
              wikibase:directClaim ?wdt.
    FILTER(?property != wd:P31)
  }
  GROUP BY ?property ?value
  ORDER BY DESC(?count)
  LIMIT 10
} AS %results WHERE {
  INCLUDE %results.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en-US,en,en". }
}
ORDER BY DESC(?count)

Try it!

# List of item / book series pairs linked with the property P1433 (published in)
SELECT ?item ?itemLabel ?bookseries ?bookseriesLabel 
WHERE
{
  ?bookseries wdt:P31/wdt:P279* wd:Q277759.
  ?item wdt:P1433 ?bookseries.
   
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

@Beat Estermann: But collection like Le Livre de poche (Q1629027) can be considered as a collection of written book by an editor. This similar to a collection of paintings by a collector. Written works are creative works so I don't really see why we should do a difference. Snipre (talk) 07:40, 19 April 2018 (UTC)
In fact, you're talking about a series which is called "collection". Let's take the example of stamps - you may have distinct series of stamps issued by the postal service which may respond to specific design criteria, and you may have collectors building collections of stamps according to their own criteria. I believe the case of Livre de poche is more like the case of the postal service. And yes, I think it makes sense to keep the two cases apart in the data model. Cheers, Beat Estermann (talk) 07:48, 19 April 2018 (UTC)
@Beat Estermann: Thank you very much for the examples. My skills with SPARQL are just slightly better than my personal best at pole vaulting (4 inches). Anyway, I see your point. I've been looking at several wikis and it seems the concept of "collection" has two ambiguous approaches: for French, Italian, Spanish, and a few more wikis, the idea of "collection" is more or less consistent [[7]]; nevertheless for enwiki there's no main category for "collection" (Penguin Classics, Everyman's Library, Folio...), just "series" [[8]]. Maybe I'm too picky, but for me, "series" should be Zuckerman Bound (Q17054053) and all the Zuckerman's books, or Harry Potter (Q8337), but Pocket Penguins (Q25037402) or Colección Austral (Q5776710) are quite another thing. I've also seen that there is little or non-existent consistency populating those items in wikidata. A lot of them don't even have a single statement, some are qualified as editorial collection (Q20655472), which is also weird because it's an instance of "art collection" and "catalog", and some are instances of "imprint" (Penguin Classics (Q11281443)). Messy. I agree with @Snipre:, "why we should do a difference"? But I also agree with @Beat Estermann:: maybe mixing art collections with paperback editions is not a good idea for the data model. Wikidelo (talk) 21:47, 19 April 2018 (UTC)
@Infovarius: It seems a good candidate. I'll try it. Thanks! Wikidelo (talk) 14:29, 20 April 2018 (UTC)
I don't think published in (P1433) is a good solution; the definition implies that it should be used for example for papers published in a scientific journal or for individual stories published in a larger work. In fact, many of the hits the above query produces point to scientific "book series" which are treated like journals (they also get a journal identifier). I would rather use series (P179) instead; and possibly use different classes for different types of "series"/"collections". --Beat Estermann (talk) 17:38, 20 April 2018 (UTC)
@Beat Estermann, Wikidelo: Please provide an objective criterion to exclude books groups from collection. Please explain what is the common parameter between a paintings collection and a sculptures collection not shared by a books collection ?
series (P179) is not appropriate as series (P179) implies an order, a sequence: each element of a serie has to be connected to one or two others members of the serie using qualifiers like followed by (P156) or follows (P155). A collection is an ensemble of items grouped according to an arbitrary criterion.
@Jura1: Do you have some elements to distinguish series (P179) from collection (P195) or do we have to merge both properties ? Snipre (talk) 23:20, 20 April 2018 (UTC)
@Snipre: I think I provided the distinction above - a "collection" has been collected by someone according to some criteria and is different from a "book series" (definition: "a set of independent books in a common format or under a common title or supervised by a common general editor", Oxford Dictionary). I am totally aware of the fact that the latter definition coincides with one of the definitions for "collection"@fr ("Série d'ouvrages, de publications ayant une unité", Le Petit Robert). – "collection"@en does not have this meaning; and as the definition given in Le Petit Robert implies, the property <"series"@en; "série"@fr> should be absolutely fine in this context - even from a francophone point of view. This said, we still may want to distinguish different types of series/collections at the level of the class definitions. The following distinguishing criteria come to mind (I don't have time right now to do a thorough terminological research across several languages, but we should maybe do that at some point - create a table with multi-lingual definitions and alignment with the respective WD class):
  • Have similar items been collected ex post or do they have been designed to be similar from the beginning?
  • Do the items have one common originator (publishing house, editor, etc.) or do they have different originators?
  • Can the items be brought in a natural order (are they even numbered)?
  • Does the order of the series apply to its content or rather only to its publication date or similar?
  • Is the series complete (we know all the items) or open (further items may be added)
So much for now... --Beat Estermann (talk) 07:27, 21 April 2018 (UTC)
@Beat Estermann: I don't know if it answers all your questions, but here's an example at LibraryThing [[9]] (note that the covers are randomly displayed; they aren't the actual covers of this particular editions). Wikidelo (talk) 11:47, 21 April 2018 (UTC)
@Jura1, Beat Estermann, Snipre: I guess series (P179), catalog (P972) or collection (P195) could be tweaked to somehow solve the issue, but I can't figure out which one is the best candidate or in which way it could distort the main purpose of each property. I agree with all of your points of view but, as a non-librarian and a noob, I can't tell which one is best. I think the LibraryThing approach is quite interesting [[10]]. They use "Publisher series" for the "collection" concept I was talking about in the beginning. The closest we have here is editorial collection (Q20655472). So, we can broaden editorial collection (Q20655472) to include "Publisher series" or we can create a new "Publisher series" item. Then we can consistently make Penguin Classics (Q11281443), Pocket Penguins (Q25037402), Le Livre de poche (Q1629027), etc, instances of editorial collection (Q20655472) or the new Q:"Publisher series". Wikidelo (talk) 11:34, 21 April 2018 (UTC)
@Jura1: I think that should be only applied to version (Q3331189), otherwise we could end up with a mess, like book (Q571) with one or multiple ISBN. Wikidelo (talk) 11:34, 21 April 2018 (UTC)

Edition item properties[edit]

I've found a couple of properties that could be useful for Wikidata:WikiProject_Books#Edition_item_properties:

What do you think? Wikidelo (talk) 20:07, 21 April 2018 (UTC)

yep, seems interesting :
typeface/font (P2739) would be mostly interesting on ancient books, where editions can be discriminated through font (I worked on a 1502 Venetian edition (Alde) this week, and can very well see where it would be useful). --Hsarrazin (talk) 09:25, 19 May 2018 (UTC)

Some questions, based on my first items for works / editions[edit]

Hi everybody. I've just created my first couple of sets of items for editions and works, at England Delineated (Q52228403) + editions, and A Topographical Dictionary of Wales (Q52240439) + editions, and I wondered if somebody could sanity-check what I've done, to see if there's anything I've done that's not right, or could be done better.

I've got quite a few editions/works that I'm about to organise some images for on Commons, that I'm intending to create Wikidata items for at the same time, so it would be good to know whether I've got things basically right.

The item-sets above I've hand-created as sort of a target to aim for. For the items going forward, I hope to be working much more from existing metadata, doing much less by hand, so they probably won't end up being as complete. But I thought if I do a couple by hand, then I could see where the different fields would fit, when or if I have them.

A few questions occurred to me along the way (below), that didn't seem entirely explained in the existing guidance at Wikidata:WikiProject_Books#Bibliographic_properties.

Sorry if it is rather a lot of questions, but it was my first time making works / editions items, so I was very much feeling my way, and hoping I was getting things right (or at least not too wrong!) So many many thanks in advance for any thoughts or comments, to set me on the right way. Regards, Jheald (talk) 23:59, 28 April 2018 (UTC)

Titles and subtitles of works/editions[edit]

Since nineteenth-century book titles can be quite long, what should go where? Are title (P1476) and subtitle (P1680) the right properties to use? Where should a title be split between them? Should one try to get everything into title (P1476) if one can, eg as done at England Delineated (Q52228403), or should I have tried to split this title?
With A Topographical Dictionary of Wales (Q52240439) I ran into the apparent problem that string fields can only be 400 characters long, leading to a rather unnatural split between the two parts of the title. Is there a better way round this? (eg is there any alternative property that ought to be used instead? Are there some properties that Wikidata allows to have longer strings?)
In practice this probably won't be a problem, because long titles in my metadata look as if they have already been truncated with ellipsis (...); so I shall probably go with just dumping the whole of that title into title (P1476). But it would be good to know what the 'right' thing to do is considered to be.
Query results for longest book titles tinyurl.com/ycnqu9s3, showing a few other examples
Usually there is a short 'convenience' title, which is typically what has been used for the item label -- should we have a property for this? If these Wikidata entries were being used to power reference citations, how would one most normally expect the the title to be stated in such a reference? Are we storing the information to generate that?
MARC 245 divides a title into: 'title' ($a) and 'remainder of title' ($b), stating that
"In records formulated according to ISBD principles, subfield $a includes all the information up to and including the first mark of ISBD punctuation (e.g., an equal sign (=), a colon (:), a semicolon (;), or a slash (/)) or the medium designator (e.g., [microform])."
Is that something we should try to emulate, or does it bring its own problems?
If the "remainder of title" would not fit into 400 characters, do we need a new property such as "subtitle (continuation)" ? Perhaps as a qualifier, since it would need to be linked to a particular value of "subtitle" ? Jheald (talk) 11:46, 29 April 2018 (UTC)

Naming of items[edit]

I've gone with shortened titles for item names, pretty much truncating at the first punctuation. I presume this is okay. One thing I wondered about was preference for title-case / sentence-case (ie how many words to capitalise in item names -- and in book titles). Again, I'm likely to follow the metadata, but I notice that for full titles, there seems to be an avoidance of title-case, presumably because it simply makes them too unreadable. For the shorter forms used for item names I think I prefer title case, but I did notice that a couple of existing items (England delineated (Q28872042) -- unrelated; and A topographical dictionary of Wales (Volume I) (Q25219289) -- a bit mixed up) both preferred sentence case. Are there any strong feelings one way or the other?
Also, naming of editions -- I've gone with England Delineated (1st edition) (Q52281940), England Delineated (2nd edition) (Q52229333), etc. Is this appropriate; what is the preference between this or using years for the suffix instead? (Or using both, eg "(1st edition, 1790)" -- are years helpful to include?) It may be easier to extract years from the data; on the other hand, there may be multiple years for the same edition (as we shall see).


Edition number[edit]

I'm getting a constraint warning for including series ordinal (P1545) as a qualifier to make the edition names more machine interpretable. Is there an objection to these? It seemed to me they might be helpful.
Also, does anyone know of any gadget or add-on to make it easier to re-order statements? eg the sequence of editions on England Delineated (Q52228403) is a bit of a mess at the moment. (I was kind of hoping that if I added series ordinal (P1545) qualifiers the software might sort them itself, but no joy). Yes, it's no big deal to do by hand; but if they were machine-added it could be a bit of a pain.
Your solution is not sufficient to sort editions: if several publishers generated several editions, then you should add another qualifiers to be able to distinguish which editions have to group before being sorted.
Ex. Publisher Y published one first and one second edition of a work, and publisher X published one first edition, one second and one third edition of the same work, how your system will help to sort edition of pulisher Y ? To be able to work correctly you don't have to use data from work item to deal with data about edition items. You are formatted with a wikipedia format where everything is mixed together in the same document. WD is a database, and data are splitted in several containers or items and you have to perform an extraction from the different items instead of duplicating data in the different items. Snipre (talk) 22:30, 29 April 2018 (UTC)

Different scans of the same edition[edit]

Turning to the edition items now, I have sometimes found cases where there are multiple different scans in circulation for the same edition -- see eg A Topographical Dictionary of Wales (3rd Edition) (Q52243033) for a particular case.
The constraint checker doesn't seem to like there being multiple different Google Books ID (P675) values for the same item. (Similarly for Open Library ID (P648) and Internet Archive ID (P724)). It also doesn't seem to like them being qualified with volume (P478). Are there real objections to consider here, or would it be appropriate for these constraint conditions to be relaxed a bit?
It probably also doesn't like me using publication date (P577) as a qualifier. In this case, the same edition has gone through various impressions, eg 1843, 1844, 1845. These differ little, apart from a few entries in the errata page, and a change of address at the end of 1843 for the publisher. To me it very much makes sense to group the different impressions together into a single item for the edition -- it makes it much easier to see the major developments of the text.
On the other hand, is it sometimes helpful to track particular scans? For example, very often the Internet Archive or Hathi scan corresponds to a particular scan from Google (although not always) Is it helpful to indicate this in some way? But if so then how?
Also, it may sometimes make sense to put images extracted from different scans of the same edition into different Commons categories (so that each category corresponds to images from a single exemplar of the book). In that case I'm guessing it may make sense to treat them as different exemplars from within the same edition, connected back by exemplar of (P1574) -- but most of the time I'm thinking that is probably an unnecessary complication? Is there an inverse property, to announce that the edition may have different exemplars?
Scans are similar to exemplars, and should not be mixed in edition level. If you want to add data about one scan, then you should create a new item for the scan and linked that item to the edition item like you link the edition item to the work item.
Ex.: if you want to add the localizaiton of one exemplar of the Gutenberg's bible, you haven't to add this data in the edition item, but create a new item. If you have three different ID and one property like publication date which are different for the scans, this justifies news new items to avoid confusion. Snipre (talk) 22:36, 29 April 2018 (UTC)
Okay, so let's take a look at how this might work.
I have created a new class individual book (Q53731850) for distinct printed copies of books (a few more eyes checking over its statements would be very welcome!), and changed the statements on a pair of existing items (hat-tip to User:Sic19), namely On the laws and practice of horse racing, etc., etc (UC copy) (Q51425849) and On the laws and practice of horse racing (UPenn copy) (Q51514189) to make them instances of it, and exemplar of (P1574) a new item On the laws and practice of horse racing (1866 edition) (Q53738443), to which I have moved statements that were specific to the edition rather than the copies.
However, with respect to information about online copies, I have duplicated this on On the laws and practice of horse racing (1866 edition) (Q53738443) as well as the items for the distinct copies, using statement is subject of (P805) to distinguish which copy each scan is taken from. I think it's useful to collect together information about all online copies on the edition item, because I think this is where people will look for that information, both directly as humans, and when writing queries. This creates an issue in the form of a violation of the 'unique values' constraint, but this can perhaps be worked around.
In practice I wouldn't expect many such items for distinct printed copies to be created. Internet Archive ID (P724) already allows collection (P195) to be used as a qualifier, and that should usually be enough to distinguish different scan families without the need for new items. So I would expect individual book (Q53731850) items only to be created rather lazily, in particularly complicated cases, or when there is specific information related to particular copies that people want to record. Jheald (talk) 22:10, 18 May 2018 (UTC)

Annotations for edition number (P393), publisher (P123), and printed by (P872)[edit]

In each case I have use stated as (P1932) to indicate how the name was actually stated on the title page. I hope this is acceptable. IMO it's for example quite useful to know that eg all copies of England Delineated (2nd edition) (Q52229333) were stated to be "Second Edition, with Additions and Corrections" -- this doesn't eg indicate an edition "2.1" following on from an initial release "2.0".
Where there isn't yet an item for the printer or publisher, I have used the special <some value> value, and then annotated it with the text from the page -- as eg at Q52283171#P872. In fact, this is what I am intending to do systematically for printers and publishers on initial item upload, then going back to see if there are ones I can match. I hope this is acceptable.
Also, where there is an identifiable address, I have put this in a located at street address (P969) qualifier. Again, it's not suggested on the Books item style page, but I hope this is considered reasonable. With enough of these, it may be quite nice to be able to track the different addresses for a printer or publisher over time, with the works issued from each one.
The only complication I found was with A Topographical Dictionary of Wales (3rd Edition) (Q52243033), where the publisher's address changed during the print-run and/or re-issues of the edition. You can see how I've dealt with this, but I am open to suggestions, if anyone has a better thought.
Wrong. You mix the person and the company. You have to create an item for the company, i.e. S. Lewis and Co., and the address should be saved in the item of the company, not in the book's item. Again, you mix data about different concept into one item. address is not a characteristic of a book but of a company. Snipre (talk) 22:43, 29 April 2018 (UTC)

Annotations for full work available at (P953)[edit]

What qualifiers/annotations are recommended for P953? You can see what I have done at eg Q52243033#P953. Is this appropriate, or are other things that should be added? There are quite a lot of potential qualifiers in current use, eg tinyurl.com/ya5g4mxa. Are there any that it would be particularly valuable to try to make a point of including?
(BTW, I am presuming that if an edition has Google Books ID (P675) or NRHP reference number (P649) or Internet Archive ID (P724), then that suffices and it is unnecessary to add a P953 to the same scan?)
Also (perhaps related to the question that User:MartinPoulter raised a few threads above), what is the best way to indicate that a site offers eg a cleaned-up transcription of the full text, such as at Q52243156#P953, rather than the more common page scans + OCR ?
One other question that came up was how best to indicate multiple volumes available at the same link -- for example with Q52241009#P1844, both volumes are available at the same link (but as two different files). I indicated this by adding both volume (P478) = 1 and volume (P478) = 2 as qualifiers on the same statement. But at Q52241558#P953 the two volumes have been combined together into a single scan-file (they may also have been bound together). I tried to indicate this using volume (P478) = "1 & 2", but the constraint checker doesn't like this. Is the preferred way therefore to do what I did for the Hathi trust case? Or is there a different way to indicate two volumes together?
Same as above: instead of creating a bunch of qualifiers, create one item for the scan or the electronic version with all data including the link to the online version. Snipre (talk) 22:46, 29 April 2018 (UTC)
  • There is a qualifier to identify a specific page in a pdf: title page number (P4714), information that can't be stored otherwise. For the reminder, I think it depends how far you want to go. If you think a detailed description is needed, it might be preferable to create separate items.
    --- Jura 11:42, 4 May 2018 (UTC)

England Described (1818) (Q52284408)[edit]

I wasn't sure how to treat this. Should it be treated as an edition, or would it be more appropriate to treat it as a new work in its own right?
On the one hand, it is a much more extensive enlargement and rewriting of England Delineated (Q52228403) than the previous new editions. But on the other hand, it is an enlargement of Q52228403, albeit with a lot of new material, leading to a somewhat different focus.
If one is looking down the list of editions at England Delineated (Q52228403), is it helpful to see it included? (Google in fact titles it as such). Or would a stand-alone item and based on (P144) have made more sense?
There is no rule for that: starting when a modified edition starts to become a new edition or even a new work ? Usually the contributor who is adding this version has to choose based on expert or historic considerations. Snipre (talk) 22:51, 29 April 2018 (UTC)
I'm not sure what to use for the P31-statement but to express the relationship to England Delineated (Q52228403) you could use modified version of (P5059) instead of edition or translation of (P629) (or based on (P144)) to express that it is not a direct edition or translation but a modified version. - Valentina.Anitnelav (talk) 15:31, 3 May 2018 (UTC)

Does there *always* need to be a separate work and edition item?[edit]

Finally, if (as would be the case with England Described (1818) (Q52284408)), this is the only time the title was issued, is it appropriate to try to combine 'work' and 'edition' in the same item (as OpenLibrary does, or at least displays) ? Or is it still required to create two items, even though they will be rather redundant to each other?
Thanks in advance, Jheald (talk) 00:10, 29 April 2018 (UTC)
Any more thoughts about this?
Per guide to item structure on the project page, a lot of properties are expected to be located on version (Q3331189) items.
So, in cases when there has only ever been one edition, if we do accept that only a single item should be created, it would make sense to me for it to be made instance of (P31) both version (Q3331189) and book (Q571).
If we do go down this route, it might be helpful to include edition or translation of (P629) statement pointing to itself -- I think query writers would find this useful, so that separate edition and work items and combined edition/work items could both be dealt with in the same way.
Does that seem a sensible suggestion to people? Jheald (talk) 22:28, 18 May 2018 (UTC)
Yes, if and only if both concepts are merged inside the same item, meaning that all properties linking the edition and the work as as all editions properties and all work properties are present in the same item, then we can consider that solution. The risk is more about constraints: we will complexify the monitoring of properties use. Snipre (talk) 23:50, 20 May 2018 (UTC)
@Snipre: So you're saying it would also need has edition (P747) on the item, pointing to itself? Jheald (talk) 09:24, 21 May 2018 (UTC)
@Jheald: Exactly. That's the only for lua scripts retrieving data from both work and edition items to be able to work without complexifying the code. But again this solution is possible but not recommended as we will have problem for some constraint definitions. Snipre (talk) 11:13, 21 May 2018 (UTC)
Question: Has England Described (1818) (Q52284408)) even been published in translation? That would require a separate data item for the edition. So, we're talking about finding a way to have a single data item for a work that was issued only once, in only one language, by only one publisher, from only one location, and never translated nor reprinted. --EncycloPetey (talk) 00:16, 21 May 2018 (UTC)
@EncycloPetey: Agreed. But it's not such an uncommon case -- in fact I would think it is the most common situation for most classes of old books. Jheald (talk) 09:22, 21 May 2018 (UTC)
That's not my experience with old books. My experience is that many were published as UK/US editions, or were published in another language, or had the contents appear later in another edition. --EncycloPetey (talk) 15:07, 21 May 2018 (UTC)

subject areas and genres[edit]

Sometimes genre (P136)-statements have subject areas/academic disciplines as their values. The most frequent are philosophy (Q5891), art history (Q50637) and history (Q309), but there are also cases like statistics (Q12483) and finance (Q43015). I see that this somehow mirrors the practice in book shops but I'm rather sceptical if it is the best way to express the fact that a work is of interest for a certain discipline. I see following options to deal with subject areas used as genres:

  1. Generally allow instances of academic discipline (Q11862829) to be used as values in genre (P136)-statements
  2. Create a new genre-item for each subject area that is used as a genre
  3. Expand the scope of an already existing property to be applicable for those cases, too (field of work (P101) is the one that comes to my mind, but maybe there are others)
  4. Create a new property <subject area> that has works as its domain and subject areas as its value

I don't really like the first two approaches (they tend to misuse genre (P136) as a catch-all), but what do you think about this issue? - Valentina.Anitnelav (talk) 14:53, 3 May 2018 (UTC)

@Valentina.Anitnelav: Yes, I agree that there is some confusion ,ainly because no clear classification exists about written texts.
In my opinion we need 4 properties to describe correctly
If I take the examples you provided, history, finance, statistics,... these are subject and no genre. If I should characterize your examples, I would propose as written format textbook (Q83790) and as written genre essay (Q35760), treatise (Q384515), scientific writing (Q1965486),...
History, finance, statistics are not genre but subject and the property main subject (P921) should be used instead of genre (P136). Snipre (talk) 01:04, 4 May 2018 (UTC)
I mainly agree with you, Snipre, and I especially like the idea to separate between form (or written format) and genre.
I also thought about using main subject (P921) for subject areas. I abandoned the idea as this is actually not very accurate: the subject area of a work is seldomly the main topic. See for example The religious and historical paintings of Jan Steen (Q29589359). It is a catalogue about Jan Steen (Q205863), not about art history. Art history is the subject area this book is written in or of interest for. On the Genealogy of Morality (Q230302) is about morality, not about philosophy (in difference to The Problems of Philosophy (Q3393210)) - Valentina.Anitnelav (talk)
@Valentina.Anitnelav: You can add several subjects so I think you can really that the book is about art history and Jan Steen (Q205863) and even add the list of works mentioned in the catalog. Snipre (talk) 10:03, 4 May 2018 (UTC)
@Snipre: I see a problem with the use of main subject (P921) because those statements would be inaccurate (not because of the number of values). On the Genealogy of Morality (Q230302) is not about philosophy (e.g. its principles, questions, methods, development) and The religious and historical paintings of Jan Steen (Q29589359) is not about art history (e.g. its principles, questions, methods, development). It should be possible to get all books having philosophy as its main topic (e.g. The Problems of Philosophy (Q3393210) and What is Philosophy? (Q7991586)) without getting every book in the field of philosophy. - Valentina.Anitnelav (talk) 10:56, 4 May 2018 (UTC)
That's why libraries often use "Schlagwortketten" (subject strings) like "Philosophie - 19. Jahrhundert - Nietzsche - Moral". Imho main subject (P921) is the right property, since an editor is free to add a second "main subject" or replace a general subject like "philosophy" with a more precise term like "Frankfurt School". --Kolja21 (talk) 01:36, 11 June 2018 (UTC)

Time for a new "subject facet" property ?[edit]

@Valentina.Anitnelav, Snipre: Further to the above, I wonder if it would be useful to propose a new "subject facet" property ?

For the Bioheritage Diversity Library (BHL) books, discussed in this section below, that we now have 60,000 items for, the BHL releases a 'keywords' dataset, that it would be useful to think how best to add.

Looking at keywords that have more than 400 hits (from the volumes of the whole collection, not just the items we have titles for), a few we might consider to relate to the form of the item (ie what the item is), viz:

Periodicals (43829); Catalogs (10750); Pictorial works (1635); Internet resource (1455); Electronic books (929); Collected Works (704); Early works to 1800 (682); Catalogs and collections (571);

But mostly they are indicative of the subject matter, ie:

Natural history (10273); Science (9892); Botany (8022); Nursery stock (7413); United States (6392); Plants (6044); Birds (6012); Seeds (5556); Zoology (4745); Nurseries (Horticulture) (4643); Flowers (4330); Plants, Ornamental (3622); Agriculture (3482); Entomology (3219); Gardening (2985); Trees (2984); Seedlings (2855); Vegetables (2828); Geology (2746); Insects (2739); Fruit (2705); Insect pests (2670); Societies, etc (2635); Forests and forestry (2491); Fruit trees (2414); Great Britain (2329); Paleontology (2210); Control (2184); California (2184); Shrubs (2109); New York (State) (2078); Biology (2068); Angiospermas (2017); Bulbs (Plants) (1939); Flora (1803); Mollusks (1799); Germany (1717); Classification (1703); North America (1661); Fisheries (1651); Bibliography (1604); Horticulture (1577); Fishes (1512); Horses (1264); Equipment and supplies (1262); Australia (1220); France (1215); Ornithology (1177); Massachusetts (1155); Diseases and pests (1114); Montana (1091); Grasses (1089); England (1049); Canada (1043); Description and travel (1005); Research (1004); Pennsylvania (999); Alberta (965); History (924); Mexico (916); Anatomy (871); Fruit-culture (868); Illinois (845); India (834); Italy (818); Europa (809); Washington (State) (809); physiology (797); Oceanography (796); Taxonomía (782); Ohio (771); Hunting (750); Varieties (740); Península Ibérica (739); Iowa (725); Marine biology (708); Mammals (689); Pteridófitos (682); Gimnospermas (655); Scientific Expeditions (643); Botanical illustration (637); Roses (630); Lepidoptera (614); Bees (611); America (601); Agricultural implements (585); Berries (585); Statistics (584); North Carolina (583); Prices (582); Beetles (581); Europe (578); Animals (572); Forest reserves (566); Poultry (552); Fishing (544); Obras clásicas (543); Antiquities (534); Seattle (533); Game and game-birds (532); New Jersey (516); University of Washington Botanic Gardens (516); Anatomy, Comparative (514); Colorado (508); Diseases (508); Hongos y líquenes (494); Michigan (491); Evolution (488); Environmental aspects (482); Fungi (481); Veterinary medicine (477); 1809-1884 (463); Plant diseases (462); Engelmann, George, (461); Forest management (461); Wildlife conservation (460); Briófitos (458); Ethnology (455); Medicine (453); Beneficial insects (453); Natuurlijke historie (442); Austria (440); Floriculture (439); New York (439); Field notes (438); Africa (435); Indonesia (429); Plant collecting (427); Florida (424); Learned institutions and societies (421); Microscopy (413); Asia (412); Plantas útiles o venenosas (412); Minnesota (402); Identification (401);

But these are not, in almost all cases, the main subject (P921) of the item. Instead they are more like en:faceted search terms.

So, for example, if we take a book like A systematic arrangement of British plants :with an easy introduction to the study of botany (Q51423679), library catalogues might give the subject as "Botany -- Great Britain" and "Botany -- Ireland" (those two from OCLC, which for copyright reasons we can't take; but an edition at the LoC might have something quite similar). This would correspond to our P921.

On the other hand, the BHL [11] gives keywords "Great Britain", "Ireland", "Plants".

For the reasons Valentina was expressing above, I think these need a different property, that might perhaps be called "subject facet". What do people think? Jheald (talk) 10:55, 8 June 2018 (UTC)

Proposed, at Wikidata:Property_proposal/Creative_work#subject_facet Jheald (talk) 23:20, 10 June 2018 (UTC)

Pauly-Wissowa[edit]

Hi! I've just noticed that the volumes of Paulys Realenzyklopädie der klassischen Altertumswissenschaft (Q1138524) have instance of (P31)  Wikimedia category (Q4167836) (e.g. Pauly-Wissowa vol. I,2 (Q26414652)); however, this property is in contrast with the constraint of published in (P1433) (e.g. in RE:Ancites (Q15892059)). The problem affects thousands of items and creates thousands of constraint violations. My proposal is to trasform the items of the volumes from categories to items of books (e.g. Pauly-Wissowa vol. S I (Q26469375) is actually both!). What do you think? --Epìdosis 15:22, 9 May 2018 (UTC)

✓ Done. --Epìdosis 08:40, 30 May 2018 (UTC)

Award for book or for author ?[edit]

Hi, I wonder where should be the award received (P166): on the author item or on the book item. It's assumible to have it duplicate on both concepts ?. What about the inconsistencies ?. Excuse me, if it had been discusse before and I don't find. Thanks, Amadalvarez (talk) 07:27, 12 May 2018 (UTC)

I have added many awards to people. When an award is also associated with a book, it is easy enough to add them as qualifiers.. A secondary notion is that authors are more likely to have an item than a book. Thanks, GerardM (talk) 08:08, 12 May 2018 (UTC)
@GerardM: When you say "... to add them as qualifiers.", under which property would you use award received (P166) as a qualifier? May be author (P50)?. Thanks, Amadalvarez (talk) 22:35, 12 May 2018 (UTC)
In awards such as the Hugos, sometimes the same author can be nominated (finalist) to the same category two times for two different works, so I interpret that in those cases the award winners are not the authors but the works. However, since the winner (P1346) property is a person, I add award received (P166) to both person and literary work. Also, if your local Wikipedia templates support Wikidata integration, this way they can automatically show the awards received by the person in his/her infobox, and by the literary work in its infobox too. --JavierCantero (talk) 08:28, 13 May 2018 (UTC)

Recording the edition format[edit]

One of the English-language aliases for property distribution (P437) is "book format".

Is distribution (P437) appropriate to record the format of the books in a particular edition -- eg folio (Q772267), quarto (Q2122442), octavo (Q1307353), duodecimo (Q1266414) etc -- as at eg Q53576187#P437 ?

Or should the values allowed for distribution (P437) be restricted to the currently permitted hardback (Q193955), paperback (Q193934), pamphlet (Q190399), softcover (Q990683), Library binding (Q6542551); and perhaps a new property be introduced for book format, akin to newspaper format (P3912) ? Jheald (talk) 20:36, 19 May 2018 (UTC)

Number of pages[edit]

We have the property number of pages (P1104).

If a source gives the number of pages for a book as eg "viii, 187 p." or "338, xlviii p.", are there agreed values to attach to an applies to part (P518) qualifier to denote the number of pages of front matter (front matter (Q24033349)), main content, and appendices respectively ? Jheald (talk) 20:47, 19 May 2018 (UTC)

I would love to hear a cataloguers professional point of view. From my experiences with reproducing old works there is no such thing as uniformity, so at best guess you are seeing the last numbered page of each section. Page number in the fore sections of a work are often variable, some will label plates, some will not, "number of pages" as a concept itself is problematic, and it changes through time. If we are going to go via sections, then we would also need to start a number of plates in a work. Then to make things more complex when there are addition editions, you can even see inserted pages with nnnA, nnnB, ... so they didn't have to renumber the whole work. Dashed variability and changes in time!  — billinghurst sDrewth 23:28, 20 May 2018 (UTC)
"last numbered page of each section" makes a lot of sense. My second example above is in fact actually stated as "238 [338], xlviii p." in the catalogue -- presumably the last numbered page was also wrongly numbered in this case!
Do you think it ould it be worth a specific new property, last numbered page? Jheald (talk) 09:18, 21 May 2018 (UTC)

Using volume as a unit[edit]

I have been using volume (Q1238720) as a unit for the property number of parts of a work of art (P2635), eg at Q53574199#P2635.

Does anyone know if there is a way to make volume appear in the plural, ie as volumes ? Jheald (talk) 09:36, 20 May 2018 (UTC)

@Jheald: wouldn't the application of plural be something that is more general, and probably be language specific? As we know for many languages there is a general rule, and maybe it could be managed by a general rule with exceptions, though still that will be extensive when you get to number of languages.  — billinghurst sDrewth 23:18, 20 May 2018 (UTC)
@billinghurst: I was thinking of perhaps a slightly more general mechanism, as to whether there was a string that could be specified (in each language) to be shown when the item is being used as a unit. (Though that would still leave the single/plural issue, but we don't usually state when a book or edition is only a volume). I found and added unit symbol (DEPRECATED) (P558) and unit symbol (P5061), but they don't seem to help. Jheald (talk) 08:46, 21 May 2018 (UTC)

Do we keep both or merge?[edit]

How would we handle a situation such as Our native ferns and their allies; with synoptical descriptions of the American Pteridophyta north of Mexico (Q51515725) and Our native ferns and their allies; with synoptical descriptions of the American Pteridophyta north of Mexico (Q51515726)?

These two items are for two different scans of the same edition of a book. The first was scanned by Cornell University from their holdings, but the second was scanned from the University of California libraries. It is the same edition, just different scans from copies at different libraries. --EncycloPetey (talk) 01:22, 30 May 2018 (UTC)

@EncycloPetey: You need one item for the edition, without any data about the scan (available at URL,...) and 2 items, one for each scan. The items about the scan should be linked to the edition item using exemplar of (P1574) and defined as instance of exemplar (Q512674). We already discussed about that problem above (see Wikidata_talk:WikiProject_Books#Different_scans_of_the_same_edition. @Jheald: You proposed to use instance of individual book (Q53731850) for particular exemplar: I don't like the term "book", because this term is not well defined and some people can use it for the edition or even for the work. I think exemplar (Q512674) is more neutral: we have work, edition and exemplar, all can be book depending on the point of view. Snipre (talk) 11:57, 30 May 2018 (UTC)
That method presents a very real problem, however. Each item on Wikisource was been labelled an "edition" or "translation" up until now, but those "editions" are typically backed by a specific scan. So, if we do what you're suggesting, then every single Wikisource-hosted copy will have to be redone, because you're suggesting they are actually exemplars. And thus, every Wikisource copy will have (1) an exemplar item where the Wikisource copy is linked, (2) a separate edition item, and (3) associated work item where the Wikipedia entry is linked. --EncycloPetey (talk) 15:02, 30 May 2018 (UTC)
@EncycloPetey, Jheald: Wikisource is doing what they want and I don't "take care" about what they choose as model. WD has to deal with other kind of data: first some exemplars can have an history like the bible of George Washington. Then particular exemplars have some characteristics like library identifiers, so WD has to have specific items to deal with that kind of data and finally you didn't do the difference between editions and print runs: one edition can have several print runs, each print run has small differences which can change the references (for example, some data can be printed on different page number). Fot these reasons WD has to have a specific item for each exemplar. Snipre (talk) 22:06, 7 June 2018 (UTC)
@Snipre: If somebody has a particular need for a particular exemplar then they can create an item for it. But in general, WD does not need to have a specific item for each exemplar. Lazy creation at the time of need will usually be quite sufficient. Jheald (talk) 22:15, 7 June 2018 (UTC)
I would merge the two into one item, and use collection (P195) as a qualifier to distinguish where a particular scan-set is taken from.
There may be a few complicated cases where it may make sense to create distinct items for specific exemplars, but in most cases I think that would be an unnecessary complication. Jheald (talk) 15:22, 30 May 2018 (UTC)
@EncycloPetey, Snipre, Jheald: in this case, I would merge too ; maybe create item about exemplars but not for the scans. Snipre: is your point to consider scans as a fifth level of FRBR? Because, scan are not exemplar at all. If a library did 10 times a scan of the same examplar (which is not unusual at all), do you really suggest to create items for: 1 work, 1 edition, 1 exemplar and 10 scans. PS: do someone know how to contact openlibrary to ask for a merge of OL26454530M and OL7247817M (indicated as two different editions but it's clearly the same one). Cdlt, VIGNERON (talk) 13:31, 12 June 2018 (UTC)
@EncycloPetey, VIGNERON, Jheald: Not exactly, I am not creating a new level, I just consider a hard copy of a book and its scan as two exemplars in term of FRBR. So if I take your example of a library doing 10 copies of a book in paper, then we have one work, one edition and eleven exemplars. I don't consider the scan as a sublevel of the hard copy but as an egal level. Why ? That is the only way to correctly put all the data of the scans without mixing them. Each scan can have one specific identifier, one specific URL for online access, one specific creation date,... how can you treat all that information in one item ? how can you retrieve the specific data of one scan when everything is mixed ? The question is not to know if we can merge the items, the question is what are the data about scans which can be described in WD ? If you have 2, 3 or more data per scan, then you HAVE to create several items. So unless you can ensure that now and in the future, no more than one data per scan will be possible, we have to create several items for each scan or hard copy.
By the way explain me how you plan to add the data about one scan if nobody specifies the data about the hard copy which was used for the scanning ? The scan, if it can be identified by any set of specific data, is an exemplar, and the scanning can be considered as the translation operation: we still need a property to indicate which was the edition used as original text for a translation. In some cases the original version in the original language was not used as text for translation. Ex.: an original version in English was translated in German and the German version, not the English one, was used to generate a French version. In that case, we should be able to specify that relation. We could use the same property to create a relation between a hard copy and a scan, if we have enough data about both "texts" to justify the creation of 2 items. Snipre (talk) 22:39, 13 June 2018 (UTC)
@Snipre: oh ok, I see. That can make some sense. For the FRBR, exemplar are only physical (and not always unique), see FRBR, pages 24, 47-48 but as FRBR itself says « dynamic nature of entities recorded in digital formats merit further analysis ».
In my example, there is only one physical document, the 10 scans are not materialized (reprint exists but are rare, most digital contents stay only digital). I don't see any problem to put 10 URLs of the same physical exemplar on one item for the exemplar, the URL is the only property specific to the scans, everything else is specific to the exemplar (and often specific only to the edition). I agree with your thought but you miss an important point: except for URL (and derivative of the URL like identifiers), 99% of the time, there is no specific data about specific scans. For exemplar, it's already quite rare to have specific data (the collection and the history of owners and that's it).
To go back to the original example here, what data would justifiy to have 2 exemplar items? I can understand one item for the edition and one for the exemplar but right now, both are about edition (and the same 6th edition). I would propose to merge the two current items and maybe create an item for the exemplar (but do we really need an item about the exemplar? I'm not even sure)
Cdlt, VIGNERON (talk) 08:39, 14 June 2018 (UTC)

Biodiversity Heritage Library[edit]

Announcing: Wikidata:WikiProject BHL

The Biodiversity Heritage Library (Q172266) is a large multi-institution project to digitise and make available literature from the past relating to zoology, botany, and the diversity of life.

A few weeks ago, Magnus's Reinheitsgebot created Wikidata items for 63,000 BHL titles out of the (currently) 136,000 in Mix'n'match catalogue 1131. Since then some progress has been made identifying Wikidata items for BHL creators and adding BHL creator ID (P4081); replacing author name string (P2093) with author (P50); and adding some further fields from online sources; but there is still a considerable way to go.

For current statistics, see the dashboard pages for title progress and creator progress now created at Wikidata:WikiProject BHL.

The data has its quirks. The BHL 'title' dataset combines various different sorts of material, including books, periodicals, catalogues, individually bound article reprints, technical reports, etc. Initially these have all been given instance of (P31) = publication (Q732577); a few (but not all periodicals) have now been given instance of (P31) = periodical literature (Q1002697) based on keywords from BHL. It will be quite a challenge to further refine the identification of the material.

Also be aware that the BHL dataset of 'creators' for the titles (currently imported as author (P50) / author name string (P2093)) actually includes people with a considerable variety of relationships to the printed material -- including authors, editors, illustrators, corporate sponsors, various other contributors to works, even former owners of the texts in a few cases. This too could usefully use quite a lot of refinement.

But it's an important collection. Commons currently includes almost 250,000 files from the BHL, coordinated through the c:Commons:Biodiversity Heritage Library project page -- so work structuring the information here may make a real difference to building new pathways to make those Commons images more accessible. With 60,000 titles, I think it's also a very useful test-set to work on, to put our ideas for book data into practice, and to see what practical issues and questions arise, when applying them to a (very diverse) real-world sample of this size.

Anyone with an interest in this data, and/or ideas on how to improve it, is very welcome to add themselves to the Participants section at the bottom of Wikidata:WikiProject BHL page. Jheald (talk) 16:12, 7 June 2018 (UTC)

How many edition items for On the Origin of Species ?[edit]

We currently have 24 different items for English-language versions of On the Origin of Species (Q20124), mostly arising from different copies scanned for the Biodiversity Heritage Library (Q172266) (plus four more versions that are translations).

This query, tinyurl.com/yd4yjod9 gives a summary, because the list on Q20124 has become pretty much impossible to navigate.

Question: How many of these items should we keep, and which (if any) should we merge?

Background: Darwin himself produced six editions of the text, the first and the last being

His official authorised publishers were John Murray (Q1232629) in London, and D. Appleton & Company (Q3011053) in New York. With the exception of On the Origin of Species (1859) (Q20968204), all of the Murray and Appleton items that we have correspond to the 1872 version of the text.

The page-counts in the 'pp' column of the query correspond to the number of scan frames, so small differences here may not be that significant: the highest-numbered page of the Murray 1872, 1880, and 1886 copies is in each case 458; on the other hand that for the 1910 "popular impression" is 432. The Appleton copies are complicated by being in two volumes, sometimes bound together and sometimes not. The highest-numbered page of the second volume is 338 for the two 1889 copies, and 339 for the 1899, 1909, 1915, and 1917 copies.

Other publishers produced versions that may or may not correspond to the Darwin's final 6th edition, depending on the copyright observance and/or expiry and/or their own particular whims.

The 1902 and 1905 Collier copies (numbered as two volumes) both conclude at page 356; the 1909 "Harvard Classics" edition (single-volume) from the same publisher concludes at page 552.

The 1872(?) and 1899 Burt copies would both conclude at page 538 (except this is missing in the 1899 copy); the earlier copy then adds several pages giving a list of other works in Burt's "Library of the World's Best Books". This pagination also matches the Merrill and Baker copy, from a series called "World's Famous Books".

The Hurst and the Caldwell copies appear to have identical pagination (final page 501), though their title pages are different.

The Books Inc. copy re-orders the material (moving the historical preface to the end), and omits both the index and the comparison of the 6th with earlier editions.

... etc ...

So: how to bring sense to all of this?

We have a large number of copies based on the same text; some are also based on the same typography and pagination; some may even be from the same printing (Q51515167 / Q51515141); though even then we have different scannings, with each scanning available from multiple different sources (ie BHL vs IA).

The page at On the Origin of Species (Q20124) brings little sense of any of this; it certainly doesn't group together the copies based on the same underlying text.

Does it make sense to differentiate eg the three John Murray texts between 1872 and 1886, all with the same pagination, or would it make sense to group these in some way?

How can we best bring some order to all of this? Jheald (talk) 19:46, 12 July 2018 (UTC)

Publishers and imprints[edit]

Aubrey
Viswaprabha (talk)
Micru
Tpt
EugeneZelenko
User:Jarekt
Maximilianklein (talk)
Don-kun
VIGNERON (talk)
Jane023 (talk) 08:21, 30 May 2013 (UTC)
Alexander Doria (talk)
Ruud 23:15, 24 June 2013 (UTC)
Kolja21
arashtitan
Jayanta Nath
Yann (talk)
John Vandenberg (talk) 09:14, 30 November 2013 (UTC)
JakobVoss
Danmichaelo (talk) 19:30, 16 February 2014 (UTC)
Ravi (talk)
Mvolz (talk) 08:21, 20 July 2014 (UTC)
Hsarrazin (talk) 07:56, 9 August 2014 (UTC)
Accurimbono
Mushroom
PKM (talk) 19:58, 10 October 2014 (UTC)
Revi 16:54, 29 November 2014 (UTC)
Giftzwerg 88 (talk) 23:36, 1 January 2015 (UTC)
Almondega (talk) 00:17, 5 August 2015 (UTC)
maxlath
Jura to help sort out issues with other projects
Epìdosis
Skim (talk) 13:52, 24 June 2016 (UTC)
Marchitelli (talk) 12:29, 5 August 2016 (UTC)
BrillLyle (talk) 15:33, 26 August 2016 (UTC)
Alexmar983 (talk) 23:53, 28 August 2016 (UTC)
Finn Årup Nielsen (fnielsen) (talk) 10:44, 29 August 2016 (UTC)
Chiara (talk) 14:15, 29 August 2016 (UTC)
Thibaut120094 (talk) 20:31, 14 September 2016 (UTC)
Ivanhercaz | Discusión Plume pen w.png 15:30, 31 October 2016 (UTC)
YULdigitalpreservation (talk) 17:35, 10 November 2016 (UTC)
User:Jc3s5h
PatHadley (talk) 21:51, 15 December 2016 (UTC)
Erica (ohmyerica) (talk) 19:26, 1 January 2017 (UTC)
User:Timmy_Finnegan
Mauricio V. Genta (talk) 05:38, 12 March 2017 (UTC)
Sam Wilson 09:24, 24 May 2017 (UTC)
Sic19 (talk) 22:25, 12 July 2017 (UTC)
Andreasmperu
MartinPoulter (talk) 09:21, 20 July 2017 (UTC)
ThelmadatterThelmadatter (talk) 01:11, 13 September 2017 (UTC)
Zeroth (talk) 15:01, 16 September 2017 (UTC)
Emeritus
Ankry
Beat Estermann (talk) 20:07, 12 November 2017 (UTC)
Shilonite - specialize in cataloging Jewish & Hebrew books
Elena moz
Oa01 (talk) 10:52, 3 February 2018 (UTC)
Maria zaos (talk) 11:39, 25 March 2018 (UTC)
Wikidelo (talk) 13:07, 15 April 2018 (UTC)
Mfchris84 (talk) 10:08, 27 April 2018 (UTC)
Mlemusrojas (talk) 3:36, 30 April 2018 (UTC)
salgo60 Salgo60 (talk) 12:42, 8 May 2018 (UTC)
Dick Bos (talk) 14:35, 16 May 2018 (UTC)
Marco Chemello (BEIC) (talk) 07:26, 30 May 2018 (UTC)
Harshrathod50 Pictogram voting comment.svg Notified participants of WikiProject Books I'd like to do some work on publishers and imprints. Does anyone know of a standard reference or database (preferably freely accessible online) with info about the dates of publisher mergers, acquisitions, spinoffs, etc.? - PKM (talk) 19:22, 13 July 2018 (UTC)