Wikidata:Property proposal/description

From Wikidata
Jump to navigation Jump to search


Description[edit]

Return to Wikidata:Property proposal/Generic

   Under discussion
Description(aliases: Definition, Abstract, Summary, Curatorial comment, Biographic note): a well-edited short description of the item, with mandatory reference. Keep it short (a couple of paragraphs): Wikidata is not a full-text repository, and respect others' copyright.
Representsdefinition (Q101072), description (Q1200750), abstract (Q333291), summary (Q776754)
Data typeMonolingual text
Domainany entity
Allowed valuesReuse some great regexps from title (P1476) that warn on Latex chars, HTML tags, etc
Example 1rocking chair (Q14963)
  • "A chair that is mounted on two curved strips of wood (called rockers or bends) which connect the front and back legs, allowing it to be rocked backward and forward. It usually has a high, straight, or slightly curved back and often open arms.\nUsed for comfortable sitting or relaxing while rocking."
    language: English,
    reference: Nomenclature for Museum Cataloging (P7749) 1128
  • "Siège généralement à dossier haut et parfois incliné, avec ou sans accotoirs, dont les pieds sont munis d'arceaux qui permettent de le faire osciller de l'avant vers l'arrière par un simple mouvement du corps.\nPermet à une personne de s'asseoir et de se balancer."
    language: French,
    reference: Nomenclature for Museum Cataloging (P7749) 1128
Example 2The Virgin Cataphyge (Refuge) and St. John the Evangelist (Q84545297)
  • "Иконата е двустранна (лицева страна, от обратната страна е изобрзено иконата "Видението на пророците Иезекиил и Авакум"). Подарена е около 1395 г. от императрица Елена Палеолог, дъщеря на ктитора на манастира деспот Деян и внучка на цар Иван Александър, на Погановския манастир. Класическа красота, пропорционалност, аристократичност на колорита в духа на средновековната естетика."
    language Bulgarian,
    reference reference URL (P854) http://bidl.cc.bas.bg/viewobject.php?id=1&lang=bg
  • "A two-sided icon (the right side, on the back side is painted "The vision of the prophets Ezekiel and Habakkuk"). Donated about 1395 to the Poganovo monastery by Empress Elena Paleologus, daughter of the monastery's sposor Despot Deyan and granddaughter of Tsar Ivan Alexander. Classical beauty, proportionality, aristocratic colouring in the spirit of mediaeval aesthetics."
    language English,
    reference reference URL (P854) http://bidl.cc.bas.bg/viewobject.php?id=1&lang=en
  • Example 3Mother of God Pantonhara (Q84296272)
  • The Mother of God on this icon is depicted as an empress with infant Christ sitting on her leftarm. Both the Virgin and Christ are shown in imperial regalia. She is dressed in a purple-pink maphorion decorated with engraved, golden and green floral ornaments. Her chiton is green, ornamented similarly as the maphorion and decorated with two golden, crossed stripes with floral decoration. She has a golden crown on her head, a scepter in her right hand, while in theleft one she holds a huge stalk of wheat. Christ is dressed as the Grand Archpriest in a golden-orange vestment decorated with floral ornaments and golden stripes. In the right hand he holds a green sphere with cross, scepter in the left one and also a golden crown on the head. From left to the right there are shown: St. George, St. Demetrios, St. John the Baptist, St. Clement of Ohrid, St. Nicholas, St. Athanasios and St. George the New Martyr of Ioannina."
    language: English,
    reference described by source (P1343) The icon of the Mother of God Pantonhara in the Icon Gallery (Q84291564); Academia.edu publication ID (P7896) 9843052
  • Source
  • external equivalent property: dc:description, dct:description, skos:definition, skos:description, schema:description;
  • external subproperty: dct:abstract, schema:abstract (schema:text), bio:olb
  • See alsotitle (P1476), name (P2561), official name (P1448), native label (P1705), subtitle (P1680), media legend (P2096), currency symbol description (P489)

    Motivation[edit]

    Unlike WD props, the labels and descriptions on top don't have provenance. Furthermore, descriptions are intended to be short and used only for disambiguation, and don't have any quality control. To compensate, we have title (P1476), name (P2561) (and a whole slew of sub-props or variants like official name (P1448), native label (P1705), subtitle (P1680), media legend (P2096), currency symbol description (P489)) for labels. But there is no similar prop for Descriptions.

    • I propose one prop for several different things for now (Definition, Abstract, Biographic note) and we can specialize such props later: what do you think?
    • In many cases it's important to respect newlines, how could that be supported?
    • Maybe we could even use this for Song lyrics or Poem text? But I think that's going too far and we should have a separate prop for this. Should I propose one? (Moebeus expressed copyright concerns, so I'm dropping this idea, even though there are MANY lyrics sites)

    Vladimir Alexiev (talk) 09:36, 7 February 2020 (UTC)

    Discussion[edit]

    Symbol support vote.svg Support Makes sense to me. WD description fields are currently being used in many ways that are not compatible with "front-facing" display on Wikipedia etc. A dedicated property would help with that, querying would be easier, constraints could be written, etc. Moebeus (talk) 10:04, 7 February 2020 (UTC)

    • Symbol support vote.svg Support --Crowjane7 (talk) 15:55, 7 February 2020 (UTC)
    • Symbol support vote.svg Support --Brimwats (talk) 08:33, 8 February 2020 (UTC) -- as libraries, archives, galleries, and museums draw upon and use Wikibase more (such as Project Passage) the description text becomes even more important and essential for use.
      • If another organization uses Wikibase and needs a certain property they can create it on their end. They don't need Wikidata to create properties. ChristianKl❫ 18:54, 10 February 2020 (UTC)
    • Symbol support vote.svg Support --illipmich (talk) 17:18, 7 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose primarily for the overt generality of this property. At least with something like scope and content (P7535) there are some constraints on its use and how it is structured, where here no such limitations on this proposed property's use exist. With respect to the front-facing aspect, Wikidata descriptions should in fact be usable as something that can be exposed to an end-user; the stuff useful only to a Wikidata editor should be placed in a separate Wikidata property (à la Wikidata usage instructions (P2559)) but the proper display of this property within the interface is presently blocked for technical reasons. Mahir256 (talk) 18:23, 7 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose I don't believe unstructured text belongs in Wikidata. We have had this conversation before and I can't remember precisely in what context - might have been "credit line". The sample text reads very much like the text type & length that used to be acceptable as a Wikipedia stub. That Wikipedia now has minimal length requirements is not a reason to put stubby text snippets in Wikidata. Also, such text descriptions of objects are generally copyrighted, if they are not from some 100-year-old catalog. Jane023 (talk) 19:22, 7 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose: unrelated with structured data. Nomen ad hoc (talk) 21:14, 7 February 2020 (UTC).
    • Symbol oppose vote.svg Oppose inherently unstructured data. If it’s a block quote then it’s copyright problematic (certainly not cc0-pure like wikidata likes to keep things. Longform freetext properties seem to be the opposite of what wikidata’s about, I would have thought. Wittylama (talk) 22:48, 7 February 2020 (UTC)
      • As the proposed description says "respect others' copyright". It's no more copyright problematic than any other info. Eg Nomenclature (one of the examples) is released under appropriate copyright in its entirety, labels and descriptions alike. I challenge everyone to give examples of datasets where "data" is cc0 but descriptions are not.
      • @Wittylama: I am surprised that you as a GLAM person don't see the value of such a prop for GLAMs.
      • WD has many items that will probably never get a WP page. Such items will be the poorer if one can't record a rich description. Eg CHIN want to add rich descriptions to the 15k Nomenclature objects...
      • Many GLAM projects want to use WD as an integration platform. Eg we plan to import 500 Orthodox icons and enrich data about painters, saints, monasteries etc, see http://rawgit2.com/VladimirAlexiev/my/master/pres/20200130-Wikidata-Icons/Slides.html. But without rich description it makes little sense to describe icons. --Vladimir Alexiev (talk) 19:42, 8 February 2020 (UTC)
        • I can and do see the value of rich, contextual, nuanced, descriptive information (especially for glams). However, I am saying that paragraph-length Freetext is not structured data, and therefore isn’t within the scope of Wikidata. Moreover, lengthy quotes (such as in the examples given above) are more than trivial because they represent the entire text professionally-written description and therefore are copyrighted information. This is different from information like ‘height’ or ‘year’ or ‘creator’ which are simple facts an therefore non-copyrightable. Thus, this property could only be used when a glam has proactively decided to share ALL fields of their their collection records under CCO - this is not information that can legally be scraped from databases and republished by us unless they’ve released it.
          • @Wittylama: No, you cannot scrape somebody's database just because you believe "simple facts are non-copyrightable". Collections of such facts are very much copyrightable and people pay big bucks to obtain appropriate datasets. Descriptions are no more copyright-problematic than other fields --Vladimir Alexiev (talk) 11:17, 11 February 2020 (UTC)
    We have had this discussion before with glam concepts like “credit line”. It is very useful and important information about objects in glam collections, BUT “credit lines” are unstructured data/freetext and potentially coyrightable, and therefore we do not [yet] have a solution for how to handle them in wikidata.
    Instead of republishing the full text of the description, could you instead use a property like this to link/reference to where the full description can be found (URL or published catalogue), when such a description exists? Wittylama (talk) 09:12, 9 February 2020 (UTC)
    @Wittylama: Sure we can and will, but that's no substitute for having a decent description on the item. --Vladimir Alexiev (talk) 11:17, 11 February 2020 (UTC)
    • Pictogram voting comment.svg Comment I suppose we needed something like this to avoid showing bias against anything that is not archive related (given that we do have scope and content (P7535)). --- Jura 09:56, 8 February 2020 (UTC)
    • Pictogram voting comment.svg Comment I think datatype should be monolingual string. Multilingual isn't currently in the works. --- Jura 09:57, 8 February 2020 (UTC)
    • Symbol support vote.svg Support --eroux108 (talk) 16:53, 8 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose. The point of Wikidata is to collect structured data, which this isn't. --Yair rand (talk) 23:55, 9 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose The motivation states for disambiguation purposes. But we already have "description" and its Help page explains why we need to use the minimum amount of text to disambiguate. Thadguidry (talk) 17:02, 10 February 2020 (UTC)
      • hi @Thadguidry: I think you've misread the motivation: "descriptions (at the beginning) are intended to be short and used only for disambiguation, and don't have any quality control". This one is for an authorized/editorial description --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose This will cause more trouble than it's worth. Moderating would get a lot more complicated because of all the possible copyright violations, and it will be very unclear how this property relates to Wikipedia articles. There are lots of potentials for useless duplication of content, and newcomers will be confused by having both a description field and a description property. Note that just for 'paragraph-length' summaries of Wikipedia articles there is already an excellent API method. Husky (talk) 18:16, 10 February 2020 (UTC)
      • @Husky: Why people don't get confused by props "name" & "title" vs "label" on top? Wikipedia abstracts are ok, but about 2/3 of WD items have no WP article --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
    • Symbol oppose vote.svg Oppose To me even the examples look copyright violating. The Bulgarian website doesn't have any statement that suggests that their content is in the public domain. ChristianKl❫ 18:54, 10 February 2020 (UTC)
      • @ChristianKl: I assure you that I have spoken to the creators of the site (Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences), and we'll be importing these icons to Wikidata. Do you only doubt the descriptions, or all data on the site? That's my point: a longer text is no more inherently copyright-problematic than other data --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
        • Statements about who created an icon or when it was created are factual statements that are not subject to copyright. Nothing on the example item The Virgin Cataphyge (Refuge) and St. John the Evangelist (Q84545297) seems to me like it's protected by copyright. On the other hand there's creative work in a description that is protected by copyright.
    There's the issue of EU database right given that Bulgaria is an EU country but that's a separate issue from the copyright. ChristianKl❫ 13:55, 11 February 2020 (UTC)

    I see this one will get shot down, so let me explain what will happen:

    • We will import icon descriptions in the existing poor "description" fields (without provenance), because 99% of these will be new WD items.
    • However, we won't be able to import Nomenclature descriptions in the poor "description" fields because many of these are existing WD items, and there's only 1 slot per language, and we can't be sure we won't overwrite information in that slot, no matter how crappy the existing description may be. In fact even for newly created Nomenclature items, we already make poor descriptions eg Exterior Shutter (Q80794411) has the ancestor path "(Category 01: Built Environment Objects> Building Components> Door & Window Elements> Window Element)" in lieu of a proper description (enough for disambiguating, but not satisfactory). When Nomenclature are ready with their proper editorial description (a big effort on their part), we won't be able to refresh the crappy descriptions in WD items --Vladimir Alexiev (talk) 11:24, 11 February 2020 (UTC)
      "(Category 01: Built Environment Objects> Building Components> Door & Window Elements> Window Element)" isn't a valid description. Why create it in the first place? --Yair rand (talk) 22:46, 11 February 2020 (UTC)
      @Yair rand: It serves the purpose to disambiguate and is better than nothing, so why do you think it's invalid? Rather than critique, why didn't YOU make a better description of Exterior Shutter (Q80794411) if you dislike this one? --Vladimir Alexiev (talk) 15:51, 12 February 2020 (UTC)
      It's not a valid description under WD:D. I think it would be far preferable to leave it blank, and have it be easily discoverable as a descriptionless item, than to have a non-description taking up the field. I'd recommend deleting all such descriptions. --Yair rand (talk) 06:05, 17 February 2020 (UTC)

    I just found out that the poor "description" field has a limit of 250 chars. This means we can't import useful descriptions from GLAM datasets like Nomenclature (@Crowjane7:) or icons. Cheers! --Vladimir Alexiev (talk) 13:39, 13 February 2020 (UTC)

    • Pictogram voting comment.svg Comment I think a more narrowly focused proposal with limited domain or source, similar to the existing property scope and content (P7535), would be more likely to succeed. ArthurPSmith (talk) 18:01, 13 February 2020 (UTC)
    • Symbol oppose vote.svg Leaning oppose while I understand the rationale behind this 1)This isn't really strucruder data 2) I am concerned about licencing issues and 3) I think that as proposed it has a bit too broad scope and is too easy to be misused, leading to more trouble. --Kostas20142 (talk) 22:42, 16 February 2020 (UTC)