Shortcut: WD:PP/GEN

Wikidata:Property proposal/Generic

From Wikidata
Jump to: navigation, search



This page is for the proposal of new properties.

Before proposing a property
  1. Check if the property already exists by looking at Wikidata:List of properties (manual list) and Special:AllPages.
  2. Check if the property is already pending or has been rejected.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically. See WD:WikiProject Infoboxes for suggestions.
  4. Select the right datatype for the Property.
  5. Start writing the documentation based on the preload form below and add it in the appropriate section.

Creating the property

  1. Creation can be done after 1 week by a property creator or an administrator.
  2. See steps when creating properties.

Add a request

This page is archived, currently at Archive 29.

To add a request, you should use this form:

=== {{TranslateThis | anchor = en
| en = PROPERTY NAME IN ENGLISH
| de = <!-- PROPERTY NAME IN German (optional) -->
| fr = <!-- PROPERTY NAME IN French (optional) -->
<!-- |xx = property names in some other languages -->
}} ===
{{Property documentation
|status                 = <!--leave this empty-->
|description            = {{TranslateThis
  | en = ...
  }}
|subject item           =  <!-- item corresponding to the concept represented by the property, if applicable; example: item ORCID (Q51044) for property ORCID (P496) -->
|infobox parameter      = Wikipedia infobox parameters, if any; ex: "population" in [[:en:template:infobox settlement]]
|datatype               = put datatype here (item, string, media, coordinate, monolingual text, multilingual text, time, URL, number)
|domain                 = types of items that may bear this property
|allowed values         = type of linked items (Q template or text), list or range of allowed values, string pattern...
|source                 = external reference, Wikipedia list article, etc.
|example                = {{Q|1}} => {{Q|2}}
|formatter URL          = 
|filter                 = (sample: 7 digit number can be validated with edit filter [[Special:AbuseFilter/17]])
|robot and gadget jobs  = Should or are bots or gadgets doing any task with this? (Checking other properties for consistency, collecting data, etc.)
}}
;{{int:Talk}}
Motivation: 

Proposed by: ~~~

(Add your motivation for this property here.) ~~~~

For a list of infobox parameters, you might want to use table format:

{{List of properties/Header}}

{{List of properties/Row|id=
|title          = audio
|type           = media
|qualifier      =
|description    = Commons sound file
|example-subject= Q187 <!-- Il Canto degli Italiani -->
|example-object = Inno di Mameli instrumental.ogg
}}

</table>

For blank forms, see Property documentation and List of properties/Row


Generic properties[edit]

sort key[edit]

   In progress
Description key indicating the order in which the item's label should be sorted
Data type String
Template parameter In en:Template:Persondata, de:Vorlage:Personendaten
Domain all
Allowed values strings
Source Persondata
Example item and value


Proposed by Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits
Discussion

May be used as a qualifier for name in native language (P1559); or qualified by a language. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:46, 24 October 2014 (UTC)

Pictogram voting question.svg Question Is there an international standard how names should be sorted? As far as I know it is extremly culture-specific how names are sorted. --Pasleim (talk) 13:10, 24 October 2014 (UTC)
No idea, but we often know it, from sources, or knowledge of our own culture. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:26, 24 October 2014 (UTC)
Chinese names are for sure sorted by similarity of the signs and the number of lines in the signs (quite a difiicult task to know the exact order of 2500 signs) and not like the transcriptions according to the alphabet.--Giftzwerg 88 (talk) 17:25, 24 October 2014 (UTC)
In Europe and America usually by last name and then first name. --Crazy1880 (talk) 17:27, 24 October 2014 (UTC)
See de:Hilfe:Personendaten/Name (german Help) --Crazy1880 (talk) 17:29, 24 October 2014 (UTC)
Pictogram voting comment.svg Comment If it's a qualifier, string datatype would probably be sufficient. Otherwise, it might need to be a monolingual string. --- Jura 18:59, 24 October 2014 (UTC)

As a librarian, I have to warn you that this is a very complex matter that can't be simply reduced to "In Europe and America usually by last name and then first name". But, rejoice: There is a standard work of reference, it's called Names of Persons and issued by the IFLA. A scanned version of the 4th edition is available as a PDF at the IFLA website. For example, in Iceland the sort order is "First name - last name" (the last name in Iceland usually being not a family name, but a patronymic - most Icelanders don't have a family name). And there are languages such as Spanish with multi-part name where the sort order also isn't obvious. "Names of Persons" helps in these matters. Gestumblindi (talk) 23:31, 27 October 2014 (UTC)

Oh, and of course - I Symbol support vote.svg Support this proposal, as a meaningful sort order is very important. Gestumblindi (talk) 00:10, 28 October 2014 (UTC)

Pictogram voting comment.svg Comment Persondata DOES NOT contain sort value for names. The |name= is supposed to be surname, firstname. About 20% of the cases, it is entered wrong, usually firstname, surname. DEFAULSORT contains the sort value in all Biography articles. Sort value does not equal surname, firstname in alot of cases.

Examples:
Otto von Bismark... DEFAULTSORT:Bismark, Otto   persondata |name=von Bismark, Otto.
Francisco da Costa Gomes... DEFAULTSORT:Gomez, Francisco   persondata |name=da Costa Gomes, Francisco.

Persondata name value contains names with ligatures, accents and other characters. Sort value is only to be the standard 26-letter English alphabet plus ".'.

Example:
Two people, one named José Márquez, the other Jose Marquez. If standard 26-letter alphabet is not used, the two names will be sorted in different spots.

DEFAULTSORT values DO FOLLOW IFLA guidelines. See en:WP:NAMESORT for rules. There are two exceptions. WikiProject Iceland has said to follow western sort order for DEFAULTSORT. WikiProject Brazil and WikiProject Football has said Brazilian footballers have defaultsort set to their nickname.

I have been the maintainer of DEFAULTSORT on enwiki for several years now. If you have questions, I would be the one to ask. Bgwhite (talk) 01:11, 28 October 2014 (UTC)

Yes, Persondata often contains wrong name sorting, but it does try to do name sorting - otherwise, why should the sort order in "name" be "supposed to be surname, firstname" at all? That's a way of sorting, too. I'm not that familiar with English Persondata, more with German Personendaten. Maybe there are more errors in the English variant. "von Bismarck, Otto" is wrong, it should be "Bismarck, Otto von", of course. - That said, if it turns out that the sort order in Persondata is wrong too often, maybe it would be better for Wikidata to extract it from DEFAULTSORT instead of Persondata. - There may be conflicting results when extracting from different Wikipedia language versions. For example, German Wikipedia follows the IFLA guidelines for sorting Icelandic names in DEFAULTSORT resp. (in German) SORTIERUNG (so, first name first). Gestumblindi (talk) 14:37, 28 October 2014 (UTC)

Pictogram voting comment.svg Comment as a librarian too, I can only agree to the need for a sorting value... at least for Person's names...

but I see problems, even between latin-language names...
  • the sorting habits are not the same in all countries, even in countries using the same language ; therefore, different sorting values should be made for different languages...
  • and this will be even more complicated for russian/chinese/etc. languages... Translitteration makes it very difficult to sort names... just have a look at Tchekov's name :S - and this is for a "modern" person... imagine for medieval names which could be written differently by the same person :/ - and of course, it works backwards, for our "simple" latin names, when translitterated in... just how many non-latin languages are there in wikidata ? --Hsarrazin (talk) 15:49, 28 October 2014 (UTC)
how could this be solved in wikidata... ? probably, it would be best to have each part of the name in a separate property, and then let each project assemble them to have a sortkey...
Sounds reasonable (though a bit complicated). A sort key may also be valuable for other names than names of persons - e.g. work titles - you don't want to sort all book titles beginning with "The ..." under T, e.g. English Wikipedia has DEFAULTSORT:Shining, The. Gestumblindi (talk) 20:27, 31 October 2014 (UTC)
of course not… :D — pardon me for laughing, but that is exactly what the library-catalog at my job does… :(
for titles, if you look at wikisource fr for example… the text pages are named with the correct title, and, without adding manually a DEFAULTSORT, except in very rare cases, we have a "Classement" module, automatically applied through Title or Proofreadpage header template, so that the texts are sorted according to French rules… I don't think the same rules apply in all languages… but a similar system could perhaps be set for every language :) --Hsarrazin (talk) 01:42, 1 November 2014 (UTC)
Symbol support vote.svg Support but it's important we indicate which system applies - names in Irish (Gaelic) are sorted by different rules (you ignore the Ó or Mac prefix so Ó Rourke and Mac Raeman are sorted together). Filceolaire (talk) 20:49, 31 October 2014 (UTC)
Actually no. You do not ignore Ó or Mac per University College, Dublin and National Library of Ireland. However, I've seen both systems used. But you do hit the most important thing... indicate which system applies. If supported, there should be a group who writes out the rules. Sounds like Hsarrazin and Gestumblindi should be in the group. Sounds like German and English Wikipedia's base things from IFLA guidelines, so that is probably the best starting point. I personally favour a fight to the death. :) Bgwhite (talk) 00:01, 2 November 2014 (UTC)
  • Reject as proposed. This needs a multilingual field as the sortkey will differ by language. --Izno (talk) 07:52, 21 November 2014 (UTC)
It depends. Libraries try to apply a standard that is on the one hand language-specific, on the other hand universal, such as in the already mentioned Names of Persons. Persons are entered according to the custom in the respective person's language, notwithstanding the language of the library's location. So, a library following "Names of Persons" will sort Spanish persons according to Spanish convention, and Icelandic persons according to Iceland convention, even if it's a library in Switzerland or in Poland. This would be a possible approach here IMHO. So we wouldn't need a "Spanish sorting" and a "Polish sorting" for the same person, but just one, the one according to the person's language. Gestumblindi (talk) 21:58, 25 November 2014 (UTC)
The problem with that approach is that isn't how it's done on any particular wiki--each of which will and do have different sorting conventions. --Izno (talk) 00:12, 27 November 2014 (UTC)
Symbol support vote.svg Support but using DEFAULTSORT as the source. I am speaking as someone who has used the metadata provided by Persondata. I gave up using the name parameter - it was just too random. Even using the name parameter to display the persons name was impossible! My final solution was to display the Wiki page name and use DEFAULTSORT for ordering. Periglio (talk) 02:03, 18 January 2015 (UTC)

Symbol support vote.svg Support But I suggest to link it to some language rules. As mentioned above, there are different rules for different languages, so we need a name of type monolinugal (original name as a person would describe itself in its mother tongue) and a set of rules for a defined group of some languages and may be some other sets of rules for some other languages. We need this property if we want to replace the templates by properties of Wikidata.--Giftzwerg 88 (talk) 00:39, 16 February 2015 (UTC)

Pictogram voting comment.svg Doubts sv.wikisource and sv.wikipedia do not have the same sorting order, so I find it difficult to find a good solution here, even if we have one order for each language. And the sorting order of Swedish names depend on their age, so the language does not give enough information. -- Innocent bystander (talk) 09:31, 23 March 2015 (UTC)

language, except for works or persons[edit]

   In progress
Description language of item. Use more specific properties language of work (P407) or original language of work (P364) for works and native language (P103) or languages spoken or published (P1412) for persons.
Data type Item
Domain any item that has or uses a language except works or persons, like names, words, phrases, proverbs.
Allowed values items for languages
Example item and value


Jean (Q4160311), female given name → English (Q1860)
Jean (Q7521081), male given name → French (Q150)
Robot and gadget jobs no persons and no works of any kind allowed.
Discussion
  • Pictogram voting comment.svg Comment P:P364 used to have the label "language". To match more closely its description, it's now labelled "language of the original work". This leaves a gap for cases like the above. --- Jura 04:52, 12 December 2014 (UTC)
  • Symbol support vote.svg Support --- Jura 04:52, 12 December 2014 (UTC)
  • Symbol support vote.svg SupportPictogram voting comment.svg Comment One day we will have also wictionary linked, so every word must have a language. Not only names, but also radio/tv stations broadcast in languages. However names are a bad example, because names tend to move between languages, sometimes unchanged over centuries and sometimes more or less modified.--Giftzwerg 88 (talk) 05:14, 12 December 2014 (UTC)
Pictogram voting comment.svg Comment I changed the domain to "any". --- Jura 05:16, 12 December 2014 (UTC)
In that discussion User:Snipre came up with the idea to merge language of work (P407) and original language of work (P364) into a "language" property which would no longer be restricted to works. --Pasleim (talk) 12:56, 18 April 2015 (UTC)
I don't see that discussion going anywhere, nor is the merge proposal formulated that way. The result is that we still haven't sorted out this issue. --- Jura 13:03, 18 April 2015 (UTC)
@Jura1, Pigsonthewing: Agree, it might take some time until there is a consensus so I won't oppose the creation of this property. --Pasleim (talk) 09:21, 19 April 2015 (UTC)

Time2wait.svg On hold per Pasleim. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:02, 18 April 2015 (UTC)

  • That other two properties have just no impact on this. --- Jura 12:05, 18 April 2015 (UTC)
    • @Pigsonthewing: I removed "on hold" as Pasleim changed is comment. --- Jura 15:48, 20 April 2015 (UTC)
    @Jura: Please try to think to that possibility: language of work (P407) and original language of work (P364) merged into a new property called "language" ? Why do we have to have one property "language of work" and one property "language of name" ? Snipre (talk) 13:57, 20 April 2015 (UTC)
    The reason for this request is that someone changed the label of the P364. --- Jura 15:46, 20 April 2015 (UTC)

distribution map[edit]

   Done: P1846
distribution map
distribution of item on a mapped area (for range map use (P181).)
Description distribution of item on the mapped area. For taxons use range map image (P181).
Data type Commons media file
Domain names
Allowed values distribution maps
Example item and value Roberts (Q1646493) => File:Roberts.png
Discussion

@Jura1, Ivan A. Krestinin, Emw: ✓ Done, as datatype = "commons media file", distribution map (P1846) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:43, 24 April 2015 (UTC)

different from[edit]

   In progress
Description This item is different from that one, and they are often confused. In contrast to said to be the same as (P460) that expresses uncertainty "... the statement is disputed", different from expresses a strong negation
Data type Item
Template parameter not that I know of
Domain any item
Allowed values any item
Source owl:differentFrom (OWL) and gvp:aat2100_distinguished_from (Getty's LOD)
Example item and value Philip A. Goodwin (Q7183094) (a politician) is different from Philip R. Goodwin (Q7184251) (a painter). Many pairs in P245 (ULAN) Unique value are examples of "different from" (and the rest should be merged).
Robot and gadget jobs


  • Check that the two items do not have the same authority identifier (AAT, TGN, ULAN, VIAF, GND, etc). The "Unique value constraint" on these identifiers already checks this (eg see P245 (ULAN) Unique value), but together with an explicit claim "different from", we can split that section into "Delete authority identifier from one of the items" vs "Merge the two items".
  • Check that the two items are not subject of the same statement: if Q1 different from Q2, there should be no statements Q3 P1 Q1 and Q3 P1 Q2. Of course, that's a warning not a certain mistake.
Proposed by Vladimir Alexiev (talk)
Discussion

Automated jobs (and less often people) make mistakes and identify two things that are in fact different. It's useful to document the difference, so others won't repeat the same mistake. It can also be used to improve authority identifier management (see the first "robot job" above).

Important: Obviously it's crucial to add a reference to such a statement, lest endless discussion ensues. But what to add? I found that Philip A. Goodwin (Q7183094) is different from Philip R. Goodwin (Q7184251) by inspecting their properties (different years, different professions), and ULAN for the latter. Ideally, I'd like to add a comment to that effect and my user name: but I can't do that in a reference. --Vladimir Alexiev (talk) 21:52, 11 January 2015 (UTC)

  • Symbol support vote.svg Support --- Jura 16:17, 7 February 2015 (UTC)
  • I think this should be placed in talk pages instead of item, as clients don't need it.--GZWDer (talk) 06:14, 16 February 2015 (UTC)
         select * {?x gvp:aat2100_distinguished_from ?y}
  • Symbol support vote.svg Support agree with Vladimir TomT0m (talk) 14:01, 22 March 2015 (UTC)
  • Pictogram voting question.svg Question - Is the intent that this essentially work as a "not to be confused with" statement? If so, I support it as useful externally as well as internally. Josh Baumgartner (talk) 18:01, 27 February 2015 (UTC)
    @Joshbaumgartner: It certainly is a useful usecase. TomT0m (talk) 14:07, 22 March 2015 (UTC)
  • Isn't the fact that we have two objects in our database always going to imply this property (except where that property contains a "same as")? Objects are different concepts always. --Izno (talk) 17:28, 9 March 2015 (UTC)
    @Izno: This depends on whether or not we make the Unique name assumption (Q7886954) (View with Reasonator). The OWL language, from which this property proposal is inspired, does not make it, so ... maybe in Wikidata in general it is not possible to make it, although it will be possible safely in subsets of Wikidata's dataset. TomT0m (talk) 14:07, 22 March 2015 (UTC)
The purpose is exactly "not to be confused with" (see the description). This will be used sparingly on two items that are likely to, or already have been, confused. OWL has a similar statement: owl:differentFrom. --Vladimir Alexiev (talk) 09:37, 1 April 2015 (UTC)

number of entries/articles[edit]

   In progress
Description number of entries/articles of encyclopedia/database
Data type Number
Domain encyclopedia/database
Allowed values number>0
Example item and value Wikidata (Q2013) => 13,909,366
Proposed by GZWDer (talk)
Discussion

GZWDer (talk) 06:10, 21 January 2015 (UTC)

  • Symbol support vote.svg Support --- Jura 12:29, 22 February 2015 (UTC)
  • @GZWDer: I'm amazed that it doesn't seem there are any properties yet for size, extent, height, width (tried autocomplete on some item, maybe I didn't search right?), etc. So definitely some dimension properties are needed, but I think should be defined in a more general way. Eg in this case "size" or "extent" with qualifier "unit" that can be left out (in this case), or be specified as an item "entries", "records", "pages", or whatever needed. --Vladimir Alexiev (talk) 11:48, 3 April 2015 (UTC)
    Quantity with unit data type is still in development. (Happens to be one of the top priorities.) --Izno (talk) 17:16, 3 April 2015 (UTC)
  • Pictogram voting question.svg Question Wouldn't this be better as a more generic "number of parts", to be used with a qualifier ("articles", "pages", "bricks", or whatever)? Unit measurements such as length and width would then be for other properties. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:37, 24 April 2015 (UTC)

number of registered users/contributors[edit]

   Done: P1833
number of registered users/contributors
number of registered users on a website
Description number of registered users/contributors of website or book edition (if there're many)
Data type Number
Domain website/edition
Allowed values number>0
Example item and value Wikidata (Q2013) => 1,898,798
Proposed by GZWDer (talk)
Discussion

GZWDer (talk) 10:06, 21 January 2015 (UTC)

Symbol support vote.svg Support looks good, especially for cases where listing all contributors is not practical (i.e. for Wikidata) Ajraddatz (talk) 22:03, 18 February 2015 (UTC)
  • Symbol support vote.svg Support --- Jura 12:29, 22 February 2015 (UTC)
  • Pictogram voting question.svg Question Would not it be better to l''number of authors ? contributor is Wikipedia or free software centric, why not generalise this a little bit ? @GZWDer, Jura1, Ajraddatz: TomT0m (talk) 16:21, 23 February 2015 (UTC)
    • "author" is more specific than "user". --- Jura 18:28, 23 February 2015 (UTC)
  • Symbol support vote.svg Support but I agree with Jura that the name needs changing. I prefer "Number of Contributors" - Users are people who read a website or a book. Filceolaire (talk) 16:30, 2 March 2015 (UTC)
  • @Filceolaire, TomT0m, Jura1: I have changed it to "registered users". This can be used in some non-encyclopedia sites like Facebook (Q355).--GZWDer (talk) 04:56, 3 March 2015 (UTC)
    • ok. BTW, I didn't mind the previous name. --- Jura 04:58, 3 March 2015 (UTC)
  • @GZWDer, Ajraddatz, TomT0m, Jura1, Filceolaire: ✓ Done :) --George (Talk · Contribs · CentralAuth · Log) 18:37, 19 April 2015 (UTC)

EAGLE id[edit]

   In progress
Description identifier of an epigraphic concept in the EAGLE controlled vocabulary (Europeana network of Ancient Greek and Latin Epigraphy)
Represents EAGLE Vocabulary (Q19371183)
Data type String
Domain materials, types of inscriptions, object types (support of inscribed object), inscription writing/execution techniques, decoration types
Allowed values digits? or vocab/digits?
Example item and value Rostrum (Q13581079) -> 1961
Robot and gadget jobs It would be really nice if a bot could do this but I suspect it will be better to do it one by one as some cases might be ambiguous
Proposed by Pietromarialiuzzo (talk)
Discussion

Motivazione Pietromarialiuzzo (talk) 15:53, 26 January 2015 (UTC) Vorrei rendere i vocabolari specifici dell'epigrafia creati dal progetto EAGLE accessibili a tutti collegandoli a wikidata perché altri progetti ne possano usufruire liberamente e per contribuire con questo lavoro alla missione di wikidata. Seguo un consiglio datomi a Berlino dal team di Wikidata a Settembre dello scorso anno, sebbene con ampio ritardo... spero di aver compilato il modulo di richiesta bene, ci sono molte cose che non capisco.

@Pietromarialiuzzo:

  • translate to English
  • Move to WD:PP/AUTH
  • "Represents Rostrum" is wrong. Make an item for "EAGLE repository" and quote in "subject item" above

--Vladimir Alexiev (talk) 15:25, 23 February 2015 (UTC)

Thanks @Vladimir Alexiev:, I am new here so, please help me to understand what is the best way I can do this. I am trying to explain this in a better way rather than translating if that is ok. I have been suggested buy the wikidata team to map our EAGLE project Vocabularies to Wikidata. We have only concept related to ancient epigraphy and I would like to add to corresponding items in Wikidata a link to our items, if that is fine. I would then add a property to these items to say "This is item as ID xxx in EAGLE". I am not sure if this is best practice for aligning vocabularies, please let me know if there is a different way to do this. I do not understand why I should create a EAGLE repository item and I do not know if I have done this alright, but I have tried to follow your suggestion. Thank you very much!--Pietromarialiuzzo (talk) 10:48, 2 March 2015 (UTC)

The repository item is so we have somewhere where we can explain what the EAGLE project is and can link the proposed property to this. Filceolaire (talk) 16:24, 2 March 2015 (UTC)

@Pietromarialiuzzo: All the above is clear, but we're trying to help you get it into better shape. It is getting better.

     <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/"
                     rdf:about="http//:www.eagle-network.eu/resources/vocabularies/objtyp.html"/>

I think it should be rdf:resource, and the URL is malformed

  • And again, I suggest to move this whole section to WD:PP/AUTH

--Vladimir Alexiev (talk) 11:45, 4 March 2015 (UTC)

Thanks a lot @Vladimir Alexiev:, you have spotted many things indeed which tell you how much non expert I am. I really appreciate your help. I have no idea why that Tematres instance is still there, but is an old version which should not exist anymore that is also why you don't find there Rostrum. the skos vocabulary was exported from there and then updated as such abandoning tematres. The vocabularies are listed and described here: http://www.eagle-network.eu/resources/vocabularies/. They are:

Then each concept has a url in this form http://www.eagle-network.eu/voc/objtyp/lod/1961 (http://www.eagle-network.eu/voc/typeins/lod/1 and so on) where the number is unique and the part before lod instead is the name of the vocabulary as above. I am aware that the /lod/ bit is superfluous but for many reasons in our workflow we could not change that at this stage. I have now fixed the errors in the skos and next sunday it should be uploaded and ok. this should be uri and you can have the html or the rdf. I do not know turtle unfortunately but I will have a look at how to do it for sure. If you suggest to have many "coreferencing prop" that is also fine, please let me know what is the best and I am happy to go down that way. I have nothing at all also against moving this to WD:PP/AUTH but I do not know how to do it: cut and paste everything? Thank you very much indeed. yours --Pietromarialiuzzo (talk) 21:33, 4 March 2015 (UTC)


source language of given name[edit]

   In progress
Description language this spelling of a first name or given name comes from. Of use for Wikidata:WikiProject Names.
Data type Item
Domain given name items
Allowed values items for languages
Example item and value

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Pictogram voting comment.svg Notified participants of Wikiproject Names

  • Pictogram voting comment.svg Comment This is to replace P:P364 on items for given names. P364 used to have the label "language", but is now labelled "language of the original work", so it can't be used anymore and we need to find a replacement. The more general proposal above (#language.2C_except_for_works_or_persons) didn't seem to fly. --- Jura 11:36, 7 February 2015 (UTC)
  • Symbol support vote.svg Support per proposal. --- Jura 11:36, 7 February 2015 (UTC)
  • Pictogram voting question.svg Question @Jura1: What's wrong with the generic "language" properties in this case ? Does not seem right to create one property for name, one other for whatever else, etc. A gender statement or/and make a statement like seems useful, however
    TomT0m (talk) 16:39, 23 February 2015 (UTC)
    • I don't know. It's just that I need some sort of a property for these items. Given the new (English) label of P:P364, I can't use that anymore. --- Jura 18:10, 23 February 2015 (UTC)
    • BTW, I improved the presentation of the example above. --- Jura 17:43, 24 February 2015 (UTC)
  • Pictogram voting question.svg Question What about native label (P1705). It can be used in any supported language instead of label, and consists both the source language and original writing form. That method is best if infobox want to show the name in original form in Wikipedia article. However, there is problem for spoken only languages, but do we need it? If yes I would prefer generic property that point for the source language of any name, no only the given one, for etymological purposes. Paweł Ziemian (talk) 14:49, 11 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose. A property that is only used for given names is way too narrow. We do need a language but this isn't it. Filceolaire (talk) 17:58, 11 April 2015 (UTC)

Category auxiliary item[edit]

   Not done
Description An item included in this category in one or more wikis, other than a related list or "main topic" article, that is included for reasons other than the generic inclusion criteria specified in the category's P360
Data type Item
Domain Wikimedia category (Q4167836)
Source Wikimedia categories
Example item and value Category:Harvard University alumni (Q7234382) => Harvard alumni health study (Q5676636)
Robot and gadget jobs Auxiliary items may typically have a different P31 to that specified in the category inclusion criteria, which could be used to generate reports of likely candidates.
They may also be identified by a '*' or other unusual piped character, to lift them out of the normal sequence of the category.
Proposed by Jheald (talk)
Discussion

Same logic as the proposal two entries above, viz:

When harvesting information from categories it is important to be able to identify "auxiliary" items intentionally included in the category, that in general will not conform normal category inclusion criteria specified by is a list of (P360).

Such auxiliary items should also be white-listed, when identifying category-member items that appear to be in constraint violation of the category's is a list of (P360) criteria, that should either have necessary missing properties added, or their category membership reviewed.

The most common type of such auxiliary items in categories are survey articles on the topic of the category as a whole, identified by category's main topic (P301). The second most common type of auxiliary item is a list paralleling the membership of the category, as proposed two entries above. This property would be used to identify any other auxiliary item in the category. Jheald (talk) 18:24, 27 February 2015 (UTC)

  • Symbol oppose vote.svg Oppose. We will be harvesting info once only from each category. After that wikidata will have more info than any one wikipedias category system. As such this 'black list' of items that the harvesting tool should ignore should be part of the harvesting tool. There is no need to make it a permanent part of wikidata. Filceolaire (talk) 14:24, 10 April 2015 (UTC)

 Not done; no sign of consensus emerging. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:33, 24 April 2015 (UTC)

Parent category[edit]

   Not done
Description A category that is a parent category of the current category in one or more wikis
Data type Item
Domain Wikimedia category (Q4167836)
Allowed values instance of Wikimedia category (Q4167836)
Source Wikipedia category structures
Example item and value Category:Portrait painters from the Netherlands (Q13421910) => Category:Dutch painters (Q6486661), Category:Portrait painters (Q12981746)
with qualifiers wiki=en, wiki=fr, wiki=sv (eg using stated in (P248) => English Wikipedia (Q328) etc)
Robot and gadget jobs Arguably this entire property should ultimately be maintained on-the-fly as a virtual property pulled straight from the central SQL databases as and when needed. But until that is possible, it would be useful to create it as a normal concrete property to allow experimentation. Either way, it would certainly be useful for it to appear to be a regular property, so it could be included in WDQ 'tree' and 'web' queries.
Proposed by Jheald (talk)
Discussion

One of the great possibilities opened up by specifying category inclusion criteria using is a list of (P360) is the chance to deliver automatically "augmented" categories, if the reader of a particular wiki so chooses -- ie to give the option to add to the regular view of the category (if the user wishes) all those articles not in the category but that fit the category inclusion criteria as specified in the P360 that are blue links on that wiki; or all the potential articles that could be in the category that are red links. (Or to generate either list as a report for a particular category).

To do this, one needs to be able to exclude items which match the inclusion criteria, but also match the more specific inclusion criteria of one of the sub-categories of the category. To do this, one needs to be able to discover what are the sub-categories of the category, in that particular wiki. And it is convenient to be able to find that by asking Wikidata, the same place as all the other information about the article-items and the category.

So this proposal is for a property to track those parent-child relationships; one could track either parentage or progeny (or both); but it seems most convenient to specify parentage, just as we do on the wikis.

I should underline that this property is intended to be descriptive, to help bots and automated systems, not normative -- it is to record how the wiki-categories are structured in each particular language, and not any single way that they in any sense they ought to be structured. Jheald (talk) 19:19, 27 February 2015 (UTC)

  • Until someone puts a bot together to work keeping Wikidata's copy of a particular wiki's category structure up-to-date, this is a flat oppose as unworkable. Additionally, seems to duplicate data that is inconsistent between the wikis, which I don't think we should be attempting to do. Maybe there is some sense in a prescriptive category system (which is better defined by relationships already defined on the category's main topic), but there wouldn't be consensus for that I don't believe. --Izno (talk) 17:15, 9 March 2015 (UTC)
  • Oppose, per Izno argument. Visite fortuitement prolongée (talk) 20:12, 31 March 2015 (UTC)
    One category can have several parent categories (such Category:Dutch painters (Q6486661) and Category:Portrait painters (Q12981746) for Category:Portrait painters from the Netherlands (Q13421910)). This is not an issue.
    But the data will be differet for each Wikimedia project. This is a huge data size issue.
    And the data is not stable (like a birth date). A weekly/daily/real time copying/import will be needed. This is a huge data rate issue. Visite fortuitement prolongée (talk) 21:32, 2 April 2015 (UTC)
    @Visite fortuitement prolongée: Is it really such a data size issue, to add a stated in (P248) qualifier (or similar) for each wiki the category applies to? A large amount of data, perhaps; but not a horrific amount, surely. Jheald (talk) 21:58, 2 April 2015 (UTC)
    I don't know. Lets try a rough estimate:
    • currently about 10,000,000 items in Wikidata
    • currently about 300 Wikimedia project (Q14827288)
    • 400 B/category property
    • 400 B/qualifier
    • if 2 categories/item on average
    • if each category has 10 interwiki links, so 10 qualifiers/category property on average
    • then the qualifiers would weight: 10,000,000 × 2 × 10 × 400 B = 80 GB
    • With more of all: 20,000,000 × 5 × 200 × 400 B = 8 TB
    And there is the history. Wikidata keep memory of revisions.
    And Wikidata do not raw store the datas. It envelop/coat them someway.
    Visite fortuitement prolongée (talk) 21:03, 3 April 2015 (UTC)
    @Visite fortuitement prolongée: According to WDQ there are currently 2.5 million items identified as instance of (P31) Wikimedia category (Q4167836) -- so the data requirement per your calculations above would be 20GB, not 80GB.
    One of the beauties of Mediawiki is that to record the history, only the diffs are required. So that probably only adds say a factor of 3 to the storage cost.
    Also note that there is a "long tail" effect here -- as one goes down the tail increasingly many of the categories become very specific, and more likely to relate to only a single wiki -- this again will tend to deflate your estimates (and is likely to remain true, even as if other languages 'catch up' to some extent in adding categories -- which they may or may not do.
    So overall then, an estimate of 80GB including history (or 20 GB for the live data) is probably more realistic. To be sure, this is a substantial amount of data. But it is not a monstrous amount; and I submit that, even without some direct virtual system, this model would still be well worth such storage, for the ability to query wikidata information and WP category information together in a single unified integrated way, and to use it to augment and improve what is currently presented on Wikiepdia category-view pages.
    As a side-note, thanks to the effect of Moore's Law on data storage, I believe I am correct in saying that even with the steady multilingual growth of Wikipedia, and the inexorable increase of history, both the cost and the physical size of its storage have in fact been steadily falling consistently over the last several years. So this should be seen in perspective. Jheald (talk) 08:33, 10 April 2015 (UTC)
    Yes, I forgot that this section is about a proposed property only for categories, thank you for the corection.
    Also, if I understand correctly https://dumps.wikimedia.org/wikidatawiki/20150307/ , it say that the last month size of the data of Wikidata, history included, was 27.7 GB. Then adding "80GB including history" would be a 285 % increase of the size. Visite fortuitement prolongée (talk) 15:01, 12 April 2015 (UTC)
  • Comment: There are currently 1841 categories with subclass of (P279) to another Wikimedia category (Q4167836), see "claim [279:(tree[4167836][][31])]" in http://tools.wmflabs.org/autolist/autolist1.html . Visite fortuitement prolongée (talk) 20:12, 31 March 2015 (UTC)
    See also "CLAIM[31:4167836] AND CLAIM[361]". Visite fortuitement prolongée (talk) 21:32, 2 April 2015 (UTC)

Symbol support vote.svg Support Agree with User:Izno that we need a bot, but that doesn't make the property unuseful. Remember that most Wikidata data has been copied from Wikipedia.

  • "Inconsistent between the wikis" is patently false. Nobody has ever claimed or designed a global category system across all wikis, and it's natural that each wiki's category system reflects local differences or priorities. If we accepted the argument that categories are not useful because they are different across wikis, we should also declare inter-language links useless, because there's no 1:1 correspondence between them. Eg consider
  • The categories are in fact very useful data and Wikidata should reflect them at some point. DBpedia has them (as skos:broader, whereas each category is a skos:Concept). --Vladimir Alexiev (talk) 06:29, 2 April 2015 (UTC)
    "Nobody has ever claimed or designed a global category system across all wikis, and it's natural that each wiki's category system reflects local differences or priorities." →‎ This is not a rebutal of the argument. The claim is not "patently false", but right. Visite fortuitement prolongée (talk) 21:32, 2 April 2015 (UTC)
    "Eg consider (...)" →‎ Those examples are errors or minor design flaw. Not one hundred million statements to be added (huge data size), not very usefull because inconsistent, and to be updated weekly/daily/real time (huge data rate). Visite fortuitement prolongée (talk) 21:32, 2 April 2015 (UTC)
    "The categories are in fact very useful data and Wikidata should reflect them at some point." →‎ Why? And at which point? Visite fortuitement prolongée (talk) 21:32, 2 April 2015 (UTC)
    @Visite fortuitement prolongée: Labels also change on a daily basis and are not sourced. Yet they are very important. --Vladimir Alexiev (talk) 07:59, 7 April 2015 (UTC)

    There are three fundamental problems, as I noted:

    One is that the data changes in an unstable fashion. Categories are added and removed on a daily basis from other categories, meaning these are not facts or even statements which have permanency, or which even you can put a truth test to on one day and get the same the next day. Our "import" of such categories would be idiotic for this reason alone, frankly. Even a bot is a bad solution to this problem because it ends up being thousands of edits a day (or more) just to keep up. The data isn't useful enough (though it might be interesting).

    The second is that a category in wiki A will not have the same parent categories as in wiki B. This is a problem from our point of view because it turns a query regarding this into a mess (without qualifiers).

    Third is that we end up duplicating data nonsensibly in two ways: one in that we're duplicating the main topic's data almost item for item, and two, we're duplicating each local wiki's data. Neither are desirable.

    I'm going to respond to one particular point: "If we accepted the argument that categories are not useful" — nowhere was this argued (as I think most would find such an argument laughable). I did argue that this property is not useful. --Izno (talk) 17:47, 3 April 2015 (UTC)

    @Izno: If you accept that categories are useful, don't you think Wikidata should reflect them somehow? --Vladimir Alexiev (talk) 07:59, 7 April 2015 (UTC)

    No. Fundamentally, it comes down to "can we describe the topics themselves, rather than putzing about with categories"? has part (P527) could describe what components are in a food item; that's useful because it tells me the kinds of food that I might find salt (Q11254) in. A property assigning a food item to a location of origination would be interesting; that's useful to a historian. A property assigning a food item to a method of preparation would be interesting; a cook or some such might want to look at different ways to make a food item. subclass of (P279) would be useful to describe a taxonomy of foods (Macaroni and Cheese has part (P527) Macaroni, Salt, Cheese; Macaroni is a subclass of (P279) a pasta is a subclass of (P279) food), which probably satisfies your need best. And so forth. Categories, on the other hand, do not give me an explicit relation between an item in that category and some parent/sub category, and so at best I'm left with vague or inconsistent data/relations and at worse I'm left with wrong data/relations.

    Wikidata can probably help you and your specific use case. It's just that you haven't tried to go about getting help in the way that will most benefit, well, everyone. What kind of data do you actually need to work with? What kind of properties does your project/do you think are actually interesting about foods? That's the kind of information we should capture. --Izno (talk) 19:14, 10 April 2015 (UTC)

  • Symbol oppose vote.svg Oppose. Wikidata is not here to describe the wikipedia categories. Wikidata is here to describe real world concepts. The wikipedia category system does work as a sort of primitive wikidata and does contain useful information. That doesn't mean wikidata should duplicate the category system. It means that wikidata should harvest the information that is useful and translate it into wikidata statements using wikidata properties about real topics. Once this is done wikidata will have much better information than the categorisation systems do and categorisation can be left as a quaint legacy system to be used by wikipedians who don't want to migrate to using wikidata. Filceolaire (talk) 14:50, 10 April 2015 (UTC) OK. Jheald you have convinced me (see comment below). Symbol support vote.svg Support. Filceolaire (talk) 18:07, 11 April 2015 (UTC)
    @Filceolaire: The sheer arrogance behind that statement is breathtaking. Wikidata is here to describe anything that we want to describe, which includes anything which would be helpful to other projects. The category system will continue to persist on Wikipedias, and will continue to develop in parallel with Wikidata, for the foreseeable future. For all its shortcomings, it even has some advantages over Wikidata -- the speed at which new categories can be defined or subdivided, the ease with which relevant free-text information can be added to the category view page, the informational value of hand-made hierarchies adapted to the specific data, even the very looseness of the categorisation system which can allow related articles to be discovered even if the nature of that relatedness may be complex would be hard to accurately define in Wikidata. So the question should be how can the two systems best be helped to support and enhance and inform each other.
    Wikidata can be used to support and improve Wikipedia categories -- if properties like "parent category" are available -- eg by giving a place and environment where the natures of categories can be described and documented in a structured way; or by making it possible to suggest content that could be used to automatically augment and improve existing category returns. This would be of real value to Wikipedians working with categories, which in turn would give them a significant motive to learn more about Wikidata, and get involved here, to improve an output that they were interested in.
    On the Wikidata side too, it is naive to think that all the information in a particular category tree can be absorbed into Wikidata in a single hit. Sure, one might capture the most obvious memberships. But the sheer flexibility of the category system, and its continuing hand-edited evolution, means that it will inevitably also include less obvious associations and new associations to be added in turn. Plus at the moment, nobody even knows which categories have been mined, or for what, or how complete that extraction was -- so anything that makes questions like that easier to analyse would be of use.
    Thinking of categories as a 'quaint duplicate legacy system' that Wikidata will somehow bury is a fundamental misconception, as well as incredibly hostile to what is still the overwhelming majority of Wikipedians. Instead, much better to think how drawing on Wikidata can help the existing category system evolve, until perhaps it becomes primarily Wikidata-driven. Jheald (talk) 21:52, 10 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose, per arguments in the redundant proposal for category below. Emw (talk) 23:33, 22 April 2015 (UTC)

 Not done; no sign of consensus emerging. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:33, 24 April 2015 (UTC)

Redirect to (if no sitelink available in a language)[edit]

   Not done
Description This is to help with the 'Bonnie and Clyde problem' and the 'Hatter/hatmaking problem' by suggesting where sitelinks to appropriate articles can be found when there is no suitable sitelink in a particular language. In the future this can be used to add sitelinks to languages which are missing at the moment.
Data type Item
Domain any items
Allowed values any items
Example item and value Bonnie Parker ( Q2319886) => Bonnie and Clyde ( Q219937), hatmaking ( Q663375) => hatter ( Q1639239), hatter ( Q1639239) => hatmaking ( Q663375)
Discussion

Motivation:

Proposed by: Filceolaire (talk) There are many cases where the structure of wikidata means that there isn't always a one to one match between the articles in some wikipedias and the wikidata items and so it is difficult for a wikipedia to find the sitelinks to relevant articles in other languages where the article names are slightly different. This property will help in some of those cases. Filceolaire (talk) 00:12, 13 March 2015 (UTC)

  • Pictogram voting comment.svg Comment This could be solved if a link to a redirect page was allowed. For example Emma Lockhart (Q448108) wikilink is a redirect to Ace Ventura, Jr.: Pet Detective which is ideal. This was only possible because the link was made before the page became a redirect. Periglio (talk) 01:19, 13 March 2015 (UTC)
  • Symbol support vote.svg Support We need something in place to allow multiple links to the same wiki article. I have noticed a tendency on Wikipedia to merge biographies into a single article. Periglio (talk) 01:19, 13 March 2015 (UTC)
  • This looks like a pragmatic scalable solution. Multichill (talk) 08:36, 13 March 2015 (UTC)
  • Symbol oppose vote.svg Oppose Comment A better solution is to allow sitelinks to redirects, with automatic reports possible that they are redirects and where they redirect to. Manual tracking of such redirects is likely to be sporadic at best, and unlikely to keep up with changes made to the different wikis. Trying to maintain a property like this by hand is not work that is worth doing. Jheald (talk) 09:31, 13 March 2015 (UTC)
    • Jheald I used to think that too but now I think this property will be easier to maintain than redirects in 250 different languages. The advantage of this property is that once it is done for one language it is available for use in every language. Filceolaire (talk) 10:31, 13 March 2015 (UTC)
    • It will also mean we can split items like 'the murder of Foo by Bar' into three separate items without having to worry as much about messing up the sitelinks. Filceolaire (talk) 10:39, 13 March 2015 (UTC)
      • @Filceolaire: Reducing my oppose to a comment, because on reflection it probably is useful to track on Wikidata how "redirects with possibilities" (as en-wiki tags them) are being used.
      However, I don't see this as an alternative to sitelinking to redirects, just a way of describing them. But just to be clear, are you suggesting that the long-term aim would be for a wiki-page to automatically be augmented with interwiki links to fallback pages using this property, if there were no interwiki link? That would be quite a transfer of control away from wikis, as to what redirects they want to exist, and where they should point to. Jheald (talk) 22:46, 13 March 2015 (UTC)
      • [User:Jheald|Jheald]] Each wiki would have the choice to just show the sitelinks on the wikidata item (including any links to redirects) or to show these sitelinks plus sitelinks found using this wikidata property. Filceolaire (talk) 17:17, 14 March 2015 (UTC)
  • Pictogram voting comment.svg Comment Like Periglio and Jheald, I think it would be simpler to allow links to redirects so I'm not convinced this is needed. Pichpich (talk) 16:13, 14 March 2015 (UTC)
  • Pictogram voting question.svg Question How is this supposed to handle different article splits in different languages ? A qualifier
    < Bonny > redirects to search < Bonny and Clide >
            in wikipedia search < en >
     ? I'm not sure it's a good idea to double maintain redirects in Wikidata and in Wikipedia, why not use a stub small article or a soft redirect like template in those cases ? TomT0m (talk) 18:56, 15 March 2015 (UTC)
    • It is not meant to reflect what is happening in any particular wiki. It is a suggestion from the wikidata community as to how we think wikipedias - especially small wikipedias with less resources - might manage the problem of missing sitelinks due to missmatches between the scope of different wikipedia articles. If a wikipedia feels it has the resources to manage redirects themselves then they can decide to ignore our suggestions. Filceolaire (talk) 20:41, 19 March 2015 (UTC)
      • @Filceolaire: Then I think we better should use the existing relations or Simple Knowledge Organization System (Q2288360) (View with Reasonator) inspired property. For example, for the Bonnie & Clyde project, we know that Bonnie is a member of the duo. A lua template could automatically do the suggestion that if the article about the part does not exists, then a good candidate for the redirect about the whole if it exists. TomT0m (talk) 14:15, 22 March 2015 (UTC)
        • TomT0m that works for duos but there are a lot of other types of group articles - trios, families, partners, groups, lists of characters in a book, taxonomic genus, municipalities containing samename towns, three battles with the same name that occurred years apart, two castles on an island in the Hebrides, etc. Having a property to tell the template how to deal with this seems a lot easier than hoping the template is smart enough to spot all the cases with no false positives or false negatives. Filceolaire (talk) 16:45, 22 March 2015 (UTC)
          • @Filceolaire: I don't think you are right. Our properties are generic. So duos and trios are exactly the same things, as families or partners, as all part/whole relationships. For geographic entities, why not the most little entity that contains the entity we want, this seems a good criteria. It seems a lot more efficient in most cases that to put a property, a few smart choice in the template, guided by generic properties, could solve most cases without hoping or doing anything besides. TomT0m (talk) 17:25, 22 March 2015 (UTC)
            • TomT0m I agree that this will take care of most of these cases but there are still the strange ones where english wikipedia has chosen to group thing together because they have the same name. "three battles with the same name that occurred years apart, two castles on an island in the Hebrides," aren't just random ideas - they are real examples on en:WP that I have come accross while trying to sort out wikidata sitelinks. They are not anything that will ever be linked by any sensible ontology but without the hint provided by this proposed property the other language wikis won't have links to these english articles which are actually more informative than any of the other language articles.
            • I will, however, in future, create WP redirects when I come across these weird cases and sitelinks to the redirects now that I know how to do it. This is a workaround that doesn't need a new property.Filceolaire (talk) 20:37, 22 March 2015 (UTC)
@Filceolaire: "It is a suggestion from the wikidata community as to how we think wikipedias - especially small wikipedias with less resources - might manage the problem of missing sitelinks". Well, I'm not sure if we should "suggest" things, we should add statements based on sources, not opinions. For example, enwp tends to redirect "Non-notable minor planet no 123456" to "List of minor planets (123000-123999)". Such linking looks very frustrating to me, since it gives no information at all. I would prefer a link to "Main belt asteroid" or "Spectral class X-asteroid" instead. But that is an opinon, and nothing we should build claims on! -- Innocent bystander (talk) 09:45, 23 March 2015 (UTC)
  • Symbol support vote.svg Support Usefull property and safe experiment. Visite fortuitement prolongée (talk) 17:35, 29 March 2015 (UTC)
  • I do not support. I think there enough properties to take care of this; notably part of and use/occupation. --Izno (talk) 20:36, 29 March 2015 (UTC)
  • Symbol oppose vote.svg Oppose This is a dirty „workaround” for a serious problem. It won't fix anything. --Succu (talk) 21:50, 2 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose Redirects have to be handle on the WP side and not on the WD side. Mainly because they are no uniform way between the different Wp to handle such situations. WD is not a mirror of WP but a database with its own structure. We have to find solutions but I think multiplying tricks to do that is going to the wrong direction. Snipre (talk) 14:12, 7 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose Per Succu and Snipre. Casper Tinan (talk) 14:38, 7 April 2015 (UTC)

name in kana[edit]

   Done: P1814
name in kana
the reading of a Japanese name in kana
Description The reading of a Japanese name written in kana (hiragana and/or katakana)
Represents kana
Data type String
Template parameter ex: "各国語表記" ja:Template:政治家 (Japanese policians
Domain (People, places)
Allowed values Hiragana and Katakana (=kana) characters, numbers and Latin characters if the original name has some.
Source ?
Example item and value Junichiro Koizumi (Q130852) => こいずみ じゅんいちろう
Discussion

Motivation: (short proposal in the Japanese discussion page; disclaimer: I am not a native speaker, and have only limited skills in Japanese). Currently, the names of a Japanese people or places are only written in Kanji characters (Chinese characters) in Wikidata (in addition of kana/latin characters/numbers… if the official name use them). It is not possible to know from the name in Kanji how to read it without ambiguity. In the Japanese Wikipedia, the name in Kanji at the beginning is often/always followed by the pronunciation/reading in hiragana. Note that the names in foreign language can help, but it may not match the Japanese pronunciation (ex: 東京 is Tokyo in English, but in fact it is pronounced Tōkyō with two long o).

An alternative would be to enter these kana names as aliases of the names.

Proposed by: Fabimaru (talk) 09:47, 29 March 2015‎

Symbol support vote.svg Support As qualifier for any monolingual property with value in Japanese i.e. official name (P1448), birth name (P1477) or native label (P1705) etc. There is similar pinyin transliteration (P1721) for Chinese. Paweł Ziemian (talk) 18:36, 29 March 2015 (UTC)
  • Support. I don't know the languages, but would it make sense to only have one property for this or multiple? It seems like it should be multiple from the range provided. --Izno (talk) 20:35, 29 March 2015 (UTC)
I made a request for the creation of a new Property, but in fact I think that it will not be enough (so maybe all of you may agree with the need of having the kana names, but there is not concrete proposal yet). The names of a given Wikidata item can be present at at least 3 places: the label of the item (in particular the label in Japanese), the "alias" entries of the label, and also in other properties (like official name). For these 3 cases, when there is a name in Kanji (ex: 日本=Japan) there should be one or several associated names in kana (ex: Japan = can be pronounced both nippon and nihon!) that would explain how to read it. I cannot see a single way to store the kana names for these 3 cases. Let's review these 3 cases to see all the challenges:
  • For the label, I don't know where we would store it. In a property? (ex: kana version of the label). We could store
  • For the "aliases" of the label, there can be several names (ex: Emperor Hirohito), and each of them should have a kana version explaining the pronunciation. If we add the kana as additional aliases, there will be no way to make the relationship between the Kanji name and its pronunciation(s).
  • For the properties (like "Official name"), it seems a good fit to use a qualifier (I just discovered this concept, I am not a seasoned user of Wikidata).
So I don't know how we could handle the names in the section "label". Any idea is welcome. Fabimaru (talk) 18:50, 4 April 2015 (UTC)
From my experiences labels and descriptions are usable only for searching data, and nothing more. Any valuable data need to be stored in properties. BTW there is Hirohito (Q11709840), and I think it is good place to store native label (P1705), which can be extended with qualifier of this proposal type data. After reading an article about Hirohito (Q34479) I think, you can put two values in official name (P1448), one "裕仁" and second "昭和天皇", with qualifier start time (P580) equal to date of death (P570). When this "nama in kana" property is ready, all presented examples can be extended with new qualifier with corresponding kana values. Paweł Ziemian (talk) 19:58, 4 April 2015 (UTC)
There is a special alphabet (the International Phonetic Alphabet ( Q21204)) to describe the pronunciation of words in english and other languages and we don't bother to include this in wikidata - though we may in the future. Does the japanese wikipedia have names in kana for places and people outside japan?
Labels and aliases are used for searching for items. If the 'name in kana' is likely to be used as a search term then it should be included as an alias.
We have a lot of 'name' properties (a search for "P:name" will find most of them) and this property should be used as a qualifier to all of them. Filceolaire (talk) 02:39, 10 April 2015 (UTC)
  • Support This property is essential data in Japanese language. In person's name or place's name, there is generally single kana. But in other concepts, there can be some kana. In Japanese Wikipedia articles, kana are always/almost wrote at the first sentence within round bracket. (P.S. I'm native Japanese speaker. But it does not mean specialist. I'm just a average speaker.)--Was a bee (talk) 19:03, 31 March 2015 (UTC)
  • Symbol support vote.svg Support Name in kana (読み仮名) is indispensable to dictionaries and encyclopedias in Japan. There seem to be no property to correspond with it at present. --本日晴天 (talk) 03:01, 7 April 2015 (UTC)
  • Symbol support vote.svg Support per 本日晴天. Kanjis (sinograms) have multiple readings, a property for kana reading will be really useful. Thibaut120094 (talk) 17:58, 7 April 2015 (UTC)
  • Symbol support vote.svg Support. We should also have a property to link to Commons audio files giving the pronunciation of names of people and places. Filceolaire (talk) 02:39, 10 April 2015 (UTC)
This is pronunciation audio (P443) --Pasleim (talk) 08:22, 10 April 2015 (UTC)
@Caliburn: - Sorry, but in fact I did not make it clear that I finally did not think that my initial proposal would solve anything. I think that the contributors agreed with the problem to be solved, but the way to solve it is still to be defined precisely.
The problem I was trying to solve was to store for each Japanese name (either in the label, the aliases and the properties) one or several readings (in kana characters). As mentioned above, it seems that a new qualifier would solve the problem nicely for the Properties containing a name (like "Official name"), so that would be a new proposal (I guess I should do a new proposal for that). Remaining problem: what do we do with the Japanese names stored as a label or an alias? I don't really have an answer. Would the contributors have to copy the name from the label into a property in order to be able to provide its associated reading(s) (via the new qualifier)? For example: for a temple, if a contributor wants to enter the reading, he will have to copy the name in a Property native label (P1705), and add one or several qualifiers. It seems a bit tedious, but the data model looks fine. - Fabimaru (talk) 18:53, 11 April 2015 (UTC)
I am afraid that you can do nothing more with label and aliases than adding more aliases, for example new aliases in kana. In fact all of them describe alternative names (spoken and/or written) of the item. Paweł Ziemian (talk) 21:19, 11 April 2015 (UTC)
(NB: I thought that a special type of new Property was required for qualifiers, but it is not the case) Here is what I understand with the new type of Property (I hope it is correct, and that the Wikidata contributors will understand it the same way): if it is added as a name in kana (P1814) instance of a given Item, then it describes the Japanese label of the item. For another Property containing a name (like official name), a contributor should add a qualifier name in kana (P1814) to it. Thanks everyone for your collaboration. - Fabimaru (talk) 20:23, 12 April 2015 (UTC)

useful for[edit]

   Not done
Description Source is a useful data item for target. Aliases: topic, tag
Data type Item
Domain any item
Allowed values any item
Example item and value Category:Food and drink (Q5645580) & List of films about food and drink (Q17097925) & List of books about food and drink (Q17097548) => Europeana Food and Drink (Q19723898)
Discussion

Motivation: Proposed by: Vladimir Alexiev (talk)

The Europeana Food and Drink project needs to find all Wikipedia categories and articles related to Food and Drink (with the intent of using these for classifying Cultural Heritage Objects on the same topic). I've searched for properties that can express this (eg topic, subject, tag), but none really seems to fit.

It's best to redefine P971 to make it more general. Otherwise, define this new property. Vladimir Alexiev (talk) 06:11, 2 April 2015 (UTC)

@Vladimir Alexiev: You forgot topic's main category (P910) miga and category's main topic (P301) miga. If the items are classes then you can follow the class hierarchy with subclass of (P279) to find broader and narrower classes associated to that category, wich is a good start. TomT0m (talk) 10:29, 2 April 2015 (UTC)
@TomT0m: This (pair) is one of the inappropriate properties that I considered. It's limited to article~category only and it's 01:01 (there cannot be more than one on either side of the property). My proposed "topic" is applicable to any item, and doesn't have such isomorphic flavor. Looking at my examples: Category:Food and drink (Q5645580) doesn't have topic's main category (P910) (since there is no category Chipotle); and Europeana Food and Drink (Q19723898) is not a category at all. --Vladimir Alexiev (talk) 10:37, 2 April 2015 (UTC)
  • This sound like a post-it, which would be a very bad idea in my opinion. Do you know that you can edit the talk page of Wikidata items? Visite fortuitement prolongée (talk) 21:33, 2 April 2015 (UTC)
    • @Visite fortuitement prolongée: Not sure what you mean by "post it". If you mean "a comment", my proposal is to attach an item, not free text. Of course I know I can write on Discussion, but I cannot get it in a structured way with WDQ or the RDF export. --Vladimir Alexiev (talk) 11:42, 3 April 2015 (UTC)
  • Oppose - overly broad (i.e., undefined), and can be solved by querying the Wikipedia's local categories. Heck, en:WP:AWB can do this. --Izno (talk) 17:12, 3 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose. We have is a list of (P360) as well. I don't think we need any more properties for describing these special wikimedia pages which don't represent anything in the real world. If Europeana want to use wikidata to find items related to food and drink then they should forget about wikipedia categories and wikipedia lists and think about how they can use wikidata properties and statements to identify those items. If they have problems doing this then we need to look at what we need to do to solve those problems - not waste time trying to create some sort of second class wikidata using categories and lists. Filceolaire (talk) 02:20, 10 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose. per Filceolaire: if you want to use WP structure, use directly WP data, if you want to use WD data, use WD structure with appropriated properties. Snipre (talk) 13:57, 17 April 2015 (UTC)
  • @Filceolaire, TomT0m: Sorry for giving an overly narrow example. Generally, there's no property to say "Item1 is related or useful for Item2". You seem to claim (same as CIDOC CRM) that a general "relatedness" property is not useful. But one does not always know a more specific nature of the relation! Filceolaire, "If they have problems doing this then we need to look at what we need to do to solve those problems": exactly! Look at the next section: I am denied an equally general property "category" bceause "Wikipedia categories are a mess". Yet **there does not at present exist** anything better to catch all terms or concepts related to "food and drink". --Vladimir Alexiev (talk) 07:18, 22 April 2015 (UTC)

 Not done; no support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:30, 24 April 2015 (UTC)

category[edit]

   Not done
Description a Wikipedia category of the source item
Data type Item
Template parameter Wikipedia category assignments
Domain any item
Allowed values instances of Wikimedia category (Q4167836)
Source Wikipedia category assignments
Example item and value Chipotle (Q860808) => Category:Chili peppers (Q7580291), Category:Nahuatl words and phrases (Q7297107), Category:Smoked food (Q7469226)
Robot and gadget jobs a robot to synchronize with Wikipedias is a definite prerequisite
Discussion

Proposed by: Vladimir Alexiev (talk)

Category assignments (article<category and category<category) are not in Wikidata.

  • They are present on DBpedia (dct:subject and skos:broader respectively).
  • In contrast, the "topical relation" article~category is in Wikidata:

#Parent category proposes to add category<category. I am proposing to add article<category.

In fact it's better to merge the two proposals to just "category"

We need such relations for the Europeana Food and Drink project. If we don't get them from Wikidata, we have to get them from DBpedia. --Vladimir Alexiev (talk) 04:11, 30 March 2015 (UTC)

Because those things are not always the same in every Wikimedia project. Sjoerd de Bruin (talk) 05:36, 30 March 2015 (UTC)
@Sjoerddebruin: So what? Assume article A1 on enwiki has cats C1,C2; article A2 on dewiki has cats C3,C4. Assume A1=A2 and C1=C3 but C2<>C4. Then A1=A2 will get 3 cats: C1=C3, C2, C4. Don't see what is the problem. --Vladimir Alexiev (talk) 17:16, 30 March 2015 (UTC)
@Vladimir Alexiev: I don't think we need category assignments in wikidata. Category is no knowledge but an arbitrary way to sort items that depends on the wikimedia project. It would not be possible to source that kind of statement and it would generate a lot of repetitions with the other statements. --Casper Tinan (talk) 20:29, 30 March 2015 (UTC)
@Casper Tinan: Categories are most assuredly knowledge. A bit messy, lacking in organization (true) but very comprehensive. Wikipedians are very serious about their categorization (see numbers in the linked report). We need them for http://vladimiralexiev.github.io/pubs/Europeana-Food-and-Drink-Classification-Scheme-(D2.2).pdf, or how else would you delineate a domain as wide as Food and Drink and its reflection in Culture? We also don't source Labels and Descriptions, but do you think you could live without them? As for "duplication", I am not sure what you mean. Cheers! --Vladimir Alexiev (talk) 22:16, 30 March 2015 (UTC)

See Help:Basic membership properties, topic's main category (P910) and Wikidata:Property proposal/Generic#Parent category. Visite fortuitement prolongée (talk) 20:14, 31 March 2015 (UTC)

I want to rebuff some patently false arguments that have been made:

  • "This is not true/useful knowledge": it is useful to me. Wikipedians are very serious about categorization. If you argue with a long-time Wikipedia editor that categories are useless, you'll get slapped in the face
  • "These are subjective, not sourced": so are labels and descriptions, yet we couldn't live without them
  • "They are inconsistent across Wikipedias": only someone who doesn't understand how language works can presume that a single global category system is possible or desirable. Local wikipedias reflect local differences and priorities. Inter-language links are also "not consistent" acoss langauges, see eg Wikidata:Project_chat#The_Mortar_and_Pestle_problem.

--Vladimir Alexiev (talk) 07:34, 2 April 2015 (UTC)

"We need such relations for the Europeana Food and Drink project." →‎ Explain. Visite fortuitement prolongée (talk) 21:38, 2 April 2015 (UTC)
We plan to use Articles as concepts to classify cultural heritage objects, and Categories as means to organize these concepts. See http://vladimiralexiev.github.io/pubs/Europeana-Food-and-Drink-Classification-Scheme-(D2.2).pdf which analyzes some 20 datasets for relevance to Food and Drink and concludes that only Wikidata has the breadth to cover this general topic --Vladimir Alexiev (talk) 11:39, 3 April 2015 (UTC)
@Vladimir Alexiev: Did you read the Help:Classification page ? From what you just say, I think you should use metaclasses instead of categories (see also metaclass (Q19478619) (View with Reasonator) ) TomT0m (talk) 11:46, 3 April 2015 (UTC)
@TomT0m: Wikipedia categories are navigation links, they make no claim about the nature of the relation. Eg a category "books by X" typically will be applied not only to books, but also to (the article about) X himself. So I don't think metaclass (Q19478619) (View with Reasonator) is appropriate. --Vladimir Alexiev (talk) 11:58, 3 April 2015 (UTC)
@Vladimir Alexiev: First the typical usecase: for example Reasonator naturally finds all the books of some author, just using regular properties. Second metaclasses can be used for classification purposes, for example we could have a metaclass Jules Vernes related class, with
< Jules Vernes Book > instance of (P31) miga < Jules Vernes related class >
. The relationship is well defined, althouh it's a bit weak in expressing how the class is related. TomT0m (talk) 12:28, 3 April 2015 (UTC)
@TomT0m: An excellent example in my favor! WD (and Reasonator) shows only about 10 "notable works". In contrast, https://en.wikipedia.org/wiki/Category:Works_by_Jules_Verne (a subcat of the Main Category of the item) lists 77 works (in 2 subsubcats). This is the richness of categories! It could/might be transferred into properly structured properties in Wikidata...maybe in 10 years time. But I'm not talking what we could have: I'm talking about what is already in Wikipedia but not reflected in Wikidata. --Vladimir Alexiev (talk) 15:47, 3 April 2015 (UTC)
@Vladimir Alexiev: 1) nope, you did not look well enough. Scroll down a little bit, click on the right place, and you'll see all the books from Jules Vernes. 2) And the other part of my message ? TomT0m (talk) 16:14, 3 April 2015 (UTC)
@TomT0m: Sorry I didn't see that. But take for example Bulgaria's biggest writer Ivan Vazov: Reasonator knows only 7 works (mostly coming from plwiki), whereas the bgwiki category https://bg.wikipedia.org/wiki/Категория:Иван_Вазов has 38 works (the full list is a couple of hundreds but a lot of them still don't have articles: https://bg.wikipedia.org/wiki/Категория:Иван_Вазов). Do you disagree that it will be several (maybe 10) years until ALL of the information from Wikipedia categories is available in separate properties? You could model them with metaclasses (I'd just use a construct from Description Logics, tracing the "author" property from the writer in question), that's not the point. The point is that this wealth of info exists in Wikipedia TODAY and is not available through Wikidata. --Vladimir Alexiev (talk) 07:48, 7 April 2015 (UTC)
Wikipedia use categories for classification. Wikidata use something else (see Help:Classification and Help:Basic membership properties). And I still do not know why this Food and Drink project need categories inside Wikidata, and how it plan to use them. Visite fortuitement prolongée (talk) 21:20, 3 April 2015 (UTC)
@Visite fortuitement prolongée: Model the info avaiable in Wikipedia categories any way you like. But model it, don't leave it out! --Vladimir Alexiev (talk) 07:48, 7 April 2015 (UTC)
"it is useful to me." →‎ Explain. Visite fortuitement prolongée (talk) 21:38, 2 April 2015 (UTC)
"only someone who doesn't understand how language works can presume that a single global category system is possible or desirable" →‎ This is not a rebutal of the argument. Visite fortuitement prolongée (talk) 21:38, 2 April 2015 (UTC)
  • Pictogram voting question.svg Question @Vladimir Alexiev: Why do we need to reflect categories on Wikipedias ? They will continue to be there and to be maintained, queryable and extractable. Why do we need to mimic them here, would'nt they be messy and double the work of maintaining them, a change of some wikipeia would have to be reflected here. Tools that uses this information will be able to use the information from other sources, are'nt they ? On the other hand here we try to use other tools to classify, like Help:basic membership properties and Help:classes. Is'nt there some chance maintaining a category hierarchy will make thing blurrier for users ? Not to mention the fact tht this is a central point and that there is as many category hierarchy as there is Wikipedia, this would be a mess to centralize everything here ... TomT0m (talk) 10:36, 2 April 2015 (UTC)
    @TomT0m: These are very good questions. For providers (editors), this proposal gives no benefits; unlike the inter-language link, which converted a combinatorial explosion (N*N) into a linear complexity (N).
    For consumers, it gives benefits: it's easier to get all data from one place. I was working on the assumption that a goal of WD is to ultimately centralize all knowledge. @Multichill: wrote a couple days ago "We have bots and we're not afraid to use them... Wikidata will help make linked open data mainstream and will become the central hub for LOD, just like Wikipedia is the central hub for people looking for information."
    If that's not true (or not yet feasible), suspend this proposal. I'm not afraid to load and interlink DBpedia and Wikidata into the same repo, but many other consumers are.
    In this train of thought... What happens to the Labels that people are furiously editing here? Nothing yet? --Vladimir Alexiev (talk) 10:47, 2 April 2015 (UTC)
    @Vladimir Alexiev: I still don't understand for what tasks you specifically need a category property. Could you give us more detail ?
    Now this proposal has in my opinion several flaws. It would enable users to use only one property to express almost anything. For instance, in the proposed example, the main ingredient of the meal (chili-pepper), the preparation process (smoking) and the etymology of the world are put on the same level. It would be impossible to get a precise information from a query using that property. It would also probably generate repetitions. Let’s take the example of Richie McCaw (Q726148). If I added to the wikidata item the categories attributed in the article on the French Wikipedia (New Zealand national team rugby player, Canterbury Crusaders rugby player, flanker, Born in New Zealand, Born in December 1980), it would not add any more information about the person. There are already statements to tell that to the wikidata users.
    1. There are many millions of items with very little info. As I said, Wikipedians take their categories very seriously: each article has 4.4 categories on average. This is a wealth of info that's currently not on WD. 2. Many of the categories represent various relations that will not be captured into separate properties for a very long time. Eg see "chipotle" above, none of its categories fits the "when/where" pattern you gave as example for that rugby player. --Vladimir Alexiev (talk) 11:39, 3 April 2015 (UTC)
    You don't need a category property to transfer the information provided by Wikipedia categories to Wikidata. You need a bot and a good understanding of the relation between the categorie and its elements. There are already bots doing just that. A category property would be too vague regarding the nature of the relation between the two items linked together.Casper Tinan (talk) 19:46, 3 April 2015 (UTC)
    By the way, your proposal would be better defended if you could
    1/ use a less aggressive tone (‘rebuff’, ‘patently false arguments’)
    2/ avoid straw man fallacies: nobody said categories are not useful at all. They are useful to browse through Wikipedia pages and do maintenance. I just challenge the fact it is needed within Wikidata. Nobody said the Wikipedia users are not serious about the categories they attribute to an article. But the definition of the categories, of how articles need to be sorted out, is at best an educated guess about how users browse through Wikipedia. As Wikidata is queryable, there is more efficient way to search items in the base.
    3/ use real arguments: ‘People would slap you in the face if you said that’ is not a proof that they’re right. It is a proof that they’re violent.
    Casper Tinan (talk) 12:41, 2 April 2015 (UTC)
    Sorry for using a more colorful language than you like. But how familiar are you with categorization on Wikipedia, and the editorial guidelines regarding categories? I've read them, and I repeat that Wikipedians take their categories very seriously --Vladimir Alexiev (talk) 11:39, 3 April 2015 (UTC)
    By the way, there is a mistake in the example provided. It it wrong to attribute the 'Nahuatl word' category to Chipotle (Q860808) as the item is about smoked food not about a word. Casper Tinan (talk) 22:05, 2 April 2015 (UTC)
    Ah, the distinction about term/concept and its denotation. But WD does not have separate entries for a concept and its denotation (nor it is desirable to have them, since that would explode the data and get many people confused). Or maybe you'd argue that Chipotle (Q860808) is not a word: Well then, would you argue that BabelNet's integration of Wordnet and Wikipedia/Wikidata (eg see http://babelnet.org/synset?word=bn:00018522n&details=1&orig=chipotle&lang=EN sec Sources) is wrong? In any case, I did not make this category assignment, Wikipedia editors did. --Vladimir Alexiev (talk) 11:39, 3 April 2015 (UTC)
    As WD is a multilingual database, WD items only refers to concepts and not to words. A statement saying computer (Q68) is an english word would be right for german speakers and wrong for french speakers.
    Casper Tinan (talk) 20:07, 3 April 2015 (UTC)
    "Ah, the distinction about term/concept and its denotation. But WD does not have separate entries for a concept and its denotation" →‎ Actually, the separation is made between Wiktionary and Wikipedia. Wiktionary store term/word, Wikipedia store denotation. Visite fortuitement prolongée (talk) 21:20, 3 April 2015 (UTC)
    "In any case, I did not make this category assignment, Wikipedia editors did." →‎ Indeed. Visite fortuitement prolongée (talk) 21:20, 3 April 2015 (UTC)
Categories provide very rich structured knowledge (albeit they are messy etc etc). You are refusing this knowledge a place on Wikidata. I thought Wikidata should be the home of all structured knowledge from Wikipedia, guess I was mistaken. Then I'll have to get it from DBpedia. But I'll also have to provide some way for our project partners to enter additional items somewhere else. --Vladimir Alexiev (talk) 07:54, 7 April 2015 (UTC)
  • Symbol support vote.svg Support everything that @Vladimir Alexiev: has written. WP Categories are things that are well worth analysing -- what's in them, what's not in them (but could or should be), how they differ from one language to another, etc, etc. In many cases a WP category can tell you something not reflected on Wikidata, if it's capturing something that is not yet the subject of a property, or data not yet input into one. Yes, WP categorisation has been human and noisy and awkward in all sorts of ways. But it reflects a human-curated understanding of the item that is worth more than just dismissing and throwing away. And having it would add to the more structured properties already here, not replace them. (Plus, not least, having the information here would open possibilities for improving, or at least suggesting improvements for categorisation on WP). One advantage of WD reflecting information gathered from many different sources is that then possible to compare and consistency-check them. WP categories should not be excluded from input into that process.

    At the moment, you can just about do it -- if you're a real Magnus, and prepared to work with all the WP SQL tables alongside Wikidata. But for ordinary mortals like me, that's a very real barrier. I for one would welcome a section at the bottom of each item page, showing what categories the item had been categorised for, and in which languages -- firstly because this would often reveal more properties that should be added; secondly, because I think it's of interest in itself to be able to see the different categorisation patterns in different languages; and thirdly, because for analysis purposes it would be far easier to have this information directly integrated into Wikidata, directly available in a straightforward, consistent way from WDQ and dumps, rather than what we have at the moment which simply isn't. Jheald (talk) 15:56, 2 April 2015 (UTC)

    @Jheald: Some wikidata game that suggests stuff or reports like this item is categorised in that wikipedia category but there no or weak relationship between those item in Wikidata would help ? I guess based of these kind of reports we will understand stuff and maybe find useful properties that are missing and therefore we cannot atm express a correct relationship ? Do you know that tools like autolist2 from Magnus allows to make query to cross datas from wikidata and from wikipedia's categorisation ? TomT0m (talk) 16:19, 2 April 2015 (UTC)
    @TomT0m: Yes, that can do certain things. But it still has its limits -- compared, for example, to how easily one could write a programmatic WDQ query to return all items in a category that didn't match the category conditions as expressed by the category's is a list of (P360). That's the sort of flexibility that putting the category membership information into the same structure as everything else makes possible. Jheald (talk) 16:34, 2 April 2015 (UTC)
    Relevant question about the potential for "virtual" properties asked at Wikidata:Contact_the_development_team#.22Virtual.22_properties_.3F. Jheald (talk) 16:36, 2 April 2015 (UTC)
    This different proposal, in which Wikidata would not store the data (except some caching) neither the data history, and would only work as gateway, would likely be spared of the first (huge data size) and second (huge data rate) issue that I have mentionned. Visite fortuitement prolongée (talk) 21:38, 2 April 2015 (UTC)
    I like this idea. --Izno (talk) 17:59, 3 April 2015 (UTC)
  • Oppose to this proposal in its current version, for the same arguments that in #Parent category (the data size and data rate will be even higher here that in the other proposal). Visite fortuitement prolongée (talk) 21:38, 2 April 2015 (UTC)
  • Comment - Another argument I forgot. I wrote above "For providers (editors), this proposal gives no benefits". That's true when the data is already in Wikipedia.
    • But many of the content providers in Europeana Food and Drink are small museums, and they don't have the staff to write WP articles about cultural concepts used in their collections, yet missing in their national Wikipedias ("We will not deliver articles to Wikipedia, as unfortunately we don't have time for such additional activities").
    • Eg Swiecenie Koszyczek, "blessing of the baskets", is a very colorful tradition represented in artefacts of one of our providers, but unfortunately missing in Polish Wikipedia.
    • They would however be able to add a Wikidata item, since that's a lot less work, and no pesky Notability questions asked.
    • What I also need is for them to tie it up to some categories, eg "Easter traditions", "Egg-related dishes", "Easter foods". Then I can do useful classification with these categories.
    So, would you accept this property as a small number of additional category assignments, not replacing those in Wikipedia? Thanks! --Vladimir Alexiev (talk) 11:39, 3 April 2015 (UTC)
    gift. Visite fortuitement prolongée (talk) 21:20, 3 April 2015 (UTC)
What does this link have to do with the Polish cultural tradition? And how does it help the fact that it's not on Wikipedia nor Wikidata, and your refusal to add categories makes it pointless to create it on Wikikidata? --Vladimir Alexiev (talk) 07:54, 7 April 2015 (UTC)
  • Reject per [1]. Also per the wikis' present usage of categories to categories items related to a particular topic in a generic sense rather than a more specific sense, per TomTom. --Izno (talk) 17:53, 3 April 2015 (UTC)
Symbol oppose vote.svg Oppose WP categories are outdated: categories are a specific way to group articles which is not common knowledge but particualar point of view. And WD can get out of this way of grouping by using queries: you want the list of all books of Jules Verne ? Just perform a query using instance of book with author equal to Jules Verne. WD goes to the dynamic management of lists or categories instead of WP who just works with a static solution ? How can WP categories provides the list of musicians living in the XVIII who where speaking English ? Think dynamic. Snipre (talk) 09:59, 4 April 2015 (UTC)
@snipre: I want the list of all articles related to Food and Drink. Do you know of any magical query to fetch this? --Vladimir Alexiev (talk) 07:48, 7 April 2015 (UTC)
@Vladimir Alexiev: You have to use the current "instance of" and "subclass of" properties and an Food and Drink ontology to classify items in WD and then recover the items we want through a query. What you want to avoid is the building of an international ontology and to use an existing one which is not completely formulated. WP categories are a kind of classification without global overview. So instead of using an empirical classification, just start the work correctly: 1) define the set of items you want to classify, 2) define the properties/characteristics you need to need to know about each item to be able to do query, 3) organize with a bot the correct classification using your rules and existing parameters like categories. As I see from your examples, if you want to query according to some ingredients, you need a property "ingredient", for the country or culture you need another one, for the specificities like the period where the food/drink is prepared, another one,... Have a look at that introduction to have some background.
And if you don't want to create an complete new ontology you can try to find one which already exists (see here. Snipre (talk) 12:41, 7 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose. If the categorisation can be expressed via a property:value statement then we should add that statement, not this vague category statement. If the categorisation can not be expressed as a statement then we almost certainly don't need it. The categorisation systems on en and other wikipedias certainly does contain a lot of information which can be harvested and converted into useful statements and we should certainly do this conversion even if it means we need to create more properties. For instance we need a "preparation method" property so we can include the statement "Chipotle:preparation method:smoked". The raw data however should stay in the wikipedias. Filceolaire (talk) 02:06, 10 April 2015 (UTC)
  • Symbol oppose vote.svg Oppose. Transfering Wikipedia's categories' content as it is is not an appropriate to exploit the data. This would generate repetitions and errors and force users to develop a second ontology to define the relation between categories. Moreover, it would be impossible to implement useful constraints for this property. Casper Tinan (talk) 18:47, 10 April 2015 (UTC)
    • @Casper Tinan: Actually, developing categorisation constraints (or at least anomaly detection) would be one of the things that would be made more straightforward by having this property. One would use is a list of (P360) statements to define the main content-areas of the category (as @GerardM: is already doing). Any lists related to the category by list related to category (P1753) would be automatically whitelisted, as would any other items specifically identified by the proposed #Category auxiliary item property above; then everything else could be reported as an anomaly -- either as a constraint violation against the P360 statements on the category, or against the category statements on the item. Jheald (talk) 22:09, 10 April 2015 (UTC)
      • @Jheald:No, it would be made more difficult. The only constraints that you would be able to implement with this property is that the value should be an instance of category. You wouldn't be able to add any constraint regarding the nature of the item (a member of a category can be a category, or an instance or a subclass of anything) or the number of values (WP articles are generally members of many categories). Casper Tinan (talk) 07:51, 11 April 2015 (UTC)
        • @Casper Tinan: You're right -- if we only consider the basic constraints currently defined at Template:Constraint. However, with a is a list of (P360) in place for a category, it becomes possible to define an additional type of check. Suppose the category is for women engineers, with a P360 for the category item defined as per the example in the template of the top of Property talk:P360, ie with is a list of (P360) => human (Q5), with qualifiers occupation (P106) => Sas van Gent (Q81086) and sex or gender (P21) => female (Q6581072). Then it becomes possible to throw a violation/anomaly for any article-like item of a member of the category that does not contain the statements instance of (P31) => subclass of human (Q5), occupation (P106) => subclass of Sas van Gent (Q81086), and sex or gender (P21) => subclass of female (Q6581072), and record that as a violation on the article item, the category item and P360. This is really quite a powerful set of constraint checks that integrating category membership data into Wikidata could make possible. Jheald (talk) 18:12, 11 April 2015 (UTC)
          • @Jheald: Except ... categories are a mess. You often find the article about some occupation categorized in the category named from that occupation. The properties defined in Help:Basic membership properties, instance of and subclass of do not have this problem, as a class hierarchy has to satisfy certain constraints. You can't say building is an instance of the building class, but the Eiffel Tower is. You'd get plenty of false positive in this constraints, assuming you can associate a query to them. By contrast, we could associate a Wikidata query to the class item, which would give a precise definition of the class. The is a list of actually has a feature to do that, but it is unclear on how this will work with the future query engine in Wikidata ... I would not fight for this at that point on the project. TomT0m (talk) 13:25, 12 April 2015 (UTC)

First, thanks for not rejecting this yet, despite all the Opposition :-) @JHeald, TomT0m, Filceolare, Snipre, GerardM:

  • Thanks to everyone for pointing out how to design ontologies (I'm an ontology engineer myself) and food ontologies (I'd researched 3 such and 20 various datasets). The problem is that I need to catch everything related to Food and Drink. Wikipedia & DBpedia have only 6.6k proper "Foods" (with infobox, DBpedia class, ingredient list etc). Wikidata visual class hierarchy has only 2050 foods and 184 drinks as of 22-Jan-2015. But I also need food, drinking, hunting & agriculture tools, traditions, styles, events, people, etc. My estimate is that there are 200-500k FD-related items (in all 11 Wikipedias across the EFD project languages).
  • Please help me to represent the following case. I created an item t'ala cup (Q19825902) "standing cup used to drink t'ala beer" corresponding to this object type at the Horniman. I made it a subclass of cup (Q2100893). But I can't relate it to category Category:Drinkware (Q7440281)!
  • This begs the question: what's the purpose of categories like Category:Drinkware (Q7440281), if they only hold a single object through category's main topic (P301)? If categories are so messy as to be useless then remove all category items from Wikidata!
  • You may tell me "use the class and instance hierarchy" but that's not enough. Eg Święconka (Q877920) (which has to do with Easter eggs but also other foods) has 3 wiki site links, and the following categories (pl, de are translated):
    • enwiki:Święconka: Easter traditions, Polish traditions
    • plwiki:Święconka: Easter Traditions, Old Polish Traditions, German Cuisine
    • dewiki:Osterspeisensegnung_in_Polen: Food and Beverages (Easter), Festivals and Customs (Poland), Roman Catholicism in Poland, Sacramental
    • As you see, I need the dewiki category to catch it as Food and drink
    • And BTW, the Wikidata class hierarchy is as messy as the categories: 16k classes of which 2/3 don't even have 5 instances, and lots of cleanup is needed. I'll use it if I need to, but I'll also have to use the categories.
  • So please allow me a property to let me add new categories to new Wikidata items. It's extremely hard for GLAMs to add Wikipedia articles, much easier to add Wikidata items (I'll post a presentation about this soon). But without the ability to link these items in the same way as Wikipedia articles can be linked, GLAMs just won't do it.
  • I'm not trying to force anyone to use the messy categories. But 18M categorylinks exist in enwiki (plus more in other langauges), and we have to use them since there's nothing better at present! We are used to wrangling with large amounts of data, using machine learning and classification approaches. But you are forbidding me from using categories at Wikidata, even as addition to (not duplication of) Wikipedia categories.
  • @Casper Tinan: can you please explain how I can use is a list of (P360)? AFAIK, it doesn't apply to categories. Please point to an example.

Thanks! --Vladimir Alexiev (talk) 07:59, 22 April 2015 (UTC)

@Vladimir Alexiev:A quick feeling about your example of a tool to drink beer :
  • propose a property (if it does not exist to the tool class function)
  • create a
    < Stand cup > subclass of (P279) miga < cup >
  • add a statement
    < Stand cup > function search < drink beer >
    with a
    < beer drinking > subclass of (P279) miga < drinking >
  • only need something to link the process of beer drinking to one of its input: beer. I must admit I failed to design a community accepted ontlogy to design input of processes yet, but it's similar to a raw material property, I guess ... maybe we can propose generics process input and process output properties, this would be a start. But I understand this does not fully satisfies you. What I'm afraid of is that if we allow too weak properties we will have a bunch of barely usable statements and that nobody will make the work to precise relationships. We will end up with a database that do not do better than the category system or a basic textual context analysis that some big player like facebook can already do with deep learning technology ... TomT0m (talk) 08:10, 22 April 2015 (UTC)
I don't think there should be an item "beer drinking", since that's a pre-coordinated concept (AND) off "beer" and "drinking". LCSH uses precoordination ("Italian love poetry 16th century") and is widely criticised for that.
Being afraid of the weak semantics is right. But rejecting the wealth of info available in Wikipedia cats on these grounds is wrong. Allow computer scientists like us to try to make meaningful use of it, don't reject it. Eg Babelnet has used the cats profitably (for disambiguation) to align Wikipedia to Wordnet. --Vladimir Alexiev (talk) 08:45, 22 April 2015 (UTC)
@Vladimir Alexiev: Beer and drinking are clearly not of the same nature. If we don't have that class that may way harder to say that the usage of a cup is to drink beer, and harder to find all types of cups or glasses that are dedicated to beers. What we need to manage this is that we are able to have a good inference system that notice then there is some kind of redundancy. Otherwise to say this we would have to create a weird usage of scary qualifiers like of search which will have some weird usecases and probably inconsistent ... But I'm opened to discussion. I just think my model does the trick and is generic enough to open a vast field of application. TomT0m (talk) 09:00, 22 April 2015 (UTC)

People are already mis-using topic's main category (P910), eg Amethystium (Q470368) (a group) => Q8946960 (Q8946960) (electronic/ambient music). But topic's main category (P910), being inverse of category's main topic (P301), should be used only once per category. As I pointed above, singleton categories are useless, so if you reject this property then be consistent and remove category items from Wikidata. --Vladimir Alexiev (talk) 08:45, 22 April 2015 (UTC)

Category items are meanly used in WD to link pages between WPs so no deletion can be approved. But I agree that the use of properties topic's main category (P910) and category's main topic (P301) is a problem because people try to use them to create again the categories system in WD. But I don't understand why you want to use WP categories when you point yourself the problems of that system (see the example of the differences between English, Polisch and German WP in categorization of articles). The category system is pointless because there is no unique system, but different ways to classify articles. So even if we create your proposed property, can you explain us which categories system you will use ? The German one ? the English one ? A mix of the different WPs ? I think that the main opposition to use the WP systems is coming from the discrepancies between the different WPs. WD classification doesn't exists yet but the advantage of WD system will be its uniqueness.
You will find more support if you provide the structure of the system you want to use in WD: we need to be able to learn and to work together according to defined classification with can be proposed to all persons who want to work with the concerned items. WPs categories don't provide this feature because 1) categories are WP dependent (no one classification system), 2) no common and centralized documentation. To overcome these difficulties the best solution is to start project about Food and Drink subject and to propose or develop a classification system there based on "instance of" and "subclass of". Snipre (talk) 13:50, 22 April 2015 (UTC)

Symbol oppose vote.svg Oppose. As a long-time Wikipedian who has spent many hours manually curating and browsing categories on Wikimedia Commons, I oppose adding further support for categories on Wikidata. I often do not agree with Snipre and TomT0m, but I agree with their robust arguments in this case. I would support deleting topic's main category (P910) and category's main topic (P301). I actually supported topic's main category (P901) as an intermediate step to eliminate categories, but I have not seen any notable work on that front in the 1.5 years since then. Thus I now believe that adding more properties to support categories would do less to help deprecate Wikimedia's category system than it would to help categories become a permanently entrenched legacy system that would fundamentally -- and redundantly, less expressively -- bifurcate how we structure data. Emw (talk) 23:31, 22 April 2015 (UTC)

 Not done; no sign of consensus emerging. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:31, 24 April 2015 (UTC)