Wikidata:Property proposal/Term

From Wikidata
Jump to: navigation, search

This page is for the proposal of new properties.

Before proposing a property
  1. Check if the property already exists by looking at Wikidata:List of properties (manual list) and Special:AllPages.
  2. Check if the property is already pending or has been rejected.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically. See WD:WikiProject Infoboxes for suggestions.
  4. Select the right datatype for the Property.
  5. Start writing the documentation based on the preload form below and add it in the appropriate section.

Creating the property

  1. Creation can be done after 1 week by a property creator or an administrator.
  2. See steps when creating properties.

Add a request

This page is archived, currently at Archive 29.

To add a request, you should use this form:

=== {{TranslateThis | anchor = en
| de = <!-- PROPERTY NAME IN German (optional) -->
| fr = <!-- PROPERTY NAME IN French (optional) -->
<!-- |xx = property names in some other languages -->
}} ===
{{Property documentation
|status                 = <!--leave this empty-->
|description            = {{TranslateThis
  | en = ...
|subject item           =  <!-- item corresponding to the concept represented by the property, if applicable; example: item ORCID (Q51044) for property ORCID (P496) -->
|infobox parameter      = Wikipedia infobox parameters, if any; ex: "population" in [[:en:template:infobox settlement]]
|datatype               = put datatype here (item, string, media, coordinate, monolingual text, multilingual text, time, URL, number)
|domain                 = types of items that may bear this property
|allowed values         = type of linked items (Q template or text), list or range of allowed values, string pattern...
|source                 = external reference, Wikipedia list article, etc.
|example                = {{Q|1}} => {{Q|2}}
|formatter URL          = 
|filter                 = (sample: 7 digit number can be validated with edit filter [[Special:AbuseFilter/17]])
|robot and gadget jobs  = Should or are bots or gadgets doing any task with this? (Checking other properties for consistency, collecting data, etc.)

Proposed by: ~~~

(Add your motivation for this property here.) ~~~~

For a list of infobox parameters, you might want to use table format:

{{List of properties/Header}}

{{List of properties/Row|id=
|title          = audio
|type           = media
|qualifier      =
|description    = Commons sound file
|example-subject= Q187 <!-- Il Canto degli Italiani -->
|example-object = Inno di Mameli instrumental.ogg


For blank forms, see Property documentation and List of properties/Row

To reduce page size and functions for better loading,

  1. For transportation-related item property proposals, see Wikidata:Property proposal/Transportation.
  2. For economics-related item property proposals, see Wikidata:Property proposal/Economics.
  3. For natural science-related item property proposals, see Wikidata:Property proposal/Natural science.

Products & software products[edit]

Languages / Sprachen / Langues[edit]


   In progress
Description type of wine made primarily from a single named grape, herb or fruit variety
Data type Item
Domain Zinfandel (Q204433) (Zinfandel), Cabernet Sauvignon (Q207310) (Cabernet Sauvignon), Chardonnay (Q213332) (Chardonnay), Merlot (Q213338) (Merlot), apple (Q89) (apple), Pinot gris (Q778601) (Pinot gris), Sangiovese (Q509162) (Sangiovese), Tempranillo (Q519874) (Tempranillo), Pinot meunier (Q947208) (Pinot meunier), Pinot noir (Q223701) (Pinot noir) and many others from List of grape varieties (Q1357585) (List of grape varieties), etc.
Allowed values Q, Text
Source List of grape varieties, Varietal
Proposed by SarahStierch (talk)

Varietals aren't "genres" and that is the closest thing I can find to what would fit for listing types of grapes, fruits, or herbs used to make wines, liquors, beers, ciders, etc. This is my first time requesting a property and I'm unsure on if I did it right, but, perhaps others interested can help improve this request. Thank you for your consideration. SarahStierch (talk) 17:18, 6 July 2014 (UTC)

Pictogram voting comment.svg Comment: Sarah, could you give an example of how you see this property being used in a Wikidata statement? For something like Zinfandel (Q204433), we might be able to build a set of varietals with existing properties, e.g. "Zinfandel use (P366) varietal". Other useful statements might be "Zinfandel instance of (P31) cultivar", "Zinfandel subclass of (P279) Vitis vinifera". That instance of and subclass of usage would be consistent with how Wikidata classifies cats and dogs, e.g. Chihuahua (Q653).
Anyhow, food and drink is an interesting area for structured data. Let me know if the above makes sense or if you have other ideas about how to model things. Cheers, Emw (talk) 19:59, 6 July 2014 (UTC)
Hi User:Emw! Hmmm.... I'm not really sure actually...I'm pretty open minded and didn't give that aspect much thought outside of wanting to be able to use "varietal" as a statement. I'm not too sure...I can see the cultivar and vitis vinifera probably working best as statements versus the first varietal - but, since Zinfandel is a varietal, and it's a type of vitis vinifera.....hmmmm.... I'm leaning towards your experience in this to guide me! SarahStierch (talk) 20:06, 6 July 2014 (UTC)
Sarah, things like "Zinfandel" are tricky because they're polysemous. As you you say, Zinfandel is a type of wine and a type of grape. However, many statements for Zinfandel wine are false for Zinfandel grape, and vice versa. For example, Zinfandel wine is not susceptible to bunch rot and Zindandel grapes do not have an alcohol by volume range of 12-17%. The wine derives from the grape. I think these statements illustrate the need for two separate items for the two different senses of Zinfandel.
This kind of polysemy exists elsewhere, e.g. with the concept "influenza". Influenza is formally a type of disease, but it is often also used to refer to a type of virus. Separating the two concepts into two Wikidata items -- influenza (Q2840) and influenza virus (Q287246) -- allows us to be much more precise and expressive about each subject.
Perhaps we could do the same here with Zinfandel wine and the Zinfandel grape it derives from. That is, we could reserve Zinfandel (Q204433) for the grape and create a new item Zinfandel wine (Qx) for the wine. What do you think? Emw (talk) 00:29, 7 July 2014 (UTC)
User:Emw you are a genius! (But you probably knew that already). Making new items for wine is a GREAT idea. How do we make that happen. SarahStierch (talk) 01:59, 8 July 2014 (UTC)
@SarahStierch: I have started Wikidata:Property_proposal/Natural_science#fruit, I think the first step would be to start creating items for the fruits and linking them with their plants.--Micru (talk) 12:33, 17 August 2014 (UTC)

ISO 639 scope[edit]

   In progress
Description The language scope as defined by ISO 639.
Data type Item
Domain langue (Q34770)
Allowed values individual language, dialect (Q33384), ISO 639 macrolanguage (Q152559), language collection.
Source ISO 639-3 (Q845956)
Example item and value English (Q1860) => individual language
Robot and gadget jobs might be imported by bot
Proposed by Pathoschild

This property would contain the language scope defined in the ISO 639-3 standard; see Scope of denotation for language identifiers for a full description. This property is of general interest — the standard is widely recognized (including by the Wikimedia Foundation for its language codes), and the scope is used by the Wikimedia language committee to determine the eligibility for new wikis. —Pathoschild 20:56, 16 August 2014 (UTC)

citation needed. Visite fortuitement prolongée (talk) 21:05, 1 January 2015 (UTC)
duplicate of ISO 639-1 code (P218), ISO 639-2 code (P219) and ISO 639-3 code (P220)? --Pasleim (talk) 18:55, 21 January 2015 (UTC)

Endangered languages[edit]

Please redirect this if it's being posted in the wrong place. I went to Wikimedia (Wikimedia topic Endangered Languages) with the idea and as far as I can tell this is where I am supposed to bring it.

I'd like to propose that all Wikipedia articles on languages include their conservation status, in a format nearly identical to that used for animals. While articles on animals get their citations from the IUCN Red List, the conservation status of languages would be cited from the UNESCO Red Book on world languages. Since this is a rather broad idea affecting a large number of articles, I wanted to bring it up somewhere I thought it would be heard rather than on an individual article. User:PiRSquared17 suggested it could also include conservation data from Ethnologue. At any rate, let me know what your thoughts are, everyone. Interlaker (talk) 22:18, 7 February 2015 (UTC)

  • I personally Symbol support vote.svg Support the creation of the property, which could be called "endangered language status" or something like that. Is there an official range of values? Presumably we can just adopt UNESCO's. However, note, this is similar to the case of the property cultural heritage site/monument; I can't remember whether that one was created or instead the information added within instance of (P31). --Nemo 16:32, 8 February 2015 (UTC)

UNESCO language status[edit]

   In progress
Description UNESCO language status
Data type String
Domain languoid (Q17376908)
Allowed values listed here
Example item and value Japanese (Q5287) => 1 safe, Ainu (Q27969) => 5 critically endangered
Proposed by Visite fortuitement prolongée (talk) 22:04, 9 February 2015 (UTC)

Motivation: Suggested by @Interlaker, Nemo bis:.

Personally of the two options, I Symbol support vote.svg Support the use of the UNESCO values due to the simpler structure and the fact that it does not value oral-only vibrant languages any lower than literary vibrant languages. Interlaker (talk) 00:13, 12 February 2015 (UTC)

EGIDS language status[edit]

   In progress
Description EGIDS language status
Data type String
Domain languoid (Q17376908)
Allowed values listed here
Example item and value Japanese (Q5287) => 1 national, Ainu (Q27969) => 8b nearly extinct
Proposed by Visite fortuitement prolongée (talk) 22:04, 9 February 2015 (UTC)

Motivation: Suggested by @Interlaker, Nemo bis:.

So how do we proceed here? Interlaker (talk) 22:11, 9 February 2015 (UTC)

Interlaker We wait for users to support or oppose and when an admin thinks there is a quorum they either reject the proposal or they create the property. This can take up to a month or even longer. If the property is approved then you can start using it to add the status to the wikidata item for each language. Once the wikidata item has the status it can be added to infoboxes in any of the 280 language wikipedias which have articles about this language. Start by adding your support below. Filceolaire (talk) 23:49, 1 March 2015 (UTC)


In the age of Open Data and Linked Data, we should be able to describe structured data sources, such as data set (Q1172284), thesaurus (Q179797), database (Q8513), authority control (Q36524) (eg Integrated Authority File (Q36578), Virtual International Authority File (Q54919), etc).

The leading ontologies for describing datasets are VOID, DCAT and ADMS by the W3C. The Getty LOD documentation shows summary diagrams & links, and uses all of them to describe the Getty dataset. Our intent is not to replicate all this information, but to provide only some critical entities and properties to allow finding datasets and access points. This is a somewhat complex topic, so before proposing properties, we should look at the above models, look at some examples, and synthesize a simpler version.


Let's first look at some examples



Data Model[edit]

Based on the examples, we can extract the following items (Q), properties (P), qualifiers (q):

  • <a structured database>
  • Dataset Distribution (Q): particular version/release/format of a structured Data Set (database) that is publically available. May be created and published by other than the database owner (that's quite common)
    • official website (P856): human-readable documentation
    • URL (P) of technical documentation (Q)
    • URL (P) of Datahub (Q): structured description at, including access URLs, examples, etc
    • URL (P) of VOID machine-readable description (Q)
      • format (q): eg VOID, OAI-PMH, explain.z3950
    • URL (P) for access/download: URL that allows search/access/download of a dataset distribution
      • protocol (q): eg SPARQL, SRU, OAI-PMH. Direct file access by HTTP is default and need not be mentioned
      • file format (q): file format and/or metadata schema, eg RDF/XML, Turtle, NTriples, JSONLD, MARC21plus-xml, MARC21-xml, xMetaDiss, oai_dc, ONIX-xml, sync-repo-xml
      • file compression format (q): eg zip, gz


   In progress
Description Particular version/release of a structured Dataset that is publicly available
Data type Item
Template parameter none
Allowed values data set (Q1172284)
Source dcat:Distribution, adms:AssetDistribution
Example item and value
Format and edit filter validation n/a
Robot and gadget jobs in the future yes..
Proposed by Vladimir Alexiev (talk)

distribution (dataset)[edit]

   In progress
Description Particular manner of distribution of a Data Set (database or file) that is publicly available
Data type Item
Template parameter none
Domain data set (Q1172284)
Allowed values Dataset Distribution (Q18814183)
Source dcat:Distribution, adms:AssetDistribution
Example item and value
Format and edit filter validation n/a
Proposed by Vladimir Alexiev (talk)

file format[edit]

(aliases: format, file type, compression format)

   In progress
Description File format, compression type, or ontology used in a file. May use several (eg Zip (Q136218) and NTriples (Q18814471))
Represents file format (Q235557)
Data type Item
Template parameter none
Domain Use as qualifier of a URL or file name (often used with data set (Q1172284) or Dataset Distribution (Q18814183))
Allowed values

Instances of file format (Q235557), such as

Target will often have Internet media type (P1163) and file extension (P1195)
Example item and value AAT LOD dataset => URL of (P642) download (Q7126717), file format Zip (Q136218), file format NTriples (Q18814471) =>
Format and edit filter validation n/a
Proposed by Vladimir Alexiev (talk)


(aliases: communication protocol)

   In progress
Description Computer communication protocol
Represents communications protocol (Q132364)
Data type Item
Template parameter none
Domain Use as qualifier of a URL (often used with data set (Q1172284) or Dataset Distribution (Q18814183))
Allowed values Instances of communications protocol (Q132364), such as SPARQL (Q54871), Open Archives Initiative Protocol for Metadata Harvesting (Q2430433)
Example item and value AAT LOD dataset => URL of (P642) API (Q165194), protocol SPARQL (Q54871) =>
Format and edit filter validation n/a
Proposed by Vladimir Alexiev (talk)


(aliases: webpage, page)

   In progress
Description URL of something, other than official website (P856), reference URL (P854), or archive URL (P1065). Must qualify with of (P642)
Data type URL
Template parameter none
Domain any
Allowed values URL
Example item and value
Format and edit filter validation n/a
Robot and gadget jobs validate that URL resolves
Proposed by Vladimir Alexiev (talk)

Discussion of Dataset properties[edit]

PLEASE first comment on the need to describe Datasets, and then on the specific implementation proposed above (of course, we could have a different implementation)

What do you think of this? --Vladimir Alexiev (talk) 14:53, 7 January 2015 (UTC)
@Emw, Snipre, Kolja21, Fralambert: What do you think of this? --Vladimir Alexiev (talk) 19:35, 19 January 2015 (UTC)

Ruud Koot
Pictogram voting comment.svg Notified participants of Wikiproject Informatics

I welcome thoughts on how to describe datasets, and the above is a good basis for having a discussion around that – thanks, Vladimir. The proposed implementation would seem to work for the examples given, but I am missing thoughts on licensing and versioning of datasets, as well as on the scope of datasets to be annotated this way. For instance, if there is an item about a scholarly publication and that publication has some associated data in a database, it would make sense to annotate the item about the paper with information about the dataset. This would not necessarily require an item about the dataset itself, though that might be an option if WD:N does not stand in the way. --Daniel Mietchen (talk) 00:48, 20 January 2015 (UTC)
Sure, that's just a start. But I think we don't want to repeat all the detailed info at Datahub and VOID files. Vladimir Alexiev (talk) 01:16, 20 January 2015 (UTC)
Suggesting if there is a CKAN/DataHub entry, that should be linked from the WP page? Other than that, I think VoID actually does do a good job at the provenance. Other than that, what is the envisioned difference between a data set and a data base? The latter has a clear visibility, with a website, etc. What makes a data unique? Egon Willighagen (talk) 07:49, 20 January 2015 (UTC)
Egon, Datahub is just one of the important URLs: above I give other examples. VOID does a good job, but is only applicable to RDF datasets, and less than 30% have a VOID file.
"Dataset" and "Database" are just about the same, and there are many other similar items (eg Authority list). I've only proposed a property "dataset distribution" to point to a particular distribution of a database, since often there are many: see GND examples above.
"What make a data unique": I don't know and I don't care. Clarify the question. --Vladimir Alexiev (talk) 18:56, 21 January 2015 (UTC)
Hi I started to work on this subject. Can I reuse or complete or duplicate the box "Property documentation" ? I write these box in the comments ? or in talk page ? May be create a project Wiki4R and create a subpage for this part ? --Karima Rafes (talk) 13:57, 20 January 2015 (UTC)
Karima, go ahead and edit above, it's a wiki. I added "protocol". If you want to move the section "Datasets" to a more permanent location, go ahead: but after the voting for the properties (when it will be moved to a subsection "Archive". --Vladimir Alexiev (talk) 18:56, 21 January 2015 (UTC)
Hello, I splited the dataset distribution in dataset and distribution like in the ontology dcat. I proposed some properties in the aim to make a map like on the website and for the future web agents. I don't know exactly if the properties exist or not in Wikidata. --Karima Rafes (talk) 14:26, 22 January 2015 (UTC)
@Karima Rafes: 1. Don't see a need for "dataset" vs "distribution" property, and you don't seem to propose different examples. We don't need to copy dcat or any other ontology. 2. The markup following "Here proposition for the infobox" is broken, please fix it!! 3. Put your contributions in separate sections (one per property) and sign them (not just this comment). 4. If you want to propose a property (eg xxx:statusAccessURL), do it using the appropriate template at the top of this page, so it can be discussed and critiqued independently. (Critique about this one: one dataset may have several access URLs, so a single "status" prop won't do. 5. "I don't know exactly if the properties exist or not" is no excuse: don't propose properties before checking whether similar properties already exist. Pick any item, go to Claims, click Add and use the autocomplete. --Vladimir Alexiev (talk) 08:41, 25 January 2015 (UTC)
1. "We don't need to copy dcat or any other ontology." DCAT is an ontology for the software agents. If you don't split, you impose that there is only one way to access the data and so, only for the human. It's not logic to use DCAT only for human. 2. Sorry, I have no the time for the moment. I try to fix if wikidata want to be a hub for the software agents. Is it the moment for this debate ? (good place?) 3. 4. 5. it's not urgent. I moved my examples in my personal page for the moment examples of properties for a object dataset and distribution --Karima Rafes (talk) 09:30, 25 January 2015 (UTC)
I think wikidata should aspire to be a hub for software agents.
I think that there is a case for specialised url properties rather than using "of" as a qualifier, particularly for 'machine readable' access and particular types of info - such as 'technical info URL', 'VOID url', 'API url'.
'Datahub' should perhaps be a string property since it always refers to the datahub site - like all the other database properties.
A 'download url' property should be designed to be useable to download digital copies of books or songs as well as databases. This applies in particular when the content is available as a free download but the licence terms are not compatible with Commons.
I support 'file format', 'compression file format', 'protocol' as qualifier properties.
I'm not sure what 'dataset' and 'distribution (dataset)' are for and what they link to. Should we have a separate item for each edition of the dataset, as well as the item for the database? What if a new edition is published every day? Could we use version (P348) or version type (P548) or edition(s) (P747) for these? Filceolaire (talk) 03:49, 22 March 2015 (UTC)