Wikidata:Property proposal/HTML entity

From Wikidata
Jump to: navigation, search

HTML entity[edit]

Return to Wikidata:Property proposal/Term

   Ready Create
Represents SGML entity (Q285300)
Data type String
Domain character (Q3241972)
Allowed values &[A-Za-z0-9]+;
Example
Source https://dev.w3.org/html5/html-author/charref
Planned use The data should be completed.
See also Unicode character (P487)
Motivation 
It would be nice to have character ↔ HTML entity mappings in Wikidata.
Discussion
Open questions:
  1. Should we include & and ; at the beginning and the end, respectively?
  2. Should this be a qualifier of Unicode character (P487)?
Matěj Suchánek (talk) 09:00, 13 November 2017 (UTC)
  • Symbol support vote.svg Support I would include & and ; as I'm more used to see HTML entities written with these characters. --Pasleim (talk) 20:28, 13 November 2017 (UTC)
  • Symbol support vote.svg Support we should link to an authoritative source website for this. ArthurPSmith (talk) 21:12, 13 November 2017 (UTC)
  • Symbol support vote.svg Support Giovanni Alfredo Garciliano Díaz diskutujo 23:28, 13 November 2017 (UTC)
  • Symbol support vote.svg Support David (talk) 08:17, 14 November 2017 (UTC)
  • Symbol support vote.svg Support; though I think this should be an external identifier; both as the string does "identify" the entity, and so that we can use formatter URLs. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:46, 14 November 2017 (UTC)
  • Symbol support vote.svg Support as external identifier, though in that case we should find a formatter URL for a site describing these entities. Mahir256 (talk) 17:18, 14 November 2017 (UTC)
    • It's not a unique identifier since multiple strings can represent the same thing (for example | | | all represent a vertical bar). String datatype is correct here. ArthurPSmith (talk) 18:46, 14 November 2017 (UTC)
      • External identifier was also an idea by me... Multiple characters cannot be mapped to a single HTML entity. Wikidata's representation of "symbols" is quite immature, though. With Ä (Q9987), both lower and upper case symbol can be represented, each having a different Unicode character (P487) and also HTML entity. I think we will need to consider creating entities for each letter variant, separate from the common understanding of a "letter", which might later turn out to be useful with Wiktionary integration. Matěj Suchánek (talk) 08:47, 15 November 2017 (UTC)
      • External IDs must uniquely identify a subject; but need not be unique in doing so; we have several properties for which there can be more than one ID for a subject, ranging from VIAF to listed buildings in England. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:33, 17 November 2017 (UTC)
        • External ID's are really not very useful if there are more than rare exceptions to the uniqueness relationship. It makes linking and lookups much harder. I think HTML entities has too many exceptions to qualify in this case. Plus we have no formatter URL so no advantage there. ArthurPSmith (talk) 19:57, 20 November 2017 (UTC)
  • Symbol support vote.svg Support Just checked and fortunately while Mediawiki automatically formats   into the respective character, Wikibase doesn't. ChristianKl () 20:07, 15 November 2017 (UTC)
  • Symbol support vote.svg Support - Surprised this doesn't exist already. -- Fuzheado (talk) 21:02, 16 November 2017 (UTC)