Topic on User talk:Tpt

Jump to navigation Jump to search
RolandUnger (talkcontribs)

Your bot TptBot makes a wrong phone formatting in case of extensions. For instance, in case of Maritime Museum of San Diego (Q3330638) it changes the correct number "+1-619-234-9153 ext. 101" to the wrong one "+1-619-234-9153;ext=101". The changed format does not match the regular expression of phone number (P1329).

Tpt (talkcontribs)

Hi! Thank you for pointing out this problem. Do you know why the regular expression has been crafted this way? The english description of the property states telephone number in standard format (RFC3966), without 'tel:' prefix and the RFC 3966 enforces to use the";ext=" syntax. I have opened a discussion on the property talk page about it.

RolandUnger (talkcontribs)

Unfortunately I was not involved in the establishment of this property. When it was created the property was of string type because the phone number-number model did not existed yet. At at this time nobody made a hint on RFC 3966 -- surely because this is not a human-readable format. In December 2016 a first regular expression was added but no hint on RFC.

On Jan 29, 2018 user:Tpt added a hint to RFC 3966 but he did not changed the regular expression. In April of this year user:Verdy p changed the regular expression in a non-RFC compliant manner but in a human readable manner as it is used on many wiki pages.

Verdy p (talkcontribs)

I did not change the format for extensions, it as ALREADY specified without using ANY semicolon or equal sign (as in the RFC). What I did was only to validate the main part of the phone number, and I left the support for extension unchanged. In fact the RFC allows a LOT of extensions for phone numbers, each one separated by a semicolon, and specified as a property (i.e. a key=value pair, with a unique key, but keys in arbitrary orders). We can't use that in Wikidata or we won't be able to validate the format. So, given this RFC if we want to add support for an arbitrary list of properties, and check the uniqueness of these properties that are also unordered, we should better have a separate property in Wikidata, outside the phone number itself.

The RFC 3966 format is an encapsulation format, which is not what the ITU standard prescribes. And there are other competing encapsulation formats, including in other RFC! (which is still not part of any BCP, so that it can be deprecated at any time). The number of properties is growing with the number of transport protocol and virtual service providers on the net, and even phone numbers are insufficient for them as they can use other letters, or can require the use of authentification (for billing, or for privacy, or for legal reasons), or can be used to group calls in a meeting with more participants), or can require additional protocol format (e.g. for audio and video codecs).

This RFC 3966 is not suitable here. We just want a single basic format. And until now, the basic extension mecnims has never used the semi-colon or equal sign, so the ;ext=(\d*) syntax is clearly invalid, it should just be / ext\. (\d*)/ as it was (with optional space or dot separators).

Adding RFC 3966 support would be wrong in Wikidata and in fact impossible to check correctly. We don't want such encapsulation just like we don't want the MIME format in a single property value. The only allowed encapsulation format is for URIs (but then we don't check the URIs except accepting only a few URI schemes and enforcing a policy about the host.domain part with blacklists against spam).

Note: I added the support for ";ext=", it was a minor change (until now the ";" and "=" were invalid, this was never present or requested before, and my improvement to the regexp then did not took this into account). This does not change at all the capture groups numbers, so phone number parsing is unchanged, and it now allows a bot to reformat the validated phone numbers if needed to use ";ext=" instead of " ext. ", or "/ext ", or just "x".

Also I did not change the following rule: only ONE phone number must be specified, you need to use separate properties to specify several phone numbers (possibly qualified for distinct usage). The change was also applied for fax numbers.

I also updated the reference URL that points to a regexp test online which is also updated with this new version.

Reply to "Wrong telephone number change"