User talk:Eihel/2019
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
Contents
Help with regex[edit]
Hi! Could you help me with format as a regular expression (P1793) on some properties?
- Treccani ID (P3365): the character º should be allowed
- Treccani's Biographical Dictionary of Italian People ID (P1986)/Treccani's Enciclopedia Italiana ID (P4223)/Treccani's Dizionario di Storia ID (P6404): the characters () shoud not be allowed
- ANICA ID (P6151): the ID should always consist of 8 characters - two capital letters, then 0-3 spaces and 4-1 numbers (so the possible combinations are 0 spaces and 4 numbers, 1 space and 3 numbers, 2 spaces and 2 numbers, 3 spaces and 1 number)
Thank you very much, --Epìdosis 14:35, 3 July 2019 (UTC)[reply]
- Yep Epìdosis, I am yours once I have made necessary modifications on skwiki and frwiki pages. —Eihel (talk) 14:40, 3 July 2019 (UTC)[reply]
- To both of us, Epìdosis: Are you on an irc now, as we discuss in RT? chan?
- in P3365 ° (so, degree, right?), where should it be? In the first match list (
[a-zA-Z]
), the second match list ([a-zA-Z0-9()'_-]
), or completely elsewhere? With an example, I modify what you want (in Wikidata property example (P1855)). - in P1986, ... same observation for both P: where in the RegEx?
- for P6151: I make you a proposal in a few moments. —Eihel (talk) 15:24, 3 July 2019 (UTC)[reply]
- For P6151:
[A-Z]{2}(\d{4}|\s\d{3}|\s{2}\d{2}|\s{3}\d)
- Does that sound right, Epìdosis? —Eihel (talk) 15:30, 3 July 2019 (UTC)[reply]
- Hi! I don't use IRC, unfortunately. Which is exactly the difference between first and second match list? An example for P3365 is James Stanhope, 1st Earl Stanhope (Q332603), so maybe I would say the second match list.
- For P6151: it's OK! Thanks, --Epìdosis 15:32, 3 July 2019 (UTC)[reply]
- in P3365 ° (so, degree, right?), where should it be? In the first match list (
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘@Epìdosis: Sorry, irl property: eat with family. Bbq ID. I'm full... burp!
- For P6151: I see a problem: you tell me that there are 8 alphanumeric characters and your combinations bring only 6 characters in total. I propose you in replacement:
[A-Z]([A-Z]|\d|\s){5,7}
. This RegEx encompasses all the solutions you have given me. - For P3365: I see: you have entered the identifier in the example you gave me, but RegEx did not cover all the solutions. You gave me the character "degree"
°
, while it is the character "ordinal indicator"º
in the URL. Aparté: we can sometimes meet with a diacritic (line below). With[\w\(\)'\-°º]+
, it includes all the examples of the property and your request, with the character "degree" in addition. - For P1986 and P4223: I answer yes if it's a question: parentheses are not allowed. I do not see what your request is. Would you like to add parentheses or not? P4223 contains 2 different RegEx (declaration and constraint)?! P1986 contains an exception to RegEx with exception to constraint (P2303). Aparté 2: Which is completely stupid. Logically, we should not put an exception to a RegEx, otherwise the Item has no more constraint and the identifier of this Item can be "chocolat_power". The exception must be incorporated into RegEx. So Giuseppe Berlendis (Q61913848) can have parenthesis (normally). With examples it's always easier, please.
- For Treccani's Dizionario di Storia ID (P6404): There is no RegEx, so no constraint, in one way or the other.
Now, I'm taking my little coffee. With pleasure. —Eihel (talk) 20:37, 3 July 2019 (UTC)[reply]
- P6151: sorry, my fault, there should be only 6 characters, so the previous regex is OK.
- P3365: OK, I added this constraint.
- P1986 and P4223: I setted the constraint for both properties as
[a-z]+(\-[a-z]+)*(_res\-[0-9a-f]{8}(\-[0-9a-f]{4}){3}\-[0-9a-f]{12})?
and I removed the exception (Giuseppe Berlendis (Q61913848) should not have had parentheses and had also to be merged); it was a question and an assertion at the same time, I wanted to say "are parentheses not allowed? If so, OK; if they are allowed, they should not be allowed". - P6404: I setted the constraint as
[a-z0-9-]
; if it does not allow parentheses, it's OK.
- I think all the problems have been solved. Excuse me for my unclear way of expression (unfortunately I'm not expert in regex), I hope to be clearer next time. Thank you very much for your patience, --Epìdosis 06:51, 4 July 2019 (UTC)[reply]
- No problem, with pleasure Epìdosis. Q431243 had before "giuseppe-berlendis_(Dizionario-Biografico)" as ID. That's why there was an exception and the url was http://www.treccani.it/enciclopedia/giuseppe-berlendis_(Dizionario-Biografico)-(Dizionario-Biografico)/. That must have been before, because now the 2 URLs are working, so this is no longer a real problem (treccani has surely made this change). Which means that before, it needed parentheses precisely.
- Now each P has his own RegEx. Following the few items I checked on Special:WhatLinksHere/Property:P4223 and Special:WhatLinksHere/Property:P6404, I correct these P. SYL. —Eihel (talk) 08:53, 4 July 2019 (UTC)[reply]
Avis dans une proposition de propriété[edit]
Bonsoir Eihel. Je n'ai vraiment pas compris la teneur de ton avis dans Wikidata:Property proposal/Ex æquo with ; pourrais-tu STP m'expliquer :
- pourquoi tu évoques le diplôme du doctorat, alors que personne n'en avait parlé auparavant,
- en quoi mes commentaires indiqueraient que je ne veux "pas connaître le précédant ou suivant [dans la série]" ?
Merci beaucoup par avance, cordialement,
Nomen ad hoc (talk) 23:03, 5 August 2019 (UTC).[reply]
Help again with regex[edit]
Hi! I need some help with the constraint FIG gymnast licence number (P2696): it should include the possible presence of "&type=licence" after the numerical ID (e.g. in Victoria Voronina (Q3557709)). Thank you very much! --Epìdosis 06:06, 7 October 2019 (UTC)[reply]
- The problem seems more complex than I thought: you can have a look at Property talk:P2696#URL format, if you have any idea how to solve it. --Epìdosis 08:39, 7 October 2019 (UTC)[reply]
- Solved. --Epìdosis 09:22, 17 October 2019 (UTC)[reply]
- Hello @Epìdosis:,
Yes, resolved… or almost. Creating another property was the best solution. That was it or use one of the external URL formatters with an X:license identifier (where X is the site id number and optional license, for example). But the discussion starts (by @Horcrux:) with 2 examples on the athlete Marco TORRES (licensed and unlicensed). It should be noted that the article Wikidata Marco Torrès (Q3108931) is not represented by the identifier 597: they are not the same people and it is the page of the FIG which is false (date of birth). You must find at least another valid example to replace this example (to avoid a warning: 2 examples minimum per Property). I look forward to reading from you. —Eihel (talk) 15:27, 17 October 2019 (UTC)[reply]- Thank you, example substituted. --Epìdosis 15:36, 17 October 2019 (UTC)[reply]
- Hello @Epìdosis:,
- In fact, I only partially answered the question: how to add the identifier with or without a license? That is to say, join the two properties in one. Alternative solutions are available to Wikidatians:
- The precise case: The solution given above, two pieces of information: the athlete's number and license. P1630 is populated by a third-party Formatter URL and the identifier is sent to it. For example, it can be of the form "123456:license" or "123456:yes_no". This solution requires a format constraint for the mandatory RegEx and an explanation of the syntax of the identifier. There are not many on all properties.
- The case "the identifier does not look like an identifier": For this identifier, the whole "QUERY" part of the URL must be the identifier, so the coding is included in the search and transformed into the browser. The RegEx:
^\?Id=[1-9]\d{0,5}(&type=license)?$
. For Aly Raisman (Q238663), we would have "?Id=13822&type=license" (licensed) and for Sérgio Sasaki (Q10376234), we would have "?id=25168" (not licensed). - The rare case: A change of type of data in URL. This requires a community consensus and a request to the development team.
- On the other hand, there is a template (from PHP) that is not suitable in the main domain of WD: urlencode. Small example:
[//tools.wmflabs.org/erwin85/xcontribs.php?user={{urlencode:{{subst:REVISIONUSER}}|WIKI}}]
, give [1]. Good continuation. —Eihel (talk) 21:11, 29 October 2019 (UTC)[reply]- Since the two identifiers are not mutually exclusive, the properties shouldn't be joined. --Horcrux (talk) 14:36, 30 October 2019 (UTC)[reply]
- Hello @Epìdosis:,
- Thank you, example substituted. --Epìdosis 15:36, 17 October 2019 (UTC)[reply]
- Hello @Epìdosis:,
- Solved. --Epìdosis 09:22, 17 October 2019 (UTC)[reply]
Bonjour,
On peut continuer en français si vous préférez. :)
- Special:Diff/1028638765 − ok, ça me semblait innocent pour organiser un peu les fils de discussions ; mais ok, désolé.
- Special:Diff/1028639457 − en général, il n’est pas considéré comme très poli de modifier les interventions des autres ; et en particulier le résumé de modification “google translation” est un chouïa insultant − je ne prétends pas que mon anglais est parfait, mais je n’utilise pas Google Translate depuis belle lurette (sauf pour mon allemand boîteux ^_^)
Maintenant, vu l’ambiance, je ne pense pas continuer à participer sur Property talk:P7375. Bonne continuation ! Jean-Fred (talk) 10:55, 9 October 2019 (UTC)[reply]
- Ça n'avait rien d'insultant, j'utilise Google Translate pour tous les Small Wikis et je sais qu'il a la fâcheuse tendance à traduire « answer » pour « répondre » et « reply » pour « demander ». C'était pour la compréhension. Je vais quand même répondre sur Property talk:P7375 pour le même fil d'historique. Pour les « Congratulations », je vous avoue que ça avait un sens taquin, tout au plus , parce que j'avais déjà fait grossièrement le tour de la db et j'avais déjà inscris certains types sur la proposition. Mais vous affirmiez le contraire en écrivant qu'il n'y avait que 2 types d'entité que j'avais déjà notés. Si j'écris « the catalog entry types are numerous » et que vous ne trouvez que 2 types distincts que j'ai déjà introduits, c'est que vous n'avez pas été assez loin dans la recherche. Ce n'est que de bonne guerre que je vous félicite pour la contradiction de mes écrits, puis en ajoutant d'autres types. De plus, vous démarrez une discussion sur un autre thème: la proposition a été validée sans opposition et en restant une semaine de plus que prévu. Une fois la Propriété créée, il y a 2 questions qui surgissent pêle-mêle. Je voulais donner un coup de main à Dominikmatus, alors si je suis un peu rugueux, je vous prie de m'excuser platement. Un peu pris irl, je vous prie de patienter pour la réponse de l'autre page. Cordialement et bonne continuation aussi, @Jean-Frédéric:. —Eihel (talk) 13:08, 9 October 2019 (UTC)[reply]
Format constraint violation?[edit]
Why is there a constraint violation in Q4295#P7041? Perseus author ID (P7041) can have values like "2", as all numbers. Thank you very much, --Epìdosis 09:22, 17 October 2019 (UTC)[reply]
- Re-hello @Epìdosis:
Because the RegEx is[1-9]\d+
. + Represents one or more of what precedes immediately. So we expect to have at least 2 digits: a number between 1 and 9, then a digit. We must replace the + with *. Which means "zero or more" of what precedes immediately. I replace the RegEx of P7041 with the good ones. This error is not the only one: see here. Looking forward. —Eihel (talk) 15:44, 17 October 2019 (UTC)[reply]
New page for catalogues[edit]
Hi, I created a new page for collecting sites that could be added to Mix'n'match and I plan to expand it with the ones that already have scrapers by category. Feel free to use, expand. Best, Adam Harangozó (talk) 19:57, 19 October 2019 (UTC)[reply]
Help again (quater) with regex[edit]
Hi! When you have time, could you try to transform my words in a real regex in Wikidata:Property proposal/The Cardinals of the Holy Roman Church ID? Thank you very much! --Epìdosis 10:03, 26 November 2019 (UTC)[reply]
- Done —Eihel (talk) 14:06, 26 November 2019 (UTC)[reply]
Question[edit]
How do I make good translation edits without useing google translate?LoganTheWatermelon (talk) 15:14, 1 December 2019 (UTC)[reply]
- Hello, to start,
- Only translate in your language, @LoganTheWatermelon:, or in a language that you speak fluently. Moreover, do not trust blindly to Google Translate. For example, this contradictory translation: [2]. It is necessary that your translation makes sense, but some things are not necessarily translated. Your changes on Tin Toy (Q549465) do not reflect IMDb and your changes on Canimals (Q3655358) don't match the Wikipedia links at the bottom of this Item (sitelink). These are titles of works (art). Take for example Quarter Pounder (Q1573107), there is no sense to use the pound (Q100995) other than in Anglo-Saxon countries. Start by introducing Babel on your personal page, we will see a little more clearly. ZI Jony put you links on your Talk Page to get started. —Eihel (talk) 16:34, 1 December 2019 (UTC)[reply]
Question[edit]
How can you support me?? – The preceding unsigned comment was added by LoganTheWatermelon (talk • contribs) at 19:48, 1 décembre 2019 (UTC).
- To start once more: Hello,
- First of all, try to find the information by yourself: looking at the help pages, reading what others have done, etc. If you are unsure, do not make any changes, like to delete a completely valid change or request a deletion of a page when there is no need. To ask for help, you can add a message on Wikidata:Project chat and wait for an reply. As a last resort, you can send me a message (without flooding me with messages). Cordially. —Eihel (talk) 23:17, 1 December 2019 (UTC)[reply]
Well, thank you… for your thanks.
I saw that you had added a new scraper, # 3046, on Mix'n'match. I'm trying to make one for another proposal which is practically identical to Hoopla (excepted the root of the URL of course), but I can not. I do not want to disturb Magnus who is already well taken. The help pages are quite poor and the example does not help me. I tried a thousand different ways, but without success. I created two completely useless entries in the M'n'm catalog. Would you be so kind as to give me screenshots or give me what you put precisely in the fields. It is an identifier of journals with numbers on 8 digits starting at 1. Do not make it for me, because I want to understand. Best regards. —Eihel (talk) 23:43, 1 December 2019 (UTC)[reply]
@Eihel: This is recreated from memory, but hopefully it's correct. :) Let me know if any part isn't clear.
I used http://com.hoopladigital.web.s3.amazonaws.com/sitemap2/prod/2019/12/01/sitemap-1-series.xml as my source file, which I reached from https://www.hoopladigital.com/sitemap.xml. There is only 1 file for series, but for now we'll pretend there are 10 (sitemap-2-series, sitemap-3-series, etc.). So for Levels use:
- Level 1 range
- Start 1
- End 10
- Step 1
Scraper: The pages to be scraped have URLs in the format ...sitemap-__-series.xml where the blank will be a number from 1 to 10. So for URL Pattern we'll use
- http://com.hoopladigital.web.s3.amazonaws.com/sitemap2/prod/2019/12/01/sitemap-$1-series.xml
Each item we want to scrape is a URL, something like https://www.hoopladigital.com/series/Edgar-Allen-Poe-Mystery/2169944278. We want the number at the end for the ID, and the text before that for the name. So for the RegEx entry we will use
- https://www.hoopladigital.com/series/(.+?)/(\d+)
The first regex group will match anything up to a slash. The second group will match any string of numbers.
For Resolve, the ID be from the second group, without any modifications, so we put $2 here. The name will be from the first group, so we put $1 here. We also want to change the dashes to spaces, so we will add a regular expression replacement, with - for the matching pattern, and a space for the replacement.
- id: $2
- name: $1
- Regular expression replacement:
- Matching pattern: - (a hyphen)
- The replacement: (a space)
For the description there's nothing we can use from the source file, so we'll just put "series".
For URL, we use
- https://www.hoopladigital.com/series/wd/$2
In this case, the series title in the URL is actually for search engine optimization purposes and doesn't affect the link, so I use wd.
For type, I used series of creative works (Q7725310), so I entered Q7725310.
Trivialist (talk) 01:14, 2 December 2019 (UTC)[reply]
Tara Fares[edit]
I forgot to log on. The claims that Tara fares is Christian is wrong, the source mention that it's took its information form this site, if you look to the video they said that she Christian and converted to Islam, and don't want to discuses about that, anther site mention, that she and her family converted to islam here, and the new york time here, so we don't have much information about tara fares religion, so it's better not add Christianity.--Jobas (talk) 08:51, 11 December 2019 (UTC)[reply]
- Ok ?! —Eihel (talk) 08:59, 11 December 2019 (UTC)[reply]
On my TA rights and my reaction![edit]
I am not sure what kind of reaction you were expecting from me, but yes, I don't have nicer way of reacting to personal attacks (other than keeping quite!) like that which was there on my request for TA. As it's not a isolated incident for me and many of us who have started from small language wiki's like Marathi, where we are yet to learn democracy and still live in old dictatorial regimes. Users like me who talk about accountability get harassed routinely and that attack was a part of that routine harassment. I have done everything I could from trust and safety to stewards. Everyone got their own reasons. Now question for me is that how long I can survive out of this? As accusations such as hat collecting become baseless when one actually looks into my contributions. But it's a ritual for Tiven to re-utter the same old shit of hat collecting on my every request for permission. And in terms of work, I will continue to work on my part and I have my goals set for myself. QueerEcofeminist (talk) 03:46, 12 December 2019 (UTC)[reply]