Topic on User talk:Repf72

Jump to navigation Jump to search
GAN (talkcontribs)

Some time ago in the list of winners in this case the word "missing" was displayed. Now, if I'm not mistaken, nothing is displayed. Just like in case if the prize-winner is not known (for example the Tour of Turkey). Which can be somewhat misleading.

And at the same time. How do you make the results of the stage so quickly? 10 people Probably not in manual

Repf72 (talkcontribs)

If you mean disqualified as winner, for example "TDF 1999 - 2004 due case Lance Armstrong" it was working some days ago, may be some bug could be check our friend Dipsacus fullonum

To fulfill listofwinners and classifications fast I do:

  1. I took list of cyclist using sparql (about 25K since last time)
  2. Take results from procyclingstats to an excel with formulas that convert times (4:00:50 -> 14.450 secs) and fix cyclist names (GEOGHEGAN HART Tao -> tao geoghegan hart)
  3. And with cycling names taken from wikidata via sparql in excel and results from procyclingstats in excel, I use VLOOKUP in excel to look for name and find the element Qxxx.
  4. Some VLOOKUP do not match, due cyclist do not have property P106=Q2309784 (if I find this situation normally put P106=Q2309784), some letters in name are not equal... In this case I use other procedures and even manual datafill. Or obvious I create the cyclist If I am sure do not exist.
  5. With all ready I use Quick Statements to datafill all data https://tools.wmflabs.org/quickstatements/#/batches/

Example datafill of "BinckBank Tour 2018" Stage 1

Q55839941 P2417 Q20743779 P2781 14460U11574 P1352 1
Q55839941 P2417 Q462839 P2911 0U11574 P1352 2
Q55839941 P2417 Q2933765 P2911 0U11574 P1352 3
Q55839941 P2417 Q23058129 P2911 0U11574 P1352 4
Q55839941 P2417 Q17580397 P2911 0U11574 P1352 5
Q55839941 P2417 Q15143907 P2911 0U11574 P1352 6
Q55839941 P2417 Q1859227 P2911 0U11574 P1352 7
Q55839941 P2417 Q2435157 P2911 0U11574 P1352 8
Q55839941 P2417 Q2500129 P2911 0U11574 P1352 9
Q55839941 P2417 Q19958425 P2911 0U11574 P1352 10

The sparql to get cycling list is:

SELECT ?descES ?descEN ?descFR ?elemento ?pais ?nacido ?sexo

WHERE {

  ?elemento wdt:P106 wd:Q2309784.

  OPTIONAL {?elemento rdfs:label ?descES filter (lang(?descES) = "es").}

  OPTIONAL {?elemento rdfs:label ?descEN filter (lang(?descEN) = "en").}

  OPTIONAL {?elemento rdfs:label ?descFR filter (lang(?descFR) = "fr").}

  OPTIONAL { ?elemento wdt:P27 ?pais. }

  OPTIONAL { ?elemento wdt:P569 ?date. }

  OPTIONAL { ?elemento wdt:P21 ?sexo. }

  BIND(YEAR(?date) AS ?nacido)

}

ORDER BY ?nacido

GAN (talkcontribs)

The principle is understood. Only I use Widar. But how to get the "sparql" did not understand. Where should I put this code / text? Or can you download your list with "sparql" for "Excel 2003" (I have an old version)? "Q" rider is the most valuable. :) I just Excel the dates of the bike races with cyclingarchives.com.

Repf72 (talkcontribs)

Hear is the Query

https://query.wikidata.org/#SELECT%20%3FdescES%20%3FdescEN%20%3FdescFR%20%3Felement%20%3Fcountry%20%3FbornYear%20%3Fsex%0AWHERE%20%7B%0A%20%20%3Felement%20wdt%3AP106%20wd%3AQ2309784.%0A%20%20OPTIONAL%20%7B%3Felement%20rdfs%3Alabel%20%3FdescES%20filter%20%28lang%28%3FdescES%29%20%3D%20%22es%22%29.%7D%0A%20%20OPTIONAL%20%7B%3Felement%20rdfs%3Alabel%20%3FdescEN%20filter%20%28lang%28%3FdescEN%29%20%3D%20%22en%22%29.%7D%0A%20%20OPTIONAL%20%7B%3Felement%20rdfs%3Alabel%20%3FdescFR%20filter%20%28lang%28%3FdescFR%29%20%3D%20%22fr%22%29.%7D%0A%20%20OPTIONAL%20%7B%20%3Felement%20wdt%3AP27%20%3Fcountry.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Felement%20wdt%3AP569%20%3Fdate.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Felement%20wdt%3AP21%20%3Fsex.%20%7D%0A%20%20BIND%28YEAR%28%3Fdate%29%20AS%20%3FbornYear%29%0A%7D%0A%0AORDER%20BY%20%3FbornYear

Here is the file

https://file.io/PGEDdF

Repf72 (talkcontribs)
GAN (talkcontribs)
Repf72 (talkcontribs)
GAN (talkcontribs)

Gracias. Everything worked out. It is hard to translate all the results into Cyrillic by hand. :)

Repf72 (talkcontribs)

Your welcome friend.

It makes it very fast. I just dedicate some minutes a day to Procyclingstats -> Excel templates -> Quick Statements -> Wikidata. The problems become when Procyclingstats has errors (it is necessary to make some simple testing). On the other hand it is necessary to check if team's riders are updated to do it by hand.

For listofwinners on World Tour races, I took between 30 mins to 2 h per race. Having list of cyclist makes it fast, but I had to create about 100 riders (using Quick Statements).

By the way, if you have time, could you check some races in order to see if listofwinners were ok? (may be some position could have error).

GAN (talkcontribs)

Prize-winners I check selectively, when I translate the riders' writing into my native language. I began to take the data from the firstcycling.com, since I changed my name and name in places when I did not understand more than two words. A couple of questions arose.

1. When you update the "sparql" base you can somehow make it so that the new created racers are displayed at the end of the list?
2. I decided to add the winner and the leader of the stages to 1974 Tour de France (Q754426)

This is the data for no label (Q28054922)

Q28054922 P1346 Q103756 P642 Q20882747
Q28054922 P1346 Q103756 P642 Q20882763
3. For multi-day cycling "J" the place of start and finish of the stages in the manual put? There is also a very very very huge list will be.
Repf72 (talkcontribs)

Hi GAN,

For 1, you can replace at the query I sent "ORDER BY ?bornYear" -> "ORDER BY strlen(str(?element)) ?element" so you can see last elements created at the end of the query

For 2, excellent, unfortunately if stage winner (Q20882747) is the same leader (Q20882763) Quick Statements merge both and it is necessary by hand separate them. But any way is it less expensive replace some data instead of datafill by hand all.

For 3, I look by hand for any element code of the cities to make the file to use on Quick Statements, but up to now I currently do not know how to get cities organized by country to do something similar I done using cyclist database.

Repf72 (talkcontribs)

Hi GAN

Please try to use Help:QuickStatements instead of widar, due it is more advanced and avoid some bugs like to put "±0" to any number as Q55839945

GAN (talkcontribs)

Like figured out with Quiststamen - 2009 Giro di Lombardia (Q675390)

I used this https://tools.wmflabs.org/wikidata-todo/quick_statements.php - here too, Quiststamment in the name :)

I managed to make the following by cities. I used for instance of (P31) one of the main territorial-administrative units of the country. But the list for France turned out almost in 40K.

For bicycle races "Pro and World Tour" you need 9 countries - Italy, Spain, France, Belgium, Netherlands, Switzerland, Poland, Australia, Germany. On idea one country - one sheet in Excel.

And with the same name can be a city in different countries or different territories in one country.

At me in prioritet all one-day races and many days for which there are pages for each stage in "Pro and World tour" (since 2005)

Reply to "Disqualified Winner"