User talk:Edoderoo

Jump to navigation Jump to search

About this board

Previous discussion was archived at User talk:Edoderoo/Archive 1 on 2015-12-29.

Before you start doing further automatic tasks...

11
Sotho Tal Ker (talkcontribs)

...I suggest that you educate yourself about the properties and their constraints. If you continue to add values that violate those constraints, I will revert them in the future. You cannot rely on other people to cleanup the mess you make, for example on [Q95994985]. So please adhere to the established standards or suggest changes if you deem them to be inappropriate.

Edoderoo (talkcontribs)

There is a bot to add useless spaces into the ISNI-value. It is a useless, because the link to the information gives exactly the same result. So if you revert my edits, just because you have a personal preference on the formatting, I will consider that vandalism. Please assume good faith in each others edits, you might get the same good faith back.

Edoderoo (talkcontribs)

d about the title. Please do your research well, the edits you "improved" were manual edits. Maybe you should not fill in the information you don't know. Not every edit on Wikidata is "automatic"d about the title. Please do your research well, the edits you "improved" were manual edits. Maybe you should not fill in the information you don't know. Not every edit on Wikidata is "automatic"nn

Edoderoo (talkcontribs)

and this wiki-editor stinks. Good day.,

Sotho Tal Ker (talkcontribs)

Property constraints are there for a reason. You can personally think of them what you want, but it has been decided sometime ago that Wikidata wants the ISNI values with spaces. It is not my personal preference. It's not hard to either stick to the standards or propose a change (whatever the outcome maybe). If people would adhere to the standards, then no bots would be needed to correct values later. Also: I never thought these edits were automatic, but from your manual edits I deduct the work your bot will probably do... Maybe you gonna prove me wrong. Happy editing :)

Edoderoo (talkcontribs)

Well, if you believe you will help the project by reverting edits ''because you don't like the format'' altough it's just a mismatch with a wrong constraint (half of the constraints and warnings make no sense on WD), I think you make a huge mistake. And your attitude towards users you never spoke to before is also wrong. What happened to ''assume good faith'' and being friendly to begin with?

Sotho Tal Ker (talkcontribs)

Well, if you believe you will help the project by doing such edits because "There is a bot to add useless spaces into the ISNI-value.", then go for it. Luckily there are others which care for the consistency of data and keep fixing that for you because you are too lazy to add 3 little spaces. I personally do not care if we store the raw or human-readable version of ISNI, but at least it should be consistent and not sometimes this and sometimes that. The constraint here is not "wrong", it is there for consistency. Also: Can you tell me a reason to be especially friendly to a user who apparently does not care about adding data the way it should be. Do you want a hug and a present before i tell you my thoughts that your edits have been subpar? Your answers have shown that you are in no way interested to improve your edits in the future, what "good faith" shall I assume here?

Edoderoo (talkcontribs)

If you are blind as you are, I prefer you keep away from my talk page. This project is not about me or you, especially not about you. If you believe my edits are all wrong, please contact an admin.

Sotho Tal Ker (talkcontribs)

Another person who cannot stand criticism *shrugs*. All I asked was that you adhere to the agreed standards for values for properties, not only ISNI, as there have been others. But feel free to continue as you like, I do not want you melting any further.

77.11.59.27 (talkcontribs)
Edoderoo (talkcontribs)

This "fake information" came from the category structure as in 2017.

Reply to "Before you start doing further automatic tasks..."

Fake information inserted by you removed

1
77.11.59.27 (talkcontribs)
Reply to "Fake information inserted by you removed"
Klaas van Buiten (talkcontribs)

Beste Edoderoo,

Gezien je gebruik van de terugdraaiknop [https://www.wikidata.org/w/index.php?title=Q16190604&oldid=prev&diff=1222006373&markasread=602225537&markasreadwiki=wikidatawiki hier] wil ik je vertellen dat de beschrijving zo kort mogelijk moet zijn en voor deze data eigenschappen (Pnnn) bestaan. Zo wordt in elke taalversie gehandeld. Eenduidigheid is wel gewenst. Zie ook Wikidata:De_kroeg#Geboorte-_en_sterfdata


Bovendien kan ik me niet voorstellen dat er meerdere mensen zijn die en zo heten en ook nog Frans schilder zijn of waren.

Edoderoo (talkcontribs)

Toch is mij de afgelopen vijf jaar door meerdere mensen verzocht om omschrijvingen het liefst zo op te bouwen, dus weet ik niet waar jij de wijsheid vandaan haalt dat jij weet hoe ik omschrijvingen moet maken, en waarom jouw methode dan "universeel de enige juiste" zou zijn.

Klaas van Buiten (talkcontribs)

Ik weet niet meer waar, maar bij omschrijving wordt aangeraden: "zo kort mogelijk". Als je dan overal eigenschappen als deze gaat toevoegen, terwijl dat bij geen andere taal dan de Nederlandse wordt gedaan vraag ik me af of die "meerdere mensen" de wijsheid in pacht denken te hebben. Kan en wil je namen noemen? Wellicht hebben ook zij zich inmiddels aangepast aan de rest van de wereld.

Edoderoo (talkcontribs)

Beroep en land zijn ook ''eigenschappen''. Een beroep-nationaliteit (geb-sterf) constructie is nog steeds een korte omschrijving.

Klaas van Buiten (talkcontribs)

Zoals de Engelstaligen zeggen: ''agree to disagree''. Laten we een (pittige?) discussie in de kroeg hier afwachten; zie de link aan 't begin van dit draadje.

Edoderoo (talkcontribs)

Je bent de eerste die in vijf jaar moeilijk hierover doet, dus zo pittig zal die discussie heel niet worden. Maar je doet hierboven alsof ik de blinde in het land der zienden ben, dus ik maak me er weinig zorgen over.

Klaas van Buiten (talkcontribs)

Mijn punt is dat sommige Nederlanders beren op wegen zien in landen waar geen beren in het wild leven.

Edoderoo (talkcontribs)

Misschien moet je eens wat meer van deze personen aan andere wikidata-items gaan koppelen. Dan ga je vanzelf inzien dat een deftige omschrijving veel meer waard is dan een te korte pruts-omschrijving. En natuurlijk, mensen die de hele dag niets te doen hebben, kunnen ook tien items uit de zoekresultaten openen, om dan de properties te gaan bestuderen, en het juiste item eruit te filteren. Maar de monniken en de middeleeuwen zijn al 500 jaar verleden tijd, en daarom is mij in 2015 al gevraagd om waar mogelijk deze datums in de omschrijving (lees: zoekresultaten) mee te nemen. De mensen die aan "the sum of all paintings" werken zullen je vervloeken als jij bij alle schilders die datums gaat verwijderen.

Reply to "Geboorte- en sterfdata"
Woordenbrij2 (talkcontribs)

Dag Edoderoo,

Zat ik je net in het vaarwater bij ‘Hero Elementary’? Sorry dan!


Vorig jaar heb je mij een script gestuurd om in te voegen op een meta-pagina, zodat ik DPs in turquoise te zien kreeg. Dat heb ik vanmiddag voor mijn huidige account weer proberen in te voeren, maar zonder het gewenste resultaat. Zou je nog eens kunnen uitleggen waar ik dat precies moet invoeren. Of ligt het aan het bevestigen? (Ik werk met een iPad, dus misschien maakt dat de zaak lastiger.).


prettige avond!


mvg., ~~~~

Edoderoo (talkcontribs)
Woordenbrij2 (talkcontribs)

Oké, hartelijk dank!

Ga ik proberen.

Reply to "Hero Elementary"
Multichill (talkcontribs)

The lag is high, but your bot keeps editing. What did you set as maxlag? Please lower it to 5 like everyone else.

Edoderoo (talkcontribs)

Er draaide 1 script elk uur met die setting, de andere die weken achtereen doorlopen liepen wel netjes met 5. Maar nu is hij helemaal gestopt, want vanacht zijn de login-tokens ongeldig verklaard, dus moet ik vanavond eerst mijn wachtwoord opnieuw invullen...

Multichill (talkcontribs)

Die setting? Welke setting?

Edoderoo (talkcontribs)

maxlag=55 ... toen vorige week er elke dag honderden vlinders toegevoegd werden, heb ik met twee scripts die allemaal aan wikidata geknoopt, maar met de serverlag zoals we die de afgelopen maanden kennen, is dat zoveel hollen en stilstaan, dat het eigenlijk niet werkt. Dat zie je het beste aan PAWS, dat is daardoor compleet onbruikbaar geworden. Daarnaast start er bij mij elk uur een script die de nieuwe wd-items van het afgelopen uur een omschrijving geeft. Als de serverlag hoog is, zijn dat vaak maar heel weinig items, maar die liepen nog met een hogere maxlag, dat is wat jij zag. Dat er hier nog drie andere scripts on hold staan, is dan wat lastiger te zien. Maar die maxlag staat nu weer terug op 5, want voor dat script op het hele uur is het niet zo spannend hoe laat hij klaar is.

Reply to "Maxlag again"
MrProperLawAndOrder (talkcontribs)
MrProperLawAndOrder (talkcontribs)
Edoderoo (talkcontribs)

I took it from the DBid, Deutsche Biografie. I saw they have more data we don't have on WIkidata yet, so I will extend my scraper sccript a bit more.... It will make the items more useful, and they will appear in more searches when more properties are filled up. And by adding a VIAF-id that is unique, we might find some duplicates that can be merged afterwards. This way your import is gonna be even more useful. Cooperation Galore ;)

MrProperLawAndOrder (talkcontribs)
Edoderoo (talkcontribs)

I know ... I will run a full run to fill the VIAF maybe this weekend ... not every single one will get a VIAF, but that 190.000 can be quite some lower...

MrProperLawAndOrder (talkcontribs)

Yeah! Great! Any support very welcome. But please don't take VIAF from that website. Could be a bit out of date. Yes, VIAF adding is high prio to find more duplicates! The GND LDS (linked data service) offers even more data than only VIAF. Some stuff on DtBio website is just from a GND LDS dump, other stuff is their own - also depending on the items. But VIAF is always from the GND LDS dump, so safer to get it from there or from viaf.org.

MrProperLawAndOrder (talkcontribs)

Why do you still take VIAF from DtBio website. It is not as safe as to take it from GND LDS. During item creation I on purpose avoided loading it from there.

MrProperLawAndOrder (talkcontribs)

Is it difficult to do it from GND LDS?

Edoderoo (talkcontribs)

Oh, I think I misunderstood the discussion elsewhere. What about the other data on DtBio, like birthdate, profession, gender, etc?

MrProperLawAndOrder (talkcontribs)

Most comes from GND, including the errors. DtBio aggregates GND IDs from various sources from DE. AT. CH and mixes it with GND dumps. So, DtBio has data on a subset of GND IDs. What they have on top of GND data are the external links - which is really helpful to find out more about these humans. Some of these external links are also tracked in WD properties. So, DtBio is really helpful for WD. Only items in ADB (which is public domain and also in wikisource) and NDB have a lot extra data, but to import data from NDB probably could be copyvio. GND is CC0, birthdate, profession, gender - all in GND. GND bot/script would be great.

Edoderoo (talkcontribs)

I now see where we are talking along eachother. The values vor P7902 and P227 are the same. I use the link for P7902 with the data from P227, so effectively I grab the data from GND.

MrProperLawAndOrder (talkcontribs)

In most cases the value for P7902 should exist in P227. Some differences can exist. Also P227 can contain deprecated values (redirects) while in P7902 these should be deleted. Effectively you only "need" to work on P227 for P31=Q5. The existence of P7902 is just an indicator that that human could be more "important".

MrProperLawAndOrder (talkcontribs)
Edoderoo (talkcontribs)

Good, they will add more data then I do right now. Because the database is locked every other 10 minutes, it becomes a bit frustrating to develop a new script, as PAWS is at those times not allowed to even read data from WikiData. It is stupid, but there seems to be no way around. When the time is right I might add more data to my script, if others already filled up the gap with QS: no problem at all, I will find another task when there is time for it...

MrProperLawAndOrder (talkcontribs)

Sad to hear about the DB problems. For me the SPARQL service is sometimes annoying, long waiting and sometimes time outs. PAWS not allowed to read, would not have thought reading is a problem.

Edoderoo (talkcontribs)

Also for bots reading is not allowed in those moments, but I have the choice to use my bot account on my Ubuntu box, or use my personal account on my PAWS environment. It is anyways weird that PAWS is limited, it is already limited to 6 write transactions per minute (per user), so it is unlikely that PAWS will ever be an issue, even for writing.

MrProperLawAndOrder (talkcontribs)

I sometimes got 90/60s via QS - including the edits that QS performed twice, so I had todo extra QS edits later to remove statements or merge items. If hardware is limited, maybe the software QS could be improved. And via PAWS one can add several statements in one edit? Have you ever done that? Not good for edit count, but one could insert more content per time. At least in QS "create" I could add several things with one edit (the first for that item). Good you can run bot via Ubuntu.

2001:1C02:1E0C:B700:DD8A:F1BC:2080:9F60 (talkcontribs)

PAWS is an online python environment, you can run python scripts from there...

MrProperLawAndOrder (talkcontribs)
2001:1C02:1E0C:B700:DD8A:F1BC:2080:9F60 (talkcontribs)

It is taking the data from GND....

MrProperLawAndOrder (talkcontribs)

the edit above didn't show it took a VIAF ID at all, and the bot writes as reference DtBio. And as I told you, others are operating on the DtBio humans. Please stop.

Edoderoo (talkcontribs)

It is probably the reference that might be Q36578 instead of Q1202222. The url to fetch the data is https://d-nb.info/gnd/{id}/about/lds which is the Deutsche National Bibliothek that is also the link from the P227/GND-property.

MrProperLawAndOrder (talkcontribs)

You are inserting wrong values and maybe wrong references. And it may interfere with the work of the two others that started working on this earlier and don't add only P214 but also other values. They do it via QS. They prepared it for all DtBio items. And I have not seen any error by them yet.

Edoderoo (talkcontribs)

I'm NOT inserting wrong values.

Reply to "Add VIAF to GND human"
Eurohunter (talkcontribs)
Edoderoo (talkcontribs)

Hello, that's how it is called in Dutch

Eurohunter (talkcontribs)

There is differentce between "song" and "audio track". Are they really the same in Dutch?

Edoderoo (talkcontribs)

song=liedje. Audio-track is geluidsspoor, but usually a track on a CD consists of several mixed audio tracks. The separate "items" on a CD, single, LP, tape, etc are called "nummers", also when they are performed live. So I don't understand why you are reverting edits that are not wrong, for a language you don't speak.

Eurohunter (talkcontribs)

I mean there is noticable difference between "song" and "audio track" so I was surprsed.

Edoderoo (talkcontribs)

So then you delete a description that has no noticable difference between wat it was before, and what you believe it had to be. How will we ever create "the sum of all knowledge" if we all did act like that?

Reply to "Audio track"
MrProperLawAndOrder (talkcontribs)

https://paws-public.wmflabs.org/paws-public/User:Edoderoo/workitems/db-id%20to%20VIAF.ipynb - shall we turn it into a GND human script/bot? Reading from DtBio website itself seems not to be priority 1 and if usable for GND humans in general it can be used for other GND humans too. "gndh.ipynb" ? I also used python for locally creating my QS statements, but I didn't know yet how to use it for writing to WD. Never used pywikiframework. But maybe I still can help something with "gndh.ipynb". Soon there will be 1 mio GND humans in WD. But GND DB has 12 mio or more.

MrProperLawAndOrder (talkcontribs)

in the future it could be used to write whole new humans to WD maybe in one edit so very fast. Next GND dump may come out in june/july. The way I create new humans is a bit complicated. GNDhumanbot would be much better.

Edoderoo (talkcontribs)

you can forget about "very fast". Pywikibot is quick or slow as the other tools ;-) But it works, in background usually, so wth...

MrProperLawAndOrder (talkcontribs)

I meant fast, if several statements are written in one revision.

MrProperLawAndOrder (talkcontribs)

Maybe WMF should rewrite all their code in Julia.

Reply to "GND human script"
Ptinphusmia (talkcontribs)

Hi Edoderoo! Thanks for the Wikidata introductory piece you sent me. I deeply appreciate it. So sorry I'm replying late though. I hope to better understand wikidata and be an active editor. Your further assistance would sure make it a reality hopefully. ~~~~

Edoderoo (talkcontribs)

there is a very active Telegram Wikidata group (see the link in the top box on Wikidata:Project chat) where you can send all your questions. Further on you will learn a lot by doing it. Don't be too afraid to do something wrong, every edit can be reverted. On the other hand, on Wikidata it can take a while before others see your mistakes.

Ptinphusmia (talkcontribs)

Thanks and well acknowledged. I'm in the telegram group already. I just don't understand so many things conversed in the group. But I'll go with your recommendation and ask questions where necessary.

Reply to "Thanks for your kind support"
Herzi Pinki (talkcontribs)

Hi Edoderoo, seen your edit: . This is a mountain right at the border, so stating this is berg in Oostenrijk is only half the truth. While berg in Duitsland is the other half. (also reflected in country (P17)) IMHO mountains at the border should have a symmetrical description regarding countries / regions. best

Edoderoo (talkcontribs)

All of my edits are made automagically with a script. It will indeed pickup only one (the very first) country (or any other property). And it will only add the description if it was <blank> before. Feel free to improve any description manually, my scripts are only meant as very first suggestion, that is correct in like 99% of the cases, and "partly correct, so not wrong" for the other 1%. Putting too much "intelligence" in the script also makes a risk that it generates descriptions that are in some way unacceptable, think of a river flowing through more then five countries, all listed in the description.

Herzi Pinki (talkcontribs)

My Dutch is not existing, so I'm not able to improve your descriptions. So I can only delete partly incorrect Dutch descriptions. But me as a human do not think it is wise to get in an edit war with a bot.

If there are two contradicting truths, none of them can be partly correct.

Edoderoo (talkcontribs)

I have changed this one manually. My script is meant as a helper, as 99.99% of all items will not get a manual (Dutch) description in the coming 10 years, and manually putting systematic descriptions is meaningless work... now the humans can concentrate on the last 0.01% (like the one you mentioned), and/or adding statements to make items more meaningfull. A half correct description is always bettern then no description at all, that is why the script is ran. Things can always be better, even the description I've added can maybe be amended to something more usefull.

Herzi Pinki (talkcontribs)

We almost do have 100 million items. Your 99.99 % would leave 10000 for manual attention and correction (it is unclear which 10000).

Counting in Austria, we currently have 418 mountains in at least two countries (Query) in relation to 8481 mountains overall (Query), which gives a percentage of 5 %. Critical objects include mountains and mountain ranges, bridges, lakes (and of course rivers, but rivers might be complex, I agree), boundary stones, border crossings, ferries, dams. Some instance of (P31) classes and subclasses thereof. I was working hard to get this done from both border sides almost the whole last year, most of missing double countries caused by Lsjbot articles created based on geonames and imported from WP:ceb and WP:sv to wikidata. So Austria might be representative for the ratio, but still getting more. (mountains often define the borders).

Overall there are 525017 (Query) mountain objects in wikidata. 5 % of those will be about 25000 (more mountains than your estimation of 10000 from above). I know there are island states without borders, boring flat borders, etc, but I feel that your 99.99 % is an underestimation.

Looking at Switzerland, we have 7960 mountains (Query) with 245 (Query) in two countries (but as there is a border between Austria and Switzerland, many of that border mountains are already included). This gives a ratio of 3 %.

For a flat and far away country like Sudan (is Sudan flat enough?) we have a ration of 2 / 1321 for mountains and 457 / 12592 (4 %) for any cross border stuff.

Edoderoo (talkcontribs)

If any Dutch person is as concerned about describing mountains 100% correct as you are, I will invite them to overwrite my bot-suggestions with a better description. No problem at all. But there is "only so much" that my bot can do, and I personally think that people from Dutch speaking countries will not care too much if a mountain is in 1,2,3 or more countries in the end. The description will help them finding that mountain, and separate it from bands with the same name, rivers with the same name, making that *mountain* is way more important then the country/countries it is located in. If the goal is to have sourced, correct, peer-reviewed doublechecked descriptions for 94.000.000 items, it is meant to fail by default anyways.


When I then study your last 500 edits, and see that over 30% of them are deletions of information, I'm afraid I'm talking to a vandal, while the goal of Wikimedia is to share knowledge, not to delete knowledge. Please be constructive, instead of complaining to others that they do not delete as much information as you did.

Herzi Pinki (talkcontribs)

We do have a different understanding of what knowledge is. Knowledge is not a statement that might be true or false, half true or half false.

What is the correct procedure to get rid of false information? The path this information comes into wikidata is wrong / inaccurate data in geonames, then mainly gets imported to sv:WP and ceb:WP by User:Lsjbot, gets imported to wikidata by some other bots (with wrong / incomplete names and descriptions) and gets translated by e.g. your bot to various languages. I was expecting that deleting wrong information will lead to an action for some bots (triggered by empty description) to enter a better translation. While keeping the wrong information will not trigger the bot to regenerate it. I feel I was overly optimistic about that which makes me a vandal. Based on multiple values of country (P17) both ways should be possible.

I was not complaining about your bot, it seems to do valuable work, but I considered my remarks as a proposal for improvement. Sorry, if I was not able to communicate this.

For wikidata, to act as the data backbone of the world, reliable and correct information is a must. Otherwise it will fail anyway. At least reliability and correctness and completeness should be our noblest concern.

Edoderoo (talkcontribs)

Well, deleting descriptions that are not "good enough" in your point of view is not going to improve the work of LSJbot. I'm trying to add value to Wikidata with all possible tools, but unfortunately they get harder to use lately, due to all lag on the replication servers and petscan being down 99% of the time. We need to add missing statements and replace wrong statements, and keep an eye on these bots that they use reliable sources for their imports. Most of the time that is OK though, but yesterday a bot was importing scientific papers with a title of True or [Not available] for an endless range of items.

Herzi Pinki (talkcontribs)

Sorry, but Lsjbot operates on the GIGO principle. And adds more garbage. The best improvement for lsjbot articles is to delete them throughout the wikiverse. IMHO.

The mere existence of misspelled names here in wikidata or in geonames or in some other fancy platform has a bunch of consequences. Yesterday I renamed the misspelled Kleiner Archkenkopf to Kleiner Archenkopf (Kleiner Archenkopf (Q21873202)), a mountain that never existed as Kleiner Archkenkopf but nevertheless had and still has more than 1000 matches in google search (https://www.google.com/search?client=firefox-b-d&q=%22Kleiner+Archkenkopf%22) - weather forecast, maps, accommodation, etc. So this was a simple typo. Even worse for non-existing features. When checking existence of such an object, you have to find the single human created information in between the bunch of bot generated stuff.

Reply to "border objects"