User talk:Ivan A. Krestinin

From Wikidata
Jump to navigation Jump to search
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at User talk:Ivan A. Krestinin/Archive.
[edit]

I'm not sure why you deprecated the DOI on Barringer Medal citation for Michael R. Dence (Q101634288). It's a working DOI, so that deprecation seems unfounded. Trilotat (talk) 07:27, 26 November 2020 (UTC)[reply]

Are you about this edit? Bot did not change deprecation rank. It just made the value upper cased. — Ivan A. Krestinin (talk) 15:49, 2 January 2021 (UTC)[reply]

Non-capturing regex group (?:)

[edit]

Hi Ivan,

At Wikidata:Property_proposal/URL_match_pattern, we are trying to figure out which would be sensible default pattern for replacement.

At Property_talk:P973, this would be probably \2, but, if Krbot supports non-capturing regex groups, we could use "\1".

this tries to test it with that url. Will Krbot convert it? It seems to be busy with other things in the meantime.

Wikidata:Property_proposal/URL_match_pattern could probably be useful for Krbot as well. used by (P1535) could qualify the ones used by Krbot. --- Jura 07:54, 27 November 2020 (UTC)[reply]

PCRE v8.39 is used by KrBot. Please see the library documentation for supported syntax. As I understand general idea is to replace {{Autofix}} to properties. Autofix template currently supports several different use cases. Do you have idea how to describe all of them using properties? — Ivan A. Krestinin (talk) 16:03, 2 January 2021 (UTC)[reply]

Connection to other wikis

[edit]

Hey! Can you move Q9212417 to Q8564503 ?

The topic is the same: en.: Category:Jazz clubs

it.: Categoria:Locali jazz

The move will allow the connection to other wikis.

Thanks! --CoolJazz5 (talk) 10:59, 2 December 2020 (UTC)[reply]

@CoolJazz5 I've done this. In future, please specify links to the pages that you write about. This greatly simplifies execution of your request. Michgrig (talk) 22:41, 2 December 2020 (UTC)[reply]
Michgrig Ok, thanks!

Question about the edits made by the bot

[edit]

Why did the bot edit the two entries Q87402631 and Q104417514? It is true that the announcements that the bot moved were displayed with a warning. But that doesn't mean they're wrong too. --Gymnicus (talk) 10:48, 24 December 2020 (UTC)[reply]

As I noticed just after resetting, there were no warnings at all. For this reason, the edits of the bot make even less sense from my point of view. --Gymnicus (talk) 10:59, 24 December 2020 (UTC)[reply]
It is very common mistake then person-specific property is set for items that describes person groups. See this edit as example. Bot fixes such cases. It is logically incorrect to use properties like country of citizenship (P27), sex or gender (P21) for human groups. — Ivan A. Krestinin (talk) 16:21, 2 January 2021 (UTC)[reply]
I don't see it as easy as you do. For the property country of citizenship (P27) I would go with you. But I can't understand that with the property occupation (P106). There is also an example with the data object Igor and Grichka Bogdanoff (Q12920) where your bot does not remove this property and this statement was added on November 22, 2020. Then why does he remove it from the examples I mentioned? --Gymnicus (talk) 10:28, 6 January 2021 (UTC)[reply]

Request

[edit]

Could you make your bot:

  • Add P750 > Q1486288 to items with P2725
  • Add P750 > Q1052025 to items with P5944
  • Add P750 > Q1052025 to items with P5971
  • Add P750 > Q1052025 to items with P5999
  • Add P750 > Q22905933 to items with P7294
  • Add P750 > Q135288 to items with P5885

--Trade (talk) 21:55, 27 December 2020 (UTC)[reply]

Please add statements like:
property constraint (P2302)
Normal rank item-requires-statement constraint (Q21503247)
property (P2306) distributed by (P750)
item of property constraint (P2305) PlayStation Store (Q1052025)
constraint status (P2316) mandatory constraint (Q21502408)
0 references
add reference


add value
Bot fixes such statements automatically. — Ivan A. Krestinin (talk) 16:49, 2 January 2021 (UTC)[reply]

Fenestella troubles

[edit]

There are two different genera named Fenestella, namely an animal one (Q20975616) and a fungus one (Q17317929). Despite the names, they are completely different. So should their corresponding article items be, and their category items (Q9651255 and Q18283983, respectively) be. However, by name confusion, there has been mistakes, like linking the Swedish fungus article to the Commons category for the animals. Unhappily, these mistakes also seem to have caused confusion in wikidata.

I do not understand why your robot talks about a "redirect from Q20975616 to Q17317929" in the summary of this edit, but I know that this edit, and others mentioned in the summary details, have contributed to the confusion. I'll revert these edits, correct what I can, and add further "different from" properties to the central items (the four enumerated supra). I hope that this will lessen the risks for this particular confusion in the future. Best regards, JoergenB (talk) 20:01, 29 December 2020 (UTC)[reply]

I think different from (P1889) should be enough. Thank you for your job! — Ivan A. Krestinin (talk) 17:17, 2 January 2021 (UTC)[reply]

Ανδρέας (Greek given name) abusively mutated to Slavic equivalent(?) Анджей

[edit]

Problem here may be caused by one or more edits in Q14941830... To correct it manually is hard: got an error message: 'Item Q87263878 already has label "Ανδρέας" associated with language code el, using the same description text.' Perhaps a bot can be more helpful.

Happy New Year, Ivan A. Krestinin, Klaas `Z4␟` V11:31, 1 January 2021 (UTC) ( on behalf of NameProjectMembers & notifiers)[reply]

Looks like everything was fixed already. Happy New Year! — Ivan A. Krestinin (talk) 17:21, 2 January 2021 (UTC)[reply]

Protection of user page

[edit]

Hi! As suggested here, I've raised the protection level of your user page to administrators, in order to avoid accidental creations by registered users. If you want to create the page in the future, you can always make a request to WD:AN. Best regards and happy 2021! --Epìdosis 14:46, 1 January 2021 (UTC)[reply]

Good, thank you! Happy new year! — Ivan A. Krestinin (talk) 17:35, 2 January 2021 (UTC)[reply]

Error in BnF correction

[edit]

Hi,

The correction made here is wrong, the correct id is 12746940n.

When it starts with FRBNF, the last character is always false and must be recalculated.

eru [Talk] [french wiki] 17:54, 1 January 2021 (UTC)[reply]

What did KrBot do?

[edit]

Hi Ivan, ich don't get the reason of this change of your bot. As far as I see, the bot didn't change anything :-)

and of course - a happy new year. greetings from Germany --Z thomas (talk) 13:01, 2 January 2021 (UTC)[reply]

Hi Thomas, bot removed non-printable symbols at the end of name. Happy new year! — Ivan A. Krestinin (talk) 17:53, 2 January 2021 (UTC)[reply]
Thanks. I assumed something like that. Greetings --Z thomas (talk) 18:19, 2 January 2021 (UTC)[reply]
Hi, could you also add non-breaking spaces (\u202F and \u00A0) and multiple spaces (\s{2,} -> " ") to your script, please? Recently there was not very successful import from DACS (ping @Hannolans:), containing such spaces in en/nl labels. --Lockal (talk) 16:31, 9 January 2021 (UTC)[reply]
AI, wasn't aware of this, this was a download from the unmatched mix n match that I uploaded with openrefine. Would be great if the bot can repair this. Double spaces is also very useful. --Hannolans (talk) 22:36, 9 January 2021 (UTC)[reply]

Murder -> Homicide

[edit]

Hi!

Could you check the discussion at Property_talk:P1196#Allow_assassination? and, unless there is a good reason that I and others have so far managed to overlook, kindly ask your bot to stop edit-warring? Best, --Matthias Winkelmann (talk) 19:58, 2 January 2021 (UTC)[reply]

Hi! Are you about this edit? Please remove corresponding {{Autofix}} rule from Property talk:P1196 page if the rule is inapplicable for some cases. — Ivan A. Krestinin (talk) 22:53, 14 February 2021 (UTC)[reply]

Edits made based on a wrong constraints on MUBI ID

[edit]

Hey, Trade added two item-requires-statement constraint (Q21503247) on MUBI film ID (P7299) that weren't right since MUBI has ID for many films that are not on MUBI. They are basically there to show up in search and then suggest similar titles for potential viewers. Your bot added is adding statements based on these constraints. Could you undo them? thanks, Máté (talk) 05:18, 3 January 2021 (UTC)[reply]

Seconding this. Plesae undo these edits. Trivialist (talk) 18:50, 24 January 2021 (UTC)[reply]
Hi! ✓ DoneIvan A. Krestinin (talk) 23:58, 14 February 2021 (UTC)[reply]
Thanks! Máté (talk) 08:21, 15 February 2021 (UTC)[reply]

Hi! I have notice that you have added a number of distributed by (P750) based on identifiers on items. First of all I have not found any bot request for this work. This is needed for an approval for a new task. For music items it's not correct to add Spotify (Q689141), Tidal (Q19711013), Deezer (Q602243), etc as distributers. That is like saying ham (Q170486) is distributed by (P750) Walmart (Q483551). Within the music industry, music distribution is the way that the music industry makes recorded music available to consumers. This is often done by a record label. So, please remove the added items on music related pages and create a bot request for this work. --Premeditated (talk) 14:08, 5 January 2021 (UTC)[reply]

Wait, i though we followed the same model with music releases as we do with video games and film? --Trade (talk) 15:20, 5 January 2021 (UTC)[reply]
What do you mean? Like Rocky IV (Q387638) is distributed by (P750) Metro-Goldwyn-Mayer (Q179200), not Netflix (Q907311) or Apple TV+ (Q62446736) (just examples) because they are available on those sites. For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games. --Premeditated (talk) 09:02, 6 January 2021 (UTC)[reply]
The theatrical and home media video version of Rocky IV (Q387638) are distributed by (P750) Metro-Goldwyn-Mayer (Q179200) while the video on demand (Q723685) version are distributed by (P750) Netflix (Q907311) (in lieu of being distributed on Netflix' video-on-demand service)
'For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games' A publisher are the one who publishes the game. A distributor is the website that the game download are being sold on tho sometimes there are exceptions for physical releases and streaming platforms. @Premeditated: --Trade (talk) 12:59, 8 January 2021 (UTC)[reply]
@Trade: I think you are mixing distribution format (P437) for distributed by (P750). Like The Beatles (Q3295515) has distribution format (P437)music streaming (Q15982450). --Premeditated (talk) 13:47, 8 January 2021 (UTC)[reply]

I corrected my examples. So, why do you think that listing music streaming platforms are outside the scope of distributed by (P750)? @Premeditated:--Trade (talk) 00:42, 10 January 2021 (UTC)[reply]

@Trade: Sorry for late response. I think that it should be made a new property named "distribution platform", that could be used for all of those platforms like Steam (Q337535), Spotify (Q689141), Microsoft Store (Q135288), etc. Instead for cluttering distributed by (P750). - Premeditated (talk) 12:22, 20 January 2021 (UTC)[reply]

KrBot malfunction at Wikidata:Database reports/Constraint violations/P8988

[edit]

Hello, your bot failed to detect any violations at Wikidata:Database reports/Constraint violations/P8988, which is improbable. Please could you look at what is wrong? Vojtěch Dostál (talk) 15:08, 5 January 2021 (UTC)[reply]

Looks like everything is fine with the page now. — Ivan A. Krestinin (talk) 22:45, 14 February 2021 (UTC)[reply]

KrBot malfunction at Wikidata:Database reports/identical birth and death dates

[edit]

Hello, some entries Wikidata:Database reports/identical birth and death dates were fixed some days ago but not removed, could you have a look? Some examples:

Thank you! --Emu (talk) 21:40, 6 January 2021 (UTC)[reply]

Now all the items are removed. Maybe somebody fix the items. Or maybe some caching issue... — Ivan A. Krestinin (talk) 22:43, 14 February 2021 (UTC)[reply]

KrBot2 sleeping

[edit]

Hello Ivan! KrBot2 he fell asleep. Please wake me up! :) Thanks Palotabarát (talk) 00:13, 13 February 2021 (UTC)[reply]

Ah, I get it. Now I know where the dump is, which gives the data. Thanks for the reply! Palotabarát (talk) 00:15, 15 February 2021 (UTC)[reply]
20210303.json.gz is corrupted also unfortunately. The issue is tracked here. — Ivan A. Krestinin (talk) 09:54, 8 March 2021 (UTC)[reply]
FixedIvan A. Krestinin (talk) 19:52, 29 March 2021 (UTC)[reply]

Resolving redirects

[edit]

My understanding that withdrawn identifiers should be handled by deprecating them and marking them as withdrawn. Your bot is instead replacing them without more. The withdrawn (and replaced) identifiers are still used in other systems and linking them may still be desired.

For example withdrawn VIAF identifiers are still used by Worldcat Identities. Though Worldcat should update and merge their entries, until they do the old VIAF ID is still useful. Int21h (talk) 00:12, 15 March 2021 (UTC)[reply]

@Int21h: Hi! For VIAF ID (P214) there was consensus here for the removal of redirected and withdrawn IDs since VIAF clusterization has many problems (e.g. Q212872#P214) and keeping trace of it would be quite problematic. Bye, --Epìdosis 07:44, 15 March 2021 (UTC)[reply]
Ok thanks I wasn't aware of previous discussions. Good to know! Int21h (talk) 16:28, 15 March 2021 (UTC)[reply]

KrBot and Single Constraint

[edit]

Hi, would it be possible when checking the Single Constraint violations of identifiers ignoring the ones who have a deprecated rank and reason for deprecated rank (P2241):redirect (Q45403344) as a qualifier? One example would be this one, which is listed in the constraint report. Those are considered valid values and should be kept, so having them in the report makes maintenance and cleanup harder. -- Agabi10 (talk) 18:00, 18 March 2021 (UTC)[reply]

Hello, there is technical troubles to implement this. Maybe I can propose alternative way. Does IMDb allow to get all valid (non-redirect) identifiers? If yes I can create bot that will fix such redirects continuously. — Ivan A. Krestinin (talk) 19:56, 29 March 2021 (UTC)[reply]
I don't know if it allows getting all identifiers, but at least for now they shouldn't be replaced, as long as they have been valid identifiers they should be kept with deprecated rank. If checking the qualifier is too much trouble just ignoring the statements with deprecated rank when creating the report would be more feasible? -- Agabi10 (talk) 09:45, 7 April 2021 (UTC)[reply]
@Agabi10: That's a good interim solution - yes, skipping item with deprecated statements would really be best.Vojtěch Dostál (talk) 10:12, 7 April 2021 (UTC)[reply]

Hi, it's been more than a year since the request. If checking the qualifiers and ignoring the ones with given deprecation reasons isn't technically feasible, would it be feasible ignoring all the claims with deprecated rank for the Single Constraint violations? It's not ideal, but most of the violations in the Single Constraint of IMDb ID (P345) are for claims that should be left as is, which makes that section of the report completely useless, at least in this case. -- Agabi10 (talk) 13:20, 27 September 2022 (UTC)[reply]

Reverted merge

[edit]

Hello. Q20540007 was mistakenly merged with Q17165321. Then KrBot re-linked statements pointing to the redirect. Then the merge was reverted. Could you also revert the bot-actions? Thanks in advance. Greetings, --FriedhelmW (talk) 14:52, 21 March 2021 (UTC)[reply]

Hello, ✓ DoneIvan A. Krestinin (talk) 19:57, 29 March 2021 (UTC)[reply]
Thank you! --FriedhelmW (talk) 16:09, 30 March 2021 (UTC)[reply]

NGA number

[edit]

Hello. I have seen your bot doing great work fixing up light characteristic (P1030) and ARLHS lighthouse ID (P2980). I wonder if you might be able to help with NGA lighthouse ID (P3563)? Often these are written as a 5-digit number but are missing the 3-digit volume prefix. (Compare [1]). The volume depends on the geographic area, which may be deducible from country (P17). This map shows how the 7 volumes are distributed. If you can help, that would be great. MSGJ (talk) 21:45, 22 March 2021 (UTC)[reply]

Hi, this is out of some current bot tasks. It is better to put the request to Wikidata:Bot requests. — Ivan A. Krestinin (talk) 20:06, 29 March 2021 (UTC)[reply]

David van Dantzig

[edit]

@KrBot: Hi Ivan: The University of Utrecht never was employer of David van Dantzig. Please see the biography of van Dantzig written by Gerard Alberts Twee geesten van de wiskunde : biografie van David van Dantzig published in 2000 or the paper of his student Jan Hemelrijk The Statistical Work of David Van Dantzig (1900-1959) published in 1960 or the short biography in Academic Genealogy of Mathematicians (page 310) by Sooyoung Chang published in 2011. Moreover, Utrecht University is not cited in the Complete Dictionary of Scientific Biography neither in MacTutor History of Mathematics.--Ferran Mir (talk) 11:22, 23 March 2021 (UTC)[reply]

Please @KrBot:, read my arguments against the statement that University of Utrecht was employer of David van Dantzig.--Ferran Mir (talk) 15:00, 23 March 2021 (UTC)[reply]
Hi, Ferran, KrBot is just a bot) It uses very simple rule: each item with Catalogus Professorum Academiae Rheno-Traiectinae ID (P2862) property should have employer (P108) = Utrecht University (Q221653) statement according to constraints specified on Property:P2862. This edit will help. — Ivan A. Krestinin (talk) 20:17, 29 March 2021 (UTC)[reply]
OK @KrBot: @Ivan A. Krestinin:, I have seen the exception included in the restriction. That's right! Thanks.--Ferran Mir (talk) 07:40, 30 March 2021 (UTC)[reply]

Qualifier reason for deprecated rank (P2241) on property constraints

[edit]

Hi Ivan A. Krestinin,

to use Help:Property_constraints_portal/Entity_suggestions_from_constraint_definitions, some constraint statements have the above qualifier (and deprecated rank). Can you skip those constraints for Krbot? In the most recent update, the report throws an error. --- Jura 10:23, 13 April 2021 (UTC)[reply]

Hi Jura, I added the property to ignore list. The nearest update is in progress already, so it will report the error. The next update should be fine. — Ivan A. Krestinin (talk) 20:48, 24 April 2021 (UTC)[reply]

Bot is doing strange things on Petit-Rocher Lighthouse (Q106498634) — Martin (MSGJ · talk) 20:17, 19 April 2021 (UTC)[reply]

Bot executes rules from {{Autofix}}. The rules were added by Jura. Better to discuss the issue with him. — Ivan A. Krestinin (talk) 20:32, 24 April 2021 (UTC)[reply]
Yes, it seems to work as planned (removing the dots). --- Jura 20:48, 24 April 2021 (UTC)[reply]
It took 8 edits to do it though? — Martin (MSGJ · talk) 19:29, 25 April 2021 (UTC)[reply]
To avoid breaking things, I think I did "."→" " as sometimes a space to following them was missing and some "." shouldn't be replaced.
We could probably have more Autofix rules that try to do it in fewer steps, but then these would have to be checked on every run as well.
This report has more patterns that might need to be normalized, but it's a tricky thing. --- Jura 07:11, 29 April 2021 (UTC)[reply]


constraint scope (P4680) qualifier, error on KrBot update

[edit]

Hi Ivan,

Maybe the qualifier should be handled somehow or ignored.

I removed it at [2] for [3], but maybe there are cases where it's useful (possibly at this property). --- Jura 15:11, 29 April 2021 (UTC)[reply]

single-value constraint (Q19474404) is checked for main value only. So the property is not looked as something useful. — Ivan A. Krestinin (talk) 15:57, 10 May 2021 (UTC)[reply]
I tend to agree. I think people started adding them as there was some oddity with the Wikibase extension (initially checking by default everywhere). Or was that about the distinct value constraint? Go figure. --- Jura 09:38, 12 May 2021 (UTC)[reply]
In that case, would you be able to provide constraint reports while ignoring that qualifier (instead of throwing an error and producing no report, as is currently done)? Mahir256 (talk) 18:35, 6 June 2021 (UTC)[reply]
@Ivan A. Krestinin: Thoughts on the idea of ignoring that qualifier? Mahir256 (talk) 17:56, 17 June 2021 (UTC)[reply]
Hi! I reviewed several usages of the property. It is looked as completely redundant as I see. Why not just remove it? Also it is very confusing because it is similar to property scope (P5314). — Ivan A. Krestinin (talk) 20:49, 21 June 2021 (UTC)[reply]
@Ivan A. Krestinin: I agree that it is redundant for you, given that you only check main values, which is why I'm asking if you could ignore it and possibly other properties not applicable in that situation when generating reports. I believe that P4680 is still useful for the gadget for which @Lucas Werkmeister (WMDE): proposed that property in the first place, and possibly for other future tools which are developed for constraint checking. (@MisterSynergy:, as the proposer of P5314, who might have more to say on that point.) Mahir256 (talk) 21:42, 21 June 2021 (UTC)[reply]
property scope (P5314) defines where a property might be used, while constraint scope (P4680) defines where the constraint should be checked.
As an example, consider identifier properties. They are usually allowed (via property scope (P5314))) as main values and references. A distinct-values constraint (Q21502410), however, should not be checked on references, as the reference value might occur on different claims and even different items.
MisterSynergy (talk) 09:39, 23 June 2021 (UTC)[reply]
distinct-values constraint (Q21502410) is checked for main values only. I do not see any reason to duplicate this fact on each property page. — Ivan A. Krestinin (talk) 22:56, 23 June 2021 (UTC)[reply]

resolved redirects

[edit]

I came here to pat the bot. I had no idea that I had left so many links to my redirects. Today, there was a good bot, thanks for that.--RaboKarbakian (talk) 00:26, 2 May 2021 (UTC)[reply]

Lingering uses of redirects in statements

[edit]

Hi! I understand your bot is responsible for fixing statements that refer to redirected items. I see that high-mass X-ray binary (Q71963720) was redirected on 2019-12-10T17:42:33‎, but there are still many statements using the old item (e.g. here is one I fixed today). Do you know why these are not getting fixed? (I noticed this when trying to work out why there were so many type violations on Wikidata:Database_reports/Constraint_violations/P59.) Cheers, Bovlb (talk) 16:30, 26 May 2021 (UTC)[reply]

Hello, currently bot fixes redirects only once per redirect. After this bot adds item to special "already fixed" list and ignores it. Bot fixes all redirects to Q71963720 at 2019-12-12, but after this the redirect was used by User:Ghuron, see [4] as example. Looks like I need to create special algorithm to detect fixed redirects reusage. — Ivan A. Krestinin (talk) 09:47, 6 June 2021 (UTC)[reply]
I need to fix that in my script, thanks Ghuron (talk) 15:45, 6 June 2021 (UTC)[reply]
I also added such items detection to my bot. Links to high-mass X-ray binary (Q71963720) were fixed. — Ivan A. Krestinin (talk) 05:59, 7 June 2021 (UTC)[reply]

Removal of "occupation" for Peter and Rosemary Grand

[edit]

Hi, your bot is removing one of the occupations ("evolutionary biologist", the most important one !) of Peter and Rosemary Grant (Q3657692), as in this edit, and I cannot understand why ? (by the way, a bot should probably not repeat twice the same edit if it has been manually reverted; it should likely lead to a discussion). Cheers, Schutz (talk) 09:11, 28 May 2021 (UTC)[reply]

Hi, bot removes person specific properties like birth/death dates, nationality, spoken language, occupation and etc. from items about person groups. It is very common mistake that is repeated many times by different bots, half-automatic procedures and some users. — Ivan A. Krestinin (talk) 09:53, 6 June 2021 (UTC)[reply]
In any case, the bot should not blindly remove several times the same information -- it should alert the user instead. But I don't really see why "writer", in this case, is kept, while "evolutionary biologist" is not. In this case, the latter is not an error, as the couple worked together as evolutionary biologists. The problem is that by simply removing the property, nothing meaningful appears in the infobox at w:fr:Peter et Rosemary Grant (at the moment it is only "writer", translated in French). If you have any suggestion about how the interesting information can be displayed (in other words, how the Wikidata item can include the information that the pair has worked as evolutionary biologists, and not "writers", so that this information can trickle down to the infobox), I'd love to hear it. Otherwise, could you change your bot so that it does not remove this useful (and correct) information ? Many thanks in advance, Schutz (talk) 13:54, 5 July 2021 (UTC)[reply]
Wikidata is not just collection of information. The information should be structured also. Discussed type of error is too common. Too many users made it from time to time. Bot fixed 4649 cases of this type already... 4649 notifications on user`s pages... Will be looked like spam bot) I changed the article a bit to make the article information more clean. — Ivan A. Krestinin (talk) 22:17, 12 July 2021 (UTC)[reply]

Request

[edit]

Can this bot replace all current and future value in Namuwiki ID (P8885) from %20 to space (example from 머라이어%20캐리 to 머라이어 캐리)? Thanks. Hddty (talk) 01:05, 9 June 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 17:00, 10 June 2021 (UTC)[reply]

Resolving redirect (Q12368077_Q5334199)

[edit]

As a consequence of bad merge there are 292 erroneous links now[5]. These links should be "unresolved". May I also suggest that your bot waits longer after a merge before resolving links, perhaps a week, or rather even a month. So that it would be liklier that bad merges will be caught before. 2001:7D0:81DA:F780:A8E0:C965:57F9:B464 06:53, 24 June 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 07:59, 26 June 2021 (UTC)[reply]

Update the report of P2991

[edit]

Hi Ivan. Could you maybe let your KrBot run over the property IBSF athlete ID (P2991) regarding Wikidata:Database reports/Constraint violations/P2991? I had removed several bugs, but I also noticed that there are definitely false messages and I would be interested to know whether these will now be removed. --Gymnicus (talk) 11:25, 25 June 2021 (UTC)[reply]

Thank you --Gymnicus (talk) 14:54, 25 June 2021 (UTC)[reply]

To merge

[edit]

Why no updates of User:Ivan A. Krestinin/To merge anymore? - FakirNL (talk) 08:24, 2 July 2021 (UTC)[reply]

Hi! Bot failed on wrong different from (P1889) values like this and this. I added some checks to skip such values. The reports should be updated in 1-2 days. — Ivan A. Krestinin (talk) 09:26, 3 July 2021 (UTC)[reply]

Wrong statements based on wrong constraint

[edit]

Hey, could you please revert edits based on this statement? The constraint was erroneous. – Máté (talk) 04:50, 4 July 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 10:45, 4 July 2021 (UTC)[reply]


Hi Ivan,

Similarly to replacement property (P6824), can you ignore this when found in property constraint? Otherwise KrBot would generate an error. --- Jura 11:33, 11 July 2021 (UTC)[reply]

Hi Jura, interesting property. I added it to ignored for conflicts-with constraint (Q21502838) and none-of constraint (Q52558054). But looks like bot should do something more than just ignoring this property. Currently the property used a bit random. For example I do not understand the property usage for location of creation (P1071):
Do you have some ideas how to make its usage more structured maybe? — Ivan A. Krestinin (talk) 21:22, 12 July 2021 (UTC)[reply]
I had seen that use too, but wasn't sure what to think of it. The problem seems to be that the replacement isn't always applicable. Personally, I'd remove that.
The samples at replacement value (P9729) are closer to how I'd use them. @Dhx1: documented them at Help:Property_constraints_portal/None_of.
I noticed platform (P400) has plenty of constraints that can use it. --- Jura 14:54, 13 July 2021 (UTC)[reply]

Thank you

[edit]

Just wanted to drop a big "thank you" for cleaning up the apparent mess that Pi bot made with all the duplication. Great catch. Huntster (t @ c) 23:52, 15 August 2021 (UTC)[reply]

New constraint Label in Language

[edit]

Hi Ivan,

Please see phab:T195178. To test the future deployment, I added label in language constraint (Q108139345) at [6]. You might need to have Krbot skip it. --- Jura 12:30, 7 September 2021 (UTC)[reply]

Hi Jura, I added fake implementation for the constraint. It is not so hard to add real implementation. But this requires to load information about labels. This information requires some memory. But memory is critical resource now unfortunately. See my message bellow for details. — Ivan A. Krestinin (talk) 20:58, 8 September 2021 (UTC)[reply]
I had previously implemented it with complex constraints, see Help:Property_constraints_portal/Label_language. --- Jura 07:59, 9 September 2021 (UTC)[reply]

Wikidata:Database_reports/Constraint_violations/P2088 has not been regenerated for 16 days.

Is there a way to force it to revalidate this property? --Vladimir Alexiev (talk) 14:24, 8 September 2021 (UTC)[reply]

# Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial
SELECT DISTINCT ?item ?itemLabel ?value WHERE {
	?statement wikibase:hasViolationForConstraint wds:P2088-DD4CDCEA-B3F6-4F02-9CFB-4A9E312B73A8 .
	?item p:P2088 ?statement .
	?statement ps:P2088 ?value.
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}
Try it!
  • Unfortunately this returns fewer violations compared to the pages generated by KrBot. See the comment in the query: "Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial"

"Unique value" violations due to duplicate external-id

[edit]

Looking at https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P2088#%22Unique_value%22_violations, we see many Qnnn values that are the same.

I described them as "false positives" but then looked at some instances eg https://www.wikidata.org/wiki/Q5013693#P2088 and see that indeed there's a problem: the same external-id is recorded with and without a reference. The one without reference should be removed --Vladimir Alexiev (talk) 12:55, 9 September 2021 (UTC)[reply]

Culture Bas-Saint-Laurent

[edit]

Salut Ivan,

Nous avons pris soin comme organisation de compléter l'entrée de Culture Bas-Saint-Laurent (Q108475391). Nous allons demander le retrait de l'entrée Q87727973 puisqu'elle est maintenant désuète.

Merci à vous

 – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs). Template:Setion resolved

Hi Ivan,

{{Autofix}} allows to add additional statements based on existing values.

An interesting enhancement could be to do this as constraint as well, e.g.

if currentProperty + currentPropertyValue then requiredProperty + requiredPropertyValue

Also:

if currentProperty + currentPropertyValue then requiredProperty

Maybe the property in the condition could be an argument as well:

if currentProperty + otherProperty + otherPropertyValue then requiredProperty
if currentProperty + otherProperty + otherPropertyValue then requiredProperty + requiredPropertyValue

--- Jura 09:46, 12 September 2021 (UTC)[reply]

Hi Jura, could you provide some example for testing? — Ivan A. Krestinin (talk) 10:46, 19 September 2021 (UTC)[reply]
How about these ? --- Jura 14:28, 19 September 2021 (UTC)[reply]
I misread your message the first time. Now I understand your idea. It is possible to create such constraint. But is {{Autofix}} enough maybe? Do we need control such cases using constraints additionally? — Ivan A. Krestinin (talk) 14:46, 19 September 2021 (UTC)[reply]
I will try to dig up better examples. When Autofix is (safely) possible (for item datatype properties with a predefined value), the constraint wouldn't that useful . --- Jura 14:53, 19 September 2021 (UTC)[reply]

Samples:

Sorry for the delay. Happy holidays. --- Jura 00:32, 24 December 2021 (UTC)[reply]

COSPAR and Coordinates

[edit]

Hello Ivan. Regarding your reverts to my removals at COSPAR ID (P247) and coordinate location (P625), you have to realize that more than just satellites are sent into space and receive COSPAR IDs. Probes sent to other worlds have need for coordinates (primarily Mars since that planet is supported in our system, but others are as well), and it's entirely possible that Earth-bound spacecraft may potentially have a need for it as well. My point is, making these two properties mandatorily conflicting doesn't make sense in modern spaceflight. Huntster (t @ c) 13:25, 19 September 2021 (UTC)[reply]

Landing point is just one of the points in spacecraft live. So we can specify coordinates for some event, but not for spacecraft at all. It is same as specifying geographic coordinates of some human. Just add geo coords as qualifier to some event. Like this or this. — Ivan A. Krestinin (talk) 13:49, 19 September 2021 (UTC)[reply]


spouse (P26) duplicate statements

[edit]

Hi Ivan,

What do you think of Wikidata:Bot_requests#Merge_multiple_P26_statements? Didn't your bot merge some statements? --- Jura 21:15, 22 September 2021 (UTC)[reply]

Hi Jura, usually my bot does not clean such duplicates because the values have different qualifiers. I started special job for this case. It is in progress now. — Ivan A. Krestinin (talk) 23:01, 22 September 2021 (UTC)[reply]
✓ Done, please check remaining 18 items manually. Bot failed to resolve data conflicts in its. — Ivan A. Krestinin (talk) 21:49, 23 September 2021 (UTC)[reply]

Добрый день. Здесь какое-то странное обновление пришло. Первый и второй объект без второго значения, а третий и вовсе без свойства. И такая картина почти по всем. Ощущение, что бот формировал отчёт ещё 17 сентября, а опубликовал только сейчас, 28 сентября. 185.16.139.123 20:52, 28 September 2021 (UTC)[reply]

DOI format restriction

[edit]

Hi, I noticed something strange at the DOI property, maybe you can identify the root of the problem? As you can see in Wikipedia in Health Professional Schools: from an Opponent to an Ally (Q108747926), the DOI property is falling under a format restriction, [a-z]*. Not sure how to fix it. Good contributions, Ederporto (talk) 00:13, 30 September 2021 (UTC)[reply]

Hello, just use upper case: [7]Ivan A. Krestinin (talk) 16:09, 30 September 2021 (UTC)[reply]

Men's basketball

[edit]

The bot is adding "men's basketball" to male basketball players. This property is not for individuals; it is for clubs, teams, competitions. Therefore all those bot contributions create an exclamation mark (!) which can be avoided by stopping this activity. When I remove the thingy from the individual sportsmen's items, the bot comes and adds it again! (Now I used an exclamation mark. :) Cheers. --E4024 (talk) 15:23, 3 October 2021 (UTC)[reply]

Hi, could you provide link to the edit sample? — Ivan A. Krestinin (talk) 19:26, 3 October 2021 (UTC)[reply]
Ömer Faruk Yurtseven (Q18129444) and many others... --E4024 (talk) 23:13, 3 October 2021 (UTC)[reply]
This bot behavior is caused by this edit. I deleted the constraint. So bot will not do such edits anymore. Also I added conflict with constraint for better protection. What we should do with existing values of competition class (P2094) in human (Q5) items? 1. Delete all such values. 2. Delete only added by my bot. 3. Move all values to sports discipline competed in (P2416). 4. Something else?) — Ivan A. Krestinin (talk) 16:30, 4 October 2021 (UTC)[reply]

Merge overlaps

[edit]

Hi. Trying to look how we not report different versions of a work as duplicates, and not having to put on a "do not merge" list or mark as different. Example

Here we have the parent work, and respective versions or translations xwiki, and they are listed on the parent. We will have an expectation that this would be a widespread situation as more and more works are transcribed. Is it worthwhile not listing as duplicate where they are both listed on the parent with has edition or translation (P747). Thanks for the consideration.  — billinghurst sDrewth 23:47, 10 October 2021 (UTC)[reply]

Hello, it is better to ask User:Pasleim about this. He is the reports author. Possible adding different from (P1889) should help. — Ivan A. Krestinin (talk) 17:09, 11 October 2021 (UTC)[reply]

Добрый день. В отчётах по GeoNames творится какой-то ад, разбирать на две-три жизни. Вы не могли бы проверить точность настроек? Как вариант, поделить списки на страны, позвать добровольцев… В общем проблема сама себя не решает. 194.50.15.241 05:35, 12 October 2021 (UTC)[reply]

Приветствую, подобный ад творится здесь практически в каждом популярном свойстве. Меня к сожалению на все свойства не хватает. Настройки генерации отчета вы можете найти на Property:P1566. Не слишком популярные проблемы, например, некорректный формат, проще поправить руками, там всего 10 элементов. Для массовых проблем вы можете попытаться выделить группы ошибок и предложить какие-нибудь автоматизированные процедуры их исправления. В этом вам могут помочь на Wikidata:Bot requests. — Ivan A. Krestinin (talk) 05:59, 12 October 2021 (UTC)[reply]
Меня больше интересовало, не мог бы ваш бот отсортировать проблемы по типам объектов (реки, озёра, горы) и по странам (Россия, СНГ). Это уже будет возможно разбирать. 194.50.15.241 20:13, 12 October 2021 (UTC)[reply]
Да, этого можно добиться добавив свойство group by (P2304). Посмотрите как это сделано, например, здесь: Property:P1538. Есть правда неприятный момент — сгруппировать сразу по двум свойствам не получится. Либо по стране, либо по типу. Также можете попробовать сформировать произвольный отчет с помощью SPARQL. — Ivan A. Krestinin (talk) 20:48, 12 October 2021 (UTC)[reply]
Анонимам запрещено править свойства. Вы не могли бы помочь? Типы: lake (Q23397), river (Q4022), mountain (Q8502); страны: Russia (Q159), Ukraine (Q212), Belarus (Q184). 194.50.15.241 18:48, 13 October 2021 (UTC)[reply]
Добавил группировку по государствам. Но лучше создайте учетку и продолжите сами. Или воспользуйтесь SPARQL. Пример получения всех объектов России у которых число кодов больше, чем один:
SELECT ?item ?itemLabel
WHERE
{
	{
		SELECT DISTINCT ?item {
			?item wdt:P1566 ?value1 .
			?item wdt:P1566 ?value2 .
            ?item wdt:P17 wd:Q159
			FILTER( ?value1 != ?value2 ) .
		}
	} .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,en" } .
}
Try it!
Ivan A. Krestinin (talk) 21:33, 14 October 2021 (UTC)[reply]
Благодарю. Я так понимаю, группировка появится при обновлении отчёта? Сейчас свежесть от 8 октября. 194.50.15.241 03:18, 16 October 2021 (UTC)[reply]
Да, при следующем обновлении. База Викиданных сильно подросла последнее время, к сожалению теперь боту требуется уже дней десять, чтобы сгенерировать очередную версию отчетов. SPARQL в этом плане удобнее. — Ivan A. Krestinin (talk) 15:14, 16 October 2021 (UTC)[reply]

Q10497835

[edit]

Is this posible to reverse this replace? Eurohunter (talk) 15:48, 15 October 2021 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 09:53, 16 October 2021 (UTC)[reply]

An actress changes sex and becomes an actor... :-)

[edit]

Hello Ivan. In the item Maurane (Q509029) —a belgian female singer—, KrBot repeteadly changes another one of her occupations —actress— in its male counterpart: "actor". I don't know why but could you please solve this? Thanks a lot in advance: Tatvam (talk) 16:10, 15 October 2021 (UTC)[reply]

@Tatvam: This processing is quite intentional. Because it's not that the data object actor (Q33999) only describes male actors, but also female actors, i.e. actresses. The item is no different from the data object singer-songwriter (Q488205), which also describes both female and male persons. --Gymnicus (talk) 18:45, 15 October 2021 (UTC)[reply]
Thank you for your answer, but I used the data object actress (Q21169216), not actor (Q33999) and I would like it to stay like that. It is KrBot which repeteadly changes actress (Q21169216) to actor (Q33999) without reason. Tatvam (talk) 18:57, 15 October 2021 (UTC)[reply]
@Tatvam: If you do not want this change you would have to raise this concern on the discussion page of occupation (P106), because there the bot is asked to make these changes. --Gymnicus (talk) 19:19, 15 October 2021 (UTC)[reply]
Hello, @Tatvam:, KrBot makes the changes because Property talk:P106 has {{Autofix}} template for this value. Please discuss the case on Property talk:P106 and delete the autofix if required. — Ivan A. Krestinin (talk) 10:47, 16 October 2021 (UTC)[reply]

Problematic bot

[edit]

Your bot (KrBot) is replacing proper item (Q4354683) with some nonsensical disambiguation page (Q3537858), tens of pages are affected. Please stop that. --Orijentolog (talk) 18:17, 15 October 2021 (UTC)[reply]

@Orijentolog: Ivan can now do very little for these arrangements. The bot is programmed to resolve redirect links and the two of the data objects mentioned were accessed by a user on May 18, 2021
merged and only separated from each other on October 4th. In the meantime, the bot has done its job and replaced the redirect links. The bot cannot see that the merging was wrong. --Gymnicus (talk) 18:41, 15 October 2021 (UTC)[reply]
Thanks for the info, it's mostly OK now because I fixed most mistakes manually. I just want to be sure that bot won't repeat the same mistakes. Greetings to both. :) Orijentolog (talk) 18:45, 15 October 2021 (UTC)[reply]
@Orijentolog: The bot should not change it back, since it is no longer a redirect link and in principle the bot ignores this links now. If such changes happen, then something is wrong with the programming. --Gymnicus (talk) 19:06, 15 October 2021 (UTC)[reply]
Hi, Orijentolog, I reverted bots changes. Bot waits for some time before resolving redirects. But wrong merge that exists for a long time creates issue not only for my bot. Humans ans other bots use target item only. So reverting of old wrong merge requires reviewing all links in any case. — Ivan A. Krestinin (talk) 14:35, 16 October 2021 (UTC)[reply]

Lexeme constraint

[edit]

Did you see Property talk:P1296#Lexeme language? FogueraC (talk) 07:43, 16 October 2021 (UTC)[reply]

Hi, I have too many notifications. So I did not see {{Ping}} mentions. Sorry. — Ivan A. Krestinin (talk) 15:10, 16 October 2021 (UTC)[reply]
No problem. And thanks! FogueraC (talk) 15:59, 16 October 2021 (UTC)[reply]

detected wrong merge

[edit]

I just discovered a faulty merge, which unfortunately also led to your bot being edited. Could you see if you can undo the edits made by your bot where it changed the data object Chowdhury (Q30971895) to Chaudhry (Q1068345)? --Gymnicus (talk) 22:59, 16 October 2021 (UTC)[reply]

I've created a new item - Q108911685 for non-Latin surnames. Probably it also needs separation. --Infovarius (talk) 23:39, 16 October 2021 (UTC)[reply]
@Infovarius: Thank you, I also see that as useful. But shouldn't we also separate the individual languages ​​(Bengali, Nepalese, Urdu and Newari)? At least the names look very different to me as a layman who has no idea about these languages. --Gymnicus (talk) 11:07, 17 October 2021 (UTC)[reply]
I reverted my bot edits. — Ivan A. Krestinin (talk) 01:56, 17 October 2021 (UTC)[reply]
Thank you very much --Gymnicus (talk) 11:07, 17 October 2021 (UTC)[reply]

Hi;

Why have you removed this property? I would ask you to undone the change, please. —Ismael Olea (talk) 10:58, 31 October 2021 (UTC)[reply]

Hi, both have too many violations (more than four hundreds). It is too many for mandatory constraint (Q21502408). This flag was created for monitoring and manual fixing few number of unexpected cases. But the mechanism is broken now. Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations stopped updating because it current size is ~7 Mb (page size limit is 2 Mb). — Ivan A. Krestinin (talk) 11:26, 31 October 2021 (UTC)[reply]

Why is KrBot removing Swedish Open Cultural Heritage URI (P1260) like this? Swedish Open Cultural Heritage URI (P1260) is allowed to have duplicate values. /ℇsquilo 13:40, 4 November 2021 (UTC)[reply]

Why would you want a duplicate value? — Martin (MSGJ · talk) 15:14, 4 November 2021 (UTC)[reply]
Several absolutely equal values is mistake in the most cases. I can add Swedish Open Cultural Heritage URI (P1260) as exception. But maybe it is better to add some qualifier to the values? For example applies to part (P518). Currently the values are looked very strange for humans also. Looks like Wikidata has single item for the lighthouse, Swedish database has single record also. It is not obvious why the identifier should be specified twice. — Ivan A. Krestinin (talk) 15:24, 4 November 2021 (UTC)[reply]
I guess the squirrel didn't notice that the values were exactly the same! — Martin (MSGJ · talk) 18:09, 4 November 2021 (UTC)[reply]

Constraint Violation Statistics

[edit]

Hi Ivan,

I'm currently conducting research on the constraints violations of Wikidata and I have found your bot KrBot2. My question is if the queries/scripts for violations counting are available in some git repo, or if there is another way to get them. Thank you for your time!

Cheers, Nicolas

Hi Nicolas, the code is located in private repo currently. The code loads full and incremental Wikidata dumps. This process takes ~9 days and requires significant amount of memory. So I am not sure that the code will be useful for your task. Maybe this report will be enough for your research. — Ivan A. Krestinin (talk) 14:36, 5 November 2021 (UTC)[reply]

Hi! Why is this bot edit? The gallery and the category are two completely different concepts; existence of the one doesn’t imply the existence of the other, and even if they both exist, their name may not match (cf. c:Category:Moscow vs c:Москва). Now no statement carries the information that there’s a gallery about Evolution. And I don’t even see anything on Property talk:P935 that would instruct the bot to do so, so I can’t stop it. (Which is unfortunately often an issue with your bots: not open source so I can’t use the source look, often unclear edit summaries, no community control over certain tasks.) —Tacsipacsi (talk) 14:07, 7 November 2021 (UTC)[reply]

Hello, Evolution (Q336251) does not have gallery on Commons. Evolution (software) is redirect to category. I agree, edit summary is confusing a bit for this case. — Ivan A. Krestinin (talk) 16:24, 7 November 2021 (UTC)[reply]
I see. Yes, please use an appropriate edit summary in this case—the bot didn’t move the statement value, it removed it because it was no longer appropriate. As I explained above, I can’t even imagine a case where this edit summary was appropriate assuming the edits themselves are correct. —Tacsipacsi (talk) 00:55, 8 November 2021 (UTC)[reply]

Update on constraints reports

[edit]

Nice bot. Any chance to get an update on the constraints report for Norwegian historical register of persons ID (P4574)? I might report a few more properties with recently added constraints soon assuming that's ok. Thanks. --Infrastruktur (T | C) 07:45, 18 November 2021 (UTC)[reply]

Hello, bot does not touch report if the only change is item count or date. I updated it manually. Bot did not detect any constraint violations. — Ivan A. Krestinin (talk) 22:07, 18 November 2021 (UTC)[reply]

How to undo consequence of a incorrect merge?

[edit]

Hi,

There was an incorrect merge of Alsace wine (Q80114014) and Alsatian Vineyard Route (Q1334019) (done Andrew Dalby and undone by Jon Harald Søby last October, thanks).

But I noticed today that you not replace the first by the second which make the current situation of mess: 700+ wine are now defined as a road... (and also this cause 700+ constraint violations), see Q41437058 for one example.

My question: is there a simple way to undo the replacement? (or at least to find the list to do an overwrite with Quickstatements).

Cheers, VIGNERON (talk) 12:45, 20 November 2021 (UTC)[reply]

Sorry about that. It seemed useful at the time and I had no idea that this chain reaction would happen. Andrew Dalby (talk) 14:05, 20 November 2021 (UTC)[reply]
@Andrew Dalby: no probem, it happens... that's why I'm very carefull hen merging, there can quickly be dire consequences but errare humanum est so shikata ga nai. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)[reply]
Thanks for your reply, @VIGNERON:. Thinking it over, I guess one should consider before merging whether the pages have, or ought to have, the same instance of (P31) values. It will normally be true. But in this case it wouldn't have been true, and that would have been a warning. Andrew Dalby (talk) 15:38, 23 November 2021 (UTC)[reply]
Hello, I rolled back the links change. Previously links like this were used for rollback also. See User_talk:Ivan_A._Krestinin/Archive#suggestion: using edit groups for solving redirects. for details. @Pintoch:, @Pasleim: currently edit group tool shows 'Edit group "Q80114014_Q1334019" not found' error. Is it some bug? — Ivan A. Krestinin (talk) 22:00, 20 November 2021 (UTC)[reply]
@Ivan : yes, I noticed this error too, I was wondering if it was just me or not. Anyway, thanks for the quick answer and I'll let you look into it. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)[reply]
@VIGNERON, Ivan A. Krestinin: thanks for the notification. KrBot still seems to have its edits tracked by EditGroups (https://editgroups.toolforge.org/?user=KrBot) but somehow this batch seems to have been missed, it is not clear to me why. I will look into the problem. − Pintoch (talk) 14:36, 24 November 2021 (UTC)[reply]

Unitless range constraint (Q21510860) for united quantity isn't checked "correctly"

[edit]

Many properties are naturally expressed with units, but have unitless range constraints. (This should be deprecated and fixed, but that's another discussion).

For example, duration (P2047) has a constraint which Wikibase interprets as meaning that the maximum allowed duration is a billion seconds.

KrBot2 currently doesn't list violations of the billion-seconds constraint, such as 70 years or more after author(s) death (Q29870196). Wikibase does. (In this case, the constraint is inappropriate and should be removed, IMHO).

If it's not too hard to do, it would be nice if KrBot2 and Wikibase could agree on how to interpret such constraints.

Streetmathematician (talk) 13:29, 22 November 2021 (UTC)[reply]

Originally "Range" constraint checks value only. Units are ignored by this constraint. Looks like it was reimplemented in Wikibase using a bit strange normalization algorithm. I agree that some properties may require taking into account units. But there are few examples of such cases. duration (P2047) is looked as error usage of "Range" constraint. For example duration of Sun live is more than 1000000000 seconds. I fixed the constraint. Reason why units are not supported by Range constraint is very simple. Conversion from one unit to another might be non-trivial. For example Mach number (Q160669) -> kilometre per hour (Q180154). Also set of all units used on Wikidata is not defined. I can implement units support for some specific cases. But not for all possible units. — Ivan A. Krestinin (talk) 17:05, 23 November 2021 (UTC)[reply]

Please don't edit Q4233718

[edit]

The edits to anonymous (Q4233718) by your bot are incorrect. Please make sure that your bot doesn't edit that item. Multichill (talk) 16:45, 27 November 2021 (UTC)[reply]

Fixed: [8], [9], [10], [11], [12]. — Ivan A. Krestinin (talk) 21:11, 27 November 2021 (UTC)[reply]
[edit]

Hello! Your bot used old dead link vwo.osm.rambler.ru for this list: Wikidata:Database reports/Constraint violations/P884. Машъал (talk) 11:25, 8 December 2021 (UTC)[reply]

Приветствую, бот всегда берет первое значение из свойства formatter URL (P1630), несмотря на его ранг. Такое вот ограничение. Обошел проблему поменяв маски местами. Вообще если подумать, то какой смысл хранить устаревшие маски на странице свойства... Истории место на странице истории... — Ivan A. Krestinin (talk) 22:33, 10 December 2021 (UTC)[reply]
Спасибо. Я тоже не знаю зачем, тем более ссылка умерла. Но кто-то так настроил, наверное по правилам нужно? Машъал (talk) 19:06, 14 December 2021 (UTC)[reply]
Да нет никаких особых правил на этот счет, просто кому-то не захотелось удалять устаревший линк. — Ivan A. Krestinin (talk) 21:18, 14 December 2021 (UTC)[reply]

Scholarly article duplicates

[edit]

Hi - I've gone through the list at User:Ivan A. Krestinin/To merge/Scholarly articles as it was a few weeks ago, and merged a large number of them (over 1000). However I see the list has been updated. Would it be possible to sort this list by items recently created (for example Q109?????? duplicates at the top, etc.) Or is the list possibly incomplete and items might be added that were created a long time ago but weren't caught by your checks yet? ArthurPSmith (talk) 21:29, 13 December 2021 (UTC)[reply]

Oh - I just realized you put it in a sortable table so I could have sorted on Qid from the start! Anyway, I guess I'll wait for the next update to this to see what I may have missed. I've contacted one person who was creating duplicates and that seems to have ceased, so hopefully we won't get so many going forward. ArthurPSmith (talk) 21:35, 13 December 2021 (UTC)[reply]
Hi Arthur, you made great job, thank you! The list is incomplete of course. It is limited by size (1 Mb). Full report size is 36 Mb now. bot sorts the report by internal rank. So end of full report contains false positives mostly. I update the report after Wikidata:Database reports/Constraint violations/P356 and some other pages update. — Ivan A. Krestinin (talk) 21:17, 14 December 2021 (UTC)[reply]

Database reports/identical birth and death dates

[edit]

The valuable report Wikidata:Database reports/identical birth and death dates/1 seems to contain a lot of matches with 1 January for date of birth or date of death at the moment, most of which are probably spurious precision. Would it be worth suppressing these, as they very seldom represent an actual match ? Jheald (talk) 13:00, 23 December 2021 (UTC)[reply]

The best solution here is just fix wrong values as for me. Did you contact with user who created values with wrong precision? Maybe he has some instrument for fix. — Ivan A. Krestinin (talk) 22:26, 23 December 2021 (UTC)[reply]
You're probably right, that it's better to try to fix issues than just to hide them. I've been in touch with Ghuron, who created about 760 entries like this as part of a recent upload of data from The Righteous Among the Nations Database (Q77598447) at Yad Vashem. Some others may have been created back in 2014 by User:GPUBot (since blocked). There may also be others again, created in other uploads (cf https://w.wiki/4bcR - quite a diverse set of items); but with luck the situation should become clearer once the Yad Vashem ones are sorted out. Jheald (talk) 17:23, 24 December 2021 (UTC)[reply]
I started task that fixes January 1 values from The Righteous Among the Nations Database (Q77598447), edit example: [13]. — Ivan A. Krestinin (talk) 08:31, 25 December 2021 (UTC)[reply]

Autofix - P17

[edit]

Hi, can you stop the automated edits of Catalonia to Spain? While I am totally against these edits as established by our Wikipedia community, Catalonia, like other nations such as Kurdistan, is not only in what is considered a sovereign state, but divided, in our case, in two (Spain and France). Therefore, it is possible that certain technically wrong edits may be made. Regards, --KajenCAT (talk) 10:12, 7 January 2022 (UTC)[reply]

Hello, just remove {{Autofix|pattern=Q5705|replacement=Q29}} line from Property talk:P17. I do not know situation with Catalonia in details. Maybe it is good to start discussion on Property talk:P17 before or after removing the autofix template. — Ivan A. Krestinin (talk) 17:46, 7 January 2022 (UTC)[reply]
Thank you for your response. I will provisionally withdraw it and open the subject. Thank you again. KajenCAT (talk) 23:16, 7 January 2022 (UTC)[reply]

Remove audio podcast

[edit]

can you please remove distribution format: audio podcast from JRE episodes such as JRE #312 - Steve Rinella, Bryan Callen (Q109306593), as most episodes are video podcasts and only very few of them are audio only Germartin1 (talk) 10:41, 8 January 2022 (UTC)[reply]

Hello, I added rollback task to bot`s task list. Bot should rollback 1420 items today or tomorrow. — Ivan A. Krestinin (talk) 18:16, 8 January 2022 (UTC)[reply]
✓ DoneIvan A. Krestinin (talk) 19:44, 9 January 2022 (UTC)[reply]
Thanks, what about these ones, some of them are video podcasts https://www.wikidata.org/w/index.php?title=Q101011923&type=revision&diff=1408455138&oldid=1335982903 Germartin1 (talk) 11:28, 14 January 2022 (UTC)[reply]
✓ Done I reverted also edits based on Spotify show ID (P5916). — Ivan A. Krestinin (talk) 01:10, 15 January 2022 (UTC)[reply]

Adding internet archive identifiers to items for people

[edit]

Hello,

Can your bot stop adding an internet archive identifier to items for people such as Q110486431. Thank you. Gamaliel (talk) 16:59, 10 January 2022 (UTC)[reply]

Hello, just delete line {{Autofix|pattern=<nowiki>https?://archive\.org/details/([0-9A-Za-z@][0-9A-Za-z._-]+)|replacement=\1|move_to=P724}}</nowiki> from Property talk:P973. This job was added by Jura1 several years ago. Maybe it is good to discuss it with him. Also I added one more conflicting value. This should prevent such edits also. — Ivan A. Krestinin (talk) 17:48, 10 January 2022 (UTC)[reply]

Добрый вечер! Хочу вас пригласить как инженера Русской Википедии в википроект Россия. Вы можете помочь сообществу теснее интегрировать данные РуВики в общий банк данных, создавать нужные свойства и т.д. MasterRus21thCentury (talk) 16:54, 18 January 2022 (UTC)[reply]

Приветствую, если будут конкретные задачи, то обращайтесь. Свободного времени у меня не очень много, но какие-то задачи возможно решу. — Ivan A. Krestinin (talk) 17:01, 18 January 2022 (UTC)[reply]
Например сейчас можете принимать участие в обсуждении предполагаемых свойств Викиданных из российских источников. MasterRus21thCentury (talk) 17:35, 18 January 2022 (UTC)[reply]

Подведение итогов по свойствам Викиданных

[edit]

Иван, привет! Вы бы не могли подвести итоги по свойствам Викиданных, поскольку новые свойства не создаются с понедельника, а также скопилось 61 свойство, ожидающее решение администратора или создателя свойств? MasterRus21thCentury (talk) 17:12, 21 January 2022 (UTC)[reply]

Приветствую, довольно редко занимаюсь созданием новых свойств, лучше обратитесь к другим участникам. Я в основном специализируюсь на автоматических процедурах поддержания целостности и качества данных. — Ivan A. Krestinin (talk) 17:15, 21 January 2022 (UTC)[reply]

Many articles with PubMed ID = 9541661

[edit]

Hi. I find there are 24 articles with PubMed ID = 9541661. e.g., Indirect (repeat) prescribing (Q84597236), The pharmaceutical industry (Q84597219). Can we recover the edits? And is this a one-time event? Kanashimi (talk) 06:23, 23 January 2022 (UTC)[reply]

Hello, it was once-run task. The edit is looked correct: both 19790808 and 19790797 were deleted by PubMed. PubMed marked its as duplicate of 9541661. Looks like all these IDs were merged because it is single large work actually. And the Wikidata items are correspond to chapters of this work. Usually we have no separate item for each chapter on Wikidata. I suggest to merge all these items following for PubMed. — Ivan A. Krestinin (talk) 10:28, 23 January 2022 (UTC)[reply]
Thank you. Kanashimi (talk) 10:36, 23 January 2022 (UTC)[reply]

Taxonomy bug?

[edit]

Hi, no idea how this happend, just reporting: https://www.wikidata.org/w/index.php?title=Q469652&diff=1564671901&oldid=1564583128&diffmode=source

Best, AdrianoRutz (talk) 12:52, 28 January 2022 (UTC)[reply]

regular constraint reports

[edit]

Hi Ivan,

Seems constraints reports are much more frequently updated, almost daily. Excellent news. Thanks!

Maybe we should mention it on Wikidata:Status_updates/Next#Other_Noteworthy_Stuff --- Jura 08:22, 10 February 2022 (UTC)[reply]

Cool. I added a note to the weekly news. --- Jura 13:59, 11 February 2022 (UTC)[reply]
Update: +280$ for extending RAM and update cycle is 9 hours now. In practice update frequency is limited by 24 hour period of incremental dumps generation. — Ivan A. Krestinin (talk) 21:21, 15 February 2022 (UTC)[reply]

stats on of (P642) as qualifier by property

[edit]

Maybe you have seen Property talk:P642.

I think it would be helpful to have statistics about the properties currently using it as qualifiers.

As there are 14 million uses, this is hard to do on query service.

I noticed the constraint report for P31 has them (197165).

Do you have a simple way to generate a summary for all properties (even those without allowed qualifier constraints, e.g. P279). --- Jura 12:01, 10 February 2022 (UTC)[reply]

It is not so simple, but I am thinking on possible implementation. — Ivan A. Krestinin (talk) 16:33, 11 February 2022 (UTC)[reply]
In the meantime we got some approximation with the query Vojtěch provided.
Maybe stats on each pair property / qualifier could be interesting, beyond P642.
OTH, the problematic might not necessarily be the most used ones. Personally, I think "applies to part" is the most problematic one. --- Jura 17:01, 11 February 2022 (UTC)[reply]
@Jura1: usage report: User:Jura1/P642 usage. — Ivan A. Krestinin (talk) 10:48, 13 February 2022 (UTC)[reply]

instance of (P31) removal of maintained by wikiproject

[edit]

Why would you do this? Lectrician1 (talk) 01:11, 11 February 2022 (UTC)[reply]

Just because it throws error and is not something commonly used. Question: can discussed cases be fixed automatically? Or its require non-trivial manual work? Maybe it is better to add {{Autofix}} or something like it? — Ivan A. Krestinin (talk) 16:42, 11 February 2022 (UTC)[reply]
Then why don't we just make it an allowed qualifier? I don't think we should autofix this stuff. Lectrician1 (talk) 17:47, 12 February 2022 (UTC)[reply]
I just used old principle: entities should not be multiplied beyond necessity. Is the qualifier used for some automated work? — Ivan A. Krestinin (talk) 21:09, 15 February 2022 (UTC)[reply]
@Ivan A. Krestinin It's to give people an idea about who to contact if they have questions about the constraint. A lot of of the constraints are for managing the Wikiproject Music data model which is complex and new contributors might have questions about it. Lectrician1 (talk) 01:06, 16 February 2022 (UTC)[reply]
I added the qualifier to ignored qualifiers list. Please rollback my edit. — Ivan A. Krestinin (talk) 21:41, 16 February 2022 (UTC)[reply]

identical dates and deprecated January 1

[edit]

Hi Ivan,

As we kept getting entries with deprecated January 1 dates, I started listing them at False_positives#pairs_with_deprecated_January_1_date. I left some notes about it at #January_1_as_date.

Since then, more get created with deprecated rank directly added (sample: Q110842925#P569).

Accordingly, I'd filter any deprecated "January 1"-date by default. --- Jura 10:35, 11 February 2022 (UTC)[reply]

Удаление DOI

[edit]

Добрый день. Почему ваш бот удаляет коды DOI, пусть и нерабочие, но подтверждённые источником? Они в частности позволяют избегать дублирования элементов. --INS Pirat ( t | c ) 05:11, 12 February 2022 (UTC)[reply]

Приветствую, они скорее наоборот мешают находить и объединять дублирующиеся элементы. Так как у статьи корректный код как правило только один, а вот некорректных кодов может быть сколько угодно. В результате оказывается, что есть два элемента описывающих одну статью, но DOI коды у которых разные. В свое время была сделана массовая чистка некорректных кодов, в результате количество объединенных элементов уже больше 50 тысяч. Кое-какая информация об этой работе: Property talk:P356#15138 wrong values. — Ivan A. Krestinin (talk) 07:07, 12 February 2022 (UTC)[reply]
Вы говорите о некорректных кодах в целом, а не о приведённом случае. И каким образом мешают? После создания повторного элемента, опираясь на тот же источник, точно так же DOI будет помечен как уже используемый (хоть и сам он нерабочий). И у материала может быть и верных кодов несколько. --INS Pirat ( t | c ) 09:53, 12 February 2022 (UTC)[reply]
Было много пар элементов, где в одном из них был указан корректный код, а в другом - некорректный. Робот или человек видел два элемента с разными кодами и делал логичный вывод: это разные статьи, объединять нельзя. Не знаете кстати какова природа этих некорректных кодов? Откуда берется такое большое количество невалидных значений? Кстати идентификатор DOI ведь не единственный по которому можно искать дубликаты. Для статьи, что мы обсуждаем вполне можно искать дубликаты по вполне валидному значению свойства Cairn publication ID (P4700). — Ivan A. Krestinin (talk) 16:10, 12 February 2022 (UTC)[reply]
Я не совсем понимаю вашу позицию. Да, не единственный идентификатор. Но я не считаю, что это препятствует использованию других. Есть факт: в первоисточнике указан определённый DOI. Я оформил его соответствующим образом (для того ранг нерекомендуемости и квалификаторы и нужны). Если, повторюсь, у материала больше одного рабочего DOI, ситуация получается такой же, как вы описываете. --INS Pirat ( t | c ) 16:54, 12 February 2022 (UTC)[reply]
Позиция простая: если превращать Wikidata в коллекцию заблуждений (пусть и помеченных соответствующим образом), то станет крайне сложно выполнять даже такие простые операции, как поиск элементов-дубликатов. Проблема еще осложняется тем, что некоторые участники начинают массово правильные коды помечать рангом нерекомендуемости и выставлять те же квалификаторы. И тут вообще полный ад начинается. Давайте просто не будем без особой на то необходимости тянуть в Wikidata невалидные значения. То, что мы можем это сделать не значит же, что мы должны это делать. — Ivan A. Krestinin (talk) 17:24, 12 February 2022 (UTC)[reply]
С чего бы кому-то таким заниматься, ещё и массово? Где вы такое увидели? И действия участников не должны влиять на допустимость информации. И всё ещё не ясно, какие вы видите помехи поиску дубликатов (как я уже сказал, наоборот, должно помогать). Более глобально: боты вообще не должны касаться целенаправленно внесённой невандальной информации (допустим, с рангами/квалификаторами), по крайней мере уж повторно, при отмене. --INS Pirat ( t | c ) 20:45, 12 February 2022 (UTC)[reply]
Посмотрите, например, эти правки: [14], [15], [16], там правда не DOI, а другой идентификатор, но суть та же. Давайте подробно опишу всю историю: среди элементов описывающих научные статьи дубликаты заливались и продолжают заливаться тысячами. Я решил заняться массовыми мержеваниями. Работа важная, т. к. из-за такого обилия данных движок SPARQL в ближайшем будущем умрет, вычистка дубликатов хоть немного отсрочит его смерть. Главная опасность этой работы - не объединить лишнего, т. к. откат одного неправильного объединения - дело тяжелое, а откат пары сотен неправильных объединений - вообще беда. Потому алгоритмы приходится делать сильно параноидальными, малейшее различие и объединение надо прерывать. Ориентироваться здесь на ранги идентификаторов не получается, т. к. ранг "нерекомендуемый" расставлен достаточно случайным образом (см. примеры выше). Бот успешно отработал и объединил что-то около 10000 пар элементов. После этого я начал анализировать случаи, когда бот не объединял похожие элементы. Выяснилось, что в большом количестве случаев была ситуация, когда в одном элементе был корректный код, а в другом - невалидный. Или в обоих невалидные коды. Это касалось не только DOI. Но DOI был один из самых информативных и самых "замусоренных". Дальше началась долгая история с вычисткой невалидных кодов. Часть кривых кодов была удалена как полный треш, который неизвестно откуда взялся и даже по формату на DOI не был похож. Потом нашлись коды похожие на DOI по формату, но кодами DOI не являющиеся. Пришлось договариваться с организацией CNRI, которая обслуживает этот идентификатор относительно валидации всех 27 миллионов имеющихся у нас кодов. Бот работал больше месяца, но в итоге вычистил почти 80 тысяч кодов, не являющихся DOI. После всех этих работ удалось объединить уже больше 50 тысяч пар элементов и работа продолжается. Только сегодня робот объединил больше 2000 пар элементов. По поводу того, чтобы не делать какие-то изменения повторно, если они были отменены кем-то: с одной стороны я с вами согласен, было бы наверное здорово действовать именно так. Но тут три проблемы: 1. это технически весьма сложно, а технически сложные системы обычно содержат много багов и как следствие склонны к невалидному поведению. 2. многие типы ошибок повторяются многократно разными пользователями. 3. на Викиданных приходится оперировать десятками тысяч, если не миллионами элементов, вручную исправлять кейсы, где были откаты нет никакой возможности. — Ivan A. Krestinin (talk) 22:47, 12 February 2022 (UTC)[reply]
Это несколько перпендикулярно теме опоры на источники, но уже более убедительно. (Почему сразу-то к этому не перешли?)
1) Умеет ли алгоритм объединять элементы, в одном из которых указан верный DOI, а в другом - нет никакого (и нет иного уникального идентификатора)? 2) Умеет ли алгоритм объединять элементы с разными, но валидными DOI? --INS Pirat ( t | c ) 17:45, 17 February 2022 (UTC)[reply]
Бессмысленно же опираться на источник, когда в нем указан очевидно ошибочный идентификатор. Опечатались, неправильно данные подтянули, чего только не бывает на больших объемах. Что же нам теперь превращаться в собрание всех ошибок на свете... 1) Если в элементе нет никакого идентификатора, то бот может еще найти по совпадению title (P1476). 2) Нет, сейчас алгоритм ориентируется только на точные совпадения. Есть еще отчет про то, что боту показалось похожим, но объединить автоматически он "не решился": User:Ivan A. Krestinin/To merge/Scholarly articles. — Ivan A. Krestinin (talk) 20:20, 17 February 2022 (UTC)[reply]
doi.org использует очень качественный датасет, но, как и любой датасет, он содержит некоторое количество ошибок. Существует ли способ пометить "невалидный" DOI (P356) так, чтобы бот его пропускал? Например How much of the solar system should we leave as wilderness? (Q63858167)DOI (P356)10.1016/J.ACTAASTRO.2019.03.014, и мы можем в этом убедится пройдя по этой ссылке, хотя https://doi.org/10.1016/J.ACTAASTRO.2019.03.014 возвращает 404
То же самое с Photometric and spectroscopic observations of the neglected near-contact binary Cl* Melotte 111 AV 1224 (Q68976229)DOI (P356)10.1088/1538-3873/AAD5D9 (см. https://iopscience.iop.org/article/10.1088/1538-3873/aad5d9).
Может быть использовать какую-нибудь причину нерекомендуемости? Ghuron (talk) 12:07, 17 October 2022 (UTC)[reply]
Коллега, посмотрите в историю Randomization-based inference for Bernoulli trial experiments and implications for observational studies. (Q49942301). То что мы с вами делаем - бессмысленно Ghuron (talk) 07:14, 23 October 2022 (UTC)[reply]
Коллега, я понимаю что у вас много сообщений, но, пожалуйста, увидьте это обращение Ghuron (talk) 08:27, 27 December 2022 (UTC)[reply]
Да, прошу прощения, действительно не заметил сообщения. Добавил в эти элементы корректные с точки зрения DOI коды. Обычно, когда возникает подобная ситуация, то я ищу в Гугле точное имя статьи плюс "DOI". Среди первых нескольких ответов обычно бывает правильный код. — Ivan A. Krestinin (talk) 16:52, 27 December 2022 (UTC)[reply]

Regarding Wikidata Q3105247

[edit]

‪You recently changed the SBN author ID‬ ‪(P396)‬: IT\ICCU\BVEV\090371 to BVEV090371, but now the authority control of en.Wikipedia says: "The ICCU id BVEV090371 is not valid". Why?--Ruotailfoglio (talk) 16:29, 20 February 2022 (UTC)[reply]

Hello, it is some kind of cache most probably. I do not see any marks at Q3105247#P396. Please try F5 and Ctrl+F5 in your browser. Maybe it is browser cache. Or wait for day or two. — Ivan A. Krestinin (talk) 16:33, 20 February 2022 (UTC)[reply]
Thank you! Ruotailfoglio (talk) 18:24, 21 February 2022 (UTC)[reply]

Reverting an Autofix?

[edit]

Hello,

Last week a mistaken {{Autofix}} was added to platform (P400) (replacing personal computer (Q16338) with Microsoft Windows (Q1406)), and KrBot duly replaced on thousands of items. is there a way to revert all these autofixes? See Property_talk:P400#Autofixes for details. Thanks! Jean-Fred (talk) 20:29, 21 February 2022 (UTC)[reply]

Did you review the source? Arlo Barnes (talk) 22:12, 24 February 2022 (UTC)[reply]

Very quickly. Skeleton is not sex or gender. It is looked as inappropriate value for sex or gender (P21), please see the property constraints. — Ivan A. Krestinin (talk) 23:30, 24 February 2022 (UTC)[reply]

Code sample

[edit]

Hi Ivan, hope you're having a great day :) Any hope of see the code sample that Krbot2 uses to update constraint violation report? I'd really appreciate it Joseph202 (talk) 17:17, 1 March 2022 (UTC)[reply]

Hi Joseph, the code was not published. Bot is written on C++. Constraints report update task uses many shared code with another wiki-related tasks (200+ tasks). Please write me email. I can send some parts. Or put some questions here if you need some implementation details instead of the code actually. — Ivan A. Krestinin (talk) 03:54, 2 March 2022 (UTC)[reply]
@Ivan A. Krestinin: Thank you for your reply, actually, I want to use the code on a third-party installation of wikibase, that's why I was asking Joseph202 (talk) 17:18, 6 May 2022 (UTC)[reply]
Bot works with dumps. Do you have planes to dumps of compatible format? What amount of data is planned? We can think about connecting my bot to your project too. — Ivan A. Krestinin (talk) 18:47, 6 May 2022 (UTC)[reply]
@Ivan A. Krestinin: Currently, the only way we get/generate dumps is via our Special:DataDump, although there seem to be a way via API that I haven't tried before.
But you can have a look if you wish. Joseph202 (talk) 20:40, 6 May 2022 (UTC)[reply]
@Ivan A. Krestinin: Hi, I trust you're having a great day. Per the above, We actually get dumps via the Special:EntityData special page, we can also get for the formats that Wikidata can get too. How can we begin to work on this?
Hope to hear from you soon. Joseph202 (talk) 08:26, 12 May 2022 (UTC)[reply]
Hi Joseph, I take a look to the project. One problem is different identifiers. For example P31, P279, Q21502404 - all are hard-coded now. Need to make some parametrization. Another question is project size. Constraints system requires some efforts for deploy and maintenance. I am not sure that it is reasonable to use it on small-size projects. Maybe it is better to focus on data volume increase first. — Ivan A. Krestinin (talk) 19:15, 12 May 2022 (UTC)[reply]
Hello, thanks for replying.
Yes, the IDs are different, is it not possible to configure it to fit in to Gratisdata? and yes, there are over 3000 data available currently and still counting is that not considerable in terms of being large volume?
I'd love to hear from you Ivan, thank you! Joseph202 (talk) 18:37, 13 May 2022 (UTC)[reply]

explain why constraints aren't applicable on {P7883}

[edit]

Hi, I noticed you removed mandatory constraint (Q21502408) from multiple properties including Historical Marker Database ID (P7883), Would you provide clarification why these were removed? Given they were just removed without comment and I'm the one who added them I'd like to know how to more accurately apply these in the future. Wolfgang8741 (talk) 19:25, 2 March 2022 (UTC)[reply]

Hello, the constrains have 100+ violations and are not looked as something easy to fix. Could you add the flag after fixing most part of violations? Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations was made as tool for quick revert of vandalism or wrong edits. But the report is unmaintainable now. Size is too large. I am trying to improve the situation with quality using different approaches. — Ivan A. Krestinin (talk) 19:42, 2 March 2022 (UTC)[reply]
Ah, thanks for the explanation. Prior to adding the type constraint, there was no means of ensuring consistent use across the IDs and why applying the type constraints generated a large report. I started a discussion to constrain and cleanup the marker IDs at WikiProject_Cultural_heritage#Adding_item_Type_Constraints_for_Historic_Marker_Properties ideally leading to a model for IDs related to markers. I'm still digging deeper into Wikidata's structure for nudging for data consistency and preventing conflation of concepts. Does one of your approaches rely upon the Wikidata:Database_reports/EntitySchema_directory? These constraints weren't meant to be left, but prompt cleanup moving the IDs. So what I'm hearing is once the IDs are cleaned up adding the mandatory constraint would be appropriate. Wolfgang8741 (talk) 20:32, 2 March 2022 (UTC)[reply]
I do not use Wikidata:Database_reports/EntitySchema_directory directly. My approaches use property constraint (P2302), {{Autofix}}, many different custom bot tasks, for example automatic duplicate values cleanup. The most tasks are focused on fixing different popular mistake types. Adding "mandatory" mark is nice practice after completing work on property. — Ivan A. Krestinin (talk) 22:07, 2 March 2022 (UTC)[reply]

идентификатор в Яндекс.Дзене (P8816)

[edit]

День добрый. Бот по ряду персон (пример - всего таких персон лично мне известно не менее 14-ти) в идентификаторе в Яндекс.Дзене (P8816) меняет значение индентификатора, удаляя участок "id/", после чего Яндекс.Дзен перестаёт открываться. Формально бот действует в соответствии с определённой маской URL https://zen.yandex.ru/$1, но по факту в таких статьях работающие ссылки работать перестают. Как понимаю, когда обсуждался идентификатор, просто не учли, что помимо основной маски, имеющейся у большинства персон в Яндекс.Дзене, у некоторых есть ещё и такой с "id/". Поэтому вопрос, как можно проблему решить? Или настроить бот, чтоб он по таким статьям не менял идентификатор, или какие другие варианты возможны? Спасибо. --Uchastnik1 (talk) 11:12, 3 March 2022 (UTC)[reply]

  • Так понимаю, что после этой правки произошла замена. --Uchastnik1 (talk) 11:40, 3 March 2022 (UTC)[reply]
  • Посмотрел по вкладу бота за соответствующий период, число таких статей/элементов ВД увеличилось где-то до 20-ти, и также обнаружился такой момент, что это не только персон касается, но и других сущностей (предметов), к примеру: Kion, Вокруг ТВ, Холодильник.ру, то есть вряд ли можно сказать, что по духу исходных условий создания идентификатора не подразумевался охват идентификатором этих сущностей (то есть разница тут чисто технического плана, в части этого добавочного "id/", не более того, равно как и персон это касается таким же образом). --Uchastnik1 (talk) 14:55, 3 March 2022 (UTC)[reply]
    • Приветствую, да, вы верно нашли правку, которая сказала боту, что нужно убирать префикс id. Если ее отменить, то бот перестанет так делать. Чуть позже запущу отмену этих изменений. Хорошо бы еще поправить описание и свойства на странице Property:P8816, чтобы они допускали такой префикс. — Ivan A. Krestinin (talk) 16:48, 3 March 2022 (UTC)[reply]

Remove also non-existing files from references and qualifiers

[edit]

Could your bot also remove non-existing files from references and qualifiers? I have found few of them and manually fixed them (example) but it would be nice if this would be done automatically. I have seen your work do similar changes, so maybe this could be done by it as well?

Similarly, bot could change a filename of a Commons file used in a reference or a qualifier, when it is moved on Commons (example). Mitar (talk) 14:22, 11 March 2022 (UTC)[reply]

Hi, I am working on fixing qualifiers. But references processing will be more hard task. — Ivan A. Krestinin (talk) 13:55, 12 March 2022 (UTC)[reply]

Check P2190 constraint config as it moves from string to numeric ID

[edit]

Hi, could you double check if I migrated the property constraints correctly to numeric from string on C-SPAN person ID (P2190). This move was discussed on the property talk page and project chat. Wolfgang8741 (talk) 15:11, 14 March 2022 (UTC)[reply]

Hi, everything is fine. I just improved property a bit. — Ivan A. Krestinin (talk) 20:49, 14 March 2022 (UTC)[reply]
Hi, Thanks for that. Looking at the constraint report the deprecated string values are flagged for format violation. Shouldn't a deprecated value be exempt from the currently accepted format checks as well as from a single value constraint? Two reasons to retaining the deprecated IDs are
1. matching with archived versions of the data using the old ID affording checks of the data consistency over time or when the data was initially added
2. assisting in matching existing data to convert to the new identifier.
This is partially a technical question and partially a statement as I noticed you had removed a few ids have been removed instead of deprecating them per Help:Deprecation since the values were valid prior to transition, but less reliable. They still could be matched to the Internet archive or other archive. Wolfgang8741 (talk) 15:05, 19 March 2022 (UTC)[reply]
More ideas:
I fixed constraints, now we have zero violations. But splitting the property is more correct way. — Ivan A. Krestinin (talk) 22:00, 21 March 2022 (UTC)[reply]
Thank you, that would have been helpful guidance a while back. Where is splitting the property the more correct way documented? This is important process to have documented. I asked both on the property talk and Project chat for this guidance and for nearly a month no one responded to my questions about process to change the property format with certainty or clear path. How is one suppose to find the "more correct way" documentation or learn? Should we go about splitting the property so the ID can be properly constrained for monitoring? Wolfgang8741 (talk) 16:37, 24 March 2022 (UTC)[reply]
I know too few about Wikidata project documentation. Actually different approaches were used for different properties in the past. I just highlighted the best approach from my point of view. And listed reasons why it is the best. You are right, splitting allows improve constraints. — Ivan A. Krestinin (talk) 19:39, 24 March 2022 (UTC)[reply]

Languages statistics on a lexeme property

[edit]

Hello,

On Wikidata:Database reports/Constraint violations/P10338#Languages_statistics, it is stated that the property Dico en ligne Le Robert ID (P10338) is used 9 times. In fact, it is much more than that. Do you know why the statistics seem incorrect? Maybe it is related to the fact that lexemes are in a separate dump (I don't how your bot works, so it's just a blind guess)?

Cheers, — Envlh (talk) 21:48, 27 March 2022 (UTC)[reply]

Hello, looks like real bug. I will investigate this. Thank you. — Ivan A. Krestinin (talk) 23:00, 31 March 2022 (UTC)[reply]
FixedIvan A. Krestinin (talk) 20:37, 8 April 2022 (UTC)[reply]
Thank you! I confirm it is properly working since you fixed it :) Cheers, — Envlh (talk) 16:23, 25 April 2022 (UTC)[reply]

Must revert some KrBot changes

[edit]

Hello!

There is problem, when had merged two diffrent entries - dispersed settlement (Q1372205) and dispersed settlement in Latvia (Q16352482). That was in March 26. In Marth 28 KrBot set claim values (watch this change). There are so many changes, which must revert. Can these changes cancelled with bot? --Treisijs (talk) 12:19, 30 March 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 23:01, 31 March 2022 (UTC)[reply]

Does KrBot still update P214 monthly ?

[edit]

Hi, this article https://ejournals.bc.edu/index.php/ital/article/view/12959 states that KrBot "updates links in Wikidata items to redirected VIAF clusters and removes links to abandoned VIAF clusters." on a montthly basis. Is it still the case ? I'm not sure based on the statistics I got here https://bambots.brucemyers.com/NavelGazer.php. Thanks !

Hello, bot had some troubles. Not all items were processed. I fixed the issue. Everything should be fine now. All items were processed now. Thank you! — Ivan A. Krestinin (talk) 20:18, 8 April 2022 (UTC)[reply]
Thanks for the info and for the great job ! 193.52.26.94 06:49, 11 April 2022 (UTC)[reply]

Wikimedia import URL constraint violations

[edit]

Edits like this one trigger this constraint violation. The constraint was added by user:Tacsipacsi (and last edited by user:Nikki). Not sure how to best solve it? You could use http://www.wikidata.org/entity/Q111645043#P10039 & http://www.wikidata.org/entity/P10039#P2302 (or even http://www.wikidata.org/entity/P10039#P10039$4d1d74b4-4a3f-cc5b-c760-d133e2ac8fd9)? Or we can just remove the whole constraint... Multichill (talk) 16:36, 18 April 2022 (UTC)[reply]

Looks like a valid use case (more or less: as an outsider, it may not be obvious at first what those URLs mean). I definitely don’t recommend working around the constraint by using a different URL—if a constraint is wrong, it should be fixed, not worked around. However, I still see a constraint here: using Wikimedia import URL (P4656) with a Wikidata URL is valid if, and only if, a inferred from (P3452) statement is also present (which it makes more precise). Unfortunately this cannot be expressed with the current constraint system, so this constraint either needs to be replaced by a complex constraint that handles this situation, or a new property needs to be created for this purpose (which could have—non-complex—constraints that require that it points to a Wikidata URL, and to require that it’s always used together with P3452). The former avoids creating yet another property, the latter lets us continue to use non-complex constraints, which provide feedback to the user in context, not only on some hidden constraint report pages. —Tacsipacsi (talk) 15:41, 19 April 2022 (UTC)[reply]

Reverting redirect resolutions after wrong merger

[edit]

Hi Ivan, human sexual behavior (Q608) and Sex (Q98040930) were wrongly merged. Could you please revert the redirect resolutions? Thanks, Máté (talk) 05:45, 22 April 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 16:58, 23 April 2022 (UTC)[reply]
[edit]

Hello! Very nice things your bot does, however it's making an understandable mistake. It is removing the statement has part(s) (P527) YouTube comment (Q110875209) on YouTube comment (Q110875209). I undid it with the reason "YouTube comments can have other YouTube comments as replies", but it did it again. AntisocialRyan (Talk) 14:56, 24 April 2022 (UTC)[reply]

Hello, usually recursive link in has part(s) (P527) is just mistake. This is reason why bot removes it. But your case is interesting. I think reply comment is another comment. It is linked with original comment using specific relation "reply to". This relation is not similar to "has part" relation. Lets take analogue from another area to verify our statement. For example "human has part human" because one human is child of another human. This statement is looked wrong. So I think that statement "comment on Youtube has part comment on Youtube" is wrong too. — Ivan A. Krestinin (talk) 16:12, 24 April 2022 (UTC)[reply]
Alright, I see where you're coming from actually. YouTube comments can have replies, but replies to YouTube comments can't have more replies. I will create a new item for this, thanks! AntisocialRyan (Talk) 16:59, 24 April 2022 (UTC)[reply]

Monks

[edit]

Hi Ivan! These edits would not be done by your bot that it does not delete the value but replaces it with monk (Q733786)?

I understand why you delete from the occupation property which order of monks you are a member of, but if you don't write that you are a monk, the item can be left without an occupation (see). Thanks Palotabarát (talk) 21:52, 26 April 2022 (UTC)[reply]

Hi @Palotabarát:! Thanks for raising the point. We are trying to establish a standard for data regarding members of Catholic religious orders at Wikidata_talk:WikiProject_Religions#Members_of_Catholic_religious_orders; since there wasn't any objection, I applied this change, but of course it is reversible and improvement can be made (e.g. adding new P106s in order to fill the gap). We can continue the discussion there. Good night, --Epìdosis 22:42, 26 April 2022 (UTC)[reply]

Laurence Olivier Award for Best xxxx

[edit]

Hi, Ivan. The KrBot changed a number of entries I made yesterday for the series of Laurence Olivier Awards (Laurence Olivier Award for Best Actor in a Supporting Role in a Musical (Q19870586), Laurence Olivier Award for Best Actress in a Supporting Role in a Musical (Q19870588), Laurence Olivier Award for Outstanding Achievement in Music (Q16995976), Laurence Olivier Award for Best Actress (Q6500774), etc.), changing the country from "England" to "United Kingdom".

The Olivier Awards are not presented to theatre productions in the entire United Kingdom, made up of the individual countries of England, Scotland and Wales, along with Northern Ireland. These awards are only presented for theatre work in England (specifically London's West End theatre district), while the other countries of the United Kingdom have their own theatre awards.

When I chose "England", it even comes up with the phrase "constituent country of the United Kingdom", as it is, in fact, an actual country within the UK.

I tried using the statement for "applies to jurisdiction", set to England, but that makes my statement for London raise the flag "An entity with located in the administrative territorial entity should also have a statement country." I want the country to be England, as that is the country that matters, but the bot will just change it to United Kingdom.

I do not think that the KrBot needs to be changed, nor anything like that. Just need your help. Do you know a way that I can tell it "England" and "London", and not have any flags nor have a bot make a change? Thanks. Jmg38 (talk) 02:09, 28 April 2022 (UTC)[reply]

Hi again. I think I'll use "country (P17)" of United Kingdom, and "applies to jurisdiction (P1001)" of London. That captures everything I need, and avoids having to mention England at all while also avoiding having to fuss about England being a country, as the real important part is the "applies to jurisdiction (P1001)" of London. The KrBot was helpful in ways you may not have expected, as it forced me to think through what I was doing, which is never a bad thing! Thank you. Jmg38 (talk) 05:16, 28 April 2022 (UTC)[reply]
Hello! Could you review previous England-related discussions on Property talk:P17? Bot just executes {{Autofix}} rule from the property discussion page. You may delete the rule and bot will stop such edits. — Ivan A. Krestinin (talk) 17:03, 28 April 2022 (UTC)[reply]

Q7686436 and Q105550321 merge and replace

[edit]

Hello Ivan! @Howard61313: has incorrectly merged the occupation (military aviator (Q105550321)) and the category (Category:Military aviators (Q7686436)), and Krbot2 has changed the occupation of all persons in the profile (example). Can you do that back? (links) Thanks. Palotabarát (talk) 10:57, 1 May 2022 (UTC)[reply]

✓ DoneIvan A. Krestinin (talk) 18:46, 1 May 2022 (UTC)[reply]
Thank You! Palotabarát (talk) 07:05, 2 May 2022 (UTC)[reply]

Single value constraints for Dutch municipalities

[edit]

Dear Ivan A. Krestinin,

In your report, DATABASE_REPORTS/CONSTRAINT_VIOLATIONS/P382#"SINGLE_VALUE"_VIOLATIONS, I notice that there is probably searched for items with more than one claim of an identifier (which should be unique by definition). The Dutch CBS Municipality code in The Netherlands can become obsolete (then it should have an end date), changed into a new code (and then have doubles) or be a whole new code when a new municipality is created. When we for instance look at Etten-Leur Q9833#P382, we see that there are two CBS codes claimed. Which is correct. The old one has an end date and the current one is actual. Recently I also updated its rank in an attempt to work with it from a script.

Would it be possible, in your database report, to include a check on the end date and perhaps a double check on the rank and only report when there are more than one codes claimed without an end date or on an equal rank?

Furthermore, if you are interested, perhaps the check on end dates could be automated. I recently tried something with the api of CBS and found a table (70739ned) with the dates registered. See a script for population data and a script for surface area's where the table is used for mapping. Please forgive me my level of programming, I gave it my best try. If you have any questions, want to work together or have remarks towards me, please let me know. (My availability is varying, but intend to respond asap.)

Best regards, Démarche Modi (talk) 18:02, 3 May 2022 (UTC)[reply]

And see https://nl.wikipedia.org/wiki/Gebruiker:D%C3%A9marche_Modi/Kladblok/python/cbs_codes for the whole table I mentioned. Démarche Modi (talk) 18:54, 3 May 2022 (UTC)[reply]
If more values are possible over time, the property should have single-best-value constraint (Q52060874) instead of single-value constraint (Q19474404). And the item statement should have normal rank instead of deprecated, as deprecated rank is not for once-correct information. —Tacsipacsi (talk) 01:03, 4 May 2022 (UTC)[reply]
Specifying separator (P4155) for the constraint fixes the issue. See [17]. Of course start time (P580) and end time (P582) are need to be specified in items. — Ivan A. Krestinin (talk) 21:45, 4 May 2022 (UTC)[reply]
All done (for the ones reported with violations), statements have normal rank again and are provided with a start and end date. For the remaining bulk (without warnings in the report) I will consider a bot request. Démarche Modi (talk) 14:55, 6 May 2022 (UTC)[reply]

what right does your robot have to deny reality?

[edit]

̃https://www.wikidata.org/w/index.php?title=Q28105001&diff=prev&oldid=1634524482

?????????????

A.BourgeoisP (talk) 14:53, 8 May 2022 (UTC)[reply]

The statement is already on "significant event", the bot would've added it if it wasn't already there. If that is the right property is beyond me AntisocialRyan (Talk) 15:32, 8 May 2022 (UTC)̇[reply]
https://www.wikidata.org/w/index.php?title=Q28105001&action=history
A.BourgeoisP (talk) 12:11, 14 May 2022 (UTC)[reply]
@A.BourgeoisP: position held (P39) shouldn't be used for oldest human (Q254917) (it's for formal positions), use significant event (P793) instead. Reverting the bot over and over again is as useful as banging your head against a wall. Multichill (talk) 12:42, 14 May 2022 (UTC)[reply]
why oldest man in France (Q107344155) and list of European supercentenarians (Q1637694) are in p 39 and oldest human (Q254917) is in p793 ? what justifies separating them? Read the˞ infobox of French version of the article... we see oldest man in France (Q107344155) and list of European supercentenarians (Q1637694) but not oldest human (Q254917)! Why ? That is rediculous and stupid ! Please solve this problem. Wikipedia.fr does not have to be vandalized by the action of a robot on Wikidata! A.BourgeoisP (talk) 19:03, 14 May 2022 (UTC)[reply]
Helloooo ǃ? A.BourgeoisP (talk) 19:26, 17 May 2022 (UTC)[reply]
Hello, bot makes such changes because page Property talk:P39 has template {{Autofix|pattern=Q254917|replacement=Q254917|move_to=P793}}. Just delete the template and bot stops this activity. But I recommend to discuss the issue on Property talk:P39 or on Wikidata:Project chat first. — Ivan A. Krestinin (talk) 20:41, 17 May 2022 (UTC)[reply]

compositions ≠ musical works

[edit]

en:Category:2020 compositions (Q97275139) don't contain en:Category:2020 albums. be:Катэгорыя:Музычныя_творы_2020_года, ru:Категория:Музыкальные_произведения_2020_года and zh:Category:2020年音樂作品 (Q111684969) contain en:Category:2020 albums. Compositions ≠ Musical works, Musical works contain Compositions. -- 15:32, 15 May 2022 (UTC)[reply]

Q9059213Q5626704, Q9059213 contain Q5626704. -- 15:36, 15 May 2022 (UTC)[reply]
  • I merged the items because "ru:Категория:Музыкальные произведения XXXX года" is the same as "en:Category:XXXX compositions". I do not see difference between "Composition" and "Musical work". Its are looked as full synonyms after translation to my native language. Could you fix the issue? Fixing this is looked too hard task for me. You may use different from (P1889) to prevent wrong merges in future. — Ivan A. Krestinin (talk) 15:51, 15 May 2022 (UTC)[reply]

"Determination method" error

[edit]

Hello! determination method (P459) was added as a main statement to just setting up my twttr (Q64790997). Instead, it should be added as a qualifier to number of likes (P10649), number of comments (P10651), number of dislikes (P10650), and number of reblogs (P10756).

https://www.wikidata.org/w/index.php?title=Q64790997&diff=1640345856&oldid=1639539094

This would actually be super helpful because it is annoying to add! AntisocialRyan (Talk) 23:52, 16 May 2022 (UTC)[reply]

Hello! Bot will stop such edits after this change. Could you discuss with User:Trade this edit? — Ivan A. Krestinin (talk) 20:56, 17 May 2022 (UTC)[reply]
Hello! That isn't the problem, the problem is that it was added as a main statement to the item and not a qualifier of one of the properties.
See its revision here, which I have since reversed: https://www.wikidata.org/w/index.php?title=Q64790997&oldid=1640622098
It did it to a number of items. AntisocialRyan (Talk) 21:10, 17 May 2022 (UTC)[reply]

Wrongly adding the country "France"

[edit]

https://www.wikidata.org/w/index.php?title=Q1713379&type=revision&diff=1648001537&oldid=1647679444 seems wrong to me.--Danny Diggins (talk) 12:15, 27 May 2022 (UTC)[reply]

@Danny Diggins: as you can see in the edit, this is based on Mérimée ID (P380) which was incorrectly added by User:Ayack. Undid the edits. Multichill (talk) 17:00, 27 May 2022 (UTC)[reply]

Constraint violations P3503

[edit]

Hello, your KrBot2 is not detecting any violations at Wikidata:Database reports/Constraint violations/P3503. Even if new items are added with the same property and some violations are being fixed, the bot is not running. Do you know what is wrong with it? Elemar (WMIT) (talk) 09:32, 30 May 2022 (UTC)[reply]

Hi, bot uses incremental dumps as data source. Its are generated daily, but data is delayed in its. Anyway everything is looked fine now. — Ivan A. Krestinin (talk) 20:23, 10 June 2022 (UTC)[reply]

FandangoNow

[edit]

Привет. По всей видимости, вывод из наличия свойства FandangoNow ID (P7970) утверждения distributed by (P750) = FandangoNow (Q80948336) не является корректным. Например, у Bugsy (Q241085) есть корректный код, но при этом не похоже, чтобы он был доступен для просмотра как на FandangoNow раньше, так и на Vudu теперь. И судя по тому, что пишут в обсуждении ru:Википедия:Форум/Общий#Викиданные, таких случаев очень много. Вероятно, лучше всего удалить вообще все добавления FandangoNow как дистрибьютора, сделанные на основе наличия свойства. —putnik 17:14, 30 May 2022 (UTC)[reply]

См. также ru:Обсуждение участника:KrBot#FandangoNow. colt_browning (talk) 11:02, 31 May 2022 (UTC)[reply]
Приветствую, откатил все изменения. — Ivan A. Krestinin (talk) 12:46, 11 June 2022 (UTC)[reply]
Прошу прощения, какие все? В Beetlejuice (Q320384), например, утверждение на месте. (Там, правда, ещё правка этого утверждения была другим участником.) colt_browning (talk) 10:56, 12 June 2022 (UTC)[reply]
Те утверждения, что правили люди бот естественно не трогал. Будет не слишком правильно ботом вычистить вклад участников. Сейчас всего 10 значений осталось, не могли бы их глянуть и удалить при необходимости? — Ivan A. Krestinin (talk) 19:20, 16 July 2022 (UTC)[reply]

Please do not remove deprecated statements

[edit]

They are present for a reason: when the source database contains incorrect information, it prevents the bad data from being mistakenly reimported to Wikidata. I had to revert several edits you made removing deprecated NTIS accession number (P7791) statements on 27 February 2022. John P. Sadowski (NIOSH) (talk) 19:02, 4 June 2022 (UTC)[reply]

Hello John, I think data validation during import is better way. This will make our project more clear for users. Collecting all mistakes in the world makes nobody happen. But it is long discussion... — Ivan A. Krestinin (talk) 19:04, 16 July 2022 (UTC)[reply]

Autofix for qualifiers

[edit]

Hi! In Property talk:P1013 I added two {{Autofix}} six months ago to fix some thousands of occurrences in qualifiers (see https://w.wiki/4tkM and https://w.wiki/4tkU). Is it normal, I am missing something or there is a problem with the bot? Thanks very much in advance, --Epìdosis 20:01, 4 June 2022 (UTC)[reply]

Hello, {{Autofix}} with move_to parameter does process qualifiers. It is unclear how this case should be processes. But your requests report two items only now. Looks like the cases are fixed already. — Ivan A. Krestinin (talk) 20:51, 10 June 2022 (UTC)[reply]
Thanks for the reply. These cases should be processed simply in this way, and many thousands are still to be fixed (this is one of the two cases). Maybe should I compile {{Autofix}} in a different way, perhaps only {{Autofix|pattern=Q1376230|move_to=P3831}} instead of {{Autofix|pattern=Q1376230|replacement=Q1376230|move_to=P3831}}? Thanks, --Epìdosis 11:11, 12 June 2022 (UTC)[reply]
Qualifiers moving is not supported by {{Autofix}} unfortunately. Could you create request on Wikidata:Bot requests? — Ivan A. Krestinin (talk) 19:24, 16 July 2022 (UTC)[reply]
Thanks, request created. And I specified that qualifiers moving is not supported in the documentation. --Epìdosis 07:48, 17 July 2022 (UTC)[reply]

Please don't merge academic theses with scientific publications

[edit]

Hi there, I've found three instances where you have merged a dissertation with a publication, they are not the same thing and it makes a real mess trying to untangle them again. Please don't merge any thesis (especially those that are a part of the NZThesisProject) with publications, even if they have the same title. Thank you.

Here is one of them - you merged this doctoral thesis https://www.wikidata.org/wiki/Q111965723 (one author, held at the University it was submitted to) into this edited book with three authors and published by Taylor and Francis. https://www.wikidata.org/wiki/Q57409615 Obviously these are not the same thing, but if you could find a way for your bot to avoid merging entirely different entities like this it would be helpful. I have had to go back and fix the references from the author items that you also changed, and which broke constraints as the "academic thesis" statement was pointing at a publication rather than a thesis. People often publish a paper with the same title as their thesis, you need to be able to distinguish them.DrThneed (talk) 23:21, 8 June 2022 (UTC)[reply]
It would be helpful, Ivan, if you could outine for us the exactly criteria by which you judged a P31=scholarly article to be the same as a P31=doctoral thesis - in the case of Psychological Aspects of Inflammatory Bowel Disease (Q111965723). From the outside it seems inexplicable that these would be merged merely by a label-string match. --Tagishsimon (talk) 00:35, 9 June 2022 (UTC)[reply]
Bot merges the items because its have same title, same publication year and similar type. No any conflicting properties were detected. Now I changed bot`s algorithm, such items will not be merged. Also different from (P1889) will protect the items. — Ivan A. Krestinin (talk) 21:14, 10 June 2022 (UTC)[reply]
Thank you, Ivan. The P31 values on the merged items were not "similar type"; I hope that that is the area of your algorithm which has been improved. You are right that different from (P1889) will protect the items, but users should not have to take extra measures to protect their items from bots, in a situation in which there are clear distinctions - P31; number of author statements &c - between the items the bot erroneously merged. --Tagishsimon (talk) 21:47, 10 June 2022 (UTC)[reply]
Thank you Ivan. There were plenty of conflicting properties between the items - different numbers of authors, publisher, DOI, etc. A different publisher should be all that it takes not to merge items, as a single author could legitimately publish the same titled article in different venues in the same year, and those items should not be merged. So it would be reassuring to know that the bot algorithm will not touch items like that? Further it is not possible to protect every thesis item against merging with false duplicates that have not even been added to Wikidata yet, we just need your bot to be more discriminating. Thank you, DrThneed (talk) 22:04, 10 June 2022 (UTC)[reply]

multi-value constraint and constraint scope:

[edit]

Hi Ivan, looks like this edit made your bot unhappy. I didn't know it either, but according to Help:Property constraints portal/Multi value, constraint scope (P4680) is a valid qualifier. Can you update your bot to handle this? Thanks, Multichill (talk) 16:57, 20 June 2022 (UTC)[reply]

Hello, it is mistake or incomplete documentation. Single and multi value constraints are checked for main values only. So the qualifier is redundant. — Ivan A. Krestinin (talk) 18:46, 16 July 2022 (UTC)[reply]
The documentation describes how the current constraint software behaves. You can see at Q46316#Q46316$FBABC74E-0209-4129-96E4-6931F395B8E6 and Q134942#Q134942$2e7aa178-4ae3-0c0e-4912-d98d2130f83f that currently the constraint is triggered. If you add the constraint scope, it now longer will be triggered. Can you update the bot to do the same?
@Lucas Werkmeister (WMDE), Wostr: ^^ Multichill (talk) 08:28, 17 July 2022 (UTC)[reply]
There are multiple implementations of constraints systems. Generally its behavior is the same. But some details may be different. The documentation does not describe the difference. Lets assume that Multiple values constraint is supported for qualifiers. Q46316 has 3 values of depicts (P180). It is unclear should multiple values constraint be triggered or not. 3 > 1, so current behavior of Wikidata integrated implementation is looked as bug. This is reason why the most constraints were supported for main values only. Another question is why P180 should have multiple values. For example Q1170315#P180, what value should be added? — Ivan A. Krestinin (talk) 10:46, 17 July 2022 (UTC)[reply]

It looks like this happened again with these edits breaking Wikidata:Database reports/Constraint violations/P1651. I have no experience with constraints, and it appears to be a complicated topic, so I'd rather have someone more experienced look at this than trying to understand it myself. @Middle river exports: ^ –JustAnotherArchivist (talk) 05:07, 29 August 2022 (UTC)[reply]

This does become a problem for editing lexemes because there are a number of properties which generally are not necessary for lexemes/senses/forms but which can make sense on occasion in a reference on one of these types. (The reason I made this edit is because especially for spoken language / pronunciation, video references can be helpful. Also most news media in Pakistan is in video format rather than articles, so citing these makes sense on lexemes for Pakistani languages.) Middle river exports (talk) 05:18, 29 August 2022 (UTC)[reply]

Constraint violations P373

[edit]

Hi, Wikidata:Database reports/Constraint violations/P373 is not updated since 4 July and giving a bit crypit errror message "too many files for existance check". Any detaily or workaround to fix it? Jklamo (talk) 11:16, 4 July 2022 (UTC)[reply]

Hello, I am investigating the issue. — Ivan A. Krestinin (talk) 18:48, 16 July 2022 (UTC)[reply]
FixedIvan A. Krestinin (talk) 18:58, 17 July 2022 (UTC)[reply]

Please leave "invalid" DOIs on the item and flag them with a qualifying statement instead

[edit]

I had input a list of DOIs to create items for in Magnus Manke's SourceMD tool, and unfortunately it appears to have created a couple of items for articles which it could not find the title for. In these cases, the DOI link did not lead to the document, but the DOI was not incorrect. The problem with removing the identifier is then there is no way to correct the issue and an item is left blank without any metadata. The "incorrect" DOI link leads to a form to contact the DOI database maintainers to fix the rror, which is a useful link to have.

Here is an example of what I mean: A Basic Parts of Speech (POS) Tagset for morphological, syntactic and lexical annotations of Saraiki language (Q113190216) was added with a DOI and nothing else. The article can be found here, and the DOI for it is at the top of the page, and consistent with the format for other articles from this journal: https://scholar.archive.org/work/p7avuxo46nagdlollxvnf5xone/access/wayback/http://journal.buitms.edu.pk/j/index.php/bj/article/download/459/281

The fact that the DOI didn't work is an error on either the DOI maintainers' or the publishers' part, so I have contacted them to fix this. However, the DOI will likely be retained whenever it is fixed so it does not make sense to remove, especially when there was no other way to tell what article the item was for. --Middle river exports (talk) 21:02, 24 July 2022 (UTC)[reply]

Hello, looks like the identifier was really wrong. I specified correct DOI value in the item. — Ivan A. Krestinin (talk) 21:27, 24 July 2022 (UTC)[reply]
Ah, thank you very much. How did you find the correct one? Is there a way for the bot to tell this? Middle river exports (talk) 21:28, 24 July 2022 (UTC)[reply]
I found it using Google. This exact case is looked hard for automation. Maybe some other cases will be more easy for automation. — Ivan A. Krestinin (talk) 21:34, 24 July 2022 (UTC)[reply]

P968 no longer automatically fixed

[edit]

Format issues for email address (P968) for which an autofix rule exists are no longer fixed by your bot. As far as I can tell all regexes are valid. Can you check what's causing this? Mbch331 (talk) 13:39, 25 July 2022 (UTC)[reply]

Hello, bot has ~650000 items to fix now. This need a time. Mostly the it is caused by this issue. I will disable DOI references fixing temporary to give a chance for another autofixes. — Ivan A. Krestinin (talk) 17:31, 25 July 2022 (UTC)[reply]

incorrect

[edit]

https://www.wikidata.org/w/index.php?title=Q49133&type=revision&diff=1278140635&oldid=1277973898&diffmode=source Oursana (talk) 10:33, 6 August 2022 (UTC)[reply]

Hi Ivan. Regarding your edit here, while I partially agree, I've been thinking about the utility of having both instance of and subclass of being requirements for rockets. All rockets should realistically be instances of either rocket family (Q109542585), rocket series (Q111722634), or rocket model (Q110055303), and I've been filling these in as I find them, but the majority of rocket items are still missing it. Having instance of be a requirement would serve as a useful indicator that the field is missing. Just a thought. Huntster (t @ c) 16:57, 7 August 2022 (UTC)[reply]

Hi, you may add second value-type constraint (Q21510865) to space launch vehicle (P375). As for me instance of (P31) is not so useful as rocket families hierarchy. But you may create and use parallel classification. space launch vehicle (P375) have one more issue: usually this property is used to specify type or vehicle. But sometimes it is used to specify exact instance of vehicle. For example: Ulysses (Q156081). — Ivan A. Krestinin (talk) 21:42, 8 August 2022 (UTC)[reply]

When does this bot stop?

[edit]

When does this bot stop deleting correct information? Roelof Hendrickx (talk) 21:45, 8 August 2022 (UTC)[reply]

Hello, bot executes many different tasks. Could you provide link to some edit? — Ivan A. Krestinin (talk) 22:12, 8 August 2022 (UTC)[reply]
See: https://www.wikidata.org/w/index.php?title=Q29400988&action=history&curid=31048137 which is the latest deletion of correct information. Roelof Hendrickx (talk) 08:09, 9 August 2022 (UTC)[reply]
@Roelof Hendrickx, is this what you are looking for? Michgrig (talk) 17:43, 9 August 2022 (UTC)[reply]
That should have been the end result yes. Moving instead of deleting. Roelof Hendrickx (talk) 22:04, 9 August 2022 (UTC)[reply]
Bot executes moving in two steps: 1. adding new value; 2. deleting original value. If new value already exists then bot just removes original value. — Ivan A. Krestinin (talk) 22:25, 9 August 2022 (UTC)[reply]

Добрый день! Похоже сломалось, см. историю. Машъал (talk) 19:28, 19 August 2022 (UTC)[reply]

Single best value constraint

[edit]

Hi Ivan, some time ago single-best-value constraint (Q52060874) was introduced as a refinement of single-value constraint (Q19474404) so that cases like this one don't trigger a constraint. Can you update your bot to report single best value instead of single value violations? The query is only slightly different. I just noticed by the way that the single value constraint query (example) filters out statements with deprecated ranks. Maybe you can do the same? Thanks, Multichill (talk) 10:17, 21 August 2022 (UTC)[reply]

+1 to this. This report's single-value violations report will grow to thousands of invalid violations over time. —seav (talk) 11:04, 18 December 2022 (UTC)[reply]

Q283299 and Q21491395 merge and replace

[edit]

Hello Ivan! @Terot: has incorrectly merged the family: (Bethlen family (Q283299)) and the family name (Bethlen (Q21491395)), and Krbot2 has changed the family name of all persons in the profile (example). Can you do that back? Pallor (talk) 15:15, 21 August 2022 (UTC)[reply]

Poland for GREL correct?

[edit]

Hi Ivan,

is this change correct?-- Négercsókk (talk) 12:00, 28 August 2022 (UTC)[reply]

Английские метки для создаваемых элементов по ЖД станциям России

[edit]