Wikidata:Requests for permissions/Bot/WikiTrackBot
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved --Lymantria (talk) 06:10, 28 June 2018 (UTC)[reply]
WikiTrackBot[edit]
WikiTrackBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Habst (talk • contribs • logs)
Task/s: WikiTrackBot adds athletics (Q542) information to Wikidata. In the initial run it will be adding a very limited set of results to existing athlete items with an World Athletics athlete ID (P1146) set like Usain Bolt (Q1189), based on IAAF database pages. The bot will be extremely conservative, will never delete content (except its own content if there are mistakes) and will keep a log of all changes made. As per policy I will begin a small test run after announcing it here, and report back with more details after.
Code: wikiTrackBot on NotABug.org (flexible)
Function details: The code is too fluid and simple at the moment to really go in depth about the functions, but it is 100% public and freely licensed. --Habst (talk) 08:00, 13 June 2018 (UTC)[reply]
Report after test run: I fiddled with the code a bit, but I think I'm pretty satisfied with the test run. Mostly made changes to Usain Bolt (Q1189) to include his most notable results. I've also added a lot of substance to the code, linked above. The bot should work on all athletics competitors for any discipline, but I'm still going to be conservative about things at first if it gets approved. So I think it's ready to be approved now, and of course I can make any improvements if suggested. --Habst (talk) 04:08, 14 June 2018 (UTC)[reply]
Update: After comments from MisterSynergy on my talk page, I'm going to revise the bot to meet a new schema and make some more attended edits today, overwriting my old Bolt ones. --Habst (talk) 15:34, 14 June 2018 (UTC)[reply]
Update 2: The bot has been mostly revised now in that it uses participant in (P1344), and the Bolt edits reflect that. The only big feature that may be changed in the future is the ability to descend into meet pages and add metadata there instead of keeping some of it per-result. I think that the bot has sufficient info now to begin doing some attended runs of a few hundred edits on a select few number of athletes. I'm not sure if I would need the bot flag to do that. --Habst (talk) 02:15, 21 June 2018 (UTC)[reply]
- A few hunderd is a lot. Please, do a test run of 50-250 edits. Lymantria (talk) 06:20, 21 June 2018 (UTC)[reply]
- Hi @Lymantria, thanks for your response. I had the bot run exactly 250 edits today which got the Bolt races back to 2015 before I stopped it. I also made a decent update to the code to ensure it won't duplicate or overwrite anyone else's (or its own) work. The test run was still mostly limited in scope to Bolt, but I'm satisfied with the results so far. What do you think? --Habst (talk) 21:52, 21 June 2018 (UTC)[reply]
- Thanks. I would like to see input from others on this idea and the way you implement it, before I judge if I can approve this bot request. Lymantria (talk) 05:44, 22 June 2018 (UTC)[reply]
- This definitely goes into the right direction, but there are some remarks:
- Use of no value Help when there is no event item yet is formally not correct (unknown value Help wouldn’t be correct either). Besides that I’m concerned that those values will never be properly updated once actual values are available. I suggest to set up individual event items (such as 2017 World Championships in Athletics – men's 100 metres (Q30765148)) and meet items (2017 World Championships in Athletics (Q175508)) in case nothing existing can be found (accidental duplicates can easily be merged later). Those items should contain some basic data about the meet and the individual event, mutually connected to each other properly (with part of (P361) and has part(s) (P527), and then used right away for the results instead of no value Help.
- I have similar concerns about the use of meet items as values for participant in (P1344) when no individual event items are available. Maybe it is wiser to create basic individual event items as well in those cases.
- The reason for these issues basically is that there is quite some basic setup work to do in this field. We cannot expect that results should wait until everything is organized perfectly, but I think it is worth to invest some effort here, in order to add results in a shape that does not need improvements later. I don’t think this would be very excessive, and I’m willing to help (although I’m rather busy right now). —MisterSynergy (talk) 06:22, 22 June 2018 (UTC)[reply]
- Hi @MisterSynergy, I do understand that and I think it is within my programming abilities to implement the auto-creation of event-specific WD items and meets when none can be found. I'll fully implement both the event creation (when one can't be found) and the meet creation (when one can't be found) some time this week, so every claim in participant in (P1344) will have a corresponding meet and event item linked.
- Also this isn't strictly related to the bot, but just for some idea of how these results might be used I've been messing around on enwiki and created an expandable results infobox that automatically grabs the data from WD via Lua, example at w:User:Habst/sandbox/athresults and w:Module:Sandbox/Habst/Athletics results. Part of my hesitance to move the result metadata to meet / event items was because it does make the Lua scripting harder, having to gather data from multiple Q items per page (should be hundreds for people like Bolt) load rather than just one, which also increases the Wikipedia page load times. That being said it would still be possible, so maybe I just need to figure it out and get that to work also . --Habst (talk) 10:28, 22 June 2018 (UTC)[reply]
- Regarding the latter point: I think it is fine to have sports results in the participant item as well as in the individual event item. With arbitrary access one could of course try to centralize it in one place, but it would be a really complicated task to collect everything then, particularly since we cannot use SPARQL in Lua. Page loading times are not much of a concern here (to my knowledge), as Wikipedia pages are completely cached for readers anyway. There could be problems with resource allocation when you access too many items while saving a page, however. —MisterSynergy (talk) 11:48, 22 June 2018 (UTC)[reply]
- This definitely goes into the right direction, but there are some remarks:
Update 3: After lots of toiling, I think I have finally revised WTB so that it can now create missing meet and event pages when none are found. This significantly increases the edit count per performance, but it does make results now viewable per-meet and per-event. I ran it for roughly 250 more edits at Usain Bolt (Q1189), and I caught some bugs along the way but they should be squashed now. Thanks to @MisterSynergy for all the guidance and help he has provided with this. I'm really satisfied with the progress, and I think it was a worthwhile change. Having the ability to do more edits with a bot flag would really help the usefulness of the bot. I can of course make any requested changes, but I hope it should be good to go now. --Habst (talk) 03:04, 25 June 2018 (UTC)[reply]
- I am ready to approve the request in a couple of days, provided that no objections are raised. Lymantria (talk) 05:23, 26 June 2018 (UTC)[reply]
- Thank you, I've been notifying the members of WD:WikiProject Athletics on our talk page so hopefully we will be good to go soon. --Habst (talk) 05:43, 26 June 2018 (UTC)[reply]