Wikidata:Requests for permissions/Bot

To request a bot flag, or approval for a new task, in accordance with the bot approval process, please input your bot's name into the box below, followed by the task number if your bot is already approved for other tasks. Then transclude that page onto this page, like this: {{Wikidata:Requests for permissions/Bot/RobotName}}.

Old requests go to the archive.

Once consensus is obtained in favor of granting the botflag, please post requests at the bureaucrats' noticeboard.

Translate this header box!

Bot Name	Request created	Last editor	Last edited
UmisBot	2024-07-25, 16:44:40	Stuchalk	2024-07-25, 16:44:40
DannyS712 bot	2024-07-21, 03:09:22	Ymblanter	2024-07-26, 04:29:22
DerIchBot	2024-07-19, 18:12:22	DerIch27	2024-07-19, 18:12:22
DifoolBot 5	2024-07-19, 02:06:21	Difool	2024-07-26, 06:55:53
AroundTheBot	2024-07-18, 12:22:48	Hardwigg	2024-07-19, 05:34:22
Bot Bozze	2024-06-30, 08:32:31	Ymblanter	2024-07-06, 21:03:04
TapuriaBot	2024-06-03, 16:18:28	BrokenSegue	2024-06-07, 15:31:46
IliasChoumaniBot	2024-06-03, 10:16:37	IliasChoumaniBot	2024-07-18, 11:01:28
Browse9ja bot	2024-05-16, 02:16:04	Browse9ja bot	2024-05-25, 13:12:09
IntegrationBot	2024-04-18, 13:19:43	BrokenSegue	2024-06-07, 15:29:47
OpeninfoBot	2024-04-16, 11:14:27	Ymblanter	2024-05-09, 19:22:52
MidleadingBot 5	2024-02-05, 13:04:20	Emijrp	2024-03-07, 20:04:30
So9qBot 9	2024-01-05, 18:41:06	Ymblanter	2024-01-11, 20:01:10
So9qBot 8	2023-12-17, 15:07:59	Samoasambia	2024-03-09, 18:38:20
HVSH-Bot	2023-12-31, 12:37:18	So9q	2024-01-02, 10:35:04
RudolfoBot	2023-11-29, 09:29:38	TiagoLubiana	2023-11-30, 23:47:22
GamerProfilesBot	2023-10-05, 11:06:23	Jean-Frédéric	2024-05-19, 07:39:50
MangadexBot	2023-08-06, 18:01:17	RPI2026F1	2024-01-25, 16:22:21
WingUCTBOT	2023-07-31, 10:07:51	So9q	2024-01-02, 10:50:02
MajavahBot	2023-07-11, 19:54:55	Taavi	2023-07-12, 17:23:24
FromCrossrefBot 1: Publication dates	2023-07-07, 14:31:17	Succu	2023-11-07, 20:19:56
UrbanBot	2023-06-29, 16:04:49	Urban Versis 32	2023-07-15, 02:40:06
AcmiBot	2023-05-16, 00:36:49	BrokenSegue	2023-06-22, 20:40:33
WikiRankBot	2023-05-12, 03:36:56	BrokenSegue	2024-02-22, 15:59:51
ForgesBot	2023-04-26, 09:30:12	BrokenSegue	2023-04-26, 17:13:55
IngeniousBot 3	2023-03-22, 16:29:58	Ymblanter	2023-06-23, 19:04:15
LucaDrBiondi@Biondibot	2023-02-28, 18:25:03	LucaDrBiondi	2023-03-31, 16:10:37
Kalliope 7.3	2022-12-07, 09:16:20	DannyS712	2024-06-09, 07:00:55
DL2204bot 2	2022-11-30, 11:19:21	DannyS712	2024-06-09, 07:02:03
Botcrux 11	2022-11-28, 09:05:27	Azertus	2024-04-12, 17:37:37
Cewbot 5	2022-11-15, 02:20:05	2800:AC:4013:A800:10DB:299B:28C7:F60F	2023-09-23, 11:23:03
Mr Robot	2022-11-04, 14:09:41	Liridon	2023-03-02, 13:03:34
RobertgarrigosBOT	2022-10-16, 19:43:23	Robertgarrigos	2022-10-16, 19:43:23
YSObot	2021-12-16, 11:33:29	So9q	2024-01-02, 10:32:27
AradglBot	2022-03-14, 19:43:27	Miguel&IvanV	2022-09-23, 10:22:40
PodcastBot	2022-02-25, 04:38:31	Azertus	2023-08-23, 10:10:28

UmisBot

UmisBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Stuchalk (talk • contribs • logs)

Task/s: This bot will add string representations of units of measurement to units of measurement Wikidata pages.

Code: The Python project on the "Units of Measurement Interoperability Service" (UMIS), that this bot will support/enable, is at https://github.com/usnistgov/nist_umis .

Function details: String representations of different units of measurement are being aligned to allow translation between different unit representation systems. As the developer of the UMIS, I have concluded that Wikidata is the best place to organize/align unit representation strings. Once available at nist.gov later this year, the UMIS website will offer additional functionality to enable users to programmatically translate between unit of representation systems, and additional functionality is planned. There are already Wikidata properties for some of the unit representation systems (e.g. QUDT) and additional ones will be requested. This is my first bot permission request so if more info is needed please let me know. --Stuart Chalk (talk) 16:44, 25 July 2024 (UTC)[reply]

DannyS712 bot

DannyS712 bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DannyS712 (talk • contribs • logs)

Task/s: I want to get approval for a bot with translation admin rights that will automatically mark pages for translations if and only if the latest version is identical to the version that is already in the translation system, i.e. only pages with no "net" changes in the pending edits.

Code: not yet

Function details: I am filing almost identical requests for bot approval on a bunch of wikis, and figured I should put some of the details in a central location. Please see meta:User:DannyS712/TranslationBot for further info. --DannyS712 (talk) 03:09, 21 July 2024 (UTC)[reply]

Support. Sure! --Wüstenspringmaus _talk 11:47, 23 July 2024 (UTC)[reply]

@Lymantria @Ymblanter just noting here that I cannot do test edits unless the bot is granted translation admin rights, unless you want me to test under my own account --DannyS712 (talk) 00:57, 26 July 2024 (UTC)[reply]

Done Ymblanter (talk) 04:29, 26 July 2024 (UTC)[reply]

DerIchBot

DerIchBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DerIch27 (talk • contribs • logs)

Task/s: Adding data about schools provided by the German and Austrian governments to wikidata.

Code: schools-bw.py; GitHub Repo mostly containing the code of the bot for the German wiki; additional code will be added soon

Function details: The bots job would include creating new Wikidata items for schools not represented yet as well as adding data like number of students, number of teachers, etc to existing wikidata items. The details of this job would vary depending on the data each state provides. A list of resources can be found here. Matching the data from official publications to wikidata items would be handled by the identifier as wikidata property (e.g. Bavarian school ID (P12350), Rhineland-Palatinate school ID (P12274)) or by the school identifier from templates in the German wiki. A consensus for such a job exists in the German wiki (see here). --DerIch27 (talk) 18:12, 19 July 2024 (UTC)[reply]

DifoolBot 5

DifoolBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Difool (talk • contribs • logs)

Task/s:: Change reference URLs into the related ID property and merge references with the same ID property.

Code:: at Github

Function details:

This task is based on a request of @Jahl de Vautban: The script will iterate through pages based on a search query and examine all references on that page. Here are the steps it will follow:

Change a reference URL (P854) into the related ID property and stated in (P248). So, for example, reference URL (P854)https://www.idref.fr/149649045 is changed into IdRef ID (P269) 149649045. The related ID property is determined based on data from Wikidata, namely pages with properties applicable 'stated in' value (P9073) and URL match pattern (P8966). Here is an example edit.
Merge references with the same ID property. Example edit.
Change references with a reference URL (P854) that has an archive URL to use archive URL (P1065). Example edit.
If the references of a claim are changed, remove references with an imported from Wikimedia project (P143) or Wikimedia import URL (P4656), but only if the claim contains another reference with a stated in (P248). Example edit.

Example search queries are: idref.fr, 80.000 pages, rkd.nl, 185.000 pages and bnf.fr, 180.000 pages.

More example edits can be found here.

--Difool (talk) 02:06, 19 July 2024 (UTC)[reply]

Strong support clearly needed maintenance, especially useful for making items more readable and reduce their size cutting only redundant data. Epìdosis 06:39, 19 July 2024 (UTC)[reply]

Support thanks for taking care of it! --Jahl de Vautban (talk) 11:53, 19 July 2024 (UTC)[reply]

Comment @Difool: how about adding to the bot tasks also the case of Bibliothèque nationale de France ID (P268) in cases like this? It would be very useful. --Epìdosis 16:45, 19 July 2024 (UTC) P.S. Reading again point 1, I guess it's probably already included, but it's just to be sure. --Epìdosis 16:46, 19 July 2024 (UTC)[reply]

@Epìdosis: no, it hasn't been included yet: the page Property:P268 contains a URL match pattern (P8966) with a similar regular expression ^https?:\/\/(?:data|catalogue)\.bnf\.fr\/\w\w\/(\d{8,9}). However, this pattern doesn't match the URL http://data.bnf.fr/ark:/12148/cb12197229. Although I use custom regular expressions to match older URLs, I decided not to do so in this case because the Bibliothèque nationale de France ID (P268) link leads to the 'catalogue' page rather than the 'data' BnF page. Some people may prefer to keep it that way. If there are no objections, I can include a custom regular expression for it. Difool (talk) 18:29, 19 July 2024 (UTC)[reply]

I know that effectively data.bnf.fr and catalogue.bnf.fr are different sites (which is often a bit confusing). Of course I would understand the reasons of potential objections of persons preferring to keep them as they are now. However, since in fact they just display the same data in different ways, I would personally support adding a custom regular expression for them. Epìdosis 18:32, 19 July 2024 (UTC)[reply]

Support - Mbch331 (talk) 09:41, 23 July 2024 (UTC)[reply]

Strong support. This is very much needed. Thanks a lot for the work and this request! Just one minor detail regarding step 2: I think it is not the cleanest approach to merge retrieved (P813) snaks with others that might change over time like title (P1476) as in this edit for the 2 Biografisch Portaal van Nederland ID (P651) references. Best compromise here would be to drop the title (P1476) snak I think. Best, --Marsupium (talk) 13:23, 23 July 2024 (UTC)[reply]
That makes sense; I'll adjust the code as you described. Difool (talk) 11:19, 24 July 2024 (UTC)[reply]
BTW, I agree on dropping title (P1476) and keeping retrieved (P813); thanks for the suggestion! Epìdosis 12:02, 24 July 2024 (UTC)[reply]

Comment One reason to not do this I can think of is that the original reference URL is lost when the formatter URL pattern changes. I very much like to idea, but I think that the original URL needs to be archived on the Internet Archive when the change is made (similar reasons why we use "object named as"). Maybe that the archiving is already done by a second bot, but then I like the two to work together. This is not a blocker. Egon Willighagen (talk) 06:30, 25 July 2024 (UTC)[reply]

Yes, I've had some uncertainty about whether to retain the reference URL or omit it when the bot includes the related ID property. The related ID property includes a link to the current URL, so the only real reason for keeping a reference URL would be to dig up old data from a web archive. But IMO the page associated with the external ID should contain information that enables you to construct that 'old' URL (if the reference also has a 'retrieved' or 'publication date' property)

Note that Help:Sources#Databases also states that you don't need to include a reference URL for a reference to an "internet accessible database". Difool (talk) 06:55, 26 July 2024 (UTC)[reply]

AroundTheBot

AroundTheBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Hardwigg (talk • contribs • logs) & BrigidGit (talk • contribs • logs)

Task/s: Automated import of Albanian nouns with IPA from Wiktionary, with the long-term goal of using this data to do pronunciation-based comparison/word evolution between languages.

Code: This notebook performs initial kaikki dataset analysis/cleanup. This notebook (run inside PAWS) coerces the cleaned up data to Wikidata format and performs the actual import.

Function details: We worked with the kaikki dataset, a structured parsing of wiktionary, to find relevant Albanian nouns with IPA pronunciation, remove any noisy entries, coerce the words into the lexeme format used by Wikidata, and then import them into Wikidata. --Hardwigg (talk) 12:22, 18 July 2024 (UTC) & @BrigidGit[reply]

Bot Bozze

Bot Bozze (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Sakretsu (talk • contribs • logs)

Task/s: Add sitelinks to itwiki draft articles after they've been moved to the main namespace

Code:

Function details: I currently operate this bot in itwiki where it has made over 100.000 edits. My bot monitors all page moves from the draft namespace to the main namespace, automatically removing the message box w:it:Template:Bozza and making other edits. Template:Bozza has a parameter named wikidata that enables users to indicate which Wikidata item the draft should be connected to once put in the main namespace, but Bot Bozze does nothing with this information and reviewers often forget to add the sitelink manually. It's been proposed that Bot Bozze itself adds the sitelink getting the relevant item ID from the parameter wikidata, so I'm asking for permission to operate my bot in Wikidata too for this specific task. --Sakretsu (炸裂) 08:32, 30 June 2024 (UTC)[reply]

Support --Epìdosis 09:50, 30 June 2024 (UTC)[reply]

Support ChristianKl ❪✉❫ 06:23, 5 July 2024 (UTC)[reply]

Please register the bot and make test edits Ymblanter (talk) 21:03, 6 July 2024 (UTC)[reply]

TapuriaBot

TapuriaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: محک (talk • contribs • logs)

Task/s: interwiki

Code: interwikidata.py from PAW, Mainly from Mazandarani and Gilaki Wikipedias.

Function details: novice --محک (talk) 16:18, 3 June 2024 (UTC)[reply]

there isn't enough info here. i don't understand what this is doing or how it is doing it BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]

IliasChoumaniBot

IliasChoumaniBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ilias Choumani / IliasChoumaniBot (talk • contribs • logs)

Task/s: Automatic updating of data from JSON files on German scientists

Code: Will be in Python (not there yet)

Function details: --IliasChoumaniBot (talk) 10:16, 3 June 2024 (UTC)[reply]

what json files? we need more details BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]

We are students from TH Köln tasked with automating the process of updating data for scientists on Wikidata. Our objective includes verifying the presence of researchers and creating entries if they are not already listed. Similarly, we extend this process to projects, such as those found in GEPRIS, where these researchers have been involved. Subsequently, our goal is to establish connections between these projects and the respective researchers.

Our JSON files contain comprehensive data necessary for expanding information on researchers (QID, name) and their associated projects (project name, project ID) within Wikidata. This ensures that accurate and up-to-date information is seamlessly integrated into the Wikidata ecosystem.

This approach leverages automated tools and careful data handling to contribute valuable knowledge to the scientific community on Wikidata. IliasChoumaniBot (talk) 14:35, 17 June 2024 (UTC)[reply]

What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)[reply]

We have the data from various online sources such as gepris, orcid or pubmed. we have exrtahted data from various german scientists and their publications and would like to automatically insert them into wikidata as part of our studies. IliasChoumaniBot (talk) 11:01, 18 July 2024 (UTC)[reply]

Browse9ja bot

Browse9ja bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Browse9ja

Task/s: Automated data retrieval and updates for Browse9ja project, focusing on Nigerian and African-based information, integrating a chatbot, NLP API, knowledge graph, and machine learning model.

Code: (Not applicable, as I am using a combination of existing APIs and services)

Function details:

The Browse9ja bot is designed to perform the following tasks:

- Retrieve and update data on Wikidata related to Nigerian and African-based information - Integrate with a chatbot to provide users with accurate and up-to-date information - Utilize natural language processing (NLP) API for text analysis and understanding - Contribute to the development of a knowledge graph for African-based information - Apply machine learning models to improve data accuracy and relevance

The bot will operate under the supervision of the operator (Browse9ja) and adhere to Wikidata's policies and guidelines. --Browse9ja (talk) 02:16, 16 May 2024 (UTC)[reply]

Comment OP has no track record of contributions either here or on any other project.

Question Can you please give more details of how the chatbot will be integrated? Do you intend to have an LLM suggest content to add to Wikidata? Bovlb (talk) 15:37, 21 May 2024 (UTC)[reply]

Details of Chat-bot Integration as requested: My chat-bot will be integrated into the Browse9ja.com as a bot to provide users with accurate and up-to-date information related to Nigerian and African-based data on Wikidata. The integration will involve utilizing a natural language processing (NLP) API for text analysis and understanding. The Chat-bot will enable users to interact with the Browse9ja bot in a conversational manner, allowing for seamless access to information and updates on Wikidata. Additionally, the chat-bot will play a role in contributing to the development of a knowledge graph for African-based information. While the chat-bot will facilitate user interaction, the machine learning models will be applied to improve data accuracy and relevance, ensuring that the information provided is of high quality and relevance to the users.

About LLM Content Suggestion: The chat-bot integrated with Browse9ja bot will have the capability to suggest content to add to Wikidata. Leveraging natural language processing (NLP) and machine learning models, the chat-bot will be able to analyze user queries and suggest relevant content for addition to Wikidata. This functionality aligns with the broader goal of the Browse9ja bot to automate data retrieval and updates for Nigerian and African-based information, ensuring that the information contributed to Wikidata is accurate, up-to-date, and relevant.

Hope this clarifies my intent and would please also increase my chances for an approval.Thanks alot.

.

Browse9ja bot (talk) 13:12, 25 May 2024 (UTC)[reply]

integrationBot

integrationBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Dann.wu (talk • contribs • logs)

Task/s: retrieve information from Wikidata and contribute data back. Code:

Function details: coming soon! --Dann.wu (talk) 13:19, 18 April 2024 (UTC)[reply]

you don't need bot approval to read data. we need details on what you are contributing back. BrokenSegue (talk) 15:29, 7 June 2024 (UTC)[reply]

OpeninfoBot

OpeninfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Fordaemdur (talk • contribs • logs)

Task/s: importing financial data (assets, equity, revenue, EBIT, net profit) from openinfo.uz to entries on public Uzbek companies in Wikidata.

Code:

Function details: I have a project going with openinfo.uz which is a state-owned public portal for financial disclosures of all public Uzbek companies. All joint-stock companies and banks in Uzbekistan have to disclose their financials there by law. I have created entries for all Uzbek banks at User:Fordaemdur/Uzbek banks and would like to test imports of financial data there (Openinfo is ready to provide API for that). If successful, the bot will import financials once per quarter. Next steps would also be creating entries for all other notable public Uzbek companies, not just banks, and import financials there too. --Fordaemdur (talk) 11:14, 16 April 2024 (UTC)[reply]

MD Imtiaz Ahammad Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl Vladimir Alexiev Parikan User:Cardinha00 MB-one User:Simonmarch User:Jneubert Mathieudu68 User:Kippelboy User:Datawiki30 User:PKM User:RollTide882071 Andber08 Sidpark SilentSpike Susanna Ånäs (Susannaanas) User:Johanricher User:Celead User:Finnusertop cdo256 Mathieu Kappler RShigapov User:So9q User:1-Byte pmt Rtnf econterms Dollarsign8 User:Izolight maiki c960657 User:Automotom applsdev Bubalina Fordaemdur DaxServer
Notified participants of WikiProject Companies --Fordaemdur (talk) 11:32, 16 April 2024 (UTC)[reply]

How many companies are we talking about? ChristianKl ❪✉❫ 18:57, 17 April 2024 (UTC)[reply]

@ChristianKl, currently there items on about 50 public Uzbek companies (30+ are banks) - all can be found on my userpage. I am planning on creating items for all companies listed at the Tashkent Stock Exchange, so we'll end up with about 150 companies. There are about 600 joint-stock companies in Uzbekistan and I assume at least one third of them is notable. The test will be run on few companies - a mix of banks and corporates, and I don't expect more than 100 edits on a test run. If the test run is successful, the bot will be occupied with populating these items that i'm manually creating rn (checking notability for each individual entry before creating it). Best, --Fordaemdur (talk) 19:17, 17 April 2024 (UTC)[reply]

Add:Openinfo.uz now has an entry to facilitate referencing its data: Unified Portal of Corporate Information Data (Q125505748) --Fordaemdur (talk) 19:19, 17 April 2024 (UTC)[reply]

Support adding all joint-stock companies is fine given the kind of notability rules we have. If you would want to small businesses as well, it would be a harder call whether or not to allow it. ChristianKl ❪✉❫ 11:48, 18 April 2024 (UTC)[reply]

Thank you for clarification. I confirm that I won't be working on small businesses. Openinfo and Tashkent Stock Exchange (which i'm using for data imports) only have data on joint-stock companies. Best, --Fordaemdur (talk) 14:48, 18 April 2024 (UTC)[reply]

Support - PKM (talk) 23:28, 18 April 2024 (UTC)[reply]
Support--So9q (talk) 16:48, 2 May 2024 (UTC)[reply]
Please make test edits.--Ymblanter (talk) 19:22, 9 May 2024 (UTC)[reply]

MidleadingBot 5

MidleadingBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Midleading (talk • contribs • logs)

Task/s: Create items for books in National Diet Library (Q477675).

Function details: As part of WikiProject, items for each book in National Diet Library (Q477675) have to be created so files in Wikimedia Commons can link to Wikidata. The items will have properties DOI (P356), NDL Bib ID (P1054), publication date (P577), author (P50), main subject (P921) and others. The number of items to be created is likely to be more than 100,000. --Midleading (talk) 13:03, 5 February 2024 (UTC)[reply]

Note: If item with specific NDL Authority ID (P349) does not yet exist, we need to also create an item for the author (which may include people and organizations). Also there are books not yet in public domain so can not be uploaded to Commons, but items for them can still be created. GZWDer (talk) 13:48, 5 February 2024 (UTC)[reply]

Also you may first create some example items.--GZWDer (talk) 13:56, 5 February 2024 (UTC)[reply]

Like local history of Kagoshima City (Q111372556)? This item was created by @Sakoppi:. Do you know any Japanese user who are interested in this topic? Some items may have already been created by these users. Midleading (talk) 15:14, 5 February 2024 (UTC)[reply]

I don't think that item is not following the guidelines at Wikidata:WikiProject_Books. BrokenSegue (talk) 17:37, 5 February 2024 (UTC)[reply]

What it mean: (1) any created item with a specific ~~DOI (P356)~~/JPNO (P2687)/NDL Bib ID (P1054)/publication date (P577)/place of publication (P291) should be an "edition" (instance of (P31)=version, edition or translation (Q3331189)); (2) there are possibly multiple editions of the same work, so (in the next step, after we have the edition items) we create an item for the "work" and link two items with edition or translation of (P629)/has edition or translation (P747). For example c:Category:人事興信録 contains multiple different editions of Q109727675 (a "work"), each edition have a different set of IDs. We should create items for each edition. In the future, the Commons category should be diffused to one for each edition which links to the edition item (instead of the current one for the work item).--GZWDer (talk) 03:17, 6 February 2024 (UTC)[reply]

Note we have another dedicated property for NDL id (besides JPNO (P2687) and NDL Bib ID (P1054)): NDL Persistent ID (P9836). To prevent fragmentation of data, IDs should not be added to DOI (P356).--GZWDer (talk) 09:31, 6 February 2024 (UTC)[reply]

I have edited local history of Kagoshima City (Q111372508) as example of work item and local history of Kagoshima City (Q111372556) as example of edition item. Midleading (talk) 14:05, 7 February 2024 (UTC)[reply]

this still looks wrong? is it a book series? or a written work (Q47461344)? There's also constraint violations? BrokenSegue (talk) 18:33, 7 February 2024 (UTC)[reply]

It was a book that later became book series. Anyway, I will use written work (Q47461344) uniformly, because this information isn't in Commons. genre (P136), official website (P856), copyright status (P6216) for work item and follows (P155), followed by (P156), genre (P136), copyright holder (P3931), official website (P856) for edition item also can't be imported. has edition or translation (P747) statements also will not have any qualifiers when imported. Midleading (talk) 03:29, 16 February 2024 (UTC)[reply]

Hello, I want to do this for books in National Library of Spain. I see that following Wikidata rules for books, two items are needed at least, one for the written work (Q47461344) and one for each version, edition or translation (Q3331189). I created an example for this entry in datos.bne.es:

La vida es eterna: biografía de Víctor Jara (Q124538246): written work by Mario Amorós
La vida es eterna: biografía de Víctor Jara (Q124537888): 2023 edition of written work by Mario Amorós

Do you think they are correct? It think that adding the "(PUBLISHER, YEAR)" in the label for each edition is useful, so you can see all that info quickly in the property has edition or translation (P747) in La vida es eterna: biografía de Víctor Jara (Q124538246). But I am open for suggestions. Of course, after we define that, I will open a request for my bot. Just wanted to use this discussion so we can unify the rules for all "book bots". Emijrp (talk) 18:12, 15 February 2024 (UTC)[reply]

generally looks good to me. personally I would like to see some more identifiers (see Wikidata:WikiProject_Books e.g. Library of Congress Control Number (LCCN) (bibliographic) (P1144) or Google Books ID (P675)) though having ISBN is good. Also a genre (P136) would be nice. BrokenSegue (talk) 18:27, 15 February 2024 (UTC)[reply]

oh also the description for the edition should say it's an edition (needs to be distinct from the work's description) BrokenSegue (talk) 18:29, 15 February 2024 (UTC)[reply]

"2023 edition of book by Mario Amorós"? What about writing the Spanish title in the English label? Is that OK or should I leave it blank when book hasn't been translated? Emijrp (talk) 19:19, 15 February 2024 (UTC)[reply]

The description for both items is that it's a "book", which is the least helpful label that could possibly be used. In Wikiproject:Books, we never use the word "book" because it could mean a work, an edition, a specific copy, a section within a work, or any of a dozen other meanings. Please do not use "book" as the description for a work or an edition; it isn't helpful and does not distinguish what it is. The data item for an edition should have "edition" in the description; not in the label. --EncycloPetey (talk) 17:18, 16 February 2024 (UTC)[reply]

I just fixed the labels and descriptions for both items (written work and edition). Is OK now? Btw, I repeat the same question, is OK to use Spanish title as English label when work hasn't been translated? Emijrp (talk) 16:15, 19 February 2024 (UTC)[reply]

Yes, it is ok to use the Spanish title if the work has not been translated. Other than that, are we ready for approval? Ymblanter (talk) 19:55, 6 March 2024 (UTC)[reply]

@Ymblanter This request was created by @Midleading: for his bot MidleadingBot. I don't know if his bot is ready for approval.

I am going to open a request for my own bot. Emijrp (talk) 19:43, 7 March 2024 (UTC)[reply]

Here is Wikidata:Requests for permissions/Bot/Emijrpbot 10. Emijrp (talk) 20:04, 7 March 2024 (UTC)[reply]

So9qBot 9

So9qBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)

Task/s: Add DDO identifier to Danish lexemes.

Code: https://github.com/dpriskorn/LexDDO

Function details: Checks whether there are multiple hits in DDO for a lemma. If yes it is skipped. Checks if there is multiple lexemes with the same lemma and lexical category in WD, if yes, it skips. Otherwise we got a match and upload is done. If we get 404 from DDO a not found in + time statement is added. This is the easiest low hanging fruit kind of matching. I vetted the edits and it seems good to me. See ~50 test edits here https://www.wikidata.org/w/index.php?title=Special:Contributions/So9q&target=So9q&offset=20240105165217--So9q (talk) 18:41, 5 January 2024 (UTC)[reply]

What is this? Ymblanter (talk) 20:01, 11 January 2024 (UTC)[reply]

So9qBot 8

So9qBot 8 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)

Task/s: Add missing names of European legal documents to labels and aliases of items with a CELEX identifier

Code: logic diagram, code

Function details: This is important for our coverage of EU legal documents. A bug is blocking creation of 50 test edits.--So9q (talk) 15:07, 17 December 2023 (UTC)[reply]

The bug has been fixed. See test edits So9q (talk) 17:41, 2 January 2024 (UTC)[reply]

Support looks useful, thanks! -Framawiki (please notify !) (talk) 14:34, 6 January 2024 (UTC)[reply]

Question Wouldn't title (P1476) be better than official name (P1448)? (That is what we used for the Swedish parliamentarian documents.) Ainali (talk) 08:41, 11 January 2024 (UTC)[reply]

@So9q: FYI, I created some data modeling for EU legal acts here. The EUR-Lex metadata is available through a SPARQL end point which gives us some additional data compared to scraping. –Samoasambia ✎ 18:38, 9 March 2024 (UTC)[reply]

HVSH-Bot

HVSH-Bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Historischer_Verein_SH (talk • contribs • logs)

Task/s: Import data about politicians from the Database of important Persons from Schaffhausen (Q119949776), now only partially online available

Code: N/A

Function details: Import of reconciled data from OpenRefine with given name, familiy name, date of birth, date of death, place of origin, sex or gender, position held, language spoken, country of citizenship. --HVSH-Bot (talk) 12:37, 31 December 2023 (UTC)[reply]

Could you explain the logic using an activity planuml diagram? Could you make 50 test edits and link them here? So9q (talk) 10:35, 2 January 2024 (UTC)[reply]

RudolfoBot

RudolfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: RudolfoMD (talk • contribs • logs)

Task/s:importing list of Drugs With Black Box Warnings; setting Property / legal status (medicine): boxed warning.

Code: N/A

Function details: Continue importing FDA list of Drugs With Black Box Warnings, as I've been doing, with OpenRefine. Ideally hope to create or have someone run a bot to maintain the data.

OpenRefine urges me to submit  Large edit batches for review.  I've done ~400 in batches of ~200.  
I want to do more, like https://www.wikidata.org/w/index.php?title=Q7939256&diff=prev&oldid=2019984699&diffmode=source.
This is what's set:
Property / legal status (medicine): boxed warning / rank
Property / legal status (medicine): boxed warning / reference
 reference URL: https://nctr-crs.fda.gov/fdalabel/ui/spl-summaries/criteria/343802
 title: FDA-sourced list of all drugs with black box warnings (Use Download Full Results and View Query links. (English)

Want to match more widely - on Q113145171, which has ~500 matches, and the other types which match and are drugs of some kind listed below.
Table has ~1600 rows, and the bulk have a matching drug in wikidata already.

Types: 
Q113145171 type of chemical entity (658)
Q59199015 group of sterioisomers (51)
Q12140 medication	DONE- first extract, I think (need to redo to add cites)
Q169336 mixture (45)
Q79529 chemical substance (40)
Q1779868 combo drug (28)
Q35456 essential med (13)
Q119892838 type of mixture of chem (3)
Q28885102 pharm prod (3)
Q467717 racemate (3)
Q8054 protein (biomolecule) (4)
Q422248 mab (12)
Q679692 biopharmaceutical (6)
Q213901 gene therapy (4)
Q2432100 vet drug (3)

I do not want to do for types 
Q13442814 article (NO)
Q30612 clinical trial (NO)
Q7318358 review article (NO)
Q16521 taxon (NO?)

--RudolfoMD (talk) 09:29, 29 November 2023 (UTC)[reply]

Comment Looks useful! Can we see some test edits with the actual bot code to be used?

GamerProfilesBot

GamerProfilesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Parnswir (talk • contribs • logs)

Task/s: Backfill GamerProfiles game IDs (P12001)

Code: https://github.com/GP-9000/GamerProfilesBot

Function details: The bot will regularly update existing video games with the GamerProfiles game ID (P12001) sourced from https://gamerprofiles.com. We plan to update the initial batch of around 55,000 games within a month of approval and then switch to a more relaxed (on-demand) update process.

--Parnswir (talk) 11:05, 5 October 2023 (UTC)[reply]

Question How do you match the GamerProfiles pages to the items? Jean-Fred (talk) 15:01, 5 October 2023 (UTC)[reply]
We have an existing 1:1 mapping in our database for those games we want to backfill. Parnswir (talk) 15:31, 5 October 2023 (UTC)[reply]
who is we? BrokenSegue (talk) 19:23, 5 October 2023 (UTC)[reply]
and how was the mapping made? BrokenSegue (talk) 19:32, 5 October 2023 (UTC)[reply]
Ah sorry for the confusion, I forgot to mention I am associated with the company behind GamerProfiles.com, so "we" is the company. The games were originally exported from Wikidata and thus we have the original Wikidata ID for each game. Parnswir (talk) 21:38, 5 October 2023 (UTC)[reply]
Does your association with the company fall inside paid editing? If so, you are obliged to mention it (on your user page). --Lymantria (talk) 11:06, 8 November 2023 (UTC)[reply]
Thanks for the clarification, I didn't mean to mislead. I added the paid contributions template to both the bot account and this account. Parnswir (talk) 11:53, 9 November 2023 (UTC)[reply]

@Parnswir: Is Master Jaro (talk • contribs • logs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)[reply]

@Kirilloparma @Lymantria Thank you for the info everyone! I didn't know about the "paid contributions" info before. And yes, I am a different person :) Since high-quality edits are also in the interest of the company, I have added the paid contributions template to my page as well now. Just let me know if anything else is missing. I've learned quite a bit over the last months, and will keep doing my best to produce helpful edits. Master Jaro (talk) 15:33, 10 November 2023 (UTC)[reply]

Please make 50 test edits and link them here. So9q (talk) 10:38, 2 January 2024 (UTC)[reply]

The contributions were already made on October 5th 2023: https://m.wikidata.org/wiki/Special:Contributions/GamerProfilesBot Parnswir (talk) 16:40, 2 January 2024 (UTC)[reply]

@Kirilloparma @Jean-Frédéric @BrokenSegue @Lymantria @So9q Thank you for your efforts everyone! Is there anything more we can do to help move this project forward? We would love to add more of the relevant IDs next to the other game edits we make along the way. Any help is highly appreciated :) Master Jaro (talk) 16:35, 27 March 2024 (UTC)[reply]

Support The origin of the mapping (the entries were originally exported from WD, as stated above) ensures the quality of the edits. I think the test edits look fine. Happy to support this. Jean-Fred (talk) 19:35, 16 May 2024 (UTC)[reply]
Comment Meanwhile, User:Kirilloparma performed an import of 84K+ GamerProfiles ids − see Wikidata:Edit groups/QSv2/230179. Jean-Fred (talk) 07:39, 19 May 2024 (UTC)[reply]

MangadexBot

MangadexBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Binarycat64 (talk • contribs • logs)

Task/s: add metadata from mangadex to manga with Mangadex manga ID

Code: not yet implemented

Function details: many manga items have a mangadex id specified, but not the id for other sites (MangaUpdates, Kitsu, AniList). however, this data exists on mangadex, so this bot would simply copy over the data.

the initial scope is quite small, only focusing on ID tags. --Binarycat64 (talk) 18:01, 6 August 2023 (UTC)[reply]

I'm concerned you are too inexperienced with wikidata (<500 edits) to be granted bot permissions. At the very least I'm going to need to see some test edits. BrokenSegue (talk) 18:39, 6 August 2023 (UTC)[reply]

I can certainly provide test edits if that's your concern. I will also adjust my code according to any reasonable concerns that are raised.

I'll start working on the code, seeing as there are no objections to the goal of the bot.

Is there a certain way i should do test edits? I can test most of the functionality on sandbox items, but I need a query endpoint to test the functionality of finding items to update, and test.wikidata.org doesn't seem to provide that. Binarycat64 (talk) 16:23, 7 August 2023 (UTC)[reply]

it is ok to make a small number of test edits on main wikidata using the bot account before approval. just make sure it is relatively few at low speed. BrokenSegue (talk) 06:28, 8 August 2023 (UTC)[reply]

Please, do so, let your bot make some test edits. --Lymantria (talk) 15:41, 17 September 2023 (UTC)[reply]

I implemented something very similar last year: https://github.com/PythonCoderAS/wikidata-anime-import

I'll take some time and revive the codebase, as I've taken an extended break but am ready to come back again. RPI2026F1 (talk) 16:22, 25 January 2024 (UTC)[reply]

WingUCTBOT

WingUCTBOT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Tadiwa Magwenzi (talk • contribs • logs)

Task/s: Batch Upload of Niger-Congo B Lexemes , including Senses and Forms.

Code:https://github.com/Boomcarti/WingUCTBOT

Function details: Upload of 550 isiZulu Nouns as Lexemes, Including their associated Forms and Senses. --WingUCTBOT (talk) 10:07, 31 July 2023 (UTC)[reply]

Please make some test edits. Ymblanter (talk) 19:19, 7 August 2023 (UTC)[reply]

Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)[reply]

I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)[reply]

I've just redone about 250 test edits they are on the TestWikidata recent changes page. Some examples: https://test.wikidata.org/wiki/Lexeme:L3768 , https://test.wikidata.org/wiki/Lexeme:L3753 . The link to the page: Recent changes - Wikidata . WingUCTBOT (talk) 18:14, 9 September 2023 (UTC)[reply]

I took a quick look at the code. Are you aware of the python library WikibaseIntegrator which supports lexemes?

I prefer if you would use that or a similar library to make sure you honor the max edit thing on the servers.

Would you be willing to do that? So9q (talk) 10:50, 2 January 2024 (UTC)[reply]

The Lexemes were sourced manually by Professor M.Keet and Langa Khumalo.

https://github.com/mkeet/GENIproject/tree/master/isiZulupluraliser/isiZulu

@WingUCTBOT, Tadiwa Magwenzi: Your code appears to add the same sense multiple times and, among forms, adds the plural of a noun multiple times without including a form for the singular. (You may wish to consider using tfsl for your import; once it is installed, an overview of how it is used may be found here.) Mahir256 (talk) 00:05, 16 August 2023 (UTC)[reply]

Understood, will fix it now. WingUCTBOT (talk) 17:21, 16 August 2023 (UTC)[reply]

Good evening. I have addressed your concerns with the code and have uploaded a test batch of 50+ Lexemes( isiZulu Nouns, along with their senses and forms) WingUCTBOT (talk) 22:36, 16 August 2023 (UTC)[reply]

In time, i do intend to refactor the code to use tfsl WingUCTBOT (talk) 23:09, 16 August 2023 (UTC)[reply]

MajavahBot

MajavahBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Taavi (talk • contribs • logs)

Task/s: Import version and metadata information for Python libraries from PyPI.

Code: https://gitlab.wikimedia.org/toolforge-repos/majavah-bot-wikidata/-/blob/main/majavah_wd_bot/pypi_sync/main.py

Function details: For items with PyPI project (P5568) set, imports the following data from PyPI:

software version identifier (P348) (from PyPI releases). The latest release is marked as preferred, and the preferred rank is removed from older versions if it was added by this bot.
issue tracker URL (P1401), user manual URL (P2078), source code repository URL (P1324), source code repository URL (P1324) (from the metadata of the latest release)

Additionally the PyPI project (P5568) value will be updated to the normalized name if it's not already in that form.

Taavi (talk) 19:54, 11 July 2023 (UTC)[reply]

how many statements do you think this will add? don't some packages have...lots of versions? BrokenSegue (talk) 20:05, 11 July 2023 (UTC)[reply]

Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)[reply]

i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)[reply]

I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does year.month.patch type releases so the first digit changing isn't really meaningful.

However I can filter out all packages generated from https://github.com/vemel/mypy_boto3_builder, as those are all very similar and not intended for human use directly anywyays. That cuts the total number of versions to a third (~70k) even before doing any other per-package limits. Taavi (talk) 21:15, 11 July 2023 (UTC)[reply]

See also Wikidata:Requests for permissions/Bot/RPI2026F1Bot 5 for discussion of a previous similar task (seems not active) and Github-wiki-bot imports version data from GitHub (see e.g. history of modelscope (Q120550399)); however you should care that version numbers may be different between GitHub and PyPI.--GZWDer (talk) 11:38, 12 July 2023 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Oh yes, the RPI2026F1Bot task looks somewhat similar. I'm aware of Github-wiki-bot, but there are quite a few PyPI projects that are not hosted on GitHub, and I think my code should be able to handle items with data from both and ensure the two bots don't start edit warring for example. Taavi (talk) 17:23, 12 July 2023 (UTC)[reply]

FromCrossrefBot 1: Publication dates

FromCrossrefBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Carlinmack (talk • contribs • logs)

Task/s: Using information from Crossref:

Add publication date to items where they are not present in Wikidata
Fix publication dates where they are erroneous

Code: Will be using Pywikibot in a similar way as I have done previously with this bot

Function details: Previously this bot has been used to add CC licenses to items which has been successful. In March 2022 it was realised that other bots/tools were using the wrong date for publication date in Crossref. Since I am working with this dump, I will step up to try fix this issue.

A simpler task is to fill in the details for items without publications. I've created a set of 80k items and once given the go ahead I will contribute these dates.

The issue of the wrong dates is a little more complicated as there are some false positives on both sides of this, sometimes Crossref is wrong and sometimes Wikidata is wrong. I'm sure that Wikidata is wrong more often, however before doing any edits I will do some manual validation to check the prevalence of false positives. When I am fairly confident I will start editing and I'll see whether I can deprecate the existing statement, add a reason and add the new date as preferred. If not, due to limitations in Pywikibot, I'll remove the previous statement instead. --Carlinmack (talk) 14:31, 7 July 2023 (UTC)[reply]

Support This seems useful. However I see only one example edit for this so far, maybe you should do some more just to verify it's doing what we expect? You will be using the "published" date-parts data in the Crossref json files for this? If an item already has the correct published date value will you add the reference? Maybe that should only be done if the published date doesn't already have a reference though... ArthurPSmith (talk) 18:17, 24 July 2023 (UTC)[reply]

Pls make some test edits.--Ymblanter (talk) 15:53, 9 August 2023 (UTC)[reply]

@User:Carlinmack: What about "erroneous" in Crossref and corrected in WD? --Succu (talk) 20:19, 7 November 2023 (UTC)[reply]

UrbanBot

UrbanBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Urban Versis 32 (talk • contribs • logs) Note: A discussion at Wikipedia about this bot took place: Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category

Task/s: UrbanBot's task is to mass-add English descriptions to items that don't have one.

Code: Main repository for UrbanBot's code Source code file for task

Function details:

1. The bot operator will first enter a category name from the English Wikipedia. This category will be used to group similar pages (items on Wikidata) which will all have the same description added to them.

2. The bot operator will enter the description to be added to the pages in the Wikipedia category.

3. The bot will follow through these steps for each page:

3a. The bot will check if the Wikipedia page has a corresponding item.

3b. The bot will check if the item already has a description

3c. If the Wikipedia page has a corresponding item and the item does not already have a description, the bot will write the description specified by the bot operator in step 2 into the item.

3d. The bot will loop through to the next page in the category and run all steps in step 3 again.

Due to the bot requiring the bot operator to enter in the English Wikipedia category and the description for the items, the bot is semi-automated. I have already done the aforementioned process using the bot to add descriptions to items a few times to make sure the code was working properly.

Thanks, Urban Versis 32^KB ⚡ ^{(talk | contribs)} 16:04, 29 June 2023 (UTC)[reply]

Support This sounds fine as long as you are aware of Wikidata's style guide for descriptions. Confirm that you've read Help:Description? BrokenSegue (talk) 16:24, 29 June 2023 (UTC)[reply]

@BrokenSegue Yes, I have and am aware of the formatting of descriptions. Urban Versis 32^KB ⚡ ^{(talk | contribs)} 03:55, 10 July 2023 (UTC)[reply]

Support Looks fine to me too, at least if you'll be following pretty much the pattern you've tested with. One note - two items with the same primary label cannot have the same description string in Wikidata; I'm not sure if your bot would ever run into that but it might be an error condition you'll have to check for. ArthurPSmith (talk) 20:33, 29 June 2023 (UTC)[reply]

Comment Another approach might be to add the short descriptions to enwiki, which are then automatically copied over here by Pi bot. That might help reduce the number of differences of descriptions here and there in the longer term. Thanks. Mike Peel (talk) 16:46, 3 July 2023 (UTC)[reply]

the style guide for wikipedia/wikidata descriptions are not the same though BrokenSegue (talk) 17:24, 5 July 2023 (UTC)[reply]

@Mike Peel Actually, this was my original plan and I discussed it at Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category but I was suggested to bring it here as the bot would mainly edit Wikidata and editing Wikipedia would only create extra steps. Urban Versis 32^KB ⚡ ^{(talk | contribs)} 03:57, 10 July 2023 (UTC)[reply]

@BrokenSegue, Urban Versis 32: Those are both problems that should be fixed. English Wikipedia seems to want the extra steps, it would be useful if they didn't self-contradict themselves... Thanks. Mike Peel (talk) 21:24, 12 July 2023 (UTC)[reply]

those won't be fixed in this request for permission. BrokenSegue (talk) 22:47, 12 July 2023 (UTC)[reply]

@Mike Peel Not sure what you mean by English Wikipedia wanting the extra steps, but if an en-wiki article is linked to a Wikidata item with a description, the description takes the place of a short description on Wikipedia. For example, viewing this Wikipedia category with the shortdescs-in-category tool will reveal that some articles have a locally-added short description whereas one page doesn't have a short description but its corresponding Wikidata item did have a description, which took the place of a Wikipedia short description. Urban Versis 32^KB ⚡ ^{(talk | contribs)} 22:50, 13 July 2023 (UTC)[reply]

@Mike Peel Actually, I stand corrected. I was looking through the en-wiki Wikiproject Short Descriptions (link here) and it looks like Wikidata descriptions are actually not really used as a replacement for a Wikipedia short description. Therefore, I think I will submit a bot request to en-wiki as you were correct about Short descriptions being a much higher priority on Wikipedia compared to Wikidata descriptions. I will leave this request up however, in case I run into people saying the same thing at Wikipedia as they did before. After the bot (hopefully) gets approved, I will take this one down. Thanks again, Urban Versis 32^KB ⚡ ^{(talk | contribs)} 02:40, 15 July 2023 (UTC)[reply]

ACMIsyncbot

ACMIsyncbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Pxxlhxslxn (talk • contribs • logs)

Task/s: Sync links with ACMI API.

Code: https://github.com/ACMILabs/acmi-wikidata-bot/blob/main/acmi_bot.py

Function details: As part of an upcoming residency with the ACMI (Q4823962) I have written a small bot to pull Wikidata links from their public API and write back to Wikidata to ensure sync between the two resources.The plan was to integrate this as part of the build workflow for the ACMI API (https://github.com/ACMILabs/acmi-api). This is currently set to append only, not removing any links Wikidata-side. While the initial link count is only around 1500 there will likely be significant expansion in the current weeks as we identify further overlaps. --Pxxlhxslxn (talk) 00:36, 16 May 2023 (UTC)[reply]

can you add a reference? can you set an edit summary (just add a "summary" arg to the write call)? Otherwise looks good. BrokenSegue (talk) 01:23, 16 May 2023 (UTC)[reply]

Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)[reply]

No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)[reply]

We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)[reply]

I don't think it's an IP block. BrokenSegue (talk) 20:40, 22 June 2023 (UTC)[reply]

WikiRankBot

WikiRankBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)

Operator: Danielyepezgarces (talk • contribs • logs)

Task/s: Use Alexa rank (P1661)

Code: Coming soon i publish the code

Function details: I am making a bot that can track the monthly ranking of websites based on Similarweb Ranking. The bot will receive a list of websites with their corresponding Wikidata IDs and domains to keep the data accurate.

The bot will have to use the Similarweb Top Sites API to get the traffic ranking of each website and store it in a MySQL database along with the date of the ranking. If the website already exists in the database, the bot should update its ranking and date every time there is a new ranking update.

Soon the bot will include some new features that will be communicated in the future.

The Similarweb ranking is not this property. It is Similarweb ranking (P10768).--GZWDer (talk) 05:16, 12 May 2023 (UTC)[reply]

If correct the bot uses the property P10768 and rewrites the old property P1661 since the public data of Alexa Rank ceased to exist,

when I put Similarweb Ranking I don't mean the property P10768 but that the bot took the data from similarweb.com website Danielyepezgarces (talk) 16:15, 17 May 2023 (UTC)[reply]

what edits is this bot making? BrokenSegue (talk) 15:59, 22 February 2024 (UTC)[reply]

ForgesBot

ForgesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Dachary (talk • contribs • logs)

Task/s: Add licensing information to software forges entries in accordance to what is found in the corresponding Wikipedia page. It is used as a helper in the context of the Forges project

Code: https://lab.forgefriends.org/friendlyforgeformat/f3-wikidata-bot/

Function details: ForgesBot is a CLI tool designed to be used by participants in the Forges project in two steps. First it is run to do some sanity check, such as verifying forges are associated with a license. If some information is missing, the participant can manually add it or it can use ForgesBot to do so.

The implementation includes one plugin for each task. There is currently only one plugin to verify and edit the license information. The license is deduced by querying the wikipedia pages of each software: if they consistently mention the same license the edit can be done. If there are discrepancies they are reported and no action is done.

--Dachary (talk) 09:29, 26 April 2023 (UTC)[reply]

I don't think I understand the task. Can you do some (~30) test edits? Or try to explain again? BrokenSegue (talk) 17:13, 26 April 2023 (UTC)[reply]

IngeniousBot 3

IngeniousBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Premeditated (talk • contribs • logs)

Task/s: Adding identifiers to album items, based on existing identifiers.

Code:

Function details: Adding Spotify album ID (P2205), Apple Music album ID (U.S. version) (P2281), YouTube playlist ID (P4300), SoundCloud ID (P3040), Pandora album ID (P10138), Amazon Standard Identification Number (P5749), Tidal album ID (P4577), Deezer album ID (P2723), Yandex Music release ID (P2819), Anghami album ID (P10972), Boomplay album ID (SOON), and Napster album ID (SOON). Based on previously mentioned properties. --Premeditated (talk) 16:29, 22 March 2023 (UTC)[reply]

can you go into more detail about how this lookup will be done? link to some test edits? BrokenSegue (talk) 16:36, 22 March 2023 (UTC)[reply]

@BrokenSegue: Test edits. Lookups are based on a given album identifier like for example, Spotify album ID (P2205). UPC, Spotify artist ID (P1902), artist name, number of tracks, name of tracks, ISRC (P1243), and more are compared and looked up on other streaming services API/scrapping to match "identical" relases. I have made a scoring system where only relases that score 80% or better are added by the bot. The matches that does not get published will be saved to a file for later to be added to Mix'n'match, maybe. - Premeditated (talk) 23:50, 22 March 2023 (UTC)[reply]

I believe you are misusing the inferred from (P3452) property. Look at the description of that property in English. Please go and fix all the test edits you made. Maybe you want stated in (P248) or similar.

I think you should add a based on heuristic (P887) statement in the reference? Maybe to record linkage (Q1266546) or similar. This whole workstream seems really similar to what is/was being done by User:Soweego bot. Can you explain how you are different/the same. Maybe we should get input from @Hjfocs:.

Can you go into more detail about what is creating these scores? How did you verify the scores are meaningful? What kind of model are you using? Is your source code available? What " looked up on other streaming services API/scrapping to match "identical" relases " are you using. Etc. BrokenSegue (talk) 16:59, 23 March 2023 (UTC)[reply]

Hey folks, happy to give my 2 cents. I second BrokenSegue's comments: (based on heuristic (P887), record linkage (Q1266546)) reference nodes sound good. @Premeditated: interesting project: it would be great if you could share the code and tell us something more about it. Cheers, Hjfocs (talk) 22:57, 25 March 2023 (UTC)[reply]

What is the situation here?--Ymblanter (talk) 19:04, 23 June 2023 (UTC)[reply]

LucaDrBiondi@Biondibot

LucaDrBiondi@Biondibot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: LucaDrBiondi (talk • contribs • logs)

Task/s: Import us patent from a csv file

For example:

US11387028; Unitary magnet having recessed shapes for forming part of contact areas between adjacent magnets ;Patent number: 11387028;Type: Grant ;Filed: Jan 18, 2019;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210218300;Assignee Whylot SAS (Cambes) Inventors: Romain Ravaud (Labastide-Murat), Loic Mayeur (Saint Santin), Vasile Mihaila (Figeac) ;Primary Examiner: Mohamad A Musleh;Application Number: 16/769,182

US11387027; Radial magnetic circuit assembly device and radial magnetic circuit assembly method ;Patent number: 11387027;Type: Grant ;Filed: Dec 5, 2017;Date of Patent: Jul 12, 2022;Patent Publication Number: 20200075208;Assignee SHENZHEN GRANDSUN ELECTRONIC CO., LTD. (Shenzhen) Inventors: Mickael Bernard Andre Lefebvre (Shenzhen), Gang Xie (Shenzhen), Haiquan Wu (Shenzhen), Weiyong Gong (Shenzhen), Ruiwen Shi (Shenzhen) ;Primary Examiner: Angelica M McKinney;Application Number: 16/491,313

US11387026; Assembly comprising a cylindrical structure supported by a support structure ;Patent number: 11387026;Type: Grant ;Filed: Nov 21, 2018;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210183551;Assignee Siemens Healthcare Limited (Chamberley) Inventors: William James Bickell (Witney), Ashley Fulham (Hinkley), Martin Gambling (Rugby), Martin Howard Hempstead (Ducklington), Graeme Hyson (Milton Keynes), Paul Lewis (Witney), Nicholas Mann (Compton), Michael Simpkins (High Wycombe) ;Primary Examiner: Alexander Talpalatski;Application Number: 16/771,560

Code:

I would learn to write my bot to perform this operation. I am using Curl in c language, i have a bot account (that now i want to "request for permission") buy i get the following error message:

{"login":{"result":"Failed","reason":"Unable to continue login. Your session most likely timed out."}} {"error":{"code":"missingparam","info":"The \"token\" parameter must be set.","*":"See https://www.wikidata.org/w/api.php for API usage.

probably i think my bot account is not already approved...

Function details:

Import item on wikidata starting from title and description and these properties for now:

P31 (instance of) "United States patent" P17 (country) "united states" P1246 (patent number) "link to google patents or similar" --LucaDrBiondi (talk) 18:25, 28 February 2023 (UTC)[reply]

@LucaDrBiondi How many patents are you planning to add this way? ChristianKl ❪✉❫ 12:33, 17 March 2023 (UTC)[reply]

The bot account to which you link doesn't exist. ChristianKl ❪✉❫ 12:34, 17 March 2023 (UTC)[reply]

Hi i am still writing and trying it and moreover it is not yet a bot ...because it is not automatic.

I have imported patents data into a sql server database then i read a patent and with pywikibot i try for example to search the assignee (owned by property). If i not find a match i will search manually. only if i am sure then i insert the data into wikidata. this is because i do not want to add data with errors. For example look at Q117193724 item. LucaDrBiondi (talk) 18:27, 17 March 2023 (UTC)[reply]

@ChristianKl

At the end i have developed a bot using pywikibot.

It is not fully automatic because i have the property Owned_id that it is mandatory for me.

So i verify if wikidata has already an item to use for this property.

If I not find it then i not import the item (the patent)

I have already loaded some houndred items like for example this Q117349404

Do a limit of number of item that can i import each day exists?

I have received at a point a warning message from the API

Must i so somethink with my user bot?

thank you for your help! LucaDrBiondi (talk) 16:08, 31 March 2023 (UTC)[reply]

Kalliope 7.3

Kalliope 7.3 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Push-f (talk • contribs • logs)

Task/s: Update User:Kalliope 7.3/List of bots every hour.

Code: https://git.push-f.com/wikidata-bots/tree/bots.py

Function details:

I am planning on adding more features e.g. adding a parameter to {{Bot}} to allow bots to define which properties they edit and then generating a table like:

Property	Bot
software version identifier (P348)	Github-wiki-bot

but I still have to implement that.

--Push-f (talk) 09:16, 7 December 2022 (UTC)[reply]

@Push-f: You do not need bot right if the bot only edit subpages of your or your bot's user pages.--GZWDer (talk) 09:26, 7 December 2022 (UTC)[reply]

@GZWDer: I think I do need something because any attempt to edit a subpage via the API is failing with a Captcha (and I did confirm the email address for the account). --Push-f (talk) 14:12, 7 December 2022 (UTC)[reply]

You need a confirmed flag for this. GZWDer (talk) 14:13, 7 December 2022 (UTC)[reply]

Ah ok thanks, then I hereby request the "confirmed" right for my bot. --Push-f (talk) 14:29, 7 December 2022 (UTC)[reply]

confirmed flags are requested at Wikidata:Requests for permissions/Other rights. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]

oh I see you figured that out. never mind. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]

@Push-f is this request still relevant? I saw that you got the confirmed flag for the bot at one point but the bot hasn't run since 2023. If you'd like to have a permanent confirmed flag (at least until the account gets autoconfirmed) and then close this request since you don't really need bot approval if you just edit your subpage, we can do that --DannyS712 (talk) 07:00, 9 June 2024 (UTC)[reply]

DL2204bot 2

DL2204bot 2 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DL2204 (talk • contribs • logs)

Task/s: Correct messy entries for scholarly articles of Uztaro. Journal of Humanities and Social Sciences. (Q12268801) and Aldiri. Arkitektura eta abar (Q12253132) journals, add 2020-2022 articles.

Code: We are using WBI 12.0 for interaction with the source Wikibase, and with Wikidata.

Function details: In 2020-21, articles of the two journals (example Q108042527) have been uploaded using OpenRefine (see Q108042527 history). That dataset has several problems, such as repeated author statements (with and without "series ordinal" qualifier), incorrect issue number, DOI not present (although existing), download URL not present (although existing), etc. This proposal consists en re-writing all entries (see all using this query), using data from the newly created Inguma Wikibase (see items for these two journals using this query). Before the operation, we will check completeness and integrity of the data, and include some missing items (original source is the SQL database in the back of https://inguma.eus). --DL2204 (talk) 11:18, 30 November 2022 (UTC)[reply]

If I'm understanding your query correctly you are planning on editing just 1000 items? Personally I would be comfortable letting you do that without bot approval. Seems like a manual audit would be possible to ensure the quality is acceptable. Either way

Support. BrokenSegue (talk) 16:43, 7 December 2022 (UTC)[reply]

Please make some test edits.--Ymblanter (talk) 20:04, 11 December 2022 (UTC)[reply]

@DL2204 reminder to make your test edits (or do you want this closed?) --DannyS712 (talk) 07:02, 9 June 2024 (UTC)[reply]

Botcrux 11

Botcrux (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Horcrux (talk • contribs • logs)

Task/s: Replace stated in (P248) to publisher (P123) where the value is Q1158.

Problem description: In 2020 User:Reinheitsgebot made a massive addition of references in which stated in (P248)World Athletics (Q1158) have been added as claim in the reference (edit example). The operation was ok, except that World Athletics (Q1158) is an organization, not a document or a database, therefore thousands of warnings are currently raised up (example). A more suitable property is publisher (P123).

Function details: For technical reasons, I'm not able to fix the source with a single edit, so the bot will:

copy all the claims in the reference to be removed (except for stated in (P248)World Athletics (Q1158));
remove the problematic reference;
add a new reference with all the claims copied plus publisher (P123)World Athletics (Q1158).

The script is ready, here a couple of edits: [1][2]. --Horcrux (talk) 09:04, 28 November 2022 (UTC)[reply]

Sounds fine though why not just remove the "Stated In" "World Atheletics" claim from the reference altogether? Surely that's implied by the athlete ID.

BrokenSegue (talk) 16:41, 28 November 2022 (UTC)[reply]

@BrokenSegue: Just because I'm used to be as much complete as I can when I add a reference. But yes, it would also be ok just to execute point #2. --Horcrux (talk) 19:41, 28 November 2022 (UTC)[reply]

Personally I'd prefer just doing point 2 but I don't care enough to argue either way. I might even argue that this bot doesn't need approval since the scope is so limited and there's warnings. BrokenSegue (talk) 21:39, 28 November 2022 (UTC)[reply]

Hey Horcrux, are you still interested in doing those edits?

Support from me! I also see stated in (P248) + a specific external ID property as a great combo, by the way. I'd say that's how they are usually used as well? The UseAsRef userscript also creates references like that. --Azertus (talk) 17:37, 12 April 2024 (UTC)[reply]

Cewbot 5

Cewbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Kanashimi (talk • contribs • logs)

Task/s: Add sitelink to redirect (Q70893996) for sitelinks to redirects without intentional sitelink to redirect (Q70894304).

Code: github

Function details: Find redirects in wiki projects, and check if there is sitelink to redirect (Q70893996) / intentional sitelink to redirect (Q70894304) or not. Add sitelink to redirect (Q70893996) for sitelinks without sitelink to redirect (Q70893996) or intentional sitelink to redirect (Q70894304). Also see Wikidata:Sitelinks to redirects. --Kanashimi (talk) 02:19, 15 November 2022 (UTC)[reply]

sounds good. link to the source? BrokenSegue (talk) 05:28, 15 November 2022 (UTC)[reply]

I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)[reply]

The complete source code of my bot is here: https://github.com/MisterSynergy/redirect_sitelink_badges. It is a bit of a work-in-progress since I need to address all sorts of special situations that my bot comes across during the inital backlog processing.

You can of course come up with something similar, but after the initial backlog has been cleared, there is actually not that much work left to do. Give how complex this task turned out to be, I am not sure whether it is worth to make a complete separate implementation for this task. Yet, it's your choice.

Anyways, my bot would not be affected by the presence of another one in a similar field of work. —MisterSynergy (talk) 18:55, 21 November 2022 (UTC)[reply]

Mr Robot

Mr Robot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)

Operator: Liridon (talk • contribs • logs)

Task/s: Add descriptions/labels/aliases

Code: https://github.com/emijrp/wikidata

Function details: I have been using QuickStatements to work on large numbers of items and properties for a lot types of items, and have +12 mil edits so far. I intend to continue to do so, and after this discussion I am applying for the bot flag for this account in order to avoid flooding Recent Changes/Watchlists.--Liridon (talk) 14:09, 4 November 2022 (UTC)[reply]

I don't think we grant blanket approval for bots. Can you specify what tasks you will be working on? BrokenSegue (talk) 16:31, 4 November 2022 (UTC)[reply]

I've already done some tasks with this account using scripts which are part of the github link, eg ([3], [4] ...) through paws.wmcloud.org. Liridon (talk) 17:35, 8 November 2022 (UTC)[reply]

that doesn't really answer the question. I don't think we grant blanket approval. BrokenSegue (talk) 17:25, 11 November 2022 (UTC)[reply]

You guys did approve this one, which had similar task description.--Liridon (talk) 16:46, 13 December 2022 (UTC)[reply]

@BrokenSegue Hello. Liridon is flooding my Watchlist with his edits adding sq labels to people items. And he's saying he cannot use the bot account because the bot request here was not approved. Can we grant him approval specifically for this kind of edits? Please - for the sake of my watchlist... Thanks... Vojtěch Dostál (talk) 18:28, 18 February 2023 (UTC)[reply]

@Vojtěch Dostál: I'm not a bcrat. I can't assign the bot flag. BrokenSegue (talk) 18:43, 18 February 2023 (UTC)[reply]

Or we can block the user for running unapproved bot. Ymblanter (talk) 20:26, 19 February 2023 (UTC)[reply]

What? You cant block me because of this. I query Items throught https://query.wikidata.org/, find those without specific label or description, then edit all them with Quickstatements. They are not bot edits. Liridon (talk) 13:42, 20 February 2023 (UTC)[reply]

the bot policy does not specify what technology the bot uses to make the edits. the point of the policy is to provide some oversight over large batch edits. BrokenSegue (talk) 21:46, 26 February 2023 (UTC)[reply]

I'm not doing these edits with cadidate bot user(Mr Robot), but with my non-bot-account (Liridon). Exept for flooding watchlist of other users with my semi-automated edits (which I'm sure a lot of other users do), nothing is against any rules of Wikidata. Liridon (talk) 13:03, 2 March 2023 (UTC)[reply]

RobertgarrigosBOT

RobertgarrigosBOT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Robertgarrigos (talk • contribs • logs)

Task/s: I'm using Openrefine to edit items related to Wikidata:Wikiproject_Lieder, beginning by adding the new subclass lyrico-musical work (Q114586269) to the actual lieder in WD. I hope to gain some experience before going with further edits.

Code:

Function details:

--Robertgarrigos (talk) 19:42, 16 October 2022 (UTC)[reply]

YSObot

YSObot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: YSObot (talk • contribs • logs)

Task/s: Account for mapping Wikidata with General Finnish Ontology (Q27303896) and the YSO-places ontology by adding YSO ID (P2347) and for creating new corresponding concepts in case there are no matches.

Code: n/a. Uploads will be done mainly with Openrefine, Mix'n'Match and crresopoinding tools.

Function details: YSO includes over 40.000 concepts and about half of them are already maapped. The mapping includes

adding possible missing labels in Finnish, Swedish and English
adding YSO ID (P2347) with subject named as (P1810) values from YSO
adding stated in (P248) with value YSO-Wikidata mapping project (Q89345680) and retrieved (P813) with the date.

Matches are checked manually before upload. Double-checking is controlled afterwords by using the Constraint violations report

Flag/s: High-volume editing, Edit existing pages, Create, edit, and move pages

--YSObot (talk) 11:33, 16 December 2021 (UTC)[reply]

The bot was running without approval (this page was never included). I asked the operator to first get it approved. Can you please explain the creation of museum (Q113965327) & theatre (Q113965328) and similar duplicate items? Multichill (talk) 16:27, 15 September 2022 (UTC)[reply]
museo (Q113965327) & teatteri (Q113965328) are part of the Finnish National Land Survey classification for places. These classes will be mapped with existing items if they are exact matches by using Property:P2959.

Considering duplicate YSO-ID instances: these are most often due to modeling differences between Wikdata and YSO. Some concepts are split in the other one and vice versa. These are due to linguistic and cultural differences in vocabularies and concept formation. Currently the duplicates would be added to the exceptions list in the YSO-ID property P2347. However, lifting the single value constraint for this proerty is another options here.

Anyway, YSObot is currently an important tool in efforts to complete the mappings of the 30.000+ conepts of YSO with Wikidata. Uploads of YSO-IDs are made to reconciled items from OpenRefine. See YSO-Wikidata mapping project and the log of YSObot. For the moment, uploads are done usually only to 10-500 items at time few times per day max. Saarik (talk) 13:46, 23 September 2022 (UTC)[reply]
That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I Oppose this bot. You should first clean up all these duplicates before doing any more edits with this bot. @Susannaanas: care to comment on this? Multichill (talk) 09:58, 24 September 2022 (UTC)[reply]
This bot is very important, we just need to reach common understanding about how to model the specific Finnish National Land Survey concepts. I have myself struggled with them previously. There is no need to oppose to the bot itself. – Susanna Ånäs (Susannaanas) (talk) 18:02, 25 September 2022 (UTC)[reply]

why do we want to maintain permanently duplicated items? this seems like a bad outcome. why not instead make these subclasses of the things they are duplicates of. or attach the identifier to already existing items. BrokenSegue (talk) 20:36, 11 October 2022 (UTC)[reply]

I think this discussion went a little astray from the original purpose of YSObot.

The creation of the Finnish National Land Survey place types were erroneously made with the YSObot account although they are not related to YSO at all. I was adding them manually with Openrefine but forgot to change the user ids in my Openrefine! I though that that would not be a big issue. The comments by @Multichilland @BrokenSegue are not really related to the original use of YSObot and do not belong here at all but rather to Q106589826 Talk page.

About the duplicate question - Earliear, I did exactly that and added these to already existing items with "instance of" property. THe I received feedback and was told to create separate items for the types. So now I am getting to totally opposite instructions from you guys. Lets move this discussion to its proper place.

And please, add the correct rights for this bot account, if they are still missing as we still need to add the remaining 10.000+ identifiers. Saarik (talk) 11:32, 27 October 2022 (UTC)[reply]
Oppose as per above. If you refrain from creating new items I would probably support it if I could easily see the flow of logic.
I strongly encourage you to publish an actvity planuml diagram showing he logic of the matching.
Thanks in advance. So9q (talk) 10:26, 2 January 2024 (UTC)[reply]

AradglBot

AradglBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Aradgl (talk • contribs • logs)

Task/s:

Create between 100,000 and 200,000 new lexemes in Aragonese language Aragonese (Q8765)

Code:

Function details: --Aradgl (talk) 19:43, 14 March 2022 (UTC)[reply]

Using a small program and the api, the bot will create new lexemes in Aragonese specifying the lexical category, the language and some of its forms

I have about 30,000 lexemes prepared and I have started uploading them

In the coming weeks and months I hope to reach 100,000 or 200,000 new lexemes.

Oppose on principle, since senses (meanings) of these words, or links to references for each lexeme (such as to dictionary entries for these words, or other lexical identifiers for these words) are not also being provided. We already have massive backlogs of senseless lexemes for a bunch of languages (see the bottom of the first table); I will not support making this backlog inordinately larger. Mahir256 (talk) 20:58, 23 March 2022 (UTC)[reply]

We understand your observations. You are right that no meanings or links are provided at this stage. However, this is only natural since this is the beginning of a broader task that we are starting now.

Due to the lack of resources of a minority language such as Aragonese (spoken by less than 30.000 people), we believe this is the most sensible way to proceed: step by step. Moreover, Aragonese is on the brink of extinction according to UNESCO.

Undermining any effort to dignify its status will definitely will speed up the death of the Aragonese language. On the contrary, we ask for support to promote our beloved language.

Thank you very much. Aradgl (talk) 18:46, 24 March 2022 (UTC)[reply]

@Aradgl: I'm not sure where you're getting that I'm interested in undermining Aragonese's dignity or speeding up the death of Aragonese. On the contrary, I'd love to see Aragonese thrive as an independent and flourishing tongue, but there should be just enough in that language's lexemes to begin with such that improvements to them, both from inside and outside the language community, are actually conceivable. Consider Breton lexemes: the language itself is also endangered, and most Breton lexemes currently do not have senses, but they do have links to Lexique étymologique du breton moderne (Q19216625), so that someone else (not necessarily a Breton speaker) can come by and at least add information based on that lexicon (@VIGNERON, Envlh:, who imported them). On the other hand, consider Estonian lexemes; an Estonian non-native speaker created a bunch of them over the course of a few days, all of them without senses, and most still sit as empty shells, with no clear way for non-Estonians to improve them and no indication that actual Estonian speakers even know they exist. I am happy to look around for references you could add to potential Aragonese lexemes, such that you can add some potential resource links based on them, but that is not a reason to begin importing them now without any such resources (especially since you have not indicated how/when you plan to add senses/resource links later). Mahir256 (talk) 20:01, 24 March 2022 (UTC)[reply]
@Mahir256 Right now we are discussing our timetable in order to implement next steps within Wikidata, with the prospect of relating lexemes with concepts and meanings. We count on finishing the first phase by the end of 2022.

By no means have we wanted to create lexemes as “empty shells”. We are working in a long-term project in order to provide valuable information for the sake of Aragonese language. We are working together with our Occitan counterparts (Lo Congrès) and in fact, we want to follow their example promoting further contributions from the community. Our reference is AitalDisem, a project initiated by Lo Congrès following its collaboration with Wikidata. This project is the direct continuation of the project AitalvivemBot. Aradgl (talk) 15:09, 25 March 2022 (UTC)[reply]

@Aradgl: I'll believe that you don't want to create empty shell lexemes, but I find it difficult to believe, given the prior examples of Russian, Estonian, Latin, and Hebrew lexemes, that they won't stay empty shells forever. If you are basing your work on the example of Aitalvivem, then (at least judging from that bot's contributions, which stopped in July 2019) you are not likely to be applying the right amount of attention to senses/resource linkages that would be desired, and (at least judging from the outcome of this bot request, from a user who disappeared after January 2020) you might disappear if prompted later about them.

You speak of wanting to add "valuable information for the sake of the language", but I fear that if there are no paths to this valuable information (with respect to the meanings of words) early on, then it is unlikely there ever will be such paths. If you are absolutely certain that existing printed/online references about Aragonese are not suitable/worthy of at least being linked to, and thus plan to essentially only crowdsource word meanings the same way the Occitan folks appear to have attempted, then what you could instead do (and what would change my opposition to a support) is have your system create lexemes only when an appropriate meaning has been added to that lexeme in that system by a community member, rather than creating lexemes with just the forms all at once waiting to be filled in on Wikidata. Mahir256 (talk) 15:37, 25 March 2022 (UTC)[reply]

@Mahir256: I'm the one who was supposed to continue the work about the AitalvivemBot. Unfortunately, I suffer since March 2020 from long covid and all my works has been postponed. But we still intend to add occitan lexemes in Wikidata, if it's something that you think can be useful. I thought that the purpose of Wikidata lexeme was to inventory words from languages. I never heard we needed to add senses to them as a mandatory requirement. Is that like this, now ? If it is, of course we wouldn't disturb the work done in Wikidata by uploading a lot of words without senses. Minority languages, indeed, don't have a lot of human and financial means and we can't move forward at the speed the main languages do (you see it with occitan, one person is sick and many works are postponed for years). Of course, we can't guarantee all the words we upload will be related to a meaning. But we intend to try with the poor means we have. In the other hand, all our words are from recognized dictionaries. Is that still interesting for Wikidata or will it be better if we keep them for ourselves ? Unuaiga (talk) 14:00, 28 March 2022 (UTC)[reply]

@Unuaiga: I'm sorry to hear that you have had long COVID this whole time—I sincerely hope you can recover! Please re-read my reply from 20:01, 24 March 2022 (UTC) above, and VIGNERON's comments below (in other words, you don't need senses if you can provide a way to add them later). Wikidata lexicographical data can do so much more than "inventory(ing) words from languages"; it's only appropriate that if more isn't done immediately after creating a lexeme, then opportunities for doing so (through the linkages of references) ought to be provided. My offer to find references re: Aragonese to Aradgl from 20:01, 24 March 2022 (UTC) above is extended to you re: Occitan. As for minority languages not moving as fast as main languages, I point you to the examples, in addition to Breton, of Hausa, Igbo, and Dagbani as under-resourced languages making lots of progress on lexemes. Mahir256 (talk) 14:23, 28 March 2022 (UTC)[reply]
Thanks for your explanations. I will look ath the languages you talk about with great curiosity. Unuaiga (talk) 16:04, 28 March 2022 (UTC)[reply]

@Aradgl: this is a wonderful project but I have to agree with Mahir256, this doesn't seems ready yet (for Breton, after a ~4000 lexemes import and even with some info for the meaning, I estimated at least a year of manual work every week to have good lexemes :/ this is already painfull, 100,000 to 200,000 lexemes wouldbe overwhelming).

I have some additionnal questions :

what is the source ? and is it public or not ? (in both case, it would be better to indicate the source in the lexemes themselves)
is you bot ready yet ; if so, could you do some test edit (like creating 10 lexemes) so we can better see exactly what we are talking about and maybe provide some help.

Cheers, VIGNERON (talk) 13:23, 27 March 2022 (UTC)[reply]

@VIGNERON: It seems like the edits the requestor has been making in the Lexeme namespace of late resemble those described in this request. Mahir256 (talk) 16:09, 27 March 2022 (UTC)[reply]

@Mahir256: ah thanks, I looked at the bot edit but notat the account behind the bot ;) Indeed, these lexemes are way to empty to have any use. At the very very least, you need to add a source (and ideally, multiple). Maybe you can cross it with other dataset. I'm also wondering, why « between 100,000 and 200,000 » don't you have the exact number?

Also, I'm pinging @Fjrc282a, Herrinsa, Jfblanc, Universal Life: who speak Aragonese and might want to know about this Bot and maybe even want to help.

Cheers, VIGNERON (talk) 16:24, 27 March 2022 (UTC)[reply]

@Aradgl: Thoughts on VIGNERON's reply from 16:24, 27 March 2022 (UTC)? Mahir256 (talk) 20:14, 8 June 2022 (UTC)[reply]

@Unuaiga, Miguel&IvanV: If either of you know or can get a hold of @Aradgl:, could you tell that user to reply to User:VIGNERON's messages above? Mahir256 (talk) 16:59, 19 July 2022 (UTC)[reply]

Ok, I write them an email to tell them. 217.119.181.174 12:09, 25 July 2022 (UTC)[reply]

Sorry I wasn't connected. I write to them. Unuaiga (talk) 12:10, 25 July 2022 (UTC)[reply]

@Unuaiga: Thank you for doing that; it is a bit disappointing that Aradgl has not replied, since only their ability to edit the lexeme namespace has been blocked and not their ability to do other things on Wikidata. Do you or @Miguel&IvanV: know @Uesca:, and could inform them of this discussion and the messages I placed on their talk page? Mahir256 (talk) 18:05, 30 August 2022 (UTC)[reply]

Good morning to the Wikidata community. I want to apologize for my delay in replying. For various reasons I have been absent.

The source used is from the regional government of Aragon in Spain. It can be consulted with the free and public tool: Aragonario. https://aragonario.aragon.es/

The bot is created and working. Almost all the lexemes created by the user @Aradgl have been created using the bot.

Please,

@

Mahir256

, unlock my user account (@Aradgl) and allow me to continue working for the protection and dissemination of the Aragonese language.

Aradgl (talk) 06:54, 31 August 2022 (UTC)[reply]

@Aradgl: Thank you for finally providing at least an external source for the lexemes you have created. Since it appears each lexeme has its own ID (the number "67731" in https://aragonario.aragon.es/words/67731/, for example), I would like you to do the following first: 1) propose a Wikidata property to store these IDs (maybe call it "Aragonario ID"), 2) once that property is created and I unblock you from the lexeme namespace, add values for this property to all of the Aragonese lexemes already created, and then 3) commit to only creating lexemes alongside their Aragonario IDs, rather than without these IDs. Mahir256 (talk) 07:13, 31 August 2022 (UTC)[reply]

@Aradgl: As a gesture of goodwill, I have gone ahead and did the first thing, proposing Wikidata:Property proposal/Aragonario ID which I will insist @Aradgl, Uesca: add to the lexemes they created first before creating any further new ones. Mahir256 (talk) 23:04, 31 August 2022 (UTC)[reply]

It is not possible to add the AragonarioID because both the Aragonario and my data come from the same database on a server, but the AragonarioID only exists on the Aragonario's website (the Aragonario's id is generated by the Aragonario's website and it is not in the database of the server that belongs to the Government of Aragon).

As we have already indicated, we are proposing the introduction of the Aragonese language in Wikidata in several phases that include its provision of content and even in the final phases the use of Wikidata to create chats in Aragonese, translators, etc.

The first phase consists of uploading the lexemes so that later other classmates manually add the meaning using dictionaries (on paper) and other resources. We would have liked to have all the lexemes (without meaning) created previously because it would have been easier, but given the circumstances, some colleagues have already begun to add meaning to the lexemes already created. The more lexemes (without meaning) I have created, the easier it will be for my classmates to add meanings, in fact, the ideal would be for all the lexemes (without meaning) created to start with phase two.

I wish we had the means and resources to tackle all the work in a single phase and a very short period of time, but this is not the case, there are very few of us who work for the defense and safeguard of Aragonese and many who put obstacles in our way to achieve it.

Can you please let me continue with my work? Don't give us bot permissions, but don't block us for creating lexemes in Aragonese. We will be adding meaning manually from now on to the lexemes and at the same time creating new lexemes (without meaning). Aradgl (talk) 08:15, 23 September 2022 (UTC)[reply]

Good morning,

As a result of opening this conversation, I found out about the initiative of the user Aradgl in Wikidata and I have seen the problem you mention.

I have been including verbs in the Aragonese language and as I work in the same line, I have contacted Aradgl and Iizquierdogo (another user who includes Aragonese language content in Wikidata) and we are going to support Aradgl's initiative by manually including the sense in the lexemes.

Best regards Miguel&IvanV (talk) 10:22, 23 September 2022 (UTC)[reply]

PodcastBot

PodcastBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Germartin1 (talk • contribs • logs)

Task/s: Upload new podcast episodes, extract: title, part of the series, has quality (explicit episode), full work available at (mp3), production code, apple podcast episode id, spotify episode ID. Regex extraction: talk show guest, recording date (from description) It will be manually run and only for prior selected podcasts. Code: https://github.com/mshd/wikidata-to-podcast-xml/blob/main/src/import/wikidataCreate.ts

Function details:

Read XML Feed
Read Apple podcast feed/ and spotify
Get latest episode date available on Wikidata
Loop all new episodes which do not exists in Wikidata yet
Extract data
Import to Wikidata using maxlath/wikidata-edit

--Germartin1 (talk) 04:38, 25 February 2022 (UTC)[reply]

Comment What is your plan for deciding which episodes are notable? Ainali (talk) 06:40, 21 March 2022 (UTC)[reply]
Oppose for a bot with would do blanket import of all Apple or Spotify podcasts. ChristianKl ❪✉❫ 22:46, 22 March 2022 (UTC)[reply]
- Have a look at the code, it's only for certain podcasts and will run only manually. Germartin1 (talk) 05:12, 23 March 2022 (UTC)[reply]
  - @Germartin1: Bot approvals are generally for a task. If that task is more narrow, that shouldn't be just noticeable from the code but be included in the task description. ChristianKl ❪✉❫ 11:39, 24 March 2022 (UTC)[reply]

How about episodes to podcasts with a Wikipedia article? @Ainali:--Trade (talk) 18:34, 12 June 2022 (UTC)[reply]

Support Productive user with a high quality track record.--Big bushlips (talk) 19:29, 25 January 2023 (UTC)[reply]
Support Are we really letting this proposal languish because the request was incomplete at the time of submission? Proposer has since addressed that only a selection of podcasts will be imported. If the podcast is in Wikidata/Wikipedia, I'd say the episodes are notable. Also the other way around, if we already have an item for the guest(s). @Germartin1: are you still interested in editing about this subject (I noticed you publicly archived your repo)? I did some similar editing (semi-automated using OpenRefine) before and might be interested in trying to set your code up and operate it for Richard Herring's Leicester Square Theatre Podcast (Q96757385) and Between the Brackets (Q108093799). --Azertus (talk) 10:09, 23 August 2023 (UTC)[reply]

Wikidata:Requests for permissions/Bot

Contents

UmisBot

DannyS712 bot

DerIchBot

DifoolBot 5

AroundTheBot

Bot Bozze

TapuriaBot

IliasChoumaniBot

Browse9ja bot

integrationBot

OpeninfoBot

MidleadingBot 5

So9qBot 9

So9qBot 8

HVSH-Bot

RudolfoBot

GamerProfilesBot

MangadexBot

WingUCTBOT

MajavahBot

FromCrossrefBot 1: Publication dates

UrbanBot

ACMIsyncbot

WikiRankBot

ForgesBot

IngeniousBot 3

LucaDrBiondi@Biondibot

Kalliope 7.3

DL2204bot 2

Botcrux 11

Cewbot 5

Mr Robot

RobertgarrigosBOT

YSObot

AradglBot

PodcastBot

Navigation menu

Wikidata:Requests for permissions/Bot

WikiRankBot

Navigation menu

Search