Wikidata:Events/30 lexic-o-days 2021/Outcomes
đŹ Telegram group |
On this page, let's list all the things that have been created, worked on or improved during 30 Lexic-o-days. It doesn't have to be a big achievement! Did you improve some Lexemes, update documentation or write a cool query? Feel free to add it here.
If you want to, you can also present your work during the showcase, taking place on April 14th (more details here).
Template to add something:
* Short description of the outcome and what you did, with links if possible ([[user:X|X]])
Contributions[edit]
If you made contributions to the content of Lexicographical Data, for example creating or improving Lexemes, feel free to add a summary here.
- Hundreds of new lexemes in Norwegian Bokmül, complete with IPA transcription (P898) for each form (Jon Harald Søby)
- I added a large number of English proper nouns, with forms and senses, and added many senses to English nouns where they were missing. Still working on both of these. I also added forms for tens of thousands of English lexemes that previously had no forms at all (nouns, verbs, adjectives) using the bulk feature in the Lexeme Forms tool (forms generated in a semi-automated fashion using a text editor with search/replace tools). ArthurPSmith (talk) 20:18, 1 April 2021 (UTC)
- I am currently adding
- Malayalam (proper) nouns, with forms, senses (in English) and the values for item for this sense (P5137)
- French and English nouns related to COVID-19 and climate change (as part of Climate lexeme week), with forms, senses (in English) and the values for item for this sense (P5137). John Samuel (talk) 20:47, 1 April 2021 (UTC)
- Creation of the first Lexemes in Lorrain (Q671198).
- Lexemes for the names of each of the 118 chemical elements in Swedish, complete with forms and senses connected with item for this sense (P5137) (Belteshassar)
- New property proposals Wikidata:Property proposal/Den Danske Ordbog article ID and Wikidata:Property proposal/Den Danske Ordbog idiom ID--So9q (talk) 07:09, 11 April 2021 (UTC)
- Created Lexeme entries along with basic tense forms for all of the Malayalam verbs in ml Wiktionary: 3700 entries.Vis M (talk) 17:34, 13 April 2021 (UTC)
- I have been contributing lot about Dagbani lexemes with so many senses, statements and forms. Dnshitobu (talk) 01:28, 3 May 2021 (UTC)
Please check out my contribution via the link below: https://www.wikidata.org/wiki/Special:Contributions/Dnshitobu
Documentation[edit]
Did you create or improve documentation pages during 30 Lexicodays? Awesome! Please add the links below!
- Added Swedish example lexemes (in a two-hour stream) Ainali (talk)
- Documentation for Malayalam Lexemes based on template by User:VIGNERON. John Samuel (talk) 09:52, 11 April 2021 (UTC)
Videos[edit]
Demos, live-editing or tutorials produced about Lexemes or tools.
- Live editing lexemes YouTube, Facebook
- Live editing lexemes (in French) by VIGNERON - YouTube
- Live SPARQL queries on Lexemes by Vigneron (in English): on Youtube
- SPARQL queries around lexemes, words, lexicography (in French) by VIGNERON - YouTube
- Editing lexemes on Wikidata (in Swedish) by Ainali - YouTube
- Launch of Climate Lexeme Week (Youtube, Facebook)
- Data imports for Wikidata Lexemes, panel discussion with Yurik, Uziel and Kirill (Youtube, Commons)
- Leveraging text corpora for curating lexicographical data, brainstorming with Daniel Mietchen (Youtube, Commons)
Forthcoming work[edit]
Some documentation work is underway that was initiated by Lexic-o-days: see Dan Shick's project page for more info.
Discussions[edit]
Important discussions and decisions made during 30 Lexicodays (for example, on data modelling) can be summarized here.
- Making sense, essay and discussion that attempt to create a straw-dog proposal on how to model senses, and collect some resources on the topic. (Denny)
- Reverse engineering the identifiers of Den Danske Ordbog (Q1186741)--So9q (talk) 07:09, 11 April 2021 (UTC)
- Found section identifiers of Svenska Akademiens Ordbok (Q1935308) and talked with User:Salgo60 about whether a new property is needed. I just decided it is, because we want to use it on the sense level to refer to the same sense in SAOB.--So9q (talk) 07:09, 11 April 2021 (UTC)
- Brainstorming about how to use the existing texts on the Wikimedia projects to generate Lexemes - Leveraging text corpora for curating lexicographical data: etherpad notes of the brainstorming facilitated by Daniel Mietchen, recording of the session (Youtube, Commons), follow-up wikipage: Wikidata:Text corpora to lexicographical data
Tools[edit]
If you created a new Lexeme-related tools, or added improvements to an existing one, we want to know! Also, don't forget to add the new tools to this page.
- New templates for Wikidata Lexeme Forms: Latvian nouns, Malayalam nouns and verbs, Portuguese nouns and adjectives, Breton nouns and adverbs. Templates can now have forms which are optional. You can now link directly to a language on the tool main page. Bulk mode now shows more helpful error messages, which are also translatable. The tool no longer automatically redirects not-logged-in users to the login page, which makes especially âedit modeâ more useful as a general âviewerâ for the forms of a lexeme.
- Improved lexeme support in @wikilinksbot on Telegram: Individual forms and senses are now exposed, and you can specify a language code for lemmas with multiple representations. (Jon Harald Søby)
- Adding an inputbox to help creating new subpages on Wikidata:Lexicographical data/Documentation/Languages
- Discussion about how Lingua Libre could be even better connected with Wikidata, and especially improvements on Lingua Libre Bot (T224312). We mentioned fixing the feature that allowed to record words based on a SPARQL query and link them to Forms, and also create a gadget (T279826) and a special page (T279822) that would list newer recordings for Lexemes already having one.
- I wrote and rewrote Wikidata:Tools/LexSAOB.--So9q (talk) 11:08, 8 April 2021 (UTC)
- I forked and improved User:So9q/Gadget-CreateNewEntity.js to support creating lexemes from the entity suggester dropdown menu also see original author there.--So9q (talk) 11:08, 8 April 2021 (UTC)
- I finished new templates for Lexeme Forms: Danish adjectives. It entailed discussion in Telegram about avoiding redundancy ("more" and "most" forms were therefore dropped as is also the case with Swedish and English).--So9q (talk) 18:31, 12 April 2021 (UTC)
- the lexicographical coverage dashboard code has now been moved to Github and can accept pull requests
Queries[edit]
Any interesting query that you wrote or adapted for your work on Lexemes can go here!
- Finding nouns without forms and plugging them into the lexeme-forms tool (Ainali (talk))
- missing sv labels on climate items that need to be researched or found via svWP.--So9q (talk) 09:21, 9 April 2021 (UTC)
Mass imports[edit]
If you imported a lot of data to Lexemes in a (partially) automated way, feel free to mention it here, and to provide details on the tools you used.
- Imported ~5000 SAOB identifiers using LexSAOB that could be matched automatically.--So9q (talk) 09:17, 9 April 2021 (UTC)
Development[edit]
Here we will list the work that the Wikidata development team has been doing on the topic of Lexicographical Data during 30 Lexicodays.
- Add new graphs at the bottom of the Wikidata Site Stats Graphana board, showing number of users and active users per namespace (so you can check the evolution of the number of editors working on Lexemes)
- Work on the ability to edit Statements on Senses via the API (work in progress)
- Prototype of a Wikidata Game to help people connecting Senses to Items (not yet available)
- Support and conclude the discussion about focus languages (TBD)
- Bug Triage Hour abour Lexemes, improving Phab tickets together with editors (notes)
- Start the research phase and user interviews for improving the interface of Lexeme pages, especially Senses
- Onboard new colleagues on the topic of Lexicographical Data, generally increase awareness and knowledge inside the different teams about Lexicographical Data on Wikidata