Wikidata talk:Wikidata Lexeme Forms/Yoruba

From Wikidata
Jump to navigation Jump to search

Yoruba language support[edit]

Hello Lucas Werkmeister, I hope you are doing fine. I am pinging you per the instruction at Wikidata:Wikidata_Lexeme_Forms#Language_support. Please add support for the Yoruba language. Thank you. T Cells (talk) 20:37, 30 September 2021 (UTC)[reply]

@T Cells It looks like so far, the templates are very closely based on the English ones. I think that’s a misunderstanding – they need to be adapted to the target language, and can be very different between languages. For example, the grammatical feature combination of simple present (Q3910936) + third person (Q51929074) + singular (Q110786) is very specific to English verbs and probably shouldn’t be used in most other languages.
Also, I’m not sure many forms will be needed at all. w:Yoruba language#Grammar is unfortunately pretty short, but from what I gather, it seems like Yoruba words usually don’t have many forms, and variations like number or tense are expressed through additional words instead? In the templates you’ve provided, the “verb” examples all seem to contain the same form, kọrin; both of the “noun” examples share the words gbé, ọmọ and (though I don’t know which of them is the noun); and the “adjective” examples also seem similar (though dára is maybe not the same as dárá, I’m not sure). Lucas Werkmeister (talk) 19:36, 5 October 2021 (UTC)[reply]
Lucas Werkmeister, you were right about misunderstanding. I believe I have fixed it now. Please let me know if I am missing anything. Regards. T Cells (talk) 21:14, 6 October 2021 (UTC)[reply]
@T Cells Thanks, that looks better, but there are still some issues. I think it’s best if we start with general points:
  1. Usually, each form should have at least one grammatical feature. (An exception are templates with a single form – so far, mostly adverb templates.) Several of the forms have no grammatical features at the moment.
  2. Each form’s example sentence must have exactly one placeholder ([this part]), which will be replaced by an input field for the form representation(s) in the tool. Currently, in several examples no part of the example is marked as a placeholder (maybe some part is meant to be the placeholder, but without speaking the language, I can’t tell), and one example has two placeholders (Ìyá [bu] omi [mu]).
  3. If at all possible, avoid putting the placeholder at the beginning of the sentence. Otherwise, it probably encourages users to mistakenly capitalize the word in the input field, even though the lexeme form shouldn’t actually be capitalized (unless it’s a proper noun, or capitalization is required for another reason). Try to rephrase the sentences so that some other words can go before the placeholder.
I hope this helps. (We’ll probably need at least one more round of feedback to resolve smaller issues, but I think we‘re making progress.) Lucas Werkmeister (talk) 14:52, 9 October 2021 (UTC)[reply]
@Lucas Werkmeister:, I believe everything is in order now. Thank you. T Cells (talk) 19:45, 24 October 2021 (UTC)[reply]
@T Cells:I still see one form without a grammatical feature, and one two example sentences where the placeholder is at the beginning (capitalized). The auxiliary verb (Q465800) grammatical feature also seems odd to me, to me that item feels more like a lexical category than a grammatical feature. (It’s also only used as a grammatical feature once so far, on na (L616289).) And if the “noun” template has a form with the grammatical feature singular (Q110786), then I’d also expect another form with a grammatical feature like plural (Q146786), or maybe dual (Q110022) or something else – but if there’s just one form, it seems strange to me to describe it as singular (Q110786). Lucas Werkmeister (talk) 20:35, 25 October 2021 (UTC) invisible extra signature to fix ping Lucas Werkmeister (talk) 20:40, 25 October 2021 (UTC)[reply]
@Lucas Werkmeister:. Thanks for the clarification. I didn't know that the placeholder shouldn't be capitalised at the beginning of the sentence. Then, I wanted to ask you for clarification but got distracted. I am a native speaker of Yorùbá language and I did study the language as a compulsory subject in my high school for 6 years. I am a sysop on Yoruba Wikipedia as well. I understand the language very well. I was just confused about how you want the lexeme done here. Please take a look at it and let me know if there are things for me to fix. Thanks for your patience. Best Regards. T Cells (talk) 21:08, 26 October 2021 (UTC)[reply]
@T Cells Thanks! In the example sentence mo tẹ̀lé [Dele] lọ oko., the placeholder word is still capitalized – is that correct or should it be lowercase? (It could be capitalized if Yorùbá grammar requires it, of course.) The beginning of the example can also be capitalized, but it doesn’t have to be :)
One other thing I’m noticing is that, to my eye, some of the example words look very different – Dele and àwọnọkùnrin seem totally different to me, and to a lesser extend also pupa and rẹwà. Usually, all the placeholder forms are different forms of the same word, since the real input would be different forms of the same word as well – is that the case here? (Maybe these really are forms of the same word, and I just can’t tell because I don’t know any Yorùbá: I just want to double-check.)
And then I still feel like some grammatical features are off. The noun template has singular (Q110786) and third-person plural (Q51929517) – but if it uses that item, doesn’t mean there should be some other plural forms as well (otherwise I’d expect plural (Q146786))? And the first adjective form has no grammatical features, maybe that should be positive (Q63302092)? (I’d also expect a superlative (Q1817208) form, but maybe Yorùbá doesn’t have that, that’s totally possible.) Lucas Werkmeister (talk) 21:11, 28 October 2021 (UTC)[reply]
@Lucas Werkmeister thanks for your response. The placeholder word "[Dele] should have the first letter capitalised regardless of wether it starts a sentence or not. It's a name of a person. In Yoruba, the first letter of a name should be capitalized. It is correct in its current state. [Dele] and [àwọnọkùnrin] are totally different. [Dele] is a singular and [àwọnọkùnrin] is a plural. Dele is a name of one person while àwọnọkùnrin means "men". We can also replace [Dele] with [ọmọkùnrin] (third-person singular). Please what did you meant by "all the placeholder forms are different forms of the same word"? The form with no grammatical feature as been fixed and a form of adjective (a superlative ) with an example has been added. T Cells (talk) 22:13, 28 October 2021 (UTC)[reply]
@T Cells Okay, but if Dele is a name, then it shouldn’t be the placeholder word at all. If I understand you correctly, ọmọkùnrin should be the placeholder for a third-person singular form (though I’m confused by a third-person singular for a noun… are there other singular forms for nouns as well?). This is what I mean by all the placeholder forms being different forms of the same word, too: you should know which forms a word of a certain lexical category can have (e.g. for nouns: singular/plural? singular/dual/plural? singular/plural plus different cases? etc.), then create all those forms for a single example word, and then build example sentences around those words, so that the word occurs in the sentence in that particular form. (This can mean that the example sentences have to be very different from one another – especially when the language has multiple cases, in my experience.) Does that make sense? Lucas Werkmeister (talk) 23:56, 28 October 2021 (UTC)[reply]
@Lucas Werkmeister thank you. Please check the page and let me know if I miss anything. T Cells (talk) 13:34, 29 October 2021 (UTC)[reply]
Pinging @Lucas Werkmeister in case they missed my earlier ping. T Cells (talk) 19:20, 3 November 2021 (UTC)[reply]
@T Cells Thanks. There are still two aspects of the grammatical features I find strange: that the noun template has a third-person plural (Q51929517) form, but no other kind of plural (are there other plurals for nouns? or should it just be plural (Q146786)?), and that the adjective template only has superlative (Q1817208) and comparative (Q14169499) forms (no positive (Q63302092)?). Apart from that, I find it odd that most of the forms just use the placeholder word as the label, but if you think that makes the most sense, I could live with it. Lucas Werkmeister (talk) 12:45, 6 November 2021 (UTC)[reply]
@User:Lucas Werkmeister, there is no other type of plurals. As to why most of the forms just use the placeholder word as the label; it is fine for Yoruba language. There is positive (Q63302092). I added the form. See this book. T Cells (talk) 14:46, 6 November 2021 (UTC)[reply]
@T Cells Okay, then why is the grammatical feature third-person plural (Q51929517) and not just plural (Q146786)?
Adjectives seem to be more difficult… the book you linked (thanks!) says (page 11) that the comparative is formed by and the superlative by julọ, but in the example given, Eyi tibai gu mejeji lọ, it seems to be divided from the rest of the adjective, and I’m not sure how to model that. (Also, the example has rather than ju for the superlative?) Meanwhile, this other book claims that comparative and superlative forms don’t really exist at all – instead, it seems like and lọ are treated as separate words there, with a space before the and the lọ (but jùlọ as one word when adjacent?). Maybe we should just leave out adjectives for now… Lucas Werkmeister (talk) 15:31, 6 November 2021 (UTC)[reply]
@User:Lucas Werkmeister yes, it should just be plural (Q146786). I forgot to fix it. I have fixed it now. "ju", "jù" means the same thing. It is contexts that determines which one to use and determines the grammatical feature. For example "ilé mi ni ó ga [jù] lọ" (my house is the tallest) - superlative. "ilé mi ga [ju] tì ẹ lọ" (my house is taller than yours) - comparative. "jùlọ" is superlative and when separated as "jù lọ", it has the same contextual meaning. It could also be written as "jùúlọ" and in this case, it would be used as comparative. lọ does not follows [ju] as "julo" (julo has no real meaning in Yoruba). lọ only follows [jù] as either [jùlọ] or [jù lọ] - superlative. My above analysis with examples are correct. The book seems to be contradicting itself and the reason is that the author fails to correctly assign the diacritical marks. T Cells (talk) 17:43, 6 November 2021 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @T Cells: Okay. Regarding the latest noun edit: the placeholder words are now kan and àwọn, but English Wiktionary claims that kan means “one” and àwọn “precedes a noun to make it plural”, so I don’t think these are the right placeholder words. It sounds like nouns should really only have one form, and singular/plural is just specified using additional words, not with additional forms of the noun itself. (The book you linked seems to agree with this, stating on page 10 that only one word has a plural: ọmmọde “child” / majeṣí “children”.) And maybe the same is true for adjectives as well, that ju/jù/jùlo are separate words, rather than creating extra forms of the adjective? --Lucas Werkmeister (talk) 19:04, 6 November 2021 (UTC)[reply]

@User:Lucas Werkmeister, in Yoruba language, words like "gbogbo","àwọn" must precedes a noun to make a plural. If the settings will only work if the plural form is expressed as a single word such as màjèsín, then we should leave the plural out. I'd suggest that we should leave ju/jù/jùlo in their current forms. T Cells (talk) 20:53, 6 November 2021 (UTC)[reply]
@T Cells In principle, it’s totally possible to have lexemes and forms with more than one word, but if the plural is always formed the same way, that doesn’t seem very useful to me. It seems like it would be easier to just have a single form for Yoruba nouns, and then the knowledge “to form the plural, add àwọn before the noun” could be somewhere else (e.g. for Abstract Wikipedia, in the Renderer functions, I think?), instead of being duplicated in each lexeme.
Does the àwọn always go directly before the noun, or can there be other words between them depending on the sentence? (Like e.g. “many great languages” in English.) If there can be other words in between, then I would lean even more heavily towards only having one form. Lucas Werkmeister (talk) 17:31, 7 November 2021 (UTC)[reply]
@User:Lucas Werkmeister; àwọn always go directly before noun. I agree with you that we should just have a single form for nouns, to avoid repetitions. T Cells (talk) 17:42, 7 November 2021 (UTC)[reply]
Hello @User:Lucas Werkmeister; is there any other things I need to do? Thank you. T Cells (talk) 21:27, 11 November 2021 (UTC)[reply]
@T Cells If you agree that we should just have a single form for nouns, then please update the template accordingly :) then that one should be ready to go. (I’m still not sure about the other templates, but I think it’s best if we get the nouns done first.) Lucas Werkmeister (talk) 12:36, 13 November 2021 (UTC)[reply]
@User:Lucas Werkmeister, the template for noun has been updated nouns. let's move on with the nouns. Aside the adjective that is unclear, which of the templates are you not sure of? T Cells (talk) 21:18, 13 November 2021 (UTC)[reply]
@T Cells: Sorry, I just realized one more thing about the noun template: the singular (Q110786) grammatical feature should probably be removed now, right? If there’s only one form, it’s not unusual for it to have no grammatical features. Lucas Werkmeister (talk) 12:52, 14 November 2021 (UTC)[reply]
@User:Lucas Werkmeister, ✓ Done. Thank you. T Cells (talk) 13:44, 14 November 2021 (UTC)[reply]
@T Cells: Alright, nouns are deployed now \o/ I’ll take another look at the other templates now. Lucas Werkmeister (talk) 14:58, 14 November 2021 (UTC)[reply]
@User:Lucas Werkmeister, thank you. I will keep eyes here. T Cells (talk) 15:46, 14 November 2021 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @T Cells: Alright, sorry for the delay.

  • Adjectives: I’m still not convinced it makes sense to store comparative and superlative forms. From your “house is tallest/taller” example, comparative and superlative seem more like different ways to build the sentence, rather than forms that need to be stored on every adjective lexeme. (And if they are forms of each adjective, then the adjective stem/root(?) should at least be included in the form representation, I believe: so in the house examples from earlier, that would be …ó [ga jù lọ] instead of …ó [ga] jù lọ, and …mi [ga ju] tì… instead of …mi ga [ju] tì…, if ga is the adjective.)
  • Verbs: Are ń, tán and nní really forms of the same verb? To me they don’t look very similar (though of course the same could be said about e.g. be/am/are/is/was ^^), and I also don’t see much of a relation between these three forms and the forms listed in A Grammar of the Yoruba Language (pages 22–27).
  • Pronouns: To me it doesn’t seem useful to include them in the Wikidata Lexeme Forms tool – I assume Yoruba has a fairly limited set of pronouns, which can be created manually. (So far, I haven’t added templates for pronouns in any other languages either, though they were proposed for Igbo, see discussion.) The tool is mainly useful for lexical categories with a great number of words, all of which have similar forms; I don’t think that’s usually the case for pronouns.

--Lucas Werkmeister (talk) 17:31, 20 November 2021 (UTC)[reply]

@Lucas Werkmeister: I apologize for this extremely slow response. November - December is often very busy for me. I'd like to ask: Why can't we store superlative forms? There are thousands of superlative words in Yorùbà. "dára", "rẹwà", "tóbi", "kéré" etc. But for comparative it makes sense for adjective stem/root to be included in the form representation. For the verbs, I think I got confused at some point with placeholder [placeholder]. Please help me understand where a placeholder is expected to be for each of the forms. Let me cite some examples with verbs: "I am eating now" (present continues verb). This is the same as "Mò ń jẹun lọ́wọ́" (present continues verb). Where should the placeholder be for the English? With that, I will be able to know where the placeholder should be placed in Yorùbà. Let me give another example : "I read everyday" (simple present). This is the same as "Mò ń kàwé lójoojúmọ́". Where should the placeholder be placed in English?. Let me give this final example: "Last night, I read a book authored by Barack Obama" (simple past). This is the same as "Ní àná, mo ka ìwé tí Barack Obama kọ" (simple past). Where should the placeholder be placed for English? T Cells (talk) 21:04, 28 December 2021 (UTC)[reply]
The placeholders would look like this in English:
  • I am [eating] now.
  • I [read] everyday.
  • Last night, I [read] a book authored by Barack Obama.
Users would then be expected to fill in the appropriate form of a different verb, resulting in sentences like:
  • I am sleeping now.
  • I sleep everyday.
  • Last night, I slept a book authored by Barack Obama.
(As you can tell from that last sentence, that wouldn’t be a great example – the example sentences should, as far as possible, work with a wide variety of different words in place of the placeholder.)
Regarding adjectives, we could store superlative forms, I’m trying to figure out if we should. Maybe you can show me some example sentence with positive and superlative adjectives, together with English translations, so I can understand better how it works? Lucas Werkmeister (talk) 20:27, 8 January 2022 (UTC)[reply]
@T Cells: (I forgot to ping you, sorry) --Lucas Werkmeister (talk) 20:27, 8 January 2022 (UTC)[reply]
@Lucas Werkmeister:, thank you. This is helpful. Example of superlative adjective is "Ẹsẹ̀ mí [tóbi ju lọ] " (my leg is the biggest). Example of positive adjective: "mo ní ímọ̀ tí ó [péye] nípa ẹ̀rọ ayélujára" (I have adequate knowledge of telecommunications). T Cells (talk) 11:37, 9 January 2022 (UTC)[reply]
@T Cells I see, and is it possible to build a sentence in a way so that there are other words between tóbi and ju lọ? Lucas Werkmeister (talk) 16:40, 9 January 2022 (UTC)[reply]
@Lucas Werkmeister:, No. "jùlọ" is often used for superlative. No words should come between "tóbi" and "jùlọ". T Cells (talk) 20:32, 12 January 2022 (UTC)[reply]
@T Cells Alright, then let’s include the superlative if you want. But the placeholder still shouldn’t be only the [jùlọ], as it is right now, I think. Lucas Werkmeister (talk) 12:58, 23 January 2022 (UTC)[reply]
@Lucas Werkmeister: there are other ones such as "ràbàtà", "púpọ̀", "gidi" etc. Why did you think the placeholder still shouldn’t be only the [jùlọ] in that sentence? T Cells (talk) 13:26, 24 January 2022 (UTC)[reply]
@T Cells Because the part in brackets is the only part that ends up in the lexeme, and I thought that the [jùlọ] is always the same, so that would mean that all the lexemes would have the exact same form, which doesn’t seem useful.
Put another way – my impression was that e.g. Ẹsẹ̀ mí [tóbi ju lọ] is a bit like My leg is the [most big], but if it’s Ẹsẹ̀ mí tóbi [ju lọ], then that’s like My leg is the [most] big, which leaves out the “main” part of the adjective (“big”). Lucas Werkmeister (talk) 21:29, 24 January 2022 (UTC)[reply]
@Lucas Werkmeister:, You are correct. Fixed. T Cells (talk) 23:23, 24 January 2022 (UTC)[reply]
@T Cells Thanks! Then I think I just have one more question for the adjective template (which I apparently forgot to ask before): should the forms really be in the current order? I think the usual order is positive (Q63302092), comparative (Q14169499), superlative (Q1817208); also, the first form (currently superlative (Q1817208)) will be used for the lemma of the lexeme, so I feel like positive (Q63302092) makes more sense for that reason as well. Lucas Werkmeister (talk) 20:01, 27 January 2022 (UTC)[reply]
@Lucas Werkmeister: I apologize for responding after a year (too bad!). I believe I got distracted at some point. I want us to move forward with this. To answer your question, yes, you are right. This has been fixed. Thanks for pointing that out. I look forward to hearing from you. T Cells (talk) 18:47, 18 April 2023 (UTC)[reply]
@T Cells: Okay, but now the placeholder for the superlative form is again just [jùlọ]. I thought we agreed above that this wasn’t right. (I’m also not sure the [ju tì ẹ] for the comparative is right, but we didn’t discuss that as much.) Lucas Werkmeister (talk) 13:18, 22 April 2023 (UTC)[reply]
@Lucas Werkmeister: Yes, Lucas. We cannot leave the main part of the adjective out of the placeholder. I've fixed it now. I believe the comparative is also fine as it is. What do you think? T Cells (talk) 14:48, 22 April 2023 (UTC)[reply]
@T Cells: Alright, I’ve deployed the adjectives template now. Lucas Werkmeister (talk) 15:28, 23 April 2023 (UTC)[reply]
@Lucas Werkmeister: Thank you for your help. T Cells (talk) 00:25, 24 April 2023 (UTC)[reply]