User talk:So9q

Jump to navigation Jump to search

About this board

talk talk

Previous discussion was archived at User talk:So9q/Archive 1 on 2019-11-18.

AndrewTavis (talkcontribs)

Hi there! Just wanted to send along thanks for a query you wrote :) On Wikidata:SPARQL query service/queries/examples/ar I found a query from you that was for "German picture dictionary for young children". I'm working on an issue for Scribe Language Keyboards where we need to remove profane words from autosuggestions. Your query to filter them out will be the perfect base for the query to get them into an eventual Python workflow where we'll not allow a word to be an autosuggest option if it's in the result of the profanity query.


Thanks for creating that query! :)

So9q (talkcontribs)

Nice to hear the graph and query is of use to you 😀. I absolutely love this large knowledge base we are building together. ❤️

When the lexemes mature you will probably be able to use those as well in applications like this.

Would you like to link to your code/product so I can take a look?

AndrewTavis (talkcontribs)

Scribe is already Wikidata lexeme based for all its grammar needs 😊 I made the decision to open source the full project and base it off Wikidata, with the autosuggestions we're working on coming from Wikipedia dumps so we can derive which words often follow other words :) Obviously there are some major holes in the data now, but another overlying goal is to promote Wikidata so that more people get involved and the data grows.


Some useful links:

- Scribe on the App Store (as of now just for iOS, and give us two weeks and the autocomplete and autosuggest should be in there 😊)

- Scribe-iOS on GitHub

- Scribe-Data where we house the WDQS queries and Python scripts to run/format them all


Would be happy to get your feedback, and let me know if you have any questions!

AndrewTavis (talkcontribs)
Reply to "Thanks for the query :)"
So9q (talkcontribs)
Trilotat (talkcontribs)
So9q (talkcontribs)

Thanks for the report. I'll revert and add to the blocklist.

So9q (talkcontribs)
So9q (talkcontribs)
Misc (talkcontribs)
So9q (talkcontribs)

Thanks for the hint. I reverted my edit. I don't know how to make the logic correct either 😅 The intention was to have a warning on P31-Q96482904 which says defendant/plaintiff are needed. Do you know how to make a constraint for that?

Misc (talkcontribs)

I think defendant/plaintiff is needed for all law cases, so just saying that if you have one, you need the others would do the trick ?

So9q (talkcontribs)

I was thinking that we should warn the users also if both defendant and plaintiff are missing. IDK how to achieve that.

Reply to "Constraint on {{P|P1591}}"
Richard Nevell (talkcontribs)
So9q (talkcontribs)

Reverted the job. IDK how that happened. I’m matching diseases and it must be an error in the graph that made it appear as a disease.

Richard Nevell (talkcontribs)

I think it probably was, I've seen a couple of unusual matches from the graph - I'm not sure how they make it in there in the first place. Thanks for sorting it out 🙂

Reply to "{{Q|94041556}} as {{P|921}}"
Jc3s5h (talkcontribs)

I have reverted this edit. I surmise it was made with some sort of script which looks for key words and infers that one item is the main topic of another.

For the journal article "Why the Greenwich Meridian moved" the main topic is not petrology. The meridian moved because the direction of gravity at the Royal Observatory, Greenwich, is not precisely toward the center of the Earth. The distribution of rocks in the Earth's crust is one of several factors that affects the direction of gravity, rocks are certainly not the main topic of the article.

Is there a way to prevent the script from making the same error over and over?

So9q (talkcontribs)

Hi, I reverted your edit and suggested you deprecate the statement instead. How does that sound?

The script can be changed/fixed, but in this case I don't see that it did not work as intended. It simply matches the subject from Crossref to one of our items and adds it as main subject. I don't know if Crossref has a way to report errors in their database.

Jc3s5h (talkcontribs)

After reviewing your contribution history, it appears you are either running a bot, or engaging in bot-like behavior. I have re-reverted your edit. Please link to your bot approval and explain why you are not doing these edits from a bot account.

So9q (talkcontribs)

The edits are semi-automated. That means if I manually approve, the edit is made.

Jc3s5h (talkcontribs)

As for the merit of your edit, what is Crossref? What does an item in Crossref look like. What is Crossref's criteria for creating an item?

According to it's short direction main subject (P921) is "primary topic of a work". This is in line with the property proposal that lead to the creation of the property.

If the criteria in Crossref is to name the main subject of a published work, adding the property and deprecating it might be appropriate, if Crossref is important enough to take notice of. If the criteria in Crossref is something else, they you shouldn't be doing this task.

So9q (talkcontribs)

These are very valid questions. You can read more about crossref here https://www.crossref.org/

I have not asked them about the subjects yet. Do you want to send them an email? If you find an edit that is wrong feel free to deprecate it and provide a reason.

see my new user script here for easy jumping to the source User:So9q/crossref-link.js

So9q (talkcontribs)
So9q (talkcontribs)

I removed it now, feel free to undo any edits that seem bogus or wrong.

Jc3s5h (talkcontribs)

The plain link to https://www.crossref.org does not make it obvious where to find answers to my questions. It would be helpful if you would email them.

I am not on telegram,and am not sure what telegram is.

So9q (talkcontribs)

Hi again. See https://meta.wikimedia.org/wiki/Telegram for more information about telegram an the different Wikimedia related chats. There is a LOT going on in those groups everyday and you get answers to questions often very fast I find and a feeling of part of the community (there are 23k active editors and about 700 of them are in Telegram).The WMDE Wikidata community office hour is in Telegram also and I highly recommend attending.

Regarding Crossref I recently made a userscript to easily jump from WD->Crossref API. See User:So9q#User scripts

Reply to "Reverted script-assisted edit"

More mistaken subjects (candidiasis Q273510)

3
Summary by So9q

fixed

Rdmpage (talkcontribs)

Argh! Another example of mistakes caused by over reliance on simple string matching when tagging articles as being abut something. The title for Pleistocene diversification and speciation of White-throated Thrush (Turdus assimilis; Aves: Turdidae) (Q110451187) includes the word "thrush" but this refers to the bird thrush (Q26050) not the fungal infection candidiasis (Q273510). It looks like any paper on thrushes (the birds) will be tagged incorrectly. It would be great if your tools were clever enough to avoid doing this sort of thing. For example, any term which has multiple meanings (e.g., "thrush") should be avoided. A simple query to Wikidata will reveal if a term has multiple meanings. Can you undo all instances of tagging with candidiasis (Q273510)?

So9q (talkcontribs)

Thanks for reporting. We talked about this in the WikiCite group. The batch in question is being reverted and a fix of the tool proposed by ArthurSmith is being implemented.

So9q (talkcontribs)
1234qwer1234qwer4 (talkcontribs)

Hello, please be careful.

So9q (talkcontribs)

I will! Thanks for fixing it.

Mistakenly making hydrochlorothiazide (Q423930) a subject

6
Summary by So9q

I added blocklist functionality to the tool and blocked alias matching for this QID https://github.com/dpriskorn/ItemSubjector/releases/tag/0.3-alpha2

Rdmpage (talkcontribs)

Hi, I've noticed that you've made Q423930 the main subject of papers that include "aquarius" in the title, e.g. Q100953456 "Aquarius philippinensis sp.n., a large endemic water strider (Insecta: Heteroptera: Gerridae) from ancient crater lakes in South Luzon, Philippines" (there are others) . In these papers Aquarius is Q2859194, a genus of insects. Is it possible to revert these edits? I guess this is the limitation of using simple text to determine subject, especially for items that have lots of synonyms that also match other items.

So9q (talkcontribs)
Rdmpage (talkcontribs)

I guess it's an occupational hazard of trying to determine what a paper is about based on its title. Taxonomic names can be a nightmare given all the possible name clashes. I wonder if what we need is something sophisticated enough to "know" whether a paper is likely to be on a species or a chemical compound, e.g. Q41799598.

So9q (talkcontribs)

Are you in the Wikicite telegram group? There we talk about how to better categorize all this knowledge. This is a blunt tool, using AI to read the abstracts would be a huge improvement. @houcemeddine: works on that if I am not mistaken, but few abstracts are available as open data at this time I'm afraid.

Rdmpage (talkcontribs)

Yes I am in that group. Abstracts are one way to determine what a manuscript is about, but I wonder whether we can get useful information from the surrounding network of connections? Knowing something about the journals and the authors may often tell us whether its likely that a strip refers to a chemical compound or a taxon.

So9q (talkcontribs)