User talk:Magnus Manske

Jump to navigation Jump to search

About this board

Previous discussion was archived at User talk:Magnus Manske/Archive 9 on 2015-08-10.

92.227.83.27 (talkcontribs)

1)

"

Unmatched -46 -1%

"

2) Privacy issue: please change in the URL formatter "/wd" to "/-"

Reply to "Property:P4931 Todotango.com person ID"
Gerwoman (talkcontribs)

seems to be stuck

Magnus Manske (talkcontribs)

Back now, I think

Succu (talkcontribs)
Reply to "Q272582"
Succu (talkcontribs)

Welche durch wen, wann und so weiter (und weitere)?

Magnus Manske (talkcontribs)

Ich weiss es auch nicht...

Succu (talkcontribs)

Du hast das Datenobjekt aber erstellt. Fünf der IDs passen nicht zum angegebenen Artnamen. Gruß

Reply to "Q55651080"
GZWDer (talkcontribs)

Please change the language to Chinese.

Magnus Manske (talkcontribs)

Done.

Thierry Caro (talkcontribs)
Magnus Manske (talkcontribs)

All taken care of.

QuickStatements strings in CSV don't work

8
Jc86035 (talkcontribs)

If you have time to fix this: string in CSV commands block results in JSON like {type: unknown, value: "url"} displaying in the GUI and it doesn't work. I converted my commands to the V1 format with regex and the commands worked fine.

Matěj Suchánek (talkcontribs)
Jc86035 (talkcontribs)

There are two posts on of the QuickStatements talk pages from 2017 or so complaining about the same problem, but neither received a response.

Mbch331 (talkcontribs)
Magnus Manske (talkcontribs)

Please be aware that I didn't write the CSV importer, it came as a pull request. I'll look at it when I have time, but it's a bit low on my priority list, not least of all because I actually need to understand the code first ;-)

Lucas Werkmeister (talkcontribs)

So the CSV import uses the same format for values as the V1 import: strings need to be quoted with double quotes. However, this format is interpreted only after the line has been parsed as CSV (using PHP’s concept of CSV, which is totally different from how the rest of the world interprets CSV, because lol PHP), and the CSV parsing process also interprets double quotes.

Long story short: triple double quotes work, somehow. This edit was performed with the following code:

qid,P2536
Q4115189,"""12345"""

Please don’t ask me what to do if the value contains embedded quotes, I don’t know how they’re escaped by PHP’s CSV parser nor by Magnus’ QuickStatements V1 parser :)

Next question, I guess: should we try to fix this (somehow), or merely document it better? (I notice that the new QuickStatements V2 UI has a lot more space for the format documentation than was available in the old dialog, yay!)

Mbch331 (talkcontribs)

For short term I would say document it and long term fix it.

Reply to "QuickStatements strings in CSV don't work"
Jneubert (talkcontribs)
Magnus Manske (talkcontribs)

Do you mean the auto-description? It says "Daily newspaper, from 1844, until 1934" for me. That seems right...

Jneubert (talkcontribs)

YES. I could swear to have seen it differently before I've confirmed, but probably I was just too tired. Sorry! Joachim

Multichill (talkcontribs)
Multichill (talkcontribs)

Bump

Magnus Manske (talkcontribs)

Done.

Melderick (talkcontribs)

Hi Magnus, I am working on catalog #108 and I see there are a lot of duplicates.
For example, search for Alcath*, you'll see duplicates on [1],[2].. entries, as well as duplicates on diacritics. Another example with Arsi* shows that Arsinoë [1] combines those problems and appears 3 times !
Do you think you could merge those duplicates ?
Thank you

Magnus Manske (talkcontribs)

I did the search, see the similar entries, but they all link to their own distinct page. So they are not duplicates.

Likewise, for Arsi*, the three top hits are not duplicates either, despite the identical name.

Could you

  • point me to two entries (click the # to get to the individual entry from search) that are actual duplicates
  • tell me how to detect them without me having to go through the entire catalog manually
Magnus Manske (talkcontribs)

Ah, now I see the ones with [1] down on the list.

Magnus Manske (talkcontribs)

OK, what happened is that Hederich IDs with spaces were also added with "+" instead. I have set the "+" ones to N/A.

Melderick (talkcontribs)

Yes for Arsinoë at the top of the list are the 3 expected entries (Arsinoë, Arsinoë [1] and Arsinoë [2]).
Then in the middle of the list, these 3 entries are there again (obviously because of the diacritics ë)
Finally, at the end of the list, Arsinoë [1] and Arsinoë [2] are back again (as you said, with the space replaced by a + in the link).

Melderick (talkcontribs)

entries 6889318 and 29007192 are duplicates.
The second one has a catalog id of Arsino%C3%AB, while the first one has Arsinoë.
So I guess any entries with a + or a % is suspicious. Ideally, you should try to convert + into space and %wx%yz into the correspondant character (utf8 i guess) and check if the resulting string exists.

Also, when an entry has duplicates, the description is set only on the duplicate entries, not on the main one. Again compare the description on entries 6889318 and 29007192.

Reply to "duplicate entries in catalog #108"