User talk:Magnus Manske

Jump to navigation Jump to search

About this board

Previous discussion was archived at User talk:Magnus Manske/Archive 9 on 2015-08-10.

PKM (talkcontribs)
Magnus Manske (talkcontribs)
PKM (talkcontribs)

Thanks for the quick response!

Reply to "Can someone create a catalog?"

Offering help to admin 5958 catalog

8
Olea (talkcontribs)

Hi:

I don't have experience administrating MnM catalogs but I would like to help with 5958, if possible, Thanks.

Magnus Manske (talkcontribs)

Hi, I have made you a catalog admin (for all catalogs; this is how it works right now). What are you planning to do with the admin powers?

Olea (talkcontribs)

Thanks!

Just learn to fine tuning this catalog and maybe another little one I'm interested too.

Again, thanks :)

Magnus Manske (talkcontribs)

There is not much of an iterface to change the actual catalog I'm afraid; if you need to do something "in bulk" best to let me know so I can do it in the database

Olea (talkcontribs)

Oh, I thought it would be possible to add some statements, particularly P17=Q29, because all elements are sited in Spain.

Magnus Manske (talkcontribs)

Done.

Olea (talkcontribs)

BTW, Is it possible to restrict matching to elements with exactly the same P131? This is the general case and we could avoid hundreds of false positives because homonyms

Magnus Manske (talkcontribs)

Not really, it's on my to-do list...

Reply to "Offering help to admin 5958 catalog"

Dynamic URL formatter for MnM catalogs

3
Bargioni (talkcontribs)

Hi, Magnus, when we was in Athens (and El Pireo too :-) ), the National Library of Greece released its Koha authority records catalog. We immediately updated the related property. But of course the corresponding MnM catalog still uses the old one.

What about to use the value of P1630 (preferred, or the first one) for external links of MnM entries? Thx a lot.

Magnus Manske (talkcontribs)

As to the case at hand, just to be on the same page, you mean Academy of Athens authority ID (P10141) and catalog 5139 respectively? Looks like the catalog is already using the preferred format, did I miss something?

As for using formatter URL (P1630), the idea behing Mix'n'match is that it can be used without a property, and that URLs are fixed with the entries. I can imagine a function to update the URLs for a catalog based on formatter URL (P1630) and the external IDs on demand though. Would that be sufficient?

Epìdosis (talkcontribs)
Reply to "Dynamic URL formatter for MnM catalogs"

Mix'n'match - a problem with Greek

2
Epìdosis (talkcontribs)
Epìdosis (talkcontribs)
Reply to "Mix'n'match - a problem with Greek"

Mix'n'match improvements - 2020 ideas

2
Epìdosis (talkcontribs)

Hi Magnus! In this 2020, also due to the lockdowns, I've spent a lot of time on Wikidata, and on Mix'n'match in particular (also with the great collaboration of @Bargioni:, which has proven an extraordinary tool for synchronisation of different catalogs with Wikidata. I've had many occasions to disturb you (fortunately a lot less after you gave me the opportunity to perform myself many actions through "catalog editor") and I've always appreciated very much your kind answers and solutions.

This evening I've tried to explore (and close, if possible) some of the old threads open on this page, in order to remember which problems are still open to be solved in the next months, and I've found many interesting issues, more or less important/urgent, related to the functioning of Mix'n'match, which has already made a lot of great improvements in the last years. I've tried to collect them in a unique thread, so that we can have a better overview of the suggestions already moved by different users.

Ideas collected from 2020 threads (I have surely missed something):

  1. it often happens that many IDs get automatched to one (generic) ID [from Topic:Vbnw3kqpnuj7lu5w] > there should be a way to un-automatch all IDs matched to a wrong item ✓ Done
  2. it often happens that all multiple automatches for an entry are wrong [from Topic:Vfxcovdfimsoezgo] > there should be a way to remove all multiple automatches from an entry, as there is a way for removing single automatches ✓ Done
  3. some parameters of the scrape may become outdated [from Topic:Vhwlvtqzghmg31og] > there should be a way to visualize the parameters of a catalog's scrape and change them
  4. references added when an item is created from MnM are plain URLs added all in the same reference to the statement [from Topic:Vit29jh9tlbjpzt3] > different catalogs should constitute different references; references should not be plain URL, but references with stated in (P248)X and Property IDID ✓ Done
  5. at the moment years of birth/death (date of birth (P569)/date of death (P570)) and other eventual data (e.g. VIAF ID (P214) or ISNI (P213)) are extracted in a second moment from auxdata [from Topic:Vjgor5rtcz2kgqmh] > when a catalog is imported through the import tool, it would be very good to have an apposite column to insert a priori at least years of birth/death, if possible also VIAF or ISNI
  6. for very big catalogs, the sync page doesn't load [from Topic:Vqzxlmdonip2vscc and Topic:Vu2hluqaescv78qy] > make it possible in some way to load the sync page for very big catalogs
  7. some catalogs, while having some merits, should not be used for sourcing key information such as birth/death dates [from Topic:Vqzxlmdonip2vscc] > there should be a way to indicate for single catalogs that they should not be used by bots to add references to statements
  8. some catalogs' entries have auxdata which are useful for the matching process but that should not be imported in new Wikidata items if they are created [from Topic:Vl8reqahkhkl4jq8] > there should be a way to indicate that the auxdata of a catalogs should not be added to new Wikidata items if they are created
  9. there is no easy way to merge catalogs [from Topic:Vqzxlmdonip2vscc and Topic:Vucpec2813ab9h5f] > there should possibly be an easy way to merge catalogs
  10. Mix'n'match (and QuickStatements) sessions seem to expire briefly [from Topic:Vnojx2ve72jzm5wl] > check if it possible to make these sessions longer, in order to avoid multiple logins
  11. the function "names in other catalogs" and "creation candidates" can be made more visible [from Topic:Vx91gnzs6zdvyf0v] ✓ Done
  12. detailed remarks from Tpalonen
  13. the Mix'n'match gadget adds IDs not considering if they are already present in the item, which sometimes causes duplication [from talk page; it has also been reported that it still links to "tools.wmflabs.org/mix-n-match" instead of "mix-n-match.toolforge.org"] > Mix'n'match gadget should avoid adding IDs which are already present in the item

Other ideas from me and Bargioni (I will add others in the next days, if we find new ones):

  1. when a catalog has a default type (e.g. human (Q5)), all automatchers should consider only items being instance of (P31)default type (this would avoid a lot of wrong automatches) ✓ Done
  2. when entries get matched to an item which is afterwards redirected, at the moment the catalog and Wikidata get out-of-sync, while there should be a botton that allows to adjust all such matches substituting the redirected item with the new item
  3. it would be comfortable, in the pages of a catalog (manually matched entries, automatched entries, unmatched entries), having the possibility to show: only entries with auxdata; only entries without auxdata; all entries
  4. it would be comfortable, when searching Mix'n'match, having the possibility to show one or both of the following categories of entries: manually matched entries; automatched entries; multiply-automatched entries; unmatched entries
  5. the automatic description of the items automatched to entries often doesn't load, being substituted by "Could not load description for X"
  6. the internal search sometimes is very slow (also when searching only in a single catalog), although nearly always finally succeeding in showing results
  7. the internal search at the moment searches only in the names of the entries; it would be useful having a second search-box searching only in the auxiliary data of the entries, in order to exploit both at the same time

I thank you again for the great work you do in maintaining Mix'n'match, QuickStatements and all the tools which make our work on Wikidata much easier and enjoyable and I'm looking forward to the next months, with a lot of new MnM catalogs to crossmatch. Good night!

Matlin (talkcontribs)

+ Creation candidates: a possibility to skip candidate with button at the top of proposed items. Sometimes there are candidate with lot of items (for example: David Jones) wchich i don't want to check. It would be less laging and more comfortable to skip candidate without sliding down entire site.  Done

Reply to "Mix'n'match improvements - 2020 ideas"

Wrong formatting of references

2
Vojtěch Dostál (talkcontribs)

Hello Magnus, thanks for the work you are doing, but I noticed that references in your newly-created items are merged into one reference. Such as here: Q84557233. Also, the sources are again not added as identifier properties but only as plain URLs...

Epìdosis (talkcontribs)

I agree, references added from MnM should be improved.

Longevity of QuickStatements/Mix'n'Match sessions

3
Summary by Epìdosis

seemingly solved, now sessions seem to last many days

GZWDer (talkcontribs)

I found that current QuickStatements/Mix'n'Match sessions expires in a few days, whlie in the previous version it may last for up to two weeks. This is annoying as I use an alternative account for all QuickStatements edits.

Epìdosis (talkcontribs)

Agree, same problem.

Matlin (talkcontribs)

For me they expires at every close of browser.

Google-search broken on Mix'n'match

1
Epìdosis (talkcontribs)

The "Google-search" links for unmatched entries on Mix'n'match are broken. Could you fix them? Thanks!

Solidest (talkcontribs)

Hi. The import from file/link in MnM is not working for about a week now. After pressing the Test button, it gets stuck with the text "Test is running..." and then nothing more appears. I've tried both file and sheet, but the result is same. While the new catalogues on the starting page all were created via scrapper.

Epìdosis (talkcontribs)

Hi! Excuse me for disturbing, I have experienced the same problem too (also reported here by @Vojtěch Dostál: and @Gerwoman:). I would have six catalogs ready for uploading and some would be useful for projects with Italian universities. Thank you in advance!

Magnus Manske (talkcontribs)

Should be fixed now.

Vojtěch Dostál (talkcontribs)

Hello, I think the problem is back. It looks exactly same as before. I was able to do an import a few hours ago but can't do one now.

Knr5 (talkcontribs)

This is still a problem for me, even though I see that many other people have imported catalogs recently.

Vojtěch Dostál (talkcontribs)

I solved my problems when Epidosis advised that I had some rows in my dataset which lacked the mandatory fields. After cleaning those rows, the import worked (yesterday).

Knr5 (talkcontribs)

Seems to have worked, and that it just hangs on an empty value. Thanks!

Mix'n'match not loading - Debian Stretch?

8
Epìdosis (talkcontribs)
Epìdosis (talkcontribs)

Something more important: Mix'n'match is often not reachable in the last two-three days. Could you have a look? Thanks!

Epìdosis (talkcontribs)

The problems are ongoing: significant slowness in loading and frequent periods of some minutes in which Mix'n'match is not reachable at all.

Epìdosis (talkcontribs)

I try to give a more precise description of the problem:

  1. Mix'n'match seems to register matches and unmatches made by users only at intervals of 20 seconds between each one (see https://mix-n-match.toolforge.org/?#/rc; in the past, it was able also to register matches and unmatches with much smaller intervals); if a user tries to do more matches and unmatches in quick sequence, Mix'n'match sets them in queue and goes into "loading" (becoming not reachable) until it has emptied the queue
  2. the function "creation_candidates/by_ext_name", which was very useful, seems not working anymore, probably as a result of the first problem
  3. @Reinheitsgebot: stopped creating new reports and updating existing ones from December 15th; after that date it only updates the index page User:Magnus Manske/Mix'n'match report

Although my technical understanding is near to 0, may it be related to https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2021_Purge#In_use_mix-n-match? Another reason may be a problem with the "microsync" function: I see many catalogs microsyncing and these operations may interfere with matches and unmatches made by users, slowing them down to 1 every 20 seconds.

Epìdosis (talkcontribs)

I would first like to thank you for your intervention on BitBucket today morning! However, at about 14 UTC Mix'n'match started an indefinite "loading..."; is it normal as MnM is absorbing today's changes or is there something to be fixed? Thank you very much!

Epìdosis (talkcontribs)

It reappeared today at 10 UTC, but the problems above (mainly the first: it registers matches and unmatches made by users only at intervals of 20 seconds between each one) are still present.

Epìdosis (talkcontribs)
Epìdosis (talkcontribs)

Now problem 1, the worst one, has disappeared. Thank you very much!