Can someone create a MnM catalog for P10690? The General Multilingual Environmental Thesaurus (GEMET) has labels in many languages and lists of exact matches to other datasets; it would be a great tool for enriching Wikidata. Download links: https://www.eionet.europa.eu/gemet/en/exports/rdf/latest
There is not much of an iterface to change the actual catalog I'm afraid; if you need to do something "in bulk" best to let me know so I can do it in the database
BTW, Is it possible to restrict matching to elements with exactly the same P131? This is the general case and we could avoid hundreds of false positives because homonyms
Hi, Magnus, when we was in Athens (and El Pireo too:-) ), the National Library of Greece released its Koha authority records catalog. We immediately updated the related property. But of course the corresponding MnM catalog still uses the old one.
What about to use the value of P1630 (preferred, or the first one) for external links of MnM entries? Thx a lot.
As to the case at hand, just to be on the same page, you mean Academy of Athens authority ID(P10141) and catalog 5139 respectively? Looks like the catalog is already using the preferred format, did I miss something?
As for using formatter URL(P1630), the idea behing Mix'n'match is that it can be used without a property, and that URLs are fixed with the entries. I can imagine a function to update the URLs for a catalog based on formatter URL(P1630) and the external IDs on demand though. Would that be sufficient?
a function to update the URLs for a catalog based on formatter URL (P1630) and the external IDs on demand would be perfect! BTW, the catalog intended by @Bargioni: is https://mix-n-match.toolforge.org/#/catalog/5478. Thanks!
Hi! If a catalog is set with language "el", it shouldn't happen that, when a new item is created from one of its entries (e.g. https://www.wikidata.org/w/index.php?title=Q118520564&oldid=1899854216), the "el" label gets copied to other languages. This because "el" label is in Greek script, while the other languages use Latin script. Thanks in advance! P.S. could you also check if the two problems reported in Topic:Xd0s9rlcnq2i6qt8 has been solved?
Hi Magnus!
In this 2020, also due to the lockdowns, I've spent a lot of time on Wikidata, and on Mix'n'match in particular (also with the great collaboration of @Bargioni:, which has proven an extraordinary tool for synchronisation of different catalogs with Wikidata. I've had many occasions to disturb you (fortunately a lot less after you gave me the opportunity to perform myself many actions through "catalog editor") and I've always appreciated very much your kind answers and solutions.
This evening I've tried to explore (and close, if possible) some of the old threads open on this page, in order to remember which problems are still open to be solved in the next months, and I've found many interesting issues, more or less important/urgent, related to the functioning of Mix'n'match, which has already made a lot of great improvements in the last years. I've tried to collect them in a unique thread, so that we can have a better overview of the suggestions already moved by different users.
Ideas collected from 2020 threads (I have surely missed something):
it often happens that many IDs get automatched to one (generic) ID [from Topic:Vbnw3kqpnuj7lu5w] > there should be a way to un-automatch all IDs matched to a wrong item Done
it often happens that all multiple automatches for an entry are wrong [from Topic:Vfxcovdfimsoezgo] > there should be a way to remove all multiple automatches from an entry, as there is a way for removing single automatches Done
some parameters of the scrape may become outdated [from Topic:Vhwlvtqzghmg31og] > there should be a way to visualize the parameters of a catalog's scrape and change them
references added when an item is created from MnM are plain URLs added all in the same reference to the statement [from Topic:Vit29jh9tlbjpzt3] > different catalogs should constitute different references; references should not be plain URL, but references with stated in (P248)X and Property IDIDDone
at the moment years of birth/death (date of birth(P569)/date of death(P570)) and other eventual data (e.g. VIAF ID(P214) or ISNI(P213)) are extracted in a second moment from auxdata [from Topic:Vjgor5rtcz2kgqmh] > when a catalog is imported through the import tool, it would be very good to have an apposite column to insert a priori at least years of birth/death, if possible also VIAF or ISNI
for very big catalogs, the sync page doesn't load [from Topic:Vqzxlmdonip2vscc and Topic:Vu2hluqaescv78qy] > make it possible in some way to load the sync page for very big catalogs
some catalogs, while having some merits, should not be used for sourcing key information such as birth/death dates [from Topic:Vqzxlmdonip2vscc] > there should be a way to indicate for single catalogs that they should not be used by bots to add references to statements
some catalogs' entries have auxdata which are useful for the matching process but that should not be imported in new Wikidata items if they are created [from Topic:Vl8reqahkhkl4jq8] > there should be a way to indicate that the auxdata of a catalogs should not be added to new Wikidata items if they are created
Mix'n'match (and QuickStatements) sessions seem to expire briefly [from Topic:Vnojx2ve72jzm5wl] > check if it possible to make these sessions longer, in order to avoid multiple logins
the function "names in other catalogs" and "creation candidates" can be made more visible [from Topic:Vx91gnzs6zdvyf0v] Done
the Mix'n'match gadget adds IDs not considering if they are already present in the item, which sometimes causes duplication [from talk page; it has also been reported that it still links to "tools.wmflabs.org/mix-n-match" instead of "mix-n-match.toolforge.org"] > Mix'n'match gadget should avoid adding IDs which are already present in the item
Other ideas from me and Bargioni (I will add others in the next days, if we find new ones):
when a catalog has a default type (e.g. human (Q5)), all automatchers should consider only items being instance of (P31)default type (this would avoid a lot of wrong automatches) Done
when entries get matched to an item which is afterwards redirected, at the moment the catalog and Wikidata get out-of-sync, while there should be a botton that allows to adjust all such matches substituting the redirected item with the new item
it would be comfortable, in the pages of a catalog (manually matched entries, automatched entries, unmatched entries), having the possibility to show: only entries with auxdata; only entries without auxdata; all entries
it would be comfortable, when searching Mix'n'match, having the possibility to show one or both of the following categories of entries: manually matched entries; automatched entries; multiply-automatched entries; unmatched entries
the automatic description of the items automatched to entries often doesn't load, being substituted by "Could not load description for X"
the internal search sometimes is very slow (also when searching only in a single catalog), although nearly always finally succeeding in showing results
the internal search at the moment searches only in the names of the entries; it would be useful having a second search-box searching only in the auxiliary data of the entries, in order to exploit both at the same time
I thank you again for the great work you do in maintaining Mix'n'match, QuickStatements and all the tools which make our work on Wikidata much easier and enjoyable and I'm looking forward to the next months, with a lot of new MnM catalogs to crossmatch. Good night!
+ Creation candidates: a possibility to skip candidate with button at the top of proposed items. Sometimes there are candidate with lot of items (for example: David Jones) wchich i don't want to check. It would be less laging and more comfortable to skip candidate without sliding down entire site. Done
Hello Magnus, thanks for the work you are doing, but I noticed that references in your newly-created items are merged into one reference. Such as here: Q84557233. Also, the sources are again not added as identifier properties but only as plain URLs...
I found that current QuickStatements/Mix'n'Match sessions expires in a few days, whlie in the previous version it may last for up to two weeks. This is annoying as I use an alternative account for all QuickStatements edits.
Hi. The import from file/link in MnM is not working for about a week now. After pressing the Test button, it gets stuck with the text "Test is running..." and then nothing more appears. I've tried both file and sheet, but the result is same. While the new catalogues on the starting page all were created via scrapper.
Hi! Excuse me for disturbing, I have experienced the same problem too (also reported here by @Vojtěch Dostál: and @Gerwoman:). I would have six catalogs ready for uploading and some would be useful for projects with Italian universities. Thank you in advance!
I solved my problems when Epidosis advised that I had some rows in my dataset which lacked the mandatory fields. After cleaning those rows, the import worked (yesterday).
Hi! https://mix-n-match.toolforge.org/#/catalog/4207 needs HTTP instead of HTTPS. Thanks! P.S. Would it be possible giving to MnM admins the possibility to directly change formatter URL? It would probably be more comfortable
I try to give a more precise description of the problem:
Mix'n'match seems to register matches and unmatches made by users only at intervals of 20 seconds between each one (see https://mix-n-match.toolforge.org/?#/rc; in the past, it was able also to register matches and unmatches with much smaller intervals); if a user tries to do more matches and unmatches in quick sequence, Mix'n'match sets them in queue and goes into "loading" (becoming not reachable) until it has emptied the queue
the function "creation_candidates/by_ext_name", which was very useful, seems not working anymore, probably as a result of the first problem
Although my technical understanding is near to 0, may it be related to https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2021_Purge#In_use_mix-n-match? Another reason may be a problem with the "microsync" function: I see many catalogs microsyncing and these operations may interfere with matches and unmatches made by users, slowing them down to 1 every 20 seconds.
I would first like to thank you for your intervention on BitBucket today morning! However, at about 14 UTC Mix'n'match started an indefinite "loading..."; is it normal as MnM is absorbing today's changes or is there something to be fixed? Thank you very much!
It reappeared today at 10 UTC, but the problems above (mainly the first: it registers matches and unmatches made by users only at intervals of 20 seconds between each one) are still present.