Topic on User talk:Pintoch

Jump to navigation Jump to search

OpenRefine author/title reconciliation

2
Jheald (talkcontribs)

Hi Pintoch! I've been doing a bit of author/title matching against VIAF, LoC, and ISNI using a nest of Perl scripts, to try to match authors from the MARC 100 field of the catalogue entry of a book to VIAFs, LCNAFs, ISNIs, and Wikidata items. (For cases where the book does not currently have a Wikidata item, as a step towards creating one).

There are various reconciliators that try to do author matching against these services, eg:

How big a job would it be to create an author/title reconciliator, rather than just an author conciliator ?

And also, to extend what these conciliators do, to be able to retrieve foreign IDs from these services (eg LoC IDs from VIAF), in the way that eg the Wikidata conciliator can add columns for the values of Wikidata properties based on a match?

Is there enough support in the community that this could be offered eg as a student project for a Digital Humanities student? Or would writing/adapting an OpenRefine reconciliator be rather too big an ask?

The British Library quite liked the rough samples from my Perl scripts, but they're a bit close to the metal; whereas an OpenRefine reconciliator could be something that anybody could use. What would be your instincts on this?

Pintoch (talkcontribs)

Hi Jheald,

That sounds like a great project! Currently, we badly need a solid implementation of the reconciliation API that can easily be configured for many data sources. Conciliator is designed just for that, but the author seems to be a bit short on time to update it. He has started to implement the data extension API (which is required for the "Add column from reconciled values" operation) but it is not ready for prime time yet. It would definitely be a very nice project for anyone who is not too daunted by Java - the scope should be manageable for a student.

I cannot work on this directly myself at the moment but I would be happy to help anyone if they have trouble finding their way in the current landscape.

Reply to "OpenRefine author/title reconciliation"