Wikidata:Property proposal/USHMM person ID

From Wikidata
Jump to navigation Jump to search

USHMM person ID

[edit]

Originally proposed at Wikidata:Property proposal/Authority control

Descriptionidentifier for a person in the UHSMM has a database of Holocaust survivors and victims
RepresentsHolocaust persons database (Q32321202)
Data typeExternal identifier
Domainholocaust survivors and victims
Allowed values[1-9]\d*
Example
Sourcehttp://www.ushmm.org/online/hsv (Holocaust Survivors and Victims database)
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd][en.wikt][fr.wikt].
Formatter URLhttps://www.ushmm.org/online/hsv/person_view.php?PersonId=$1
See alsodifferent from USHMM Holocaust Encyclopedia ID (P3724)
Motivation
Vladimir Alexiev Jonathan Groß Andy Mabbett Jneubert Sic19 Wikidelo ArthurPSmith PKM Ettorerizza Fuzheado Daniel Mietchen Iwan.Aucamp Epìdosis Sotho Tal Ker Bargioni Carlobia Pablo Busatto Matlin Msuicat Uomovariabile Silva Selva 1-Byte Alessandra.Moi CamelCaseNick Songceci moz AhavaCohen Kolja21 RShigapov Jason.nlw MasterRus21thCentury NGOgo Pierre Tribhou Ahatd JordanTimothyJames Silviafanti Back ache AfricanLibrarian M.roszkowski Rhagfyr 沈澄心 MrBenjo S.v.Mering Hiperterminal (talk) מקף Lovelano Ecravo Chado07

Notified participants of WikiProject Authority control

WikiProject Cultural heritage has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Building a database of Holocaust victims is an exceedingly difficult problem. No such comprehensive database exists. Most of the victims were not famous, so they are not in VIAF or any other large LOD for people.

UHSMM has a database of 3.2 public person records (plus that many closed); with 1M additional names that provide aliases, related family members, etc; historic places, dates, events (see https://www.slideshare.net/valexiev1/semantic-archive-integration-for-holocaust-research-the-ehri-research-infrastructure#slide=27 for details). Because these come from 15k different lists/sources, one person may have as many as 6-7 records. We've done some clustering and are thinking of crowdsourcing for confirming person records deduplication, and building up family networks (which often requires additional documentary research).

Yad Vashem has another database, JewishGen has several databases (and they've collaborated with USHMM on transcribing lists/sources), USC Shoah Foundation has about 1M person names from their 52k oral history interviews, etc. Connecting these disparate databases, and linking to respective archival sources is difficult, but I think we can make a start.

Vladimir Alexiev (talk) 14:56, 9 July 2017 (UTC)[reply]

Discussion
  •  Support I support this property. But am curious if this includes the Benjamin and Vladka Meed Registry of Holocaust Survivors created by American Gathering of Jewish Holocaust Survivors and their Descendants -- and other existing resources that are available and used by Jewish genealogists who are tracing their family roots. JewishGen has a great locality database, specifically, that I started a JewishGen Locality ID for, that helps to collocate different town names as well as various town resources. There are also other established Holocaust resources at both JewishGen and JRI-Poland -- as well as the International Tracing Service. I am sure the USHMM is providing so much information here, but I am concerned that the slides did not note any of the existing genealogical resources. I am probably wrong about this but sometimes it seems like Wikidata is recreating the wheel with datasets and/or isn't utilizing existing resources found in libraries and museums -- which have their own datasets, thesauri, etc. -- and is instead crunching data in isolation of these existing resources. There are also a lot of genealogists who are not digitized or who are paywalled/copyrighted who have significant resources. Curious if those have been part of the scope of this project. -- Erika aka BrillLyle (talk) 10:07, 10 July 2017 (UTC)[reply]
    • I do see that you describe some of the resources, including JewishGen, above. I guess I am just confirming that these very rich resources are part of the project. I have done some primary research with a Holocaust survivor who was prolific in translating a massive amount of vital records -- and have worked on some Jewish genealogy-related BLPs on En Wikipedia which has impacted my own Jewish genealogical research, so I am passionate and care very deeply about this project you will be working on. Thank you! -- Erika aka BrillLyle (talk) 10:16, 10 July 2017 (UTC)[reply]
  •  Support. This looks like a useful source of information to connect with Wikidata.YULdigitalpreservation (talk) 11:53, 10 July 2017 (UTC)[reply]

"a great deal of potential to build a valuable LOD research resource by ingesting this dataset": Building a global database of Holocaust victims is a very worthy goal but very difficult. There's lots of duplication for some USHMM persons, even merging those in one database is hard, and merging across databases very hard. Also, the resources are distributed in various places and partly duplicated (as Erika wrote, some came from JewishGen, others from Ancestry which USHMM uses for crowdsourced transcription...) I'm now hoping EHRI can get the nearly 1M people from USC Shoah Foundation interviews and integrate those, but we'll see.

  • @BrillLyle: that presentation isn't meant to review all available genealogical sources, just present what we're doing in EHRI
  • EHRI will execute some Crowdsourcing use cases, and I proposed "judging whether a cluster of person records represents the same person" and "judging whether a guessed family relation actually holds". That requires examining historic documents so is more complex than a typical crowdsourcing task. If you can help with elaborating such case, proposing simple-to-use sources, providing man-power, or evangelisation, please contact me --Vladimir Alexiev (talk) 16:06, 11 July 2017 (UTC)[reply]
  • Please comment in JewishGen Locality ID Discussion
  • All these discussions don't belong to this page. Any volunteers to start and organize a WikiProject Holocaust? --Vladimir Alexiev (talk) 16:06, 11 July 2017 (UTC)[reply]
  • About Benjamin and Vladka Meed Registry of Holocaust Survivors": It has only 200k, but I'd expect richer info especially on family members, see the registration form. Data collection is run by UHSMM. The FAQ says "The Registry of Holocaust Survivors is not made available over the Internet in order to maintain the privacy of the survivors and their families", so it's probably in the other 3M private records of USHMM.
    • Thanks for the response and detailed information Vladimir. I really appreciate it. I did some library work for one of the people who was responsible for compiling the Registry, Miriam Weiner, who is still around though semi-retired, so if you need or want to talk with her please reach out -- I would be happy to introduce you if that's helpful. I suspect that due to the relationships between the Jewish genealogical communities and Ancestry, the resources came from the Jewish genealogical communities and NOT Ancestry. Ancestry often just hosts the resources (due to a complicated history with providing server space during a time of great flux for the community). Probably more detail than you wanted to hear. I would be happy to provide any assistance and evangelism you might need. I am a "younger" generation of the more "old school" -- and less digitized -- Jewish genealogical community but I know some of the players who have connections to USHMM, etc. but may not necessarily be part of those institutions -- and might be valuable resources. Best to you on this project. Feel free to ping me any time -- Erika aka BrillLyle (talk) 04:26, 12 July 2017 (UTC)[reply]
@Vladimir Alexiev, Sic19, Pigsonthewing, Jonathan Groß, Jneubert: @BrillLyle, YULdigitalpreservation: USHMM person ID (P4130) has been created. Pamputt (talk) 19:47, 17 July 2017 (UTC)[reply]