Digitial Paleography

We all recognize the huge potential of handwriting recognition using artificial intelligence. The major part of most archives comprises handwritten documents. We all recognize the huge challenges. Major initiatives like Transkribus are encouraging.  While it was tested at TNA — TNA tests handwriting recognition in PROB 11 will collection  I didn’t find any conclusion from that pilot project. Perhaps it wasn’t ready for prime time in 2018.

What’s new in 2022?

A 28 January 2022 news release from FamilySearch, “What to Expect from FamilySearch in 2022” included the following.

FamilySearch Handwriting Recognition Artificial Intelligence

FamilySearch works with archives all over the world to help digitally preserve and expand online access to their genealogical records. In 2022, FamilySearch will be unveiling some exciting developments utilizing handwriting recognition artificial intelligence and records access technology to make impressive additions to the number of searchable ancestor names found in genealogical records at FamilySearch.org.

These emerging technologies will enable FamilySearch to make ancestral information hidden in its billions of digital images of historical records more discoverable—in a fraction of the time currently required.

“We’ve been implementing our new records access technologies in Spanish-speaking countries for the past year. We are very pleased with the results. We’ve been able to produce in one year what would have otherwise taken us a hundred years to do with previous technology.  We are excited to unveil it at RootsTech 2022 along with our expansion plans,” said John Alexander, a FamilySearch product manager for the emerging technology.

Alexander said the new handwriting recognition technology, coupled with the new online volunteer experiences and integration with a patron’s FamilySearch Family Tree, will dramatically increase personal discoveries and access to the world’s records.

On 27 January Ancestry posted Ancestry® to Apply Handwriting Recognition Artificial Intelligence to Create a Searchable Index of the 1950 U.S. Census

Using new, proprietary Artificial Intelligence (AI) handwriting recognition technology, Ancestry® today announced it will deliver a searchable index of the 1950 U.S. Census to customers faster than ever before.

The 1950 U.S. Census is set to be released to the public in early April. With handwriting recognition technology, what previously took years to index, now will only take weeks. Ancestry anticipates the indexing of the 1950 Census to be completed and available on Ancestry.com this summer, with states released in real time upon completion.

(paragraph omitted)

Cutting-Edge Technology to Power Discoveries

Ancestry developed machine learning algorithms to power our proprietary AI handwriting recognition technology. Ancestry created AI software that reads handwriting from historical documents and transcribes the data, enabling our community to easily and quickly search historical records. The technology uses a unique and iterative blend of machine and human evaluation which is based on an Ancestry-developed confidence score framework.
Given the unique nature of the 1950 U.S. Census and the unavailability of images in advance, Ancestry used a novel approach to simulate sample document images to ensure it is representative of anticipated variation in aged, inconsistent or damaged historical documents that may be encountered in order to train the AI. Employees recreated full-size census forms in a variety of handwriting styles before intentionally damaging some of these forms by ripping, burning, and pouring liquid on the forms in order to simulate the wear and tear that historical documents go through over time. Ancestry then re-scanned these forms, using them in our sample set to ensure our unique algorithms are prepared to support the anticipated condition of these 70-year-old historical documents.

Calling All Family History Buffs
Ancestry and FamilySearch volunteers are partnering to evaluate the handwriting recognition extraction and ensure a complete and accurate index. Those interested in volunteering to help should visit familysearch.org/1950Census to learn more.

Keep an eye out for additional details around the 1950s U.S. Census and the AI handwriting recognition technology at RootsTech 2022. To register, go to www.RootsTech.org.

Apparently, FamilySearch and Ancestry believe AI handwriting recognition technology is ready for an outing in 2022, perhaps not prime time but as a starting point for human correction. I’ll be interested to attend the RootsTech presentation and the reaction of the community to the experience with the US 1950 census. If successful perhaps LAC will be adventurous enough to consider a similar initiative with the 1931 Canadian census when released next year?

Leave a Reply

Your email address will not be published.