How accurate is Ancestry’s handwriting recognition technology?

Less than two weeks after LAC released the 1931 census of Canada images, Ancestry provided a name search capability. It used AI — handwriting recognition (HWR) technology. Did you rush to use the index to find the person you sought? Chances are you found them. Perhaps the person wasn’t recorded, or the technology failed if you didn’t.

HWR isn’t perfect; it makes mistakes too. That’s just as you might be unsure about a transcription and choose the wrong option!

To test this, I took a one-page sample, 50 lines, from the 1931 Canadian census. I compared the Ancestry HWR interpretation to my own. I randomly selected Sub-district: 0022 – Brantford (City), Page Number 3.  If my interpretation differed from Ancestry’s I looked for another record to check for the correct interpretation.

Of the 50 names, 11 had given and/or last names that were not correct. That’s 78% accuracy.

Out of the 50, only one given name needed to be corrected. A given name was transcribed as Crogier rather than Crozier. That’s just a single-letter difference: g and z are frequently confused, as shown in this listing.

It isn’t clear how the HWR technology works. AI is opaque about its methodology. Does it look at individual characters? If so, that 78% accuracy is only possible because the character recognition accuracy is much greater. A 99% (96.5%) character accuracy implies a 93% (78%) accuracy for a seven-letter word. 

However, more likely the HWR is recognizing the whole word rather than spelling out the letters, just as we do in reading. Problems arise for unfamiliar words, much more likely for last than given names. 

Most of the ten differences in the last name occurred because the HWR usually recognizes and interprets the ditto mark to mean use the name above. However, the ditto mark was not recognized in two of the ten cases, so no surname was provided.

The ditto causes problems where the lead person’s name is misinterpreted. In one case, the HWT interpreted Strohm as Strohn and applied it to the following two family members. There were two other similar cases.

Because the technology is more accurate for first than last names, if the full name search isn’t successful, consider searching by omitting the last name while adding an age, birth year, or place to limit the number of hits.

You may be interested in the YouTube video below explaining Ancestry’s HWR technology.

One Reply to “How accurate is Ancestry’s handwriting recognition technology?”

  1. Given the issues with the 1921 England & Wales census, indexed by humans, it seems the AI does at least as good, if not better a job of it.

    I’ve always found that using various combinations to search for people will yield better results. We have to think outside the box and try every possible combination of search terms.

