Most of the legal jobs coming my way are in image-only PDF format, not directly compatible with the CAT tool I prefer. Usually, it’s not a big deal—not much repetition, and a limited list of new-to-me terminology. But, very recently, I got a doozy of a legal summons that was about twice as long as it needed to be, full of repetition. Three days in, my hands were cramping and I was kicking myself for not knowing a better way to work.
Then, I worked on a long project as part of a team of translators. The word count I quoted was only a rough guide for the project manager, and when I received my section, there was no mention of just how many words it actually contained. How on earth to write the invoice?! Kick myself once, eh, I’ll get to it later. Kick myself twice in the same couple weeks, and I do something about it.
So, I bought ABBYY FineReader. Specifically, ABBYY FineReader Express Edition for Mac. First, I tried the 15-day free trial. Quick download, easy install process. I love the simple interface when you open the tool! All you do is choose:
- where to retrieve the image from (a file, or your TWAIN-driver scanner)
- which language to read (a very long list, though no Chinese. I suspect they are Latin-based alphabets only?)
- whether to convert the image to a text document, spreadsheet, HTML document, or searchable PDF.
There’s no decision fatigue here. One, two, three—go!
The document I was testing this on was quite long, 165 pages, and though it was technically French, it was written Cameroonian-style. I suspect these might be the reasons it took about 30 minutes for the whole conversion process.
Before the tool creates your final converted file, you have a chance to play with the images to optimize the accuracy of the reading. For instance, if your image-only PDF was saved sideways, you can rotate it to a standard reading orientation. This feature definitely came in handy for me! Once you’re done adjusting how you want the file read, you have just one more simple click to save the file. You can rename it and designate a folder location at this point, too.
Caution: the free trial will only save the first page of your test document! I was sorely disappointed by this, considering the length of my document and the amount of time I had already invested in working with my new tool. Not to be deterred, and pleased with the ease of use already, I shelled out the very reasonable sum to download the full version. I went through the same steps with my document, and I have to say it was worth it.
After adjusting the reading areas a bit, there was very little clean-up for me in the final file. I deleted the pages that weren’t assigned to me. I removed a few weird characters that had once been icons and fixed the spelling of a couple words. Find/replace came in handy for inserting spaces after every sentence. In all, I think it took me about an hour to prepare the file to get my word count. I’m very happy with this investment. It is going to be a very useful tool for future translation tasks!
* Note: These are not affiliate links and I wasn’t paid to write this post. I just thought I’d share my experience, since I had a hard time finding good info on an OCR tool to use for translation. Thanks also to my fellow NCATA members for their expert recommendations!