ABBYY FineReader recently introduced an upgrade for Mac users. We now have a Pro version of the OCR software. Hooray!
This upgrade is a fantastic improvement to optical character recognition for Mac users. The Pro version reads files much better than the Express version did—and I liked the Express version. I recently did a rush job on some heavily formatted personal documents using Express. I ran the same four documents through Pro and was blown away by the results.
My favorite improvement by far has to be the option to convert to not only .DOC format, but also ODT, RTF, HTML, CSV, and a few other choices. Because I can now convert directly to ODT files, I save a major step in my standard workflow for heavily formatted documents. Instead of running them through ABBYY, opening them in NeoOffice, saving as an ODT, and then putting them into OmegaT, I can go directly from ABBYY to my CAT tool. Major win!
Another plus in the Pro version of this OCR tool is that you have four options for how you save your files: you can save your scanned output as a single file, a set of files (one for each page in the original), one file per source file, or a set of files separated at each blank page that appears in your source. This is great, because often my clients send me personal documents, like driver’s licenses, that have a front and a back, which they scan and save as individual files. I like to scan them as one large file to save time during the OCR stage—but then I would have to separate them out again when I’m done. With this improvement to ABBYY, I can benefit from the time-saving workflow I’m used to, but also save time during the finalization of a project by not having to manually separate files for my client. Hallelujah!
As far as actually reading the text goes, I noticed a major improvement in how the Pro version of ABBYY reads tables. Express read the table fine, but decided to lump some rows together according to a set of rules I could not figure out. It wasn’t difficult to separate the smooshed-together rows, but it was annoying. Pro read all of the rows separately, and displayed them that way. No muss, no fuss.
One of the upgrades that didn’t work quite as well for me was the option to leave out images when saving the scanned file. I would love to use this for documents like transcripts and diplomas, since clients don’t like to see stamps and seals reproduced literally for the translation. Even though I selected this option, images still showed up in my scanned file. Unlike the image output using the express version, however, I was very quickly and easily able to delete the images from my converted document. So, still a step up.
Now, obviously, with increased functionality, this software upgrade is not nearly as intuitive to use as the simpler express version. You have to read your options a little more carefully and maybe even use the help/search feature to get started with the more advanced upgrades.
You might also have to take a little more time massaging your output before you can save a usable file. On my first pass of these test documents using the auto-recognition setting, I ended up with three blank pages and one page with text output inferior to what the express version produced. After less than one minute taking advantage of the Pro image editing features, however, I was able to create a document whose quality surpassed the results from the express software.
The Pro version doesn’t improve every detail. In my transcript table, the Pro version and the express version both messed up a column of data that was blank except for a heading. Both versions of the software lumped the relatively blank column in with the next column. It was a pretty simple fix, but one I wish I still didn’t have to make.
[I know some of you have dismissed ABBYY in the past for those “dreaded” boxes that show up in its output. I’m sorry to say they’re still here—though they are more accurately placed. If you really need to get rid of them from your final file, I recommend cutting the text out, pasting it somewhere else for a second, and selecting the box as an object. It should be easy to delete that box and then rearrange the text in its proper place.]
I’m very happy to see such a professional piece of software for Mac users. It’s a tool I already use regularly to speed up the formatting time for documents, and now I expect that formatting time to drop even more. If you don’t already incorporate an OCR tool in at least some of your translation projects, I strongly encourage you to try ABBYY. It really is a huge time-saver.