Office Suite Delivers High-Caliber Document Conversion
I.R.I.S. three-piece text-recognition package has rough spots, but it puts serious scanning power at your fingertips.William Van Winkle, special to PCWorld.com
In a perfect world, none of us would ever have to retype anything. I.R.I.S. Office Suite, although not a perfect package, makes the world a better place with its substantial optical character recognition and document conversion capabilities. The $149 software-hardware bundle won't help you organize your scanned files, but it ably turns documents into data.
Readiris: Some Training Required
The centerpiece of the suite is Readiris Pro 6, an OCR program that supports 56 languages and any TWAIN-compliant scanner. When we tested it, Readiris did an excellent job of preserving page formatting, including color graphics and tables, and its text recognition was top-notch. You can export documents directly into Microsoft Word, WordPad, or Excel, or save them in any of several popular generic formats, including RTF and HTML, but not Adobe's PDF. Readiris's OCR Wizard makes scanning and conversion a snap for novices, and the software is lightning-fast. A magazine page full of text and a sprinkling of graphics was converted into a Word document in only 6 seconds. In comparison, the same page took 13 seconds in TextBridge 9.5--and Readiris was better at OCR accuracy and preserving formatting.
While Readiris Pro 6 has the potential to perform well, you may not get impressive results out of the box. After seeing our first OCR attempts emerge as tilde-ridden gibberish, we learned that Readiris needs a fair amount of help to provide accurate results. Slight changes in scanning brightness and resolution can make a huge difference in document quality, but neither the bundled nor online documentation offers good advice on how to optimize your settings, which can vary considerably between scanners. The manual recommends scanning at 300 dpi or higher, but we found that 200 dpi yielded far better results for color work. Readiris does not let you preview while it's scanning, but it does let you set resolution, brightness, and color mode. Additional tools such as despeckling and deskewing are available once the image is captured.
As with other OCR packages, once you import a document from any TWAIN-compatible scanner or existing image file, Readiris analyzes the page, breaking it into text, table, and graphic zones, each zone type represented by its own color. Graphics containing text confuse the program, though, and our inability to select the scanning area in a preview meant that we had to capture the entire scanning bed--including the areas outside of the page--and not just the section of the page we wanted. Allowing Readiris to perform OCR using automatic analysis often resulted in a terrible jumble of nonsense. But the program allows you to manually readjust, create, and delete analysis zones. For example, you can delete a zone window overlaying a block of text you know you won't need in the final document, or you can drag a new graphic zone window over an object that Readiris zoned as text. We found we could typically rezone a complex page in about about 2 minutes.
Our test documents included correspondence on letterhead, magazine articles, brochures, and books. While we never managed to obtain a "perfect" OCR conversion, the 2 minutes spent rezoning a page before conversion meant perhaps only another 2 minutes of fine-tuning fonts and formatting once the document popped up in Word. Better still, Readiris's OCR capabilities improve with use, thanks to a font-training routine that lets you correct OCR conversions. Your corrections are saved in the program's font dictionary and are used in subsequent conversions.
One Hit, One Strike
I.R.I.S. Office includes a pen scanner--something none of the other popular OCR packages offer. To use it, you just press the IRISPen's tip down on a page and run it across any line of text with a type size of 8 to 22 points. The characters and numbers are immediately recognized and dumped into your word processor. You can also can small graphics, such as logos. The IRISPen software, fluent in 28 languages, even uses text-to-speech technology to read the conversion out loud so that you can listen for errors without looking up from the page.
Roughly the size of a toothpaste tube, the Pen fetches power from a bundled keyboard cable splitter and connects to your PC via a pass-through parallel adapter. (I.R.I.S. says a Universal Serial Bus version should be available early in 2001.) The device has a switch for three brightness levels and a large button that you can program to execute commands such as Enter and Tab. While the Office Suite version of the IRISPen software is not compatible with Windows NT or 2000, it worked fine with our 800-MHz Athlon test system running Windows Millennium Edition. It is also compatible with Windows 95 and 98. IRISPen Executive 2.6 is compatible with Windows NT and 2000.
IRISPen's deskewing capability is phenomenal. Despite how wobbly our scans were, IRISPen seemed to consistently yield accurate OCR conversions, provided that we had the brightness set properly and kept a steady scanning speed across the page. Attached to a laptop, this would be a killer product for any student or researcher scanning material in a library.
Not For Business Cards
While Readiris and IRISPen performed fine, the third component of the package--a database program called Cardiris, designed to catalog all your business cards--was a different story. The idea behind Cardiris is slick: Just scan a card and watch the program automatically identify contact fields such as name and address, and then insert the information into a database. Unfortunately, Cardiris is still based on 16-bit code from the Windows 3.x era, which might explain why it refused to recognize a Hewlett-Packard 6250 scanner on either our PIII-800 Windows 2000 system or our Athlon 800 Windows Me test machine. (The company does not claim that the software is compatible with Windows versions later than 98, it but does claim that Cardiris works with a wide range of scanners.)
Since I.R.I.S. Office does not include the card scanner described in Cardiris's manual, we used the HP 6250 and captured business cards as black-and-white TIFF files with HP PrecisionScan or Photoshop. Then we imported the graphics into Cardiris and ran the software's OCR--a lengthy, inconvenient process. You could use the IRISPen to scan business cards, but it doesn't directly support the Cardiris software; you would be forced to cut and paste the contact data from your word processor back into Cardiris's database, field by field.
Cardiris translated business cards with standard fonts and text layouts fairly well, but its main view doesn't show you field names such as Address or Company. (You have to switch into List View to see them.) Cards with uncommon fonts could hardly be converted at all. In the end, especially since Cardiris's export formats are rather limited, we decided that entering contact data into our PIM by hand was quicker.
Two Out of Three Still Wins
The frustrations and difficulties we experienced while using Cardiris do not negate the inherent value of this package, even though the three products have virtually no integrated functionality with each other. Considering that stand-alone versions of Readiris Pro and the IRISPen Standard retail for $79 and $199 respectively, the I.R.I.S. Office Suite is a bargain at just $149--especially when compared with packages such as TextBridge Millennium ($79) and OmniPage Pro 10 ($500). Readiris and IRISPen are fast and powerful, and they make short work of tough OCR jobs. If you take the time to learn the package's scanning optimization quirks, you'll find it will save you hours of retyping and editing.


