Step 5 - What are your hardware and software requirements?
When you configure or engineer your document imaging system, you will be faced to select a scanner(s) to convert your paper documents into digital images. This chapter will provide you with the basics about scanners so that you can make an intelligent choice as to which scanner(s) to purchase.
There is no one-size-fits-all scanner for businesses. Every business has multiple scanning needs, each with its own requirements and each fulfilled with its unique type of scanner.
Scanners work like photocopiers, but instead of producing a paper copy of the document, they produce an electronic image.
Most scanners operate by moving the paper across a stationary optical assembly. The paper is feed into the scanner by an automatic document feeder (ADF). The automatic document feeder is a critical part of the scanner. It's ability to handle different paper sizes and weights are important to avoiding double feeds, paper jams, and skewed images.
The automatic document feeder allows you to process multiple pages of paper with little human intervention. Automatic document feeders come in varying sizes and allow you to stack anywhere from 25 pages to 500 pages into the feeder at a time.
It's important to understand that scanners are broken down into three types, production scanners, special format scanners, and graphic scanners.
A production scanner is usually characterized as high-volume scanners which image letter and legal sized documents. Most organizations will purchase a production scanner. These scanners are engineered to be very efficient at capturing black and white paper documents and converting them into sharp digital images.
Production scanners are usually characterized as having an automatic document feeder so that they can scan multiple documents at a time. They are designed to handle single/double sided documents of varying weight and sizes.
Some models come with a flat-bed attached to handle special needs such as damaged documents that can't pass through the automatic document feeder. They can also handle books and magazines without disassembling them.
These scanners are engineered to handle special need documents. For example, wide format scanners are used to scan engineering drawings (blue prints) that exceed the scanning capabilities of a production scanner. There are also specialized scanners to scan bound materials such as books, manuals, and magazines.
These scanners are engineered to capture as much detail as possible by scanning at high resolutions. They are not designed for speed, and are generally used for photographic work. Most of these scanners are built as a flat bed so that you can lay material onto its surface.
Regardless of the types of scanners they have many features in common. We will identify and discuss those features that are important to know in order for you to make an informed buying decision.
A scanners speed largely determines its price. The scanner's rated speed is a benchmark determined under ideal conditions. It is usually calculated by setting the scanner at 200 dpi and scanning a stack of letter size paper. If you want to scan at 300 dpi, the speed will be slower.
A common misconception is to take the pages per minute (ppm) and multiply that by 60 minutes to determine the volume of paper which can be scanned in an hour. In a production setting, scanner speeds will be 25-40% lower than the rated speed because of factors like: loading the ADF, removing scanned pages, removing paper jams and rescanning misfeeds, etc.
The size of the document is also a factor. Smaller documents will take less time to scan than larger size documents such as legal size. The most accurate way to calculate the peed by which you will finish a scan job is to scan a representative batch of documents through the scanner.
Virtual Re-scanning (VRS)
Once the image has been created it can be enhanced by VRS software or hardware. VRS technology improves the readability of an image by removing random blemishes caused by dirt and noise (de-speckle); straightening the image (de-skew); removing boarders (crop); and even dropping unwanted colors (color dropout).
Today many scanners come with special colored lamps that are used for dropping out colored ink or print. This feature is especially useful when scanning forms. By dropping out the colored background you will get clear readable images. It is important to note that the lamp color match the color of the ink on the form you wish to dropout.
Simplex vs. Duplex
Some documents are printed on only one side while others are printed on both sides of a piece of paper. Duplex scanners provide the functionality to scan both sides of a document in one pass. Simplex scanners only allow you to scan a single side of the document at a time.
The duty cycle on a scanner is generally stated as the number of pages per day or the number of pages per month, or a total number of pages in the life of a scanner. You will want to match the duty cycle to the number of pages that you will scan on a daily basis. If you are going to scan into the system documents that currently reside in boxes and file cabinets, you will have to adjust your purchase to a higher duty cycle scanner.
Maintenance is required on all scanners and the trend today is for customer maintenance rather than a field service call. Lamp replacement and cleaning paper residue from the automatic document feeder are the most common activities that need to be done routinely to maintain the scanner.
Your scanned images will reside on a server or an optical disk. For simplicity sake we will confine our discussion to keeping the images on a server because that is what 99% of you will do.
We promote the PDF file format over the TIFF format because it has become the international standard, doesn't need proprietary software to view the documents and when compressed it is smaller than the TIFF format. Typically we calculate that a single page PDF file will be 30kb in size and that a TIFF file will be 50kb in size.
If you take the number of documents that you plan to scan into the system in a year and multiply that figure by 30kb you will have the amount of storage space required to store one years worth of scanned images. Dividing that number by 1,020 will give you the number of Gigabytes you will need of hard drive space.
For example, an 80GB hard drive costing about $100 will store about 2,700,000 documents.
Hardware fails. Using top of line hardware minimizes the risk but all hardware can fail.
The primary item that fails in a computer is the hard drive. If a hard drive has a failure, usually the data on that t hard drive is LOST.
How do you protect yourself from hardware failure? There are several ways to do this; one is redundancy and another is near line backup. Best practices incorporate both of these methodologies.
Redundancy is a hardware solution that allows for uninterrupted or minimal interruption of services and no data loss in the event of a hardware failure. The three methods of achieving redundancy are RAID 1, RAID 5 and Failover technology.
2. Near Line Backup
Near line backup is a process where the data is copied to another device or computer on the network. This could be a separate server or workstation or it could be an external USB drive.
A near line backup can be very effective on a small network especially because with our system, you only need the documents themselves to have full functionality. If the main
Server had a catastrophic failure and if the documents are in near line backup, they can be placed on a different computer and can be re-indexed. NOTE: The time loss for interrupted services depends on the number of documents to be indexed.
Disasters are just that, disasters. This can include theft, fire, flood, hurricanes, tornadoes, and who knows what else. The only way to protect yourself from a disaster is to have a copy of your data OFF SITE.
There are a number of ways to maintain a backup of your data off-site.
1. Tape backup
Data is backed up to a tape each night and that tape is taken off-site daily.
2. Other hardware backup
Similar to a tape backup, a system using USB drives could be used.
3. Off Site online backup service
Backups of you data can be taken over the internet. This has the advantage of not requiring an individual to physically take tapes or disks off-site.