Please add "" to your e-mail Address Book to ensure that this 5-Day Mini-Course does not get filtered to your junk folders.

If you cannot read this email or wish to view it within a browser, please go to

Three Kinds of Document Imaging Systems

By: Randy Van Ittersum & Erin Spalding, CDIA+ Instructor

Welcome to Day 2 of our 5-Day Mini-Course on document imaging technology. Today's lesson will focus on the three types of document imaging systems. We have included this topic because the wrong choice could cost you your job and even your business. Making the right choice in this area is critical to successful implementation of a document imaging system.

How a Search Engine Finds a PDF File

For search purposes, visualize that a PDF file has three layers. This graphic illustrates the three layer concept and is essential to understanding how different document imaging systems search and retrieve electronic documents.

  1. The first layer is the image layer
    The first layer is a picture image of the paper document and can't be searched. There is no data that can be used in the search process other than the file name. Document imaging systems that use this layer must input the key search words into a database and hyperlink it to the image. Without the hyperlink connection, you cannot find your documents other than by file name.

    Advantages: accurate searches
    Disadvantages: high maintenance cost, high exit cost, and all the inherent database problems
    Recommended: only for large organizations that have an inside database IT person

  2. The second layer is the text layer
    From a scanned image you need to create text in order to have data that can be searched upon. This is done through a process called Optical Character Recognition or OCR (we will uncover more on OCR in Day 3). Keep in mind that OCR is not even close to perfect and is subject to errors that must be corrected in order to accurately search a document. For example, if the OCR process is 90% accurate it means that one in every ten characters is wrong. Studies conclude that when using everyday documents you will achieve about a 74% accuracy rate.

    You can use an index to find documents using this layer but once you reach a large number of documents, this method becomes self defeating. The reason is systems that rely on this method, start to return high numbers of inaccurate search results. A search might return 1,000 results where there are only 10 documents that actually satisfy your search criteria. Most people don't have the patience to open 1,000 documents to find the information they need.

    Advantages: uses an index and can be maintained by someone with basic computer skills
    Disadvantages: inaccuracy of OCR and inaccurate search results
    Recommended: for desktop systems or organizations that have less than 15,000 documents

  3. The third layer is the metadata layer
    This layer is searchable by search engines and produces very accurate search results. Our system embeds the key search words that you assign to the document into the metadata layer. By embedding the key search words into this layer, we have in effect made the documents portable. Everywhere the document goes, the key search words go with it. This enables a search engine to easily find a document even after it has been moved from one file folder to another.

    Advantages: accurate search results, uses an index, can be maintained by someone with basic computer skills, scaleable, low maintenance costs, and low exit costs
    Disadvantages: none
    Recommended: for any size organization


We are of the opinion that every organization will derive significant benefits from a document imaging system. The next step is choosing the best system for your organization. Should it be a:

  • Database system,

  • Full-text system, or a

  • Metadata system?

Warmest regards,
Randy Van Ittersum &
Erin Spalding, CDIA+ Instructor

P.S. Check out our new healthcare product called LifeKey. LifeKey allows you to carry your medical records with you at all times on a USB Flash drive. It uses the first universal electronic medical record that can be viewed from virtually every computer in the world without the need for proprietary software or hardware. If you know someone with a serious medical condition, please share this information with them. It could save their life.

For more information go to

Copyright 2006 | Document Imaging Solutions, Inc.

Were you forwarded this e-Course by a friend? Fill out our no-hassle subscription form to get your own copy of the "Document Imaging Manifesto and 5-Day Mini-Course" by clicking here.

Document Imaging Solutions, Inc. only sends the 5-Day Mini-Course and information to e-mail addresses that we have acquired through information requests, and reseller relationships. We do not purchase or acquire lists for e-mail marketing purposes. If you feel that you have received this message in error or do not wish to receive further e-mails from Document Imaging Solutions, Inc., please unsubscribe from the link at the end of this email.

CONFIDENTIALITY NOTICE: This message and any attached documents may contain confidential information from Document Imaging Solutions, Inc. The information is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or an employee or agent responsible for the delivery of this message to the intended recipient, the reader is hereby notified that any dissemination, distribution or copying of this message or of any attached documents, or the taking of any action or omission to take any action in reliance on the contents of this message or of any attached documents, is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail or telephone, at (616) 847-5055, and delete the original message immediately. Thank you.