PDF File Format versus TIFF Image Format

By: Randy Van Ittersum & Erin Spalding, CDIA+ Instructor

Welcome to Day 4 of our 5-Day Mini-Course on document imaging technology. Today's lesson will focus on selecting the right document format, PDF or TIFF. Did you know that Adobe owns the patent on both of these document formats?

PDF versus TIFF as the Preferred Electronic Document Format

In the document imaging world, there are two dominant file formats to consider when storing scanned paper business documents as images, the Tagged Image File Format (TIFF) and/or Portable Document Format (PDF).

The TIFF image is an industry standard file format and is "platform-independent" (i.e. Windows PC, Apple/Mac, UNIX, etc.). TIFF was originally developed by Aldus and Microsoft, Inc. and the specification was owned by Aldus, which in turn merged with Adobe Systems, Inc. Consequently, Adobe Systems now holds the copyright for the TIFF specification and claims the trademark originally registered to Aldus. Although Adobe holds the copyrights to both TIFF and PDF, it has not released new specifications for TIFF since 1991, and is no longer investing in its development. Instead, Adobe developed PDF to supersede TIFF.

The TIFF image file format is very flexible and supports several color spaces (i.e. bitonal, grayscale, RGB, etc.) and compression schemes (i.e. raw uncompressed, CCITT Group 3 & 4, Unisys LZW, etc.) and is a raster image file format. Because raster images can be easily altered in an image editing program (Windows Imaging, Photoshop, MS Paint, etc.), the TIFF image format needs to be stored in a read-only format on Write Once Read Many (WORM) media (i.e. Magneto-Optical, CD-R, DVD-R, etc.) immediately after creation and validation by authorized personnel to maintain legal compliancy.

TIFF images can be stored on non-WORM storage media only if one of the two conditions is met, (a) paper originals exist (b) micrographics (microfilm, microfiche, etc.) of paper originals are maintained. TIFF images stored on magnetic media (i.e. DASD/hard drive, RAID, SAN, Network Attached Storage (NAS), DLT tape, etc.) without original paper or micrographic backups could spell legal woes if ever challenged in a court of law.

The PDF file format was developed and released by Adobe Systems in 1993 and has become the de facto standard in electronic document distribution worldwide. It was originally developed for the U.S. federal government to store its legacy files. In fact, the U.S. federal government is still the largest user of the PDF file format. The federal courts are in the process of replacing their paper docket and case management system with a new system based on the PDF technology. This new system only accepts documents that are PDF.

The PDF file format uses a variable compression scheme to create small portable file sizes. An innovative feature of PDF is the ability to perform file streaming, in which a user can request a multi-page document across a network and will receive the first requested page immediately while the remaining pages download in the background. The PDF file format has several additional features including self-contained annotations, comments, security and document encryption, just to name a few. When a user makes changes to a PDF file, it will record these changes and alert others as to the validity of these changes. In essence, you cannot alter a PDF document without leaving an electronic footprint, regardless of the storage media type.

Managing the storage and archival of scanned business documents with the PDF file format is a much easier process due to the fact that it can be maintained on any storage media. The PDF file format is both "platform-independent" and "storage media-independent". In other words, PDF documents can be viewed, downloaded, and printed, regardless of how they were created and without loss of document attributes.

When a PDF file is created from another electronic application, such as Microsoft Word, all of the original document's text and metadata are captured. The image, text and metadata are all integrated into one PDF document. In its simplest terms, metadata is hidden data about the document: date created, original author, date last modified, and so forth. Thus, PDF is more than just a picture of a document; PDF files retain all the content and electronic history of the original document.

By contrast, a TIFF image is simply an image of a document and cannot be searched. Because of this, the text to search a TIFF image must be stored and saved in a second, unrelated text file, which is in turn loaded into a database to enable searching.

Another benefit of PDF is each page is displayed as a separate page, but all of the pages are organized together within the same file. This is not the case with TIFF images. With TIFF, each page of a document is a separate TIFF image; a ten-page paper document ends up as ten individual TIFF images - none of which can be electronically linked.


In summary, the significant technology benefits of PDF over TIFF, coupled with the federal court system converting to a PDF only system, makes PDF the document format of choice.

