Document Imaging Solution's Software...The new standard in document imaging

Document Imaging
Solutions, Inc.


Ph: 888-855-7699


PDF Based
Document Imaging Software


HOME SOFTWARE IT INFORMATION WORKFLOW ARTICLES VERTICAL MARKETS PARTNERS CONTACT US




Software:
Enterprise Edition
Enterprise Edition

Small Business Edition
Small Business Edition

Features
Features

Workflow
Workflow





Physician Program
Physician Program

Medical Billing Program
Medical Billing Program

Attorney Program
Attorney Program

Banking Program
Banking Program

Pre-configured Systems
Pre-configured Systems

Video Tutorials
Video Tutorials

Reseller Program
Reseller Program

White Papers
White Paper Library





Testimonial...

Listen to what our customers have to say about our system. Click here to watch a 2-minute video...   View Movie






Document Imaging with browser search engine

Indexing Search Engine Spider Features
  • The Spider can index and search publicly available sites, secure content HTTPS sites, and password-accessible sites. The Spider also supports forms-based authentication.

  • A single search request can return fully-integrated search results, spanning local and remote content, including:


    • hit-highlighted display of Web-ready file types such as HTML, PDF and XML, including display of images, formatting and links.


    • conversion of other file types ("Office," Unicode, ZIP, etc.) to HTML for browser display with highlighted hits.


    • support for dynamically-generated content (ASP.NET, MS CMS, SharePoint, etc.) with highlighted hits.


  • The Spider can perform "vertical" searching of pages linked from a URL, as well as "horizontal" crawling of sites linked to a URL.


  • The Spider can limit indexed data by file size, file number, time on a Web site, etc.
How The Search Engine Works

Indexed vs. Unindexed Searching: Distributed Searching, Email Filtering, Security Classifications, Forensics

This software module can instantly search terabytes of text because it builds a search index that stores the location of words in documents.

Indexing is easy - simply select folders or entire drives to index and the software does the rest.
  • Once it has built an index, it can automatically update it using the Windows Task Scheduler to reflect additions, deletions and modifications to your document collection.

  • Updating an index is even faster, since it will check each file, and only reindex files that have been added or changed.

  • The indexer automatically recognizes and supports all popular file formats, and never alters original files.
    • A single index can hold over a terabyte of text, and it can create - and search with a single search request - an unlimited number of indexes.
    • Since you may sometimes want to search files that the software has not indexed, the software also does unindexed as well as "combination" searching.
  • Searching and document display (like indexing) do not in any way affect original files.


  • When the search engine does an indexed search, it searches directly on the index that it has built.


  • An unindexed search, in contrast, searches directly through the documents.


  • In either case, when software displays a retrieved document, it refers to the original document, using information in the index to highlight hits.
The search engine can instantly search terabytes of text across a desktop, network, Internet or Intranet site.

This search engine also serve as tools for publishing, with instant text searching, large document collections to Web sites or CD/DVDs.
  • over two dozen indexed, unindexed, fielded and full-text search options

  • highlights hits in HTML, XML and PDF, while displaying embedded links, formatting and images


  • converts other file types - word processor, database, spreadsheet, email and full-text of email attachments, ZIP, Unicode, etc. - to HTML for display with highlighted hits


  • Spider supports Web-based content (HTML, PDF, XML, etc.) as well as dynamically-generated content (ASP.NET, MS CMS, SharePoint, etc.)
Supported file formats
  • Adobe Acrobat (*.pdf)
  • Ami Pro (*.sam)
  • Ansi Text (*.txt)
  • ASCII Text (See note 3)
  • ASF media files (metadata only) (*.asf)
  • CSV (Comma-separated values) (*.csv)
  • DBF (*.dbf)
  • EBCDIC
  • EML files (emails saved by Outlook Express) (*.eml)
  • Enhanced Metafile Format (*.emf)
  • Eudora MBX message files (*.mbx)
  • GZIP (*.gz)
  • HTML (*.htm, *.html)
  • JPEG (*.jpg)
  • Lotus 1-2-3 (*.123, *.wk?)
  • MBOX email archives (including Thunderbird) (*.mbx)
  • MHT archives (HTML archives saved by Internet Explorer) (*.mht)
  • MIME messages
  • MSG files (emails saved by Outlook) (*.msg)
  • Microsoft Access MDB files (see note 1) (*.mdb)
  • Microsoft Document Imaging (*.mdi)
  • Microsoft Excel (*.xls)
  • Microsoft Excel 2003 XML (*.xml)
  • Microsoft Excel 2007 (*.xlsx)
  • Microsoft Outlook/Exchange (See note 2)
  • Microsoft Outlook Express 5 and 6 (*.dbx) message stores
  • Microsoft PowerPoint
  • Microsoft PowerPoint 2007 (*.pptx)
  • Microsoft Rich Text Format (*.rtf)
  • Microsoft Searchable Tiff (*.tiff)
  • Microsoft Word for DOS (*.doc)
  • Microsoft Word for Windows (*.doc)
  • Microsoft Word 2003 XML (*.xml)
  • Microsoft Word 2007 (*.docx)
  • Microsoft Works (*.wks)
  • MP3 (metadata only) (*.mp3)
  • Multimate Advantage II (*.dox)
  • Multimate version 4 (*.doc)
  • OpenOffice 2.x and 1.x documents, spreadsheets, and presentations (*.sxc, *.sxd, *.sxi, *.sxw, *.sxg, *.stc, *.sti, *.stw, *.stm, *.odt, *.ott, *.odg, *.otg, *.odp, *.otp, *.ods, *.ots, *.odf) (includes OASIS Open Document Format for Office Applications)
  • Quattro Pro (*.wb1, *.wb2, *.wb3, *.qpw)
  • TAR (*.tar)
  • TIFF (*.tif)
  • TNEF (winmail.dat files)
  • Treepad HJT files (*.hjt)
  • Unicode (UCS16, Mac or Windows byte order, or UTF-8)
  • Windows Metafile Format (*.wmf)
  • WMA media files (metadata only) (*.wma)
  • WMV video files (metadata only) (*.wmv)
  • WordPerfect 4.2 (See note 3) (*.wpd, *.wpf)
  • WordPerfect (5.0 and later) (*.wpd, *.wpf)
  • WordStar version 1, 2, 3 (See note 3) (*.ws)
  • WordStar versions 4, 5, 6 (*.ws)
  • WordStar 2000
  • Write (*.wri)
  • XBase (including FoxPro, dBase, and other XBase-compatible formats) (*.dbf)
  • XML (*.xml)
  • XML Paper Specification (*.xps) (version 7.40)
  • XSL
  • XyWrite (See note 3)
  • ZIP (*.zip)
Notes

[1] Databases. Using ODBC, the software can also index and display records in Access databases. Each record is treated as a separate document. XBase databases are indexed without using ODBC. For information on indexing SQL databases, click here.

[2] Outlook and Exchange. The software can index Outlook and Exchange message stores using MAPI. For more information, click here.

[3] Older Word Processor Formats. The software can index and display, but cannot automatically recognize, documents in the following formats:
  • WordPerfect 4.2
  • WordStar versions before 4
  • XyWrite
  • Ascii Text
Other File Formats
  • This browser based search engine will index, search, and display other file formats, but they will be treated as binary file types. In other words, all binary codes, etc. will be displayed along with the text.


  • Image Formats

  • This browser based search engine can display images in the following formats:
    • BMP
    • EPSF
    • GIF
    • IMG
    • JPEG
    • PCX
    • PNG
    • TIFF
    • Targa
    • WMF
    • WPG (WPG version 1.0 only)
  • When viewing multipage images, use PgUp and PgDn to navigate between the pages. The image viewer also includes viewing options such as Zoom In, Zoom Out, Invert, Rotate, etc.
Search Types

  • All search options on this page work with indexed, unindexed and "combination" indexed/unindexed searching.
Basic Search Types
  • Phrase searching finds phrases like: due process of law.


  • Boolean operators like and/or/not can join words and phrases: due process of law and not (equal protection or civil rights).


  • Proximity searching finds a word or phrase within "n" words of another word or phrase: apple pie w/38 peach cobbler.


  • Directed Proximity searching finds a word or phrase "n" words before another word or phrase: apple pie pre/38 peach cobbler.


  • Phonic searching finds words that sound alike, like Smythe in a search for Smith.


  • Stemming finds variations on endings, like applies, applied, applying in a search for apply.


  • Numeric range searching finds any number between two numbers, such as between 6 and 36.


  • Macro capabilities make it easy to include frequently used items in a search request.


  • Wildcard support allows ? to hold a single letter place, and * to hold multiple letter places: apple* and not appl?sauce.
Fuzzy Searching
  • Fuzzy searching uses a proprietary algorithm to find search terms even if they are misspelled.


  • Search fuzziness adjusts from 0 to 10 so you can fine-tune fuzziness to the level of OCR or typographical errors in your files.


  • A search for alphabet with a fuzziness of 1 would find alphaqet; with a fuzziness of 3, it would find both alphaqet and alpkaqet.


  • Fuzziness is not built into the index, so you can vary fuzziness at the time of each search.
Concept / Synonym / Thesaurus Searching
  • Concept searching lets you look for fast and find quick, speedy, etc.


  • The search engine offers variable levels of automatic synonym expansion based on a comprehensive semantic network of the English language.


  • You can also add your own thesaurus terms.
Combining Search Types
  • Nearly all search types are combinable.


  • You can make your search request as complex as you want
Search Features - Relevancy-Ranking
  • The search engine can sort and instantly re-sort searches by relevancy with respect to number of hits, file name, file date, etc.


  • Natural language algorithms provide automatic term weighting, following a "plain English" or unstructured indexed search request.


    • Automatic term weighting is based on the frequency and density of hits in your files.


    • For example, in the search request get me Sam's memo on the 1999 CorpX takeover, if 1999 appeared in 3,000 files, and Sam appeared in only two files, then Sam would get a much higher relevancy rating, taking you straight to the most "relevant" files.


  • A positional scoring option works with the search engine's natural language relevancy ranking to rank documents more highly when hits are near the top of a file, or otherwise clustered in a file.


  • It also includes variable term weighting options for both indexed and unindexed searches:


    • Positive term weighting can place extra emphasis on one or more words: soup:8 or recipe:3


    • Negative term weighting can assign negative emphasis to one or more words: red or green or yellow:-7


  • Variable term weighting can also apply to fields: (description:5 contains (apple and pear)) or (author:2 contains smith)




HOME SOFTWARE RESELLER LOG-IN CONTACT US BLOG SITEMAP
Copyright© 1998-2010 Document Imaging Solutions, Inc. All Rights Reserved.