![]() |
||||||||||||||||||||||
![]() |
||||||||||||||||||||||
| coding and ocr | ||||||||||||||||||||||
|
Document coding and OCR (optical character recognition) are two services that enhance the effectiveness and searchability of a litigation database. OCR text is useful for keyword searches and quickly identifying content within documents, while coding provides a greater flexibility in identifying and organizing documents through targeted queries based on various field data such as document types and date ranges. Our service include: OCR For many forms of electronic evidence, such as email, searchable text can be automatically extracted and used to help search a database for useful documents. For scanned images, searchable text does not exist until the images are OCR’d. The resultant text files are essential to creating a database that can be effectively searched by keywords and phrases. Objective Coding Inventory Indexing: An inventory index is a less detailed form of coding useful in cataloging the contents of a collection by one or two simple criteria, such as Bates range and file title. Bibliographic Coding: Most commonly requested with scanned paper documents for which metadata does not exist. The standard fields are usually a variation of the following: Begdoc, Enddoc, BegAttach, EndAttach, BegBates, EndBates, Docdate, Doctitle, Doctype, Author, Recipient, CC, Characteristics/Properties. Enhanced bibliographic options include issues or names mentioned in text, as well as any number of custom fields. Subjective Coding Subjective coding can provide a summary of the document and an analysis of the issues. Such coding is flexible and specific to individual case requirements and is ideal for Issue Coding, Responsive Categories or Document Summaries. Document Unitization Logical Document Determination (LDD): When documents are scanned, physical document boundaries such as staples and paper clips are used to establish document boundaries for the images. These boundaries are often not ideal for the purposes of coding, since a single staple section may be comprised of various logical documents such as a piece of correspondence, one or more invoices, a copy of an email and a handwritten note, for example. If this document were coded as a single document, the data fields would reflect only the information from the first logical document—in this instance, the piece of correspondence. The ability to query the other discreet logical documents would be compromised. Logical Document Determination (LDD) is the process by which, prior to coding, images are examined page by page to determine where document boundaries should be placed, in order to maximize the effectiveness of the coding results. So, in the example above, if the email which is buried in the middle of the aforementioned staple section is relevant to a search of the Author field, the email will be returned in the search results if the documents have been logically unitized. It is generally recommended to have scanned images logically unitized prior to coding. |
|||||||||||||||||||||
| © 2007 eDiscovery Solutions, Inc. All rights reserved Website design by moonrise design. |
||||||||||||||||||||||