Optical character recognition and document image analysis have become very important areas with a fast growing number of researchers in the field. This comprehensive handbook with contributions by eminent experts, presents both the theoretical and practical aspects at an introductory level wherever possible.
The Handbook of Document Image Processing and Recognition is a comprehensive resource on the latest methods and techniques in document image processing and recognition. Each chapter provides a clear overview of the topic followed by the state of the art of techniques used – including elements of comparison between them – along with supporting references to archival publications, for those interested in delving deeper into topics addressed. Rather than favor a particular approach, the text enables the reader to make an informed decision for their specific problems.
The objective of Document Analysis and Recognition (DAR) is to recognize the text and graphical components of a document and to extract information. This book is a collection of research papers and state-of-the-art reviews by leading researchers all over the world. It includes pointers to challenges and opportunities for future research directions. The main goal of the book is to identify good practices for the use of learning strategies in DAR.
In the field of document recognition and understanding, whereas scanned paper documents were previously the only recognition target, various new media such as camera-captured documents, videos, and natural scene images have recently started to attract attention because of the growth of the Internet/WWW and the rapid adoption of low-priced digital cameras/videos. The keys to the breakthrough include character detection from complex backgrounds, discrimination of characters from non-characters, modern or ancient unique font recognition, fast retrieval technique from large-scaled scanned documents, multi-lingual OCR, and unconstrained handwriting recognition. This book aims to present recent advances, applications, and new ideas that are relevant to document recognition and understanding, from technical topics such as image processing, feature extraction or classification, to new applications like camera-based recognition or character-based natural scene analysis. The goal of this book is to provide a new trend and a reference source for academic research and for professionals working in the document recognition and understanding field
This is the first comprehensive text on Optical Character Recognition for Indic scripts. It covers many topics and describes OCR systems for eight different scripts—Bangla, Devanagari, Gurmukhi, Gujarti, Kannada, Malayalam, Tamil and Urdu.
Interest in the automatic processing and analysis of document images has been rapidly increasing during the past few years. This book addresses the different subfields of document image analysis, including preprocessing and segmentation, form processing, handwriting recognition, line drawing and map processing, and contextual processing.
The book focuses on one of the key issues in document image processing – graphical symbol recognition, which is a sub-field of the larger research domain of pattern recognition. It covers several approaches: statistical, structural and syntactic, and discusses their merits and demerits considering the context. Through comprehensive experiments, it also explores whether these approaches can be combined. The book presents research problems, state-of-the-art methods that convey basic steps as well as prominent techniques, evaluation metrics and protocols, and research standpoints/directions that are associated with it. However, it is not limited to straightforward isolated graphics (visual patterns) recognition; it also addresses complex and composite graphical symbols recognition, which is motivated by real-world industrial problems.
Recently, there has been an increased interest in the research and development of techniques for components of complete document analysis systems. In recognition of this trend, a series of workshops on Document Analysis Systems commenced in 1994, under the leadership of Henry Baird. The first workshop, held in Kaiserslautern, Germany, in October, 1994, was chaired by Andreas Dengel and Larry Spitz. The second workshop on Document Analysis Systems was held in Malvern, PA, USA, in October, 1996, chaired by Jonathan J. Hull and Suzanne Liebowitz Taylor. The DAS workshop has been one of the most prestigious technical meetings, bringing together a large number of scientists and engineers from all over the world to express their innovative ideas and report on their latest achievements in the area of document analysis systems. The papers in this special book edition were rigorously selected from the Third IAPR Workshop on Document Analysis Systems (DAS’98), held in Nagano, Japan, on 4 - 6 November 1998. It is worth mentioning that the papers were chosen for their original and substantial contributions to the workshop theme and this special book edition. From among the 53 papers that were presented by authors from 11 countries at the DAS’98 after critical reviews by at least three experts, we carefully selected 29 papers for this special book edition. Most of the contributions in this edition have been expanded or extensively revised to include helpful discussions, suggestions, or comments made during the workshop.
This book brings all the major and frontier topics in the field of document analysis together into a single volume, creating a unique reference source that will be invaluable to a large audience of researchers, lecturers and students working in this field. With chapters written by some of the most distinguished researchers active in this field, this book addresses recent advances in digital document processing research and development.