This book constitutes the thoroughly refereed post-workshop-proceedings of the 4th International Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2011, held in Beijing, China, in September 2011. The 13 revised full papers presented were carefully selected during a second round of reviewing and improvement from numerous original submissions. Intended to give a snapshot of the state-of-the-art research in the field of camera based document analysis and recognition, the papers are organized in topical sections on text detection and recognition in scene images, camera-based systems, and datasets and evaluation.
The compendium presents the latest results of the most prominent competitions held in the field of Document Analysis and Text Recognition. It includes a description of the participating systems and the underlying methods on one hand and the datasets used together with evaluation metrics on the other hand. This volume also demonstrates with examples, how to organize a competition and how to make it successful. It will be an indispensable handbook to the document image analysis community.
This volume contains the papers of the 9th International Conference on Computing and Information Technology (IC2IT 2013) held at King Mongkut's University of Technology North Bangkok (KMUTNB), Bangkok, Thailand, on May 9th-10th, 2013. Traditionally, the conference is organized in conjunction with the National Conference on Computing and Information Technology, one of the leading Thai national events in the area of Computer Science and Engineering. The conference as well as this volume is structured into 3 main tracks on Data Networks/Communication, Data Mining/Machine Learning, and Human Interfaces/Image processing.
Document layout analysis (DLA) is a crucial step towards the development of an effective document image processing system. In the early days of document image processing, DLA was not considered as a complete and complex research problem, rather just a pre-processing step having some minor challenges. The main reason for that is the type of layout being considered for processing was simple. Researchers started paying attention to this complex problem as they come across a large variety of documents. This book presents a clear view of the past, present, and future of DLA, and it also discusses two recent methods developed to address the said problem.
This book constitutes the refereed proceedings of the 15th IAPR International Workshop on Document Analysis Systems, DAS 2022, held in La Rochelle, France, in May 2022. The full papers presented were carefully reviewed and selected from numerous submissions addressing key techniques of document analysis.
This is the first comprehensive text on Optical Character Recognition for Indic scripts. It covers many topics and describes OCR systems for eight different scripts—Bangla, Devanagari, Gurmukhi, Gujarti, Kannada, Malayalam, Tamil and Urdu.
The 8-volume set, comprising the LNCS books 13801 until 13809, constitutes the refereed proceedings of 38 out of the 60 workshops held at the 17th European Conference on Computer Vision, ECCV 2022. The conference took place in Tel Aviv, Israel, during October 23-27, 2022; the workshops were held hybrid or online. The 367 full papers included in this volume set were carefully reviewed and selected for inclusion in the ECCV 2022 workshop proceedings. They were organized in individual parts as follows: Part I: W01 - AI for Space; W02 - Vision for Art; W03 - Adversarial Robustness in the Real World; W04 - Autonomous Vehicle Vision Part II: W05 - Learning With Limited and Imperfect Data; W06 - Advances in Image Manipulation; Part III: W07 - Medical Computer Vision; W08 - Computer Vision for Metaverse; W09 - Self-Supervised Learning: What Is Next?; Part IV: W10 - Self-Supervised Learning for Next-Generation Industry-Level Autonomous Driving; W11 - ISIC Skin Image Analysis; W12 - Cross-Modal Human-Robot Interaction; W13 - Text in Everything; W14 - BioImage Computing; W15 - Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications; W16 - AI for Creative Video Editing and Understanding; W17 - Visual Inductive Priors for Data-Efficient Deep Learning; W18 - Mobile Intelligent Photography and Imaging; Part V: W19 - People Analysis: From Face, Body and Fashion to 3D Virtual Avatars; W20 - Safe Artificial Intelligence for Automated Driving; W21 - Real-World Surveillance: Applications and Challenges; W22 - Affective Behavior Analysis In-the-Wild; Part VI: W23 - Visual Perception for Navigation in Human Environments: The JackRabbot Human Body Pose Dataset and Benchmark; W24 - Distributed Smart Cameras; W25 - Causality in Vision; W26 - In-Vehicle Sensing and Monitorization; W27 - Assistive Computer Vision and Robotics; W28 - Computational Aspects of Deep Learning; Part VII: W29 - Computer Vision for Civil and Infrastructure Engineering; W30 - AI-Enabled Medical Image Analysis: Digital Pathology and Radiology/COVID19; W31 - Compositional and Multimodal Perception; Part VIII: W32 - Uncertainty Quantification for Computer Vision; W33 - Recovering 6D Object Pose; W34 - Drawings and Abstract Imagery: Representation and Analysis; W35 - Sign Language Understanding; W36 - A Challenge for Out-of-Distribution Generalization in Computer Vision; W37 - Vision With Biased or Scarce Data; W38 - Visual Object Tracking Challenge.
This book presents an interactive multimodal approach for efficient transcription of handwritten text images. This approach, rather than full automation, assists the expert in the recognition and transcription process.Until now, handwritten text recognition (HTR) systems are far from being perfect and heavy human intervention is often required to check and correct the results of such systems. The interactive scenario studied in this book combines the efficiency of automatic handwriting recognition systems with the accuracy of the experts, leading to a cost-effective perfect transcription of the handwritten text images.The interactive system here allows the user to repeatedly interact with the system. Hence, the quality and ergonomy of the interactive process is crucial for the success of the system. Moreover, more ergonomic multimodal interfaces are used to obtain an easier and more comfortable human-machine interaction.