Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot's view in order to explore interaction possibilities of the scene.
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections – for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful. - Contains state-of-the-art developments on multi-modal computing - Shines a focus on algorithms and applications - Presents novel deep learning topics on multi-sensor fusion and multi-modal deep learning
This book constitutes the refereed proceedings of the Third International Conference on Dynamic Data Driven Application Systems, DDDAS 2020, held in Boston, MA, USA, in October 2020. The 21 full papers and 14 short papers presented in this volume were carefully reviewed and selected from 40 submissions. They cover topics such as: digital twins; environment cognizant adaptive-planning systems; energy systems; materials systems; physics-based systems analysis; imaging methods and systems; and learning systems.
In this paper, an active vision system is developed which is based on image strategy. The image based control structure uses the optical flow algorithm for motion detection of an object in a visual scene. Because the optical flow is very sensitive to changes in illumination or to the quality of the video, it was necessary to use median filtering and erosion and dilatation morphological operations for the decrease of erroneous blobs residing in individual frames. Since the image coordinates of the object are subjected to noise, the Kalman filtering technique is adopted for robust estimation. A fuzzy controller based on the fuzzy condensed algorithm allows real time work for each captured frame. Finally, the proposed active vision system has been simulated in the development/simulation environment Matlab/Simulink.
Computer Vision has now reached a level of maturity that allows us not only to perform research on individual methods but also to build fully integrated computer vision systems of a signi cant complexity. This opens up a number of new problems related to architectures, systems integration, validation of - stems using benchmarking techniques, and so on. So far, the majority of vision conferences have focused on component technologies, which has motivated the organization of the First International Conference on Computer Vision Systems (ICVS). It is our hope that the conference will allow us not only to see a number of interesting new vision techniques and systems but hopefully also to de ne the research issues that need to be addressed to pave the way for more wide-scale use of computer vision in a diverse set of real-world applications. ICVS is organized as a single-track conference consisting of high-quality, p- viously unpublished, contributed papers on new and original research on c- puter vision systems. All contributions will be presented orally. A total of 65 papers were submitted for consideration by the conference. All papers were - viewed by three reviewers from the program committee. Thirty-two of the papers were selected for presentation. ICVS’99 is being held at the Alfredo Kraus Auditorium and Convention Centre, in Las Palmas, on the lovely Canary Islands, Spain. The setting is spri- like, which seems only appropriate as the basis for a new conference.
This volume is the Proceedings of the First International Conference on Advanced Multimedia Content Processing (AMCP ’98). With the remarkable advances made in computer and communication hardware/software system technologies, we can now easily obtain large volumes of multimedia data through advanced computer networks and store and handle them in our own personal hardware. Sophisticated and integrated multimedia content processing technologies, which are essential to building a highly advanced information based society, are attracting ever increasing attention in various service areas, including broadcasting, publishing, medical treatment, entertainment, and communications. The prime concerns of these technologies are how to acquire multimedia content data from the real world, how to automatically organize and store these obtained data in databases for sharing and reuse, and how to generate and create new, attractive multimedia content using the stored data. This conference brings together researchers and practitioners from academia, in dustry, and public agencies to present and discuss recent advances in the acquisition, management, retrieval, creation, and utilization of large amounts of multimedia con tent. Artistic and innovative applications through the active use of multimedia con tent are also subjects of interest. The conference aims at covering the following par ticular areas: (1) Dynamic multimedia data modeling and intelligent structuring of content based on active, bottom up, and self organized strategies. (2) Access archi tecture, querying facilities, and distribution mechanisms for multimedia content.
One of the series in Machine Perception and Artificial Intelligence, this book covers subjects including the Harvard binocular head; heads, eyes, and head-eye systems; a binocular robot head with torsional eye movements; and escape and dodging behaviours for reactive control.
This book offers a comprehensive introduction to seven commonly used image understanding techniques in modern information technology. Readers of various levels can find suitable techniques to solve their practical problems and discover the latest development in these specific domains. The techniques covered include camera model and calibration, stereo vision, generalized matching, scene analysis and semantic interpretation, multi-sensor image information fusion, content-based visual information retrieval, and understanding spatial-temporal behavior. The book provides aspects from the essential concepts overview and basic principles to detailed introduction, explanation of the current methods and their practical techniques. It also presents discussions on the research trends and latest results in conjunction with new development of technical methods. This is an excellent read for those who do not have a subject background in image technology but need to use these techniques to complete specific tasks. These essential information will also be useful for their further study in the relevant fields.