Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot's view in order to explore interaction possibilities of the scene.
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections – for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful. - Contains state-of-the-art developments on multi-modal computing - Shines a focus on algorithms and applications - Presents novel deep learning topics on multi-sensor fusion and multi-modal deep learning
This book constitutes the refereed proceedings of the Third International Conference on Dynamic Data Driven Application Systems, DDDAS 2020, held in Boston, MA, USA, in October 2020. The 21 full papers and 14 short papers presented in this volume were carefully reviewed and selected from 40 submissions. They cover topics such as: digital twins; environment cognizant adaptive-planning systems; energy systems; materials systems; physics-based systems analysis; imaging methods and systems; and learning systems.
In this paper, an active vision system is developed which is based on image strategy. The image based control structure uses the optical flow algorithm for motion detection of an object in a visual scene. Because the optical flow is very sensitive to changes in illumination or to the quality of the video, it was necessary to use median filtering and erosion and dilatation morphological operations for the decrease of erroneous blobs residing in individual frames. Since the image coordinates of the object are subjected to noise, the Kalman filtering technique is adopted for robust estimation. A fuzzy controller based on the fuzzy condensed algorithm allows real time work for each captured frame. Finally, the proposed active vision system has been simulated in the development/simulation environment Matlab/Simulink.
Computer Vision has now reached a level of maturity that allows us not only to perform research on individual methods but also to build fully integrated computer vision systems of a signi cant complexity. This opens up a number of new problems related to architectures, systems integration, validation of - stems using benchmarking techniques, and so on. So far, the majority of vision conferences have focused on component technologies, which has motivated the organization of the First International Conference on Computer Vision Systems (ICVS). It is our hope that the conference will allow us not only to see a number of interesting new vision techniques and systems but hopefully also to de ne the research issues that need to be addressed to pave the way for more wide-scale use of computer vision in a diverse set of real-world applications. ICVS is organized as a single-track conference consisting of high-quality, p- viously unpublished, contributed papers on new and original research on c- puter vision systems. All contributions will be presented orally. A total of 65 papers were submitted for consideration by the conference. All papers were - viewed by three reviewers from the program committee. Thirty-two of the papers were selected for presentation. ICVS’99 is being held at the Alfredo Kraus Auditorium and Convention Centre, in Las Palmas, on the lovely Canary Islands, Spain. The setting is spri- like, which seems only appropriate as the basis for a new conference.
This volume is the Proceedings of the First International Conference on Advanced Multimedia Content Processing (AMCP ’98). With the remarkable advances made in computer and communication hardware/software system technologies, we can now easily obtain large volumes of multimedia data through advanced computer networks and store and handle them in our own personal hardware. Sophisticated and integrated multimedia content processing technologies, which are essential to building a highly advanced information based society, are attracting ever increasing attention in various service areas, including broadcasting, publishing, medical treatment, entertainment, and communications. The prime concerns of these technologies are how to acquire multimedia content data from the real world, how to automatically organize and store these obtained data in databases for sharing and reuse, and how to generate and create new, attractive multimedia content using the stored data. This conference brings together researchers and practitioners from academia, in dustry, and public agencies to present and discuss recent advances in the acquisition, management, retrieval, creation, and utilization of large amounts of multimedia con tent. Artistic and innovative applications through the active use of multimedia con tent are also subjects of interest. The conference aims at covering the following par ticular areas: (1) Dynamic multimedia data modeling and intelligent structuring of content based on active, bottom up, and self organized strategies. (2) Access archi tecture, querying facilities, and distribution mechanisms for multimedia content.
Cutting-edge research on the visual cognition of scenes, covering issues that include spatial vision, context, emotion, attention, memory, and neural mechanisms underlying scene representation. For many years, researchers have studied visual recognition with objects—single, clean, clear, and isolated objects, presented to subjects at the center of the screen. In our real environment, however, objects do not appear so neatly. Our visual world is a stimulating scenery mess; fragments, colors, occlusions, motions, eye movements, context, and distraction all affect perception. In this volume, pioneering researchers address the visual cognition of scenes from neuroimaging, psychology, modeling, electrophysiology, and computer vision perspectives. Building on past research—and accepting the challenge of applying what we have learned from the study of object recognition to the visual cognition of scenes—these leading scholars consider issues of spatial vision, context, rapid perception, emotion, attention, memory, and the neural mechanisms underlying scene representation. Taken together, their contributions offer a snapshot of our current knowledge of how we understand scenes and the visual world around us. Contributors Elissa M. Aminoff, Moshe Bar, Margaret Bradley, Daniel I. Brooks, Marvin M. Chun, Ritendra Datta, Russell A. Epstein, Michèle Fabre-Thorpe, Elena Fedorovskaya, Jack L. Gallant, Helene Intraub, Dhiraj Joshi, Kestutis Kveraga, Peter J. Lang, Jia Li Xin Lu, Jiebo Luo, Quang-Tuan Luong, George L. Malcolm, Shahin Nasr, Soojin Park, Mary C. Potter, Reza Rajimehr, Dean Sabatinelli, Philippe G. Schyns, David L. Sheinberg, Heida Maria Sigurdardottir, Dustin Stansbury, Simon Thorpe, Roger Tootell, James Z. Wang
This title focuses on vision as an active process, rather than a passive activity and provides an integrated account of seeing and looking. The authors give a thorough description of basic details of the visual and oculomotor systems necessary to understand active vision.