This book deals with the problem of detecting and localizing multiple simultaneously active wideband acoustic sources by applying the notion of wavefield decomposition using circular and spherical microphone arrays. A rigorous derivation of modal array signal processing algorithms for unambiguous source detection and localization, as well as performance evaluations by means of measurements using an actual real-time capable implementation, are discussed.
This book deals with the problem of detecting and localizing multiple simultaneously active wideband acoustic sources by applying the notion of wavefield decomposition using circular and spherical microphone arrays. A rigorous derivation of modal array signal processing algorithms for unambiguous source detection and localization, as well as performance evaluations by means of measurements using an actual real-time capable implementation, are discussed.
Presents a unified framework of far-field and near-field array techniques for noise source identification and sound field visualization, from theory to application. Acoustic Array Systems: Theory, Implementation, and Application provides an overview of microphone array technology with applications in noise source identification and sound field visualization. In the comprehensive treatment of microphone arrays, the topics covered include an introduction to the theory, far-field and near-field array signal processing algorithms, practical implementations, and common applications: vehicles, computing and communications equipment, compressors, fans, and household appliances, and hands-free speech. The author concludes with other emerging techniques and innovative algorithms. Encompasses theoretical background, implementation considerations and application know-how Shows how to tackle broader problems in signal processing, control, and transudcers Covers both farfield and nearfield techniques in a balanced way Introduces innovative algorithms including equivalent source imaging (NESI) and high-resolution nearfield arrays Selected code examples available for download for readers to practice on their own Presentation slides available for instructor use A valuable resource for Postgraduates and researchers in acoustics, noise control engineering, audio engineering, and signal processing.
In consideration of the remarkable intensity of research in the field of Virtual Acoustics, including different areas such as sound field analysis and synthesis, spatial audio technologies, and room acoustical modeling and auralization, it seemed about time to organize a second international symposium following the model of the first EAA Auralization Symposium initiated in 2009 by the acoustics group of the former Helsinki University of Technology (now Aalto University). Additionally, research communities which are focused on different approaches to sound field synthesis such as Ambisonics or Wave Field Synthesis have, in the meantime, moved closer together by using increasingly consistent theoretical frameworks. Finally, the quality of virtual acoustic environments is often considered as a result of all processing stages mentioned above, increasing the need for discussions on consistent strategies for evaluation. Thus, it seemed appropriate to integrate two of the most relevant communities, i.e. to combine the 2nd International Auralization Symposium with the 5th International Symposium on Ambisonics and Spherical Acoustics. The Symposia on Ambisonics, initiated in 2009 by the Institute of Electronic Music and Acoustics of the University of Music and Performing Arts in Graz, were traditionally dedicated to problems of spherical sound field analysis and re-synthesis, strategies for the exchange of ambisonics-encoded audio material, and – more than other conferences in this area – the artistic application of spatial audio systems. This publication contains the official conference proceedings. It includes 29 manuscripts which have passed a 3-stage peer-review with a board of about 70 international reviewers involved in the process. Each contribution has already been published individually with a unique DOI on the DepositOnce digital repository of TU Berlin. Some conference contributions have been recommended for resubmission to Acta Acustica united with Acustica, to possibly appear in a Special Issue on Virtual Acoustics in late 2014. These are not published in this collection.
Learn the technology behind hearing aids, Siri, and Echo Audio source separation and speech enhancement aim to extract one or more source signals of interest from an audio recording involving several sound sources. These technologies are among the most studied in audio signal processing today and bear a critical role in the success of hearing aids, hands-free phones, voice command and other noise-robust audio analysis systems, and music post-production software. Research on this topic has followed three convergent paths, starting with sensor array processing, computational auditory scene analysis, and machine learning based approaches such as independent component analysis, respectively. This book is the first one to provide a comprehensive overview by presenting the common foundations and the differences between these techniques in a unified setting. Key features: Consolidated perspective on audio source separation and speech enhancement. Both historical perspective and latest advances in the field, e.g. deep neural networks. Diverse disciplines: array processing, machine learning, and statistical signal processing. Covers the most important techniques for both single-channel and multichannel processing. This book provides both introductory and advanced material suitable for people with basic knowledge of signal processing and machine learning. Thanks to its comprehensiveness, it will help students select a promising research track, researchers leverage the acquired cross-domain knowledge to design improved techniques, and engineers and developers choose the right technology for their target application scenario. It will also be useful for practitioners from other fields (e.g., acoustics, multimedia, phonetics, and musicology) willing to exploit audio source separation or speech enhancement as pre-processing tools for their own needs.
It gives me immense pleasure to introduce this timely handbook to the research/- velopment communities in the ?eld of signal processing systems (SPS). This is the ?rst of its kind and represents state-of-the-arts coverage of research in this ?eld. The driving force behind information technologies (IT) hinges critically upon the major advances in both component integration and system integration. The major breakthrough for the former is undoubtedly the invention of IC in the 50’s by Jack S. Kilby, the Nobel Prize Laureate in Physics 2000. In an integrated circuit, all components were made of the same semiconductor material. Beginning with the pocket calculator in 1964, there have been many increasingly complex applications followed. In fact, processing gates and memory storage on a chip have since then grown at an exponential rate, following Moore’s Law. (Moore himself admitted that Moore’s Law had turned out to be more accurate, longer lasting and deeper in impact than he ever imagined. ) With greater device integration, various signal processing systems have been realized for many killer IT applications. Further breakthroughs in computer sciences and Internet technologies have also catalyzed large-scale system integration. All these have led to today’s IT revolution which has profound impacts on our lifestyle and overall prospect of humanity. (It is hard to imagine life today without mobiles or Internets!) The success of SPS requires a well-concerted integrated approach from mul- ple disciplines, such as device, design, and application.
Automatic speech recognition (ASR) systems are finding increasing use in everyday life. Many of the commonplace environments where the systems are used are noisy, for example users calling up a voice search system from a busy cafeteria or a street. This can result in degraded speech recordings and adversely affect the performance of speech recognition systems. As the use of ASR systems increases, knowledge of the state-of-the-art in techniques to deal with such problems becomes critical to system and application engineers and researchers who work with or on ASR technologies. This book presents a comprehensive survey of the state-of-the-art in techniques used to improve the robustness of speech recognition systems to these degrading external influences. Key features: Reviews all the main noise robust ASR approaches, including signal separation, voice activity detection, robust feature extraction, model compensation and adaptation, missing data techniques and recognition of reverberant speech. Acts as a timely exposition of the topic in light of more widespread use in the future of ASR technology in challenging environments. Addresses robustness issues and signal degradation which are both key requirements for practitioners of ASR. Includes contributions from top ASR researchers from leading research units in the field
A comprehensive guide that addresses the theory and practice of spatial audio This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming—covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed for such processing, and provides an overview to existing research. It also shows recent up-to-date projects and commercial applications built on top of the systems. Provides an in-depth presentation of the principles, past developments, state-of-the-art methods, and future research directions of spatial audio technologies Includes contributions from leading researchers in the field Offers MATLAB codes with selected chapters An advanced book aimed at readers who are capable of digesting mathematical expressions about digital signal processing and sound field analysis, Parametric Time-frequency Domain Spatial Audio is best suited for researchers in academia and in the audio industry.
This open access book provides a concise explanation of the fundamentals and background of the surround sound recording and playback technology Ambisonics. It equips readers with the psychoacoustical, signal processing, acoustical, and mathematical knowledge needed to understand the inner workings of modern processing utilities, special equipment for recording, manipulation, and reproduction in the higher-order Ambisonic format. The book comes with various practical examples based on free software tools and open scientific data for reproducible research. The book’s introductory section offers a perspective on Ambisonics spanning from the origins of coincident recordings in the 1930s to the Ambisonic concepts of the 1970s, as well as classical ways of applying Ambisonics in first-order coincident sound scene recording and reproduction that have been practiced since the 1980s. As, from time to time, the underlying mathematics become quite involved, but should be comprehensive without sacrificing readability, the book includes an extensive mathematical appendix. The book offers readers a deeper understanding of Ambisonic technologies, and will especially benefit scientists, audio-system and audio-recording engineers. In the advanced sections of the book, fundamentals and modern techniques as higher-order Ambisonic decoding, 3D audio effects, and higher-order recording are explained. Those techniques are shown to be suitable to supply audience areas ranging from studio-sized to hundreds of listeners, or headphone-based playback, regardless whether it is live, interactive, or studio-produced 3D audio material.
Presents current trends and potential future developments by leading researchers in immersive media production, delivery, rendering and interaction The underlying audio and video processing technology that is discussed in the book relates to areas such as 3D object extraction, audio event detection; 3D sound rendering and face detection, gesture analysis and tracking using video and depth information. The book will give an insight into current trends and developments of future media production, delivery and reproduction. Consideration of the complete production, processing and distribution chain will allow for a full picture to be presented to the reader. Production developments covered will include integrated workflows developed by researchers and industry practitioners as well as capture of ultra-high resolution panoramic video and 3D object based audio across a range of programme genres. Distribution developments will include script based format agnostic network delivery to a full range of devices from large scale public panoramic displays with wavefield synthesis and ambisonic audio reproduction to ’small screen’ mobile devices. Key developments at the consumer end of the chain apply to both passive and interactive viewing modes and will incorporate user interfaces such as gesture recognition and ‘second screen’ devices to allow manipulation of the audio visual content. Presents current trends and potential future developments by leading researchers in immersive media production, delivery, rendering and interaction. Considers the complete production, processing and distribution chain illustrating the dependencies and the relationship between different components. Proposes that a format-agnostic approach to the production and delivery of broadcast programmes will overcome the problems faced with the steadily growing number of production and delivery formats. Explains the fundamentals of media production in addition to the complete production chain, beyond current-state-of-the-art through to presenting novel approaches and technologies for future media production. Focuses on the technologies that will allow for the realization of an E2E media platform that supports flexible content representations and interactivity for users. An essential read for Researchers and developers of audio-visual technology in industry and academia, such as engineers in broadcast technology companies and students working toward a career in the rapidly changing area of broadcast both from a production and an engineering perspective.