Real-time Single and Dual-channel Speech Enhancement on Edge Devices for Hearing Applications

Real-time Single and Dual-channel Speech Enhancement on Edge Devices for Hearing Applications

Author: Nikhil Shankar

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Speech Enhancement (SE) is an important module in the signal processing pipeline for hearing applications and it helps enhance the comfort of listening. Many single and dualmicrophone SE techniques have been developed by researchers over the last few decades. In this thesis, novel single and dual-channel SE techniques have been proposed and are implemented on edge devices as an assistive tool for hearing applications. The smartphone is considered as the processing platform for real-time implementation and testing. In this work, both statistical signal processing and deep learning algorithms are proposed for SE. Firstly, we compare different two-channel beamformers for SE. Later, the Minimum Variance Distortionless Response (MVDR) beamformer assisted by a voice activity detector (VAD) is used as a Signal to Noise Ratio (SNR) booster for the SE method. Deep neural network architectures comprising of convolutional neural network (CNN) and recurrent neural network (RNN) layers are proposed in this thesis for real-time SE. Finally to filter out background noise, the SE gain estimation for noisy speech mixture is smoothed along the frequency axis by a Mel filter-bank, resulting in a Mel-warped frequency-domain gain estimation. In comparison with existing SE methods, objective assessment and subjective results of the developed methods indicate substantial improvements in speech quality and intelligibility.


Digital Speech Transmission and Enhancement

Digital Speech Transmission and Enhancement

Author: Peter Vary

Publisher: John Wiley & Sons

Published: 2023-11-23

Total Pages: 596

ISBN-13: 1119060982

DOWNLOAD EBOOK

DIGITAL SPEECH TRANSMISSION AND ENHANCEMENT Enables readers to understand the latest developments in speech enhancement/transmission due to advances in computational power and device miniaturization The Second Edition of Digital Speech Transmission and Enhancement has been updated throughout to provide all the necessary details on the latest advances in the theory and practice in speech signal processing and its applications, including many new research results, standards, algorithms, and developments which have recently appeared and are on their way into state-of-the-art applications. Besides mobile communications, which constituted the main application domain of the first edition, speech enhancement for hearing instruments and man-machine interfaces has gained significantly more prominence in the past decade, and as such receives greater focus in this updated and expanded second edition. Readers can expect to find information and novel methods on: Low-latency spectral analysis-synthesis, single-channel and dual-channel algorithms for noise reduction and dereverberation Multi-microphone processing methods, which are now widely used in applications such as mobile phones, hearing aids, and man-computer interfaces Algorithms for near-end listening enhancement, which provide a significantly increased speech intelligibility for users at the noisy receiving side of their mobile phone Fundamentals of speech signal processing, estimation and machine learning, speech coding, error concealment by soft decoding, and artificial bandwidth extension of speech signals Digital Speech Transmission and Enhancement is a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology, and as such is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.


Dual-microphone and Binaural Noise Reduction Techniques for Improved Speech Intelligibility by Hearing Aid Users

Dual-microphone and Binaural Noise Reduction Techniques for Improved Speech Intelligibility by Hearing Aid Users

Author: Nima Yousefian Jazi

Publisher:

Published: 2013

Total Pages: 218

ISBN-13:

DOWNLOAD EBOOK

Spatial filtering and directional discrimination has been shown to be an effective pre-processing approach for noise reduction in microphone array systems. In dual-microphone hearing aids, fixed and adaptive beamforming techniques are the most common solutions for enhancing the desired speech and rejecting unwanted signals captured by the microphones. In fact, beamformers are widely utilized in systems where spatial properties of target source (usually in front of the listener) is assumed to be known. In this dissertation, some dual-microphone coherence-based speech enhancement techniques applicable to hearing aids are proposed. All proposed algorithms operate in the frequency domain and (like traditional beamforming techniques) are purely based on the spatial properties of the desired speech source and does not require any knowledge of noise statistics for calculating the noise reduction filter. This benefit gives our algorithms the ability to address adverse noise conditions, such as situations where interfering talker(s) speaks simultaneously with the target speaker. In such cases, the (adaptive) beamformers lose their effectiveness in suppressing interference, since the noise channel (reference) cannot be built and updated accordingly. This difference is the main advantage of the proposed techniques in the dissertation over traditional adaptive beamformers. Furthermore, since the suggested algorithms are independent of noise estimation, they offer significant improvement in scenarios that the power level of interfering sources are much more than that of target speech. The dissertation also shows the premise behind the proposed algorithms can be extended and employed to binaural hearing aids. The main purpose of the investigated techniques is to enhance the intelligibility level of speech, measured through subjective listening tests with normal hearing and cochlear implant listeners. However, the improvement in quality of the output speech achieved by the algorithms are also presented to show that the proposed methods can be potential candidates for future use in commercial hearing aids and cochlear implant devices.


Smartphone-based Single and Dual Microphone Speech Enhancement Algorithms for Hearing Study

Smartphone-based Single and Dual Microphone Speech Enhancement Algorithms for Hearing Study

Author: Gautam Shreedhar Bhat

Publisher:

Published: 2018

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Speech Enhancement (SE) is elemental in many real world applications. In the last two decades, extensive studies have been carried out on single and multi-channel SE techniques. In this thesis, three novel SE algorithms have been proposed that can be used for Hearing Aid Devices using a smartphone as their assistive device. The first SE method exploits the information of formant locations to improve the speech quality and intelligibility of the Super-Gaussian Joint Maximum aposterori (SGJMAP) SE method. The second method is the extension of this work on the Log Spectral Minimum Mean Square Error Amplitude Estimator (Log-MMSE) which is a well-known SE algorithm. The third method is a real time Blind Source Separation (BSS) method based on Independent Vector Analysis (IVA) for convolutive mixtures. Objective and subjective evaluation of the developed techniques show substantial improvements in speech quality and intelligibility.


Speech Enhancement

Speech Enhancement

Author: Philipos C. Loizou

Publisher: CRC Press

Published: 2013-02-25

Total Pages: 715

ISBN-13: 1466599227

DOWNLOAD EBOOK

With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic pr


Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Author: Emmanuel Vincent

Publisher: John Wiley & Sons

Published: 2018-07-24

Total Pages: 506

ISBN-13: 1119279887

DOWNLOAD EBOOK

Learn the technology behind hearing aids, Siri, and Echo Audio source separation and speech enhancement aim to extract one or more source signals of interest from an audio recording involving several sound sources. These technologies are among the most studied in audio signal processing today and bear a critical role in the success of hearing aids, hands-free phones, voice command and other noise-robust audio analysis systems, and music post-production software. Research on this topic has followed three convergent paths, starting with sensor array processing, computational auditory scene analysis, and machine learning based approaches such as independent component analysis, respectively. This book is the first one to provide a comprehensive overview by presenting the common foundations and the differences between these techniques in a unified setting. Key features: Consolidated perspective on audio source separation and speech enhancement. Both historical perspective and latest advances in the field, e.g. deep neural networks. Diverse disciplines: array processing, machine learning, and statistical signal processing. Covers the most important techniques for both single-channel and multichannel processing. This book provides both introductory and advanced material suitable for people with basic knowledge of signal processing and machine learning. Thanks to its comprehensiveness, it will help students select a promising research track, researchers leverage the acquired cross-domain knowledge to design improved techniques, and engineers and developers choose the right technology for their target application scenario. It will also be useful for practitioners from other fields (e.g., acoustics, multimedia, phonetics, and musicology) willing to exploit audio source separation or speech enhancement as pre-processing tools for their own needs.


Digital Speech Transmission and Enhancement

Digital Speech Transmission and Enhancement

Author: Peter Vary

Publisher: John Wiley & Sons

Published: 2024-01-23

Total Pages: 596

ISBN-13: 1119060966

DOWNLOAD EBOOK

Enables readers to understand the latest developments in speech enhancement/transmission due to advances in computational power and device miniaturization The Second Edition of Digital Speech Transmission and Enhancement has been updated throughout to provide all the necessary details on the latest advances in the theory and practice in speech signal processing and its applications, including many new research results, standards, algorithms, and developments which have recently appeared and are on their way into state-of-the-art applications. Besides mobile communications, which constituted the main application domain of the first edition, speech enhancement for hearing instruments and man-machine interfaces has gained significantly more prominence in the past decade, and as such receives greater focus in this updated and expanded 2nd edition. In the Second Edition of Digital Speech Transmission and Enhancement, readers can expect to find information and novel methods on: Low-latency spectral analysis-synthesis, single-channel and dual-channel algorithms for noise reduction and dereverberation. Multi-microphone processing methods, which are now widely used in applications such as mobile phones, hearing aids, and man-computer interfaces. Algorithms for near-end listening enhancement, which provide a significantly increased speech intelligibility for users at the noisy receiving side of their mobile phone. Fundamentals of speech signal processing, estimation and machine learning, speech coding, error concealment by soft decoding, and artificial bandwidth extension of speech signals Digital Speech Transmission and Enhancement is a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology, and as such is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.


Multi-channel Speech Enhancement by Regularized Optimization

Multi-channel Speech Enhancement by Regularized Optimization

Author: Meng Yu

Publisher:

Published: 2012

Total Pages: 123

ISBN-13: 9781267454515

DOWNLOAD EBOOK

Speech enhancement aims to eliminate noise and unexpected interferences that degrade speech quality and intelligibility in realistic listening situations. It is an indispensible technique in telecommunication and assistive listening devices such as hands-free mobile phones and hearing aids. Though a lot of research has been done in this area, only a limited number of methods can be eective in both real time and real world conditions. Diculties include noise types (incoherent, coherent, diuse), a-priori unknown number of noise sources, mobility of source locations, room reverberations, and non-stationarity. In this thesis, we focus on speech enhancement by suppressing coherent noise and reverberation. Classical speech enhancement methods rely on data from a single microphone. Spectral estimation methods, such as spectral subtraction, Wiener ltering and subspace method, are most widely used. In recent years, microphone array techniques have been developed and recognized as more powerful and promising solutions. Crosschannel cancellation is incorporated in the thesis to resolve the spatial dierence between channels, which helps to blindly identify channel impulse responses and forms the constraints between channels as well. L1 regularized minimization framework is incorporated to speech signal processing, with the regularization applied on channel impulse responses and speech spectrogram, respectively. The over-tting problem in the lter and spectrogram estimation is overcome by the sparsity regularization. Split Bregman method is used to derive the updating rules for speech enhancement in the time domain, while in the spectral domain non-negativity is applied on the spectrogram magnitude of speech signal and impulse response. Therefore, the proposed speech dereverberation method is solved under a constrained non-negative matrix factorization framework (NMF) in the spectrogram magnitude domain. The thesis is organized as follows. In chapter 1, the mathematical frameworks on L1 minimization and NMF are introduced, respectively. Under l1 minimization framework, chapter 2, 3 and 4 present the convex speech enhancement model, musical noise reduction and overlapping speech detection method, respectively. The multichannel speech dereverberation method is presented in chapter 5 under a constrained NMF framework. The thesis is concluded in chapter 6.


Speech Enhancement

Speech Enhancement

Author: Shoji Makino

Publisher: Springer Science & Business Media

Published: 2005-03-17

Total Pages: 432

ISBN-13: 9783540240396

DOWNLOAD EBOOK

We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be "cleaned" with digital signal processing tools before it is played out, transmitted, or stored. This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise reduction but also dereverberation and separation of independent signals. These topics are also covered in this book. However, the general emphasis is on noise reduction because of the large number of applications that can benefit from this technology. The goal of this book is to provide a strong reference for researchers, engineers, and graduate students who are interested in the problem of signal and speech enhancement. To do so, we invited well-known experts to contribute chapters covering the state of the art in this focused field.