Real-time Single and Dual-channel Speech Enhancement on Edge Devices for Hearing Applications

Real-time Single and Dual-channel Speech Enhancement on Edge Devices for Hearing Applications

Author: Nikhil Shankar

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Speech Enhancement (SE) is an important module in the signal processing pipeline for hearing applications and it helps enhance the comfort of listening. Many single and dualmicrophone SE techniques have been developed by researchers over the last few decades. In this thesis, novel single and dual-channel SE techniques have been proposed and are implemented on edge devices as an assistive tool for hearing applications. The smartphone is considered as the processing platform for real-time implementation and testing. In this work, both statistical signal processing and deep learning algorithms are proposed for SE. Firstly, we compare different two-channel beamformers for SE. Later, the Minimum Variance Distortionless Response (MVDR) beamformer assisted by a voice activity detector (VAD) is used as a Signal to Noise Ratio (SNR) booster for the SE method. Deep neural network architectures comprising of convolutional neural network (CNN) and recurrent neural network (RNN) layers are proposed in this thesis for real-time SE. Finally to filter out background noise, the SE gain estimation for noisy speech mixture is smoothed along the frequency axis by a Mel filter-bank, resulting in a Mel-warped frequency-domain gain estimation. In comparison with existing SE methods, objective assessment and subjective results of the developed methods indicate substantial improvements in speech quality and intelligibility.


User Customizable Real-time Single and Dual Microphone Speech Enhancement and Blind Speech Separation for Smartphone Hearing Aid Applications

User Customizable Real-time Single and Dual Microphone Speech Enhancement and Blind Speech Separation for Smartphone Hearing Aid Applications

Author: Chandan Karadagur Ananda Reddy

Publisher:

Published: 2018

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Speech Enhancement (SE) is a vital algorithmic component in the Hearing Aid pipeline. Over the years, several algorithms have been developed to work in real-time and to improve the quality and intelligibility of speech. However, noise suppression with minimal distortion to speech is still a prime challenge that needs to be addressed. In this work, a new single microphone SE is introduced that is implemented on a smartphone to work as an assistive device to Hearing Aids via wireless connectivity. The uniqueness of the developed method is in the introduction of varying parameters that allow the smartphone user to control the amount of noise suppression and speech distortion in real-time, which allows the user to customize the perceptual audio to their preference. A super-Gaussian extension of this approach is explored and analyzed. With the recent accessibility of the two microphones on the smartphones, doors were opened to use beamformer as a pre-filtering stage to the proposed single microphone SE. Real-time blind speech separation technique is also proposed to yield superior quality for speech. Objective and subjective results show that the developed methods outperform traditional SE techniques.


Smartphone-based Single and Dual Microphone Speech Enhancement Algorithms for Hearing Study

Smartphone-based Single and Dual Microphone Speech Enhancement Algorithms for Hearing Study

Author: Gautam Shreedhar Bhat

Publisher:

Published: 2018

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Speech Enhancement (SE) is elemental in many real world applications. In the last two decades, extensive studies have been carried out on single and multi-channel SE techniques. In this thesis, three novel SE algorithms have been proposed that can be used for Hearing Aid Devices using a smartphone as their assistive device. The first SE method exploits the information of formant locations to improve the speech quality and intelligibility of the Super-Gaussian Joint Maximum aposterori (SGJMAP) SE method. The second method is the extension of this work on the Log Spectral Minimum Mean Square Error Amplitude Estimator (Log-MMSE) which is a well-known SE algorithm. The third method is a real time Blind Source Separation (BSS) method based on Independent Vector Analysis (IVA) for convolutive mixtures. Objective and subjective evaluation of the developed techniques show substantial improvements in speech quality and intelligibility.


Digital Speech Transmission and Enhancement

Digital Speech Transmission and Enhancement

Author: Peter Vary

Publisher: John Wiley & Sons

Published: 2023-11-23

Total Pages: 596

ISBN-13: 1119060982

DOWNLOAD EBOOK

DIGITAL SPEECH TRANSMISSION AND ENHANCEMENT Enables readers to understand the latest developments in speech enhancement/transmission due to advances in computational power and device miniaturization The Second Edition of Digital Speech Transmission and Enhancement has been updated throughout to provide all the necessary details on the latest advances in the theory and practice in speech signal processing and its applications, including many new research results, standards, algorithms, and developments which have recently appeared and are on their way into state-of-the-art applications. Besides mobile communications, which constituted the main application domain of the first edition, speech enhancement for hearing instruments and man-machine interfaces has gained significantly more prominence in the past decade, and as such receives greater focus in this updated and expanded second edition. Readers can expect to find information and novel methods on: Low-latency spectral analysis-synthesis, single-channel and dual-channel algorithms for noise reduction and dereverberation Multi-microphone processing methods, which are now widely used in applications such as mobile phones, hearing aids, and man-computer interfaces Algorithms for near-end listening enhancement, which provide a significantly increased speech intelligibility for users at the noisy receiving side of their mobile phone Fundamentals of speech signal processing, estimation and machine learning, speech coding, error concealment by soft decoding, and artificial bandwidth extension of speech signals Digital Speech Transmission and Enhancement is a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology, and as such is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.


DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement

DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement

Author: Richard C. Hendriks

Publisher: Morgan & Claypool Publishers

Published: 2013-01-01

Total Pages: 84

ISBN-13: 1627051449

DOWNLOAD EBOOK

As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement. The field of single-microphone noise reduction for speech enhancement comprises a history of more than 30 years of research. In this survey, we wish to demonstrate the significant advances that have been made during the last decade in the field of discrete Fourier transform domain-based single-channel noise reduction for speech enhancement.Furthermore, our goal is to provide a concise description of a state-of-the-art speech enhancement system, and demonstrate the relative importance of the various building blocks of such a system. This allows the non-expert DSP practitioner to judge the relevance of each building block and to implement a close-to-optimal enhancement system for the particular application at hand. Table of Contents: Introduction / Single Channel Speech Enhancement: General Principles / DFT-Based Speech Enhancement Methods: Signal Model and Notation / Speech DFT Estimators / Speech Presence Probability Estimation / Noise PSD Estimation / Speech PSD Estimation / Performance Evaluation Methods / Simulation Experiments with Single-Channel Enhancement Systems / Future Directions


DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement

DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement

Author: Richard C. Hendriks

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 70

ISBN-13: 3031025644

DOWNLOAD EBOOK

As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement. The field of single-microphone noise reduction for speech enhancement comprises a history of more than 30 years of research. In this survey, we wish to demonstrate the significant advances that have been made during the last decade in the field of discrete Fourier transform domain-based single-channel noise reduction for speech enhancement.Furthermore, our goal is to provide a concise description of a state-of-the-art speech enhancement system, and demonstrate the relative importance of the various building blocks of such a system. This allows the non-expert DSP practitioner to judge the relevance of each building block and to implement a close-to-optimal enhancement system for the particular application at hand. Table of Contents: Introduction / Single Channel Speech Enhancement: General Principles / DFT-Based Speech Enhancement Methods: Signal Model and Notation / Speech DFT Estimators / Speech Presence Probability Estimation / Noise PSD Estimation / Speech PSD Estimation / Performance Evaluation Methods / Simulation Experiments with Single-Channel Enhancement Systems / Future Directions


Real-time Speech Processing Algorithms for Smartphone Based Hearing Aid Applications

Real-time Speech Processing Algorithms for Smartphone Based Hearing Aid Applications

Author: Gautam Shreedhar Bhat

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Signal processing algorithms are extensively used in hearing aid applications to improve the quality and intelligibility of speech. The hearing aid device (HAD) signal processing pipeline consists of several key modules that help to improve the perception for hearing-impaired listeners. In this dissertation, novel speech processing algorithms have been proposed that can be used in smartphone-based hearing aid (HA) setup. Every chapter of this dissertation concentrates on the individual modules of the signal processing pipeline in HADs. The first algorithm is developed for speech enhancement (SE) to suppress the background noise. A voice activity detector (VAD) to classify the incoming signal into speech or noise is developed. Signal enhancement techniques like blind source separation and dereverberation are developed. The algorithms are developed using conventional and supervised learning techniques. Objective and subjective evaluations are conducted for each of the proposed techniques to show substantial improvements in speech quality and intelligibility.


Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Author: Emmanuel Vincent

Publisher: John Wiley & Sons

Published: 2018-07-24

Total Pages: 628

ISBN-13: 1119279917

DOWNLOAD EBOOK

Learn the technology behind hearing aids, Siri, and Echo Audio source separation and speech enhancement aim to extract one or more source signals of interest from an audio recording involving several sound sources. These technologies are among the most studied in audio signal processing today and bear a critical role in the success of hearing aids, hands-free phones, voice command and other noise-robust audio analysis systems, and music post-production software. Research on this topic has followed three convergent paths, starting with sensor array processing, computational auditory scene analysis, and machine learning based approaches such as independent component analysis, respectively. This book is the first one to provide a comprehensive overview by presenting the common foundations and the differences between these techniques in a unified setting. Key features: Consolidated perspective on audio source separation and speech enhancement. Both historical perspective and latest advances in the field, e.g. deep neural networks. Diverse disciplines: array processing, machine learning, and statistical signal processing. Covers the most important techniques for both single-channel and multichannel processing. This book provides both introductory and advanced material suitable for people with basic knowledge of signal processing and machine learning. Thanks to its comprehensiveness, it will help students select a promising research track, researchers leverage the acquired cross-domain knowledge to design improved techniques, and engineers and developers choose the right technology for their target application scenario. It will also be useful for practitioners from other fields (e.g., acoustics, multimedia, phonetics, and musicology) willing to exploit audio source separation or speech enhancement as pre-processing tools for their own needs.


Audio Source Separation

Audio Source Separation

Author: Shoji Makino

Publisher: Springer

Published: 2018-03-01

Total Pages: 389

ISBN-13: 3319730312

DOWNLOAD EBOOK

This book provides the first comprehensive overview of the fascinating topic of audio source separation based on non-negative matrix factorization, deep neural networks, and sparse component analysis. The first section of the book covers single channel source separation based on non-negative matrix factorization (NMF). After an introduction to the technique, two further chapters describe separation of known sources using non-negative spectrogram factorization, and temporal NMF models. In section two, NMF methods are extended to multi-channel source separation. Section three introduces deep neural network (DNN) techniques, with chapters on multichannel and single channel separation, and a further chapter on DNN based mask estimation for monaural speech separation. In section four, sparse component analysis (SCA) is discussed, with chapters on source separation using audio directional statistics modelling, multi-microphone MMSE-based techniques and diffusion map methods. The book brings together leading researchers to provide tutorial-like and in-depth treatments on major audio source separation topics, with the objective of becoming the definitive source for a comprehensive, authoritative, and accessible treatment. This book is written for graduate students and researchers who are interested in audio source separation techniques based on NMF, DNN and SCA.