Algorithms and Software for Predictive and Perceptual Modeling of Speech

Algorithms and Software for Predictive and Perceptual Modeling of Speech

Author: Venkatraman Atti

Publisher: Morgan & Claypool Publishers

Published: 2010-05-05

Total Pages: 124

ISBN-13: 160845388X

DOWNLOAD EBOOK

From the early pulse code modulation-based coders to some of the recent multi-rate wideband speech coding standards, the area of speech coding made several significant strides with an objective to attain high quality of speech at the lowest possible bit rate. This book presents some of the recent advances in linear prediction (LP)-based speech analysis that employ perceptual models for narrow- and wide-band speech coding. The LP analysis-synthesis framework has been successful for speech coding because it fits well the source-system paradigm for speech synthesis. Limitations associated with the conventional LP have been studied extensively, and several extensions to LP-based analysis-synthesis have been proposed, e.g., the discrete all-pole modeling, the perceptual LP, the warped LP, the LP with modified filter structures, the IIR-based pure LP, all-pole modeling using the weighted-sum of LSP polynomials, the LP for low frequency emphasis, and the cascade-form LP. These extensions can be classified as algorithms that either attempt to improve the LP spectral envelope fitting performance or embed perceptual models in the LP. The first half of the book reviews some of the recent developments in predictive modeling of speech with the help of MatlabTM Simulation examples. Advantages of integrating perceptual models in low bit rate speech coding depend on the accuracy of these models to mimic the human performance and, more importantly, on the achievable "coding gains" and "computational overhead" associated with these physiological models. Methods that exploit the masking properties of the human ear in speech coding standards, even today, are largely based on concepts introduced by Schroeder and Atal in 1979. For example, a simple approach employed in speech coding standards is to use a perceptual weighting filter to shape the quantization noise according to the masking properties of the human ear. The second half of the book reviews some of the recent developments in perceptual modeling of speech (e.g., masking threshold, psychoacoustic models, auditory excitation pattern, and loudness) with the help of MatlabTM simulations. Supplementary material including MatlabTM programs and simulation examples presented in this book can also be accessed here. Table of Contents: Introduction / Predictive Modeling of Speech / Perceptual Modeling of Speech


Intelligent Systems

Intelligent Systems

Author: Amit Sheth

Publisher: Springer Nature

Published: 2021-07-21

Total Pages: 492

ISBN-13: 9811622485

DOWNLOAD EBOOK

This book contains the latest computational intelligence methodologies and applications. This book is a collection of selected papers presented at International Conference on Sustainable Computing and Intelligent Systems (SCIS 2021), held in Jaipur, India, during February 5–6, 2021. It includes novel and innovative work from experts, practitioners, scientists, and decision-makers from academia and industry. It covers selected papers in the area of artificial intelligence and intelligent systems, intelligent business systems, machine intelligence, computer vision, Web intelligence, big data analytics, swarm intelligence, and related topics.


Digital Speech Processing

Digital Speech Processing

Author: Sadaoki Furui

Publisher: CRC Press

Published: 2018-05-04

Total Pages: 319

ISBN-13: 1351990926

DOWNLOAD EBOOK

A study of digital speech processing, synthesis and recognition. This second edition contains new sections on the international standardization of robust and flexible speech coding techniques, waveform unit concatenation-based speech synthesis, large vocabulary continuous-speech recognition based on statistical pattern recognition, and more.


Speech Synthesis and Recognition

Speech Synthesis and Recognition

Author: Wendy Holmes

Publisher: CRC Press

Published: 2002-09-11

Total Pages: 317

ISBN-13: 0203484681

DOWNLOAD EBOOK

With the growing impact of information technology on daily life, speech is becoming increasingly important for providing a natural means of communication between humans and machines. This extensively reworked and updated new edition of Speech Synthesis and Recognition is an easy-to-read introduction to current speech technology. Aimed at advanced undergraduates and graduates in electronic engineering, computer science and information technology, the book is also relevant to professional engineers who need to understand enough about speech technology to be able to apply it successfully and to work effectively with speech experts. No advanced mathematical ability is required and no specialist prior knowledge of phonetics or of the properties of speech signals is assumed.