This book gives an overview of the research and application of speech technologies in different areas. One of the special characteristics of the book is that the authors take a broad view of the multiple research areas and take the multidisciplinary approach to the topics. One of the goals in this book is to emphasize the application. User experience, human factors and usability issues are the focus in this book.
This book addresses different aspects of the research field and a wide range of topics in speech signal processing, speech recognition and language processing. The chapters are divided in three different sections: Speech Signal Modeling, Speech Recognition and Applications. The chapters in the first section cover some essential topics in speech signal processing used for building speech recognition as well as for speech synthesis systems: speech feature enhancement, speech feature vector dimensionality reduction, segmentation of speech frames into phonetic segments. The chapters of the second part cover speech recognition methods and techniques used to read speech from various speech databases and broadcast news recognition for English and non-English languages. The third section of the book presents various speech technology applications used for body conducted speech recognition, hearing impairment, multimodal interfaces and facial expression recognition.
"This book provides practitioners and researchers with information that will allow them to better assist the speech disabled who wish to utilize computer synthesized speech (CSS) technology"--Provided by publisher.
This collection examines the promise and limitations for computer-assisted language learning of emerging speech technologies: speech recognition, text-to-speech synthesis, and acoustic visualization. Using pioneering research from contributors based in the US and Europe, this volume illustrates the uses of each technology for learning languages, the problems entailed in their use, and the solutions evolving in both technology and instructional design. To illuminate where these technologies stand on the path from research toward practice, the book chapters are organized to reflect five stages in the maturation of learning technologies: basic research, analysis of learners’ needs, adaptation of technologies to meet needs, development of prototypes to incorporate adapted technologies, and evaluation of prototypes. The volume demonstrates the progress in employing each class of speech technology while pointing up the effort that remains for effective, reliable application to language learning.
The book explores new ways to reconstruct and enhance speech that is compromised by various neuro-motor disorders – collectively known as “dysarthria.” The authors address some of the extant lacunae in speech research of dysarthric conditions: they show how new methods can improve speaker recognition when speech is impaired due to developmental or acquired pathologies; they present a novel multi-dimensional approach to help the speech system both assess dysarthric speech and to perform intelligibility improvement of the impaired speech; they display well-performing software solutions for developmental and acquired speech impairments, and for vocal injuries; and they examine non-acoustic signals and muted nonverbal sounds in relation to audible speech conversion.
This book addresses state-of-the-art systems and achievements in various topics in the research field of speech and language technologies. Book chapters are organized in different sections covering diverse problems, which have to be solved in speech recognition and language understanding systems. In the first section machine translation systems based on large parallel corpora using rule-based and statistical-based translation methods are presented. The third chapter presents work on real time two way speech-to-speech translation systems. In the second section two papers explore the use of speech technologies in language learning. The third section presents a work on language modeling used for speech recognition. The chapters in section Text-to-speech systems and emotional speech describe corpus-based speech synthesis and highlight the importance of speech prosody in speech recognition. In the fifth section the problem of speaker diarization is addressed. The last section presents various topics in speech technology applications like audio-visual speech recognition and lip reading systems.
This book draws on the recent remarkable advances in speech and language processing: advances that have moved speech technology beyond basic applications such as medical dictation and telephone self-service to increasingly sophisticated and clinically significant applications aimed at complex speech and language disorders. The book provides an introduction to the basic elements of speech and natural language processing technology, and illustrates their clinical potential by reviewing speech technology software currently in use for disorders such as autism and aphasia. The discussion is informed by the authors' own experiences in developing and investigating speech technology applications for these populations. Topics include detailed examples of speech and language technologies in both remediative and assistive applications, overviews of a number of current applications, and a checklist of criteria for selecting the most appropriate applications for particular user needs. This book will be of benefit to four audiences: application developers who are looking to apply these technologies; clinicians who are looking for software that may be of value to their clients; students of speech-language pathology and application development; and finally, people with speech and language disorders and their friends and family members.
The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.
Written by the world's top experts in the field, this multidisciplinary book explores all phases of speech technology. Topics covered include: Conversion of computerized (keyboarded) text into synthesized speech, aimed at developing "talking computers" Development of automatic speech recognition, allowing electronic devices to process verbal commands Speech training and the use of synthesized speech for the hearing- and speech-impaired In-depth discussions of specific speech technologies are included, as well as a treatment of the issues and challenges of human-computer interfaces. Oriented toward state-of-the-art applications, the book emphasizes the practical utilization of emerging technologies and includes numerous case studies.
Mathematical Models of Spoken Language presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind. The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available.Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure. It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure. This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline. There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve.