Neural Correlates of Unimodal and Multimodal Speech Perception in Cochlear Implant Users and Normal-hearing Listeners

Neural Correlates of Unimodal and Multimodal Speech Perception in Cochlear Implant Users and Normal-hearing Listeners

Author: Hannah E. Shatzer

Publisher:

Published: 2020

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Spoken word recognition often involves the integration of both auditory and visual speech cues. The addition of visual cues is particularly useful for individuals with hearing loss and cochlear implants (CIs), as the auditory signal they perceive is degraded compared to individuals with normal hearing (NH). CI users generally benefit more from visual cues than NH perceivers; however, the underlying neural mechanisms affording them this benefit are not well-understood. The current study sought to identify the neural mechanisms active during auditory-only and audiovisual speech processing in CI users and determine how they differ from NH perceivers. Postlingually deaf experienced CI users and age-matched NH adults completed syllable and word recognition tasks during EEG recording, and the neural data was analyzed for differences in event-related potentials and neural oscillations. The results showed that during phonemic processing in the syllable task, CI users have stronger AV integration, shifting processing away from primary auditory cortex and weighting the visual signal more strongly. During whole-word processing in the word task, early acoustic processing is preserved and similar to NH perceivers, but again displaying robust AV integration. Lipreading ability also predicted suppression of early auditory processing across both CI and NH participants, suggesting that while some neural reorganization may have occurred in CI recipients to improve multisensory integrative processing, visual speech ability leads to reduced sensory processing in primary auditory cortex regardless of hearing status. Findings further support behavioral evidence for strong AV integration in CI users and the critical role of vision in improving speech perception.


Music and Speech Perception in Pre-lingually Deafened Young Listeners with Cochlear Implants

Music and Speech Perception in Pre-lingually Deafened Young Listeners with Cochlear Implants

Author:

Publisher:

Published: 2022

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Timbre and pitch cues, though definitionally and physically distinct characteristics of sound, are attributes of all sound signals. A body of literature has shown that alteration of one characteristic can influence the perception of the other; e.g., speech spoken with an atypical contour of pitch can influence a listener's accuracy in identifying the words spoken; conversely, whether a melodic contour is presented via a MIDI piano representation or as sung speech can influence the accuracy of identification of the pitches' contour. Trends for these interactions have been documented for normal hearing children and adults, as well as postlingually deafened adult cochlear implant users. Findings have differed in some capacities between the two listening statuses, attributed in part to impoverished frequency resolution of signals delivered by CIs. Prelingually-deafened young cochlear implant users were examined in this study to observe whether trends persisted for this population, who have briefly, or never, experienced sound perception via acoustic auditory pathways. Additionally, demographic factors and cognitive measures (auditory working memory, nonverbal IQ, and receptive vocabulary) were examined for correlation to word identification and melodic contour identification (MCI) measures within this study. Outcomes for this population largely aligned with existing literature. Speech presented with atypical pitch contours reduced word identification accuracy; however, unlike the relation between adult NH and CI populations, where CI users show greater vulnerability to reduction in word identification when presented atypically contoured speech, the subjects of this study showed a comparable level of decrement relative to their NH peers. When the frequency-spacing between notes in a melodic contour was discriminable, these participants matched trends to NH peers for influence by timbre alteration. Lastly, auditory working memory showed robust correlation within outcomes for both MCI and word identification measures.


The Effect of Top-down Compensation on Speech Perception Using Simulated Cochlear Implant Processing and Post-lingual Cochlear Implant Users

The Effect of Top-down Compensation on Speech Perception Using Simulated Cochlear Implant Processing and Post-lingual Cochlear Implant Users

Author: Chhayakanta Patro

Publisher:

Published: 2016

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

In suboptimal listening environments when noise hinders the continuity of the speech, the normal auditory-cognitive system perceptually integrates available speech information and & ldquo;fills in & rdquo; missing information with help from higher level feedback mechanisms. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing (NH) counterparts. Little is known about CI listeners & rsquo; ability to restore missing speech when they are exposed to challenging listening environments. In this dissertation, three experimental paradigms were used to evaluate listeners & rsquo; ability to utilize their acquired linguistic skills in normal hearing individuals using simulated cochlear implant processing and in individuals with cochlear implants. In the first experiment, listeners & rsquo; abilities to use semantic context when speech was intact or interrupted was evaluated under various spectral resolution conditions. The results suggested that higher level processing facilitates speech perception up to a point but it fails to facilitate speech understanding when speech signals are significantly degraded. In the second experiment, high level processing was investigated using the phonemic restoration effect where sentences were interrupted with and without filler noise at different interruption rates. Both groups failed to show top-down restoration, except the CI users showed some amount of higher level processing at the lowest interruption rate. In the third experiment, a gated word recognition task was used and listeners with CIs required comparatively more acoustic-phonetic information to recognize a word than the NH listeners. In the final experiment, when speech was presented in noise, both groups relied significantly on contextual cues to perceive the speech. Overall, the results from successive experiments indicated CI users rely heavily on contextual cues when they are available. However, when they listen to speech with severe degradations, they may not benefit from semantic context as the incoming speech does not provide enough information to trigger top-down processes. If the signal fidelity (spectral resolution) is improved, their benefit from higher level linguistic feedback processes can be maximized.


Multisensory and sensorimotor interactions in speech perception

Multisensory and sensorimotor interactions in speech perception

Author: Kaisa Tiippana

Publisher: Frontiers Media SA

Published: 2015-06-26

Total Pages: 265

ISBN-13: 2889195481

DOWNLOAD EBOOK

Speech is multisensory since it is perceived through several senses. Audition is the most important one as speech is mostly heard. The role of vision has long been acknowledged since many articulatory gestures can be seen on the talker's face. Sometimes speech can even be felt by touching the face. The best-known multisensory illusion is the McGurk effect, where incongruent visual articulation changes the auditory percept. The interest in the McGurk effect arises from a major general question in multisensory research: How is information from different senses combined? Despite decades of research, a conclusive explanation for the illusion remains elusive. This is a good demonstration of the challenges in the study of multisensory integration. Speech is special in many ways. It is the main means of human communication, and a manifestation of a unique language system. It is a signal with which all humans have a lot of experience. We are exposed to it from birth, and learn it through development in face-to-face contact with others. It is a signal that we can both perceive and produce. The role of the motor system in speech perception has been debated for a long time. Despite very active current research, it is still unclear to which extent, and in which role, the motor system is involved in speech perception. Recent evidence shows that brain areas involved in speech production are activated during listening to speech and watching a talker's articulatory gestures. Speaking involves coordination of articulatory movements and monitoring their auditory and somatosensory consequences. How do auditory, visual, somatosensory, and motor brain areas interact during speech perception? How do these sensorimotor interactions contribute to speech perception? It is surprising that despite a vast amount of research, the secrets of speech perception have not yet been solved. The multisensory and sensorimotor approaches provide new opportunities in solving them. Contributions to the research topic are encouraged for a wide spectrum of research on speech perception in multisensory and sensorimotor contexts, including novel experimental findings ranging from psychophysics to brain imaging, theories and models, reviews and opinions.


Toward a Unified Theory of Audiovisual Integration in Speech Perception

Toward a Unified Theory of Audiovisual Integration in Speech Perception

Author: Nicholas Altieri

Publisher: Universal-Publishers

Published: 2010-09-09

Total Pages:

ISBN-13: 1599423618

DOWNLOAD EBOOK

Auditory and visual speech recognition unfolds in real time and occurs effortlessly for normal hearing listeners. However, model theoretic descriptions of the systems level cognitive processes responsible for integrating auditory and visual speech information are currently lacking, primarily because they rely too heavily on accuracy rather than reaction time predictions. Speech and language researchers have argued about whether audiovisual integration occurs in a parallel or in coactive fashion, and also the extent to which audiovisual occurs in an efficient manner. The Double Factorial Paradigm introduced in Section 1 is an experimental paradigm that is equipped to address dynamical processing issues related to architecture (parallel vs. coactive processing) as well as efficiency (capacity). Experiment 1 employed a simple word discrimination task to assess both architecture and capacity in high accuracy settings. Experiments 2 and 3 assessed these same issues using auditory and visual distractors in Divided Attention and Focused Attention tasks respectively. Experiment 4 investigated audiovisual integration efficiency across different auditory signal-to-noise ratios. The results can be summarized as follows: Integration typically occurs in parallel with an efficient stopping rule, integration occurs automatically in both focused and divided attention versions of the task, and audiovisual integration is only efficient (in the time domain) when the clarity of the auditory signal is relatively poor--although considerable individual differences were observed. In Section 3, these results were captured within the milieu of parallel linear dynamic processing models with cross channel interactions. Finally, in Section 4, I discussed broader implications for this research, including applications for clinical research and neural-biological models of audiovisual convergence.


Perception of Novel Sounds in the Presence of Background Noise

Perception of Novel Sounds in the Presence of Background Noise

Author: Vahid Montazeri

Publisher:

Published: 2019

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

The goal of this dissertation is to investigate how listeners and learning machines cope with the ambiguity caused by interfering multiple novel sound sources. Starting from an ambiguous auditory scene with competing sound sources, this dissertation investigates how a particular sound source draws listeners’ attention while the remaining sources lose their salience and become background (noise). Listeners’ perception of competing novel sounds is investigated in a series of experiments that varied in terms of listening conditions, simulating the difficulties experienced by hearing-impaired individuals in noise. In Chapter 1, the mechanisms behind listeners' perception of speech in the presence of competing sounds are reviewed. Chapter 2 describes three experiments that investigated the recognition of novel sounds in the presence of background noise. The chapter begins with a replication of a previous study, providing evidence that listeners can segregate a novel target sound from the competing distractor only if it repeats across different distractors. A subsequent experiment tested the hypothesis that listeners’ ability to detect change in a sound depends on their knowledge of its source, which is gained via repetition. It is concluded that listeners are able to perceptually learn patterns of the repeating target while suppressing the changes in the masker stream. Two neural network architectures previously employed to study mechanisms of learning, generalized Hebbian and anti-Hebbian, are evaluated. It is shown that the generalized Hebbian learning network produces similar results to those obtained from the listeners. Experiments in Chapter 3 provide evidence that recognition of a novel target sound becomes robust against new (unheard) distractors when listeners go through an exposure stage in which the target is presented repeatedly across multiple distractors. Chapter 3 concludes by reporting experiments 3-2 and 3-3 that investigated recognition of consonant-vowel-consonant-vowel (CVCV) words in the presence of novel distractors. Experiment 3-2 showed that upon exposing the listeners to target tokens across multiple distractors, the process of learning new CVCV tokens shifts from context-specificity to an adaptation-plus-prototype mechanism. The goal in experiment 3-3 was to investigate whether or not cochlear implant users, who have limited spectral resolution, would show the same behavior as listeners with normal hearing in experiment 3-2. The main goal in Chapter 4 is to investigate the extent to which the findings in experiment 3-2 can be replicated by recurrent neural networks (RNNs). This chapter begins with a brief introduction to RNNs and long short-term memories (LSTMs). In experiment 4-1 a recurrent LSTM auto-encoder was trained to reconstruct an input CVCV target when mixed with a distractor with or without the presence of a context sequence prior to the input. It was shown that the network could reconstruct the input with better accuracy when the context sequence contained the repeating CVCV target across multiple distractors. Furthermore, similar to the findings in experiment 3-2, the presence of such a context sequence improved the network’s generalizability to unseen data (novel distractors). Experiment 4-2 showed that the presence of the context sequence led to an improved semi-supervised speech enhancement algorithm that recovered the target CVCV tokens while suppressing the distractors.


The Impact of Visual Input on the Ability of Bilateral and Bimodal Cochlear Implant Users to Accurately Perceive Words and Phonemes in Experimental Phrases

The Impact of Visual Input on the Ability of Bilateral and Bimodal Cochlear Implant Users to Accurately Perceive Words and Phonemes in Experimental Phrases

Author: Cimarron Ludwig

Publisher:

Published: 2015

Total Pages: 33

ISBN-13:

DOWNLOAD EBOOK

A multitude of individuals across the globe suffer from hearing loss and that number continues to grow. Cochlear implants, while having limitations, provide electrical input for users enabling them to "hear" and more fully interact socially with their environment. There has been a clinical shift to the bilateral placement of implants in both ears and to bimodal placement of a hearing aid in the contralateral ear if residual hearing is present. However, there is potentially more to subsequent speech perception for bilateral and bimodal cochlear implant users than the electric and acoustic input being received via these modalities. For normal listeners vision plays a role and Rosenblum (2005) points out it is a key feature of an integrated perceptual process. Logically, cochlear implant users should also benefit from integrated visual input. The question is how exactly does vision provide benefit to bilateral and bimodal users. Eight (8) bilateral and 5 bimodal participants received randomized experimental phrases previously generated by Liss et al. (1998) in auditory and audiovisual conditions. The participants recorded their perception of the input. Data were consequently analyzed for percent words correct, consonant errors, and lexical boundary error types. Overall, vision was found to improve speech perception for bilateral and bimodal cochlear implant participants. Each group experienced a significant increase in percent words correct when visual input was added. With vision bilateral participants reduced consonant place errors and demonstrated increased use of syllabic stress cues used in lexical segmentation. Therefore, results suggest vision might provide perceptual benefits for bilateral cochlear implant users by granting access to place information and by augmenting cues for syllabic stress in the absence of acoustic input. On the other hand vision did not provide the bimodal participants significantly increased access to place and stress cues. Therefore the exact mechanism by which bimodal implant users improved speech perception with the addition of vision is unknown. These results point to the complexities of audiovisual integration during speech perception and the need for continued research regarding the benefit vision provides to bilateral and bimodal cochlear implant users.


Neural Correlates of Auditory-visual Speech Perception in Noise

Neural Correlates of Auditory-visual Speech Perception in Noise

Author: Jaimie Gilbert

Publisher: ProQuest

Published: 2009

Total Pages: 173

ISBN-13: 9781109217896

DOWNLOAD EBOOK

Speech perception in noise may be facilitated by presenting the concurrent optic stimulus of observable speech gestures. Objective measures such as event-related potentials (ERPs) are crucial to understanding the processes underlying a facilitation of auditory-visual speech perception. Previous research has demonstrated that in quiet acoustic conditions auditory-visual speech perception occurs faster (decreased latency) and with less neural activity (decreased amplitude) than auditory-only speech perception. These empirical observations provide support for the construct of auditory-visual neural facilitation. Auditory-visual facilitation was quantified with response time and accuracy measures and the N1/P2 ERP waveform response as a function of changes in audibility (manipulation of the acoustic environment by testing a range of signal-to-noise ratios) and content of optic cue (manipulation of the types of cues available, e.g., speech, nonspeech-static, or non-speech-dynamic cues). Experiment 1 (Response Time Measures) evaluated participant responses in a speeded-response task investigating effects of both audibility and type of optic cue. Results revealed better accuracy and response times with visible speech gestures compared to those for any non-speech cue. Experiment 2 (Audibility) investigated the influence of audibility on auditory-visual facilitation in response time measures and the N1/P2 response. ERP measures showed effects of reduced audibility (slower latency, decreased amplitude) for both types of facial motion, i.e., speech and non-speech dynamic facial optic cues, compared to measures in quiet conditions. Experiment 3 (Optic Cues) evaluated the influence of the type of optic cue on auditory-visual facilitation with response time measures and the N1/P2 response. N1 latency was faster with both types of facial motion tested in this experiment, but N1 amplitude was decreased only with concurrent presentation of auditory and visual speech. The N1 ERP results of these experiments reveal that the effect of audibility alone does not explain auditory-visual facilitation in noise. The decreased N1 amplitude associated with the visible speech gesture and the concurrent auditory speech suggests that processing of the visible speech gesture either stimulates N1 generators or interacts with processing in N1 generators. A likely generator of the N1 response is the auditory cortex, which matures differently without auditory stimulation during a critical period. The impact of auditory-visual integration deprivation on neural development and ability to make use of optic cues must also be investigated. Further scientific understanding of any maturational differences or differences in processing due to auditory-visual integration deprivation is needed to promote utilization of auditory-visual facilitation of speech perception for individuals with auditory impairment. Research and (re)habilitation therapies for speech perception in noise must continue to emphasize the benefit of associating and integrating auditory and visual speech cues.