1
|
Bsharat-Maalouf D, Degani T, Karawani H. The Involvement of Listening Effort in Explaining Bilingual Listening Under Adverse Listening Conditions. Trends Hear 2023; 27:23312165231205107. [PMID: 37941413 PMCID: PMC10637154 DOI: 10.1177/23312165231205107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/14/2023] [Accepted: 09/15/2023] [Indexed: 11/10/2023] Open
Abstract
The current review examines listening effort to uncover how it is implicated in bilingual performance under adverse listening conditions. Various measures of listening effort, including physiological, behavioral, and subjective measures, have been employed to examine listening effort in bilingual children and adults. Adverse listening conditions, stemming from environmental factors, as well as factors related to the speaker or listener, have been examined. The existing literature, although relatively limited to date, points to increased listening effort among bilinguals in their nondominant second language (L2) compared to their dominant first language (L1) and relative to monolinguals. Interestingly, increased effort is often observed even when speech intelligibility remains unaffected. These findings emphasize the importance of considering listening effort alongside speech intelligibility. Building upon the insights gained from the current review, we propose that various factors may modulate the observed effects. These include the particular measure selected to examine listening effort, the characteristics of the adverse condition, as well as factors related to the particular linguistic background of the bilingual speaker. Critically, further research is needed to better understand the impact of these factors on listening effort. The review outlines avenues for future research that would promote a comprehensive understanding of listening effort in bilingual individuals.
Collapse
Affiliation(s)
- Dana Bsharat-Maalouf
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Tamar Degani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Hanin Karawani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| |
Collapse
|
2
|
Shan T, Wenner CE, Xu C, Duan Z, Maddox RK. Speech-In-Noise Comprehension is Improved When Viewing a Deep-Neural-Network-Generated Talking Face. Trends Hear 2022; 26:23312165221136934. [PMID: 36384325 PMCID: PMC9677167 DOI: 10.1177/23312165221136934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Listening in a noisy environment is challenging, but many previous studies have demonstrated that comprehension of speech can be substantially improved by looking at the talker's face. We recently developed a deep neural network (DNN) based system that generates movies of a talking face from speech audio and a single face image. In this study, we aimed to quantify the benefits that such a system can bring to speech comprehension, especially in noise. The target speech audio was masked with signal to noise ratios of -9, -6, -3, and 0 dB and was presented to subjects in three audio-visual (AV) stimulus conditions: (1) synthesized AV: audio with the synthesized talking face movie; (2) natural AV: audio with the original movie from the corpus; and (3) audio-only: audio with a static image of the talker. Subjects were asked to type the sentences they heard in each trial and keyword recognition was quantified for each condition. Overall, performance in the synthesized AV condition fell approximately halfway between the other two conditions, showing a marked improvement over the audio-only control but still falling short of the natural AV condition. Every subject showed some benefit from the synthetic AV stimulus. The results of this study support the idea that a DNN-based model that generates a talking face from speech audio can meaningfully enhance comprehension in noisy environments, and has the potential to be used as a visual hearing aid.
Collapse
Affiliation(s)
- Tong Shan
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA,Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA,Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Casper E. Wenner
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA
| | - Chenliang Xu
- Department of Computer Science, University of Rochester, Rochester, NY, USA
| | - Zhiyao Duan
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA
| | - Ross K. Maddox
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA,Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA,Center for Visual Science, University of Rochester, Rochester, NY, USA,Department of Neuroscience, University of Rochester, Rochester, NY, USA,Ross K. Maddox, Department of Biomedical Engineering and Department of Neuroscience, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
3
|
Tai J, Forrester J, Sekuler R. Costs and benefits of audiovisual interactions. Perception 2022; 51:639-657. [PMID: 35959630 DOI: 10.1177/03010066221111501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A strong temporal correlation promotes integration of concurrent sensory signals, either within a single sensory modality, or from different modalities. Although the benefits of such integration are well known, far less attention has been given to possible costs incurred when concurrent sensory signals are uncorrelated. In two experiments, subjects categorized the rate at which a visual object modulated in size, while they also tried to ignore a concurrent task-irrelevant broadband sound. Overall, the experiments showed that (i) losses in accuracy from mismatched auditory and visual rates were larger than gains from matched rates and (ii) mismatched auditory and visual rates slowed responses more than they were sped up when rates matched. Experiment One showed that audiovisual interaction varied with the difference between the visual modulation rate and the modulation rate of a concurrent auditory stimulus. Experiment Two showed that audiovisual interaction depended upon the strength of the task-irrelevant auditory modulation. Although our stimuli involved abstract, low-dimensional stimuli, not speech, the effects we observed parallel key findings on interference in multi-speaker settings.
Collapse
Affiliation(s)
- Jiayue Tai
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| | - Jack Forrester
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| | - Robert Sekuler
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| |
Collapse
|
4
|
Independent mechanisms of temporal and linguistic cue correspondence benefiting audiovisual speech processing. Atten Percept Psychophys 2022; 84:2016-2026. [PMID: 35211849 DOI: 10.3758/s13414-022-02440-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/27/2021] [Indexed: 11/08/2022]
Abstract
It is well established that in order to comprehend speech in noisy environments, listeners use the face of the talker in conjunction with the auditory speech. Yet how listeners use audiovisual speech correspondences along the multisensory speech processing pathway is not known. We engaged listeners in a pair of experiments using face rotation to partially dissociate linguistic and temporal information and two tasks to assess both overall integration and early integration specifically. In our first exploratory experiment, listeners performed a speech in noise task to determine which face rotation maximally disrupts speech comprehension and thus overall audiovisual integration. Our second experiment involved a dual pitch discrimination and visual catch task to test specifically for binding. The results showed that temporal coherence supports early integration, replicating the importance of temporal coherence seen for binding nonspeech stimuli. However, the benefit of temporal coherence was present in both upright and inverted positions, suggesting that binding is minimally affected by face rotation under these conditions. Together, our results suggest that different aspects of audio-visual speech are integrated at different stages of multisensory speech processing.
Collapse
|
5
|
Carolan PJ, Heinrich A, Munro KJ, Millman RE. Quantifying the Effects of Motivation on Listening Effort: A Systematic Review and Meta-Analysis. Trends Hear 2022; 26:23312165211059982. [PMID: 35077257 PMCID: PMC8793127 DOI: 10.1177/23312165211059982] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 10/11/2021] [Accepted: 10/27/2021] [Indexed: 11/16/2022] Open
Abstract
Motivation influences the amount of listening effort (LE) exerted or experienced under challenging conditions, such as in high-noise environments. This systematic review and meta-analysis is the first to quantify the effects of motivation on LE. The review was pre-registered in PROSPERO, and performed in accordance with PRISMA guidelines. Eligible studies examined the influence of motivation or individual traits (related to motivation) on LE in adults. Motivational factors, coded as independent variables, included financial reward, evaluative threat, perceived competence, feedback, and individual traits. LE outcomes were categorized as subjective, behavioral, or physiological. The quality of evidence was assessed using an adaptation of the Cochrane Collaboration Risk of Bias Tool. Nested random-effects meta-analyses were performed to quantify and compare the influence of motivational factors across LE outcomes. After assessing 3,532 records, 48 studies met the inclusion criteria and 43 were included in the meta-analyses. Risk of bias was high, for example, many studies lacked sample size justification. Motivational factors had a small-to-medium effect (mean Cohen's d = 0.34, range: 0.11-0.72) on LE. When LE outcomes were considered collectively, an external manipulation of motivation (perceived competence) produced a larger mean effect size compared with individual traits. Some combinations of motivational factors and LE outcomes produced more robust effects than others, for example, evaluative threat and subjective LE outcomes. Although wide prediction intervals and high risk of bias mean that significant positive effects cannot be guaranteed, these findings provide useful guidance on the selection of motivational factors and LE outcomes for future research.
Collapse
Affiliation(s)
- Peter J Carolan
- Manchester Centre for Audiology and Deafness, School of Health Sciences, The University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| | - Antje Heinrich
- Manchester Centre for Audiology and Deafness, School of Health Sciences, The University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| | - Kevin J Munro
- Manchester Centre for Audiology and Deafness, School of Health Sciences, The University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| | - Rebecca E Millman
- Manchester Centre for Audiology and Deafness, School of Health Sciences, The University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| |
Collapse
|
6
|
Fleming JT, Maddox RK, Shinn-Cunningham BG. Spatial alignment between faces and voices improves selective attention to audio-visual speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3085. [PMID: 34717460 DOI: 10.1121/10.0006415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 09/01/2021] [Indexed: 06/13/2023]
Abstract
The ability to see a talker's face improves speech intelligibility in noise, provided that the auditory and visual speech signals are approximately aligned in time. However, the importance of spatial alignment between corresponding faces and voices remains unresolved, particularly in multi-talker environments. In a series of online experiments, we investigated this using a task that required participants to selectively attend a target talker in noise while ignoring a distractor talker. In experiment 1, we found improved task performance when the talkers' faces were visible, but only when corresponding faces and voices were presented in the same hemifield (spatially aligned). In experiment 2, we tested for possible influences of eye position on this result. In auditory-only conditions, directing gaze toward the distractor voice reduced performance, but this effect could not fully explain the cost of audio-visual (AV) spatial misalignment. Lowering the signal-to-noise ratio (SNR) of the speech from +4 to -4 dB increased the magnitude of the AV spatial alignment effect (experiment 3), but accurate closed-set lipreading caused a floor effect that influenced results at lower SNRs (experiment 4). Taken together, these results demonstrate that spatial alignment between faces and voices contributes to the ability to selectively attend AV speech.
Collapse
Affiliation(s)
- Justin T Fleming
- Speech and Hearing Bioscience and Technology Program, Harvard University, 243 Charles Street, Boston, Massachusetts 02114, USA
| | - Ross K Maddox
- Department of Biomedical Engineering, University of Rochester, 430 Elmwood Avenue, Rochester, New York 14620, USA
| | - Barbara G Shinn-Cunningham
- Neuroscience Institute, Carnegie Mellon University, 4825 Frew Street, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
7
|
Text Captioning Buffers Against the Effects of Background Noise and Hearing Loss on Memory for Speech. Ear Hear 2021; 43:115-127. [PMID: 34260436 DOI: 10.1097/aud.0000000000001079] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Everyday speech understanding frequently occurs in perceptually demanding environments, for example, due to background noise and normal age-related hearing loss. The resulting degraded speech signals increase listening effort, which gives rise to negative downstream effects on subsequent memory and comprehension, even when speech is intelligible. In two experiments, we explored whether the presentation of realistic assistive text captioned speech offsets the negative effects of background noise and hearing impairment on multiple measures of speech memory. DESIGN In Experiment 1, young normal-hearing adults (N = 48) listened to sentences for immediate recall and delayed recognition memory. Speech was presented in quiet or in two levels of background noise. Sentences were either presented as speech only or as text captioned speech. Thus, the experiment followed a 2 (caption vs no caption) × 3 (no noise, +7 dB signal-to-noise ratio, +3 dB signal-to-noise ratio) within-subjects design. In Experiment 2, a group of older adults (age range: 61 to 80, N = 31), with varying levels of hearing acuity completed the same experimental task as in Experiment 1. For both experiments, immediate recall, recognition memory accuracy, and recognition memory confidence were analyzed via general(ized) linear mixed-effects models. In addition, we examined individual differences as a function of hearing acuity in Experiment 2. RESULTS In Experiment 1, we found that the presentation of realistic text-captioned speech in young normal-hearing listeners showed improved immediate recall and delayed recognition memory accuracy and confidence compared with speech alone. Moreover, text captions attenuated the negative effects of background noise on all speech memory outcomes. In Experiment 2, we replicated the same pattern of results in a sample of older adults with varying levels of hearing acuity. Moreover, we showed that the negative effects of hearing loss on speech memory in older adulthood were attenuated by the presentation of text captions. CONCLUSIONS Collectively, these findings strongly suggest that the simultaneous presentation of text can offset the negative effects of effortful listening on speech memory. Critically, captioning benefits extended from immediate word recall to long-term sentence recognition memory, a benefit that was observed not only for older adults with hearing loss but also young normal-hearing listeners. These findings suggest that the text captioning benefit to memory is robust and has potentially wide applications for supporting speech listening in acoustically challenging environments.
Collapse
|
8
|
Rohrer JM, Tierney W, Uhlmann EL, DeBruine LM, Heyman T, Jones B, Schmukle SC, Silberzahn R, Willén RM, Carlsson R, Lucas RE, Strand J, Vazire S, Witt JK, Zentall TR, Chabris CF, Yarkoni T. Putting the Self in Self-Correction: Findings From the Loss-of-Confidence Project. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2021; 16:1255-1269. [PMID: 33645334 PMCID: PMC8564260 DOI: 10.1177/1745691620964106] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Science is often perceived to be a self-correcting enterprise. In principle, the
assessment of scientific claims is supposed to proceed in a cumulative fashion,
with the reigning theories of the day progressively approximating truth more
accurately over time. In practice, however, cumulative self-correction tends to
proceed less efficiently than one might naively suppose. Far from evaluating new
evidence dispassionately and infallibly, individual scientists often cling
stubbornly to prior findings. Here we explore the dynamics of scientific
self-correction at an individual rather than collective level. In 13 written
statements, researchers from diverse branches of psychology share why and how
they have lost confidence in one of their own published findings. We
qualitatively characterize these disclosures and explore their implications. A
cross-disciplinary survey suggests that such loss-of-confidence sentiments are
surprisingly common among members of the broader scientific population yet
rarely become part of the public record. We argue that removing barriers to
self-correction at the individual level is imperative if the scientific
community as a whole is to achieve the ideal of efficient self-correction.
Collapse
Affiliation(s)
- Julia M Rohrer
- International Max Planck Research School on the Life Course, Max Planck Institute for Human Development, Berlin.,Department of Psychology, University of Leipzig
| | - Warren Tierney
- Department of Organizational Behavior, INSEAD, Singapore
| | - Eric L Uhlmann
- Department of Organizational Behavior, INSEAD, Singapore
| | - Lisa M DeBruine
- Institute of Neuroscience and Psychology, University of Glasgow
| | - Tom Heyman
- Laboratory of Experimental Psychology, KU Leuven.,Institute of Psychology, Leiden University
| | - Benedict Jones
- Institute of Neuroscience and Psychology, University of Glasgow
| | | | | | - Rebecca M Willén
- Institute for Globally Distributed Open Research and Education (IGDORE)
| | | | | | | | - Simine Vazire
- Melbourne School of Psychological Sciences, University of Melbourne
| | | | | | - Christopher F Chabris
- Autism and Developmental Medicine Institute, Geisinger Health System, Danville, Pennsylvania
| | - Tal Yarkoni
- Department of Psychology, University of Texas at Austin
| |
Collapse
|
9
|
Strand JF, Ray L, Dillman-Hasso NH, Villanueva J, Brown VA. Understanding Speech Amid the Jingle and Jangle: Recommendations for Improving Measurement Practices in Listening Effort Research. ACTA ACUST UNITED AC 2020; 3:169-188. [PMID: 34240011 DOI: 10.1080/25742442.2021.1903293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The latent constructs psychologists study are typically not directly accessible, so researchers must design measurement instruments that are intended to provide insights about those constructs. Construct validation-assessing whether instruments measure what they intend to-is therefore critical for ensuring that the conclusions we draw actually reflect the intended phenomena. Insufficient construct validation can lead to the jingle fallacy-falsely assuming two instruments measure the same construct because the instruments share a name (Thorndike, 1904)-and the jangle fallacy-falsely assuming two instruments measure different constructs because the instruments have different names (Kelley, 1927). In this paper, we examine construct validation practices in research on listening effort and identify patterns that strongly suggest the presence of jingle and jangle in the literature. We argue that the lack of construct validation for listening effort measures has led to inconsistent findings and hindered our understanding of the construct. We also provide specific recommendations for improving construct validation of listening effort instruments, drawing on the framework laid out in a recent paper on improving measurement practices (Flake & Fried, 2020). Although this paper addresses listening effort, the issues raised and recommendations presented are widely applicable to tasks used in research on auditory perception and cognitive psychology.
Collapse
Affiliation(s)
| | - Lucia Ray
- Carleton College, Department of Psychology
| | | | | | - Violet A Brown
- Washington University in St. Louis, Department of Psychological & Brain Sciences
| |
Collapse
|