1
|
O'Connell SR, Bissmeyer SRS, Gan H, Goldsworthy RL. How Switching Musical Instruments Affects Pitch Discrimination for Cochlear Implant Users. Ear Hear 2025:00003446-990000000-00431. [PMID: 40325511 DOI: 10.1097/aud.0000000000001640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2025]
Abstract
OBJECTIVES Cochlear implant (CI) users struggle with music perception. Generally, they have poorer pitch discrimination and timbre identification than peers with normal hearing, which reduces their overall music appreciation and quality of life. This study's primary aim was to characterize how the increased difficulty of comparing pitch changes across musical instruments affects CI users and their peers with no known hearing loss. The motivation is to better understand the challenges that CI users face with polyphonic music listening. The primary hypothesis was that CI users would be more affected by instrument switching than those with no known hearing loss. The rationale was that poorer pitch and timbre perception through a CI hinders the disassociation between pitch and timbre changes needed for this demanding task. DESIGN Pitch discrimination was measured for piano and tenor saxophone including conditions with pitch comparisons across instruments. Adult participants included 15 CI users and 15 peers with no known hearing loss. Pitch discrimination was measured for 4 note ranges centered on A2 (110 Hz), A3 (220 Hz), A4 (440 Hz), and A5 (880 Hz). The effect of instrument switching was quantified as the change in discrimination thresholds with and without instrument switching. Analysis of variance and Spearman's rank correlation were used to test group differences and relational outcomes, respectively. RESULTS Although CI users had worse pitch discrimination, the additional difficulty of instrument switching did not significantly differ between groups. Discrimination thresholds in both groups were about two times worse with instrument switching than without. Further analyses, however, revealed that CI users were biased toward ranking tenor saxophone higher in pitch compared with piano, whereas those with no known hearing loss were not so biased. In addition, CI users were significantly more affected by instrument switching for the A5 note range. CONCLUSIONS The magnitude of the effect of instrument switching on pitch resolution was similar for CI users and their peers with no known hearing loss. However, CI users were biased toward ranking tenor saxophone as higher in pitch and were significantly more affected by instrument switching for pitches near A5. These findings might reflect poorer temporal coding of fundamental frequency by CIs.
Collapse
Affiliation(s)
- Samantha Reina O'Connell
- Caruso Department of Otolaryngology - Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | | | | | | |
Collapse
|
2
|
Albera R, Urbanelli A, Lucisano S, Aprigliano A, Morando L, Amoroso A, Alexeev M, Albera A. Musical note recognition based on the upper adjacent harmonics without the presence of the fundamental frequency. Sci Rep 2025; 15:14295. [PMID: 40274804 PMCID: PMC12022334 DOI: 10.1038/s41598-025-89454-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 02/05/2025] [Indexed: 04/26/2025] Open
Abstract
Musical signals are complex periodic waveforms characterized by the sum of different frequencies. In a harmonic complex tone, the lowest frequency is called fundamental frequency (f0), while the other frequencies are called harmonics, and their frequencies are integer multiples of the fundamental. The perceived pitch of a sound is correlated with the fundamental frequency, even though it may be impossible to hear f0 in many situations. In these cases, it is possible to identify the pitch based on the upper consecutive harmonics. This study aimed to evaluate the identification of the notes based on the presence of consecutive harmonics only and to determine the importance of their distance from the fundamental frequency. The study was carried out on 30 normally hearing amateur musicians without perfect pitch. The acoustic signal was characterized by the association of four consecutive and two consecutive harmonics of the middle region notes of the piano keyboard. The correct identification rate ranged between 8 and 100%, with better identification occurring when more harmonics and lower frequencies were present. The results confirm that it is possible to identify a note solely based on the presence of harmonics near the fundamental frequency, especially if it is under 2000 Hz.
Collapse
Affiliation(s)
- Roberto Albera
- Otorhinolaryngology Unit, Department of Surgical Sciences, University of Turin, Via G. Verdi, 8, 10124, Turin, Italy
| | - Anastasia Urbanelli
- Otorhinolaryngology Unit, Department of Surgical Sciences, University of Turin, Via G. Verdi, 8, 10124, Turin, Italy.
| | - Sergio Lucisano
- Otorhinolaryngology Unit, Department of Surgical Sciences, University of Turin, Via G. Verdi, 8, 10124, Turin, Italy
| | - Alessandra Aprigliano
- Otorhinolaryngology Unit, Department of Surgical Sciences, University of Turin, Via G. Verdi, 8, 10124, Turin, Italy
| | - Luca Morando
- Department of Physics, University of Turin, Turin, Italy
| | | | - Maxim Alexeev
- Department of Physics, University of Turin, Turin, Italy
| | - Andrea Albera
- Otorhinolaryngology Unit, Department of Surgical Sciences, University of Turin, Via G. Verdi, 8, 10124, Turin, Italy
| |
Collapse
|
3
|
Neamaalkassis H, Boubenec Y, Fiebach C, Muralikrishnan R, Tavano A. The fundamental frequencies of our own voice. ROYAL SOCIETY OPEN SCIENCE 2025; 12:241081. [PMID: 39975656 PMCID: PMC11836694 DOI: 10.1098/rsos.241081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 10/15/2024] [Accepted: 11/26/2024] [Indexed: 02/21/2025]
Abstract
Own actions send a corollary discharge (CD) signal, that is a copy of the planned motor programme, to sensory-specific brain areas to suppress the anticipated sensory response, providing a neural basis for the sense of self. When we speak, the sensory consequences of the fundamental frequency ( f 0 ) of our own voice, generated by vocal fold vibrations, are suppressed. However, due to bone/air conduction filtering effects, the f 0 we self-generate is measurably different from the f 0 we subjectively perceive as defining our own voice. Using an auditory change deafness paradigm, we parametrically tested the sensitivity to auditory change in the frequency neighbourhoods of objective and subjective own voice pitches and found that participants experience change deafness for both to a similar extent, relative to a control pitch condition. We conclude that when we listen attentively, we are likely to filter out small pitch changes in the vicinity of our own objective and subjective voice f 0 , possibly as a long-term consequence of speaking-induced suppression mechanisms integrated with individual, perceptual bodily priors.
Collapse
Affiliation(s)
- Hakam Neamaalkassis
- Department of Cognitive Psychology, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt a. M.63122, Germany
- Département d’Études Cognitives, École Normale Supérieure, PSL Research University, CNRS, 29 rue d’Ulm, Paris75005, France
| | - Yves Boubenec
- Département d’Études Cognitives, École Normale Supérieure, PSL Research University, CNRS, 29 rue d’Ulm, Paris75005, France
| | - Christian Fiebach
- Department of Psychology, Goethe University Frankfurt, Theodor-W.-Adorno-Platz 1, Frankfurt a. M.60323, Germany
| | - R. Muralikrishnan
- Department of Cognitive Psychology, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt a. M.63122, Germany
| | - Alessandro Tavano
- Department of Cognitive Psychology, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt a. M.63122, Germany
- Department of Psychology, Goethe University Frankfurt, Theodor-W.-Adorno-Platz 1, Frankfurt a. M.60323, Germany
| |
Collapse
|
4
|
Lad M, Taylor JP, Griffiths TD. Reliable Web-Based Auditory Cognitive Testing: Observational Study. J Med Internet Res 2024; 26:e58444. [PMID: 39652871 PMCID: PMC11667740 DOI: 10.2196/58444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 06/26/2024] [Accepted: 10/10/2024] [Indexed: 12/12/2024] Open
Abstract
BACKGROUND Web-based experimentation, accelerated by the COVID-19 pandemic, has enabled large-scale participant recruitment and data collection. Auditory testing on the web has shown promise but faces challenges such as uncontrolled environments and verifying headphone use. Prior studies have successfully replicated auditory experiments but often involved younger participants, limiting the generalizability to older adults with varying hearing abilities. This study explores the feasibility of conducting reliable auditory cognitive testing using a web-based platform, especially among older adults. OBJECTIVE This study aims to determine whether demographic factors such as age and hearing status influence participation in web-based auditory cognitive experiments and to assess the reproducibility of auditory cognitive measures-specifically speech-in-noise perception and auditory memory (AuM)-between in-person and web-based settings. Additionally, this study aims to examine the relationship between musical sophistication, measured by the Goldsmiths Musical Sophistication Index (GMSI), and auditory cognitive measures across different testing environments. METHODS A total of 153 participants aged 50 to 86 years were recruited from local registries and memory clinics; 58 of these returned for web-based, follow-up assessments. An additional 89 participants from the PREVENT cohort were included in the web-based study, forming a combined sample. Participants completed speech-in-noise perception tasks (Digits-in-Noise and Speech-in-Babble), AuM tests for frequency and amplitude modulation rate, and the GMSI questionnaire. In-person testing was conducted in a soundproof room with standardized equipment, while web-based tests required participants to use headphones in a quiet room via a web-based app. The reproducibility of auditory measures was evaluated using Pearson and intraclass correlation coefficients, and statistical analyses assessed relationships between variables across settings. RESULTS Older participants and those with severe hearing loss were underrepresented in the web-based follow-up. The GMSI questionnaire demonstrated the highest reproducibility (r=0.82), while auditory cognitive tasks showed moderate reproducibility (Digits-in-Noise and Speech-in-Babble r=0.55 AuM tests for frequency r=0.75 and amplitude modulation rate r=0.44). There were no significant differences in the correlation between age and auditory measures across in-person and web-based settings (all P>.05). The study replicated previously reported associations between AuM and GMSI scores, as well as sentence-in-noise perception, indicating consistency across testing environments. CONCLUSIONS Web-based auditory cognitive testing is feasible and yields results comparable to in-person testing, especially for questionnaire-based measures like the GMSI. While auditory tasks demonstrated moderate reproducibility, the consistent replication of key associations suggests that web-based testing is a viable alternative for auditory cognition research. However, the underrepresentation of older adults and those with severe hearing loss highlights a need to address barriers to web-based participation. Future work should explore methods to enhance inclusivity, such as remote guided testing, and address factors like digital literacy and equipment standards to improve the representativeness and quality of web-based auditory research.
Collapse
Affiliation(s)
- Meher Lad
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - John-Paul Taylor
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
- NIHR Newcastle Biomedical Research Centre, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Timothy David Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
| |
Collapse
|
5
|
Vaziri PA, McDougle SD, Clark DA. Humans can use positive and negative spectrotemporal correlations to detect rising and falling pitch. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.03.606481. [PMID: 39131316 PMCID: PMC11312537 DOI: 10.1101/2024.08.03.606481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
To discern speech or appreciate music, the human auditory system detects how pitch increases or decreases over time. However, the algorithms used to detect changes in pitch, or pitch motion, are incompletely understood. Here, using psychophysics, computational modeling, functional neuroimaging, and analysis of recorded speech, we ask if humans can detect pitch motion using computations analogous to those used by the visual system. We adapted stimuli from studies of vision to create novel auditory correlated noise stimuli that elicited robust pitch motion percepts. Crucially, these stimuli are inharmonic and possess no persistent features across frequency or time, but do possess positive or negative local spectrotemporal correlations in intensity. In psychophysical experiments, we found clear evidence that humans can judge pitch direction based only on positive or negative spectrotemporal intensity correlations. The key behavioral result-robust sensitivity to the negative spectrotemporal correlations-is a direct analogue of illusory "reverse-phi" motion in vision, and thus constitutes a new auditory illusion. Our behavioral results and computational modeling led us to hypothesize that human auditory processing may employ pitch direction opponency. fMRI measurements in auditory cortex supported this hypothesis. To link our psychophysical findings to real-world pitch perception, we analyzed recordings of English and Mandarin speech and found that pitch direction was robustly signaled by both positive and negative spectrotemporal correlations, suggesting that sensitivity to both types of correlations confers ecological benefits. Overall, this work reveals how motion detection algorithms sensitive to local correlations are deployed by the central nervous system across disparate modalities (vision and audition) and dimensions (space and frequency).
Collapse
|
6
|
Madsen SMK, Oxenham AJ. Mistuning perception in music is asymmetric and relies on both beats and inharmonicity. COMMUNICATIONS PSYCHOLOGY 2024; 2:91. [PMID: 39358548 PMCID: PMC11447020 DOI: 10.1038/s44271-024-00141-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 09/23/2024] [Indexed: 10/04/2024]
Abstract
An out-of-tune singer or instrument can ruin the enjoyment of music. However, there is disagreement on how we perceive mistuning in natural music settings. To address this question, we presented listeners with in-tune and out-of-tune passages of two-part music and manipulated the two primary candidate acoustic cues: beats (fluctuations caused by interactions between nearby frequency components) and inharmonicity (non-integer harmonic frequency relationships) across seven experiments (Exp 1: N = 101; Exp 2: N = 63; Exp 3a: N = 87; Exp 3b: N = 28; Exp 3c: N = 69; Exp 4: N = 160; Exp 5: N = 105). Mistuning detection worsened markedly when removing either beating or inharmonicity cues, suggesting important contributions from both. The relative importance of the two cues varied reliably between listeners but was unaffected by musical experience. Finally, a general asymmetry in sensitivity to mistuning was discovered, with compressed pitch differences being more easily detected than stretched ones, thereby demonstrating a generalization of the previously found stretched-octave effect. Overall, the results reveal the acoustic underpinnings of the critical perceptual phenomenon of dissonance through mistuning in natural music.
Collapse
Affiliation(s)
- Sara M K Madsen
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA.
- Hearing Systems Group, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
7
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
8
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
9
|
Han Z, Zhu H, Shen Y, Tian X. Segregation and integration of sensory features by flexible temporal characteristics of independent neural representations. Cereb Cortex 2023; 33:9542-9553. [PMID: 37344250 DOI: 10.1093/cercor/bhad225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/03/2023] [Accepted: 06/05/2023] [Indexed: 06/23/2023] Open
Abstract
Segregation and integration are two fundamental yet competing computations in cognition. For example, in serial speech processing, stable perception necessitates the sequential establishment of perceptual representations to remove irrelevant features for achieving invariance. Whereas multiple features need to combine to create a coherent percept. How to simultaneously achieve seemingly contradicted computations of segregation and integration in a serial process is unclear. To investigate their neural mechanisms, we used loudness and lexical tones as a research model and employed a novel multilevel oddball paradigm with Electroencephalogram (EEG) recordings to explore the dynamics of mismatch negativity (MMN) responses to their deviants. When two types of deviants were presented separately, distinct topographies of MMNs to loudness and tones were observed at different latencies (loudness earlier), supporting the sequential dynamics of independent representations for two features. When they changed simultaneously, the latency of responses to tones became shorter and aligned with that to loudness, while the topographies remained independent, yielding the combined MMN as a linear additive of single MMNs of loudness and tones. These results suggest that neural dynamics can be temporally synchronized to distinct sensory features and balance the computational demands of segregation and integration, grounding for invariance and feature binding in serial processing.
Collapse
Affiliation(s)
- Zhili Han
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
| | - Hao Zhu
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning; Division of Arts and Sciences, NYU Shanghai Shanghai 200126, China
| | - Yunyun Shen
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Cognitive Neuroimaging Unit, INSERN, CEA, CNRS, Universite Paris-Saclay, Neuronspin Center, Gif Yvette 91191, France
| | - Xing Tian
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning; Division of Arts and Sciences, NYU Shanghai Shanghai 200126, China
| |
Collapse
|