1
|
Kwon J, Hwang J, Sung JE, Im CH. Speech synthesis from three-axis accelerometer signals using conformer-based deep neural network. Comput Biol Med 2024; 182:109090. [PMID: 39232406 DOI: 10.1016/j.compbiomed.2024.109090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 08/23/2024] [Accepted: 08/29/2024] [Indexed: 09/06/2024]
Abstract
Silent speech interfaces (SSIs) have emerged as innovative non-acoustic communication methods, and our previous study demonstrated the significant potential of three-axis accelerometer-based SSIs to identify silently spoken words with high classification accuracy. The developed accelerometer-based SSI with only four accelerometers and a small training dataset outperformed a conventional surface electromyography (sEMG)-based SSI. In this study, motivated by the promising initial results, we investigated the feasibility of synthesizing spoken speech from three-axis accelerometer signals. This exploration aimed to assess the potential of accelerometer-based SSIs for practical silent communication applications. Nineteen healthy individuals participated in our experiments. Five accelerometers were attached to the face to acquire speech-related facial movements while the participants read 270 Korean sentences aloud. For the speech synthesis, we used a convolution-augmented Transformer (Conformer)-based deep neural network model to convert the accelerometer signals into a Mel spectrogram, from which an audio waveform was synthesized using HiFi-GAN. To evaluate the quality of the generated Mel spectrograms, ten-fold cross-validation was performed, and the Mel cepstral distortion (MCD) was chosen as the evaluation metric. As a result, an average MCD of 5.03 ± 0.65 was achieved using four optimized accelerometers based on our previous study. Furthermore, the quality of generated Mel spectrograms was significantly enhanced by adding one more accelerometer attached under the chin, achieving an average MCD of 4.86 ± 0.65 (p < 0.001, Wilcoxon signed-rank test). Although an objective comparison is difficult, these results surpass those obtained using conventional SSIs based on sEMG, electromagnetic articulography, and electropalatography with the fewest sensors and a similar or smaller number of sentences to train the model. Our proposed approach will contribute to the widespread adoption of accelerometer-based SSIs, leveraging the advantages of accelerometers like low power consumption, invulnerability to physiological artifacts, and high portability.
Collapse
Affiliation(s)
- Jinuk Kwon
- Department of Electronic Engineering, Hanyang University, Seoul, South Korea.
| | - Jihun Hwang
- Department of Electronic Engineering, Hanyang University, Seoul, South Korea.
| | - Jee Eun Sung
- Department of Communication Disorders, Ewha Womans University, Seoul, South Korea.
| | - Chang-Hwan Im
- Department of Electronic Engineering, Hanyang University, Seoul, South Korea; Department of Biomedical Engineering, Hanyang University, Seoul, South Korea; Department of Artificial Intelligence, Hanyang University, Seoul, South Korea; Department of HY-KIST Bio-Convergence, Hanyang University, Seoul, South Korea.
| |
Collapse
|
2
|
Liu S, Fawden T, Zhu R, Malliaras GG, Bance M. A data-efficient and easy-to-use lip language interface based on wearable motion capture and speech movement reconstruction. SCIENCE ADVANCES 2024; 10:eado9576. [PMID: 38924408 PMCID: PMC11204283 DOI: 10.1126/sciadv.ado9576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 05/21/2024] [Indexed: 06/28/2024]
Abstract
Lip language recognition urgently needs wearable and easy-to-use interfaces for interference-free and high-fidelity lip-reading acquisition and to develop accompanying data-efficient decoder-modeling methods. Existing solutions suffer from unreliable lip reading, are data hungry, and exhibit poor generalization. Here, we propose a wearable lip language decoding technology that enables interference-free and high-fidelity acquisition of lip movements and data-efficient recognition of fluent lip language based on wearable motion capture and continuous lip speech movement reconstruction. The method allows us to artificially generate any wanted continuous speech datasets from a very limited corpus of word samples from users. By using these artificial datasets to train the decoder, we achieve an average accuracy of 92.0% across individuals (n = 7) for actual continuous and fluent lip speech recognition for 93 English sentences, even observing no training burn on users because all training datasets are artificially generated. Our method greatly minimizes users' training/learning load and presents a data-efficient and easy-to-use paradigm for lip language recognition.
Collapse
Affiliation(s)
- Shiqiang Liu
- State Key Laboratory of Precision Measurement Technology and Instrument, Department of Precision Instrument, Tsinghua University, Beijing 100084, China
| | - Terry Fawden
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB23EB, UK
| | - Rong Zhu
- State Key Laboratory of Precision Measurement Technology and Instrument, Department of Precision Instrument, Tsinghua University, Beijing 100084, China
| | - George G. Malliaras
- Electrical Engineering Division, Department of Engineering, University of Cambridge, Cambridge CB3 0FA, UK
| | - Manohar Bance
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB23EB, UK
| |
Collapse
|
3
|
Ornelas G, Bueno Garcia H, Bracken DJ, Linnemeyer-Risser K, Coleman TP, Weissbrod PA. Differentiation of Bolus Texture During Deglutition via High-Density Surface Electromyography: A Pilot Study. Laryngoscope 2023; 133:2695-2703. [PMID: 36734335 DOI: 10.1002/lary.30589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 10/26/2022] [Accepted: 12/03/2022] [Indexed: 02/04/2023]
Abstract
OBJECTIVE Swallowing is a complex neuromuscular task. There is limited spatiotemporal data on normative surface electromyographic signal during swallow, particularly across standard textures. We hypothesize the pattern of electromyographic signal of the anterior neck varies cranio-caudally, that laterality can be evaluated, and categorization of bolus texture can be differentiated by high-density surface electromyography (HDsEMG) through signal analysis. METHODS An HDsEMG grid of 20 electrodes captured electromyographic activity in eight healthy adult subjects across 240 total swallows. Participants swallowed five standard textures: saliva, thin liquid, puree, mixed consistency, and dry solid. Data were bandpass filtered, underwent functional alignment of signal, and then placed into binary classifier receiver operating characteristic (ROC) curves. Muscular activity was visualized by creating two-dimensional EMG heat maps. RESULTS Signal analysis results demonstrated a positive correlation between signal amplitude and bolus texture. Greater differences of amplitude in the cranial most region of the array when compared to the caudal most region were noted in all subjects. Lateral comparison of the array revealed symmetric power levels across all subjects and textures. ROC curves demonstrated the ability to correctly classify textures within subjects in 6 of 10 texture comparisons. CONCLUSION This pilot study suggests that utilizing HDsEMG during deglutition can noninvasively differentiate swallows of varying texture noninvasively. This may prove useful in future diagnostic and behavioral swallow applications. LEVEL OF EVIDENCE 4 Laryngoscope, 133:2695-2703, 2023.
Collapse
Affiliation(s)
- Gladys Ornelas
- Department of Bioengineering, University of California San Diego, La Jolla, California, U.S.A
| | - Hassler Bueno Garcia
- Department of Bioengineering, University of California San Diego, La Jolla, California, U.S.A
| | - David J Bracken
- Department of Otolaryngology, University of California San Francisco, San Francisco, California, U.S.A
| | | | - Todd P Coleman
- Department of Bioengineering, University of California San Diego, La Jolla, California, U.S.A
| | - Philip A Weissbrod
- Department of Otolaryngology, University of California San Diego, La Jolla, California, U.S.A
| |
Collapse
|
4
|
Ershad F, Patel S, Yu C. Wearable bioelectronics fabricated in situ on skins. NPJ FLEXIBLE ELECTRONICS 2023; 7:32. [PMID: 38665149 PMCID: PMC11041641 DOI: 10.1038/s41528-023-00265-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 07/04/2023] [Indexed: 04/28/2024]
Abstract
In recent years, wearable bioelectronics has rapidly expanded for diagnosing, monitoring, and treating various pathological conditions from the skin surface. Although the devices are typically prefabricated as soft patches for general usage, there is a growing need for devices that are customized in situ to provide accurate data and precise treatment. In this perspective, the state-of-the-art in situ fabricated wearable bioelectronics are summarized, focusing primarily on Drawn-on-Skin (DoS) bioelectronics and other in situ fabrication methods. The advantages and limitations of these technologies are evaluated and potential future directions are suggested for the widespread adoption of these technologies in everyday life.
Collapse
Affiliation(s)
- Faheem Ershad
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16801 USA
| | - Shubham Patel
- Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, PA 16801 USA
| | - Cunjiang Yu
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16801 USA
- Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, PA 16801 USA
- Department of Materials Science and Engineering, Materials Research Institute, Pennsylvania State University, University Park, PA 16801 USA
| |
Collapse
|
5
|
Ershad F, Houston M, Patel S, Contreras L, Koirala B, Lu Y, Rao Z, Liu Y, Dias N, Haces-Garcia A, Zhu W, Zhang Y, Yu C. Customizable, reconfigurable, and anatomically coordinated large-area, high-density electromyography from drawn-on-skin electrode arrays. PNAS NEXUS 2023; 2:pgac291. [PMID: 36712933 PMCID: PMC9837666 DOI: 10.1093/pnasnexus/pgac291] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 12/09/2022] [Indexed: 06/18/2023]
Abstract
Accurate anatomical matching for patient-specific electromyographic (EMG) mapping is crucial yet technically challenging in various medical disciplines. The fixed electrode construction of multielectrode arrays (MEAs) makes it nearly impossible to match an individual's unique muscle anatomy. This mismatch between the MEAs and target muscles leads to missing relevant muscle activity, highly redundant data, complicated electrode placement optimization, and inaccuracies in classification algorithms. Here, we present customizable and reconfigurable drawn-on-skin (DoS) MEAs as the first demonstration of high-density EMG mapping from in situ-fabricated electrodes with tunable configurations adapted to subject-specific muscle anatomy. The DoS MEAs show uniform electrical properties and can map EMG activity with high fidelity under skin deformation-induced motion, which stems from the unique and robust skin-electrode interface. They can be used to localize innervation zones (IZs), detect motor unit propagation, and capture EMG signals with consistent quality during large muscle movements. Reconfiguring the electrode arrangement of DoS MEAs to match and extend the coverage of the forearm flexors enables localization of the muscle activity and prevents missed information such as IZs. In addition, DoS MEAs customized to the specific anatomy of subjects produce highly informative data, leading to accurate finger gesture detection and prosthetic control compared with conventional technology.
Collapse
Affiliation(s)
- Faheem Ershad
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Michael Houston
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Shubham Patel
- Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, PA, 16801, USA
- Department of Mechanical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Luis Contreras
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Bikram Koirala
- Department of Mechanical Engineering, University of Houston, Houston, TX, 77204, USA
- Department of Engineering Technology, University of Houston, Houston, TX, 77204, USA
| | - Yuntao Lu
- Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, PA, 16801, USA
- Materials Science and Engineering Program, University of Houston, Houston, TX, 77204, USA
| | - Zhoulyu Rao
- Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, PA, 16801, USA
- Materials Science and Engineering Program, University of Houston, Houston, TX, 77204, USA
| | - Yang Liu
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Nicholas Dias
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Arturo Haces-Garcia
- Department of Engineering Technology, University of Houston, Houston, TX, 77204, USA
- Department of Electrical and Computer Engineering, University of Houston, Houston, TX, 77204, USA
| | - Weihang Zhu
- Department of Mechanical Engineering, University of Houston, Houston, TX, 77204, USA
- Department of Engineering Technology, University of Houston, Houston, TX, 77204, USA
| | - Yingchun Zhang
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | | |
Collapse
|
6
|
Programmable living assembly of materials by bacterial adhesion. Nat Chem Biol 2022; 18:289-294. [PMID: 34934187 DOI: 10.1038/s41589-021-00934-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 10/22/2021] [Indexed: 11/08/2022]
Abstract
The field of engineered living materials aims to construct functional materials with desirable properties of natural living systems. A recent study demonstrated the programmed self-assembly of bacterial populations by engineered adhesion. Here we use this strategy to engineer self-healing living materials with versatile functions. Bacteria displaying outer membrane-anchored nanobody-antigen pairs are cultured separately and, when mixed, adhere to each other to enable processing into functional materials, which we term living assembled material by bacterial adhesion (LAMBA). LAMBA is programmable and can be functionalized with extracellular moieties up to 545 amino acids. Notably, the adhesion between nanobody-antigen pairs in LAMBA leads to fast recovery under stretching or bending. By exploiting this feature, we fabricated wearable LAMBA sensors that can detect bioelectrical or biomechanical signals. Our work establishes a scalable approach to produce genetically editable and self-healable living functional materials that can be applied in biomanufacturing, bioremediation and soft bioelectronics assembly.
Collapse
|