1
|
Zhang Z, Lang Z, Chen G, Zhou H, Zhou W. Development of generic metabolic Raman calibration models using solution titration in aqueous phase and data augmentation for in-line cell culture analysis. Biotechnol Bioeng 2024. [PMID: 38639160 DOI: 10.1002/bit.28717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 02/29/2024] [Accepted: 04/08/2024] [Indexed: 04/20/2024]
Abstract
This study presents a novel approach for developing generic metabolic Raman calibration models for in-line cell culture analysis using glucose and lactate stock solution titration in an aqueous phase and data augmentation techniques. First, a successful set-up of the titration method was achieved by adding glucose or lactate solution at several different constant rates into the aqueous phase of a bench-top bioreactor. Subsequently, the in-line glucose and lactate concentration were calculated and interpolated based on the rate of glucose and lactate addition, enabling data augmentation and enhancing the robustness of the metabolic calibration model. Nine different combinations of spectra pretreatment, wavenumber range selection, and number of latent variables were evaluated and optimized using aqueous titration data as training set and a historical cell culture data set as validation and prediction set. Finally, Raman spectroscopy data collected from 11 historical cell culture batches (spanning four culture modes and scales ranging from 3 to 200 L) were utilized to predict the corresponding glucose and lactate values. The results demonstrated a high prediction accuracy, with an average root mean square errors of prediction of 0.65 g/L for glucose, and 0.48 g/L for lactate. This innovative method establishes a generic metabolic calibration model, and its applicability can be extended to other metabolites, reducing the cost of deploying real-time cell culture monitoring using Raman spectroscopy in bioprocesses.
Collapse
Affiliation(s)
- Zhijun Zhang
- Cell Culture Process Development (CCPD), WuXi Biologics, Shanghai, China
| | - Zhe Lang
- Cell Culture Process Development (CCPD), WuXi Biologics, Shanghai, China
| | - Gong Chen
- Cell Culture Process Development (CCPD), WuXi Biologics, Shanghai, China
| | - Hang Zhou
- Cell Culture Process Development (CCPD), WuXi Biologics, Shanghai, China
| | - Weichang Zhou
- Global Biologics Development and Operations (GBDO), WuXi Biologics, Shanghai, China
| |
Collapse
|
2
|
Du X, Ding X, Xi M, Lv Y, Qiu S, Liu Q. A Data Augmentation Method for Motor Imagery EEG Signals Based on DCGAN-GP Network. Brain Sci 2024; 14:375. [PMID: 38672024 DOI: 10.3390/brainsci14040375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 04/09/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Motor imagery electroencephalography (EEG) signals have garnered attention in brain-computer interface (BCI) research due to their potential in promoting motor rehabilitation and control. However, the limited availability of labeled data poses challenges for training robust classifiers. In this study, we propose a novel data augmentation method utilizing an improved Deep Convolutional Generative Adversarial Network with Gradient Penalty (DCGAN-GP) to address this issue. We transformed raw EEG signals into two-dimensional time-frequency maps and employed a DCGAN-GP network to generate synthetic time-frequency representations resembling real data. Validation experiments were conducted on the BCI IV 2b dataset, comparing the performance of classifiers trained with augmented and unaugmented data. Results demonstrated that classifiers trained with synthetic data exhibit enhanced robustness across multiple subjects and achieve higher classification accuracy. Our findings highlight the effectiveness of utilizing a DCGAN-GP-generated synthetic EEG data to improve classifier performance in distinguishing different motor imagery tasks. Thus, the proposed data augmentation method based on a DCGAN-GP offers a promising avenue for enhancing BCI system performance, overcoming data scarcity challenges, and bolstering classifier robustness, thereby providing substantial support for the broader adoption of BCI technology in real-world applications.
Collapse
Affiliation(s)
- Xiuli Du
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| | - Xiaohui Ding
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| | - Meiling Xi
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| | - Yana Lv
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| | - Shaoming Qiu
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| | - Qingli Liu
- Communication and Network Laboratory, Dalian University, Dalian 116622, China
| |
Collapse
|
3
|
Norris ML, Obeid N, El-Emam K. Examining the role of artificial intelligence to advance knowledge and address barriers to research in eating disorders. Int J Eat Disord 2024. [PMID: 38597344 DOI: 10.1002/eat.24215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 03/22/2024] [Accepted: 03/22/2024] [Indexed: 04/11/2024]
Abstract
OBJECTIVE To provide a brief overview of artificial intelligence (AI) application within the field of eating disorders (EDs) and propose focused solutions for research. METHOD An overview and summary of AI application pertinent to EDs with focus on AI's ability to address issues relating to data sharing and pooling (and associated privacy concerns), data augmentation, as well as bias within datasets is provided. RESULTS In addition to clinical applications, AI can utilize useful tools to help combat commonly encountered challenges in ED research, including issues relating to low prevalence of specific subpopulations of patients, small overall sample sizes, and bias within datasets. DISCUSSION There is tremendous potential to embed and utilize various facets of artificial intelligence (AI) to help improve our understanding of EDs and further evaluate and investigate questions that ultimately seek to improve outcomes. Beyond the technology, issues relating to regulation of AI, establishing ethical guidelines for its application, and the trust of providers and patients are all needed for ultimate adoption and acceptance into ED practice. PUBLIC SIGNIFICANCE Artificial intelligence (AI) offers a promise of significant potential within the realm of eating disorders (EDs) and encompasses a broad set of techniques that offer utility in various facets of ED research and by extension delivery of clinical care. Beyond the technology, issues relating to regulation, establishing ethical guidelines for application, and the trust of providers and patients are needed for the ultimate adoption and acceptance of AI into ED practice.
Collapse
Affiliation(s)
- Mark L Norris
- Department of Pediatrics, Children's Hospital of Eastern Ontario (CHEO), University of Ottawa, Ottawa, Ontario, Canada
- CHEO Research Institute, Ottawa, Ontario, Canada
| | - Nicole Obeid
- CHEO Research Institute, Ottawa, Ontario, Canada
- Department of Psychiatry, University of Ottawa, Ottawa, Ontario, Canada
| | - Khaled El-Emam
- CHEO Research Institute, Ottawa, Ontario, Canada
- School of Epidemiology and Public Health, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
4
|
You L, Liu X, Krischer J. A discrete approximation method for modeling interval-censored multistate data. Stat Med 2024. [PMID: 38599784 DOI: 10.1002/sim.10079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 01/07/2024] [Accepted: 04/02/2024] [Indexed: 04/12/2024]
Abstract
Many longitudinal studies are designed to monitor participants for major events related to the progression of diseases. Data arising from such longitudinal studies are usually subject to interval censoring since the events are only known to occur between two monitoring visits. In this work, we propose a new method to handle interval-censored multistate data within a proportional hazards model framework where the hazard rate of events is modeled by a nonparametric function of time and the covariates affect the hazard rate proportionally. The main idea of this method is to simplify the likelihood functions of a discrete-time multistate model through an approximation and the application of data augmentation techniques, where the assumed presence of censored information facilitates a simpler parameterization. Then the expectation-maximization algorithm is used to estimate the parameters in the model. The performance of the proposed method is evaluated by numerical studies. Finally, the method is employed to analyze a dataset on tracking the advancement of coronary allograft vasculopathy following heart transplantation.
Collapse
Affiliation(s)
- Lu You
- Health Informatics Institute, University of South Florida, Tampa, Florida, USA
| | - Xiang Liu
- Health Informatics Institute, University of South Florida, Tampa, Florida, USA
| | - Jeffrey Krischer
- Health Informatics Institute, University of South Florida, Tampa, Florida, USA
| |
Collapse
|
5
|
Baek H, Yu S, Son S, Seo J, Chung Y. Automated Region of Interest-Based Data Augmentation for Fallen Person Detection in Off-Road Autonomous Agricultural Vehicles. Sensors (Basel) 2024; 24:2371. [PMID: 38610583 PMCID: PMC11014021 DOI: 10.3390/s24072371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 03/18/2024] [Accepted: 04/05/2024] [Indexed: 04/14/2024]
Abstract
Due to the global population increase and the recovery of agricultural demand after the COVID-19 pandemic, the importance of agricultural automation and autonomous agricultural vehicles is growing. Fallen person detection is critical to preventing fatal accidents during autonomous agricultural vehicle operations. However, there is a challenge due to the relatively limited dataset for fallen persons in off-road environments compared to on-road pedestrian datasets. To enhance the generalization performance of fallen person detection off-road using object detection technology, data augmentation is necessary. This paper proposes a data augmentation technique called Automated Region of Interest Copy-Paste (ARCP) to address the issue of data scarcity. The technique involves copying real fallen person objects obtained from public source datasets and then pasting the objects onto a background off-road dataset. Segmentation annotations for these objects are generated using YOLOv8x-seg and Grounded-Segment-Anything, respectively. The proposed algorithm is then applied to automatically produce augmented data based on the generated segmentation annotations. The technique encompasses segmentation annotation generation, Intersection over Union-based segment setting, and Region of Interest configuration. When the ARCP technique is applied, significant improvements in detection accuracy are observed for two state-of-the-art object detectors: anchor-based YOLOv7x and anchor-free YOLOv8x, showing an increase of 17.8% (from 77.8% to 95.6%) and 12.4% (from 83.8% to 96.2%), respectively. This suggests high applicability for addressing the challenges of limited datasets in off-road environments and is expected to have a significant impact on the advancement of object detection technology in the agricultural industry.
Collapse
Affiliation(s)
- Hwapyeong Baek
- Department of Computer Convergence Software, Korea University, Sejong 30019, Republic of Korea; (H.B.); (S.Y.); (J.S.)
| | - Seunghyun Yu
- Department of Computer Convergence Software, Korea University, Sejong 30019, Republic of Korea; (H.B.); (S.Y.); (J.S.)
| | - Seungwook Son
- Info Valley Korea Co., Ltd., Anyang 14067, Republic of Korea;
| | - Jongwoong Seo
- Department of Computer Convergence Software, Korea University, Sejong 30019, Republic of Korea; (H.B.); (S.Y.); (J.S.)
| | - Yongwha Chung
- Department of Computer Convergence Software, Korea University, Sejong 30019, Republic of Korea; (H.B.); (S.Y.); (J.S.)
| |
Collapse
|
6
|
Ru Y, Wei Z, An G, Chen H. Combining data augmentation and deep learning for improved epilepsy detection. Front Neurol 2024; 15:1378076. [PMID: 38633533 PMCID: PMC11021591 DOI: 10.3389/fneur.2024.1378076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/18/2024] [Indexed: 04/19/2024] Open
Abstract
Introduction In recent years, the use of EEG signals for seizure detection has gained widespread academic attention. Aiming at the problem of overfitting deep learning models due to the small number of EEG signal data during epilepsy detection, this paper proposes an epilepsy detection method that combines data augmentation and deep learning. Methods First, the Adversarial and Mixup Data Augmentation (AMDA) method is used to realize the data augmentation, which effectively enriches the number of training samples. To further improve the classification accuracy and robustness of epilepsy detection, this paper proposes a one-dimensional convolutional neural network and gated recurrent unit (AM-1D CNN-GRU) network model based on attention mechanism for epilepsy detection. Results and discussion The experimental results show that the performance of epilepsy detection achieved by using augmented data is significantly improved, and the accuracy, sensitivity, and area under the subject's working characteristic curve are up to 96.06, 95.48%, and 0.9637, respectively. Compared with the non-augmented data, all indicators are increased by more than 6.2%. Meanwhile, the detection performance was significantly improved compared with other epilepsy detection methods. The results of this research can provide a reference for the clinical application of epilepsy detection.
Collapse
Affiliation(s)
- Yandong Ru
- School of Information Engineering, Zhejiang Ocean University, Zhoushan, China
- Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan, China
| | - Zheng Wei
- School of Electronics and Information Engineering, Heilongjiang University of Science and Technology, Harbin, China
| | - Gaoyang An
- School of Electronics and Information Engineering, Heilongjiang University of Science and Technology, Harbin, China
| | - Hongming Chen
- School of Information Engineering, Zhejiang Ocean University, Zhoushan, China
- Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan, China
| |
Collapse
|
7
|
Tan EX, Tang J, Leong YX, Phang IY, Lee YH, Pun CS, Ling XY. Creating 3D Nanoparticle Structural Space via Data Augmentation to Bidirectionally Predict Nanoparticle Mixture's Purity, Size, and Shape from Extinction Spectra. Angew Chem Int Ed Engl 2024; 63:e202317978. [PMID: 38357744 DOI: 10.1002/anie.202317978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/08/2024] [Accepted: 02/13/2024] [Indexed: 02/16/2024]
Abstract
Nanoparticle (NP) characterization is essential because diverse shapes, sizes, and morphologies inevitably occur in as-synthesized NP mixtures, profoundly impacting their properties and applications. Currently, the only technique to concurrently determine these structural parameters is electron microscopy, but it is time-intensive and tedious. Here, we create a three-dimensional (3D) NP structural space to concurrently determine the purity, size, and shape of 1000 sets of as-synthesized Ag nanocubes mixtures containing interfering nanospheres and nanowires from their extinction spectra, attaining low predictive errors at 2.7-7.9 %. We first use plasmonically-driven feature enrichment to extract localized surface plasmon resonance attributes from spectra and establish a lasso regressor (LR) model to predict purity, size, and shape. Leveraging the learned LR, we artificially generate 425,592 augmented extinction spectra to overcome data scarcity and create a comprehensive NP structural space to bidirectionally predict extinction spectra from structural parameters with <4 % error. Our interpretable NP structural space further elucidates the two higher-order combined electric dipole, quadrupole, and magnetic dipole as the critical structural parameter predictors. By incorporating other NP shapes and mixtures' extinction spectra, we anticipate our approach, especially the data augmentation, can create a fully generalizable NP structural space to drive on-demand, autonomous synthesis-characterization platforms.
Collapse
Affiliation(s)
- Emily Xi Tan
- Division of Chemistry and Biological Chemistry, School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
| | - Jingxiang Tang
- Division of Mathematics, School of Physical and Mathematical Sciences Department, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
| | - Yong Xiang Leong
- Division of Chemistry and Biological Chemistry, School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
| | - In Yee Phang
- Key Laboratory of Synthetic and Biological Colloids, Ministry of Education, International Joint Research Laboratory for Nano Energy Composites, School of Chemical and Material Engineering, Jiangnan University, Wuxi, 214122, People's Republic of China
| | - Yih Hong Lee
- Division of Chemistry and Biological Chemistry, School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
| | - Chi Seng Pun
- Division of Mathematics, School of Physical and Mathematical Sciences Department, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
| | - Xing Yi Ling
- Division of Chemistry and Biological Chemistry, School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, Singapore, 637371, Singapore
- Key Laboratory of Synthetic and Biological Colloids, Ministry of Education, International Joint Research Laboratory for Nano Energy Composites, School of Chemical and Material Engineering, Jiangnan University, Wuxi, 214122, People's Republic of China
| |
Collapse
|
8
|
Suglia V, Palazzo L, Bevilacqua V, Passantino A, Pagano G, D’Addio G. A Novel Framework Based on Deep Learning Architecture for Continuous Human Activity Recognition with Inertial Sensors. Sensors (Basel) 2024; 24:2199. [PMID: 38610410 PMCID: PMC11014138 DOI: 10.3390/s24072199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 02/26/2024] [Accepted: 03/08/2024] [Indexed: 04/14/2024]
Abstract
Frameworks for human activity recognition (HAR) can be applied in the clinical environment for monitoring patients' motor and functional abilities either remotely or within a rehabilitation program. Deep Learning (DL) models can be exploited to perform HAR by means of raw data, thus avoiding time-demanding feature engineering operations. Most works targeting HAR with DL-based architectures have tested the workflow performance on data related to a separate execution of the tasks. Hence, a paucity in the literature has been found with regard to frameworks aimed at recognizing continuously executed motor actions. In this article, the authors present the design, development, and testing of a DL-based workflow targeting continuous human activity recognition (CHAR). The model was trained on the data recorded from ten healthy subjects and tested on eight different subjects. Despite the limited sample size, the authors claim the capability of the proposed framework to accurately classify motor actions within a feasible time, thus making it potentially useful in a clinical scenario.
Collapse
Affiliation(s)
- Vladimiro Suglia
- Department of Electrical and Information Engineering (DEI), Polytechnic University of Bari, 70126 Bari, Italy; (V.S.); (L.P.); (V.B.)
| | - Lucia Palazzo
- Department of Electrical and Information Engineering (DEI), Polytechnic University of Bari, 70126 Bari, Italy; (V.S.); (L.P.); (V.B.)
- Scientific Clinical Institutes Maugeri SPA SB IRCCS, 70124 Bari, Italy; (A.P.); (G.D.)
| | - Vitoantonio Bevilacqua
- Department of Electrical and Information Engineering (DEI), Polytechnic University of Bari, 70126 Bari, Italy; (V.S.); (L.P.); (V.B.)
- Apulian Bioengineering S.R.L.,Via delle Violette 14, 70026 Modugno, Italy
| | - Andrea Passantino
- Scientific Clinical Institutes Maugeri SPA SB IRCCS, 70124 Bari, Italy; (A.P.); (G.D.)
| | - Gaetano Pagano
- Scientific Clinical Institutes Maugeri SPA SB IRCCS, 70124 Bari, Italy; (A.P.); (G.D.)
| | - Giovanni D’Addio
- Scientific Clinical Institutes Maugeri SPA SB IRCCS, 70124 Bari, Italy; (A.P.); (G.D.)
| |
Collapse
|
9
|
Lee ZJ, Yang MR, Hwang BJ. A Sustainable Approach to Asthma Diagnosis: Classification with Data Augmentation, Feature Selection, and Boosting Algorithm. Diagnostics (Basel) 2024; 14:723. [PMID: 38611635 PMCID: PMC11011786 DOI: 10.3390/diagnostics14070723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 03/21/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024] Open
Abstract
Asthma is a diverse disease that affects over 300 million individuals globally. The prevalence of asthma has increased by 50% every decade since the 1960s, making it a serious global health issue. In addition to its associated high mortality, asthma generates large economic losses due to the degradation of patients' quality of life and the impairment of their physical fitness. Asthma research has evolved in recent years to fully analyze why certain diseases develop based on a variety of data and observations of patients' performance. The advent of new techniques offers good opportunities and application prospects for the development of asthma diagnosis methods. Over the last few decades, techniques like data mining and machine learning have been utilized to diagnose asthma. Nevertheless, these traditional methods are unable to address all of the difficulties associated with improving a small dataset to increase its quantity, quality, and feature space complexity at the same time. In this study, we propose a sustainable approach to asthma diagnosis using advanced machine learning techniques. To be more specific, we use feature selection to find the most important features, data augmentation to improve the dataset's resilience, and the extreme gradient boosting algorithm for classification. Data augmentation in the proposed method involves generating synthetic samples to increase the size of the training dataset, which is then utilized to enhance the training data initially. This could lessen the phenomenon of imbalanced data related to asthma. Then, to improve diagnosis accuracy and prioritize significant features, the extreme gradient boosting technique is used. The outcomes indicate that the proposed approach performs better in terms of diagnostic accuracy than current techniques. Furthermore, five essential features are extracted to help physicians diagnose asthma.
Collapse
Affiliation(s)
- Zne-Jung Lee
- Department of Electronic and Information Engineering, School of Advanced Manufacturing, Fuzhou University, Quanzhou 362200, China
| | - Ming-Ren Yang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 235, Taiwan;
| | - Bor-Jiunn Hwang
- College of Information Science, Ming Chuan University, Taoyuan 333, Taiwan;
| |
Collapse
|
10
|
Wang W, Shang Z, Li C. Brain-inspired semantic data augmentation for multi-style images. Front Neurorobot 2024; 18:1382406. [PMID: 38596181 PMCID: PMC11002076 DOI: 10.3389/fnbot.2024.1382406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/04/2024] [Indexed: 04/11/2024] Open
Abstract
Data augmentation is an effective technique for automatically expanding training data in deep learning. Brain-inspired methods are approaches that draw inspiration from the functionality and structure of the human brain and apply these mechanisms and principles to artificial intelligence and computer science. When there is a large style difference between training data and testing data, common data augmentation methods cannot effectively enhance the generalization performance of the deep model. To solve this problem, we improve modeling Domain Shifts with Uncertainty (DSU) and propose a new brain-inspired computer vision image data augmentation method which consists of two key components, namely, using Robust statistics and controlling the Coefficient of variance for DSU (RCDSU) and Feature Data Augmentation (FeatureDA). RCDSU calculates feature statistics (mean and standard deviation) with robust statistics to weaken the influence of outliers, making the statistics close to the real values and improving the robustness of deep learning models. By controlling the coefficient of variance, RCDSU makes the feature statistics shift with semantic preservation and increases shift range. FeatureDA controls the coefficient of variance similarly to generate the augmented features with semantics unchanged and increase the coverage of augmented features. RCDSU and FeatureDA are proposed to perform style transfer and content transfer in the feature space, and improve the generalization ability of the model at the style and content level respectively. On Photo, Art Painting, Cartoon, and Sketch (PACS) multi-style classification task, RCDSU plus FeatureDA achieves competitive accuracy. After adding Gaussian noise to PACS dataset, RCDSU plus FeatureDA shows strong robustness against outliers. FeatureDA achieves excellent results on CIFAR-100 image classification task. RCDSU plus FeatureDA can be applied as a novel brain-inspired semantic data augmentation method with implicit robot automation which is suitable for datasets with large style differences between training and testing data.
Collapse
Affiliation(s)
| | - Zhaowei Shang
- College of Computer Science, Chongqing University, Chongqing, China
| | | |
Collapse
|
11
|
Hoang QT, Pham XH, Trinh XT, Le AV, Bui MV, Bui TT. An Efficient CNN-Based Method for Intracranial Hemorrhage Segmentation from Computerized Tomography Imaging. J Imaging 2024; 10:77. [PMID: 38667975 DOI: 10.3390/jimaging10040077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 04/28/2024] Open
Abstract
Intracranial hemorrhage (ICH) resulting from traumatic brain injury is a serious issue, often leading to death or long-term disability if not promptly diagnosed. Currently, doctors primarily use Computerized Tomography (CT) scans to detect and precisely locate a hemorrhage, typically interpreted by radiologists. However, this diagnostic process heavily relies on the expertise of medical professionals. To address potential errors, computer-aided diagnosis systems have been developed. In this study, we propose a new method that enhances the localization and segmentation of ICH lesions in CT scans by using multiple images created through different data augmentation techniques. We integrate residual connections into a U-Net-based segmentation network to improve the training efficiency. Our experiments, based on 82 CT scans from traumatic brain injury patients, validate the effectiveness of our approach, achieving an IOU score of 0.807 ± 0.03 for ICH segmentation using 10-fold cross-validation.
Collapse
Affiliation(s)
- Quoc Tuan Hoang
- Faculty of Mechanical Engineering, Hung Yen University of Technology and Education, 39Rd., Hung Yen 160000, Vietnam
| | - Xuan Hien Pham
- Faculty of Mechanical Engineering, University of Transport and Communications, Hanoi 100000, Vietnam
| | - Xuan Thang Trinh
- Faculty of Mechanical Engineering, Hung Yen University of Technology and Education, 39Rd., Hung Yen 160000, Vietnam
| | - Anh Vu Le
- Communication and Signal Processing Research Group, Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, Ho Chi Minh City 700000, Vietnam
| | - Minh V Bui
- Faculty of Engineering and Technology, Nguyen Tat Thanh University, 300A, Nguyen Tat Thanh, Ward 13, District 4, Ho Chi Minh City 700000, Vietnam
| | - Trung Thanh Bui
- Faculty of Mechanical Engineering, Hung Yen University of Technology and Education, 39Rd., Hung Yen 160000, Vietnam
| |
Collapse
|
12
|
Alsuradi H, Khattak A, Fakhry A, Eid M. Individual-finger motor imagery classification: a data-driven approach with Shapley-informed augmentation. J Neural Eng 2024; 21:026013. [PMID: 38479013 DOI: 10.1088/1741-2552/ad33b3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 03/13/2024] [Indexed: 03/23/2024]
Abstract
Objective. Classifying motor imagery (MI) tasks that involve fine motor control of the individual five fingers presents unique challenges when utilizing electroencephalography (EEG) data. In this paper, we systematically assess the classification of MI functions for the individual five fingers using single-trial time-domain EEG signals. This assessment encompasses both within-subject and cross-subject scenarios, supported by data-driven analysis that provides statistical validation of the neural correlate that could potentially discriminate between the five fingers.Approach. We present Shapley-informed augmentation, an informed approach to enhance within-subject classification accuracy. This method is rooted in insights gained from our data-driven analysis, which revealed inconsistent temporal features encoding the five fingers MI across sessions of the same subject. To evaluate its impact, we compare within-subject classification performance both before and after implementing this augmentation technique.Main results. Both the data-driven approach and the model explainability analysis revealed that the parietal cortex contains neural information that helps discriminate the individual five fingers' MI apart. Shapley-informed augmentation successfully improved classification accuracy in sessions severely affected by inconsistent temporal features. The accuracy for sessions impacted by inconsistency in their temporal features increased by an average of26.3%±6.70, thereby enabling a broader range of subjects to benefit from brain-computer interaction (BCI) applications involving five-fingers MI classification. Conversely, non-impacted sessions experienced only a negligible average accuracy decrease of2.01±5.44%. The average classification accuracy achieved is around 60.0% (within-session), 50.0% (within-subject) and 40.0% (leave-one-subject-out).Significance. This research offers data-driven evidence of neural correlates that could discriminate between the individual five fingers MI and introduces a novel Shapley-informed augmentation method to address temporal variability of features, ultimately contributing to the development of personalized systems.
Collapse
Affiliation(s)
- Haneen Alsuradi
- Engineering Division, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi 129188, United Arab Emirates
| | - Arshiya Khattak
- Engineering Division, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi 129188, United Arab Emirates
| | - Ali Fakhry
- Engineering Division, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi 129188, United Arab Emirates
| | - Mohamad Eid
- Engineering Division, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi 129188, United Arab Emirates
| |
Collapse
|
13
|
Wirnsberger G, Pritišanac I, Oberdorfer G, Gruber K. Flattening the curve-How to get better results with small deep-mutational-scanning datasets. Proteins 2024. [PMID: 38501649 DOI: 10.1002/prot.26686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/24/2024] [Accepted: 03/07/2024] [Indexed: 03/20/2024]
Abstract
Proteins are used in various biotechnological applications, often requiring the optimization of protein properties by introducing specific amino-acid exchanges. Deep mutational scanning (DMS) is an effective high-throughput method for evaluating the effects of these exchanges on protein function. DMS data can then inform the training of a neural network to predict the impact of mutations. Most approaches use some representation of the protein sequence for training and prediction. As proteins are characterized by complex structures and intricate residue interaction networks, directly providing structural information as input reduces the need to learn these features from the data. We introduce a method for encoding protein structures as stacked 2D contact maps, which capture residue interactions, their evolutionary conservation, and mutation-induced interaction changes. Furthermore, we explored techniques to augment neural network training performance on smaller DMS datasets. To validate our approach, we trained three neural network architectures originally used for image analysis on three DMS datasets, and we compared their performances with networks trained solely on protein sequences. The results confirm the effectiveness of the protein structure encoding in machine learning efforts on DMS data. Using structural representations as direct input to the networks, along with data augmentation and pretraining, significantly reduced demands on training data size and improved prediction performance, especially on smaller datasets, while performance on large datasets was on par with state-of-the-art sequence convolutional neural networks. The methods presented here have the potential to provide the same workflow as DMS without the experimental and financial burden of testing thousands of mutants. Additionally, we present an open-source, user-friendly software tool to make these data analysis techniques accessible, particularly to biotechnology and protein engineering researchers who wish to apply them to their mutagenesis data.
Collapse
Affiliation(s)
| | - Iva Pritišanac
- Institute of Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
| | - Gustav Oberdorfer
- BioTechMed-Graz, Graz, Austria
- Institute of Biochemistry, Graz University of Technology, Graz, Austria
| | - Karl Gruber
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Field of Excellence BioHealth, University of Graz, Graz, Austria
| |
Collapse
|
14
|
Naeini SA, Simmatis L, Jafari D, Yunusova Y, Taati B. Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation. IEEE J Transl Eng Health Med 2024; 12:382-389. [PMID: 38606392 PMCID: PMC11008804 DOI: 10.1109/jtehm.2024.3375323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 02/21/2024] [Accepted: 03/02/2024] [Indexed: 04/13/2024]
Abstract
Acoustic features extracted from speech can help with the diagnosis of neurological diseases and monitoring of symptoms over time. Temporal segmentation of audio signals into individual words is an important pre-processing step needed prior to extracting acoustic features. Machine learning techniques could be used to automate speech segmentation via automatic speech recognition (ASR) and sequence to sequence alignment. While state-of-the-art ASR models achieve good performance on healthy speech, their performance significantly drops when evaluated on dysarthric speech. Fine-tuning ASR models on impaired speech can improve performance in dysarthric individuals, but it requires representative clinical data, which is difficult to collect and may raise privacy concerns. This study explores the feasibility of using two augmentation methods to increase ASR performance on dysarthric speech: 1) healthy individuals varying their speaking rate and loudness (as is often used in assessments of pathological speech); 2) synthetic speech with variations in speaking rate and accent (to ensure more diverse vocal representations and fairness). Experimental evaluations showed that fine-tuning a pre-trained ASR model with data from these two sources outperformed a model fine-tuned only on real clinical data and matched the performance of a model fine-tuned on the combination of real clinical data and synthetic speech. When evaluated on held-out acoustic data from 24 individuals with various neurological diseases, the best performing model achieved an average word error rate of 5.7% and a mean correct count accuracy of 94.4%. In segmenting the data into individual words, a mean intersection-over-union of 89.2% was obtained against manual parsing (ground truth). It can be concluded that emulated and synthetic augmentations can significantly reduce the need for real clinical data of dysarthric speech when fine-tuning ASR models and, in turn, for speech segmentation.
Collapse
Affiliation(s)
- Saeid Alavi Naeini
- KITE, Toronto Rehabilitation Institute, University Health Network (UHN)TorontoONM5G 2A2Canada
- Institute of Biomedical Engineering, University of TorontoTorontoONM5S 3G9Canada
| | - Leif Simmatis
- KITE, Toronto Rehabilitation Institute, University Health Network (UHN)TorontoONM5G 2A2Canada
| | - Deniz Jafari
- KITE, Toronto Rehabilitation Institute, University Health Network (UHN)TorontoONM5G 2A2Canada
- Institute of Biomedical Engineering, University of TorontoTorontoONM5S 3G9Canada
| | - Yana Yunusova
- KITE, Toronto Rehabilitation Institute, University Health Network (UHN)TorontoONM5G 2A2Canada
- Department of Speech Language PathologyRehabilitation Sciences Institute, University of TorontoTorontoONM5G 1V7Canada
- Hurvitz Brain Sciences ProgramSunnybrook Research Institute (SRI)TorontoONM4N 3M5Canada
| | - Babak Taati
- KITE, Toronto Rehabilitation Institute, University Health Network (UHN)TorontoONM5G 2A2Canada
- Institute of Biomedical Engineering, University of TorontoTorontoONM5S 3G9Canada
- Department of Computer ScienceUniversity of TorontoTorontoONM5S 2E4Canada
| |
Collapse
|
15
|
Oh Y. Data Augmentation Techniques for Accurate Action Classification in Stroke Patients with Hemiparesis. Sensors (Basel) 2024; 24:1618. [PMID: 38475154 DOI: 10.3390/s24051618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 02/29/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024]
Abstract
Stroke survivors with hemiparesis require extensive home-based rehabilitation. Deep learning-based classifiers can detect actions and provide feedback based on patient data; however, this is difficult owing to data sparsity and heterogeneity. In this study, we investigate data augmentation and model training strategies to address this problem. Three transformations are tested with varying data volumes to analyze the changes in the classification performance of individual data. Moreover, the impact of transfer learning relative to a pre-trained one-dimensional convolutional neural network (Conv1D) and training with an advanced InceptionTime model are estimated with data augmentation. In Conv1D, the joint training data of non-disabled (ND) participants and double rotationally augmented data of stroke patients is observed to outperform the baseline in terms of F1-score (60.9% vs. 47.3%). Transfer learning pre-trained with ND data exhibits 60.3% accuracy, whereas joint training with InceptionTime exhibits 67.2% accuracy under the same conditions. Our results indicate that rotational augmentation is more effective for individual data with initially lower performance and subset data with smaller numbers of participants than other techniques, suggesting that joint training on rotationally augmented ND and stroke data enhances classification performance, particularly in cases with sparse data and lower initial performance.
Collapse
Affiliation(s)
- Youngmin Oh
- School of Computing, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
16
|
Ma H, Geng M, Wang F, Zheng W, Ai Y, Zhang W. Data Augmentation of a Corrosion Dataset for Defect Growth Prediction of Pipelines Using Conditional Tabular Generative Adversarial Networks. Materials (Basel) 2024; 17:1142. [PMID: 38473613 DOI: 10.3390/ma17051142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 02/24/2024] [Accepted: 02/25/2024] [Indexed: 03/14/2024]
Abstract
Due to corrosion characteristics, there are data scarcity and uneven distribution in corrosion datasets, and collecting high-quality data is time-consuming and sometimes difficult. Therefore, this work introduces a novel data augmentation strategy using a conditional tabular generative adversarial network (CTGAN) for enhancing corrosion datasets of pipelines. Firstly, the corrosion dataset is subjected to data cleaning and variable correlation analysis. The CTGAN is then used to generate external environmental factors as input variables for corrosion growth prediction, and a hybrid model based on machine learning is employed to generate corrosion depth as an output variable. The fake data are merged with the original data to form the synthetic dataset. Finally, the proposed data augmentation strategy is verified by analyzing the synthetic dataset using different visualization methods and evaluation indicators. The results show that the synthetic and original datasets have similar distributions, and the data augmentation strategy can learn the distribution of real corrosion data and sample fake data that are highly similar to the real data. Predictive models trained on the synthetic dataset perform better than predictive models trained using only the original dataset. In comparative tests, the proposed strategy outperformed other data generation methods.
Collapse
Affiliation(s)
- Haonan Ma
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
| | - Mengying Geng
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
| | - Fan Wang
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
| | - Wenyue Zheng
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
| | - Yibo Ai
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai 519082, China
| | - Weidong Zhang
- National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing 100083, China
| |
Collapse
|
17
|
Lin Z, Henson WH, Dowling L, Walsh J, Dall’Ara E, Guo L. Automatic segmentation of skeletal muscles from MR images using modified U-Net and a novel data augmentation approach. Front Bioeng Biotechnol 2024; 12:1355735. [PMID: 38456001 PMCID: PMC10919285 DOI: 10.3389/fbioe.2024.1355735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 02/05/2024] [Indexed: 03/09/2024] Open
Abstract
Rapid and accurate muscle segmentation is essential for the diagnosis and monitoring of many musculoskeletal diseases. As gold standard, manual annotation suffers from intensive labor and high inter-operator reproducibility errors. In this study, deep learning (DL) based automatic muscle segmentation from MR scans is investigated for post-menopausal women, who normally experience a decline in muscle volume. The performance of four Deep Learning (DL) models was evaluated: U-Net and UNet++ and two modified U-Net networks, which combined feature fusion and attention mechanisms (Feature-Fusion-UNet, FFU, and Attention-Feature-Fusion-UNet, AFFU). The models were tested for automatic segmentation of 16-lower limb muscles from MRI scans of two cohorts of post-menopausal women (11 subjects in PMW-1, 8 subjects in PMW-2; from two different studies so considered independent datasets) and 10 obese post-menopausal women (PMW-OB). Furthermore, a novel data augmentation approach is proposed to enlarge the training dataset. The results were assessed and compared by using the Dice similarity coefficient (DSC), relative volume error (RVE), and Hausdorff distance (HD). The best performance among all four DL models was achieved by AFFU (PMW-1: DSC 0.828 ± 0.079, 1-RVE 0.859 ± 0.122, HD 29.9 mm ± 26.5 mm; PMW-2: DSC 0.833 ± 0.065, 1-RVE 0.873 ± 0.105, HD 25.9 mm ± 27.9 mm; PMW-OB: DSC 0.862 ± 0.048, 1-RVE 0.919 ± 0.076, HD 34.8 mm ± 46.8 mm). Furthermore, the augmentation of data significantly improved the DSC scores of U-Net and AFFU for all 16 tested muscles (between 0.23% and 2.17% (DSC), 1.6%-1.93% (1-RVE), and 9.6%-19.8% (HD) improvement). These findings highlight the feasibility of utilizing DL models for automatic segmentation of muscles in post-menopausal women and indicate that the proposed augmentation method can enhance the performance of models trained on small datasets.
Collapse
Affiliation(s)
- Zhicheng Lin
- Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, United Kingdom
| | - William H. Henson
- Department of Mechanical Engineering, University of Sheffield, Sheffield, United Kingdom
| | - Lisa Dowling
- Faculty of Health, University of Sheffield, Sheffield, United Kingdom
| | - Jennifer Walsh
- Faculty of Health, University of Sheffield, Sheffield, United Kingdom
| | - Enrico Dall’Ara
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
- Insigneo, University of Sheffield, Sheffield, United Kingdom
| | - Lingzhong Guo
- Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, United Kingdom
- Insigneo, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
18
|
Wu X, Zhang D, Li G, Gao X, Metcalfe B, Chen L. Data augmentation for invasive brain-computer interfaces based on stereo-electroencephalography (SEEG). J Neural Eng 2024; 21:016026. [PMID: 38237174 DOI: 10.1088/1741-2552/ad200e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/18/2024] [Indexed: 02/23/2024]
Abstract
Objective.Deep learning is increasingly used for brain-computer interfaces (BCIs). However, the quantity of available data is sparse, especially for invasive BCIs. Data augmentation (DA) methods, such as generative models, can help to address this sparseness. However, all the existing studies on brain signals were based on convolutional neural networks and ignored the temporal dependence. This paper attempted to enhance generative models by capturing the temporal relationship from a time-series perspective.Approach. A conditional generative network (conditional transformer-based generative adversarial network (cTGAN)) based on the transformer model was proposed. The proposed method was tested using a stereo-electroencephalography (SEEG) dataset which was recorded from eight epileptic patients performing five different movements. Three other commonly used DA methods were also implemented: noise injection (NI), variational autoencoder (VAE), and conditional Wasserstein generative adversarial network with gradient penalty (cWGANGP). Using the proposed method, the artificial SEEG data was generated, and several metrics were used to compare the data quality, including visual inspection, cosine similarity (CS), Jensen-Shannon distance (JSD), and the effect on the performance of a deep learning-based classifier.Main results. Both the proposed cTGAN and the cWGANGP methods were able to generate realistic data, while NI and VAE outputted inferior samples when visualized as raw sequences and in a lower dimensional space. The cTGAN generated the best samples in terms of CS and JSD and outperformed cWGANGP significantly in enhancing the performance of a deep learning-based classifier (each of them yielding a significant improvement of 6% and 3.4%, respectively).Significance. This is the first time that DA methods have been applied to invasive BCIs based on SEEG. In addition, this study demonstrated the advantages of the model that preserves the temporal dependence from a time-series perspective.
Collapse
Affiliation(s)
- Xiaolong Wu
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Dingguo Zhang
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Guangye Li
- School of Mechanical Engineering, Shanghai Jiao Tong University, People's Republic of China
| | - Xin Gao
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Benjamin Metcalfe
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Liang Chen
- Liang Chen is with Huashan Hospital, Fudan University, People's Republic of China
| |
Collapse
|
19
|
Bang G, Lee J, Endo Y, Nishimori T, Nakao K, Kamijo S. Semantic and Geometric-Aware Day-to-Night Image Translation Network. Sensors (Basel) 2024; 24:1339. [PMID: 38400497 PMCID: PMC10891961 DOI: 10.3390/s24041339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 02/10/2024] [Accepted: 02/18/2024] [Indexed: 02/25/2024]
Abstract
Autonomous driving systems heavily depend on perception tasks for optimal performance. However, the prevailing datasets are primarily focused on scenarios with clear visibility (i.e., sunny and daytime). This concentration poses challenges in training deep-learning-based perception models for environments with adverse conditions (e.g., rainy and nighttime). In this paper, we propose an unsupervised network designed for the translation of images from day-to-night to solve the ill-posed problem of learning the mapping between domains with unpaired data. The proposed method involves extracting both semantic and geometric information from input images in the form of attention maps. We assume that the multi-task network can extract semantic and geometric information during the estimation of semantic segmentation and depth maps, respectively. The image-to-image translation network integrates the two distinct types of extracted information, employing them as spatial attention maps. We compare our method with related works both qualitatively and quantitatively. The proposed method shows both qualitative and qualitative improvements in visual presentation over related work.
Collapse
Affiliation(s)
- Geonkyu Bang
- Emerging Design and Informatics Course, Graduate School of Interdisciplinary Information Studies, The University of Tokyo, 4 Chome-6-1 Komaba, Meguro-ku, Tokyo 153-0041, Japan;
| | - Jinho Lee
- Emerging Design and Informatics Course, Graduate School of Interdisciplinary Information Studies, The University of Tokyo, 4 Chome-6-1 Komaba, Meguro-ku, Tokyo 153-0041, Japan;
| | - Yuki Endo
- Department of Information and Communication Engineering, Graduate School of Information Science and Technology, The University of Tokyo, 7 Chome-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan;
| | - Toshiaki Nishimori
- Mitsubishi Heavy Industries Machinery Systems, Ltd., 1 Chome-1-1 Wadasaki-cho, Hyogo-ku, Kobe 652-8585, Japan;
| | - Kenta Nakao
- Mitsubishi Heavy Industries, Ltd., 1 Chome-1-1 Wadasaki-cho, Hyogo-ku, Kobe 652-8585, Japan;
| | - Shunsuke Kamijo
- Institute of Industrial Science, The University of Tokyo, 4 Chome-6-1 Komaba, Meguro-ku, Tokyo 153-0041, Japan
| |
Collapse
|
20
|
Dindorf C, Dully J, Konradi J, Wolf C, Becker S, Simon S, Huthwelker J, Werthmann F, Kniepert J, Drees P, Betz U, Fröhlich M. Enhancing biomechanical machine learning with limited data: generating realistic synthetic posture data using generative artificial intelligence. Front Bioeng Biotechnol 2024; 12:1350135. [PMID: 38419724 PMCID: PMC10899878 DOI: 10.3389/fbioe.2024.1350135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 01/22/2024] [Indexed: 03/02/2024] Open
Abstract
Objective: Biomechanical Machine Learning (ML) models, particularly deep-learning models, demonstrate the best performance when trained using extensive datasets. However, biomechanical data are frequently limited due to diverse challenges. Effective methods for augmenting data in developing ML models, specifically in the human posture domain, are scarce. Therefore, this study explored the feasibility of leveraging generative artificial intelligence (AI) to produce realistic synthetic posture data by utilizing three-dimensional posture data. Methods: Data were collected from 338 subjects through surface topography. A Variational Autoencoder (VAE) architecture was employed to generate and evaluate synthetic posture data, examining its distinguishability from real data by domain experts, ML classifiers, and Statistical Parametric Mapping (SPM). The benefits of incorporating augmented posture data into the learning process were exemplified by a deep autoencoder (AE) for automated feature representation. Results: Our findings highlight the challenge of differentiating synthetic data from real data for both experts and ML classifiers, underscoring the quality of synthetic data. This observation was also confirmed by SPM. By integrating synthetic data into AE training, the reconstruction error can be reduced compared to using only real data samples. Moreover, this study demonstrates the potential for reduced latent dimensions, while maintaining a reconstruction accuracy comparable to AEs trained exclusively on real data samples. Conclusion: This study emphasizes the prospects of harnessing generative AI to enhance ML tasks in the biomechanics domain.
Collapse
Affiliation(s)
- Carlo Dindorf
- Department of Sports Science, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Jonas Dully
- Department of Sports Science, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Jürgen Konradi
- Institute of Physical Therapy, Prevention and Rehabilitation, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Claudia Wolf
- Institute of Physical Therapy, Prevention and Rehabilitation, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Stephan Becker
- Department of Sports Science, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Steven Simon
- Department of Sports Science, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Janine Huthwelker
- Institute of Physical Therapy, Prevention and Rehabilitation, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Frederike Werthmann
- Department of Orthopedics and Trauma Surgery, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Johanna Kniepert
- Department of Orthopedics and Trauma Surgery, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Philipp Drees
- Department of Orthopedics and Trauma Surgery, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Ulrich Betz
- Institute of Physical Therapy, Prevention and Rehabilitation, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Michael Fröhlich
- Department of Sports Science, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| |
Collapse
|
21
|
Yue P, Li Z, Zhou M, Wang X, Yang P. Wearable-Sensor-Based Weakly Supervised Parkinson's Disease Assessment with Data Augmentation. Sensors (Basel) 2024; 24:1196. [PMID: 38400357 PMCID: PMC10892773 DOI: 10.3390/s24041196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 01/23/2024] [Accepted: 01/30/2024] [Indexed: 02/25/2024]
Abstract
Parkinson's disease (PD) is the second most prevalent dementia in the world. Wearable technology has been useful in the computer-aided diagnosis and long-term monitoring of PD in recent years. The fundamental issue remains how to assess the severity of PD using wearable devices in an efficient and accurate manner. However, in the real-world free-living environment, there are two difficult issues, poor annotation and class imbalance, both of which could potentially impede the automatic assessment of PD. To address these challenges, we propose a novel framework for assessing the severity of PD patient's in a free-living environment. Specifically, we use clustering methods to learn latent categories from the same activities, while latent Dirichlet allocation (LDA) topic models are utilized to capture latent features from multiple activities. Then, to mitigate the impact of data imbalance, we augment bag-level data while retaining key instance prototypes. To comprehensively demonstrate the efficacy of our proposed framework, we collected a dataset containing wearable-sensor signals from 83 individuals in real-life free-living conditions. The experimental results show that our framework achieves an astounding 73.48% accuracy in the fine-grained (normal, mild, moderate, severe) classification of PD severity based on hand movements. Overall, this study contributes to more accurate PD self-diagnosis in the wild, allowing doctors to provide remote drug intervention guidance.
Collapse
Affiliation(s)
- Peng Yue
- Department of Computer Science, University of Sheffield, Sheffield S10 2TN, UK; (P.Y.); (M.Z.); (X.W.)
- AntData Ltd., Liverpool L16 2AE, UK
| | - Ziheng Li
- Department of Software, Yunnan University, Kunming 650106, China;
| | - Menghui Zhou
- Department of Computer Science, University of Sheffield, Sheffield S10 2TN, UK; (P.Y.); (M.Z.); (X.W.)
| | - Xulong Wang
- Department of Computer Science, University of Sheffield, Sheffield S10 2TN, UK; (P.Y.); (M.Z.); (X.W.)
| | - Po Yang
- Department of Computer Science, University of Sheffield, Sheffield S10 2TN, UK; (P.Y.); (M.Z.); (X.W.)
| |
Collapse
|
22
|
Lee S, Kim T, Shin J, Kim N, Choi Y. INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection. Sensors (Basel) 2024; 24:1168. [PMID: 38400326 PMCID: PMC10893488 DOI: 10.3390/s24041168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/02/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024]
Abstract
Pedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-modal feature fusion using convolution operation can be effective, but such methods rely solely on local feature correlations, which can degrade the performance capabilities. To address this issue, we propose an attention-based novel fusion network, referred to as INSANet (INtra-INter Spectral Attention Network), that captures global intra- and inter-information. It consists of intra- and inter-spectral attention blocks that allow the model to learn mutual spectral relationships. Additionally, we identified an imbalance in the multispectral dataset caused by several factors and designed an augmentation strategy that mitigates concentrated distributions and enables the model to learn the diverse locations of pedestrians. Extensive experiments demonstrate the effectiveness of the proposed methods, which achieve state-of-the-art performance on the KAIST dataset and LLVIP dataset. Finally, we conduct a regional performance evaluation to demonstrate the effectiveness of our proposed network in various regions.
Collapse
Affiliation(s)
- Sangin Lee
- Department of Software, Sejong University, Seoul 05006, Republic of Korea;
| | - Taejoo Kim
- Department of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea; (T.K.); (J.S.)
| | - Jeongmin Shin
- Department of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea; (T.K.); (J.S.)
| | - Namil Kim
- NAVER LABS, Seongnam 13561, Republic of Korea;
| | - Yukyung Choi
- Department of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea; (T.K.); (J.S.)
| |
Collapse
|
23
|
Tang Z, Li S, Kim KS, Smith JS. Multi-Dimensional Wi-Fi Received Signal Strength Indicator Data Augmentation Based on Multi-Output Gaussian Process for Large-Scale Indoor Localization. Sensors (Basel) 2024; 24:1026. [PMID: 38339745 PMCID: PMC10857661 DOI: 10.3390/s24031026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 02/12/2024]
Abstract
Location fingerprinting using Received Signal Strength Indicators (RSSIs) has become a popular technique for indoor localization due to its use of existing Wi-Fi infrastructure and Wi-Fi-enabled devices. Artificial intelligence/machine learning techniques such as Deep Neural Networks (DNNs) have been adopted to make location fingerprinting more accurate and reliable for large-scale indoor localization applications. However, the success of DNNs for indoor localization depends on the availability of a large amount of pre-processed and labeled data for training, the collection of which could be time-consuming in large-scale indoor environments and even challenging during a pandemic situation like COVID-19. To address these issues in data collection, we investigate multi-dimensional RSSI data augmentation based on the Multi-Output Gaussian Process (MOGP), which, unlike the Single-Output Gaussian Process (SOGP), can exploit the correlation among the RSSIs from multiple access points in a single floor, neighboring floors, or a single building by collectively processing them. The feasibility of MOGP-based multi-dimensional RSSI data augmentation is demonstrated through experiments using the hierarchical indoor localization model based on a Recurrent Neural Network (RNN)-i.e., one of the state-of-the-art multi-building and multi-floor localization models-and the publicly available UJIIndoorLoc multi-building and multi-floor indoor localization database. The RNN model trained with the UJIIndoorLoc database augmented with the augmentation mode of "by a single building", where an MOGP model is fitted based on the entire RSSI data of a building, outperforms the other two augmentation modes and results in the three-dimensional localization error of 8.42 m.
Collapse
Affiliation(s)
- Zhe Tang
- School of Advanced Technology, Xi’an Jiaotong-Liverpool University (XJTLU), Suzhou 215123, China; (Z.T.); (S.L.)
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK;
| | - Sihao Li
- School of Advanced Technology, Xi’an Jiaotong-Liverpool University (XJTLU), Suzhou 215123, China; (Z.T.); (S.L.)
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK;
| | - Kyeong Soo Kim
- School of Advanced Technology, Xi’an Jiaotong-Liverpool University (XJTLU), Suzhou 215123, China; (Z.T.); (S.L.)
| | - Jeremy S. Smith
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK;
| |
Collapse
|
24
|
Schiemer R, Rüdt M, Hubbuch J. Generative data augmentation and automated optimization of convolutional neural networks for process monitoring. Front Bioeng Biotechnol 2024; 12:1228846. [PMID: 38357704 PMCID: PMC10864647 DOI: 10.3389/fbioe.2024.1228846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 01/15/2024] [Indexed: 02/16/2024] Open
Abstract
Chemometric modeling for spectral data is considered a key technology in biopharmaceutical processing to realize real-time process control and release testing. Machine learning (ML) models have been shown to increase the accuracy of various spectral regression and classification tasks, remove challenging preprocessing steps for spectral data, and promise to improve the transferability of models when compared to commonly applied, linear methods. The training and optimization of ML models require large data sets which are not available in the context of biopharmaceutical processing. Generative methods to extend data sets with realistic in silico samples, so-called data augmentation, may provide the means to alleviate this challenge. In this study, we develop and implement a novel data augmentation method for generating in silico spectral data based on local estimation of pure component profiles for training convolutional neural network (CNN) models using four data sets. We simultaneously tune hyperparameters associated with data augmentation and the neural network architecture using Bayesian optimization. Finally, we compare the optimized CNN models with partial least-squares regression models (PLS) in terms of accuracy, robustness, and interpretability. The proposed data augmentation method is shown to produce highly realistic spectral data by adapting the estimates of the pure component profiles to the sampled concentration regimes. Augmenting CNNs with the in silico spectral data is shown to improve the prediction accuracy for the quantification of monoclonal antibody (mAb) size variants by up to 50% in comparison to single-response PLS models. Bayesian structure optimization suggests that multiple convolutional blocks are beneficial for model accuracy and enable transfer across different data sets. Model-agnostic feature importance methods and synthetic noise perturbation are used to directly compare the optimized CNNs with PLS models. This enables the identification of wavelength regions critical for model performance and suggests increased robustness against Gaussian white noise and wavelength shifts of the CNNs compared to the PLS models.
Collapse
Affiliation(s)
- Robin Schiemer
- Institute of Process Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Matthias Rüdt
- Institute of Life Technologies, HES-SO Valais-Wallis, Sion, Switzerland
| | - Jürgen Hubbuch
- Institute of Process Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| |
Collapse
|
25
|
Li Z, Deng Z, Hao K, Zhao X, Jin Z. A Ship Detection Model Based on Dynamic Convolution and an Adaptive Fusion Network for Complex Maritime Conditions. Sensors (Basel) 2024; 24:859. [PMID: 38339576 PMCID: PMC10856874 DOI: 10.3390/s24030859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 12/08/2023] [Accepted: 01/25/2024] [Indexed: 02/12/2024]
Abstract
Ship detection is vital for maritime safety and vessel monitoring, but challenges like false and missed detections persist, particularly in complex backgrounds, multiple scales, and adverse weather conditions. This paper presents YOLO-Vessel, a ship detection model built upon YOLOv7, which incorporates several innovations to improve its performance. First, we devised a novel backbone network structure called Efficient Layer Aggregation Networks and Omni-Dimensional Dynamic Convolution (ELAN-ODConv). This architecture effectively addresses the complex background interference commonly encountered in maritime ship images, thereby improving the model's feature extraction capabilities. Additionally, we introduce the space-to-depth structure in the head network, which can solve the problem of small ship targets in images that are difficult to detect. Furthermore, we introduced ASFFPredict, a predictive network structure addressing scale variation among ship types, bolstering multiscale ship target detection. Experimental results demonstrate YOLO-Vessel's effectiveness, achieving a 78.3% mean average precision (mAP), surpassing YOLOv7 by 2.3% and Faster R-CNN by 11.6%. It maintains real-time detection at 8.0 ms/frame, meeting real-time ship detection needs. Evaluation in adverse weather conditions confirms YOLO-Vessel's superiority in ship detection, offering a robust solution to maritime challenges and enhancing marine safety and vessel monitoring.
Collapse
Affiliation(s)
- Zhisheng Li
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Zhihui Deng
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Kun Hao
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Xiaofang Zhao
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Zhigang Jin
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
| |
Collapse
|
26
|
Xu Q, Zan H, Ji S. A lightweight mixup-based short texts clustering for contrastive learning. Front Comput Neurosci 2024; 17:1334748. [PMID: 38348466 PMCID: PMC10860753 DOI: 10.3389/fncom.2023.1334748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 12/18/2023] [Indexed: 02/15/2024] Open
Abstract
Traditional text clustering based on distance struggles to distinguish between overlapping representations in medical data. By incorporating contrastive learning, the feature space can be optimized and applies mixup implicitly during the data augmentation phase to reduce computational burden. Medical case text is prevalent in everyday life, and clustering is a fundamental method of identifying major categories of conditions within vast amounts of unlabeled text. Learning meaningful clustering scores in data relating to rare diseases is difficult due to their unique sparsity. To address this issue, we propose a contrastive clustering method based on mixup, which involves selecting a small batch of data to simulate the experimental environment of rare diseases. The contrastive learning module optimizes the feature space based on the fact that positive pairs share negative samples, and clustering is employed to group data with comparable semantic features. The module mitigates the issue of overlap in data, whilst mixup generates cost-effective virtual features, resulting in superior experiment scores even when using small batch data and reducing resource usage and time overhead. Our suggested technique has acquired cutting-edge outcomes and embodies a favorable strategy for unmonitored text clustering.
Collapse
Affiliation(s)
| | | | - ShengWei Ji
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, Anhui, China
| |
Collapse
|
27
|
Esheh J, Affes S. Effectiveness of Data Augmentation for Localization in WSNs Using Deep Learning for the Internet of Things. Sensors (Basel) 2024; 24:430. [PMID: 38257522 DOI: 10.3390/s24020430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/14/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024]
Abstract
Wireless sensor networks (WSNs) have become widely popular and are extensively used for various sensor communication applications due to their flexibility and cost effectiveness, especially for applications where localization is a main challenge. Furthermore, the Dv-hop algorithm is a range-free localization algorithm commonly used in WSNs. Despite its simplicity and low hardware requirements, it does suffer from limitations in terms of localization accuracy. In this article, we develop an accurate Deep Learning (DL)-based range-free localization for WSN applications in the Internet of things (IoT). To improve the localization performance, we exploit a deep neural network (DNN) to correct the estimated distance between the unknown nodes (i.e., position-unaware) and the anchor nodes (i.e., position-aware) without burdening the IoT cost. DL needs large training data to yield accurate results, and the DNN is no stranger. The efficacy of machine learning, including DNNs, hinges on access to substantial training data for optimal performance. However, to address this challenge, we propose a solution through the implementation of a Data Augmentation Strategy (DAS). This strategy involves the strategic creation of multiple virtual anchors around the existing real anchors. Consequently, this process generates more training data and significantly increases data size. We prove that DAS can provide the DNNs with sufficient training data, and ultimately making it more feasible for WSNs and the IoT to fully benefit from low-cost DNN-aided localization. The simulation results indicate that the accuracy of the proposed (Dv-hop with DNN correction) surpasses that of Dv-hop.
Collapse
Affiliation(s)
- Jehan Esheh
- EMT Centre (Energy, Materials and Telecommunications), INRS (Institut National de la Recherche Scientifique), Université du Québec, Montréal, QC H5A 1K6, Canada
| | - Sofiene Affes
- EMT Centre (Energy, Materials and Telecommunications), INRS (Institut National de la Recherche Scientifique), Université du Québec, Montréal, QC H5A 1K6, Canada
| |
Collapse
|
28
|
Garcia C, Inoue S. Relabeling for Indoor Localization Using Stationary Beacons in Nursing Care Facilities. Sensors (Basel) 2024; 24:319. [PMID: 38257412 PMCID: PMC10818562 DOI: 10.3390/s24020319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 12/31/2023] [Accepted: 01/02/2024] [Indexed: 01/24/2024]
Abstract
In this study, we propose an augmentation method for machine learning based on relabeling data in caregiving and nursing staff indoor localization with Bluetooth Low Energy (BLE) technology. Indoor localization is used to monitor staff-to-patient assistance in caregiving and to gain insights into workload management. However, improving accuracy is challenging when there is a limited amount of data available for training. In this paper, we propose a data augmentation method to reuse the Received Signal Strength (RSS) from different beacons by relabeling to the locations with less samples, resolving data imbalance. Standard deviation and Kullback-Leibler divergence between minority and majority classes are used to measure signal pattern to find matching beacons to relabel. By matching beacons between classes, two variations of relabeling are implemented, specifically full and partial matching. The performance is evaluated using the real-world dataset we collected for five days in a nursing care facility installed with 25 BLE beacons. A Random Forest model is utilized for location recognition, and performance is compared using the weighted F1-score to account for class imbalance. By increasing the beacon data with our proposed relabeling method for data augmentation, we achieve a higher minority class F1-score compared to augmentation with Random Sampling, Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN). Our proposed method utilizes collected beacon data by leveraging majority class samples. Full matching demonstrated a 6 to 8% improvement from the original baseline overall weighted F1-score.
Collapse
Affiliation(s)
- Christina Garcia
- Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, 2-4 Hibikino, Wakamatsu Ward, Kitakyushu 808-0135, Japan;
| | | |
Collapse
|
29
|
Lee JD, Tsai CM. Advancing Barrett's Esophagus Segmentation: A Deep-Learning Ensemble Approach with Data Augmentation and Model Collaboration. Bioengineering (Basel) 2024; 11:47. [PMID: 38247924 PMCID: PMC10813459 DOI: 10.3390/bioengineering11010047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 11/15/2023] [Accepted: 11/15/2023] [Indexed: 01/23/2024] Open
Abstract
This approach provides a thorough investigation of Barrett's esophagus segmentation using deep-learning methods. This study explores various U-Net model variants with different backbone architectures, focusing on how the choice of backbone influences segmentation accuracy. By employing rigorous data augmentation techniques and ensemble strategies, the goal is to achieve precise and robust segmentation results. Key findings include the superiority of DenseNet backbones, the importance of tailored data augmentation, and the adaptability of training U-Net models from scratch. Ensemble methods are shown to enhance segmentation accuracy, and a grid search is used to fine-tune ensemble weights. A comprehensive comparison with the popular Deeplabv3+ architecture emphasizes the role of dataset characteristics. Insights into training saturation help optimize resource utilization, and efficient ensembles consistently achieve high mean intersection over union (IoU) scores, approaching 0.94. This research marks a significant advancement in Barrett's esophagus segmentation.
Collapse
Affiliation(s)
- Jiann-Der Lee
- Department of Electrical Engineering, Chang Gung University, Taoyuan 33302, Taiwan
- Department of Neurosurgery, Chang Gung Memorial Hospital at Linkou, Taoyuan 33305, Taiwan
- Department of Electrical Engineering, Ming Chi University of Technology, New Taipei City 24330, Taiwan
| | - Chih Mao Tsai
- Department of Electrical Engineering, Chang Gung University, Taoyuan 33302, Taiwan
| |
Collapse
|
30
|
Rajaram S, Mitchell CS. Data Augmentation with Cross-Modal Variational Autoencoders (DACMVA) for Cancer Survival Prediction. Information (Basel) 2024; 15:7. [PMID: 38665395 PMCID: PMC11044918 DOI: 10.3390/info15010007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024] Open
Abstract
The ability to translate Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) into different modalities and data types is essential to improve Deep Learning (DL) for predictive medicine. This work presents DACMVA, a novel framework to conduct data augmentation in a cross-modal dataset by translating between modalities and oversampling imputations of missing data. DACMVA was inspired by previous work on the alignment of latent spaces in Autoencoders. DACMVA is a DL data augmentation pipeline that improves the performance in a downstream prediction task. The unique DACMVA framework leverages a cross-modal loss to improve the imputation quality and employs training strategies to enable regularized latent spaces. Oversampling of augmented data is integrated into the prediction training. It is empirically demonstrated that the new DACMVA framework is effective in the often-neglected scenario of DL training on tabular data with continuous labels. Specifically, DACMVA is applied towards cancer survival prediction on tabular gene expression data where there is a portion of missing data in a given modality. DACMVA significantly (p << 0.001, one-sided Wilcoxon signed-rank test) outperformed the non-augmented baseline and competing augmentation methods with varying percentages of missing data (4%, 90%, 95% missing). As such, DACMVA provides significant performance improvements, even in very-low-data regimes, over existing state-of-the-art methods, including TDImpute and oversampling alone.
Collapse
Affiliation(s)
- Sara Rajaram
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Cassie S. Mitchell
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Center for Machine Learning at Georgia Tech, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
31
|
Gayatri E, Aarthy SL. Reduction of overfitting on the highly imbalanced ISIC-2019 skin dataset using deep learning frameworks. J Xray Sci Technol 2024; 32:53-68. [PMID: 38189730 DOI: 10.3233/xst-230204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
BACKGROUND With the rapid growth of Deep Neural Networks (DNN) and Computer-Aided Diagnosis (CAD), more significant works have been analysed for cancer related diseases. Skin cancer is the most hazardous type of cancer that cannot be diagnosed in the early stages. OBJECTIVE The diagnosis of skin cancer is becoming a challenge to dermatologists as an abnormal lesion looks like an ordinary nevus at the initial stages. Therefore, early identification of lesions (origin of skin cancer) is essential and helpful for treating skin cancer patients effectively. The enormous development of automated skin cancer diagnosis systems significantly supports dermatologists. METHODS This paper performs a classification of skin cancer by utilising various deep-learning frameworks after resolving the class Imbalance problem in the ISIC-2019 dataset. A fine-tuned ResNet-50 model is used to evaluate the performance of original data, augmented data, and after by adding the focal loss. Focal loss is the best technique to solve overfitting problems by assigning weights to hard misclassified images. RESULTS Finally, augmented data with focal loss is given a good classification performance with 98.85% accuracy, 95.52% precision, and 95.93% recall. Matthews Correlation coefficient (MCC) is the best metric to evaluate the quality of multi-class images. It has given outstanding performance by using augmented data and focal loss.
Collapse
Affiliation(s)
| | - S L Aarthy
- SCOPE, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| |
Collapse
|
32
|
Liang B, Wang X, Zhao W, Wang X. High-Precision Carton Detection Based on Adaptive Image Augmentation for Unmanned Cargo Handling Tasks. Sensors (Basel) 2023; 24:12. [PMID: 38202874 PMCID: PMC10780547 DOI: 10.3390/s24010012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/04/2023] [Accepted: 12/15/2023] [Indexed: 01/12/2024]
Abstract
Unattended intelligent cargo handling is an important means to improve the efficiency and safety of port cargo trans-shipment, where high-precision carton detection is an unquestioned prerequisite. Therefore, this paper introduces an adaptive image augmentation method for high-precision carton detection. First, the imaging parameters of the images are clustered into various scenarios, and the imaging parameters and perspectives are adaptively adjusted to achieve the automatic augmenting and balancing of the carton dataset in each scenario, which reduces the interference of the scenarios on the carton detection precision. Then, the carton boundary features are extracted and stochastically sampled to synthesize new images, thus enhancing the detection performance of the trained model for dense cargo boundaries. Moreover, the weight function of the hyperparameters of the trained model is constructed to achieve their preferential crossover during genetic evolution to ensure the training efficiency of the augmented dataset. Finally, an intelligent cargo handling platform is developed and field experiments are conducted. The outcomes of the experiments reveal that the method attains a detection precision of 0.828. This technique significantly enhances the detection precision by 18.1% and 4.4% when compared to the baseline and other methods, which provides a reliable guarantee for intelligent cargo handling processes.
Collapse
Affiliation(s)
- Bing Liang
- Naval Architecture and Ocean Engineering College, Dalian Maritime University, Dalian 116026, China; (X.W.); (W.Z.); (X.W.)
| | | | | | | |
Collapse
|
33
|
Li J, Peng C. Weighted residual network for SAR automatic target recognition with data augmentation. Front Neurorobot 2023; 17:1298653. [PMID: 38169785 PMCID: PMC10758409 DOI: 10.3389/fnbot.2023.1298653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
Introduction Decades of research have been dedicated to overcoming the obstacles inherent in synthetic aperture radar (SAR) automatic target recognition (ATR). The rise of deep learning technologies has brought a wave of new possibilities, demonstrating significant progress in the field. However, challenges like the susceptibility of SAR images to noise, the requirement for large-scale training datasets, and the often protracted duration of model training still persist. Methods This paper introduces a novel data augmentation strategy to address these issues. Our method involves the intentional addition and subsequent removal of speckle noise to artificially enlarge the scope of training data through noise perturbation. Furthermore, we propose a modified network architecture named weighted ResNet, which incorporates residual strain controls for enhanced performance. This network is designed to be computationally efficient and to minimize the amount of training data required. Results Through rigorous experimental analysis, our research confirms that the proposed data augmentation method, when used in conjunction with the weighted ResNet model, significantly reduces the time needed for training. It also improves the SAR ATR capabilities. Discussion Compared to existing models and methods tested, the combination of our data augmentation scheme and the weighted ResNet framework achieves higher computational efficiency and better recognition accuracy in SAR ATR applications. This suggests that our approach could be a valuable advancement in the field of SAR image analysis.
Collapse
Affiliation(s)
| | - Cheng Peng
- School of Electrical and Mechanical Engineering, Hefei Technology College, Hefei, China
| |
Collapse
|
34
|
Ellis CA, Miller RL, Calhoun VD. Evaluating Augmentation Approaches for Deep Learning-based Major Depressive Disorder Diagnosis with Raw Electroencephalogram Data. bioRxiv 2023:2023.12.15.571938. [PMID: 38187601 PMCID: PMC10769199 DOI: 10.1101/2023.12.15.571938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
While deep learning methods are increasingly applied in research contexts for neuropsychiatric disorder diagnosis, small dataset size limits their potential for clinical translation. Data augmentation (DA) could address this limitation, but the utility of EEG DA methods remains relatively underexplored in neuropsychiatric disorder diagnosis. In this study, we train a model for major depressive disorder diagnosis. We then evaluate the utility of 6 EEG DA approaches. Importantly, to remove the bias that could be introduced by comparing performance for models trained on larger augmented training sets to models trained on smaller baseline sets, we also introduce a new baseline trained on duplicate training data to better. We lastly examine the effects of the DA approaches upon representations learned by the model with a pair of explainability analyses. We find that while most approaches boost model performance, they do not improve model performance beyond that of simply using a duplicate training set without DA. The exception to this is channel dropout augmentation, which does improve model performance. These findings suggest the importance of comparing EEG DA methods to a baseline with a duplicate training set of equal size to the augmented training set. We also found that some DA methods increased model robustness to frequency (Fourier transform surrogates) and channel (channel dropout) perturbation. While our findings on EEG DA efficacy are restricted to our dataset and model, we hope that future studies on deep learning for small EEG datasets and on new EEG DA methods will find our findings helpful.
Collapse
Affiliation(s)
- Charles A Ellis
- Center for Translational Research in Neuroimaging and Data Science Georgia State University, Emory University, Georgia Institute of Technology Atlanta, USA
| | - Robyn L Miller
- Center for Translational Research in Neuroimaging and Data Science Georgia State University, Emory University, Georgia Institute of Technology Atlanta, USA
| | - Vince D Calhoun
- Center for Translational Research in Neuroimaging and Data Science Georgia State University, Emory University, Georgia Institute of Technology Atlanta, USA
| |
Collapse
|
35
|
Daneshgar Rahbar M, Mousavi Mojab SZ. Enhanced U-Net with GridMask (EUGNet): A Novel Approach for Robotic Surgical Tool Segmentation. J Imaging 2023; 9:282. [PMID: 38132700 PMCID: PMC10744415 DOI: 10.3390/jimaging9120282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/13/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023] Open
Abstract
This study proposed enhanced U-Net with GridMask (EUGNet) image augmentation techniques focused on pixel manipulation, emphasizing GridMask augmentation. This study introduces EUGNet, which incorporates GridMask augmentation to address U-Net's limitations. EUGNet features a deep contextual encoder, residual connections, class-balancing loss, adaptive feature fusion, GridMask augmentation module, efficient implementation, and multi-modal fusion. These innovations enhance segmentation accuracy and robustness, making it well-suited for medical image analysis. The GridMask algorithm is detailed, demonstrating its distinct approach to pixel elimination, enhancing model adaptability to occlusions and local features. A comprehensive dataset of robotic surgical scenarios and instruments is used for evaluation, showcasing the framework's robustness. Specifically, there are improvements of 1.6 percentage points in balanced accuracy for the foreground, 1.7 points in intersection over union (IoU), and 1.7 points in mean Dice similarity coefficient (DSC). These improvements are highly significant and have a substantial impact on inference speed. The inference speed, which is a critical factor in real-time applications, has seen a noteworthy reduction. It decreased from 0.163 milliseconds for the U-Net without GridMask to 0.097 milliseconds for the U-Net with GridMask.
Collapse
Affiliation(s)
- Mostafa Daneshgar Rahbar
- Department of Electrical and Computer Engineering, Lawrence Technological University, Southfield, MI 48075, USA
| | | |
Collapse
|
36
|
Li X, Cai W, Xu B, Jiang Y, Qi M, Wang M. SEResUTer: a deep learning approach for accurate ECG signal delineation and atrial fibrillation detection. Physiol Meas 2023; 44:125005. [PMID: 37827168 DOI: 10.1088/1361-6579/ad02da] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 10/12/2023] [Indexed: 10/14/2023]
Abstract
Objective.Accurate detection of electrocardiogram (ECG) waveforms is crucial for computer-aided diagnosis of cardiac abnormalities. This study introduces SEResUTer, an enhanced deep learning model designed for ECG delineation and atrial fibrillation (AF) detection.Approach. Built upon a U-Net architecture, SEResUTer incorporates ResNet modules and Transformer encoders to replace convolution blocks, resulting in improved optimization and encoding capabilities. A novel masking strategy is proposed to handle incomplete expert annotations. The model is trained on the QT database (QTDB) and evaluated on the Lobachevsky University Electrocardiography Database (LUDB) to assess its generalization performance. Additionally, the model's scope is extended to AF detection using the the China Physiological Signal Challenge 2021 (CPSC2021) and the China Physiological Signal Challenge 2018 (CPSC2018) datasets.Main results.The proposed model surpasses existing traditional and deep learning approaches in ECG waveform delineation on the QTDB. It achieves remarkable average F1 scores of 99.14%, 98.48%, and 98.46% for P wave, QRS wave, and T wave delineation, respectively. Moreover, the model demonstrates exceptional generalization ability on the LUDB, achieving average SE, positive prediction rate, and F1 scores of 99.05%, 94.59%, and 94.62%, respectively. By analyzing RR interval differences and the existence of P waves, our method achieves AF identification with 99.20% accuracy on the CPSC2021 test set and demonstrates strong generalization on CPSC2018 dataset.Significance.The proposed approach enables highly accurate ECG waveform delineation and AF detection, facilitating automated analysis of large-scale ECG recordings and improving the diagnosis of cardiac abnormalities.
Collapse
Affiliation(s)
- Xinyue Li
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China
| | - Wenjie Cai
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China
| | - Bolin Xu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China
| | - Yupeng Jiang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China
| | - Mengdi Qi
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China
| | - Mingjie Wang
- Shanghai Key Laboratory of Bioactive Small Molecules, School of Basic Medical Science, Fudan University, Shanghai, 200032, People's Republic of China
| |
Collapse
|
37
|
Achenbach P, Laux S, Purdack D, Müller PN, Göbel S. Give Me a Sign: Using Data Gloves for Static Hand-Shape Recognition. Sensors (Basel) 2023; 23:9847. [PMID: 38139692 PMCID: PMC10747392 DOI: 10.3390/s23249847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/07/2023] [Accepted: 12/13/2023] [Indexed: 12/24/2023]
Abstract
Human-to-human communication via the computer is mainly carried out using a keyboard or microphone. In the field of virtual reality (VR), where the most immersive experience possible is desired, the use of a keyboard contradicts this goal, while the use of a microphone is not always desirable (e.g., silent commands during task-force training) or simply not possible (e.g., if the user has hearing loss). Data gloves help to increase immersion within VR, as they correspond to our natural interaction. At the same time, they offer the possibility of accurately capturing hand shapes, such as those used in non-verbal communication (e.g., thumbs up, okay gesture, …) and in sign language. In this paper, we present a hand-shape recognition system using Manus Prime X data gloves, including data acquisition, data preprocessing, and data classification to enable nonverbal communication within VR. We investigate the impact on accuracy and classification time of using an outlier detection and a feature selection approach in our data preprocessing. To obtain a more generalized approach, we also studied the impact of artificial data augmentation, i.e., we created new artificial data from the recorded and filtered data to augment the training data set. With our approach, 56 different hand shapes could be distinguished with an accuracy of up to 93.28%. With a reduced number of 27 hand shapes, an accuracy of up to 95.55% could be achieved. The voting meta-classifier (VL2) proved to be the most accurate, albeit slowest, classifier. A good alternative is random forest (RF), which was even able to achieve better accuracy values in a few cases and was generally somewhat faster. outlier detection was proven to be an effective approach, especially in improving the classification time. Overall, we have shown that our hand-shape recognition system using data gloves is suitable for communication within VR.
Collapse
Affiliation(s)
- Philipp Achenbach
- Serious Games Group, Technical University of Darmstadt, 64289 Darmstadt, Germany (D.P.); (S.G.)
| | | | | | | | | |
Collapse
|
38
|
Zhang Z, Ma L, Wei C, Yang M, Qin S, Lv X, Zhang Z. Cotton Fusarium wilt diagnosis based on generative adversarial networks in small samples. Front Plant Sci 2023; 14:1290774. [PMID: 38162306 PMCID: PMC10754962 DOI: 10.3389/fpls.2023.1290774] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 11/17/2023] [Indexed: 01/03/2024]
Abstract
This study aimed to explore the feasibility of applying Generative Adversarial Networks (GANs) for the diagnosis of Verticillium wilt disease in cotton and compared it with traditional data augmentation methods and transfer learning. By designing a model based on small-sample learning, we proposed an innovative cotton Verticillium wilt disease diagnosis system. The system uses Convolutional Neural Networks (CNNs) as feature extractors and applies trained GAN models for sample augmentation to improve classification accuracy. This study collected and processed a dataset of cotton Verticillium wilt disease images, including samples from normal and infected plants. Data augmentation techniques were used to expand the dataset and train the CNNs. Transfer learning using InceptionV3 was applied to train the CNNs on the dataset. The dataset was augmented using GAN algorithms and used to train CNNs. The performances of the data augmentation, transfer learning, and GANs were compared and analyzed. The results have demonstrated that augmenting the cotton Verticillium wilt disease image dataset using GAN algorithms enhanced the diagnostic accuracy and recall rate of the CNNs. Compared to traditional data augmentation methods, GANs exhibit better performance and generated more representative and diverse samples. Unlike transfer learning, GANs ensured an adequate sample size. By visualizing the images generated, GANs were found to generate realistic cotton images of Verticillium wilt disease, highlighting their potential applications in agricultural disease diagnosis. This study has demonstrated the potential of GANs in the diagnosis of cotton Verticillium wilt disease diagnosis, offering an effective approach for agricultural disease detection and providing insights into disease detection in other crops.
Collapse
Affiliation(s)
- Zhenghang Zhang
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| | - Lulu Ma
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| | - Chunyue Wei
- Agricultural Development Service Center, Fifty-first Mission, Third Division, Tumushuke, China
| | - Mi Yang
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| | - Shizhe Qin
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| | - Xin Lv
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| | - Ze Zhang
- Xinjiang Production and Construction Crops Oasis Eco-Agriculture Key Laboratory, Shihezi University College of Agriculture, Shihezi, China
- Natiobal-Local Joint Engineering Research Center of Xinjiang Production and Construction Corps XPCC's Agricultural Big Data, Shihezi, China
| |
Collapse
|
39
|
Koslovsky MD. A Bayesian zero-inflated Dirichlet-multinomial regression model for multivariate compositional count data. Biometrics 2023; 79:3239-3251. [PMID: 36896642 DOI: 10.1111/biom.13853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 02/23/2023] [Indexed: 03/11/2023]
Abstract
The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data generated by high-throughput sequencing technology in omics research due to its ability to accommodate the compositional structure of the data as well as overdispersion. A major limitation of the DM distribution is that it is unable to handle excess zeros typically found in practice which may bias inference. To fill this gap, we propose a novel Bayesian zero-inflated DM model for multivariate compositional count data with excess zeros. We then extend our approach to regression settings and embed sparsity-inducing priors to perform variable selection for high-dimensional covariate spaces. Throughout, modeling decisions are made to boost scalability without sacrificing interpretability or imposing limiting assumptions. Extensive simulations and an application to a human gut microbiome dataset are presented to compare the performance of the proposed method to existing approaches. We provide an accompanying R package with a user-friendly vignette to apply our method to other datasets.
Collapse
Affiliation(s)
- Matthew D Koslovsky
- Department of Statistics, Colorado State University, Fort Collins, Colorado, USA
| |
Collapse
|
40
|
Wang X, Chu Y, Wang Q, Cao L, Qiao L, Zhang L, Liu M. Unsupervised contrastive graph learning for resting-state functional MRI analysis and brain disorder detection. Hum Brain Mapp 2023; 44:5672-5692. [PMID: 37668327 PMCID: PMC10619386 DOI: 10.1002/hbm.26469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 07/08/2023] [Accepted: 08/11/2023] [Indexed: 09/06/2023] Open
Abstract
Resting-state functional magnetic resonance imaging (rs-fMRI) helps characterize regional interactions that occur in the human brain at a resting state. Existing research often attempts to explore fMRI biomarkers that best predict brain disease progression using machine/deep learning techniques. Previous fMRI studies have shown that learning-based methods usually require a large amount of labeled training data, limiting their utility in clinical practice where annotating data is often time-consuming and labor-intensive. To this end, we propose an unsupervised contrastive graph learning (UCGL) framework for fMRI-based brain disease analysis, in which a pretext model is designed to generate informative fMRI representations using unlabeled training data, followed by model fine-tuning to perform downstream disease identification tasks. Specifically, in the pretext model, we first design a bi-level fMRI augmentation strategy to increase the sample size by augmenting blood-oxygen-level-dependent (BOLD) signals, and then employ two parallel graph convolutional networks for fMRI feature extraction in an unsupervised contrastive learning manner. This pretext model can be optimized on large-scale fMRI datasets, without requiring labeled training data. This model is further fine-tuned on to-be-analyzed fMRI data for downstream disease detection in a task-oriented learning manner. We evaluate the proposed method on three rs-fMRI datasets for cross-site and cross-dataset learning tasks. Experimental results suggest that the UCGL outperforms several state-of-the-art approaches in automated diagnosis of three brain diseases (i.e., major depressive disorder, autism spectrum disorder, and Alzheimer's disease) with rs-fMRI data.
Collapse
Affiliation(s)
- Xiaochuan Wang
- The School of Mathematics ScienceLiaocheng UniversityLiaochengChina
| | - Ying Chu
- The School of Mathematics ScienceLiaocheng UniversityLiaochengChina
| | - Qianqian Wang
- The Department of Radiology and BRICUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Liang Cao
- Taian Tumor Prevention and Treatment HospitalTaianChina
| | - Lishan Qiao
- The School of Mathematics ScienceLiaocheng UniversityLiaochengChina
| | - Limei Zhang
- School of Computer Science and TechnologyShandong Jianzhu UniversityJinanChina
| | - Mingxia Liu
- The Department of Radiology and BRICUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| |
Collapse
|
41
|
Van Ee JJ, Hagen CA, Jr DCP, Fricke KA, Koslovsky MD, Hooten MB. Melding wildlife surveys to improve conservation inference. Biometrics 2023; 79:3941-3953. [PMID: 37443410 DOI: 10.1111/biom.13903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 06/29/2023] [Indexed: 07/15/2023]
Abstract
Integrated models are a popular tool for analyzing species of conservation concern. Species of conservation concern are often monitored by multiple entities that generate several datasets. Individually, these datasets may be insufficient for guiding management due to low spatio-temporal resolution, biased sampling, or large observational uncertainty. Integrated models provide an approach for assimilating multiple datasets in a coherent framework that can compensate for these deficiencies. While conventional integrated models have been used to assimilate count data with surveys of survival, fecundity, and harvest, they can also assimilate ecological surveys that have differing spatio-temporal regions and observational uncertainties. Motivated by independent aerial and ground surveys of lesser prairie-chicken, we developed an integrated modeling approach that assimilates density estimates derived from surveys with distinct sources of observational error into a joint framework that provides shared inference on spatio-temporal trends. We model these data using a Bayesian Markov melding approach and apply several data augmentation strategies for efficient sampling. In a simulation study, we show that our integrated model improved predictive performance relative to models for analyzing the surveys independently. We use the integrated model to facilitate prediction of lesser prairie-chicken density at unsampled regions and perform a sensitivity analysis to quantify the inferential cost associated with reduced survey effort.
Collapse
Affiliation(s)
- Justin J Van Ee
- Department of Statistics, Colorado State University, Fort Collins, Colorado, USA
| | - Christian A Hagen
- Department of Fisheries, Wildlife, and Conservation Sciences, Oregon State University, Corvallis, Oregon, USA
| | - David C Pavlacky Jr
- Bird Conservancy of the Rockies, Brighton, Colorado, USA
- Department of Fish, Wildlife, and Conservation Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Kent A Fricke
- Kansas Department of Wildlife and Parks, Emporia, Kansas, USA
| | - Matthew D Koslovsky
- Department of Statistics, Colorado State University, Fort Collins, Colorado, USA
| | - Mevin B Hooten
- Department of Statistics and Data Sciences, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
42
|
Lu X, Hooten MB, Raiho AM, Swanson DK, Roland CA, Stehn SE. Latent trajectory models for spatio-temporal dynamics in Alaskan ecosystems. Biometrics 2023; 79:3664-3675. [PMID: 36715694 DOI: 10.1111/biom.13832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 01/13/2023] [Indexed: 01/31/2023]
Abstract
The Alaskan landscape has undergone substantial changes in recent decades, most notably the expansion of shrubs and trees across the Arctic. We developed a Bayesian hierarchical model to quantify the impact of climate change on the structural transformation of ecosystems using remotely sensed imagery. We used latent trajectory processes to model dynamic state probabilities that evolve annually, from which we derived transition probabilities between ecotypes. Our latent trajectory model accommodates temporal irregularity in survey intervals and uses spatio-temporally heterogeneous climate drivers to infer rates of land cover transitions. We characterized multi-scale spatial correlation induced by plot and subplot arrangements in our study system. We also developed a Pólya-Gamma sampling strategy to improve computation. Our model facilitates inference on the response of ecosystems to shifts in the climate and can be used to predict future land cover transitions under various climate scenarios.
Collapse
Affiliation(s)
- Xinyi Lu
- Department of Statistics, Colorado State University, Fort Collins, Colorado, USA
| | - Mevin B Hooten
- Department of Statistics and Data Sciences, The University of Texas at Austin, Austin, Texas, USA
| | - Ann M Raiho
- The National Aeronautics and Space Administration (NASA) Goddard Space Flight Center, Greenbelt, Maryland, USA
- Earth System Science Interdisciplinary Center, University of Maryland, College Park, Maryland, USA
| | | | - Carl A Roland
- Denali National Park and Preserve, Denali Park, Alaska, USA
- Central Alaska Network Inventory and Monitoring Program, Fairbanks, Alaska, USA
| | - Sarah E Stehn
- Denali National Park and Preserve, Denali Park, Alaska, USA
- Central Alaska Network Inventory and Monitoring Program, Fairbanks, Alaska, USA
| |
Collapse
|
43
|
Dong H, Zhang Y, Gu H, Konz N, Zhang Y, Mazurowski MA. SWSSL: Sliding Window-Based Self-Supervised Learning for Anomaly Detection in High-Resolution Images. IEEE Trans Med Imaging 2023; 42:3860-3870. [PMID: 37695965 PMCID: PMC10766076 DOI: 10.1109/tmi.2023.3314318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
Anomaly detection (AD) aims to determine if an instance has properties different from those seen in normal cases. The success of this technique depends on how well a neural network learns from normal instances. We observe that the learning difficulty scales exponentially with the input resolution, making it infeasible to apply AD to high-resolution images. Resizing them to a lower resolution is a compromising solution and does not align with clinical practice where the diagnosis could depend on image details. In this work, we propose to train the network and perform inference at the patch level, through the sliding window algorithm. This simple operation allows the network to receive high-resolution images but introduces additional training difficulties, including inconsistent image structure and higher variance. We address these concerns by setting the network's objective to learn augmentation-invariant features. We further study the augmentation function in the context of medical imaging. In particular, we observe that the resizing operation, a key augmentation in general computer vision literature, is detrimental to detection accuracy, and the inverting operation can be beneficial. We also propose a new module that encourages the network to learn from adjacent patches to boost detection performance. Extensive experiments are conducted on breast tomosynthesis and chest X-ray datasets and our method improves 8.03% and 5.66% AUC on image-level classification respectively over the current leading techniques. The experimental results demonstrate the effectiveness of our approach.
Collapse
|
44
|
Pu S, Zhang F, Shu Y, Fu W. Microscopic image recognition of diatoms based on deep learning. J Phycol 2023; 59:1166-1178. [PMID: 37994558 DOI: 10.1111/jpy.13390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 09/03/2023] [Accepted: 09/05/2023] [Indexed: 11/24/2023]
Abstract
Diatoms are a crucial component in the study of aquatic ecosystems and ancient environmental records. However, traditional methods for identifying diatoms, such as morphological taxonomy and molecular detection, are costly, are time consuming, and have limitations. To address these issues, we developed an extensive collection of diatom images, consisting of 7983 images from 160 genera and 1042 species, which we expanded to 49,843 through preprocessing, segmentation, and data augmentation. Our study compared the performance of different algorithms, including backbones, batch sizes, dynamic data augmentation, and static data augmentation on experimental results. We determined that the ResNet152 network outperformed other networks, producing the most accurate results with top-1 and top-5 accuracies of 85.97% and 95.26%, respectively, in identifying 1042 diatom species. Additionally, we propose a method that combines model prediction and cosine similarity to enhance the model's performance in low-probability predictions, achieving an 86.07% accuracy rate in diatom identification. Our research contributes significantly to the recognition and classification of diatom images and has potential applications in water quality assessment, ecological monitoring, and detecting changes in aquatic biodiversity.
Collapse
Affiliation(s)
- Siyue Pu
- College of Computer and Information Engineering (College of Artificial Intelligence), Nanjing Tech University, Nanjing, China
| | - Fan Zhang
- Ocean College, Zhejiang University, Zhoushan, China
- Kavli Institute for Astrophysics and Space Research Center, Massachusettes Institute of Technology, Cambridge, Massachusetts, USA
| | - Yuexuan Shu
- Ocean College, Zhejiang University, Zhoushan, China
| | - Weiqi Fu
- Ocean College, Zhejiang University, Zhoushan, China
- Center for Systems Biology and Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| |
Collapse
|
45
|
Nasaev SS, Mukanov AR, Kuznetsov II, Veselovsky AV. AliNA - a deep learning program for RNA secondary structure prediction. Mol Inform 2023; 42:e202300113. [PMID: 37710142 DOI: 10.1002/minf.202300113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 09/13/2023] [Accepted: 09/14/2023] [Indexed: 09/16/2023]
Abstract
Nowadays there are numerous discovered natural RNA variations participating in different cellular processes and artificial RNA, e. g., aptamers, riboswitches. One of the required tasks in the investigation of their functions and mechanism of influence on cells and interaction with targets is the prediction of RNA secondary structures. The classic thermodynamic-based prediction algorithms do not consider the specificity of biological folding and deep learning methods that were designed to resolve this issue suffer from homology-based methods problems. Herein, we present a method for RNA secondary structure prediction based on deep learning - AliNA (ALIgned Nucleic Acids). Our method successfully predicts secondary structures for non-homologous to train-data RNA families thanks to usage of the data augmentation techniques. Augmentation extends existing datasets with easily-accessible simulated data. The proposed method shows a high quality of prediction across different benchmarks including pseudoknots. The method is available on GitHub for free (https://github.com/Arty40m/AliNA).
Collapse
Affiliation(s)
- Shamsudin S Nasaev
- Institute of Biomedical Chemistry, 10, Pogodinskaya str., 119121, Moscow, Russia
| | - Artem R Mukanov
- A.M. Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008, Kazan, Russia
| | - Ivan I Kuznetsov
- Moscow University of Finance and Law, 10 block 1, Serpuhovsky val str., 115191, Moscow, Russia
| | | |
Collapse
|
46
|
Gouda MA, Hong W, Jiang D, Feng N, Zhou B, Li Z. Synthesis of sEMG Signals for Hand Gestures Using a 1DDCGAN. Bioengineering (Basel) 2023; 10:1353. [PMID: 38135944 PMCID: PMC10740493 DOI: 10.3390/bioengineering10121353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/18/2023] [Accepted: 11/20/2023] [Indexed: 12/24/2023] Open
Abstract
The emergence of modern prosthetics controlled by bio-signals has been facilitated by AI and microchip technology innovations. AI algorithms are trained using sEMG produced by muscles during contractions. The data acquisition procedure may result in discomfort and fatigue, particularly for amputees. Furthermore, prosthetic companies restrict sEMG signal exchange, limiting data-driven research and reproducibility. GANs present a viable solution to the aforementioned concerns. GANs can generate high-quality sEMG, which can be utilised for data augmentation, decrease the training time required by prosthetic users, enhance classification accuracy and ensure research reproducibility. This research proposes the utilisation of a one-dimensional deep convolutional GAN (1DDCGAN) to generate the sEMG of hand gestures. This approach involves the incorporation of dynamic time wrapping, fast Fourier transform and wavelets as discriminator inputs. Two datasets were utilised to validate the methodology, where five windows and increments were utilised to extract features to evaluate the synthesised sEMG quality. In addition to the traditional classification and augmentation metrics, two novel metrics-the Mantel test and the classifier two-sample test-were used for evaluation. The 1DDCGAN preserved the inter-feature correlations and generated high-quality signals, which resembled the original data. Additionally, the classification accuracy improved by an average of 1.21-5%.
Collapse
Affiliation(s)
| | - Wang Hong
- Department of Mechanical Engineering and Automation, Northeastern University, Shenyang 110819, China; (M.A.G.); (D.J.); (N.F.); (B.Z.); (Z.L.)
| | | | | | | | | |
Collapse
|
47
|
Esmaeili F, Cassie E, Nguyen HPT, Plank NOV, Unsworth CP, Wang A. Utilizing Deep Learning Algorithms for Signal Processing in Electrochemical Biosensors: From Data Augmentation to Detection and Quantification of Chemicals of Interest. Bioengineering (Basel) 2023; 10:1348. [PMID: 38135939 PMCID: PMC10740562 DOI: 10.3390/bioengineering10121348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/14/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023] Open
Abstract
Nanomaterial-based aptasensors serve as useful instruments for detecting small biological entities. This work utilizes data gathered from three electrochemical aptamer-based sensors varying in receptors, analytes of interest, and lengths of signals. Our ultimate objective was the automatic detection and quantification of target analytes from a segment of the signal recorded by these sensors. Initially, we proposed a data augmentation method using conditional variational autoencoders to address data scarcity. Secondly, we employed recurrent-based networks for signal extrapolation, ensuring uniform signal lengths. In the third step, we developed seven deep learning classification models (GRU, unidirectional LSTM (ULSTM), bidirectional LSTM (BLSTM), ConvGRU, ConvULSTM, ConvBLSTM, and CNN) to identify and quantify specific analyte concentrations for six distinct classes, ranging from the absence of analyte to 10 μM. Finally, the second classification model was created to distinguish between abnormal and normal data segments, detect the presence or absence of analytes in the sample, and, if detected, identify the specific analyte and quantify its concentration. Evaluating the time series forecasting showed that the GRU-based network outperformed two other ULSTM and BLSTM networks. Regarding classification models, it turned out signal extrapolation was not effective in improving the classification performance. Comparing the role of the network architectures in classification performance, the result showed that hybrid networks, including both convolutional and recurrent layers and CNN networks, achieved 82% to 99% accuracy across all three datasets. Utilizing short-term Fourier transform (STFT) as the preprocessing technique improved the performance of all datasets with accuracies from 84% to 99%. These findings underscore the effectiveness of suitable data preprocessing methods in enhancing neural network performance, enabling automatic analyte identification and quantification from electrochemical aptasensor signals.
Collapse
Affiliation(s)
- Fatemeh Esmaeili
- Department of Engineering Science, University of Auckland, Auckland 1010, New Zealand; (F.E.); (C.P.U.)
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
| | - Erica Cassie
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Hong Phan T. Nguyen
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Natalie O. V. Plank
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Charles P. Unsworth
- Department of Engineering Science, University of Auckland, Auckland 1010, New Zealand; (F.E.); (C.P.U.)
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
| | - Alan Wang
- Auckland Bioengineering Institute, University of Auckland, Auckland 1010, New Zealand
- Center for Medical Imaging, Faculty of Medical and Health Sciences, University of Auckland, Auckland 1010, New Zealand
- Centre for Brain Research, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
48
|
Cai Z, Wong LM, Wong YH, Lee HL, Li KY, So TY. Dual-Level Augmentation Radiomics Analysis for Multisequence MRI Meningioma Grading. Cancers (Basel) 2023; 15:5459. [PMID: 38001719 PMCID: PMC10670283 DOI: 10.3390/cancers15225459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 11/07/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND Preoperative, noninvasive prediction of meningioma grade is important for therapeutic planning and decision making. In this study, we propose a dual-level augmentation strategy incorporating image-level augmentation (IA) and feature-level augmentation (FA) to tackle class imbalance and improve the predictive performance of radiomics for meningioma grading on Magnetic Resonance Imaging (MRI). METHODS This study recruited 160 consecutive patients with pathologically proven meningioma (129 low-grade (WHO grade I) tumors; 31 high-grade (WHO grade II and III) tumors) with preoperative multisequence MRI imaging. A dual-level augmentation strategy combining IA and FA was applied and evaluated in 100 repetitions in 3-, 5-, and 10-fold cross-validation. RESULTS The best area under the receiver operating characteristics curve of our method in 100 repetitions was ≥0.78 in all cross-validations. The corresponding cross-validation sensitivities (cross-validation specificity) were 0.72 (0.69), 0.76 (0.71), and 0.63 (0.82) in 3-, 5-, and 10-fold cross-validation, respectively. The proposed method achieved significantly better performance and distribution of results, outperforming single-level augmentation (IA or FA) or no augmentation in each cross-validation. CONCLUSIONS The dual-level augmentation strategy using IA and FA significantly improves the performance of the radiomics model for meningioma grading on MRI, allowing better radiomics-based preoperative stratification and individualized treatment.
Collapse
Affiliation(s)
| | | | | | | | | | - Tiffany Y. So
- Department of Imaging and Interventional Radiology, The Chinese University of Hong Kong, Hong Kong SAR, China; (Z.C.); (L.M.W.); (Y.H.W.); (H.-l.L.); (K.-y.L.)
| |
Collapse
|
49
|
Summerfield GI, De Freitas A, van Marle-Koster E, Myburgh HC. Automated Cow Body Condition Scoring Using Multiple 3D Cameras and Convolutional Neural Networks. Sensors (Basel) 2023; 23:9051. [PMID: 38005439 PMCID: PMC10675635 DOI: 10.3390/s23229051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 10/28/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023]
Abstract
Body condition scoring is an objective scoring method used to evaluate the health of a cow by determining the amount of subcutaneous fat in a cow. Automated body condition scoring is becoming vital to large commercial dairy farms as it helps farmers score their cows more often and more consistently compared to manual scoring. A common approach to automated body condition scoring is to utilise a CNN-based model trained with data from a depth camera. The approaches presented in this paper make use of three depth cameras placed at different positions near the rear of a cow to train three independent CNNs. Ensemble modelling is used to combine the estimations of the three individual CNN models. The paper aims to test the performance impact of using ensemble modelling with the data from three separate depth cameras. The paper also looks at which of these three cameras and combinations thereof provide a good balance between computational cost and performance. The results of this study show that utilising the data from three depth cameras to train three separate models merged through ensemble modelling yields significantly improved automated body condition scoring accuracy compared to a single-depth camera and CNN model approach. This paper also explored the real-world performance of these models on embedded platforms by comparing the computational cost to the performance of the various models.
Collapse
Affiliation(s)
- Gary I. Summerfield
- Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0028, South Africa; (A.D.F.); (H.C.M.)
| | - Allan De Freitas
- Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0028, South Africa; (A.D.F.); (H.C.M.)
| | | | - Herman C. Myburgh
- Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0028, South Africa; (A.D.F.); (H.C.M.)
| |
Collapse
|
50
|
Wang N, Zhang J, Song X. A Pipeline Defect Instance Segmentation System Based on SparseInst. Sensors (Basel) 2023; 23:9019. [PMID: 38005407 PMCID: PMC10675068 DOI: 10.3390/s23229019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/21/2023] [Accepted: 10/31/2023] [Indexed: 11/26/2023]
Abstract
Deep learning algorithms have achieved encouraging results for pipeline defect segmentation. However, existing defect segmentation methods may encounter challenges in accurately segmenting the complex features of pipeline defects and suffer from low processing speeds. Therefore, in this study, we propose Pipe-Sparse-Net, a pipeline defect segmentation system that combines StyleGAN3 to segment the complex forms of underground drainage pipe defects. First, we introduce a data augmentation algorithm based on StyleGAN3 to enlarge the dataset. Next, we propose Pipe-Sparse-Net, a pipeline segmentation model based on SparseInst, to accurately predict the defect regions in drainage pipes. Experimental results demonstrate that the segmentation accuracy of this model can reach 91.4% with a processing speed of 56.7 frames per second (FPS). To validate the superiority of this method, comparative experiments were conducted against Yolact, Condinst, and Mask R-CNN, and the model achieved a speed improvement of 45% while increasing the accuracy by more than 4%.
Collapse
Affiliation(s)
- Niannian Wang
- School of Water Conservancy and Transportation, Zhengzhou University, Zhengzhou 450001, China; (N.W.); (J.Z.)
| | - Jingzheng Zhang
- School of Water Conservancy and Transportation, Zhengzhou University, Zhengzhou 450001, China; (N.W.); (J.Z.)
| | - Xiaotian Song
- School of Engineering and Technology, China University of Geosciences (Beijing), Beijing 100083, China
| |
Collapse
|