1
|
Gupta M, Verma N, Sharma N, Singh SN, Brojen Singh RK, Sharma SK. Deep transfer learning hybrid techniques for precision in breast cancer tumor histopathology classification. Health Inf Sci Syst 2025; 13:20. [PMID: 39949707 PMCID: PMC11813847 DOI: 10.1007/s13755-025-00337-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Accepted: 01/07/2025] [Indexed: 02/16/2025] Open
Abstract
The breast cancer is one of the most prevalent causes of cancer-related death globally. Preliminary diagnosis of breast cancer increases the patient's chances of survival. Breast cancer classification is a challenging problem due to dense tissue structures, subtle variations, cellular heterogeneity, artifacts, and variability. In this paper, we propose three hybrid deep-transfer learning models for breast cancer classification using histopathology images. These models use Xception model as a base model, and we add seven more layers to fine-tune the base model. We also performed an extensive comparative analysis of five prominent machine-learning classifiers, namely Random Forest Classifier (RFC), Logistic Regression (LR), Support Vector Classifier (SVC), K-Nearest Neighbors (KNN), and Ada-boost. We incorporate the best performing two classifiers, namely RFC and SVC, in the fine-tuned Xception model, and accordingly, they are named as Xception Random Forest (XRF) and Xception Support Vector (XSV), respectively. The fine-tuned Xception model with softmax classifier is termed as Multi-layer Xception Classifier (MXC). These three models are evaluated on the two publically available datasets: BreakHis and Breast Histopathology Images Database (BHID). Our all three models perform better than the state-of-the-art methods. The XRF provides the best performance at the 40 × magnification level on the BreakHis dataset, with an accuracy (ACC) of 94.44%, F1 score (F1) of 94.44%, area under the receiver operating characteristic curve (AUC) of 95.12%, Matthew's correlation coefficient (MCC) of 88.98%, kappa (K) of 88.88%, and classification success index (CSI) of 89.23%. The MXC provides the best performance on the BHID dataset, with an ACC of 88.50%, F1 of 88.50%, AUC of 95.12%, MCC of 77.03%, K of 77.00%, and CSI of 79.13%. Further, to validate our models, we performed fivefold cross-validation on both datasets and obtained similar results.
Collapse
Affiliation(s)
- Muniraj Gupta
- School of Computer & Systems Sciences, Jawaharlal Nehru University, New Delhi, 110067 India
| | - Nidhi Verma
- Ramlal Anand College, University of Delhi, South Campus, Anand Niketan, New Delhi, 110021 India
| | - Naveen Sharma
- Indian Council of Medical Research, New Delhi, 110029 India
| | | | - R. K. Brojen Singh
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067 India
| | - Saurabh Kumar Sharma
- School of Computer & Systems Sciences, Jawaharlal Nehru University, New Delhi, 110067 India
| |
Collapse
|
2
|
Xie Z, Chen Z, Yang Q, Ye Q, Li X, Xie Q, Liu C, Lin B, Han X, He Y, Wang X, Yang W, Zhao Y. Enhanced diagnosis of axial spondyloarthritis using machine learning with sacroiliac joint MRI: a multicenter study. Insights Imaging 2025; 16:91. [PMID: 40281350 PMCID: PMC12031678 DOI: 10.1186/s13244-025-01967-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 04/01/2025] [Indexed: 04/29/2025] Open
Abstract
OBJECTIVES To develop a machine learning (ML)-based model using MRI and clinical risk factors to enhance diagnostic accuracy for axial spondyloarthritis (axSpA). METHODS We retrospectively analyzed datasets from four centers (A-D), focusing on patients with chronic low back pain. A subset from center A was used for prospective validation. A deep learning (DL) model based on ResNet50 was constructed using sacroiliac joint MRI. Clinical variables were integrated with DL scores in ML algorithms to distinguish axSpA from non-axSpA patients. Model performance was assessed by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. RESULTS The study included 1294 patients (median age 31 years [interquartile range 24-42]; 35.5% females). Clinical risk factors identified were age, sex, and human leukocyte antigen-B27 status. The MRI-based DL model demonstrated an AUC of 0.837, 0.636, 0.724, 0.710, and 0.812 on the internal test set, three external test sets, and the prospective validation set, respectively. The combined model, particularly the K-nearest-neighbors-11 algorithm, demonstrated superior performance across multiple test sets with AUCs ranging from 0.853 to 0.912. It surpassed the Assessment of SpondyloArthritis International Society criteria with better AUC (0.858 vs. 0.650, p < 0.001), sensitivity (87.8% vs. 42.4%, p < 0.001), and accuracy (78.7% vs. 56.9%, p < 0.001). CONCLUSION The ML method integrating MRI and clinical risk factors effectively identified axSpA, representing a promising tool for the diagnosis and management of axSpA. CLINICAL RELEVANCE STATEMENT The machine learning model combining MRI and clinical risk factors potentially enables earlier diagnosis and intervention for axial spondyloarthritis patients, reducing the delays commonly associated with traditional diagnostic approaches. KEY POINTS Axial spondyloarthritis (AxSpA) lacks definitive diagnostic criteria or markers, leading to diagnostic delay. MRI-based deep learning provided quantitative analysis of sacroiliac joint changes indicative of axSpA. A machine learning model combining sacroiliac joint MRI and clinical risk factors enhanced axSpA identification.
Collapse
Affiliation(s)
- Zhuoyao Xie
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Zefeiyun Chen
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou, China
| | - Qinmei Yang
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Qiang Ye
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Xin Li
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Qiuxia Xie
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Caolin Liu
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
- Department of Radiology, The Sixth Affiliated Hospital of South China University of Technology, Nanhai, China
| | - Bomiao Lin
- Department of Radiology, Zhujiang Hospital of Southern Medical University, Guangzhou, China
| | - Xinai Han
- Department of Rheumatology and Immunology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Yi He
- Department of Rheumatology and Immunology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China
| | - Xiaohong Wang
- Department of Radiology, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China.
| | - Wei Yang
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou, China.
| | - Yinghua Zhao
- Department of Radiology, The Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics Guangdong Province), Guangzhou, China.
| |
Collapse
|
3
|
Zhu L, Wang R, Jin X, Li Y, Tian F, Cai R, Qian K, Hu X, Hu B, Yamamoto Y, Schuller BW. Explainable Depression Classification Based on EEG Feature Selection From Audio Stimuli. IEEE Trans Neural Syst Rehabil Eng 2025; 33:1411-1426. [PMID: 40173060 DOI: 10.1109/tnsre.2025.3557275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2025]
Abstract
With the development of affective computing and Artificial Intelligence (AI) technologies, Electroencephalogram (EEG)-based depression detection methods have been widely proposed. However, existing studies have mostly focused on the accuracy of depression recognition, ignoring the association between features and models. Additionally, there is a lack of research on the contribution of different features to depression recognition. To this end, this study introduces an innovative approach to depression detection using EEG data, integrating Ant-Lion Optimization (ALO) and Multi-Agent Reinforcement Learning (MARL) for feature fusion analysis. The inclusion of Explainable Artificial Intelligence (XAI) methods enhances the explainability of the model's features. The Time-Delay Embedded Hidden Markov Model (TDE-HMM) is employed to infer internal brain states during depression, triggered by audio stimulation. The ALO-MARL algorithm, combined with hyper-parameter optimization of the XGBoost classifier, achieves high accuracy (93.69%), sensitivity (88.60%), specificity (97.08%), and F1-score (91.82%) on a auditory stimulus-evoked three-channel EEG dataset. The results suggest that this approach outperforms state-of-the-art feature selection methods for depression recognition on this dataset, and XAI elucidates the critical impact of the minimum value of Power Spectral Density (PSD), Sample Entropy (SampEn), and Rényi Entropy (Ren) on depression recognition. The study also explores dynamic brain state transitions revealed by audio stimuli, providing insights for the clinical application of AI algorithms in depression recognition.
Collapse
|
4
|
Hu Y, Ngai CSB, Chen S. Automated Approaches to Screening Developmental Language Disorder: A Comprehensive Review and Future Prospects. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2025:1-21. [PMID: 40228046 DOI: 10.1044/2025_jslhr-24-00488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Abstract
PURPOSE This study examines existing automatic screening methods for developmental language disorder (DLD), a neurodevelopmental language deficit without known biomedical etiologies, focusing on languages, data sets, extracted features, performance metrics, and classification methods. Additionally, it summarizes the strengths and weaknesses of current systems and explores future research opportunities and challenges. METHOD We conducted a systematic review, searching PubMed, Web of Science, Scopus, and PsycINFO for articles published in English before March 2024. We included studies that developed automated screening systems to classify DLD cases among children. RESULTS A total of 23 studies were thoroughly reviewed. We found that automatic screening models for DLD focused on five languages, namely, Czech, Italian, Mandarin, Spanish, and English, with various data sets employed. Most studies identified and used acoustic, textural, and combination of speech features and nonspeech features for model development. Traditional machine learning, artificial neural networks, convolutional neural networks, long short-term memory, and non-machine-learning classification methods were employed in model training. The need for larger, multilingual data sets and improved system sensitivity is noted. Future research opportunities include exploring the integration of combined features and algorithms; implementing new algorithms; and considering variations in age, gender, severity, and comorbidity differences in DLD. CONCLUSION This systematic review of existing automatic screening methods for DLD highlights significant advancements and suggests potential areas in future research on automatic DLD screening.
Collapse
Affiliation(s)
- Yangna Hu
- The Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hung Hom, Kowloon
| | - Cindy Sing Bik Ngai
- The Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hung Hom, Kowloon
| | - Sihui Chen
- The Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hung Hom, Kowloon
| |
Collapse
|
5
|
Fan X, Zhang Y, Peng Y, Li Q, Wei X, Wang J, Zou F. LO-MLPRNN: A Classification Algorithm for Multispectral Remote Sensing Images by Fusing Selective Convolution. SENSORS (BASEL, SWITZERLAND) 2025; 25:2472. [PMID: 40285163 PMCID: PMC12030848 DOI: 10.3390/s25082472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2025] [Revised: 03/28/2025] [Accepted: 04/14/2025] [Indexed: 04/29/2025]
Abstract
To address the limitation of traditional deep learning algorithms in fully utilizing contextual information in multispectral remote sensing (RS) images, this paper proposes an improved vegetation cover classification algorithm called LO-MLPRNN, which integrates Large Selective Kernel Network (LSK) and Omni-Dimensional Dynamic Convolution (ODC) with a Multi-Layer Perceptron Recurrent Neural Network (MLPRNN). The algorithm employs parallel-connected ODC and LSK modules to adaptively adjust convolution kernel parameters across multiple dimensions and dynamically optimize spatial receptive fields, enabling multi-perspective feature fusion for efficient processing of multispectral band information. The extracted features are mapped to a high-dimensional space through a Gate Recurrent Unit (GRU) and fully connected layers, with nonlinear characteristics enhanced by activation functions, ultimately achieving pixel-level land cover classification. Experiments conducted on GF-2 (0.75 m) and Sentinel-2 (10 m) multispectral RS images from Liucheng County, Liuzhou City, Guangxi Province, demonstrate that LO-MLPRNN achieves overall accuracies of 99.11% and 99.43%, outperforming Vision Transformer (ViT) by 2.61% and 3.98%, respectively. Notably, the classification accuracy for sugarcane reaches 99.70% and 99.67%, showcasing its superior performance.
Collapse
Affiliation(s)
- Xiangsuo Fan
- School of Automation, Guangxi University of Science and Technology, Liuzhou 545006, China; (X.F.); (Y.Z.); (Y.P.)
- Guangxi Collaborative Innovation Centre for Earthmoving Machinery, Guangxi University of Science and Technology, Liuzhou 545006, China
- Engineering Research Center of Advanced Engineering Equipment, University of Guangxi, Liuzhou 545006, China
| | - Yan Zhang
- School of Automation, Guangxi University of Science and Technology, Liuzhou 545006, China; (X.F.); (Y.Z.); (Y.P.)
| | - Yong Peng
- School of Automation, Guangxi University of Science and Technology, Liuzhou 545006, China; (X.F.); (Y.Z.); (Y.P.)
| | - Qi Li
- School of Civil Engineering and Architecture, Guangxi University of Science and Technology, Liuzhou 545006, China
| | - Xianqiang Wei
- Liuzhou Survey and Mapping Research Institute Co., Ltd., Liuzhou 545005, China; (J.W.); (F.Z.)
| | - Jiabin Wang
- Liuzhou Survey and Mapping Research Institute Co., Ltd., Liuzhou 545005, China; (J.W.); (F.Z.)
| | - Fadong Zou
- Liuzhou Survey and Mapping Research Institute Co., Ltd., Liuzhou 545005, China; (J.W.); (F.Z.)
| |
Collapse
|
6
|
Guo H, Ping D, Wang L, Zhang W, Wu J, Ma X, Xu Q, Lu Z. Fault Diagnosis Method of Rolling Bearing Based on 1D Multi-Channel Improved Convolutional Neural Network in Noisy Environment. SENSORS (BASEL, SWITZERLAND) 2025; 25:2286. [PMID: 40218798 PMCID: PMC11991339 DOI: 10.3390/s25072286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2025] [Revised: 04/01/2025] [Accepted: 04/02/2025] [Indexed: 04/14/2025]
Abstract
The vibration signal of mechanical equipment in operating environments is the key to describing fault characteristics, but due to thez influence of equipment density and environmental interference, the accuracy of fault diagnosis is often affected by noise. In this paper, a fault diagnosis method based on a 1D Multi-Channel Improved Convolutional Neural Network (1DMCICNN) is proposed. By introducing BiLSTM, an attention mechanism and a local sparse structure of a two-channel Convolutional Neural Network, the feature information of the noisy timing signal is fully extracted at different scales while reducing the computational parameters. The model is verified through experiments under different signal-to-noise ratios and loads. The results show that the accuracy of 1DMCICNN is 98.67%, 99.71%, 99.04%, and 99.71% on different load and speed datasets. Meanwhile, compared with the unoptimized two-channel Convolutional Neural Network, the training parameters are reduced by 55.58%.
Collapse
Affiliation(s)
- Huijuan Guo
- Department of Engineering, Huanghe Science and Technology University, Zhengzhou 450045, China; (H.G.); (J.W.)
| | - Dongzhi Ping
- School of Mechanical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China; (D.P.); (W.Z.); (X.M.)
| | - Lijun Wang
- School of Mechanical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China; (D.P.); (W.Z.); (X.M.)
| | - Weijie Zhang
- School of Mechanical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China; (D.P.); (W.Z.); (X.M.)
| | - Junfeng Wu
- Department of Engineering, Huanghe Science and Technology University, Zhengzhou 450045, China; (H.G.); (J.W.)
| | - Xiao Ma
- School of Mechanical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China; (D.P.); (W.Z.); (X.M.)
| | - Qiang Xu
- School of Computing and Engineering, University of Huddersfield, West Yorkshire HD1 3DH, UK;
| | - Zhongyu Lu
- The Glass Box, 6 Friendly Street, Huddersfield HD1 1RD, UK
| |
Collapse
|
7
|
Wu X, Shi S, Jiang J, Lin D, Song J, Wang Z, Huang W. Bionic Olfactory Neuron with In-Sensor Reservoir Computing for Intelligent Gas Recognition. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2025; 37:e2419159. [PMID: 39945055 DOI: 10.1002/adma.202419159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 01/21/2025] [Indexed: 04/03/2025]
Abstract
Gas sensing and recognition are closely related to the sustainable development of human society, current electronic noses (e-noses) typically focus on detecting specific gases, with only a few capable of recognizing complex odor mixtures. Further, these sensors often struggle to distinguish between isomers and homologs, as these compounds usually have similar physical and chemical properties, yielding nearly identical sensor responses. Even the mammalian olfactory systems consisting of a large variety of receptor cells and efficient neuron networks sometimes fail in this task. The bottleneck stems from the inability to extract the fingerprints of these compounds and the inefficiency of signal processing. To address these limitations, a material-device-algorithm co-design strategy is proposed that integrates an organic field-effect transistor (OFET) array with in-sensor reservoir computing (RC) and the k-nearest neighbors (KNN) algorithm. Organic semiconductors diversify responses to different gases, while RC efficiently extracts spatiotemporal features with lower training costs and reduced energy overhead. This synergy achieves 100% classification accuracy for eight gases and 99.04% accuracy for a library of 26 gases, including mixtures, isomers, and homologs-among the highest reported accuracies. This work provides a groundbreaking hardware solution for bionic olfactory neurons with edge artificial intelligence (AI) functions, surpassing traditional e-noses.
Collapse
Affiliation(s)
- Xiaosong Wu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian, 350002, P. R. China
- Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou, Fujian, 350002, P. R. China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, P. R. China
| | - Shuhui Shi
- Department of Electrical and Electronic Engineering, University of Hong Kong, Pokfulam Road, Hong Kong SAR, P. R. China
| | - Jingyan Jiang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, 518118, P. R. China
| | - Dedong Lin
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, 518118, P. R. China
| | - Jian Song
- School of Microelectronics, Shanghai University, Shanghai, 201800, P. R. China
| | - Zhongrui Wang
- Department of Electrical and Electronic Engineering, University of Hong Kong, Pokfulam Road, Hong Kong SAR, P. R. China
| | - Weiguo Huang
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian, 350002, P. R. China
- Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou, Fujian, 350002, P. R. China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, P. R. China
| |
Collapse
|
8
|
Queipo M, Mateo J, Torres AM, Barbado J. The Effect of Naturally Acquired Immunity on Mortality Predictors: A Focus on Individuals with New Coronavirus. Biomedicines 2025; 13:803. [PMID: 40299374 DOI: 10.3390/biomedicines13040803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2025] [Revised: 03/19/2025] [Accepted: 03/24/2025] [Indexed: 04/30/2025] Open
Abstract
Background/Objectives: The spread of the COVID-19 pandemic has spurred the development of advanced healthcare tools to effectively manage patient outcomes. This study aims to identify key predictors of mortality in hospitalized patients with some level of natural immunity, but not yet vaccinated, using machine learning techniques. Methods: A total of 363 patients with COVID-19 admitted to Río Hortega University Hospital in Spain between the second and fourth waves of the pandemic were included in this study. Key characteristics related to both the patient's previous status and hospital stay were screened using the Random Forest (RF) machine learning technique. Results: Of the 19 variables identified as having the greatest influence on predicting mortality, the most powerful ones could be identified at the time of hospital admission. These included the assessment of severity in community-acquired pneumonia (CURB-65) scale, age, the Glasgow Coma Scale (GCS), and comorbidities, as well as laboratory results. Some variables associated with hospitalization and intensive care unit (ICU) admission (acute renal failure, shock, PRONO sessions and the Acute Physiology and Chronic Health Evaluation [APACHE-II] scale) showed a certain degree of significance. The Random Forest (RF) method showed high accuracy, with a precision of >95%. Conclusions: This study shows that natural immunity generates significant changes in the evolution of the disease. As has been shown, machine learning models are an effective tool to improve personalized patient care in different periods.
Collapse
Affiliation(s)
- Mónica Queipo
- Autoimmunity and Inflammation Research Group, Río Hortega University Hospital, 47012 Valladolid, Spain
- Cooperative Research Network Focused on Health Results-Advanced Therapies (RICORS TERAV), 28220 Madrid, Spain
| | - Jorge Mateo
- Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Ana María Torres
- Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Julia Barbado
- Autoimmunity and Inflammation Research Group, Río Hortega University Hospital, 47012 Valladolid, Spain
- Cooperative Research Network Focused on Health Results-Advanced Therapies (RICORS TERAV), 28220 Madrid, Spain
- Internal Medicine, Río Hortega University Hospital, 47012 Valladolid, Spain
| |
Collapse
|
9
|
Wang Y, Pan Z, Cai H, Li S, Huang Y, Zhuang J, Liu X, Guan G. Prognostic model for log odds of negative lymph node in locally advanced rectal cancer via interpretable machine learning. Sci Rep 2025; 15:7924. [PMID: 40050297 PMCID: PMC11885450 DOI: 10.1038/s41598-025-90191-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Accepted: 02/11/2025] [Indexed: 03/09/2025] Open
Abstract
No studies have examined the prognostic value of the log odds of negative lymph nodes/T stage (LONT) in locally advanced rectal cancer (LARC) treated with neoadjuvant chemoradiotherapy (nCRT). We aimed to assess the prognostic value of LONT and develop a machine learning model to predict overall survival (OS) and disease-free survival (DFS) in LARC patients treated with nCRT. The study included 820 LARC patients who received nCRT between September 2010 and October 2017. Univariate and multivariate Cox regression analyses identified prognostic factors, which were then used to develop risk assessment models with 9 machine learning algorithms. Model hyperparameters were optimized using random search and 10-fold cross-validation. The models were evaluated using metrics such as the area under the receiver operating characteristic curves (AUC), decision curve analysis, calibration curves, and precision and accuracy for predicting OS and DFS. Shapley's additive explanations (SHAP) was also used for model interpretation. The study included 820 patients, identifying LONT as a significant independent prognostic factor for both OS and DFS. Nine machine learning algorithms were used to create predictive models based on these factors. The extreme gradient boosting (XGB) model showed the best performance, with a mean AUC of 0.89 for OS and 0.83 for DFS in 10-fold cross-validation. Additionally, the predictions generated by the XGB model were analyzed using SHAP. Finally, we developed an online web-based calculator utilizing the XGB model to enhance the model's generalizability and to provide improved support for physicians in their decision-making processes. The study developed an XGB model utilizing LONT to predict OS and DFS in patients with LARC undergoing nCRT. Furthermore, an online web calculator was constructed using the XGB model to facilitate the model's generalization and to enhance physician decision-making.
Collapse
Affiliation(s)
- Ye Wang
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Zhen Pan
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Huajun Cai
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Shoufeng Li
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Ying Huang
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Jinfu Zhuang
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Xing Liu
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Guoxian Guan
- Department of Colorectal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China.
- Department of Colorectal Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fuzhou, China.
| |
Collapse
|
10
|
Suárez M, Torres AM, Blasco-Segura P, Mateo J. Application of the Random Forest Algorithm for Accurate Bipolar Disorder Classification. Life (Basel) 2025; 15:394. [PMID: 40141739 PMCID: PMC11943861 DOI: 10.3390/life15030394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 02/16/2025] [Accepted: 02/28/2025] [Indexed: 03/28/2025] Open
Abstract
Bipolar disorder (BD) is a complex psychiatric condition characterized by alternating episodes of mania and depression, posing significant challenges for accurate and timely diagnosis. This study explores the use of the Random Forest (RF) algorithm as a machine learning approach to classify patients with BD and healthy controls based on electroencephalogram (EEG) data. A total of 330 participants, including euthymic BD patients and healthy controls, were analyzed. EEG recordings were processed to extract key features, including power in frequency bands and complexity metrics such as the Hurst Exponent, which measures the persistence or randomness of a time series, and the Higuchi's Fractal Dimension, which is used to quantify the irregularity of brain signals. The RF model demonstrated robust performance, achieving an average accuracy of 93.41%, with recall and specificity exceeding 93%. These results highlight the algorithm's capacity to handle complex, noisy datasets while identifying key features relevant for classification. Importantly, the model provided interpretable insights into the physiological markers associated with BD, reinforcing the clinical value of EEG as a diagnostic tool. The findings suggest that RF is a reliable and accessible method for supporting the diagnosis of BD, complementing traditional clinical practices. Its ability to reduce diagnostic delays, improve classification accuracy, and optimize resource allocation make it a promising tool for integrating artificial intelligence into psychiatric care. This study represents a significant step toward precision psychiatry, leveraging technology to improve the understanding and management of complex mental health disorders.
Collapse
Affiliation(s)
- Miguel Suárez
- Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
- Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Ana M. Torres
- Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
- Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | | | - Jorge Mateo
- Medical Analysis Expert Group, Institute of Technology, University of Castilla-La Mancha, 13001 Cuenca, Spain
- Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| |
Collapse
|
11
|
Wu Y, Li P, Xie T, Yang R, Zhu R, Liu Y, Zhang S, Weng S. Enhanced quasi-meshing hotspot effect integrated embedded attention residual network for culture-free SERS accurate determination of Fusarium spores. Biosens Bioelectron 2025; 271:117053. [PMID: 39708494 DOI: 10.1016/j.bios.2024.117053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 11/29/2024] [Accepted: 12/08/2024] [Indexed: 12/23/2024]
Abstract
Determination of Fusarium spores is essential for precision control of Fusarium head blight and ensuring agri-food safety. As a highly sensitive real-time detection technique with rich fingerprint information and little influence from water, surface-enhanced Raman spectroscopy (SERS) has been widely applied in the determination of microorganisms. However, fungi determination faces significant challenges including low sensitivity, and poor specificity. The enhanced quasi-meshing hotspot effect (EQMHE) integrated with the embedded attention residual network (EARNet) was proposed to realize a label-free and accurate SERS determination of various Fusarium spores. The EQMHE can promote the binding of spores and nanoparticles to form numerous hotspots, significantly enhancing signal quality and improving the detection limit by at least four orders of magnitude. The key factors inducing EQMHE were validated through various characterization techniques. Moreover, due to its excellent feature extraction and recognition capabilities, EARNet successfully overcomes the limitations of spectral similarity, achieving determination accuracies of 100% in the training set, 98.33% in the validation set, and 100% in the prediction set for three Fusarium species from actual samples. EARNet requires only a small amount of training data and provides rapid and accurate diagnostics. Throughout the process, the spores do not require culturing or lysing, providing an effective determination method for the practical determination of mixed fungal spores. Overall, the proposed strategy effectively addresses challenges such as the need for fungal spore cultivation, difficulties in SERS hotspot formation, and suboptimal signal quality, and it holds significant promise for applications in disease control, food safety, and agricultural production.
Collapse
Affiliation(s)
- Yehang Wu
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China
| | - Pan Li
- Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, Anhui, China
| | - Tao Xie
- Key Laboratory of Biomimetic Sensor and Detecting Technology of Anhui Province, School of Materials and Chemical Engineering, West Anhui University, Lu'an, 237012, Anhui, China
| | - Rui Yang
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China
| | - Rui Zhu
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China
| | - Yulong Liu
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China
| | - Shengyu Zhang
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China
| | - Shizhuang Weng
- National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei, 230601, Anhui, China; School of Electronics and Information Engineering, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
12
|
Zhao Z, Ran X, Niu Y, Qiu M, Lv S, Zhu M, Wang J, Li M, Gao Z, Wang C, Xu Y, Ren W, Zhou X, Fan X, Song J, Qi M, Yu Y. Predicting Treatment Response of Repetitive Transcranial Magnetic Stimulation in Major Depressive Disorder Using an Explainable Machine Learning Model Based on Electroencephalography and Clinical Features. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2025:S2451-9022(25)00059-X. [PMID: 39978464 DOI: 10.1016/j.bpsc.2025.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 02/07/2025] [Accepted: 02/10/2025] [Indexed: 02/22/2025]
Abstract
Major depressive disorder (MDD) is highly heterogeneous in response to repetitive transcranial magnetic stimulation (rTMS), and identifying predictive biomarkers is essential for personalized treatment. However, most prior research studies have used either electroencephalography (EEG) or clinical features, lack interpretability, or have small sample sizes. This study included 74 patients with MDD who responded (responders) and 43 patients with MDD who did not respond (nonresponders) to rTMS. Eight baseline EEG metrics and clinical features were sent to 7 machine learning models to classify responders and nonresponders. Shapley additive explanations (SHAP) was used to interpret feature contributions. Combining phase locking value and clinical features with support vector machine achieved optimal classification performance (accuracy = 97.33%). SHAP revealed that delta and beta band functional connectivity (F3-P7, F3-P4, P3-P8, T7-Cz) significantly influenced predictions and differed between groups. This study developed an explainable predictive framework to predict rTMS response in MDD, enhancing the accuracy of rTMS response prediction and supporting personalized treatment in MDD.
Collapse
Affiliation(s)
- Zongya Zhao
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Second Affiliated Hospital of Xinxiang Medical University, Henan Collaborative Innovation Center of Prevention and Treatment of Mental Disorder, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Henan Engineering Research Center of Physical Diagnostics and Treatment Technology for Mental and Neurological Diseases, Henan, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China.
| | - Xiangying Ran
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Yanxiang Niu
- Institute of Disaster and Emergency Medicine, Tianjin University, Tianjin, China
| | - Mengyue Qiu
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Shiyang Lv
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Mingjie Zhu
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Junming Wang
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Mingcai Li
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Zhixian Gao
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Chang Wang
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Yongtao Xu
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Wu Ren
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Xuezhi Zhou
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Xiaofeng Fan
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
| | - Jinggui Song
- Henan Engineering Research Center of Physical Diagnostics and Treatment Technology for Mental and Neurological Diseases, Henan, China
| | - Mingchao Qi
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China
| | - Yi Yu
- School of Medical Engineering, School of Mathematical Medicine, Xinxiang Medical University, Xinxiang, China; Henan International Joint Laboratory of Neural Information Analysis and Drug Intelligent Design, Xinxiang, China; Henan Engineering Research Center of Medical Virtual Reality Intelligent Sensing Feedback, Xinxiang, China; Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China.
| |
Collapse
|
13
|
Xia Y, Li D, Wang Y, Xi Q, Jiao T, Wei J, Chen X, Chen Q, Chen Q. Rapid identification of cod authenticity based on hyperspectral imaging technology. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2025; 326:125258. [PMID: 39388934 DOI: 10.1016/j.saa.2024.125258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 09/07/2024] [Accepted: 10/04/2024] [Indexed: 10/12/2024]
Abstract
The high economic value of Atlantic cod makes it prone to fraudulent activities in the market, thus achieving rapid and non-destructive identification of its authenticity has practical significance. This study investigated the hyperspectral imaging (HSI) systems with a Vis-NIR (400 - 1000 nm) and SWIR (900 - 1700 nm) spectral range, for determining the authenticity of Atlantic cod fillets in two frozen and thawed sample states. Results found that the model effect of Vis-NIR data was generally better than SWIR data. Random forest (RF) and Linear discriminant analysis (LDA) models of Vis-NIR data achieved 100 % accuracy. Variable screening algorithms of Successive projections algorithm (SPA) and Variable combination population analysis- iteratively retaining informative variables (VCPA-IRIV) maintained 100 % accuracy of the LDA model at VIS-NIR wavebands while simplifying the data operation burden. Overall, this study suggests that HSI is a promising solution for rapid and non-destructive detection of Atlantic cod authenticity.
Collapse
Affiliation(s)
- Yu Xia
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Dong Li
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Yilin Wang
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Qibing Xi
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Tianhui Jiao
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Jie Wei
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Xiaomei Chen
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China
| | - Qingmin Chen
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China.
| | - Quansheng Chen
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen 361021, China.
| |
Collapse
|
14
|
Lv Z, Wei M, Pei H, Peng S, Li M, Jiang L. PTSP-BERT: Predict the thermal stability of proteins using sequence-based bidirectional representations from transformer-embedded features. Comput Biol Med 2025; 185:109598. [PMID: 39708499 DOI: 10.1016/j.compbiomed.2024.109598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 12/16/2024] [Accepted: 12/17/2024] [Indexed: 12/23/2024]
Abstract
Thermophilic proteins, mesophiles proteins and psychrophilic proteins have wide industrial applications, as enzymes with different optimal temperatures are often needed for different purposes. Convenient methods are needed to determine the optimal temperatures for proteins; however, laboratory methods for this purpose are time-consuming and laborious, and existing machine learning methods can only perform binary classification of thermophilic and non-thermophilic proteins, or psychrophilic and non-psychrophilic proteins. Here, we developed a deep learning model, PSTP-BERT, based on protein sequences that can directly perform Three classes identification of thermophilic, mesophilic, and psychrophilic proteins. By comparing BERT-bfd with other deep learning models using five-fold cross-validation, we found that BERT-bfd-extracted features achieved the highest accuracy under six classifiers. Furthermore, to improve the model's accuracy, we used SMOTE (synthetic minority oversampling technique) to balance the dataset and light gradient-boosting machine to rank BERT-bfd-extracted features according to their weights. We obtained the best-performing model with five-fold cross-validation accuracy of 89.59 % and independent test accuracy of 85.42 %. The performance of the PSTP-BERT is significantly better than that of existing models in Three classes identification task. In order to compare with previous binary classification models, we used PSTP-BERT to perform binary classification tasks of thermophilic and non-thermophilic protein, and psychrophilic and non-psychrophilic protein on an independent test set. PSTP-BERT achieved the highest accuracy on both binary classification tasks, with an accuracy of 93.33 % for thermophilic protein binary classification and 88.33 % for psychrophilic protein binary classification. The accuracy of the independent test of the model can reach between 89.8 % and 92.9 % after training and optimization of the training set with different sequence similarities, and the prediction accuracy of the new data can exceed 97 %. For the convenience of future researchers to use and reference, we have uploaded source code of PSTP-BERT to GitHub.
Collapse
Affiliation(s)
- Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu, 610065, China.
| | - Mingxuan Wei
- College of Biomedical Engineering, Sichuan University, Chengdu, 610065, China
| | - Hongdi Pei
- Department of Biomedical Engineering, Johns Hopkins University, MD, 21218, USA
| | - Shiyu Peng
- College of Biomedical Engineering, Sichuan University, Chengdu, 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu, 610065, China
| | - Liangzhen Jiang
- College of Food and Biological Engineering, Chengdu University, Chengdu, 610106, China; Country Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Chengdu, 610106, China
| |
Collapse
|
15
|
Shen D, Sha L, Yang L, Gu X. Identification of multiple complications as independent risk factors associated with 1-, 3-, and 5-year mortality in hepatitis B-associated cirrhosis patients. BMC Infect Dis 2025; 25:151. [PMID: 39891059 PMCID: PMC11786570 DOI: 10.1186/s12879-025-10566-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Accepted: 01/28/2025] [Indexed: 02/03/2025] Open
Abstract
BACKGROUND Hepatitis B-associated cirrhosis (HBC) is associated with severe complications and adverse clinical outcomes. This study aimed to develop and validate a predictive model for the occurrence of multiple complications (three or more) in patients with HBC and to explore the effects of multiple complications on HBC prognosis. METHODS In this retrospective cohort study, data from 121 HBC patients treated at Nanjing Second Hospital from February 2009 to November 2019 were analysed. The maximum follow-up period was 10.75 years, with a median of 5.75 years. Eight machine learning techniques were employed to construct predictive models, including C5.0, linear discriminant analysis (LDA), least absolute shrinkage and selection operator (LASSO), k-nearest neighbour (KNN), gradient boosting decision tree (GBDT), support vector machine (SVM), generalised linear model (GLM) and naive Bayes (NB), utilising variables such as medical history, demographics, clinical signs, and laboratory test results. Model performance was evaluated via receiver operating characteristic (ROC) curve analysis, residual analysis, calibration curve analysis, and decision curve analysis (DCA). The influence of multiple complications on HBC survival time was assessed via Kaplan‒Meier curve analysis. Furthermore, LASSO and univariable and multivariable Cox regression analyses were conducted to identify independent prognostic factors for overall survival (OS) in patients with HBC, followed by ROC, C-index, calibration curve, and DCA curve analyses of the constructed prognostic nomogram model. This study utilized bootstrap resampling for internal validation and employed the Medical Information Mart for Intensive Care IV (MIMIC-IV) database for external validation. RESULTS The GBDT model exhibited the highest area under the curve (AUC) and emerged as the optimal model for predicting the occurrence of multiple complications. The key predictive factors included posthospitalisation fever (PHF), body mass index (BMI), retinol binding protein (RBP), total bilirubin (TB) levels, and eosinophils (EOS). Kaplan-Meier analysis revealed that patients with multiple complications had significantly worse OS than those with fewer complications. Additionally, multivariable Cox regression analysis, informed by least absolute shrinkage and LASSO selection, identified hepatocellular carcinoma (HCC), multiple complications, and lactate dehydrogenase (LDH) levels as independent prognostic factors for OS. The prognostic model demonstrated 1-year, 3-year, and 5-year OS ROC AUCs of 0.802, 0.793, and 0.817, respectively. For the internal validation cohort, the corresponding AUC values were 0.797, 0.832, and 0.835. In contrast, the external validation cohort yielded a 1-year ROC AUC of 0.707. Calibration curves indicated good consistency of the model, and DCA demonstrated the model's clinical utility, showing high net benefits within certain threshold ranges. Compared with the univariable models, the multivariable ROC curves indicated higher AUC values for this prognostic model, and the model also possessed the best c-index. CONCLUSION The GBDT prediction model provides a reliable tool for the early identification of high-risk HBC patients prone to developing multiple complications. The concurrent occurrence of multiple complications is an independent prognostic factor for OS in patients with HBC. The constructed prognostic model demonstrated remarkable predictive performance and clinical applicability, indicating its crucial role in enhancing patient outcomes through timely and targeted interventions.
Collapse
Affiliation(s)
- Duo Shen
- Department of Gastroenterology, The Second People's Hospital of Changzhou, the Third Affiliated Hospital of Nanjing Medical University, Changzhou, Jiangsu, China
| | - Ling Sha
- Department of Neurology, Nanjing Drum Tower Hospital, Affiliated to Nanjing University Medical School, Nanjing, Jiangsu, China
| | - Ling Yang
- Department of Central Laboratory, Jurong Hospital Affiliated to Jiangsu University, 66 Ersheng Road, Jurong, Zhenjiang, Jiangsu, 212400, China
| | - Xuefeng Gu
- Department of Central Laboratory, Jurong Hospital Affiliated to Jiangsu University, 66 Ersheng Road, Jurong, Zhenjiang, Jiangsu, 212400, China.
- Department of Infectious Diseases, Jurong Hospital Affiliated to Jiangsu University, 66 Ersheng Road, Jurong, Zhenjiang, Jiangsu, 212400, China.
| |
Collapse
|
16
|
Yan H, Wu Y, Bo Y, Han Y, Ren G. Study on the Impact of LDA Preprocessing on Pig Face Identification with SVM. Animals (Basel) 2025; 15:231. [PMID: 39858231 PMCID: PMC11759145 DOI: 10.3390/ani15020231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 12/21/2024] [Accepted: 01/14/2025] [Indexed: 01/27/2025] Open
Abstract
In this study, the implementation of traditional machine learning models in the intelligent management of swine is explored, focusing on the impact of LDA preprocessing on pig facial recognition using an SVM. Through experimental analysis, the kernel functions for two testing protocols, one utilizing an SVM exclusively and the other employing a combination of LDA and an SVM, were identified as polynomial and RBF, both with coefficients of 0.03. Individual identification tests conducted on 10 pigs demonstrated that the enhanced protocol improved identification accuracy from 83.66% to 86.30%. Additionally, the training and testing durations were reduced to 0.7% and 0.3% of the original times, respectively. These findings suggest that LDA preprocessing significantly enhances the efficiency of individual pig identification using an SVM, providing empirical evidence for the deployment of SVM classifiers in mobile and embedded systems.
Collapse
Affiliation(s)
- Hongwen Yan
- College of Information Science and Engineering, Shanxi Agricultural University, Jinzhong 030801, China; (Y.W.); (Y.B.); (Y.H.); (G.R.)
| | | | | | | | | |
Collapse
|
17
|
Jiang X, Wang B. Enhancing Clinical Decision Making by Predicting Readmission Risk in Patients With Heart Failure Using Machine Learning: Predictive Model Development Study. JMIR Med Inform 2024; 12:e58812. [PMID: 39740105 PMCID: PMC11706445 DOI: 10.2196/58812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 10/10/2024] [Accepted: 11/12/2024] [Indexed: 01/02/2025] Open
Abstract
Background Patients with heart failure frequently face the possibility of rehospitalization following an initial hospital stay, placing a significant burden on both patients and health care systems. Accurate predictive tools are crucial for guiding clinical decision-making and optimizing patient care. However, the effectiveness of existing models tailored specifically to the Chinese population is still limited. Objective This study aimed to formulate a predictive model for assessing the likelihood of readmission among patients diagnosed with heart failure. Methods In this study, we analyzed data from 1948 patients with heart failure in a hospital in Sichuan Province between 2016 and 2019. By applying 3 variable selection strategies, 29 relevant variables were identified. Subsequently, we constructed 6 predictive models using different algorithms: logistic regression, support vector machine, gradient boosting machine, Extreme Gradient Boosting, multilayer perception, and graph convolutional networks. Results The graph convolutional network model showed the highest prediction accuracy with an area under the receiver operating characteristic curve of 0.831, accuracy of 75%, sensitivity of 52.12%, and specificity of 90.25%. Conclusions The model crafted in this study proves its effectiveness in forecasting the likelihood of readmission among patients with heart failure, thus serving as a crucial reference for clinical decision-making.
Collapse
Affiliation(s)
- Xiangkui Jiang
- School of Automation, Xi’an University of Posts and Telecommunications, No. 563 Chang'an South Road, Yanta District, Xi’an, Shaanxi, 710121, China, 86 17810791125
| | - Bingquan Wang
- School of Automation, Xi’an University of Posts and Telecommunications, No. 563 Chang'an South Road, Yanta District, Xi’an, Shaanxi, 710121, China, 86 17810791125
| |
Collapse
|
18
|
Wang Y, Wang L, Li Y. Organophosphorus Pesticides Management Strategies: Prohibition and Restriction Multi-Category Multi-Class Models, Environmental Transformation Risks, and Special Attention List. TOXICS 2024; 13:16. [PMID: 39853016 PMCID: PMC11768814 DOI: 10.3390/toxics13010016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 12/18/2024] [Accepted: 12/24/2024] [Indexed: 01/26/2025]
Abstract
Organophosphorus pesticides (OPs) have become one of the most widely used pesticides in Chinese agriculture; however, methods to identify potential restrictions on OPs molecules are lacking. Therefore, this study retrieved the OPs restriction list and constructed eight multi-class, multi-category machine learning models for OPs restrictions. Among these, the random forest (RF) model demonstrated excellent predictive performance, as it was successfully validated and applied. Potential environmental transformation products of OPs were obtained using EAWAG-BBD software, while toxicity indicators for the parent OPs and their transformation products were predicted with ADMETlab 3.0 software. This study found that unrestricted OPs, such as phorate, parathion, and chlorpyrifos, exhibited a high probability of toxicity. Additionally, the environmental transformation products of OPs posed similar comprehensive toxicity risks as the parent compounds. A special attention list for OPs was created based on the toxicity risks of unrestricted parent OPs and their transformation products, using standard deviation classification. Phorate and parathion were identified as OPs requiring special attention. This paper aims to provide an effective method for identifying the potential restriction levels of OPs and to propose an evaluation system that comprehensively considers the health risk, thereby supporting the improvement and optimization of management and usage strategies for OPs.
Collapse
Affiliation(s)
- Yingwei Wang
- Colleges of Forestry, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China;
| | - Lu Wang
- Jilin Province Ecological Environmental Monitoring Centre, 813 Pudong Road, Changchun 130011, China;
| | - Yufei Li
- Colleges of Forestry, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China;
| |
Collapse
|
19
|
Wujieti B, Hao M, Liu E, Zhou L, Wang H, Zhang Y, Cui W, Chen B. Study on SHP2 Conformational Transition and Structural Characterization of Its High-Potency Allosteric Inhibitors by Molecular Dynamics Simulations Combined with Machine Learning. Molecules 2024; 30:14. [PMID: 39795072 PMCID: PMC11721961 DOI: 10.3390/molecules30010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 12/20/2024] [Accepted: 12/20/2024] [Indexed: 01/13/2025] Open
Abstract
The src-homology 2 domain-containing phosphatase 2 (SHP2) is a human cytoplasmic protein tyrosine phosphatase that plays a crucial role in cellular signal transduction. Aberrant activation and mutations of SHP2 are associated with tumor growth and immune suppression, thus making it a potential target for cancer therapy. Initially, researchers sought to develop inhibitors targeting SHP2's catalytic site (protein tyrosine phosphatase domain, PTP). Due to limitations such as conservativeness and poor membrane permeability, SHP2 was once considered a challenging drug target. Nevertheless, with the in-depth investigations into the conformational switch mechanism from SHP2's inactive to active state and the emergence of various SHP2 allosteric inhibitors, new hope has been brought to this target. In this study, we investigated the interaction models of various allosteric inhibitors with SHP2 using molecular dynamics simulations. Meanwhile, we explored the free energy landscape of SHP2 activation using enhanced sampling technique (meta-dynamics simulations), which provides insights into its conformational changes and activation mechanism. Furthermore, to biophysically interpret high-dimensional simulation trajectories, we employed interpretable machine learning methods, specifically extreme gradient boosting (XGBoost) with Shapley additive explanations (SHAP), to comprehensively analyze the simulation data. This approach allowed us to identify and highlight key structural features driving SHP2 conformational dynamics and regulating the activity of the allosteric inhibitor. These studies not only enhance our understanding of SHP2's conformational switch mechanism but also offer crucial insights for designing potent allosteric SHP2 inhibitors and addressing drug resistance issues.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Wei Cui
- School of Chemical Sciences, University of Chinese Academy of Sciences, No. 19A, Yuquan Road, Beijing 100049, China; (B.W.); (M.H.); (E.L.); (L.Z.); (H.W.); (Y.Z.)
| | - Bozhen Chen
- School of Chemical Sciences, University of Chinese Academy of Sciences, No. 19A, Yuquan Road, Beijing 100049, China; (B.W.); (M.H.); (E.L.); (L.Z.); (H.W.); (Y.Z.)
| |
Collapse
|
20
|
Shi J, Xiao Y. Research on the pathways to high-quality development of tourism SMEs: A perspective of value assigned by quality, standards and brand. Heliyon 2024; 10:e39772. [PMID: 39717572 PMCID: PMC11665349 DOI: 10.1016/j.heliyon.2024.e39772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 08/25/2024] [Accepted: 10/23/2024] [Indexed: 12/25/2024] Open
Abstract
With the advancement of the United Nations Sustainable Development Goals, the trend toward sustainable and high-quality development has become prominent for tourism enterprises. Tourism small and medium-sized enterprises (SMEs), which are numerous, constitute a significant part of the global tourism market. However, existing research predominantly focuses on the pathways for high-quality development of large tourism enterprises, with many gaps in understanding the corresponding processes for tourism SMEs. Due to their limited resources and capabilities, tourism SMEs have to selectively explore distinctive development pathways based on their core strengths to pursue high-quality development. To investigate these development pathways, this study draws on the perspective of value assigned by quality, standards, and brand, taking 181 tourism SMEs as the research object, and uses the fsQCA method to analyze the pathways towards high-quality development of tourism SMEs. The results show that there are three pathways to high-quality development of tourism SMEs can be distilled, including "root-deepening", "pioneering and innovative", and "brand-prioritized". Instead, "deficiency in innovation capacity" and "indiscriminate marketing of brands" will hamper the high-quality development pathways of tourism SMEs. Further examination discloses a substitutive relationship among the causal factors of the various pathways. This study delves into how tourism SMEs, under the objective constraints of resources and capabilities, integrate business elements and adjust strategic priorities to achieve high-quality development. It provides theoretical guidance for tourism SMEs worldwide to pursue high-quality and sustainable development in ways that are appropriate to different contexts.
Collapse
Affiliation(s)
- Jianzhong Shi
- School of Management, Ocean University of China, Qingdao, People's Republic of China
| | - Yang Xiao
- School of Management, Ocean University of China, Qingdao, People's Republic of China
| |
Collapse
|
21
|
Wang Y, Wang Z, Yu X, Wang X, Song J, Yu DJ, Ge F. MORE: a multi-omics data-driven hypergraph integration network for biomedical data classification and biomarker identification. Brief Bioinform 2024; 26:bbae658. [PMID: 39692449 DOI: 10.1093/bib/bbae658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 11/18/2024] [Accepted: 12/04/2024] [Indexed: 12/19/2024] Open
Abstract
High-throughput sequencing methods have brought about a huge change in omics-based biomedical study. Integrating various omics data is possibly useful for identifying some correlations across data modalities, thus improving our understanding of the underlying biological mechanisms and complexity. Nevertheless, most existing graph-based feature extraction methods overlook the complementary information and correlations across modalities. Moreover, these methods tend to treat the features of each omics modality equally, which contradicts current biological principles. To solve these challenges, we introduce a novel approach for integrating multi-omics data termed Multi-Omics hypeRgraph integration nEtwork (MORE). MORE initially constructs a comprehensive hyperedge group by extensively investigating the informative correlations within and across modalities. Subsequently, the multi-omics hypergraph encoding module is employed to learn the enriched omics-specific information. Afterward, the multi-omics self-attention mechanism is then utilized to adaptatively aggregate valuable correlations across modalities for representation learning and making the final prediction. We assess MORE's performance on datasets characterized by message RNA (mRNA) expression, Deoxyribonucleic Acid (DNA) methylation, and microRNA (miRNA) expression for Alzheimer's disease, invasive breast carcinoma, and glioblastoma. The results from three classification tasks highlight the competitive advantage of MORE in contrast with current state-of-the-art (SOTA) methods. Moreover, the results also show that MORE has the capability to identify a greater variety of disease-related biomarkers compared to existing methods, highlighting its advantages in biomedical data mining and interpretation. Overall, MORE can be investigated as a valuable tool for facilitating multi-omics analysis and novel biomarker discovery. Our code and data can be publicly accessed at https://github.com/Wangyuhanxx/MORE.
Collapse
Affiliation(s)
- Yuhan Wang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Zhikang Wang
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Wellington Rd, Clayton, Melbourne, VIC 3800, Australia
| | - Xuan Yu
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong 999077, China
| | - Xiaoyu Wang
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Wellington Rd, Clayton, Melbourne, VIC 3800, Australia
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Wellington Rd, Clayton, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Wellington Rd, Clayton, Melbourne, VIC 3800, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Fang Ge
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), Nanjing University of Posts & Telecommunications, 9 Wenyuan, Nanjing 210023, China
| |
Collapse
|
22
|
Alemayehu MA. Machine learning algorithms for prediction of measles one vaccination dropout among 12-23 months children in Ethiopia. BMJ Open 2024; 14:e089764. [PMID: 39542486 PMCID: PMC11575239 DOI: 10.1136/bmjopen-2024-089764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2024] Open
Abstract
INTRODUCTION Despite the availability of a safe and effective measles vaccine in Ethiopia, the country has experienced recurrent and significant measles outbreaks, with a nearly fivefold increase in confirmed cases from 2021 to 2023. The WHO has identified being unvaccinated against measles as a major factor driving this resurgence of cases and deaths. Consequently, this study aimed to apply robust machine learning algorithms to predict the key factors contributing to measles vaccination dropout. METHODS This study utilised data from the 2016 Ethiopian Demographic and Health Survey to evaluate measles vaccination dropout. Eight supervised machine learning algorithms were implemented: eXtreme Gradient Boosting (XGBoost), Random Forest, Gradient Boosting, Support Vector Machine, Decision Tree, Naïve Bayes, K-Nearest Neighbours and Logistic Regression. Data preprocessing and model development were performed using R language V.4.2.1. The predictive models were evaluated using accuracy, precision, recall, F1-score and area under the curve (AUC). Unlike previous studies, this research utilised Shapley values to interpret individual predictions made by the top-performing machine learning model. RESULTS The XGBoost algorithm surpassed all classifiers in predicting measles vaccination dropout (Accuracy and AUC values of 73.9% and 0.813, respectively). The Shapley Beeswarm plot displayed how each feature influenced the best model's predictions. The model predicted that the younger mother's age, religion-Jehovah/Adventist, husband with no and mother with primary education, unemployment of the mother, residence in the Oromia and Somali regions, large family size and older paternal age have a strong positive impact on the measles vaccination dropout. CONCLUSION The measles dropout rate in the country exceeded the recommended threshold of <10%. To tackle this issue, targeted interventions are crucial. Public awareness campaigns, regular health education and partnerships with religious institutions and health extension workers should be implemented, particularly in the identified underprivileged regions. These measures can help reduce measles vaccination dropout rates and enhance overall coverage.
Collapse
Affiliation(s)
- Meron Asmamaw Alemayehu
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Amhara, Ethiopia
| |
Collapse
|
23
|
Ding T, Qu T, Zou Z, Ding C. A novel multi-model feature generation technique for suicide detection. PeerJ Comput Sci 2024; 10:e2301. [PMID: 39650449 PMCID: PMC11623287 DOI: 10.7717/peerj-cs.2301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 08/12/2024] [Indexed: 12/11/2024]
Abstract
Automated expert systems (AES) analyzing depression-related content on social media have piqued the interest of researchers. Depression, often linked to suicide, requires early prediction for potential life-saving interventions. In the conventional approach, psychologists conduct patient interviews or administer questionnaires to assess depression levels. However, this traditional method is plagued by limitations. Patients might not feel comfortable disclosing their true feelings to psychologists, and counselors may struggle to accurately predict situations due to limited data. In this context, social media emerges as a potentially valuable resource. Given the widespread use of social media in daily life, individuals often express their nature and mental state through their online posts. AES can efficiently analyze vast amounts of social media content to predict depression levels in individuals at an early stage. This study contributes to this endeavor by proposing an innovative approach for predicting suicide risks using social media content and machine learning techniques. A novel multi-model feature generation technique is employed to enhance the performance of machine learning models. This technique involves the use of a feature extraction method known as term frequency-inverse document frequency (TF-IDF), combined with two machine learning models: logistic regression (LR) and support vector machine (SVM). The proposed technique calculates probabilities for each sample in the dataset, resulting in a new feature set referred to as the probability-based feature set (ProBFS). This ProBFS is compact yet highly correlated with the target classes in the dataset. The utilization of concise and correlated features yields significant outcomes. The SVM model achieves an impressive accuracy score of 0.96 using ProBFS while maintaining a low computational time of 5.63 seconds even when dealing with extensive datasets. Furthermore, a comparison with state-of-the-art approaches is conducted to demonstrate the significance of the proposed method.
Collapse
Affiliation(s)
- Ting Ding
- School of Earth Science, East China University of Technology, Nanchang, Jiangxi, China
- Urumqi Comprehensive Survey Center on Natural Resources, China Geological Survey, Urumqi, Xinjiang, China
| | - Tonghui Qu
- Hangzhou Hikvision Digital Technology, Hangzhou, China
| | - Zongliang Zou
- School of Earth Science, East China University of Technology, Nanchang, Jiangxi, China
| | - Cheng Ding
- Department of Biomedical Engineering, Emory University, Atlanta, GA, United States of America
| |
Collapse
|
24
|
Zhang H, Ren R, Gao X, Wang H, Jiang W, Jiang X, Li Z, Pan J, Wang J, Wang S, Ding Y, Mu Y, Wang X, Du J, Li WT, Xiong Z, Zou J. Synchronous monitoring agricultural water qualities and greenhouse gas emissions based on low-cost Internet of Things and intelligent algorithms. WATER RESEARCH 2024; 268:122663. [PMID: 39467424 DOI: 10.1016/j.watres.2024.122663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 09/24/2024] [Accepted: 10/17/2024] [Indexed: 10/30/2024]
Abstract
This study addressed the challenges of cost and portability in synchronous monitoring water quality and greenhouse gas emissions in paddy-dominated regions by developing a novel Internet of Things (IoT)-based monitoring system (WG-IoT-MS). The system, equipped with low-cost sensors and integrated intelligent algorithms, enabled real-time monitoring of dissolved N2O concentrations. Combined with an air-water gas exchange model, the system achieved efficient monitoring and simulation of CO2 and N2O emissions from agricultural water bodies while reducing monitoring costs by approximately 60 %. The proposed method was validated in paddy-dominated regions in Danyang, China. Results indicated the excellence of the dissolved N2O concentration model based on support vector regression, demonstrating accurate predictions within a concentration range of 2.003 to 13.247 μg/L. Notably, the model maintained acceptable predictive accuracy (R2 > 0.70) even when some variables were partially absent (with the number of missing variables < 2 and the missing proportion (MP) ≤ 50 %), making up for the data loss caused by sensor malfunctions. Furthermore, the model performed well (R2 > 0.80) when testing data sourced from paddy fields and lakes. Finally, CO2 and N2O emissions were successfully monitored, with the results validated using a floating chamber method (R2 > 0.70). The method provides crucial technical support for quantitative assessment of water quality and greenhouse gas emissions in paddy-dominated regions, laying a foundation for formulating effective emission reduction strategies.
Collapse
Affiliation(s)
- Huazhan Zhang
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Rui Ren
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Xiang Gao
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China; Jiangsu Key Laboratory of Low Carbon Agriculture and GHGs Mitigation, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China.
| | - Housheng Wang
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Wei Jiang
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Xiaosan Jiang
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Zhaofu Li
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Jianjun Pan
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Jinyang Wang
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China; Jiangsu Key Laboratory of Low Carbon Agriculture and GHGs Mitigation, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Songhan Wang
- College of Agronomy, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Yanfeng Ding
- College of Agronomy, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Yue Mu
- Academy for advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Xuelei Wang
- Institute of Remote Sensing Applications of Chinese Academy of Sciences, Beijing 100875, PR China
| | - Jizeng Du
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, Beijing 100875, PR China
| | - Wen-Tao Li
- State Key Laboratory of Pollution Control and Resources Reuse, School of the Environment, Nanjing University 210023 Nanjing, PR China
| | - Zhengqin Xiong
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China; Jiangsu Key Laboratory of Low Carbon Agriculture and GHGs Mitigation, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Jianwen Zou
- Key Laboratory of Low-carbon and Green Agriculture in Southeastern China, Ministry of Agriculture and Rural Affairs, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China; Jiangsu Key Laboratory of Low Carbon Agriculture and GHGs Mitigation, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, PR China
| |
Collapse
|
25
|
Asti V, Ablondi M, Molle A, Zanotti A, Vasini M, Sabbioni A. Inertial measurement unit technology for gait detection: a comprehensive evaluation of gait traits in two Italian horse breeds. Front Vet Sci 2024; 11:1459553. [PMID: 39479203 PMCID: PMC11521968 DOI: 10.3389/fvets.2024.1459553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 09/30/2024] [Indexed: 11/02/2024] Open
Abstract
Introduction The shift of the horse breeding sector from agricultural to leisure and sports purposes led to a decrease in local breeds' population size due to the loss of their original breeding purposes. Most of the Italian breeds must adapt to modern market demands, and gait traits are suitable phenotypes to help this process. Inertial measurement unit (IMU) technology can be used to objectively assess them. This work aims to investigate on IMU recorded data (i) the influence of environmental factors and biometric measurements, (ii) their repeatability, (iii) the correlation with judge evaluations, and (iv) their predictive value. Material and methods The Equisense Motion S® was used to collect phenotypes on 135 horses, Bardigiano (101) and Murgese (34) and the data analysis was conducted using R (v.4.1.2). Analysis of variance (ANOVA) was employed to assess the effects of biometric measurements and environmental and animal factors on the traits. Results and discussion Variations in several traits depending on the breed were identified, highlighting different abilities among Bardigiano and Murgese horses. Repeatability of horse performance was assessed on a subset of horses, with regularity and elevation at walk being the traits with the highest repeatability (0.63 and 0.72). The positive correlation between judge evaluations and sensor data indicates judges' ability to evaluate overall gait quality. Three different algorithms were employed to predict the judges score from the IMU measurements: Support Vector Machine (SVM), Gradient Boosting Machine (GBM), and K-Nearest Neighbors (KNN). A high variability was observed in the accuracy of the SVM model, ranging from 55 to 100% while the other two models showed higher consistency, with accuracy ranging from 74 to 100% for the GBM and from 64 to 88% for the KNN. Overall, the GBM model exhibits the highest accuracy and the lowest error. In conclusion, integrating IMU technology into horse performance evaluation offers valuable insights, with implications for breeding and training.
Collapse
Affiliation(s)
- Vittoria Asti
- Department of Veterinary Sciences, University of Parma, Parma, Italy
| | - Michela Ablondi
- Department of Veterinary Sciences, University of Parma, Parma, Italy
| | - Arnaud Molle
- Department of Veterinary Sciences, University of Parma, Parma, Italy
| | - Andrea Zanotti
- Department of Veterinary Sciences, University of Parma, Parma, Italy
| | - Matteo Vasini
- Italian Breeding Association for Equine and Donkey Breeds (ANAREAI), Roma, Italy
| | - Alberto Sabbioni
- Department of Veterinary Sciences, University of Parma, Parma, Italy
| |
Collapse
|
26
|
Jia W, Li F, Cui Y, Wang Y, Dai Z, Yan Q, Liu X, Li Y, Chang H, Zeng Q. Deep Learning Radiomics Model of Contrast-Enhanced CT for Differentiating the Primary Source of Liver Metastases. Acad Radiol 2024; 31:4057-4067. [PMID: 38702214 DOI: 10.1016/j.acra.2024.04.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 04/05/2024] [Accepted: 04/11/2024] [Indexed: 05/06/2024]
Abstract
RATIONALE AND OBJECTIVES To develop and validate a deep learning radiomics (DLR) model based on contrast-enhanced computed tomography (CT) to identify the primary source of liver metastases. MATERIALS AND METHODS In total, 657 liver metastatic lesions, including breast cancer (BC), lung cancer (LC), colorectal cancer (CRC), gastric cancer (GC), and pancreatic cancer (PC), from 428 patients were collected at three clinical centers from January 2018 to October 2023 series. The lesions were randomly assigned to the training and validation sets in a 7:3 ratio. An additional 112 lesions from 61 patients at another clinical center served as an external test set. A DLR model based on contrast-enhanced CT of the liver was developed to distinguish the five pathological types of liver metastases. Stepwise classification was performed to improve the classification efficiency of the model. Lesions were first classified as digestive tract cancer (DTC) and non-digestive tract cancer (non-DTC). DTCs were divided into CRC, GC, and PC and non-DTCs were divided into LC and BC. To verify the feasibility of the DLR model, we trained classical machine learning (ML) models as comparison models. Model performance was evaluated using accuracy (ACC) and area under the receiver operating characteristic curve (AUC). RESULTS The classification model constructed by the DLR algorithm showed excellent performance in the classification task compared to ML models. Among the five categories task, highest ACC and average AUC were achieved at 0.563 and 0.796 in the validation set, respectively. In the DTC and non-DTC and the LC and BC classification tasks, AUC was achieved at 0.907 and 0.809 and ACC was achieved at 0.843 and 0.772, respectively. In the CRC, GC, and PC classification task, ACC and average AUC were the highest, at 0.714 and 0.811, respectively. CONCLUSION The DLR model is an effective method for identifying the primary source of liver metastases.
Collapse
Affiliation(s)
- Wenjing Jia
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China; Shandong First Medical University, Jinan, China.
| | - Fuyan Li
- Department of Radiology, Shandong Provincial Hospital affiliated to Shandong First Medical University, Jinan, China.
| | - Yi Cui
- Department of Radiology, Qilu Hospital of Shandong University, Jinan, China.
| | - Yong Wang
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China.
| | - Zhengjun Dai
- Scientific Research Department, Huiying Medical Technology Co., Ltd, Beijing, China.
| | - Qingqing Yan
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China.
| | - Xinhui Liu
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China.
| | - Yuting Li
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China.
| | - Huan Chang
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China.
| | - Qingshi Zeng
- Department of Radiology, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, China.
| |
Collapse
|
27
|
Murmu A, Győrffy B. Artificial intelligence methods available for cancer research. Front Med 2024; 18:778-797. [PMID: 39115792 DOI: 10.1007/s11684-024-1085-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 05/17/2024] [Indexed: 11/01/2024]
Abstract
Cancer is a heterogeneous and multifaceted disease with a significant global footprint. Despite substantial technological advancements for battling cancer, early diagnosis and selection of effective treatment remains a challenge. With the convenience of large-scale datasets including multiple levels of data, new bioinformatic tools are needed to transform this wealth of information into clinically useful decision-support tools. In this field, artificial intelligence (AI) technologies with their highly diverse applications are rapidly gaining ground. Machine learning methods, such as Bayesian networks, support vector machines, decision trees, random forests, gradient boosting, and K-nearest neighbors, including neural network models like deep learning, have proven valuable in predictive, prognostic, and diagnostic studies. Researchers have recently employed large language models to tackle new dimensions of problems. However, leveraging the opportunity to utilize AI in clinical settings will require surpassing significant obstacles-a major issue is the lack of use of the available reporting guidelines obstructing the reproducibility of published studies. In this review, we discuss the applications of AI methods and explore their benefits and limitations. We summarize the available guidelines for AI in healthcare and highlight the potential role and impact of AI models on future directions in cancer research.
Collapse
Affiliation(s)
- Ankita Murmu
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary
- National Laboratory for Drug Research and Development, Budapest, 1117, Hungary
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary
| | - Balázs Győrffy
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary.
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary.
- Department of Biophysics, University of Pecs, Pecs, 7624, Hungary.
| |
Collapse
|
28
|
Chen J, Chen X, Wang J. A novel binary data classification algorithm based on the modified reaction-diffusion predator-prey system with Holling-II function. CHAOS (WOODBURY, N.Y.) 2024; 34:103111. [PMID: 39361816 DOI: 10.1063/5.0219960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 09/02/2024] [Indexed: 10/05/2024]
Abstract
In this study, we propose a modified reaction-diffusion prey-predator model with a Holling-II function for binary data classification. In the model, we use u and v to represent the densities of prey and predators, respectively. We modify the original equation by substituting the term v with f-v to obtain a stable and clear nonlinear decision surface. By employing a finite difference method for numerical solution of the original model, we conduct various experiments in two-dimensional and three-dimensional spaces to validate the feasibility of the classifier. Additionally, with consideration for wide real applications, we perform classification experiments on electroencephalogram signals, demonstrating the effectiveness and robustness of the classifier in binary data classification.
Collapse
Affiliation(s)
- Jialin Chen
- School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Xinlei Chen
- School of Teacher Education, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Jian Wang
- School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, China
- Center for Applied Mathematics of Jiangsu Province, Nanjing University of Information Science and Technology, Nanjing 210044, China
- Jiangsu International Joint Laboratory on System Modeling and Data Analysis, Nanjing University of Information Science and Technology, Nanjing 210044, China
| |
Collapse
|
29
|
Mehta J, Chatterjee S, Shah M. Leveraging microbial synergy: Predicting the optimal consortium to enhance the performance of microbial fuel cell using Subspace-kNN. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 369:122252. [PMID: 39222584 DOI: 10.1016/j.jenvman.2024.122252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/12/2024] [Accepted: 08/17/2024] [Indexed: 09/04/2024]
Abstract
Microbial Fuel Cells (MFCs) are a sophisticated and advanced system that uses exoelectrogenic microorganisms to generate bioenergy. Predicting performance outcomes under experimental settings is challenging due to the intricate interactions that occur in mixed-species bioelectrochemical reactors like MFCs. One of the key factors that limit the MFC's performance is the presence of a microbial consortium. Traditionally, multiple microbial consortia are implemented in MFCs to determine the best consortium. This approach is laborious, inefficient, and wasteful of time and resources. The increase in the availability of soft computational techniques has allowed for the development of alternative strategies like artificial intelligence (AI) despite the fact that a direct correlation between microbial strain, microbial consortium, and MFC performance has yet to be established. In this work, a novel generic AI model based on subspace k-Nearest Neighbour (SS-kNN) is developed to identify and forecast the best microbial consortium from the constituent microbes. The SS-kNN model is trained with thirty-five different microbial consortia sharing different effluent properties. Chemical oxygen demand (COD) reduction, voltage generation, exopolysaccharide (EPS) production, and standard deviation (SD) of voltage generation are used as input features to train the SS-kNN model. The proposed SS-kNN model offers an accuracy of 100% during training period and 85.71% when it is tested with the data obtained from existing literature. The implementation of selected consortium (as predicted by SS-kNN model) improves the COD reduction capability of MFC by 15.67% than that of its constituent microbes which is experimentally verified. In addition, to prevent the effects of climate change and mitigate water pollution, the implementation of MFC technology ensures clean and green electricity. Consequently, achieving sustainable development goals (SDG) 6, 7, and 13.
Collapse
Affiliation(s)
- Jimil Mehta
- Electrical Engineering Department, Institute of Technology, Nirma University, Sarkhej-Gandhinagar Highway, Ahmedabad, 382481, Gujarat, India
| | - Soumesh Chatterjee
- Electrical Engineering Department, Institute of Technology, Nirma University, Sarkhej-Gandhinagar Highway, Ahmedabad, 382481, Gujarat, India
| | - Manisha Shah
- Electrical Engineering Department, Institute of Technology, Nirma University, Sarkhej-Gandhinagar Highway, Ahmedabad, 382481, Gujarat, India.
| |
Collapse
|
30
|
Guo F, Hu H, Peng H, Liu J, Tang C, Zhang H. Research progress on machine algorithm prediction of liver cancer prognosis after intervention therapy. Am J Cancer Res 2024; 14:4580-4596. [PMID: 39417194 PMCID: PMC11477842 DOI: 10.62347/beao1926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 09/13/2024] [Indexed: 10/19/2024] Open
Abstract
The treatment for liver cancer has transitioned from traditional surgical resection to interventional therapies, which have become increasingly popular among patients due to their minimally invasive nature and significant local efficacy. However, with advancements in treatment technologies, accurately assessing patient response and predicting long-term survival has become a crucial research topic. Over the past decade, machine algorithms have made remarkable progress in the medical field, particularly in hepatology and prognosis studies of hepatocellular carcinoma (HCC). Machine algorithms, including deep learning and machine learning, can identify prognostic patterns and trends by analyzing vast amounts of clinical data. Despite significant advancements, several issues remain unresolved in the prognosis prediction of liver cancer using machine algorithms. Key challenges and main controversies include effectively integrating multi-source clinical data to improve prediction accuracy, addressing data privacy and ethical concerns, and enhancing the transparency and interpretability of machine algorithm decision-making processes. This paper aims to systematically review and analyze the current applications and potential of machine algorithms in predicting the prognosis of patients undergoing interventional therapy for liver cancer, providing theoretical and empirical support for future research and clinical practice.
Collapse
Affiliation(s)
- Feng Guo
- Department of Interventional Diagnosis and Treatment, Yongzhou Central Hospital, Yongzhou Clinical College, University of South ChinaYongzhou 425000, Hunan, China
| | - Hao Hu
- Department of Gynecologic Oncology, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and TechnologyWuhan 430079, Hubei, China
| | - Hao Peng
- Department of Abdominal Oncology, The Central Hospital of Enshi Tujia and Miao Autonomous PrefectureEnshi 445000, Hubei, China
| | - Jia Liu
- Department of Oncology, The First People’s Hospital of Changde CityChangde 415003, Hunan, China
| | - Chengbo Tang
- Department of Interventional Diagnosis and Treatment, Yongzhou Central Hospital, Yongzhou Clinical College, University of South ChinaYongzhou 425000, Hunan, China
| | - Hao Zhang
- Department of Interventional Vascular Surgery, First Affiliated Hospital of Hunan Normal University (Hunan Provincial People’s Hospital)Changsha 410000, Hunan, China
| |
Collapse
|
31
|
Mumenin N, Kabir Hossain ABM, Hossain MA, Debnath PP, Nusrat Della M, Hasan Rashed MM, Hossen A, Basar MR, Hossain MS. Screening depression among university students utilizing GHQ-12 and machine learning. Heliyon 2024; 10:e37182. [PMID: 39296063 PMCID: PMC11409111 DOI: 10.1016/j.heliyon.2024.e37182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 08/22/2024] [Accepted: 08/28/2024] [Indexed: 09/21/2024] Open
Abstract
The escalating incidence of depression has brought attention to the increasing concern regarding the mental well-being of university students in the current academic environment. Given the increasing mental health challenges faced by students, there is a critical need for efficient, scalable, and accurate screening methods. This study aims to address the issue by using the General Health Questionnaire-12 (GHQ-12), a well recognized tool for evaluating psychological discomfort, in combination with machine learning (ML) techniques. Firstly, for effective screening of depression, a comprehensive questionnaire has been created with the help of an expert psychiatrist. The questionnaire includes the GHQ-12, socio-demographic, and job and career-related inquiries. A total of 804 responses has been collected from various public and private universities across Bangladesh. The data has been then analyzed and preprocessed. It has been found that around 60% of the study population are suffering from depression. Lastly, 16 different ML models, including both traditional algorithms and ensemble methods has been applied to examine the data to identify trends and predictors of depression in this demographic. The models' performance has been rigorously evaluated in order to ascertain their effectiveness in precisely identifying individuals who are at risk. Among the ML models, Extremely Randomized Tree (ET) has achieved the highest accuracy of 90.26%, showcasing its classification effectiveness. A thorough investigation of the performance of the models compared, therefore clarifying their possible relevance in the early detection of depression among university students, has been presented in this paper. The findings shed light on the complex interplay among socio-demographic variables, stressors associated with one's profession, and mental well-being, which offer an original viewpoint on utilizing ML in psychological research.
Collapse
Affiliation(s)
- Nasirul Mumenin
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| | - A B M Kabir Hossain
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| | - Md Arafat Hossain
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| | | | | | | | - Afzal Hossen
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| | - Md Rubel Basar
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| | - Md Sejan Hossain
- Bangladesh Army University of Engineering and Technology, Rajshahi, Bangladesh
| |
Collapse
|
32
|
Hussain I, Qureshi M, Ismail M, Iftikhar H, Zywiołek J, López-Gonzales JL. Optimal features selection in the high dimensional data based on robust technique: Application to different health database. Heliyon 2024; 10:e37241. [PMID: 39296019 PMCID: PMC11408077 DOI: 10.1016/j.heliyon.2024.e37241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 08/28/2024] [Accepted: 08/29/2024] [Indexed: 09/21/2024] Open
Abstract
Bio-informatics and gene expression analysis face major hurdles when dealing with high-dimensional data, where the number of variables or genes much outweighs the number of samples. These difficulties are exacerbated, particularly in microarray data processing, by redundant genes that do not significantly contribute to the response variable. To address this issue, gene selection emerges as a feasible method for identifying the most important genes, hence reducing the generalization error of classification algorithms. This paper introduces a new hybrid approach for gene selection by combining the Signal-to-Noise Ratio (SNR) score with the robust Mood median test. The Mood median test is beneficial for reducing the impact of outliers in non-normal or skewed data since it may successfully identify genes with significant changes across groups. The SNR score measures the significance of a gene's classification by comparing the gap between class means and within-class variability. By integrating both of these approaches, the suggested approach aims to find genes that are significant for classification tasks. The major objective of this study is to evaluate the effectiveness of this combination approach in choosing the optimal genes. A significant P-value is consistently identified for each gene using the Mood median test and the SNR score. By dividing the SNR value of each gene by its significant P-value, the Md score is calculated. Genes with a high signal-to-noise ratio (SNR) have been considered favorable due to their minimal noise influence and significant classification importance. To verify the effectiveness of the selected genes, the study utilizes two dependable classification techniques: Random Forest and K-Nearest Neighbors (KNN). These algorithms were chosen due to their track record of successfully completing categorization-related tasks. The performance of the selected genes is evaluated using two metrics: error reduction and classification accuracy. These metrics offer an in-depth assessment of how well the selected genes improve classification accuracy and consistency. According to the findings, the hybrid approach put out here outperforms conventional gene selection methods in high-dimensional datasets and has lower classification error rates. There are considerable improvements in classification accuracy and error reduction when specific genes are exposed to the Random Forest and KNN classifiers. The outcomes demonstrate how this hybrid technique might be a helpful tool to improve gene selection processes in bioinformatics.
Collapse
Affiliation(s)
- Ibrar Hussain
- Department of Statistics Abdul Wali Khan University Mardan, Pakistan
| | - Moiz Qureshi
- Govt Boys Degree College Tandojam, Hyderabad, Sindh, Pakistan
- Department of Statistics, Quaid-i-Azam University, 45320, Islamabad, Pakistan
| | - Muhammad Ismail
- College of Statistical Sciences, University of the Punjab, Lahore, Pakistan
- Department of Statistics, Quaid-i-Azam University, 45320, Islamabad, Pakistan
| | - Hasnain Iftikhar
- Department of Statistics, Quaid-i-Azam University, 45320, Islamabad, Pakistan
- Escuela de Posgrado, Universidad Peruana Unión, Lima, Peru
| | - Justyna Zywiołek
- Faculty of Management, Czestochowa University of Technology, Czestochowa, 42-200, Poland
| | | |
Collapse
|
33
|
Deng L, Chen WS, Xiao M. Metafeature Selection via Multivariate Sparse-Group Lasso Learning for Automatic Hyperparameter Configuration Recommendation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12540-12552. [PMID: 37037247 DOI: 10.1109/tnnls.2023.3263506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The performance of classification algorithms is mainly governed by the hyperparameter settings deployed in applications, and the search for desirable hyperparameter configurations usually is quite challenging due to the complexity of datasets. Metafeatures are a group of measures that characterize the underlying dataset from various aspects, and the corresponding recommendation algorithm fully relies on the appropriate selection of metafeatures. Metalearning (MtL), aiming to improve the learning algorithm itself, requires development in integrating features, models, and algorithm learning to accomplish its goal. In this article, we develop a multivariate sparse-group Lasso (SGLasso) model embedded with MtL capacity in recommending suitable configurations via learning. The main idea is to select the principal metafeatures by removing those redundant or irregular ones, promoting both efficiency and performance in the hyperparameter configuration recommendation. To be specific, we first extract the metafeatures and classification performance of a set of configurations from the collection of historical datasets, and then, a metaregression task is established through SGLasso to capture the main characteristics of the underlying relationship between metafeatures and historical performance. For a new dataset, the classification performance of configurations can be estimated through the selected metafeatures so that the configuration with the highest predictive performance in terms of the new dataset can be generated. Furthermore, a general MtL architecture combined with our model is developed. Extensive experiments are conducted on 136 UCI datasets, demonstrating the effectiveness of the proposed approach. The empirical results on the well-known SVM show that our model can effectively recommend suitable configurations and outperform the existing MtL-based methods and the well-known search-based algorithms, such as random search, Bayesian optimization, and Hyperband.
Collapse
|
34
|
Li D. A pre-averaged pseudo nearest neighbor classifier. PeerJ Comput Sci 2024; 10:e2247. [PMID: 39314691 PMCID: PMC11419614 DOI: 10.7717/peerj-cs.2247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 07/17/2024] [Indexed: 09/25/2024]
Abstract
The k-nearest neighbor algorithm is a powerful classification method. However, its classification performance will be affected in small-size samples with existing outliers. To address this issue, a pre-averaged pseudo nearest neighbor classifier (PAPNN) is proposed to improve classification performance. In the PAPNN rule, the pre-averaged categorical vectors are calculated by taking the average of any two points of the training sets in each class. Then, k-pseudo nearest neighbors are chosen from the preprocessed vectors of every class to determine the category of a query point. The pre-averaged vectors can reduce the negative impact of outliers to some degree. Extensive experiments are conducted on nineteen numerical real data sets and three high dimensional real data sets by comparing PAPNN to other twelve classification methods. The experimental results demonstrate that the proposed PAPNN rule is effective for classification tasks in the case of small-size samples with existing outliers.
Collapse
Affiliation(s)
- Dapeng Li
- School of Software Engineering, Jinling Institute of Technology, Nanjing, China
| |
Collapse
|
35
|
Liu L, Zhou H, Wang X, Wen F, Zhang G, Yu J, Shen H, Huang R. Effects of environmental phenols on eGFR: machine learning modeling methods applied to cross-sectional studies. Front Public Health 2024; 12:1405533. [PMID: 39148651 PMCID: PMC11324456 DOI: 10.3389/fpubh.2024.1405533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Accepted: 07/26/2024] [Indexed: 08/17/2024] Open
Abstract
Purpose Limited investigation is available on the correlation between environmental phenols' exposure and estimated glomerular filtration rate (eGFR). Our target is established a robust and explainable machine learning (ML) model that associates environmental phenols' exposure with eGFR. Methods Our datasets for constructing the associations between environmental phenols' and eGFR were collected from the National Health and Nutrition Examination Survey (NHANES, 2013-2016). Five ML models were contained and fine-tuned to eGFR regression by phenols' exposure. Regression evaluation metrics were used to extract the limitation of the models. The most effective model was then utilized for regression, with interpretation of its features carried out using shapley additive explanations (SHAP) and the game theory python package to represent the model's regression capacity. Results The study identified the top-performing random forest (RF) regressor with a mean absolute error of 0.621 and a coefficient of determination of 0.998 among 3,371 participants. Six environmental phenols with eGFR in linear regression models revealed that the concentrations of triclosan (TCS) and bisphenol S (BPS) in urine were positively correlated with eGFR, and the correlation coefficients were β = 0.010 (p = 0.026) and β = 0.007 (p = 0.004) respectively. SHAP values indicate that BPS (1.38), bisphenol F (BPF) (0.97), 2,5-dichlorophenol (0.87), TCS (0.78), BP3 (0.60), bisphenol A (BPA) (0.59) and 2,4-dichlorophenol (0.47) in urinary contributed to the model. Conclusion The RF model was efficient in identifying a correlation between phenols' exposure and eGFR among United States NHANES 2013-2016 participants. The findings indicate that BPA, BPF, and BPS are inversely associated with eGFR.
Collapse
Affiliation(s)
- Lei Liu
- Department of Pathology, Affiliated Hospital of Nantong University, Nantong, China
| | - Hao Zhou
- Department of Thoracic Surgery, Affiliated Hospital of Nantong University, Nantong, China
| | - Xueli Wang
- Department of Pathology, Qingdao Eighth People's Hospital, Qingdao, China
| | - Fukang Wen
- Institute of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China
| | - Guibin Zhang
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| | - Jinao Yu
- Institute of Computer Science and Engineering, University of Wisconsin-Madison, Madison, WI, United States
| | - Hui Shen
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, United States
| | - Rongrong Huang
- Department of Pharmacy, Affiliated Hospital of Nantong University, Nantong, China
| |
Collapse
|
36
|
Sebro R, De la Garza-Ramos C. Can we screen opportunistically for low bone mineral density using CT scans of the shoulder and artificial intelligence? Br J Radiol 2024; 97:1450-1460. [PMID: 38837337 PMCID: PMC11256955 DOI: 10.1093/bjr/tqae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 04/12/2023] [Accepted: 05/22/2024] [Indexed: 06/07/2024] Open
Abstract
OBJECTIVE To evaluate whether the CT attenuation of bones seen on shoulder CT scans could be used to predict low bone mineral density (BMD) (osteopenia/osteoporosis), and to compare the performance of two machine learning models to predict low BMD. METHODS In this study, we evaluated 194 patients aged 50 years or greater (69.2 ± 9.1 years; 170 females) who underwent unenhanced shoulder CT scans and dual-energy X-ray absorptiometry within 1 year of each other between January 1, 2010, and December 31, 2021. The CT attenuation of the humerus, glenoid, coracoid, acromion, clavicle, first, second, and third ribs was obtained using 3D-Slicer. Support vector machines (SVMs) and k-nearest neighbours (kNN) were used to predict low BMD. DeLong test was used to compare the areas under the curve (AUCs). RESULTS A CT attenuation of 195.4 Hounsfield Units of the clavicle had a sensitivity of 0.577, specificity of 0.781, and AUC of 0.701 to predict low BMD. In the test dataset, the SVM had sensitivity of 0.686, specificity of 1.00, and AUC of 0.857, while the kNN model had sensitivity of 0.966, specificity of 0.200, and AUC of 0.583. The SVM was superior to the CT attenuation of the clavicle (P = .003) but not better than the kNN model (P = .098). CONCLUSION The CT attenuation of the clavicle was best for predicting low BMD; however, a multivariable SVM was superior for predicting low BMD. ADVANCES IN KNOWLEDGE SVM utilizing the CT attenuations at many sites was best for predicting low BMD.
Collapse
Affiliation(s)
- Ronnie Sebro
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, FL 32224, United States
- Department of Radiology, Mayo Clinic, Jacksonville, FL 32224, United States
| | | |
Collapse
|
37
|
Tripathi MK, Shivendra. Improved deep belief network for estimating mango quality indices and grading: A computer vision-based neutrosophic approach. NETWORK (BRISTOL, ENGLAND) 2024; 35:249-277. [PMID: 38224325 DOI: 10.1080/0954898x.2023.2299851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 12/21/2023] [Indexed: 01/16/2024]
Abstract
This research introduces a revolutionary machinet learning algorithm-based quality estimation and grading system. The suggested work is divided into four main parts: Ppre-processing, neutroscopic model transformation, Feature Extraction, and Grading. The raw images are first pre-processed by following five major stages: read, resize, noise removal, contrast enhancement via CLAHE, and Smoothing via filtering. The pre-processed images are then converted into a neutrosophic domain for more effective mango grading. The image is processed under a new Geometric Mean based neutrosophic approach to transforming it into the neutrosophic domain. Finally, the prediction of TSS for the different chilling conditions is done by Improved Deep Belief Network (IDBN) and based on this; the grading of mango is done automatically as the model is already trained with it. Here, the prediction of TSS is carried out under the consideration of SSC, firmness, and TAC. A comparison between the proposed and traditional methods is carried out to confirm the efficacy of various metrics.
Collapse
Affiliation(s)
- Mukesh Kumar Tripathi
- Department of Computer Science & Engineering, Vardhaman College of Engineering, Hyderabad, Telangana India
| | | |
Collapse
|
38
|
Wu J, Wang R, Tan Y, Liu L, Chen Z, Zhang S, Lou X, Yun J. Hybrid machine learning model based predictions for properties of poly(2-hydroxyethyl methacrylate)-poly(vinyl alcohol) composite cryogels embedded with bacterial cellulose. J Chromatogr A 2024; 1727:464996. [PMID: 38763087 DOI: 10.1016/j.chroma.2024.464996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/10/2024] [Accepted: 05/13/2024] [Indexed: 05/21/2024]
Abstract
Supermacroporous composite cryogels with enhanced adjustable functionality have received extensive interest in bioseparation, tissue engineering, and drug delivery. However, the variations in their components significantly impactfinal properties. This study presents a two-step hybrid machine learning approach for predicting the properties of innovative poly(2-hydroxyethyl methacrylate)-poly(vinyl alcohol) composite cryogels embedded with bacterial cellulose (pHEMA-PVA-BC) based on their compositions. By considering the ratios of HEMA (1.0-22.0 wt%), PVA (0.2-4.0 wt%), poly(ethylene glycol) diacrylate (1.0-4.5 wt%), BC (0.1-1.5 wt%), and water (68.0-96.0 wt%) as investigational variables, overlay sampling uniform design (OSUD) was employed to construct a high-quality dataset for model development. The random forest (RF) model was used to classify the preparation conditions. Then four models of artificial neural network, RF, gradient boosted regression trees (GBRT), and XGBoost were developed to predict the basic properties of the composite cryogels. The results showed that the RF model achieved an accurate three-class classification of preparation conditions. Among the four models, the GBRT model exhibited the best predictive performance of the basic properties, with the mean absolute percentage error of 16.04 %, 0.85 %, and 2.44 % for permeability, effective porosity, and height of theoretical plate (1.0 cm/min), respectively. Characterization results of the representative pHEMA-PVA-BC composite cryogel showed an effective porosity of 81.01 %, a permeability of 1.20 × 10-12 m2, and a range of height of theoretical plate between 0.40-0.49 cm at flow velocities of 0.5-3.0 cm/min. These indicate that the pHEMA-PVA-BC cryogel was an excellent material with supermacropores, low flow resistance and high mass transfer efficiency. Furthermore, the model output demonstrates that the alteration of the proportions of PVA (0.2-3.5 wt%) and BC (0.1-1.5 wt%) components in composite cryogels resulted in significant changes in the material basic properties. This work represents an attempt to efficiently design and prepare target composite cryogels using machine learning and providing valuable insights for the efficient development of polymers.
Collapse
Affiliation(s)
- Jiawei Wu
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Ruobing Wang
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Yan Tan
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Lulu Liu
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Zhihong Chen
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Songhong Zhang
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China
| | - Xiaoling Lou
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China.
| | - Junxian Yun
- State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Chaowang Road 18, Hangzhou 310032, PR China.
| |
Collapse
|
39
|
Jeong JS, Kang TH, Ju H, Cho CH. Novel approach exploring the correlation between presepsin and routine laboratory parameters using explainable artificial intelligence. Heliyon 2024; 10:e33826. [PMID: 39027625 PMCID: PMC11255511 DOI: 10.1016/j.heliyon.2024.e33826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 06/27/2024] [Accepted: 06/27/2024] [Indexed: 07/20/2024] Open
Abstract
Although presepsin, a crucial biomarker for the diagnosis and management of sepsis, has gained prominence in contemporary medical research, its relationship with routine laboratory parameters, including demographic data and hospital blood test data, remains underexplored. This study integrates machine learning with explainable artificial intelligence (XAI) to provide insights into the relationship between presepsin and these parameters. Advanced machine learning classifiers provide a multilateral view of data and play an important role in highlighting the interrelationships between presepsin and other parameters. XAI enhances analysis by ensuring transparency in the model's decisions, especially in selecting key parameters that significantly enhance classification accuracy. Utilizing XAI, this study successfully identified critical parameters that increased the predictive accuracy for sepsis patients, achieving a remarkable ROC AUC of 0.97 and an accuracy of 0.94. This breakthrough is possibly attributed to the comprehensive utilization of XAI in refining parameter selection, thus leading to these significant predictive metrics. The presence of missing data in datasets is another concern; this study addresses it by employing Extreme Gradient Boosting (XGBoost) to manage missing data, effectively mitigating potential biases while preserving both the accuracy and relevance of the results. The perspective of examining data from higher dimensions using machine learning transcends traditional observation and analysis. The findings of this study hold the potential to enhance patient diagnoses and treatment, underscoring the value of merging traditional research methods with advanced analytical tools.
Collapse
Affiliation(s)
- Jae-Seung Jeong
- Division of Artificial Intelligence Convergence Engineering, Sahmyook University, South Korea
| | - Tak Ho Kang
- Department of Laboratory Medicine, College of Medicine, Korea University Anam Hospital, South Korea
| | - Hyunsu Ju
- Post-Silicon Semiconductor Institute, Korea Institute of Science and Technology, South Korea
| | - Chi-Hyun Cho
- Department of Laboratory Medicine, College of Medicine, Korea University Ansan Hospital, South Korea
| |
Collapse
|
40
|
Zhou Z, Ai Q, Lou P, Hu J, Yan J. A Novel Method for Rolling Bearing Fault Diagnosis Based on Gramian Angular Field and CNN-ViT. SENSORS (BASEL, SWITZERLAND) 2024; 24:3967. [PMID: 38931750 PMCID: PMC11207501 DOI: 10.3390/s24123967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 06/11/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024]
Abstract
Fault diagnosis is one of the important applications of edge computing in the Industrial Internet of Things (IIoT). To address the issue that traditional fault diagnosis methods often struggle to effectively extract fault features, this paper proposes a novel rolling bearing fault diagnosis method that integrates Gramian Angular Field (GAF), Convolutional Neural Network (CNN), and Vision Transformer (ViT). First, GAF is used to convert one-dimensional vibration signals from sensors into two-dimensional images, effectively retaining the fault features of the vibration signal. Then, the CNN branch is used to extract the local features of the image, which are combined with the global features extracted by the ViT branch to diagnose the bearing fault. The effectiveness of this method is validated with two datasets. Experimental results show that the proposed method achieves average accuracies of 99.79% and 99.63% on the CWRU and XJTU-SY rolling bearing fault datasets, respectively. Compared with several widely used fault diagnosis methods, the proposed method achieves higher accuracy for different fault classifications, providing reliable technical support for performing complex fault diagnosis on edge devices.
Collapse
Affiliation(s)
- Zijun Zhou
- School of Information, Wuhan University of Technology, Wuhan 430070, China; (Z.Z.); (Q.A.); (P.L.)
| | - Qingsong Ai
- School of Information, Wuhan University of Technology, Wuhan 430070, China; (Z.Z.); (Q.A.); (P.L.)
| | - Ping Lou
- School of Information, Wuhan University of Technology, Wuhan 430070, China; (Z.Z.); (Q.A.); (P.L.)
| | - Jianmin Hu
- School of Information Engineering, Hubei University of Economics, Wuhan 430205, China;
| | - Junwei Yan
- School of Information, Wuhan University of Technology, Wuhan 430070, China; (Z.Z.); (Q.A.); (P.L.)
| |
Collapse
|
41
|
Nakano FK, Dulfer K, Vanhorebeek I, Wouters PJ, Verbruggen SC, Joosten KF, Güiza Grandas F, Vens C, Van den Berghe G. Predicting adverse long-term neurocognitive outcomes after pediatric intensive care unit admission. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108166. [PMID: 38614026 DOI: 10.1016/j.cmpb.2024.108166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 03/18/2024] [Accepted: 04/05/2024] [Indexed: 04/15/2024]
Abstract
BACKGROUND AND OBJECTIVE Critically ill children may suffer from impaired neurocognitive functions years after ICU (intensive care unit) discharge. To assess neurocognitive functions, these children are subjected to a fixed sequence of tests. Undergoing all tests is, however, arduous for former pediatric ICU patients, resulting in interrupted evaluations where several neurocognitive deficiencies remain undetected. As a solution, we propose using machine learning to predict the optimal order of tests for each child, reducing the number of tests required to identify the most severe neurocognitive deficiencies. METHODS We have compared the current clinical approach against several machine learning methods, mainly multi-target regression and label ranking methods. We have also proposed a new method that builds several multi-target predictive models and combines the outputs into a ranking that prioritizes the worse neurocognitive outcomes. We used data available at discharge, from children who participated in the PEPaNIC-RCT trial (ClinicalTrials.gov-NCT01536275), as well as data from a 2-year follow-up study. The institutional review boards at each participating site have also approved this follow-up study (ML8052; NL49708.078; Pro00038098). RESULTS Our proposed method managed to outperform other machine learning methods and also the current clinical practice. Precisely, our method reaches approximately 80% precision when considering top-4 outcomes, in comparison to 65% and 78% obtained by the current clinical practice and the state-of-the-art method in label ranking, respectively. CONCLUSIONS Our experiments demonstrated that machine learning can be competitive or even superior to the current testing order employed in clinical practice, suggesting that our model can be used to severely reduce the number of tests necessary for each child. Moreover, the results indicate that possible long-term adverse outcomes are already predictable as early as at ICU discharge. Thus, our work can be seen as the first step to allow more personalized follow-up after ICU discharge leading to preventive care rather than curative.
Collapse
Affiliation(s)
- Felipe Kenji Nakano
- KU Leuven, Campus KULAK, Department of Public Health and Primary Care, Etienne Sabbelaan 53, Kortrijk, 8500, Belgium; Itec, imec research group at KU Leuven, Etienne Sabbelaan 53, Kortrijk, 8500, Belgium.
| | - Karolijn Dulfer
- Intensive Care Unit, Department of Paediatrics and Paediatric Surgery, Erasmus Medical Centre, Sophia Children's Hospital, Doctor Molewaterplein 40, Rotterdam, 3015 GD, the Netherlands
| | - Ilse Vanhorebeek
- Clinical Division and Laboratory of Intensive Care Medicine, Department of Cellular and Molecular Medicine, UZ Herestraat 49, Leuven, 3000, Belgium
| | - Pieter J Wouters
- Clinical Division and Laboratory of Intensive Care Medicine, Department of Cellular and Molecular Medicine, UZ Herestraat 49, Leuven, 3000, Belgium
| | - Sascha C Verbruggen
- Intensive Care Unit, Department of Paediatrics and Paediatric Surgery, Erasmus Medical Centre, Sophia Children's Hospital, Doctor Molewaterplein 40, Rotterdam, 3015 GD, the Netherlands
| | - Koen F Joosten
- Intensive Care Unit, Department of Paediatrics and Paediatric Surgery, Erasmus Medical Centre, Sophia Children's Hospital, Doctor Molewaterplein 40, Rotterdam, 3015 GD, the Netherlands
| | - Fabian Güiza Grandas
- Clinical Division and Laboratory of Intensive Care Medicine, Department of Cellular and Molecular Medicine, UZ Herestraat 49, Leuven, 3000, Belgium
| | - Celine Vens
- KU Leuven, Campus KULAK, Department of Public Health and Primary Care, Etienne Sabbelaan 53, Kortrijk, 8500, Belgium; Itec, imec research group at KU Leuven, Etienne Sabbelaan 53, Kortrijk, 8500, Belgium
| | - Greet Van den Berghe
- Clinical Division and Laboratory of Intensive Care Medicine, Department of Cellular and Molecular Medicine, UZ Herestraat 49, Leuven, 3000, Belgium
| |
Collapse
|
42
|
Khan I, Zedadra O, Guerrieri A, Spezzano G. Occupancy Prediction in IoT-Enabled Smart Buildings: Technologies, Methods, and Future Directions. SENSORS (BASEL, SWITZERLAND) 2024; 24:3276. [PMID: 38894069 PMCID: PMC11174554 DOI: 10.3390/s24113276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 05/14/2024] [Accepted: 05/19/2024] [Indexed: 06/21/2024]
Abstract
In today's world, a significant amount of global energy is used in buildings. Unfortunately, a lot of this energy is wasted, because electrical appliances are not used properly or efficiently. One way to reduce this waste is by detecting, learning, and predicting when people are present in buildings. To do this, buildings need to become "smart" and "cognitive" and use modern technologies to sense when and how people are occupying the buildings. By leveraging this information, buildings can make smart decisions based on recently developed methods. In this paper, we provide a comprehensive overview of recent advancements in Internet of Things (IoT) technologies that have been designed and used for the monitoring of indoor environmental conditions within buildings. Using these technologies is crucial to gathering data about the indoor environment and determining the number and presence of occupants. Furthermore, this paper critically examines both the strengths and limitations of each technology in predicting occupant behavior. In addition, it explores different methods for processing these data and making future occupancy predictions. Moreover, we highlight some challenges, such as determining the optimal number and location of sensors and radars, and provide a detailed explanation and insights into these challenges. Furthermore, the paper explores possible future directions, including the security of occupants' data and the promotion of energy-efficient practices such as localizing occupants and monitoring their activities within a building. With respect to other survey works on similar topics, our work aims to both cover recent sensory approaches and review methods used in the literature for estimating occupancy.
Collapse
Affiliation(s)
- Irfanullah Khan
- ICAR-CNR, Institute for High Performance Computing and Networking, National Research Council of Italy, Via P. Bucci 8/9C, 87036 Rende, Italy;
- DIMES Department, University of Calabria, Via P. Bucci, 87036 Rende, Italy
| | - Ouarda Zedadra
- LabSTIC Laboratory, Department of Computer Science, 8 Mai 1945 University, P.O. Box 401, Guelma 24000, Algeria;
| | - Antonio Guerrieri
- ICAR-CNR, Institute for High Performance Computing and Networking, National Research Council of Italy, Via P. Bucci 8/9C, 87036 Rende, Italy;
| | - Giandomenico Spezzano
- ICAR-CNR, Institute for High Performance Computing and Networking, National Research Council of Italy, Via P. Bucci 8/9C, 87036 Rende, Italy;
| |
Collapse
|
43
|
Huang H, Zhou G, Zhao Q, He L, Xie S. Comprehensive Multiview Representation Learning via Deep Autoencoder-Like Nonnegative Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5953-5967. [PMID: 37672378 DOI: 10.1109/tnnls.2023.3304626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Learning a comprehensive representation from multiview data is crucial in many real-world applications. Multiview representation learning (MRL) based on nonnegative matrix factorization (NMF) has been widely adopted by projecting high-dimensional space into a lower order dimensional space with great interpretability. However, most prior NMF-based MRL techniques are shallow models that ignore hierarchical information. Although deep matrix factorization (DMF)-based methods have been proposed recently, most of them only focus on the consistency of multiple views and have cumbersome clustering steps. To address the above issues, in this article, we propose a novel model termed deep autoencoder-like NMF for MRL (DANMF-MRL), which obtains the representation matrix through the deep encoding stage and decodes it back to the original data. In this way, through a DANMF-based framework, we can simultaneously consider the multiview consistency and complementarity, allowing for a more comprehensive representation. We further propose a one-step DANMF-MRL, which learns the latent representation and final clustering labels matrix in a unified framework. In this approach, the two steps can negotiate with each other to fully exploit the latent clustering structure, avoid previous tedious clustering steps, and achieve optimal clustering performance. Furthermore, two efficient iterative optimization algorithms are developed to solve the proposed models both with theoretical convergence analysis. Extensive experiments on five benchmark datasets demonstrate the superiority of our approaches against other state-of-the-art MRL methods.
Collapse
|
44
|
Said A, Göker H. Spectral analysis and Bi-LSTM deep network-based approach in detection of mild cognitive impairment from electroencephalography signals. Cogn Neurodyn 2024; 18:597-614. [PMID: 38699612 PMCID: PMC11061085 DOI: 10.1007/s11571-023-10010-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 09/05/2023] [Accepted: 09/12/2023] [Indexed: 05/05/2024] Open
Abstract
Mild cognitive impairment (MCI) is a neuropsychological syndrome that is characterized by cognitive impairments. It typically affects adults 60 years of age and older. It is a noticeable decline in the cognitive function of the patient, and if left untreated it gets converted to Alzheimer's disease (AD). For that reason, early diagnosis of MCI is important as it slows down the conversion of the disease to AD. Early and accurate diagnosis of MCI requires recognition of the clinical characteristics of the disease, extensive testing, and long-term observations. These observations and tests can be subjective, expensive, incomplete, or inaccurate. Electroencephalography (EEG) is a powerful choice for the diagnosis of diseases with its advantages such as being non-invasive, based on findings, less costly, and getting results in a short time. In this study, a new EEG-based model is developed which can effectively detect MCI patients with higher accuracy. For this purpose, a dataset consisting of EEG signals recorded from a total of 34 subjects, 18 of whom were MCI and 16 control groups was used, and their ages ranged from 40 to 77. To conduct the experiment, the EEG signals were denoised using Multiscale Principal Component Analysis (MSPCA), and to increase the size of the dataset Data Augmentation (DA) method was performed. The tenfold cross-validation method was used to validate the model, moreover, the power spectral density (PSD) of the EEG signals was extracted from the EEG signals using three spectral analysis methods, the periodogram, welch, and multitaper. The PSD graphs of the EEG signals showed signal differences between the subjects of control and the MCI group, indicating that the signal power of MCI patients is lower compared to control groups. To classify the subjects, one of the best classifiers of deep learning algorithms called the Bi-directional long-short-term-memory (Bi-LSTM) was used, and several machine learning algorithms, such as decision tree (DT), support vector machine (SVM), and k-nearest neighbor (KNN). These algorithms were trained and tested using the extracted feature vectors from the control and the MCI groups. Additionally, the values of the coefficient matrix of those algorithms were compared and evaluated with the performance evaluation matrix to determine which one performed the best overall. According to the experimental results, the proposed deep learning model of multitaper spectral analysis approach with Bi-LSTM deep learning algorithm attained the highest number of correctly classified samples for diagnosing MCI patients and achieved a remarkable accuracy compared to the other proposed models. The achieved classification results of the deep learning model are reported to be 98.97% accuracy, 98.34% sensitivity, 99.67% specificity, 99.70% precision, 99.02% f1 score, and 97.94% Matthews correlation coefficient (MCC).
Collapse
Affiliation(s)
- Afrah Said
- Department of Electrical Electronics Engineering, Faculty of Simav Technology, Dumlupınar University, 43500 Kütahya, Turkey
| | - Hanife Göker
- Health Services Vocational College, Gazi University, 06830 Ankara, Turkey
| |
Collapse
|
45
|
Jiang Z, Liu L, Du L, Lv S, Liang F, Luo Y, Wang C, Shen Q. Machine learning for the early prediction of acute respiratory distress syndrome (ARDS) in patients with sepsis in the ICU based on clinical data. Heliyon 2024; 10:e28143. [PMID: 38533071 PMCID: PMC10963609 DOI: 10.1016/j.heliyon.2024.e28143] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 02/28/2024] [Accepted: 03/12/2024] [Indexed: 03/28/2024] Open
Abstract
Background Acute respiratory distress syndrome (ARDS) is a fatal outcome of severe sepsis. Machine learning models are helpful for accurately predicting ARDS in patients with sepsis at an early stage. Objective We aim to develop a machine-learning model for predicting ARDS in patients with sepsis in the intensive care unit (ICU). Methods The initial clinical data of patients with sepsis admitted to the hospital (including population characteristics, clinical diagnosis, complications, and laboratory tests) were used to predict ARDS, and screen out the crucial variables. After comparing eight different algorithms, namely, XG boost, logistic regression, light GBM, random forest, GaussianNB, complement NB, support vector machine (SVM), and K nearest neighbors (KNN), rebuilding a prediction model with the best one. When remodeling with the best algorithm, 10% was randomly selected to test, and the remaining was trained for cross-validation. Using the area under the curve (AUC), sensitivity, accuracy, specificity, positive and negative predictive value, F1 score, kappa value, and clinical decision curve to evaluate the model's performance. Eventually, the application in the model illustrated by the SHAP package. Results Ten critical features were screened utilizing the lasso method, namely, PaO2/PAO2, A-aDO2, PO2(T), CRP, gender, PO2, RDW, MCH, SG, and chlorine. The prior ranking of variables demonstrated that PaO2/PAO2 was the most significant variable. Among the eight algorithms, the performance of the Gaussian NB algorithm was significantly better than that of the others. After remodeling with the best algorithm, the AUC in the training and validation sets were 0.777 and 0.770, respectively, and the algorithm performed well in the test set (AUC = 0.781, accuracy = 78.6%, sensitivity = 82.4%, F1 score = 0.824). A comparison of the overlap factors with those of previous models revealed that the model we developed performs better. Conclusion Sepsis-associated ARDS can be accurately predicted early via a machine learning model based on existing clinical data. These findings are helpful for accurate identification and improvement of the prognosis in patients with sepsis-associated ARDS.
Collapse
Affiliation(s)
- Zhenzhen Jiang
- Department of Blood Transfusion, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Leping Liu
- Department of Pediatrics, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Lin Du
- Department of Blood Transfusion, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Shanshan Lv
- Department of Blood Transfusion, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Fang Liang
- Department of Hematology and Critical Care Medicine, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Yanwei Luo
- Department of Blood Transfusion, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Chunjiang Wang
- Department of Pharmacy, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Qin Shen
- Department of Radiology, The Second Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
46
|
Shi J, Chen X, Xie Y, Zhang H, Sun Y. Delicately Reinforced k-Nearest Neighbor Classifier Combined With Expert Knowledge Applied to Abnormity Forecast in Electrolytic Cell. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:3027-3037. [PMID: 37494170 DOI: 10.1109/tnnls.2023.3280963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
As the profit and safety requirements become higher and higher, it is more and more necessary to realize an advanced intelligent analysis for abnormity forecast of the synthetical balance of material and energy (AF-SBME) on aluminum reduction cells (ARCs). Without loss of generality, AF-SBME belongs to classification problems. Its advanced intelligent analysis can be realized by high-performance data-driven classifiers. However, AF-SBME has some difficulties, including a high requirement for interpretability of data-driven classifiers, a small number, and decreasing-over-time correctness of training samples. In this article, based on a preferable data-driven classifier, which is called a reinforced k -nearest neighbor (R-KNN) classifier, a delicately R-KNN combined with expert knowledge (DR-KNN/CE) is proposed. It improves R-KNN in two ways, including using expert knowledge as external assistance and enhancing self-ability to mine and synthesize data knowledge. The related experiments on AF-SBME, where the relevant data are directly sampled from practical production, have demonstrated that the proposed DR-KNN/CE not only makes an effective improvement for R-KNN, but also has a more advanced performance compared with other existing high-performance data-driven classifiers.
Collapse
|
47
|
Liang JH, Wang SQ, Zhang WF, Guo Y, Zhang Y, Chen F, Zhang L, Yin WB, Xiao LT, Jia ST. Rapid and accurate identification of bacteria utilizing laser-induced breakdown spectroscopy. BIOMEDICAL OPTICS EXPRESS 2024; 15:1878-1891. [PMID: 38495706 PMCID: PMC10942702 DOI: 10.1364/boe.517213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 02/16/2024] [Accepted: 02/20/2024] [Indexed: 03/19/2024]
Abstract
Timely and accurate identification of harmful bacterial species in the environment is paramount for preventing the spread of diseases and ensuring food safety. In this study, laser-induced breakdown spectroscopy technology was utilized, combined with four machine learning methods - KNN, PCA-KNN, RF, and SVM, to conduct classification and identification research on 7 different types of bacteria, adhering to various substrate materials. The experimental results showed that despite the nearly identical elemental composition of these bacteria, differences in the intensity of elemental spectral lines provide crucial information for identification of bacteria. Under conditions of high-purity aluminum substrate, the identification rates of the four modeling methods reached 74.91%, 84.05%, 85.36%, and 96.07%, respectively. In contrast, under graphite substrate conditions, the corresponding identification rates reached 96.87%, 98.11%, 98.93%, and 100%. Graphite is found to be more suitable as a substrate material for bacterial classification, attributed to the fact that more characteristic spectral lines are excited in bacteria under graphite substrate conditions. Additionally, the emission spectral lines of graphite itself are relatively scarce, resulting in less interference with other elemental spectral lines of bacteria. Meanwhile, SVM exhibited the highest precision rate and recall rate, reaching up to 1, making it the most effective classification method in this experiment. This study provides a valuable approach for the rapid and accurate identification of bacterial species based on LIBS, as well as substrate selection, enhancing efficient microbial identification capabilities in fields related to social security and military applications.
Collapse
Affiliation(s)
- J. H. Liang
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| | - S. Q. Wang
- SINOPEC Research Institute of Petroleum Processing Co., Ltd., Beijing, China
| | - W. F. Zhang
- Shanxi Xinhua Chemical Defense Equipment Research Institute Co., Ltd., Taiyuan, China
| | - Y. Guo
- Shanxi Xinhua Chemical Defense Equipment Research Institute Co., Ltd., Taiyuan, China
| | - Y. Zhang
- School of Optoelectronic Engineering, Xi’an Technological University, Xian, China
| | - F. Chen
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| | - L. Zhang
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| | - W. B. Yin
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| | - L. T. Xiao
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| | - S. T. Jia
- State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| |
Collapse
|
48
|
Klaar ACR, Seman LO, Mariani VC, Coelho LDS. Random Convolutional Kernel Transform with Empirical Mode Decomposition for Classification of Insulators from Power Grid. SENSORS (BASEL, SWITZERLAND) 2024; 24:1113. [PMID: 38400271 PMCID: PMC10893376 DOI: 10.3390/s24041113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
The electrical energy supply relies on the satisfactory operation of insulators. The ultrasound recorded from insulators in different conditions has a time series output, which can be used to classify faulty insulators. The random convolutional kernel transform (Rocket) algorithms use convolutional filters to extract various features from the time series data. This paper proposes a combination of Rocket algorithms, machine learning classifiers, and empirical mode decomposition (EMD) methods, such as complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), empirical wavelet transform (EWT), and variational mode decomposition (VMD). The results show that the EMD methods, combined with MiniRocket, significantly improve the accuracy of logistic regression in insulator fault diagnosis. The proposed strategy achieves an accuracy of 0.992 using CEEMDAN, 0.995 with EWT, and 0.980 with VMD. These results highlight the potential of incorporating EMD methods in insulator failure detection models to enhance the safety and dependability of power systems.
Collapse
Affiliation(s)
| | - Laio Oriel Seman
- Department of Automation and Systems Engineering, Federal University of Santa Catarina, Florianópolis 88040-535, Brazil;
| | - Viviana Cocco Mariani
- Mechanical Engineering Graduate Program, Pontifical Catholic University of Parana, Curitiba 80215-901, Brazil;
- Department of Electrical Engineering, Federal University of Parana, Curitiba 81530-000, Brazil;
| | - Leandro dos Santos Coelho
- Department of Electrical Engineering, Federal University of Parana, Curitiba 81530-000, Brazil;
- Industrial and Systems Engineering Graduate Program, Pontifical Catholic University of Parana, Curitiba 80215-901, Brazil
| |
Collapse
|
49
|
Xu L, Liang Y, Huang WE, Shang L, Chai L, Zhang X, Shi J, Li B, Wang Y, Xu Z, Lu Z. Rapid detection of six Oceanobacillus species in Daqu starter using single-cell Raman spectroscopy combined with machine learning. Microb Biotechnol 2024; 17:e14416. [PMID: 38381051 PMCID: PMC10880574 DOI: 10.1111/1751-7915.14416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 01/09/2024] [Accepted: 01/19/2024] [Indexed: 02/22/2024] Open
Abstract
Many traditional fermented foods and beverages industries around the world request the addition of multi-species starter cultures. However, the microbial community in starter cultures is subject to fluctuations due to their exposure to an open environment during fermentation. A rapid detection approach to identify the microbial composition of starter culture is essential to ensure the quality of the final products. Here, we applied single-cell Raman spectroscopy (SCRS) combined with machine learning to monitor Oceanobacillus species in Daqu starter, which plays crucial roles in the process of Chinese baijiu. First, a total of six Oceanobacillus species (O. caeni, O. kimchii, O. iheyensis, O. sojae, O. oncorhynchi subsp. Oncorhynchi and O. profundus) were detected in 44 Daqu samples by amplicon sequencing and isolated by pure culture. Then, we created a reference database of these Oceanobacillus strains which correlated their taxonomic data and single-cell Raman spectra (SCRS). Based on the SCRS dataset, five machine-learning algorithms were used to classify Oceanobacillus strains, among which support vector machine (SVM) showed the highest rate of accuracy. For validation of SVM-based model, we employed a synthetic microbial community composed of varying proportions of Oceanobacillus species and demonstrated a remarkable accuracy, with a mean error was less than 1% between the predicted result and the expected value. The relative abundance of six different Oceanobacillus species during Daqu fermentation was predicted within 60 min using this method, and the reliability of the method was proved by correlating the Raman spectrum with the amplicon sequencing profiles by partial least squares regression. Our study provides a rapid, non-destructive and label-free approach for rapid identification of Oceanobacillus species in Daqu starter culture, contributing to real-time monitoring of fermentation process and ensuring high-quality products.
Collapse
Affiliation(s)
- Lei Xu
- Key Laboratory of Industrial Biotechnology of Ministry of Education, School of BiotechnologyJiangnan UniversityWuxiChina
- National Engineering Research Center of Cereal Fermentation and Food BiomanufacturingJiangnan UniversityWuxiChina
| | - Yuan Liang
- Key Laboratory of Industrial Biotechnology of Ministry of Education, School of BiotechnologyJiangnan UniversityWuxiChina
| | - Wei E Huang
- Oxford Suzhou Centre for Advanced ResearchSuzhouChina
- Department of Engineering ScienceUniversity of OxfordOxfordUK
| | - Lin‐Dong Shang
- State Key Laboratory of Applied Optics, Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of SciencesChangchunChina
| | - Li‐Juan Chai
- National Engineering Research Center of Cereal Fermentation and Food BiomanufacturingJiangnan UniversityWuxiChina
| | - Xiao‐Juan Zhang
- National Engineering Research Center of Cereal Fermentation and Food BiomanufacturingJiangnan UniversityWuxiChina
| | - Jin‐Song Shi
- School of Life Sciences and Health EngineeringJiangnan UniversityWuxiChina
| | - Bei Li
- State Key Laboratory of Applied Optics, Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of SciencesChangchunChina
| | - Yun Wang
- Oxford Suzhou Centre for Advanced ResearchSuzhouChina
| | - Zheng‐Hong Xu
- Key Laboratory of Industrial Biotechnology of Ministry of Education, School of BiotechnologyJiangnan UniversityWuxiChina
- National Engineering Research Center of Cereal Fermentation and Food BiomanufacturingJiangnan UniversityWuxiChina
- National Engineering Research Center of Solid‐State BrewingLuzhouChina
| | - Zhen‐Ming Lu
- Key Laboratory of Industrial Biotechnology of Ministry of Education, School of BiotechnologyJiangnan UniversityWuxiChina
- National Engineering Research Center of Cereal Fermentation and Food BiomanufacturingJiangnan UniversityWuxiChina
- National Engineering Research Center of Solid‐State BrewingLuzhouChina
| |
Collapse
|
50
|
Tan P, Miles CE. Intrinsic statistical separation of subpopulations in heterogeneous collective motion via dimensionality reduction. Phys Rev E 2024; 109:014403. [PMID: 38366514 DOI: 10.1103/physreve.109.014403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 12/12/2023] [Indexed: 02/18/2024]
Abstract
Collective motion of locally interacting agents is found ubiquitously throughout nature. The inability to probe individuals has driven longstanding interest in the development of methods for inferring the underlying interactions. In the context of heterogeneous collectives, where the population consists of individuals driven by different interactions, existing approaches require some knowledge about the heterogeneities or underlying interactions. Here, we investigate the feasibility of identifying the identities in a heterogeneous collective without such prior knowledge. We numerically explore the behavior of a heterogeneous Vicsek model and find sufficiently long trajectories intrinsically cluster in a principal component analysis-based dimensionally reduced model-agnostic description of the data. We identify how heterogeneities in each parameter in the model (interaction radius, noise, population proportions) dictate this clustering. Finally, we show the generality of this phenomenon by finding similar behavior in a heterogeneous D'Orsogna model. Altogether, our results establish and quantify the intrinsic model-agnostic statistical disentanglement of identities in heterogeneous collectives.
Collapse
Affiliation(s)
- Pei Tan
- Mathematical, Computational, and Systems Biology Graduate Program, University of California, Irvine 92697, USA
| | | |
Collapse
|