1
|
Li D, Mei Q, Li G. scQA: A dual-perspective cell type identification model for single cell transcriptome data. Comput Struct Biotechnol J 2024; 23:520-536. [PMID: 38235363 PMCID: PMC10791572 DOI: 10.1016/j.csbj.2023.12.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/16/2023] [Accepted: 12/18/2023] [Indexed: 01/19/2024] Open
Abstract
Single-cell RNA sequencing technologies have been pivotal in advancing the development of algorithms for clustering heterogeneous cell populations. Existing methods for utilizing scRNA-seq data to identify cell types tend to neglect the beneficial impact of dropout events and perform clustering focusing solely on quantitative perspective. Here, we introduce a novel method named scQA, notable for its ability to concurrently identify cell types and cell type-specific key genes from both qualitative and quantitative perspectives. In contrast to other methods, scQA not only identifies cell types but also extracts key genes associated with these cell types, enabling bidirectional clustering for scRNA-seq data. Through an iterative process, our approach aims to minimize the number of landmarks to approximately a dozen while maximizing the inclusion of quasi-trend-preserved genes with dropouts both qualitatively and quantitatively. It then clusters cells by employing an ingenious label propagation strategy, obviating the requirement for a predetermined number of cell types. Validated on 20 publicly available scRNA-seq datasets, scQA consistently outperforms other salient tools. Furthermore, we confirm the effectiveness and potential biological significance of the identified key genes through both external and internal validation. In conclusion, scQA emerges as a valuable tool for investigating cell heterogeneity due to its distinctive fusion of qualitative and quantitative facets, along with bidirectional clustering capabilities. Furthermore, it can be seamlessly integrated into border scRNA-seq analyses. The source codes are publicly available at https://github.com/LD-Lyndee/scQA.
Collapse
Affiliation(s)
- Di Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Qinglin Mei
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
2
|
Lv Z, Wei X, Hu S, Lin G, Qiu W. iSUMO-RsFPN: A predictor for identifying lysine SUMOylation sites based on multi-features and feature pyramid networks. Anal Biochem 2024; 687:115460. [PMID: 38191118 DOI: 10.1016/j.ab.2024.115460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/10/2024]
Abstract
SUMOylation is a protein post-translational modification that plays an essential role in cellular functions. For predicting SUMO sites, numerous researchers have proposed advanced methods based on ordinary machine learning algorithms. These reported methods have shown excellent predictive performance, but there is room for improvement. In this study, we constructed a novel deep neural network Residual Pyramid Network (RsFPN), and developed an ensemble deep learning predictor called iSUMO-RsFPN. Initially, three feature extraction methods were employed to extract features from samples. Following this, weak classifiers were trained based on RsFPN for each feature type. Ultimately, the weak classifiers were integrated to construct the final classifier. Moreover, the predictor underwent systematically testing on an independent test dataset, where the results demonstrated a significant improvement over the existing state-of-the-art predictors. The code of iSUMO-RsFPN is free and available at https://github.com/454170054/iSUMO-RsFPN.
Collapse
Affiliation(s)
- Zhe Lv
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Xin Wei
- Business School, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Siqin Hu
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Gang Lin
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Wangren Qiu
- Computer Department, Jingdezhen Ceramic University, 333403, Jingdezhen, Jiangxi, China.
| |
Collapse
|
3
|
Ma D, Li C, Du T, Qiao L, Tang D, Ma Z, Shi L, Lu G, Meng Q, Chen Z, Grzegorzek M, Sun H. PHE-SICH-CT-IDS: A benchmark CT image dataset for evaluation semantic segmentation, object detection and radiomic feature extraction of perihematomal edema in spontaneous intracerebral hemorrhage. Comput Biol Med 2024; 173:108342. [PMID: 38522249 DOI: 10.1016/j.compbiomed.2024.108342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/05/2024] [Accepted: 03/17/2024] [Indexed: 03/26/2024]
Abstract
BACKGROUND AND OBJECTIVE Intracerebral hemorrhage is one of the diseases with the highest mortality and poorest prognosis worldwide. Spontaneous intracerebral hemorrhage (SICH) typically presents acutely, prompt and expedited radiological examination is crucial for diagnosis, localization, and quantification of the hemorrhage. Early detection and accurate segmentation of perihematomal edema (PHE) play a critical role in guiding appropriate clinical intervention and enhancing patient prognosis. However, the progress and assessment of computer-aided diagnostic methods for PHE segmentation and detection face challenges due to the scarcity of publicly accessible brain CT image datasets. METHODS This study establishes a publicly available CT dataset named PHE-SICH-CT-IDS for perihematomal edema in spontaneous intracerebral hemorrhage. The dataset comprises 120 brain CT scans and 7,022 CT images, along with corresponding medical information of the patients. To demonstrate its effectiveness, classical algorithms for semantic segmentation, object detection, and radiomic feature extraction are evaluated. The experimental results confirm the suitability of PHE-SICH-CT-IDS for assessing the performance of segmentation, detection and radiomic feature extraction methods. RESULTS This study conducts numerous experiments using classical machine learning and deep learning methods, demonstrating the differences in various segmentation and detection methods on the PHE-SICH-CT-IDS. The highest precision achieved in semantic segmentation is 76.31%, while object detection attains a maximum precision of 97.62%. The experimental results on radiomic feature extraction and analysis prove the suitability of PHE-SICH-CT-IDS for evaluating image features and highlight the predictive value of these features for the prognosis of SICH patients. CONCLUSION To the best of our knowledge, this is the first publicly available dataset for PHE in SICH, comprising various data formats suitable for applications across diverse medical scenarios. We believe that PHE-SICH-CT-IDS will allure researchers to explore novel algorithms, providing valuable support for clinicians and patients in the clinical setting. PHE-SICH-CT-IDS is freely published for non-commercial purpose at https://figshare.com/articles/dataset/PHE-SICH-CT-IDS/23957937.
Collapse
Affiliation(s)
- Deguo Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Chen Li
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China.
| | - Tianming Du
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Lin Qiao
- Shengjing Hospital, China Medical University, Shenyang, China
| | - Dechao Tang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Zhiyu Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Liyu Shi
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Guotao Lu
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Qingtao Meng
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Zhihao Chen
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China
| | - Marcin Grzegorzek
- Institute of Medical Informatics, University of Luebeck, Luebeck, Germany
| | - Hongzan Sun
- Shengjing Hospital, China Medical University, Shenyang, China.
| |
Collapse
|
4
|
Yazdi M, Samaee M, Massicotte D. A Review on Automated Sleep Study. Ann Biomed Eng 2024:10.1007/s10439-024-03486-0. [PMID: 38493234 DOI: 10.1007/s10439-024-03486-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 02/25/2024] [Indexed: 03/18/2024]
Abstract
In recent years, research on automated sleep analysis has witnessed significant growth, reflecting advancements in understanding sleep patterns and their impact on overall health. This review synthesizes findings from an exhaustive analysis of 87 papers, systematically retrieved from prominent databases such as Google Scholar, PubMed, IEEE Xplore, and ScienceDirect. The selection criteria prioritized studies focusing on methods employed, signal modalities utilized, and machine learning algorithms applied in automated sleep analysis. The overarching goal was to critically evaluate the strengths and weaknesses of the proposed methods, shedding light on the current landscape and future directions in sleep research. An in-depth exploration of the reviewed literature revealed a diverse range of methodologies and machine learning approaches employed in automated sleep studies. Notably, K-Nearest Neighbors (KNN), Ensemble Learning Methods, and Support Vector Machine (SVM) emerged as versatile and potent classifiers, exhibiting high accuracies in various applications. However, challenges such as performance variability and computational demands were observed, necessitating judicious classifier selection based on dataset intricacies. In addition, the integration of traditional feature extraction methods with deep structures and the combination of different deep neural networks were identified as promising strategies to enhance diagnostic accuracy in sleep-related studies. The reviewed literature emphasized the need for adaptive classifiers, cross-modality integration, and collaborative efforts to drive the field toward more accurate, robust, and accessible sleep-related diagnostic solutions. This comprehensive review serves as a solid foundation for researchers and practitioners, providing an organized synthesis of the current state of knowledge in automated sleep analysis. By highlighting the strengths and challenges of various methodologies, this review aims to guide future research toward more effective and nuanced approaches to sleep diagnostics.
Collapse
Affiliation(s)
- Mehran Yazdi
- Laboratory of Signal and System Integration, Department of Electrical and Computer Engineering, Université du Québec à Trois-Rivières, Trois-Rivières, Canada.
- Signal and Image Processing Laboratory, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran.
| | - Mahdi Samaee
- Signal and Image Processing Laboratory, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
| | - Daniel Massicotte
- Laboratory of Signal and System Integration, Department of Electrical and Computer Engineering, Université du Québec à Trois-Rivières, Trois-Rivières, Canada
| |
Collapse
|
5
|
Karimi-Rouzbahani H, McGonigal A. Generalisability of epileptiform patterns across time and patients. Sci Rep 2024; 14:6293. [PMID: 38491096 PMCID: PMC10942983 DOI: 10.1038/s41598-024-56990-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 03/13/2024] [Indexed: 03/18/2024] Open
Abstract
The complexity of localising the epileptogenic zone (EZ) contributes to surgical resection failures in achieving seizure freedom. The distinct patterns of epileptiform activity during interictal and ictal phases, varying across patients, often lead to suboptimal localisation using electroencephalography (EEG) features. We posed two key questions: whether neural signals reflecting epileptogenicity generalise from interictal to ictal time windows within each patient, and whether epileptiform patterns generalise across patients. Utilising an intracranial EEG dataset from 55 patients, we extracted a large battery of simple to complex features from stereo-EEG (SEEG) and electrocorticographic (ECoG) neural signals during interictal and ictal windows. Our features (n = 34) quantified many aspects of the signals including statistical moments, complexities, frequency-domain and cross-channel network attributes. Decision tree classifiers were then trained and tested on distinct time windows and patients to evaluate the generalisability of epileptogenic patterns across time and patients, respectively. Evidence strongly supported generalisability from interictal to ictal time windows across patients, particularly in signal power and high-frequency network-based features. Consistent patterns of epileptogenicity were observed across time windows within most patients, and signal features of epileptogenic regions generalised across patients, with higher generalisability in the ictal window. Signal complexity features were particularly contributory in cross-patient generalisation across patients. These findings offer insights into generalisable features of epileptic neural activity across time and patients, with implications for future automated approaches to supplement other EZ localisation methods.
Collapse
Affiliation(s)
- Hamid Karimi-Rouzbahani
- Neurosciences Centre, Mater Hospital, South Brisbane, 4101, Australia.
- Mater Research Institute, University of Queensland, South Brisbane, 4101, Australia.
- Queensland Brain Institute, University of Queensland, St Lucia, 4072, Australia.
| | - Aileen McGonigal
- Neurosciences Centre, Mater Hospital, South Brisbane, 4101, Australia
- Mater Research Institute, University of Queensland, South Brisbane, 4101, Australia
- Queensland Brain Institute, University of Queensland, St Lucia, 4072, Australia
| |
Collapse
|
6
|
Abubakar H, Al-Turjman F, Ameen ZS, Mubarak AS, Altrjman C. A hybridized feature extraction for COVID-19 multi-class classification on computed tomography images. Heliyon 2024; 10:e26939. [PMID: 38463848 PMCID: PMC10920381 DOI: 10.1016/j.heliyon.2024.e26939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 03/12/2024] Open
Abstract
COVID-19 has killed more than 5 million individuals worldwide within a short time. It is caused by SARS-CoV-2 which continuously mutates and produces more transmissible new different strains. It is therefore of great significance to diagnose COVID-19 early to curb its spread and reduce the death rate. Owing to the COVID-19 pandemic, traditional diagnostic methods such as reverse-transcription polymerase chain reaction (RT-PCR) are ineffective for diagnosis. Medical imaging is among the most effective techniques of respiratory disorders detection through machine learning and deep learning. However, conventional machine learning methods depend on extracted and engineered features, whereby the optimum features influence the classifier's performance. In this study, Histogram of Oriented Gradient (HOG) and eight deep learning models were utilized for feature extraction while K-Nearest Neighbour (KNN) and Support Vector Machines (SVM) were used for classification. A combined feature of HOG and deep learning feature was proposed to improve the performance of the classifiers. VGG-16 + HOG achieved 99.4 overall accuracy with SVM. This indicates that our proposed concatenated feature can enhance the SVM classifier's performance in COVID-19 detection.
Collapse
Affiliation(s)
- Hassana Abubakar
- Biomedical Engineering Department, Faculty of Engineering, Near East University, Mersin 10, Turkey
| | - Fadi Al-Turjman
- Artificial Intelligence Engineering Department, AI and Robotics Institute, Near East University, Mersin 10, Turkey
- Research Center for AI and IoT, Faculty of Engineering, University of Kyrenia, Mersin 10, Turkey
| | - Zubaida S. Ameen
- Operational Research Center in Healthcare, Near East University, Mersin 10, Turkey
| | - Auwalu S. Mubarak
- Operational Research Center in Healthcare, Near East University, Mersin 10, Turkey
| | - Chadi Altrjman
- Waterloo University, 200 University Avenue West. Waterloo, ON, Canada
| |
Collapse
|
7
|
Song H, Li G, Xiong X, Li M, Qin Q, Mitrouchev P. A novel data fusion based intelligent identification approach for working cycle stages of hydraulic excavators. ISA Trans 2024:S0019-0578(24)00110-1. [PMID: 38508952 DOI: 10.1016/j.isatra.2024.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 03/05/2024] [Accepted: 03/06/2024] [Indexed: 03/22/2024]
Abstract
Accurately identifying the stage of the excavator working cycle is the prerequisite to achieve the staged energy-saving control. However, current identification methods often overlook the influence of hydraulic system latency on identification results and depend on a single model, resulting in poor generalization performance of the identification approaches. Moreover, expert calibration system remains a necessary factor for improving identification accuracy. Aiming at these issues, a hybrid multi-scale feature extractor and a decision-level data fusion classifier approach (HMSFE-DFC) is proposed to identify the working cycle stages of excavator. The input signal employs mixed signals from the main pump pressure and the control current of the proportional solenoid valve to reduce the response delay caused by the single main pump pressure signal. A hybrid multi-scale feature extractor is constructed using a convolutional neural network temporal self-attention feature extraction mechanism and one-dimensional ResNet-50 architecture to extract multiscale features. To prevent overfitting, a decision-level data fusion classifier is used to fuse the decisions information of numerous classifiers. The accuracy of stage identification for 10 consecutive working cycles reaches 95.21%, which verifies its effectiveness.
Collapse
Affiliation(s)
- Haoju Song
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China
| | - Guiqin Li
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China.
| | - Xin Xiong
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China
| | - Ming Li
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China
| | - Qiang Qin
- XCMG Excavator Machinery Business Department, Xuzhou Construction Machinery Group Co Ltd, Xuzhou 221004, China
| | | |
Collapse
|
8
|
Pham TD, Holmes SB, Zou L, Patel M, Coulthard P. Diagnosis of pathological speech with streamlined features for long short-term memory learning. Comput Biol Med 2024; 170:107976. [PMID: 38219647 DOI: 10.1016/j.compbiomed.2024.107976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/14/2023] [Accepted: 01/04/2024] [Indexed: 01/16/2024]
Abstract
BACKGROUND Pathological speech diagnosis is crucial for identifying and treating various speech disorders. Accurate diagnosis aids in developing targeted intervention strategies, improving patients' communication abilities, and enhancing their overall quality of life. With the rising incidence of speech-related conditions globally, including oral health, the need for efficient and reliable diagnostic tools has become paramount, emphasizing the significance of advanced research in this field. METHODS This paper introduces novel features for deep learning in the analysis of short voice signals. It proposes the incorporation of time-space and time-frequency features to accurately discern between two distinct groups: Individuals exhibiting normal vocal patterns and those manifesting pathological voice conditions. These advancements aim to enhance the precision and reliability of diagnostic procedures, paving the way for more targeted treatment approaches. RESULTS Utilizing a publicly available voice database, this study carried out training and validation using long short-term memory (LSTM) networks learning on the combined features, along with a data balancing strategy. The proposed approach yielded promising performance metrics: 90% accuracy, 93% sensitivity, 87% specificity, 88% precision, an F1 score of 0.90, and an area under the receiver operating characteristic curve of 0.96. The results surpassed those obtained by the networks trained using wavelet-time scattering coefficients, as well as several algorithms trained with alternative feature types. CONCLUSIONS The incorporation of time-frequency and time-space features extracted from short segments of voice signals for LSTM learning demonstrates significant promise as an AI tool for the diagnosis of speech pathology. The proposed approach has the potential to enhance the accuracy and allow for real-time pathological speech assessment, thereby facilitating more targeted and effective therapeutic interventions.
Collapse
Affiliation(s)
- Tuan D Pham
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK.
| | - Simon B Holmes
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Lifong Zou
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Mangala Patel
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Paul Coulthard
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| |
Collapse
|
9
|
Avola D, Cannistraci I, Cascio M, Cinque L, Fagioli A, Foresti GL, Rodolà E, Solito L. MV-MS-FETE: Multi-view multi-scale feature extractor and transformer encoder for stenosis recognition in echocardiograms. Comput Methods Programs Biomed 2024; 245:108037. [PMID: 38271793 DOI: 10.1016/j.cmpb.2024.108037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 12/27/2023] [Accepted: 01/15/2024] [Indexed: 01/27/2024]
Abstract
BACKGROUND aortic stenosis is a common heart valve disease that mainly affects older people in developed countries. Its early detection is crucial to prevent the irreversible disease progression and, eventually, death. A typical screening technique to detect stenosis uses echocardiograms; however, variations introduced by other tissues, camera movements, and uneven lighting can hamper the visual inspection, leading to misdiagnosis. To address these issues, effective solutions involve employing deep learning algorithms to assist clinicians in detecting and classifying stenosis by developing models that can predict this pathology from single heart views. Although promising, the visual information conveyed by a single image may not be sufficient for an accurate diagnosis, especially when using an automatic system; thus, this indicates that different solutions should be explored. METHODOLOGY following this rationale, this paper proposes a novel deep learning architecture, composed of a multi-view, multi-scale feature extractor, and a transformer encoder (MV-MS-FETE) to predict stenosis from parasternal long and short-axis views. In particular, starting from the latter, the designed model extracts relevant features at multiple scales along its feature extractor component and takes advantage of a transformer encoder to perform the final classification. RESULTS experiments were performed on the recently released Tufts medical echocardiogram public dataset, which comprises 27,788 images split into training, validation, and test sets. Due to the recent release of this collection, tests were also conducted on several state-of-the-art models to create multi-view and single-view benchmarks. For all models, standard classification metrics were computed (e.g., precision, F1-score). The obtained results show that the proposed approach outperforms other multi-view methods in terms of accuracy and F1-score and has more stable performance throughout the training procedure. Furthermore, the experiments also highlight that multi-view methods generally perform better than their single-view counterparts. CONCLUSION this paper introduces a novel multi-view and multi-scale model for aortic stenosis recognition, as well as three benchmarks to evaluate it, effectively providing multi-view and single-view comparisons that fully highlight the model's effectiveness in aiding clinicians in performing diagnoses while also producing several baselines for the aortic stenosis recognition task.
Collapse
Affiliation(s)
- Danilo Avola
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| | - Irene Cannistraci
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| | - Marco Cascio
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| | - Luigi Cinque
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| | - Alessio Fagioli
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy.
| | - Gian Luca Foresti
- Department of Mathematics, Computer Science and Physics, University of Udine, 33100 Udine, Italy
| | - Emanuele Rodolà
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| | - Luciana Solito
- Department of Computer Science, Sapienza University, Via Salaria 113, 00185, Rome, Italy
| |
Collapse
|
10
|
Bal U, Bal A, Moral ÖT, Düzgün F, Gürbüz N. A deep learning feature extraction-based hybrid approach for detecting pediatric pneumonia in chest X-ray images. Phys Eng Sci Med 2024; 47:109-117. [PMID: 37991696 DOI: 10.1007/s13246-023-01347-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 10/12/2023] [Indexed: 11/23/2023]
Abstract
Pneumonia is a disease caused by bacteria, viruses, and fungi that settle in the alveolar sacs of the lungs and can lead to serious health complications in humans. Early detection of pneumonia is necessary for early treatment to manage and cure the disease. Recently, machine learning-based pneumonia detection methods have focused on pneumonia in adults. Machine learning relies on manual feature engineering, whereas deep learning can automatically detect and extract features from data. This study proposes a deep learning feature extraction-based hybrid approach that combines deep learning and machine learning to detect pediatric pneumonia, which is difficult to standardize. The proposed hybrid approach enhances the accuracy of detecting pediatric pneumonia and simplifies the approach by eliminating the requirement for advanced feature extraction. The experiments indicate that the hybrid approach using a Medium Neural Network based on AlexNet feature extraction achieved a 97.9% accuracy rate and 98.0% sensitivity rate. The results show that the proposed approach achieved higher accuracy rates than state-of-the-art approaches.
Collapse
Affiliation(s)
- Ufuk Bal
- Osmaniye Korkut Ata University Electrical and Electronics Engineering Department, Osmaniye, Turkey.
| | - Alkan Bal
- Manisa Celal Bayar University Pediatrics Department, Manisa, Turkey
| | - Özge Taylan Moral
- Vocational School of Technical Sciences, Electronics Technology, Istanbul University-Cerrahpasa, Istanbul, Turkey
| | - Fatih Düzgün
- Department of Radiology, Manisa Celal Bayar University, Manisa, Turkey
| | - Nida Gürbüz
- Manisa Celal Bayar University Pediatrics Department, Manisa, Turkey
| |
Collapse
|
11
|
Wen W, Zhang H, Wang Z, Gao X, Wu P, Lin J, Zeng N. Enhanced multi-label cardiology diagnosis with channel-wise recurrent fusion. Comput Biol Med 2024; 171:108210. [PMID: 38417383 DOI: 10.1016/j.compbiomed.2024.108210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/08/2024] [Accepted: 02/25/2024] [Indexed: 03/01/2024]
Abstract
The timely detection of abnormal electrocardiogram (ECG) signals is vital for preventing heart disease. However, traditional automated cardiology diagnostic methods have the limitation of being unable to simultaneously identify multiple diseases in a segment of ECG signals, and do not consider the potential correlations between the 12-lead ECG signals. To address these issues, this paper presents a novel network architecture, denoted as Branched Convolution and Channel Fusion Network (BCCF-Net), designed for the multi-label diagnosis of ECG cardiology to achieve simultaneous identification of multiple diseases. Among them, the BCCF-Net incorporates the Channel-wise Recurrent Fusion (CRF) network, which is designed to enhance the ability to explore potential correlation information between 12 leads. Furthermore, the utilization of the squeeze and excitation (SE) attention mechanism maximizes the potential of the convolutional neural network (CNN). In order to efficiently capture complex patterns in space and time across various scales, the multi branch convolution (MBC) module has been developed. Through extensive experiments on two public datasets with seven subtasks, the efficacy and robustness of the proposed ECG multi-label classification framework have been comprehensively evaluated. The results demonstrate the superior performance of the BCCF-Net compared to other state-of-the-art algorithms. The developed framework holds practical application in clinical settings, allowing for the refined diagnosis of cardiac arrhythmias through ECG signal analysis.
Collapse
Affiliation(s)
- Weimin Wen
- School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Hongyi Zhang
- School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Zidong Wang
- Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK.
| | - Xingen Gao
- School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Peishu Wu
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Juqiang Lin
- School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Nianyin Zeng
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China.
| |
Collapse
|
12
|
Sabry AH, I. Dallal Bashi O, Nik Ali N, Mahmood Al Kubaisi Y. Lung disease recognition methods using audio-based analysis with machine learning. Heliyon 2024; 10:e26218. [PMID: 38420389 PMCID: PMC10900411 DOI: 10.1016/j.heliyon.2024.e26218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 12/11/2023] [Accepted: 02/08/2024] [Indexed: 03/02/2024] Open
Abstract
The use of computer-based automated approaches and improvements in lung sound recording techniques have made lung sound-based diagnostics even better and devoid of subjectivity errors. Using a computer to evaluate lung sound features more thoroughly with the use of analyzing changes in lung sound behavior, recording measurements, suppressing the presence of noise contaminations, and graphical representations are all made possible by computer-based lung sound analysis. This paper starts with a discussion of the need for this research area, providing an overview of the field and the motivations behind it. Following that, it details the survey methodology used in this work. It presents a discussion on the elements of sound-based lung disease classification using machine learning algorithms. This includes commonly prior considered datasets, feature extraction techniques, pre-processing methods, artifact removal methods, lung-heart sound separation, deep learning algorithms, and wavelet transform of lung audio signals. The study introduces studies that review lung screening including a summary table of these references and discusses the literature gaps in the existing studies. It is concluded that the use of sound-based machine learning in the classification of respiratory diseases has promising results. While we believe this material will prove valuable to physicians and researchers exploring sound-signal-based machine learning, large-scale investigations remain essential to solidify the findings and foster wider adoption within the medical community.
Collapse
Affiliation(s)
- Ahmad H. Sabry
- Department of Medical Instrumentation Engineering Techniques, Shatt Al-Arab University College, Basra, Iraq
| | - Omar I. Dallal Bashi
- Medical Technical Institute, Northern Technical University, 95G2+P34, Mosul, 41002, Iraq
| | - N.H. Nik Ali
- School of Electrical Engineering, College of Engineering, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
| | - Yasir Mahmood Al Kubaisi
- Department of Sustainability Management, Dubai Academic Health Corporation, Dubai, 4545, United Arab Emirates
| |
Collapse
|
13
|
Han X, Li R, Wang B, Lin Z. Defect identification of bare printed circuit boards based on Bayesian fusion of multi-scale features. PeerJ Comput Sci 2024; 10:e1900. [PMID: 38435627 PMCID: PMC10909203 DOI: 10.7717/peerj-cs.1900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/31/2024] [Indexed: 03/05/2024]
Abstract
The aim of this article is to propose a defect identification method for bare printed circuit boards (PCB) based on multi-feature fusion. This article establishes a description method for various features of grayscale, texture, and deep semantics of bare PCB images. First, the multi-scale directional projection feature, the multi-scale grey scale co-occurrence matrix feature, and the multi-scale gradient directional information entropy feature of PCB were extracted to build the shallow features of defect images. Then, based on migration learning, the feature extraction network of the pre-trained Visual Geometry Group16 (VGG-16) convolutional neural network model was used to extract the deep semantic feature of the bare PCB images. A multi-feature fusion method based on principal component analysis and Bayesian theory was established. The shallow image feature was then fused with the deep semantic feature, which improved the ability of feature vectors to characterize defects. Finally, the feature vectors were input as feature sequences to support vector machines for training, which completed the classification and recognition of bare PCB defects. Experimental results show that the algorithm integrating deep features and multi-scale shallow features had a high recognition rate for bare PCB defects, with an accuracy rate of over 99%.
Collapse
Affiliation(s)
- Xixi Han
- School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| | - Renpeng Li
- Anyang Iron and Steel Automation Software Co., Ltd, Zhengzhou, Henan, China
| | - Boqin Wang
- School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| | - Zhibo Lin
- School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| |
Collapse
|
14
|
Shafik W, Tufail A, De Silva Liyanage C, Apong RAAHM. Using transfer learning-based plant disease classification and detection for sustainable agriculture. BMC Plant Biol 2024; 24:136. [PMID: 38408925 PMCID: PMC10895770 DOI: 10.1186/s12870-024-04825-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 02/15/2024] [Indexed: 02/28/2024]
Abstract
Subsistence farmers and global food security depend on sufficient food production, which aligns with the UN's "Zero Hunger," "Climate Action," and "Responsible Consumption and Production" sustainable development goals. In addition to already available methods for early disease detection and classification facing overfitting and fine feature extraction complexities during the training process, how early signs of green attacks can be identified or classified remains uncertain. Most pests and disease symptoms are seen in plant leaves and fruits, yet their diagnosis by experts in the laboratory is expensive, tedious, labor-intensive, and time-consuming. Notably, how plant pests and diseases can be appropriately detected and timely prevented is a hotspot paradigm in smart, sustainable agriculture remains unknown. In recent years, deep transfer learning has demonstrated tremendous advances in the recognition accuracy of object detection and image classification systems since these frameworks utilize previously acquired knowledge to solve similar problems more effectively and quickly. Therefore, in this research, we introduce two plant disease detection (PDDNet) models of early fusion (AE) and the lead voting ensemble (LVE) integrated with nine pre-trained convolutional neural networks (CNNs) and fine-tuned by deep feature extraction for efficient plant disease identification and classification. The experiments were carried out on 15 classes of the popular PlantVillage dataset, which has 54,305 image samples of different plant disease species in 38 categories. Hyperparameter fine-tuning was done with popular pre-trained models, including DenseNet201, ResNet101, ResNet50, GoogleNet, AlexNet, ResNet18, EfficientNetB7, NASNetMobile, and ConvNeXtSmall. We test these CNNs on the stated plant disease detection and classification problem, both independently and as part of an ensemble. In the final phase, a logistic regression (LR) classifier is utilized to determine the performance of various CNN model combinations. A comparative analysis was also performed on classifiers, deep learning, the proposed model, and similar state-of-the-art studies. The experiments demonstrated that PDDNet-AE and PDDNet-LVE achieved 96.74% and 97.79%, respectively, compared to current CNNs when tested on several plant diseases, depicting its exceptional robustness and generalization capabilities and mitigating current concerns in plant disease detection and classification.
Collapse
Affiliation(s)
- Wasswa Shafik
- School of Digital Science, Universiti Brunei Darussalam, Tungku Link, Gadong, BE1410, Brunei
| | - Ali Tufail
- School of Digital Science, Universiti Brunei Darussalam, Tungku Link, Gadong, BE1410, Brunei.
| | | | | |
Collapse
|
15
|
Chowa SS, Azam S, Montaha S, Bhuiyan MRI, Jonkman M. Improving the Automated Diagnosis of Breast Cancer with Mesh Reconstruction of Ultrasound Images Incorporating 3D Mesh Features and a Graph Attention Network. J Imaging Inform Med 2024:10.1007/s10278-024-00983-5. [PMID: 38361007 DOI: 10.1007/s10278-024-00983-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/17/2023] [Accepted: 12/11/2023] [Indexed: 02/17/2024]
Abstract
This study proposes a novel approach for breast tumor classification from ultrasound images into benign and malignant by converting the region of interest (ROI) of a 2D ultrasound image into a 3D representation using the point-e system, allowing for in-depth analysis of underlying characteristics. Instead of relying solely on 2D imaging features, this method extracts 3D mesh features that describe tumor patterns more precisely. Ten informative and medically relevant mesh features are extracted and assessed with two feature selection techniques. Additionally, a feature pattern analysis has been conducted to determine the feature's significance. A feature table with dimensions of 445 × 12 is generated and a graph is constructed, considering the rows as nodes and the relationships among the nodes as edges. The Spearman correlation coefficient method is employed to identify edges between the strongly connected nodes (with a correlation score greater than or equal to 0.7), resulting in a graph containing 56,054 edges and 445 nodes. A graph attention network (GAT) is proposed for the classification task and the model is optimized with an ablation study, resulting in the highest accuracy of 99.34%. The performance of the proposed model is compared with ten machine learning (ML) models and one-dimensional convolutional neural network where the test accuracy of these models ranges from 73 to 91%. Our novel 3D mesh-based approach, coupled with the GAT, yields promising performance for breast tumor classification, outperforming traditional models, and has the potential to reduce time and effort of radiologists providing a reliable diagnostic system.
Collapse
Affiliation(s)
- Sadia Sultana Chowa
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Sami Azam
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia.
| | - Sidratul Montaha
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Md Rahad Islam Bhuiyan
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Mirjam Jonkman
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| |
Collapse
|
16
|
Gao C, Fan Q, Zhao P, Sun C, Dang R, Feng Y, Hu B, Wang Q. Spectral encoder to extract the efficient features of Raman spectra for reliable and precise quantitative analysis. Spectrochim Acta A Mol Biomol Spectrosc 2024; 312:124036. [PMID: 38367343 DOI: 10.1016/j.saa.2024.124036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/04/2024] [Accepted: 02/10/2024] [Indexed: 02/19/2024]
Abstract
Raman spectroscopy has become a powerful analytical tool highly demanded in many applications such as microorganism sample analysis, food quality control, environmental science, and pharmaceutical analysis, owing to its non-invasiveness, simplicity, rapidity and ease of use. Among them, quantitative research using Raman spectroscopy is a crucial application field of spectral analysis. However, the entire process of quantitative modeling largely relies on the extraction of effective spectral features, particularly for measurements on complex samples or in environments with poor spectral signal quality. In this paper, we propose a method of utilizing a spectral encoder to extract effective spectral features, which can significantly enhance the reliability and precision of quantitative analysis. We built a latent encoded feature regression model; in the process of utilizing the autoencoder for reconstructing the spectrometer output, the latent feature obtained from the intermediate bottleneck layer is extracted. Then, these latent features are fed into a deep regression model for component concentration prediction. Through detailed ablation and comparative experiments, our proposed model demonstrates superior performance to common methods on single-component and multi-component mixture datasets, remarkably improving regression precision while without needing user-selected parameters and eliminating the interference of irrelevant and redundant information. Furthermore, in-depth analysis reveals that latent encoded feature possesses strong nonlinear feature representation capabilities, low computational costs, wide adaptability, and robustness against noise interference. This highlights its effectiveness in spectral regression tasks and indicates its potential in other application fields. Sufficient experimental results show that our proposed method provides a novel and effective feature extraction approach for spectral analysis, which is simple, suitable for various methods, and can meet the measurement needs of different real-world scenarios.
Collapse
Affiliation(s)
- Chi Gao
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qi Fan
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China
| | - Peng Zhao
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chao Sun
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China
| | - Ruochen Dang
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yutao Feng
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China
| | - Bingliang Hu
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China
| | - Quan Wang
- Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Shaanxi, 710076, China; The Key Laboratory of Biomedical Spectroscopy of Xi'an, Shaanxi, 710076, China.
| |
Collapse
|
17
|
Sami A, El-Metwally S, Rashad MZ. MAC-ErrorReads: machine learning-assisted classifier for filtering erroneous NGS reads. BMC Bioinformatics 2024; 25:61. [PMID: 38321434 PMCID: PMC10848413 DOI: 10.1186/s12859-024-05681-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/29/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND The rapid advancement of next-generation sequencing (NGS) machines in terms of speed and affordability has led to the generation of a massive amount of biological data at the expense of data quality as errors become more prevalent. This introduces the need to utilize different approaches to detect and filtrate errors, and data quality assurance is moved from the hardware space to the software preprocessing stages. RESULTS We introduce MAC-ErrorReads, a novel Machine learning-Assisted Classifier designed for filtering Erroneous NGS Reads. MAC-ErrorReads transforms the erroneous NGS read filtration process into a robust binary classification task, employing five supervised machine learning algorithms. These models are trained on features extracted through the computation of Term Frequency-Inverse Document Frequency (TF_IDF) values from various datasets such as E. coli, GAGE S. aureus, H. Chr14, Arabidopsis thaliana Chr1 and Metriaclima zebra. Notably, Naive Bayes demonstrated robust performance across various datasets, displaying high accuracy, precision, recall, F1-score, MCC, and ROC values. The MAC-ErrorReads NB model accurately classified S. aureus reads, surpassing most error correction tools with a 38.69% alignment rate. For H. Chr14, tools like Lighter, Karect, CARE, Pollux, and MAC-ErrorReads showed rates above 99%. BFC and RECKONER exceeded 98%, while Fiona had 95.78%. For the Arabidopsis thaliana Chr1, Pollux, Karect, RECKONER, and MAC-ErrorReads demonstrated good alignment rates of 92.62%, 91.80%, 91.78%, and 90.87%, respectively. For the Metriaclima zebra, Pollux achieved a high alignment rate of 91.23%, despite having the lowest number of mapped reads. MAC-ErrorReads, Karect, and RECKONER demonstrated good alignment rates of 83.76%, 83.71%, and 83.67%, respectively, while also producing reasonable numbers of mapped reads to the reference genome. CONCLUSIONS This study demonstrates that machine learning approaches for filtering NGS reads effectively identify and retain the most accurate reads, significantly enhancing assembly quality and genomic coverage. The integration of genomics and artificial intelligence through machine learning algorithms holds promise for enhancing NGS data quality, advancing downstream data analysis accuracy, and opening new opportunities in genetics, genomics, and personalized medicine research.
Collapse
Affiliation(s)
- Amira Sami
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| | - Sara El-Metwally
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt.
- Biomedical Informatics Department, Faculty of Computer Science and Engineering, New Mansoura University, Gamasa, 35712, Egypt.
| | - M Z Rashad
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| |
Collapse
|
18
|
Chen M, Jin C, Ni Y, Yang T, Xu J. A dataset of the quality of soybean harvested by mechanization for deep-learning-based monitoring and analysis. Data Brief 2024; 52:109833. [PMID: 38370022 PMCID: PMC10873865 DOI: 10.1016/j.dib.2023.109833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/14/2023] [Accepted: 11/15/2023] [Indexed: 02/20/2024] Open
Abstract
Deep learning and machine vision technology are widely applied to detect the quality of mechanized soybean harvesting. A clean dataset is the foundation for constructing an online detection learning model for the quality of mechanized harvested soybeans. In pursuit of this objective, we established an image dataset for mechanized harvesting of soybeans. The photos were taken on October 9, 2018, at a soybean experimental field of Liangfeng Grain and Cotton Planting Professional Cooperative in Guanyi District, Liangshan, Shandong, China. The dataset contains 40 soybean images of different qualities. By scaling, rotating, flipping, filtering, and adding noise to enhance the data, we expanded the dataset to 800 frames. The dataset consists of three folders, which store images, label maps, and record files for partitioning the dataset into training, validation, and testing sets. In the initial stages, the author devised an online detection model for soybean crushing rate and impurity rate based on machine vision, and research outcomes affirm the efficacy of this dataset. The dataset can help researchers construct a quality prediction model for mechanized harvested soybeans using deep learning techniques.
Collapse
Affiliation(s)
- Man Chen
- Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing Jiangsu Province, 210014, China
| | - Chengqian Jin
- Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing Jiangsu Province, 210014, China
| | - Youliang Ni
- Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing Jiangsu Province, 210014, China
| | - Tengxiang Yang
- Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing Jiangsu Province, 210014, China
| | - Jinshan Xu
- Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing Jiangsu Province, 210014, China
| |
Collapse
|
19
|
Raj SS, Chandra SSV. Significance of Sequence Features in Classification of Protein-Protein Interactions Using Machine Learning. Protein J 2024; 43:72-83. [PMID: 38114669 DOI: 10.1007/s10930-023-10168-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/30/2023] [Indexed: 12/21/2023]
Abstract
Protein-protein interactions are crucial for the entry of viruses into the cell. Understanding the mechanism of interactions is essential in studying human-virus association, developing new biologics and drug candidates, as well as viral infections and antiviral responses. Experimental methods to analyze human-virus protein-protein interactions based on protein sequence data are time-consuming and labor-intensive, so machine learning models are being developed to predict interactions and determine large-scale interactomes between species. The present work highlights the importance of sequence features in classifying interacting and non-interacting proteins from the protein sequence data. Higher dimensional amino acid sequence features such as Amino Acid Composition (AAC), Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Pseudo-Amino Acid Composition (PAAC) etc., are extracted. Following feature extraction, three datasets were created: Dataset 1 contains all of the extracted features. While Datasets 2 and 3 contain the most relevant features obtained through dimensionality reduction. To analyze the importance of high-dimensional features and their participation in protein-protein interactions, a random forest classifier is trained on three datasets. With dimensionality reduction, the model exhibited exceptional accuracy, indicating that dimensionality reduction fails to capture the complexity of interactions and the underlying relationships between human and viral proteins. As a result of retaining high-dimensional features, it is possible to capture all the characteristics of protein-protein interactions that resemble host-pathogen associations, leading to the development of biologically meaningful models. Our proposed approach is a more realistic and comprehensive classification model, leading to deeper insights and better applications in virology and drug development.
Collapse
Affiliation(s)
- Sini S Raj
- Machine Intelligence Research Lab, Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India.
| | - S S Vinod Chandra
- Machine Intelligence Research Lab, Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India
| |
Collapse
|
20
|
Aldayel M, Al-Nafjan A. A comprehensive exploration of machine learning techniques for EEG-based anxiety detection. PeerJ Comput Sci 2024; 10:e1829. [PMID: 38435618 PMCID: PMC10909191 DOI: 10.7717/peerj-cs.1829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 12/29/2023] [Indexed: 03/05/2024]
Abstract
The performance of electroencephalogram (EEG)-based systems depends on the proper choice of feature extraction and machine learning algorithms. This study highlights the significance of selecting appropriate feature extraction and machine learning algorithms for EEG-based anxiety detection. We explored different annotation/labeling, feature extraction, and classification algorithms. Two measurements, the Hamilton anxiety rating scale (HAM-A) and self-assessment Manikin (SAM), were used to label anxiety states. For EEG feature extraction, we employed the discrete wavelet transform (DWT) and power spectral density (PSD). To improve the accuracy of anxiety detection, we compared ensemble learning methods such as random forest (RF), AdaBoost bagging, and gradient bagging with conventional classification algorithms including linear discriminant analysis (LDA), support vector machine (SVM), and k-nearest neighbor (KNN) classifiers. We also evaluated the performance of the classifiers using different labeling (SAM and HAM-A) and feature extraction algorithms (PSD and DWT). Our findings demonstrated that HAM-A labeling and DWT-based features consistently yielded superior results across all classifiers. Specifically, the RF classifier achieved the highest accuracy of 87.5%, followed by the Ada boost bagging classifier with an accuracy of 79%. The RF classifier outperformed other classifiers in terms of accuracy, precision, and recall.
Collapse
Affiliation(s)
- Mashael Aldayel
- Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Abeer Al-Nafjan
- Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
| |
Collapse
|
21
|
Qin Y, Li B, Wang W, Shi X, Wang H, Wang X. ETCNet: An EEG-based motor imagery classification model combining efficient channel attention and temporal convolutional network. Brain Res 2024; 1823:148673. [PMID: 37956749 DOI: 10.1016/j.brainres.2023.148673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 08/16/2023] [Accepted: 11/07/2023] [Indexed: 11/15/2023]
Abstract
Brain-computer interface (BCI) enables the control of external devices using signals from the brain, offering immense potential in assisting individuals with neuromuscular disabilities. Among the different paradigms of BCI systems, the motor imagery (MI) based electroencephalogram (EEG) signal is widely recognized as exceptionally promising. Deep learning (DL) has found extensive applications in the processing of MI signals, wherein convolutional neural networks (CNN) have demonstrated superior performance compared to conventional machine learning (ML) approaches. Nevertheless, challenges related to subject independence and subject dependence persist, while the inherent low signal-to-noise ratio of EEG signals remains a critical aspect that demands attention. Accurately deciphering intentions from EEG signals continues to present a formidable challenge. This paper introduces an advanced end-to-end network that effectively combines the efficient channel attention (ECA) and temporal convolutional network (TCN) components for the classification of motor imagination signals. We incorporated an ECA module prior to feature extraction in order to enhance the extraction of channel-specific features. A compact convolutional network model uses for feature extraction in the middle part. Finally, the time characteristic information is obtained by using TCN. The results show that our network is a lightweight network that is characterized by few parameters and fast speed. Our network achieves an average accuracy of 80.71% on the BCI Competition IV-2a dataset.
Collapse
Affiliation(s)
- Yuxin Qin
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China
| | - Baojiang Li
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China.
| | - Wenlong Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China
| | - Xingbin Shi
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China
| | - Haiyan Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China
| | - Xichao Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai, China
| |
Collapse
|
22
|
Yang J, Ma X, Guan H, Yang C, Zhang Y, Li G, Li Z, Lu Y. A quality detection method of corn based on spectral technology and deep learning model. Spectrochim Acta A Mol Biomol Spectrosc 2024; 305:123472. [PMID: 37788513 DOI: 10.1016/j.saa.2023.123472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 09/25/2023] [Accepted: 09/26/2023] [Indexed: 10/05/2023]
Abstract
Corn is an important food crop in the world. With economic development and population growth, the nutritional quality of corn is of great significance to high-quality breeding, scientific cultivation and fine management. Aiming at the problems of cumbersome steps, time-consuming and laborious, and low accuracy in the current research on corn quality detection. This paper proposes to combine near-infrared (NIR) spectroscopy technology with deep learning technology to build a corn quality detection model based on convolutional neural network (LeNet-5). The original spectral data were preprocessed by wavelet transform (WT) and multivariate scattering correction (MSC) to remove noise interference and spectral scattering information. The Competitive Adaptive Reweighted Sampling Algorithm (CARS) was applied to optimize the characteristic wavenumber and reduce redundant data. According to the optimized characteristic wave number, it was input into the constructed corn quality detection model for simulation test, and the average detection accuracy rate of the test set was 96.46%, the average precision rate was 95.42%, the average recall rate was 97.92%, the average F1score was 96.64%, and the average recognition time was 51.95 s. Compared with traditional machine learning models such as BP neural network, K Nearest Neighbor (KNN), Support Vector Machine (SVM), Generalized Linear Model (GLM), Linear Discriminant Analysis (LDA), and Naive Bayesian (NB), the deep learning LeNet-5 network model constructed in this paper has an average accuracy increase of 39.32%, and has a higher detection accuracy.
Collapse
Affiliation(s)
- Jiao Yang
- College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Xiaodan Ma
- College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Haiou Guan
- College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Da Qing 163319, China; Key Laboratory of Low-carbon Green Agriculture in North-eastern China, Ministry of Agriculture and Rural Affairs, Da qing 163319, China.
| | - Chen Yang
- College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Yifei Zhang
- Key Laboratory of Low-carbon Green Agriculture in North-eastern China, Ministry of Agriculture and Rural Affairs, Da qing 163319, China; College of Agricultural, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Guibin Li
- Key Laboratory of Low-carbon Green Agriculture in North-eastern China, Ministry of Agriculture and Rural Affairs, Da qing 163319, China; College of Agricultural, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Zesong Li
- Key Laboratory of Low-carbon Green Agriculture in North-eastern China, Ministry of Agriculture and Rural Affairs, Da qing 163319, China; College of Agricultural, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| | - Yuxin Lu
- Key Laboratory of Low-carbon Green Agriculture in North-eastern China, Ministry of Agriculture and Rural Affairs, Da qing 163319, China; College of Agricultural, Heilongjiang Bayi Agricultural University, Da Qing 163319, China
| |
Collapse
|
23
|
Ahmed AAM, Jui SJJ, Sharma E, Ahmed MH, Raj N, Bose A. An advanced deep learning predictive model for air quality index forecasting with remote satellite-derived hydro-climatological variables. Sci Total Environ 2024; 906:167234. [PMID: 37739083 DOI: 10.1016/j.scitotenv.2023.167234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 09/15/2023] [Accepted: 09/19/2023] [Indexed: 09/24/2023]
Abstract
Forecasting the air quality index (AQI) is a critical and pressing challenge for developing nations worldwide. With air pollution emerging as a significant threat to the environment, this study considers seven study sites of the sub-tropical region in Bangladesh and introduces a novel hybrid deep-learning model. The proposed model, expressed as CLSTM-BiGRU, integrates a convolutional neural network (CNN), a long-short term memory (LSTM), and a bi-directional gated recurrent unit (BiGRU) network. Leveraging nineteen remotely sensed predictor variables and harnessing the grey wolf optimization (GWO) algorithm, the CLSTM-BiGRU model showcases its superiority in air quality forecasting. It consistently outperforms the benchmark models, yielding lower forecasting errors and higher efficiency (i.e., correlation coefficient ~1) values. Hence, this study underscores the feasibility and substantial potential of the hybrid deep learning model, which can provide precise forecasts of air quality index, and will be highly useful for relevant stakeholders and decision-makers. Furthermore, the adaptability and potential utility of this innovative model may be ascertained for air quality monitoring and effective public health risk mitigation in urban environments.
Collapse
Affiliation(s)
- Abul Abrar Masrur Ahmed
- Department of Infrastructure Engineering, University of Melbourne, Parkville, VIC 3010, Australia
| | - S Janifer Jabin Jui
- School of Mathematics Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia
| | - Ekta Sharma
- School of Mathematics Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia.
| | - Mohammad Hafez Ahmed
- Wadsworth Department of Civil and Environmental Engineering, West Virginia University, Morgantown, WV 26506-6103, United States.
| | - Nawin Raj
- School of Mathematics Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia.
| | - Aditi Bose
- School of Mathematics Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia.
| |
Collapse
|
24
|
Pajanoja C, Kerosuo L. ShapeMetrics: A 3D Cell Segmentation Pipeline for Single-Cell Spatial Morphometric Analysis. Methods Mol Biol 2024; 2767:263-273. [PMID: 37219813 DOI: 10.1007/7651_2023_489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
There is a growing need for single-cell level data analysis in correlation with the advancements of microscopy techniques. Morphology-based statistics gathered from individual cells are essential for detection and quantification of even subtle changes within the complex tissues, yet the information available from high-resolution imaging is oftentimes sub-optimally utilized due to the lack of proper computational analysis software. Here we present ShapeMetrics, a 3D cell segmentation pipeline that we have developed to identify, analyze, and quantify single cells in an image. This MATLAB-based script enables users to extract morphological parameters, such as ellipticity, longest axis, cell elongation, or the ratio between cell volume and surface area. We have specifically invested in creating a user-friendly pipeline, aimed for biologists with a limited computational background. Our pipeline is presented with detailed stepwise instructions, starting from the establishment of machine learning-based prediction files of immuno-labeled cell membranes followed by the application of 3D cell segmentation and parameter extraction script, leading to the morphometric analysis and spatial visualization of cell clusters defined by their morphometric features.
Collapse
Affiliation(s)
- Ceren Pajanoja
- Neural Crest Development and Disease Unit, National Institute of Dental and Craniofacial Research, Intramural Research Program, Neural Crest Development and Disease Unit, National Institutes of Health, Bethesda, ML, USA
- Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Laura Kerosuo
- Neural Crest Development and Disease Unit, National Institute of Dental and Craniofacial Research, Intramural Research Program, Neural Crest Development and Disease Unit, National Institutes of Health, Bethesda, ML, USA
| |
Collapse
|
25
|
Li B, Zhang J, Wang Q, Li H, Wang Q. Three-dimensional spine reconstruction from biplane radiographs using convolutional neural networks. Med Eng Phys 2024; 123:104088. [PMID: 38365341 DOI: 10.1016/j.medengphy.2023.104088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 12/04/2023] [Accepted: 12/10/2023] [Indexed: 02/18/2024]
Abstract
PURPOSE The purpose of this study was to develop and evaluate a deep learning network for three-dimensional reconstruction of the spine from biplanar radiographs. METHODS The proposed approach focused on extracting similar features and multiscale features of bone tissue in biplanar radiographs. Bone tissue features were reconstructed for feature representation across dimensions to generate three-dimensional volumes. The number of feature mappings was gradually reduced in the reconstruction to transform the high-dimensional features into the three-dimensional image domain. We produced and made eight public datasets to train and test the proposed network. Two evaluation metrics were proposed and combined with four classical evaluation metrics to measure the performance of the method. RESULTS In comparative experiments, the reconstruction results of this method achieved a Hausdorff distance of 1.85 mm, a surface overlap of 0.2 mm, a volume overlap of 0.9664, and an offset distance of only 0.21 mm from the vertebral body centroid. The results of this study indicate that the proposed method is reliable.
Collapse
Affiliation(s)
- Bo Li
- Department of Electronic Engineering, Yunnan University, Kunming, China
| | - Junhua Zhang
- Department of Electronic Engineering, Yunnan University, Kunming, China.
| | - Qian Wang
- Department of Electronic Engineering, Yunnan University, Kunming, China
| | - Hongjian Li
- The First People's Hospital of Yunnan Province, China
| | - Qiyang Wang
- The First People's Hospital of Yunnan Province, China
| |
Collapse
|
26
|
Kiran U, Bhat SN, Anitha H, Naik RR. Feature-based multimodal registration framework for vertebral pose estimation. Eur Spine J 2023:10.1007/s00586-023-08054-z. [PMID: 38104308 DOI: 10.1007/s00586-023-08054-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 08/21/2023] [Accepted: 11/12/2023] [Indexed: 12/19/2023]
Abstract
PURPOSE The reliable estimation of the vertebral body posture helps to aid a safe and effective spine surgery. The proposed work aims to present an MR to X-ray image registration to assess the 3D pose of the vertebral body during spine surgery. The 3D assessment of vertebral pose assists in analyzing the position and orientation of the vertebral body to provide information during various clinical diagnosis conditions such as curvature estimation and pedicle screw insertion surgery. METHODS The proposed feature-based registration framework extracted vertebral end plates to avoid the mismatch between the intensities of MR and X-ray images. Using the projection matrix, the segmented MRI is forward projected and then registered to the X-ray image using binary image matching similarity and the CMA-ES optimizer. RESULTS The proposed method estimated the vertebral pose by registering the simulated X-ray onto pre-operative MRI. To evaluate the efficacy of the proposed approach, a certain number of experiments are carried out on the simulated dataset. CONCLUSION The proposed method is a fast and accurate registration method that can provide 3D information about the vertebral body. This 3D information is useful to improve accuracy during various clinical diagnoses.
Collapse
Affiliation(s)
- Usha Kiran
- Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Shyamasunder N Bhat
- Department of Orthopaedics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
| | - H Anitha
- Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
| | - Roshan Ramakrishna Naik
- Department of Electronics and Communication Engineering, St. Joseph Engineering College, Vamanjoor, Mangalore, Karnataka, 575028, India
| |
Collapse
|
27
|
Wu J, Liu B, Zhang J, Wang Z, Li J. DL-PPI: a method on prediction of sequenced protein-protein interaction based on deep learning. BMC Bioinformatics 2023; 24:473. [PMID: 38097937 PMCID: PMC10722729 DOI: 10.1186/s12859-023-05594-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023] Open
Abstract
PURPOSE Sequenced Protein-Protein Interaction (PPI) prediction represents a pivotal area of study in biology, playing a crucial role in elucidating the mechanistic underpinnings of diseases and facilitating the design of novel therapeutic interventions. Conventional methods for extracting features through experimental processes have proven to be both costly and exceedingly complex. In light of these challenges, the scientific community has turned to computational approaches, particularly those grounded in deep learning methodologies. Despite the progress achieved by current deep learning technologies, their effectiveness diminishes when applied to larger, unfamiliar datasets. RESULTS In this study, the paper introduces a novel deep learning framework, termed DL-PPI, for predicting PPIs based on sequence data. The proposed framework comprises two key components aimed at improving the accuracy of feature extraction from individual protein sequences and capturing relationships between proteins in unfamiliar datasets. 1. Protein Node Feature Extraction Module: To enhance the accuracy of feature extraction from individual protein sequences and facilitate the understanding of relationships between proteins in unknown datasets, the paper devised a novel protein node feature extraction module utilizing the Inception method. This module efficiently captures relevant patterns and representations within protein sequences, enabling more informative feature extraction. 2. Feature-Relational Reasoning Network (FRN): In the Global Feature Extraction module of our model, the paper developed a novel FRN that leveraged Graph Neural Networks to determine interactions between pairs of input proteins. The FRN effectively captures the underlying relational information between proteins, contributing to improved PPI predictions. DL-PPI framework demonstrates state-of-the-art performance in the realm of sequence-based PPI prediction.
Collapse
Affiliation(s)
- Jiahui Wu
- Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Bo Liu
- School of Mathematical and Computational Sciences, Massey University, Auckland, 0745, New Zealand.
| | - Jidong Zhang
- Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Zhihan Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| | - Jianqiang Li
- Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
| |
Collapse
|
28
|
Avram C, Gligor A, Roman D, Soylu A, Nyulas V, Avram L. Machine learning based assessment of preclinical health questionnaires. Int J Med Inform 2023; 180:105248. [PMID: 37866276 DOI: 10.1016/j.ijmedinf.2023.105248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/04/2023] [Accepted: 10/08/2023] [Indexed: 10/24/2023]
Abstract
BACKGROUND Within modern health systems, the possibility of accessing a large amount and a variety of data related to patients' health has increased significantly over the years. The source of this data could be mobile and wearable electronic systems used in everyday life, and specialized medical devices. In this study we aim to investigate the use of modern Machine Learning (ML) techniques for preclinical health assessment based on data collected from questionnaires filled out by patients. METHOD To identify the health conditions of pregnant women, we developed a questionnaire that was distributed in three maternity hospitals in the Mureș County, Romania. In this work we proposed and developed an ML model for pattern detection in common risk assessment based on data extracted from questionnaires. RESULTS Out of the 1278 women who answered the questionnaire, 381 smoked before pregnancy and only 216 quit smoking during the period in which they became pregnant. The performance of the model indicates the feasibility of the solution, with an accuracy of 98 % confirmed for the considered case study. CONCLUSION The proposed solution offers a simple and efficient way to digitize questionnaire data and to analyze the data through a reduced computational effort, both in terms of memory and computing power used.
Collapse
Affiliation(s)
- Calin Avram
- George Emil Palade University of Medicine, Pharmacy, Science and Technology of Targu Mures, Romania.
| | - Adrian Gligor
- George Emil Palade University of Medicine, Pharmacy, Science and Technology of Targu Mures, Romania.
| | - Dumitru Roman
- SINTEF AS, Norway; OsloMet - Oslo Metropolitan University, Norway.
| | - Ahmet Soylu
- OsloMet - Oslo Metropolitan University, Norway.
| | - Victoria Nyulas
- George Emil Palade University of Medicine, Pharmacy, Science and Technology of Targu Mures, Romania.
| | - Laura Avram
- "Dimitrie Cantemir" University of Târgu-Mureș, Romania.
| |
Collapse
|
29
|
Chowa SS, Azam S, Montaha S, Payel IJ, Bhuiyan MRI, Hasan MZ, Jonkman M. Graph neural network-based breast cancer diagnosis using ultrasound images with optimized graph construction integrating the medically significant features. J Cancer Res Clin Oncol 2023; 149:18039-18064. [PMID: 37982829 PMCID: PMC10725367 DOI: 10.1007/s00432-023-05464-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 10/06/2023] [Indexed: 11/21/2023]
Abstract
PURPOSE An automated computerized approach can aid radiologists in the early diagnosis of breast cancer. In this study, a novel method is proposed for classifying breast tumors into benign and malignant, based on the ultrasound images through a Graph Neural Network (GNN) model utilizing clinically significant features. METHOD Ten informative features are extracted from the region of interest (ROI), based on the radiologists' diagnosis markers. The significance of the features is evaluated using density plot and T test statistical analysis method. A feature table is generated where each row represents individual image, considered as node, and the edges between the nodes are denoted by calculating the Spearman correlation coefficient. A graph dataset is generated and fed into the GNN model. The model is configured through ablation study and Bayesian optimization. The optimized model is then evaluated with different correlation thresholds for getting the highest performance with a shallow graph. The performance consistency is validated with k-fold cross validation. The impact of utilizing ROIs and handcrafted features for breast tumor classification is evaluated by comparing the model's performance with Histogram of Oriented Gradients (HOG) descriptor features from the entire ultrasound image. Lastly, a clustering-based analysis is performed to generate a new filtered graph, considering weak and strong relationships of the nodes, based on the similarities. RESULTS The results indicate that with a threshold value of 0.95, the GNN model achieves the highest test accuracy of 99.48%, precision and recall of 100%, and F1 score of 99.28%, reducing the number of edges by 85.5%. The GNN model's performance is 86.91%, considering no threshold value for the graph generated from HOG descriptor features. Different threshold values for the Spearman's correlation score are experimented with and the performance is compared. No significant differences are observed between the previous graph and the filtered graph. CONCLUSION The proposed approach might aid the radiologists in effective diagnosing and learning tumor pattern of breast cancer.
Collapse
Affiliation(s)
- Sadia Sultana Chowa
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Sami Azam
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia.
| | - Sidratul Montaha
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Israt Jahan Payel
- Health Informatics Research Laboratory (HIRL), Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1216, Bangladesh
| | - Md Rahad Islam Bhuiyan
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| | - Md Zahid Hasan
- Health Informatics Research Laboratory (HIRL), Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1216, Bangladesh
| | - Mirjam Jonkman
- Faculty of Science and Technology, Charles Darwin University, Casuarina, NT, 0909, Australia
| |
Collapse
|
30
|
Sethurajan MR, K. N. An adept approach to ascertain and elude probable social bots attacks on twitter and twitch employing machine learning approach. MethodsX 2023; 11:102430. [PMID: 37867912 PMCID: PMC10585632 DOI: 10.1016/j.mex.2023.102430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 10/09/2023] [Indexed: 10/24/2023] Open
Abstract
There has been a tremendous increase in the popularity of social media such as blogs, Instagram, twitter, online websites etc. The increasing utilization of these platforms have enabled the users to share information on a regular basis and also publicize social events. Nevertheless, most of the multimedia events are filled with social bots which raise concerns on the authenticity of the information shared in these events. With the increasing advancements of social bots, the complexity of detecting and fact-checking is also increasing. This is mainly due to the similarity between authorized users and social bots. Several researchers have introduced different models for detecting social bots and fact checking. However, these models suffer from various challenges. In most of the cases, these bots become indistinguishable from existing users and it is challenging to extract relevant attributes of the bots. In addition, it is also challenging to collect large scale data and label them for training the bot detection models. The performance of existing traditional classifiers used for bot detection processes is not satisfactory. This paper presents:•A machine learning based adaptive fuzzy neuro model integrated with a hist gradient boosting (HGB) classifier for identifying the persisting pattern of social bots for fake news detection.•And Harris Hawk optimization with Bi-LSTM for social bot prediction.•Results validate the efficacy of the HGB classifier which achieves a phenomenal accuracy of 95.64 % for twitter bot and 98.98 % for twitch bot dataset.
Collapse
Affiliation(s)
- Monikka Reshmi Sethurajan
- Research Scholar, Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST (Deemed to be University), Kengeri Campus, Bengaluru, Karnataka 560074, India
| | - Natarajan K.
- Associate Professor, Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST (Deemed to be University), Kengeri Campus, Bengaluru, Karnataka 560074, India
| |
Collapse
|
31
|
García-Pavioni A, López B. Dimensionality reduction and features visual representation based on conditional probabilities applied to activity classification. Comput Biol Med 2023; 167:107595. [PMID: 37925905 DOI: 10.1016/j.compbiomed.2023.107595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 10/05/2023] [Accepted: 10/17/2023] [Indexed: 11/07/2023]
Abstract
A large part of the information emitted by contemporary technological devices comes in the form of time series. The massive commercialization of these kinds of devices has made the study of time series feature extraction techniques acquire a vital relevance in last years. Two main things are essential when applying feature extraction techniques to time series: to reduce the dimensionality so it occupies the least amount of storage memory possible, and to make features that contain the relevant information regarding the nature of the data set and the goals to be achieved. For this purpose, we propose in this work a brand new technique called the State Changes Representation for Time Series (SCRTS), which relies on the relevant data associated with the conditional probabilities of the time series (also known in the literature as Markov model's features), and the distribution of its values. This method is length-independent, which means that we can apply it to time series of different dimensions obtaining the same number of features for each one. Also, it provides a visual representation of the input data, so it is possible to interpret what makes a certain time series different from the other. After explaining how it works, we apply it to 3 different wearable accelerometer data sets. This algorithm reduces the original dimension of the time series considerably (in the best case from 5499 values to 31), having a good performance in the classification results (in the best chance with an accuracy of 98%).
Collapse
Affiliation(s)
- Alihuén García-Pavioni
- Exit Grup, University of Girona, Carrer Universitat de Girona, 6, Girona, 17003, Girona, Spain.
| | - Beatriz López
- Exit Grup, University of Girona, Carrer Universitat de Girona, 6, Girona, 17003, Girona, Spain.
| |
Collapse
|
32
|
Liu H, Guan F, Liu T, Yang L, Fan L, Liu X, Luo H, Wu N, Yao B, Tian J, Huang H. MECE: a method for enhancing the catalytic efficiency of glycoside hydrolase based on deep neural networks and molecular evolution. Sci Bull (Beijing) 2023; 68:2793-2805. [PMID: 37867059 DOI: 10.1016/j.scib.2023.09.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 07/14/2023] [Accepted: 09/25/2023] [Indexed: 10/24/2023]
Abstract
The demand for high efficiency glycoside hydrolases (GHs) is on the rise due to their various industrial applications. However, improving the catalytic efficiency of an enzyme remains a challenge. This investigation showcases the capability of a deep neural network and method for enhancing the catalytic efficiency (MECE) platform to predict mutations that improve catalytic activity in GHs. The MECE platform includes DeepGH, a deep learning model that is able to identify GH families and functional residues. This model was developed utilizing 119 GH family protein sequences obtained from the Carbohydrate-Active enZYmes (CAZy) database. After undergoing ten-fold cross-validation, the DeepGH models exhibited a predictive accuracy of 96.73%. The utilization of gradient-weighted class activation mapping (Grad-CAM) was used to aid us in comprehending the classification features, which in turn facilitated the creation of enzyme mutants. As a result, the MECE platform was validated with the development of CHIS1754-MUT7, a mutant that boasts seven amino acid substitutions. The kcat/Km of CHIS1754-MUT7 was found to be 23.53 times greater than that of the wild type CHIS1754. Due to its high computational efficiency and low experimental cost, this method offers significant advantages and presents a novel approach for the intelligent design of enzyme catalytic efficiency. As a result, it holds great promise for a wide range of applications.
Collapse
Affiliation(s)
- Hanqing Liu
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China; Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Feifei Guan
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| | - Tuoyu Liu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Lixin Yang
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Lingxi Fan
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Xiaoqing Liu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Huiying Luo
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Ningfeng Wu
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Bin Yao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Jian Tian
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China; Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| | - Huoqing Huang
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| |
Collapse
|
33
|
Bianchi A, Marchal PC, Martínez Gila DM, Mencarelli F, Gámez García J. Assessment of Fruity Aroma Intensity in Olive Oils from Different Spanish Regions Using a Portable Electronic Nose. J Sci Food Agric 2023. [PMID: 38017697 DOI: 10.1002/jsfa.13179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/22/2023] [Accepted: 11/24/2023] [Indexed: 11/30/2023]
Abstract
BACKGROUND The organoleptic profile of an olive oil is a fundamental quality parameter obtained by human sensory panels. In this work, a portable electronic nose was employed to predict the fruity aroma intensity of 199 olive oil samples from different Spanish regions and cultivar varieties (Picual, Arbequina and Cornicabra), with special emphasis in testing the robustness of the predictions versus cultivar variety variability. The primary data given by the electronic-nose was used to obtain two different feature vectors that were employed to fit ridge and lasso regressions models to two datasets: one consisting of all the samples and another just the cv. Picual samples. RESULTS The results obtained showed Mean Average Error (MAE) values below 0.88 in all the cases, with a MAE of 0.67 for the Picual model. These MAE values and the similarities in the model parameters fitted for the different data folds are in agreement with the results obtained in previous works. CONCLUSION The large number of samples analyzed and the results obtained show the robustness of the approach and the applicability of the methods. Also, the results suggest that better performance can be obtained when specific models are fitted for particular cultivar varieties. Overall, the proposed methods are capable of providing useful information for a fast screening of the fruity aroma intensity of olive oils. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Alessandro Bianchi
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Pablo Cano Marchal
- University Institute of Research on Olive and Olive Oils (INUO) , Robotics, Automation and Computer Vision Group, University of Jaén, 23071, Jaén, Spain
| | - Diego M Martínez Gila
- University Institute of Research on Olive and Olive Oils (INUO) , Robotics, Automation and Computer Vision Group, University of Jaén, 23071, Jaén, Spain
| | - Fabio Mencarelli
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Javier Gámez García
- University Institute of Research on Olive and Olive Oils (INUO) , Robotics, Automation and Computer Vision Group, University of Jaén, 23071, Jaén, Spain
| |
Collapse
|
34
|
Kabir S, Pippi Salle JL, Chowdhury MEH, Abbas TO. Quantification of vesicoureteral reflux using machine learning. J Pediatr Urol 2023:S1477-5131(23)00486-2. [PMID: 37980211 DOI: 10.1016/j.jpurol.2023.10.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 10/17/2023] [Accepted: 10/21/2023] [Indexed: 11/20/2023]
Abstract
INTRODUCTION The radiographic grading of voiding cystourethrogram (VCUG) images is often used to determine the clinical course and appropriate treatment in patients with vesicoureteral reflux (VUR). However, image-based evaluation of VUR remains highly subjective, so we developed a supervised machine learning model to automatically and objectively grade VCUG data. STUDY DESIGN A total of 113 VCUG images were gathered from public sources to compile the dataset for this study. For each image, VUR severity was graded by four pediatric radiologists and three pediatric urologists (low severity scored 1-3; high severity 4-5). Ground truth for each image was assigned based on the grade diagnosed by a majority of the expert assessors. Nine features were extracted from each VCUG image, then six machine learning models were trained, validated, and tested using 'leave-one-out' cross-validation. All features were compared and contrasted, with the highest-ranked then being used to train the final models. RESULTS F1-score is a metric that is often used to indicate performance accuracy of machine learning models. When using the highest-ranked VCUG image features, F1-scores for the support vector machine (SVM) and multi-layer perceptron (MLP) classifiers were 90.27 % and 91.14 %, respectively, indicating a high level of accuracy. When using all features combined, F1 scores were 89.37 % for SVM and 90.27 % for MLP. DISCUSSION These findings indicate that a distorted pattern of renal calyces is an accurate predictor of high-grade VUR. Machine learning protocols can be enhanced in future to improve objective grading of VUR.
Collapse
Affiliation(s)
- Saidul Kabir
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
| | | | | | - Tariq O Abbas
- Urology Division, Surgery Department, Sidra Medicine, Qatar.
| |
Collapse
|
35
|
Fatema K, Rony MAH, Azam S, Mukta MSH, Karim A, Hasan MZ, Jonkman M. Development of an automated optimal distance feature-based decision system for diagnosing knee osteoarthritis using segmented X-ray images. Heliyon 2023; 9:e21703. [PMID: 38027947 PMCID: PMC10665756 DOI: 10.1016/j.heliyon.2023.e21703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 10/25/2023] [Accepted: 10/26/2023] [Indexed: 12/01/2023] Open
Abstract
Knee Osteoarthritis (KOA) is a leading cause of disability and physical inactivity. It is a degenerative joint disease that affects the cartilage, cushions the bones, and protects them from rubbing against each other during motion. If not treated early, it may lead to knee replacement. In this regard, early diagnosis of KOA is necessary for better treatment. Nevertheless, manual KOA detection is time-consuming and error-prone for large data hubs. In contrast, an automated detection system aids the specialist in diagnosing KOA grades accurately and quickly. So, the main objective of this study is to create an automated decision system that can analyze KOA and classify the severity grades, utilizing the extracted features from segmented X-ray images. In this study, two different datasets were collected from the Mendeley and Kaggle database and combined to generate a large data hub containing five classes: Grade 0 (Healthy), Grade 1 (Doubtful), Grade 2 (Minimal), Grade 3 (Moderate), and Grade 4 (Severe). Several image processing techniques were employed to segment the region of interest (ROI). These included Gradient-weighted Class Activation Mapping (Grad-Cam) to detect the ROI, cropping the ROI portion, applying histogram equalization (HE) to improve contrast, brightness, and image quality, and noise reduction (using Otsu thresholding, inverting the image, and morphological closing). Besides, the focus filtering method was utilized to eliminate unwanted images. Then, six feature sets (morphological, GLCM, statistical, texture, LBP, and proposed features) were generated from segmented ROIs. After evaluating the statistical significance of the features and selection methods, the optimal feature set (prominent six distance features) was selected, and five machine learning (ML) models were employed. Additionally, a decision-making strategy based on the six optimal features is proposed. The XGB model outperformed other models with a 99.46 % accuracy, using six distance features, and the proposed decision-making strategy was validated by testing 30 images.
Collapse
Affiliation(s)
- Kaniz Fatema
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1341, Bangladesh
| | - Md Awlad Hossen Rony
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1341, Bangladesh
| | - Sami Azam
- Faculty of Science and Technology, Charles Darwin University, Darwin, NT, 0909, Australia
| | - Md Saddam Hossain Mukta
- Department of Computer Science and Engineering, United International University, Dhaka, 1212, Bangladesh
| | - Asif Karim
- Faculty of Science and Technology, Charles Darwin University, Darwin, NT, 0909, Australia
| | - Md Zahid Hasan
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1341, Bangladesh
| | - Mirjam Jonkman
- Faculty of Science and Technology, Charles Darwin University, Darwin, NT, 0909, Australia
| |
Collapse
|
36
|
Nemati M, Zhang H, Sloma M, Bekbolsynov D, Wang H, Stepkowski S, Xu KS. Predicting kidney transplant survival using multiple feature representations for HLAs. Artif Intell Med 2023; 145:102675. [PMID: 37925205 DOI: 10.1016/j.artmed.2023.102675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 06/14/2023] [Accepted: 10/02/2023] [Indexed: 11/06/2023]
Abstract
Kidney transplantation can significantly enhance living standards for people suffering from end-stage renal disease. A significant factor that affects graft survival time (the time until the transplant fails and the patient requires another transplant) for kidney transplantation is the compatibility of the Human Leukocyte Antigens (HLAs) between the donor and recipient. In this paper, we propose 4 new biologically-relevant feature representations for incorporating HLA information into machine learning-based survival analysis algorithms. We evaluate our proposed HLA feature representations on a database of over 100,000 transplants and find that they improve prediction accuracy by about 1%, modest at the patient level but potentially significant at a societal level. Accurate prediction of survival times can improve transplant survival outcomes, enabling better allocation of donors to recipients and reducing the number of re-transplants due to graft failure with poorly matched donors.
Collapse
Affiliation(s)
- Mohammadreza Nemati
- Department of Electrical Engineering and Computer Science, University of Toledo, 2801 W Bancroft St, Toledo, 43606, OH, United States; Department of Computer and Data Sciences, Case Western Reserve University, 10900 Euclid Ave, Cleveland, 44106, OH, United States
| | - Haonan Zhang
- Department of Electrical Engineering and Computer Science, University of Toledo, 2801 W Bancroft St, Toledo, 43606, OH, United States
| | - Michael Sloma
- Department of Electrical Engineering and Computer Science, University of Toledo, 2801 W Bancroft St, Toledo, 43606, OH, United States
| | - Dulat Bekbolsynov
- Department of Medical Microbiology and Immunology, University of Toledo, United States
| | - Hong Wang
- Department of Engineering Technology, University of Toledo, United States
| | - Stanislaw Stepkowski
- Department of Medical Microbiology and Immunology, University of Toledo, United States
| | - Kevin S Xu
- Department of Electrical Engineering and Computer Science, University of Toledo, 2801 W Bancroft St, Toledo, 43606, OH, United States; Department of Computer and Data Sciences, Case Western Reserve University, 10900 Euclid Ave, Cleveland, 44106, OH, United States.
| |
Collapse
|
37
|
Zhou Q, Zhang Y, Wang S, Wu D. Drug-drug interaction prediction based on local substructure features and their complements. J Mol Graph Model 2023; 124:108557. [PMID: 37390789 DOI: 10.1016/j.jmgm.2023.108557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 04/27/2023] [Accepted: 06/17/2023] [Indexed: 07/02/2023]
Abstract
The properties of drugs may undergo changes when multiple drugs are co-administered to treat co-existing or complex diseases, potentially leading to unforeseen drug-drug interactions (DDIs). Therefore, predicting potential drug-drug interactions has been an important task in pharmaceutical research. However, the following challenges remain: (1) existing methods do not work very well in cold-start scenarios, and (2) the interpretability of existing methods is not satisfactory. To address these challenges, we proposed a multi-channel feature fusion method based on local substructure features of drugs and their complements (LSFC). The local substructure features are extracted from each drug, interacted with those of another drug, and then integrated with the global features of two drugs for DDI prediction. We evaluated LSFC on two real-world DDI datasets in worm-start and cold-start scenarios. Comprehensive experiments demonstrate that LSFC consistently improved DDI prediction performance compared with the start-of-the-art methods. Moreover, visual inspection results showed that LSFC can detect crucial substructures of drugs for DDIs, providing interpretable DDI prediction. The source codes and data are available at https://github.com/Zhang-Yang-ops/LSFC.
Collapse
Affiliation(s)
- Qing Zhou
- College of Computer Science, Chongqing University, Chongqing 400044, China.
| | - Yang Zhang
- College of Computer Science, Chongqing University, Chongqing 400044, China.
| | - Siyuan Wang
- College of Computer Science, Chongqing University, Chongqing 400044, China.
| | - Dayu Wu
- College of Computer Science, Chongqing University, Chongqing 400044, China.
| |
Collapse
|
38
|
Zou F, Li N, Guo F, Cai Q, Cai X. Research and design of simulation and verification system of intelligent expressway based on ETC big data. Heliyon 2023; 9:e21532. [PMID: 38027738 PMCID: PMC10658287 DOI: 10.1016/j.heliyon.2023.e21532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 10/04/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Electronic toll collection (ETC) system records a large number of travel trajectories of vehicles on expressways, and it has a great potential application value. However, the current simulation system mainly focuses on simulating the characteristics of traffic flow while ignoring the real-time flow conditions of the road is difficult to calculate and display quantitatively, and the overall optimization cost is also notably substantial. Currently, there is a lack of a simulation system tailored for the ETC environment, which addresses the challenge of real-time traffic flow computation and holistic optimization, fulfilling the requisites of pertinent research. According to the topological structure inherent to an actual provincial road network on expressways, this paper devises a framework for a simulation system that conforms to the current ETC environment. We solved the critical problem of generating simulation data in the simulation system by establishing a Feature Extraction Algorithm for spatio-temporal features derived from ETC transaction data (Edata). Then we put forward Traffic Control Strategy Algorithm in ETC simulation system, which can provide decision indicators for simulating the control of traffic flow of the expressway. At the same time, we optimized the improved Multi-Task Scheduling Algorithm (ETC_MTS) based on the application scenario of real-time parallelism of multi-task on expressways, which provides better execution performance compared with the current mainstream algorithms such as Shortest Job First Scheduling Algorithm (SJFS), Priority Scheduling Algorithm (Priority), First Come First Serve Scheduling Algorithm (FCFS) and Round Robin Scheduling Algorithm (RR).
Collapse
Affiliation(s)
- Fumin Zou
- Renewable Energy Technology Research Institute, Fujian University of Technology, Ningde 352100, China
- Fujian Provincial Key Laboratory of Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China
| | - Nan Li
- Renewable Energy Technology Research Institute, Fujian University of Technology, Ningde 352100, China
- Fujian Provincial Key Laboratory of Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China
| | - Feng Guo
- Renewable Energy Technology Research Institute, Fujian University of Technology, Ningde 352100, China
- Fujian Provincial Key Laboratory of Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China
| | - Qiqin Cai
- Renewable Energy Technology Research Institute, Fujian University of Technology, Ningde 352100, China
- School of Mechanical Engineering and Automation, Huaqiao University, Xiamen 361021, China
| | - Xinjian Cai
- Renewable Energy Technology Research Institute, Fujian University of Technology, Ningde 352100, China
| |
Collapse
|
39
|
Rai HM, Yoo J. A comprehensive analysis of recent advancements in cancer detection using machine learning and deep learning models for improved diagnostics. J Cancer Res Clin Oncol 2023; 149:14365-14408. [PMID: 37540254 DOI: 10.1007/s00432-023-05216-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 07/26/2023] [Indexed: 08/05/2023]
Abstract
PURPOSE There are millions of people who lose their life due to several types of fatal diseases. Cancer is one of the most fatal diseases which may be due to obesity, alcohol consumption, infections, ultraviolet radiation, smoking, and unhealthy lifestyles. Cancer is abnormal and uncontrolled tissue growth inside the body which may be spread to other body parts other than where it has originated. Hence it is very much required to diagnose the cancer at an early stage to provide correct and timely treatment. Also, manual diagnosis and diagnostic error may cause of the death of many patients hence much research are going on for the automatic and accurate detection of cancer at early stage. METHODS In this paper, we have done the comparative analysis of the diagnosis and recent advancement for the detection of various cancer types using traditional machine learning (ML) and deep learning (DL) models. In this study, we have included four types of cancers, brain, lung, skin, and breast and their detection using ML and DL techniques. In extensive review we have included a total of 130 pieces of literature among which 56 are of ML-based and 74 are from DL-based cancer detection techniques. Only the peer reviewed research papers published in the recent 5-year span (2018-2023) have been included for the analysis based on the parameters, year of publication, feature utilized, best model, dataset/images utilized, and best accuracy. We have reviewed ML and DL-based techniques for cancer detection separately and included accuracy as the performance evaluation metrics to maintain the homogeneity while verifying the classifier efficiency. RESULTS Among all the reviewed literatures, DL techniques achieved the highest accuracy of 100%, while ML techniques achieved 99.89%. The lowest accuracy achieved using DL and ML approaches were 70% and 75.48%, respectively. The difference in accuracy between the highest and lowest performing models is about 28.8% for skin cancer detection. In addition, the key findings, and challenges for each type of cancer detection using ML and DL techniques have been presented. The comparative analysis between the best performing and worst performing models, along with overall key findings and challenges, has been provided for future research purposes. Although the analysis is based on accuracy as the performance metric and various parameters, the results demonstrate a significant scope for improvement in classification efficiency. CONCLUSION The paper concludes that both ML and DL techniques hold promise in the early detection of various cancer types. However, the study identifies specific challenges that need to be addressed for the widespread implementation of these techniques in clinical settings. The presented results offer valuable guidance for future research in cancer detection, emphasizing the need for continued advancements in ML and DL-based approaches to improve diagnostic accuracy and ultimately save more lives.
Collapse
Affiliation(s)
- Hari Mohan Rai
- School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, 13120, Gyeonggi-do, Republic of Korea.
| | - Joon Yoo
- School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, 13120, Gyeonggi-do, Republic of Korea
| |
Collapse
|
40
|
Pan Z, Zhang Z, Meng Z, Wang Y. A novel fault classification feature extraction method for rolling bearing based on multi-sensor fusion technology and EB-1D-TP encoding algorithm. ISA Transactions 2023; 142:427-444. [PMID: 37573188 DOI: 10.1016/j.isatra.2023.07.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 06/28/2023] [Accepted: 07/14/2023] [Indexed: 08/14/2023]
Abstract
To improve the accuracy of bearing fault diagnosis in a multisensor monitoring environment, it is necessary to obtain more accurate and effective fault classification features for bearings. Accordingly, a bearing fault classification feature extraction method based on multisensor fusion technology and an enhanced binary one-dimensional ternary pattern (EB-1D-TP) algorithm were proposed in this study. First, an optimal equalization weighting algorithm was established to realize high-precision fusion of bearing signals by introducing an optimal equalization factor and determining the theoretical optimal equalization factor value. Second, an enhanced binary encoding method similar to balanced ternary encoding was developed, which increases the difference between the different fault features of the bearing. Finally, the new sequence obtained after encoding was used as the input to a support vector machine to classify and diagnose the faults of the rolling bearing. The experimental results show that the algorithm can significantly improve the accuracy and speed of rolling-bearing fault classification. Combining fusion-encoding features with other intelligent classification methods, the classification results were improved.
Collapse
Affiliation(s)
- Zuozhou Pan
- College of Metrology and Measurement Engineering, China Jiliang University, Hangzhou 310018, PR China
| | - Zhengyuan Zhang
- College of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore
| | - Zong Meng
- College of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei, 066004, PR China.
| | - Yuebing Wang
- College of Metrology and Measurement Engineering, China Jiliang University, Hangzhou 310018, PR China.
| |
Collapse
|
41
|
Jiang S, Chen Z. Application of dynamic time warping optimization algorithm in speech recognition of machine translation. Heliyon 2023; 9:e21625. [PMID: 38027668 PMCID: PMC10651500 DOI: 10.1016/j.heliyon.2023.e21625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 10/07/2023] [Accepted: 10/25/2023] [Indexed: 12/01/2023] Open
Abstract
Speech recognition is the foundation of human-computer interaction technology and an important aspect of speech signal processing, with broad application prospects. Therefore, it is very necessary to recognize speech. At present, speech recognition has problems such as low recognition rate, slow recognition speed, and severe interference from other factors. This paper studied speech recognition based on dynamic time warping (DTW) algorithm. By introducing speech recognition, the specific steps of speech recognition were understood. Before performing speech recognition, the speech that needs to be recognized needs to be converted into a speech sequence using an acoustic model. Then, the DTW algorithm was used to preprocess speech recognition, mainly by sampling and windowing the speech. After preprocessing, speech feature extraction was carried out. After feature extraction was completed, speech recognition was carried out. Through experiments, it can be found that the recognition rate of speech recognition on the basis of DTW algorithm was very high. In a quiet environment, the recognition rate was above 93.85 %, and the average recognition rate of the 10 selected testers was 95.8 %. In a noisy environment, the recognition rate was above 91.4 %, and the average recognition rate of the 10 selected testers was 93 %. In addition to high recognition rate, DTW based speech recognition also had a very fast speed for vocabulary recognition. Based on the DTW algorithm, speech recognition not only has a high recognition rate, but also has a faster recognition speed.
Collapse
Affiliation(s)
- Shaohua Jiang
- School of Humanities, Fujian University of Technology, Fuzhou 350118, China
- Krirk University, Bangkok 10220, Thailand
| | - Zheng Chen
- Concord University College, Fujian Normal University, China
| |
Collapse
|
42
|
Hireš M, Drotár P, Pah ND, Ngo QC, Kumar DK. On the inter-dataset generalization of machine learning approaches to Parkinson's disease detection from voice. Int J Med Inform 2023; 179:105237. [PMID: 37801807 DOI: 10.1016/j.ijmedinf.2023.105237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/20/2023] [Accepted: 09/24/2023] [Indexed: 10/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Parkinson's disease is the second-most-common neurodegenerative disorder that affects motor skills, cognitive processes, mood, and everyday tasks such as speaking and walking. The voices of people with Parkinson's disease may become weak, breathy, or hoarse and may sound emotionless, with slurred words and mumbling. Algorithms for computerized voice analysis have been proposed and have shown highly accurate results. However, these algorithms were developed on single, limited datasets, with participants possessing similar demographics. Such models are prone to overfitting and are unsuitable for generalization, which is essential in real-world applications. METHODS We evaluated the computerized Parkinson's disease diagnosis performance of various machine learning models and showed that these models degraded rapidly when used on different datasets. We evaluated two mainstream state-of-the-art approaches, one based on deep convolutional neural networks and another based on voice feature extraction followed by a shallow classifier (i.e., extreme gradient boosting (XGBoost)). RESULTS An investigation with four datasets (CzechPD, PC-GITA, ITA, and RMIT-PD) proved that even if the algorithms yielded excellent performance on a single dataset, the results obtained on new data or even a mix of datasets were very unsatisfactory. CONCLUSIONS More work needs to be done to make computerized voice analysis methods for Parkinson's disease diagnosis suitable for real-world applications.
Collapse
Affiliation(s)
- Máté Hireš
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001 Kosice, Slovakia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001 Kosice, Slovakia.
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Surabaya, Indonesia
| | | | | |
Collapse
|
43
|
Landry C, Mukkamala R. Current evidence suggests that estimating blood pressure from convenient ECG waveforms alone is not viable. J Electrocardiol 2023; 81:153-155. [PMID: 37708738 PMCID: PMC10872818 DOI: 10.1016/j.jelectrocard.2023.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/03/2023] [Accepted: 09/02/2023] [Indexed: 09/16/2023]
Abstract
Cuffless blood pressure (BP) measurement could improve hypertension awareness and control and is being widely pursued. Some have proposed to estimate BP from the electrocardiogram (ECG) alone despite little physiological basis. In this minireview, we extracted the most relevant articles related to ECG-based BP estimation. Our findings suggest that, as expected, estimating BP from ECG does not appear to be viable. Most notably, we have not found any evidence that ECG features can track BP changes. At best, certain ECG features may indicate heart disease and thus correlate with high BP, but this may not be clinically useful.
Collapse
Affiliation(s)
- Cederick Landry
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ramakrishna Mukkamala
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA.; Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
44
|
韩 闯, 阙 文, 王 治, 王 松, 李 艳, 师 丽. [A review on intelligent auxiliary diagnosis methods based on electrocardiograms for myocardial infarction]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2023; 40:1019-1026. [PMID: 37879933 PMCID: PMC10600411 DOI: 10.7507/1001-5515.202212010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 08/22/2023] [Indexed: 10/27/2023]
Abstract
Myocardial infarction (MI) has the characteristics of high mortality rate, strong suddenness and invisibility. There are problems such as the delayed diagnosis, misdiagnosis and missed diagnosis in clinical practice. Electrocardiogram (ECG) examination is the simplest and fastest way to diagnose MI. The research on MI intelligent auxiliary diagnosis based on ECG is of great significance. On the basis of the pathophysiological mechanism of MI and characteristic changes in ECG, feature point extraction and morphology recognition of ECG, along with intelligent auxiliary diagnosis method of MI based on machine learning and deep learning are all summarized. The models, datasets, the number of ECG, the number of leads, input modes, evaluation methods and effects of different methods are compared. Finally, future research directions and development trends are pointed out, including data enhancement of MI, feature points and dynamic features extraction of ECG, the generalization and clinical interpretability of models, which are expected to provide references for researchers in related fields of MI intelligent auxiliary diagnosis.
Collapse
Affiliation(s)
- 闯 韩
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| | - 文戈 阙
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| | - 治忠 王
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| | - 松伟 王
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| | - 艳婷 李
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| | - 丽 师
- 郑州轻工业大学 计算机与通信工程学院(郑州 450000)School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, P. R. China
| |
Collapse
|
45
|
Lin X, Gao Y, Lei F. An application of topological data analysis in predicting sumoylation sites. PeerJ 2023; 11:e16204. [PMID: 37846308 PMCID: PMC10576966 DOI: 10.7717/peerj.16204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Accepted: 09/08/2023] [Indexed: 10/18/2023] Open
Abstract
Sumoylation is a reversible post-translational modification that regulates certain significant biochemical functions in proteins. The protein alterations caused by sumoylation are associated with the incidence of some human diseases. Therefore, identifying the sites of sumoylation in proteins may provide a direction for mechanistic research and drug development. Here, we propose a new computational approach for identifying sumoylation sites using an encoding method based on topological data analysis. The features of our model captured the key physical and biological properties of proteins at multiple scales. In a 10-fold cross validation, the outcomes of our model showed 96.45% of sensitivity (Sn), 94.65% of accuracy (Acc), 0.8946 of Matthew's correlation coefficient (MCC), and 0.99 of area under curve (AUC). The proposed predictor with only topological features achieves the best MCC and AUC in comparison to the other released methods. Our results suggest that topological information is an additional parameter that can assist in the prediction of sumoylation sites and provide a novel perspective for further research in protein sumoylation.
Collapse
Affiliation(s)
- Xiaoxi Lin
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning, China
| | - Yaru Gao
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning, China
| | - Fengchun Lei
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning, China
| |
Collapse
|
46
|
Ariotta V, Lehtonen O, Salloum S, Micoli G, Lavikka K, Rantanen V, Hynninen J, Virtanen A, Hautaniemi S. H&E image analysis pipeline for quantifying morphological features. J Pathol Inform 2023; 14:100339. [PMID: 37915837 PMCID: PMC10616375 DOI: 10.1016/j.jpi.2023.100339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/15/2023] [Accepted: 09/30/2023] [Indexed: 11/03/2023] Open
Abstract
Detecting cell types from histopathological images is essential for various digital pathology applications. However, large number of cells in whole-slide images (WSIs) necessitates automated analysis pipelines for efficient cell type detection. Herein, we present hematoxylin and eosin (H&E) Image Processing pipeline (HEIP) for automatied analysis of scanned H&E-stained slides. HEIP is a flexible and modular open-source software that performs preprocessing, instance segmentation, and nuclei feature extraction. To evaluate the performance of HEIP, we applied it to extract cell types from ovarian high-grade serous carcinoma (HGSC) patient WSIs. HEIP showed high precision in instance segmentation, particularly for neoplastic and epithelial cells. We also show that there is a significant correlation between genomic ploidy values and morphological features, such as major axis of the nucleus.
Collapse
Affiliation(s)
- Valeria Ariotta
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Oskari Lehtonen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Shams Salloum
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
- Department of Pathology, University of Helsinki and HUS Diagnostic Center, Helsinki University Hospital, 00029 Helsinki, Finland
| | - Giulia Micoli
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Kari Lavikka
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Ville Rantanen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Johanna Hynninen
- Department of Obstetrics and Gynaecology, University of Turku and Turku University Hospital, 200521 Turku, Finland
| | - Anni Virtanen
- Department of Pathology, University of Helsinki and HUS Diagnostic Center, Helsinki University Hospital, 00029 Helsinki, Finland
| | - Sampsa Hautaniemi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| |
Collapse
|
47
|
Zhao J, Almodfer R, Wu X, Wang X. A dataset of pomegranate growth stages for machine learning-based monitoring and analysis. Data Brief 2023; 50:109468. [PMID: 37600594 PMCID: PMC10432946 DOI: 10.1016/j.dib.2023.109468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 07/24/2023] [Accepted: 07/27/2023] [Indexed: 08/22/2023] Open
Abstract
Machine learning and deep learning have grown very rapidly in recent years and are widely used in agriculture. Neat and clean datasets are a major requirement for building accurate and robust machine learning models and minimizing misclassification in real-time environments. To achieve this goal, we created a dataset of images of pomegranate growth stages. These images of pomegranate growth stages were taken from May to September from an orchard inside the Henan Institute of Science and Technology in China. The dataset contains 5857 images of pomegranates at different growth stages, which are labeled and classified into five periods: bud, flower, early-fruit, mid-growth and ripe. The dataset consists of four folders, which respectively store the images, two formats of annotation files, and the record files for the division of training, validation, and test sets. The authors have confirmed the usability of this dataset through previous research. The dataset may help researchers develop computer applications using machine learning and computer vision algorithms.
Collapse
Affiliation(s)
- Jifei Zhao
- School of Computer Science and Technology, Henan Institute of Science and Technology, Xinxiang, Henan Province, 453003, China
| | - Rolla Almodfer
- School of Computer Science and Technology, Henan Institute of Science and Technology, Xinxiang, Henan Province, 453003, China
| | - Xiaoying Wu
- School of Computer Science and Technology, Henan Institute of Science and Technology, Xinxiang, Henan Province, 453003, China
| | - Xinfa Wang
- School of Computer Science and Technology, Henan Institute of Science and Technology, Xinxiang, Henan Province, 453003, China
| |
Collapse
|
48
|
Ngige O, Ayankoya F, Balogun J, Onuiri E, Agbonkhese C, Sanusi F. A dataset for predicting Supreme Court judgments in Nigeria. Data Brief 2023; 50:109483. [PMID: 37588617 PMCID: PMC10425661 DOI: 10.1016/j.dib.2023.109483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 08/18/2023] Open
Abstract
It has been widely argued among researchers that the application of big data analytics promises to reduce human bias and provide a scientific and evidence-based approach to the judicial process. In this dataset, historical data consisting of appeal cases presented at the Supreme Court of Nigeria (SCN) were collected from an online repository (Primsol Law Pavillion). A total of 5585 appeal cases brought before the SCN were collected from the archive. The dataset consisted of both criminal and civil appeal cases brought before the SCN. Variables that are related to court case proceedings were identified from related literature, verified by legal experts and used as a basis for generating an electronic structured version of the dataset stored as a spreadsheet file from the unstructured data. From the collected data, thirteen input variables were identified with one output/decision variable. The distribution of the numerical variables was presented as a descriptive statistical summary in terms of the minimum, maximum, mode, mean and standard deviation. The developed dataset can assist researchers to build predictive systems by training their models. Various feature extraction techniques can also be applied on the dataset to remove irrelevant or redundant features for increased performance of such classifiers that are needed to predict the outcome of legal cases.
Collapse
Affiliation(s)
- O.C. Ngige
- Federal Institute of Industrial Research, Oshodi, Nigeria
| | - F.Y. Ayankoya
- Department of Computer Science, Babcock University, Ilishan-Remo, Ogun State, Nigeria
| | - J.A. Balogun
- Department of Computer Science and Mathematics, Mountain Top University, Nigeria
| | - E. Onuiri
- Department of Computer Science, Babcock University, Ilishan-Remo, Ogun State, Nigeria
| | - C. Agbonkhese
- Department of Computer Science and Mathematics, Mountain Top University, Nigeria
| | - F.A. Sanusi
- Department of Computer Science and Mathematics, Mountain Top University, Nigeria
| |
Collapse
|
49
|
Xu C, Yi K, Jiang N, Li X, Zhong M, Zhang Y. MDFF-Net: A multi-dimensional feature fusion network for breast histopathology image classification. Comput Biol Med 2023; 165:107385. [PMID: 37633086 DOI: 10.1016/j.compbiomed.2023.107385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Revised: 07/23/2023] [Accepted: 08/14/2023] [Indexed: 08/28/2023]
Abstract
Breast cancer is a common malignancy and early detection and treatment of it is crucial. Computer-aided diagnosis (CAD) based on deep learning has significantly advanced medical diagnostics, enhancing accuracy and efficiency in recent years. Despite the convenience, this technology also has certain limitations. When the morphological characteristics of the patient's pathological section are not evident or complex, certain small lesions or cells deep within the lesion cannot be recognized, and misdiagnosis is prone to occur. As a result, MDFF-Net, a CNN-based multidimensional feature fusion network, is proposed. The model consists of a one-dimensional feature extraction network, a two-dimensional feature extraction network, and a feature fusion classification network. The basic part of the two-dimensional feature extraction network is stacked by modules integrated with multi-scale channel shuffling networks and channel attention modules. Furthermore, inspired by natural language processing, this model integrates a one-dimensional feature extraction network to extract detailed information in the image to avoid misdiagnosis caused by insufficient information extraction such as cell morphological characteristics and differentiation degree. Finally, the extracted one-dimensional and two-dimensional features are fused in the feature fusion network and employed for the final classification. The effectiveness of MDFF-Net and classical classification models were evaluated on the BreakHis and the BACH datasets. According to experimental results, MDFF-Net achieves an accuracy of 98.86% on the BreakHis and 86.25% on the BACH dataset. Furthermore, to further assess the effectiveness of the model in other classification tasks, the colon cancer and the lung cancer datasets were employed for additional experiments, achieving a classification accuracy of 100% in both cases.
Collapse
Affiliation(s)
- Cheng Xu
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
| | - Ke Yi
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
| | - Nan Jiang
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
| | - Xiong Li
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| | - Meiling Zhong
- School of Materials Science and Engineering, East China Jiaotong University, 330013, Nanchang, China
| | - Yuejin Zhang
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China.
| |
Collapse
|
50
|
Ding X, Liu Y, Zhao J, Wang R, Li C, Luo Q, Shen C. A novel wavelet-transform-based convolution classification network for cervical lymph node metastasis of papillary thyroid carcinoma in ultrasound images. Comput Med Imaging Graph 2023; 109:102298. [PMID: 37769402 DOI: 10.1016/j.compmedimag.2023.102298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/29/2023] [Accepted: 08/29/2023] [Indexed: 09/30/2023]
Abstract
Preoperative assessment of cervical lymph nodes metastasis (CLNM) for accurate qualitative and locating diagnosis is important for choosing the best treatment option for patients with papillary thyroid cancer. Non-destructive, non-invasive ultrasound is currently the imaging method of choice for lymph node metastatic assessment. For lymph node characteristics and ultrasound images, this paper proposes a multitasking network framework for diagnosing metastatic lymph nodes in ultrasound images, in which localization module not only provides information on the location of lymph nodes to focus on the peripheral and self regions of lymph nodes, but also provides structural features of lymph nodes for subsequent classification module. In the classification module, we design a novel wavelet-transform-based convolution network. Wavelet transform is introduced into the deep learning convolution module to analyze ultrasound images in both spatial and frequency domains, which effectively enriches the feature information and improves the classification performance of the model without increasing the model parameters. We collected 510 patient data (N = 1376) from Shanghai Sixth People's Hospital regarding ultrasound lymph nodes in the neck, as well as used three publicly available ultrasound datasets, including SCUI2020 (N = 2914), DDTI (N = 480), and BUSI (N = 780). Compared to the optimal two-stage model, our model has improved its accuracy and AUC indexes by 5.83% and 4%, which outperforms the two-stage architectures and also surpasses the latest classification networks.
Collapse
Affiliation(s)
- Xuehai Ding
- School of Computer Engineering and Science, Shanghai University, Shangda Rd, Shanghai, 200444, China
| | - Yanting Liu
- School of Computer Engineering and Science, Shanghai University, Shangda Rd, Shanghai, 200444, China
| | - Junjuan Zhao
- School of Computer Engineering and Science, Shanghai University, Shangda Rd, Shanghai, 200444, China.
| | - Ren Wang
- Department of Ultrasound Medicine, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Yishan Rd, Shanghai, 200233, China
| | - Chengfan Li
- School of Computer Engineering and Science, Shanghai University, Shangda Rd, Shanghai, 200444, China
| | - Quanyong Luo
- Department of Nuclear Medicine, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Yishan Rd, Shanghai, 200233, China
| | - Chentian Shen
- Department of Nuclear Medicine, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Yishan Rd, Shanghai, 200233, China.
| |
Collapse
|