1
|
Ma Z, Wang H, Shan S, Zhu K, Yuan L. Effect of metformin on type 2 diabetes mellitus based on the volume of thyroid nodules tracked by artificial intelligence. JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES 2023. [DOI: 10.1016/j.jrras.2023.100566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
2
|
Kosolwattana T, Liu C, Hu R, Han S, Chen H, Lin Y. A self-inspected adaptive SMOTE algorithm (SASMOTE) for highly imbalanced data classification in healthcare. BioData Min 2023; 16:15. [PMID: 37098549 PMCID: PMC10131309 DOI: 10.1186/s13040-023-00330-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 03/09/2023] [Indexed: 04/27/2023] Open
Abstract
In many healthcare applications, datasets for classification may be highly imbalanced due to the rare occurrence of target events such as disease onset. The SMOTE (Synthetic Minority Over-sampling Technique) algorithm has been developed as an effective resampling method for imbalanced data classification by oversampling samples from the minority class. However, samples generated by SMOTE may be ambiguous, low-quality and non-separable with the majority class. To enhance the quality of generated samples, we proposed a novel self-inspected adaptive SMOTE (SASMOTE) model that leverages an adaptive nearest neighborhood selection algorithm to identify the "visible" nearest neighbors, which are used to generate samples likely to fall into the minority class. To further enhance the quality of the generated samples, an uncertainty elimination via self-inspection approach is introduced in the proposed SASMOTE model. Its objective is to filter out the generated samples that are highly uncertain and inseparable with the majority class. The effectiveness of the proposed algorithm is compared with existing SMOTE-based algorithms and demonstrated through two real-world case studies in healthcare, including risk gene discovery and fatal congenital heart disease prediction. By generating the higher quality synthetic samples, the proposed algorithm is able to help achieve better prediction performance (in terms of F1 score) on average compared to the other methods, which is promising to enhance the usability of machine learning models on highly imbalanced healthcare data.
Collapse
Affiliation(s)
| | - Chenang Liu
- School of Industrial Engineering & Management, Oklahoma State University, Stillwater, USA
| | - Renjie Hu
- Department of Information and Logistics Technology, University of Houston, Houston, USA
| | - Shizhong Han
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, USA
- Lieber Institute for Brain Development, Baltimore, USA
| | - Hua Chen
- Department of Pharmaceutical Health Outcomes and Policy, University of Houston, Houston, USA
| | - Ying Lin
- Department of Industrial Engineering, University of Houston, Houston, USA.
| |
Collapse
|
3
|
R R, Gobalakrishnan N, Chokkalingam A. Detection of turner syndrome using hand X-ray using anchor based links segmentation method. Proc Inst Mech Eng H 2022; 236:9544119221075496. [PMID: 35118910 DOI: 10.1177/09544119221075496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Turner Syndrome (TS) is a chromosomal disorder, wherein the female's growth is impacted. Immature ovaries, low stature, and heart abnormalities are a range of developmental and medical issues due to TS. The condition of TS might be detected prior to birth, throughout infancy or in the early years of life. The diagnosis of TS in girls with modest symptoms and indications is sometimes deferred until they reach adolescence or become young adults. This study presents an algorithm to segment the hand digital X-ray image in children with TS. In medical image and computer vision examination, image segmentation is demanding, and very crucial. Prevailing segmentation algorithms even now suffer from common segmentation issues including under-segmentation, over-segmentation, and spurious or non-closed edges, regardless of the several years of studies. In this paper, Anchor Based Link (ABL) segmentation approach is proposed to detect TS based on fourth Metacarpal bone from left hand X-ray images. The detection of TS is demonstrated based upon the comparison of proposed approach with existing watershed segmentation and Gaussian-Mixture-Model-based Hidden-Markov-Random-Field (GMM-HMRF) method. The proposed approach attains better segmentation based on the ratio of height and width of left fourth finger that is analyzed for normal children and children having TS with the help of edge pixel present in the metacarpal bone that has been segmented. The suggested method is verified on fifty (50) sample X-ray hand images of carpal bones, providing 0.60 ± 0.02 as an average Dice coefficient.
Collapse
Affiliation(s)
- Ramachandran R
- Research Scholar, Anna University, Chennai, Tamilnadu, India
| | - N Gobalakrishnan
- Department of Information Technology, Sri Venkateswara College of Engineering, Sriperumbudur, Chennai, Tamil Nadu, India
| | - Arun Chokkalingam
- Department of ECE, R.M.K College of Engineering and Technology, Chennai, Tamil Nadu, India
| |
Collapse
|
4
|
Huang Z, Yang C, Chen X, Huang K, Xie Y. Adaptive over-sampling method for classification with application to imbalanced datasets in aluminum electrolysis. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04208-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
5
|
Liu Z, Ma C, Gu J, Yu M. Potential biomarkers of acute myocardial infarction based on weighted gene co-expression network analysis. Biomed Eng Online 2019; 18:9. [PMID: 30683112 PMCID: PMC6347746 DOI: 10.1186/s12938-019-0625-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 03/01/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Acute myocardial infarction (AMI) is the common cause of mortality in developed countries. The feasibility of whole-genome gene expression analysis to identify outcome-related genes and dysregulated pathways remains unknown. Molecular marker such as BNP, CRP and other serum inflammatory markers have got the notice at this point. However, these biomarkers exhibit elevated levels in patients with thyroid disease, renal failure and congestive heart failure. In this study, three groups of microarray data sets (GES66360, GSE48060, GSE29532) were collected from GEO, a total of 99, 52 and 55 samples, respectively. Weighted gene co-expression network analysis (WGCNA) was performed to obtain a classifier which composed of related genes that best characterize the AMI. RESULTS Here, this study obtained three groups of microarray data sets (GES66360, GSE48060, GSE29532) on AMI blood samples, a total of 99, 52 and 24 samples, respectively. In all, 4672 genes, 3185 genes, 3660 genes were identified in GSE66360, GSE48060, GSE60993 modules, respectively. We preformed WGCNA, GO and KEGG pathway enrichment analysis on these three data sets, finding function enrichment of the differential expression gene on inflammation and immune response. Transcriptome analysis were performed in AMI patients at four time points compared to CAD patients with no history of MI, to determine gene expression profiles and their possible changes during the recovery from myocardial infarction. CONCLUSIONS The results suggested that three overlapping genes (FGFBP2, GFOD1 and MLC1) between two modules could be a potential use of gene biomarkers for the diagnose of AMI.
Collapse
Affiliation(s)
- Zhihua Liu
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China. .,Beijing Yuqiu Medical Research Institute, Beijing, 100022, China. .,Shenzhen Yuqiu Biological Big Data Research Institute, Shenzhen, 518033, China. .,Nanjing Yuqiu Biotechnology Co., Ltd., Nanjing, 210009, China.
| | - Chenguang Ma
- Tsinghua University, Beijing, 100084, China.,Beijing Yuqiu Medical Research Institute, Beijing, 100022, China.,Shenzhen Yuqiu Biological Big Data Research Institute, Shenzhen, 518033, China.,Nanjing Yuqiu Biotechnology Co., Ltd., Nanjing, 210009, China
| | - Junhua Gu
- Shenzhen Yuqiu Biological Big Data Research Institute, Shenzhen, 518033, China.,Nanjing Yuqiu Biotechnology Co., Ltd., Nanjing, 210009, China.,Hebei University of Technology, Tianjin, 300130, China
| | - Ming Yu
- Shenzhen Yuqiu Biological Big Data Research Institute, Shenzhen, 518033, China.,Nanjing Yuqiu Biotechnology Co., Ltd., Nanjing, 210009, China.,Hebei University of Technology, Tianjin, 300130, China
| |
Collapse
|
6
|
Dorado-Moreno M, Gutiérrez PA, Cornejo-Bueno L, Prieto L, Salcedo-Sanz S, Hervás-Martínez C. Ordinal Multi-class Architecture for Predicting Wind Power Ramp Events Based on Reservoir Computing. Neural Process Lett 2018. [DOI: 10.1007/s11063-018-9922-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Medical image analysis of knee joint lipoma arborescens and arthroscopic treatment. Comput Med Imaging Graph 2018; 66:66-72. [PMID: 29567561 DOI: 10.1016/j.compmedimag.2018.01.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 01/03/2018] [Accepted: 01/08/2018] [Indexed: 11/22/2022]
Abstract
OBJECTIVE Arthroscopy is a minimally invasive surgical procedure on a joint in which examination and treatment of knee damage is performed using a surgical device known as the arthroscope. Lipoma arborescens (LA), an infrequent intra-articular lesion, originates from mature adipose cells under subsynovial tissue. The synovial membrane is pale yellow with large villous projections. It is caused by various underlying factors. We found many patients with LA and processed them appropriately.The research was implemented to investigate therapeutic effect of semi-automated arthroscopic diagnosis and treatment for knee joint. METHODS We used the Stryker arthroscopic in surgery that is 4 mm in diameter with angle at 30°. Patients were chosen by biomechanical analysis and scanning mode. All of the patients underwent radiographic imaging examination, Magnetic Resonance Imaging (MRI), Lysholm Score and Visual Analogue Scale (VAS). Arthroscopic limited synovectomy was carried out on these patients. RESULTS The wound of all patients healed up. The content of follow-up includes: chief complaints, range of motion of knee joint, Visual Analogue Scale (VAS) and Lysholm score. No swollen nor effusion of the infected knee was found in all patients during the follow-up. The postoperative symptom was markedly alleviated in fourteen patients and partially alleviated in one. All patients were satisfied with the therapeutic effect. CONCLUSION We performed biomechanical analysis based on knee slight flexion and extension. Arthroscopy is an endoscope for the diagnosis and treatment of joint diseases. Semi-automated arthroscopic debridement is good for early and mid-term osteoarthritis with Lipoma arborescens.
Collapse
|
8
|
Shang L, Liu C, Tomiura Y, Hayashi K. Machine-Learning-Based Olfactometer: Prediction of Odor Perception from Physicochemical Features of Odorant Molecules. Anal Chem 2017; 89:11999-12005. [PMID: 29027463 DOI: 10.1021/acs.analchem.7b02389] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Gas chromatography/olfactometry (GC/O) has been used in various fields as a valuable method to identify odor-active components from a complex mixture. Since human assessors are employed as detectors to obtain the olfactory perception of separated odorants, the GC/O technique is limited by its subjectivity, variability, and high cost of the trained panelists. Here, we present a proof-of-concept model by which odor information can be obtained by machine-learning-based prediction from molecular parameters (MPs) of odorant molecules. The odor prediction models were established using a database of flavors and fragrances including 1026 odorants and corresponding verbal odor descriptors (ODs). Physicochemical parameters of the odorant molecules were acquired by use of molecular calculation software (DRAGON). Ten representative ODs were selected to build the prediction models based on their high frequency of occurrence in the database. The features of the MPs were extracted via either unsupervised (principal component analysis) or supervised (Boruta, BR) approaches and then used as input to calibrate machine-learning models. Predictions were performed by various machine-learning approaches such as support vector machine (SVM), random forest, and extreme learning machine. All models were optimized via parameter tuning and their prediction accuracies were compared. A SVM model combined with feature extraction by BR-C (confirmed only) was found to afford the best results with an accuracy of 97.08%. Validation of the models was verified by using the GC/O data of an apple sample for comparison between the predicted and measured results. The prediction models can be used as an auxiliary tool in the existing GC/O by suggesting possible OD candidates to the panelists and thus helping to give more objective and correct judgment. In addition, a machine-based GC/O in which the panelist is no longer needed might be expected after further development of the proposed odor prediction technique.
Collapse
Affiliation(s)
| | - Chuanjun Liu
- Research Laboratory, U.S.E. Company, Limited , Tokyo 150-0013, Japan
| | | | | |
Collapse
|
9
|
Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data. PLoS One 2017; 12:e0180830. [PMID: 28753613 PMCID: PMC5533448 DOI: 10.1371/journal.pone.0180830] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 06/04/2017] [Indexed: 11/19/2022] Open
Abstract
Clinical data analysis and forecasting have made substantial contributions to disease control, prevention and detection. However, such data usually suffer from highly imbalanced samples in class distributions. In this paper, we aim to formulate effective methods to rebalance binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat algorithm, and apply them to empower the effects of synthetic minority over-sampling technique (SMOTE) for pre-processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reported in this paper reveal that the performance improvements obtained by the former methods are not scalable to larger data scales. The latter methods, which we call Adaptive Swarm Balancing Algorithms, lead to significant efficiency and effectiveness improvements on large datasets while the first method is invalid. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. The proposed methods lead to more credible performances of the classifier, and shortening the run time compared to brute-force method.
Collapse
|
10
|
Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets. Sci Rep 2017; 7:4354. [PMID: 28659577 PMCID: PMC5489518 DOI: 10.1038/s41598-017-04037-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 05/09/2017] [Indexed: 12/02/2022] Open
Abstract
Due to the high-dimensional characteristics of dataset, we propose a new method based on the Wolf Search Algorithm (WSA) for optimising the feature selection problem. The proposed approach uses the natural strategy established by Charles Darwin; that is, ‘It is not the strongest of the species that survives, but the most adaptable’. This means that in the evolution of a swarm, the elitists are motivated to quickly obtain more and better resources. The memory function helps the proposed method to avoid repeat searches for the worst position in order to enhance the effectiveness of the search, while the binary strategy simplifies the feature selection problem into a similar problem of function optimisation. Furthermore, the wrapper strategy gathers these strengthened wolves with the classifier of extreme learning machine to find a sub-dataset with a reasonable number of features that offers the maximum correctness of global classification models. The experimental results from the six public high-dimensional bioinformatics datasets tested demonstrate that the proposed method can best some of the conventional feature selection methods up to 29% in classification accuracy, and outperform previous WSAs by up to 99.81% in computational time.
Collapse
|