1
|
Hennebelle A, Ismail L, Materwala H, Al Kaabi J, Ranjan P, Janardhanan R. Secure and privacy-preserving automated machine learning operations into end-to-end integrated IoT-edge-artificial intelligence-blockchain monitoring system for diabetes mellitus prediction. Comput Struct Biotechnol J 2024; 23:212-233. [PMID: 38169966 PMCID: PMC10758733 DOI: 10.1016/j.csbj.2023.11.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 01/05/2024] Open
Abstract
Diabetes Mellitus, one of the leading causes of death worldwide, has no cure to date and can lead to severe health complications, such as retinopathy, limb amputation, cardiovascular diseases, and neuronal disease, if left untreated. Consequently, it becomes crucial to be able to monitor and predict the incidence of diabetes. Machine learning approaches have been proposed and evaluated in the literature for diabetes prediction. This paper proposes an IoT-edge-Artificial Intelligence (AI)-blockchain system for diabetes prediction based on risk factors. The proposed system is underpinned by blockchain to obtain a cohesive view of the risk factors data from patients across different hospitals and ensure security and privacy of the user's data. We provide a comparative analysis of different medical sensors, devices, and methods to measure and collect the risk factors values in the system. Numerical experiments and comparative analysis were carried out within our proposed system, using the most accurate random forest (RF) model, and the two most used state-of-the-art machine learning approaches, Logistic Regression (LR) and Support Vector Machine (SVM), using three real-life diabetes datasets. The results show that the proposed system predicts diabetes using RF with 4.57% more accuracy on average in comparison with the other models LR and SVM, with 2.87 times more execution time. Data balancing without feature selection does not show significant improvement. When using feature selection, the performance is improved by 1.14% for PIMA Indian and 0.02% for Sylhet datasets, while it is reduced by 0.89% for MIMIC III.
Collapse
Affiliation(s)
- Alain Hennebelle
- School of Computing and Information Systems, The University of Melbourne, Australia
| | - Leila Ismail
- School of Computing and Information Systems, The University of Melbourne, Australia
- Intelligent Distributed Computing and Systems Lab, Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, United Arab Emirates
- National Water and Energy Center, United Arab Emirates University, United Arab Emirates
| | - Huned Materwala
- Intelligent Distributed Computing and Systems Lab, Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, United Arab Emirates
- National Water and Energy Center, United Arab Emirates University, United Arab Emirates
| | - Juma Al Kaabi
- College of Medicine and Health Sciences, Department of Internal Medicine, United Arab Emirates University, United Arab Emirates
- Tawam and Mediclinic Hospitals, Al Ain, Abu Dhabi, United Arab Emirates
| | - Priya Ranjan
- School of Computer Science, Internet of Things Center of Excellence, University of Petroleum and Energy Studies, India
| | - Rajiv Janardhanan
- Faculty of Medical & Health Sciences, SRM Institute of Science & Technology, India
| |
Collapse
|
2
|
Moezzi SMM, Mohammadi M, Mohammadi M, Saloglu D, Sheikholeslami R. Machine learning insights into PM 2.5 changes during COVID-19 lockdown: LSTM and RF analysis in Mashhad. Environ Monit Assess 2024; 196:453. [PMID: 38619639 DOI: 10.1007/s10661-024-12567-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/23/2024] [Indexed: 04/16/2024]
Abstract
This study seeks to investigate the impact of COVID-19 lockdown measures on air quality in the city of Mashhad employing two strategies. We initiated our research using basic statistical methods such as paired sample t-tests to compare hourly PM2.5 data in two scenarios: before and during quarantine, and pre- and post-lockdown. This initial analysis provided a broad understanding of potential changes in air quality. Notably, a low reduction of 2.40% in PM2.5 was recorded when compared to air quality prior to the lockdown period. This finding highlights the wide range of factors that impact the levels of particulate matter in urban settings, with the transportation sector often being widely recognized as one of the principal causes of this issue. Nevertheless, throughout the period after the quarantine, a remarkable decrease in air quality was observed characterized by distinct seasonal patterns, in contrast to previous years. This finding demonstrates a significant correlation between changes in human mobility patterns and their influence on the air quality of urban areas. It also emphasizes the need to use air pollution modeling as a fundamental tool to evaluate and understand these linkages to support long-term plans for reducing air pollution. To obtain a more quantitative understanding, we then employed cutting-edge machine learning methods, such as random forest and long short-term memory algorithms, to accurately determine the effect of the lockdown on PM2.5 levels. Our models' results demonstrated remarkable efficacy in assessing the pollutant concentration in Mashhad during lockdown measures. The test set yielded an R-squared value of 0.82 for the long short-term memory network model, whereas the random forest model showed a calculated cross-validation R-squared of 0.78. The required computational cost for training the LSTM and the RF models across all data was 25 min and 3 s, respectively. In summary, through the integration of statistical methods and machine learning, this research attempts to provide a comprehensive understanding of the impact of human interventions on air quality dynamics.
Collapse
Affiliation(s)
| | - Mitra Mohammadi
- Department of Environmental Science, Kheradgarayan Motahar Institute of Higher Education, Mashhad, Iran.
| | | | - Didem Saloglu
- Department of Disaster and Emergency Management, Disaster Management Institute, Istanbul Technical University, Istanbul, Turkey
| | - Razi Sheikholeslami
- Department of Civil Engineering, Sharif University of Technology, Tehran, Iran
| |
Collapse
|
3
|
Gupta P, Shukla DP. Demi-decadal land use land cover change analysis of Mizoram, India, with topographic correction using machine learning algorithm. Environ Sci Pollut Res Int 2024:10.1007/s11356-024-33094-3. [PMID: 38609681 DOI: 10.1007/s11356-024-33094-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 03/22/2024] [Indexed: 04/14/2024]
Abstract
Mizoram (India) is part of UNESCO's biodiversity hotspots in India that is primarily populated by tribes who engage in shifting agriculture. Hence, the land use land cover (LULC) pattern of the state is frequently changing. We have used Landsat 5 and 8 satellite images to prepare LULC maps from 2000 to 2020 in every 5 years. The atmospherically corrected images were pre-processed for removal of cloud cover and then classified into six classes: waterbodies, farmland, settlement, open forest, dense forest, and bare land. We applied four machine learning (ML) algorithms for classification, namely, random forest (RF), classification and regression tree (CART), minimum distance (MD), and support vector machine (SVM) for the images from 2000 to 2020. With 80% training and 20% testing data, we found that the RF classifier works best with the most accuracy than other classifiers. The average overall accuracy (OA) and Kappa coefficient (KC) from 2000 to 2020 were 84.00% and 0.79 when the RF classifier was used. When using SVM, CART, and MD, the average OA and KC were 78.06%, 0.73; 78.60%, 0.72; and 73.32%, 0.65, respectively. We utilised three methods of topographic correction, namely, C-correction, SCS (sun canopy sensor) correction, and SCS + C correction to reduce the misclassification due to shadow effects. SCS + C correction worked best for this region; hence, we prepared LULC maps on SCS + C corrected satellite image. Hence, we have used RF classifier for LULC preparation demi-decadal from 2000 to 2020. The OA for 2000, 2005, 2010, 2015, and 2020 was found to be 84%, 81%, 81%, 85%, and 89%, respectively, using RF. The dense forest decreased from 2000 to 2020 with an increase in open forest, settlement, and agriculture; nevertheless, when Farmland was low, there was an increase in the barren land. The results were significantly improved with the topographic correction, and misclassification was quite less.
Collapse
Affiliation(s)
- Priyanka Gupta
- DExtER Lab, School of Civil and Environmental Engineering, A-11 Building, North Campus, IIT Mandi, Mandi, Himachal Pradesh, India, 175075
| | - Dericks Praise Shukla
- DExtER Lab, School of Civil and Environmental Engineering, A-11 Building, North Campus, IIT Mandi, Mandi, Himachal Pradesh, India, 175075.
| |
Collapse
|
4
|
Habib N, Saqib M, Najeh T, Gamil Y. Eco-Transformation of construction: Harnessing machine learning and SHAP for crumb rubber concrete sustainability. Heliyon 2024; 10:e26927. [PMID: 38463877 PMCID: PMC10920364 DOI: 10.1016/j.heliyon.2024.e26927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 02/14/2024] [Accepted: 02/21/2024] [Indexed: 03/12/2024] Open
Abstract
Researchers have focused their efforts on investigating the integration of crumb rubber as a substitute for conventional aggregates and cement in concrete. Nevertheless, the manufacture of crumb rubber concrete (CRC) has been linked to the release of noxious pollutants, hence presenting potential environmental hazards. Rather than developing novel CRC formulations, the primary objective of this work is to construct an extensive database by leveraging prior research efforts. The study places particular emphasis on two crucial concrete properties: compressive strength (fc') and tensile strength (fts). The database includes a total of 456 data points for fc' and 358 data points for fts, focusing on nine essential characteristics that have a substantial impact on both attributes. The research employs several machine learning algorithms, including both individual and ensemble methods, to undertake a comprehensive analysis of the created databases for fc' and fts. In order to ascertain the correctness of the models, a comparative analysis of machine learning techniques, namely decision tree (DT) and random forest (RF), is conducted using statistical evaluation. Cross-validation approaches are used in order to address the possible issues of overfitting. Furthermore, the Shapley additive explanations (SHAP) approach is used to investigate the influence of input parameters and their interrelationships. The findings demonstrate that the RF methodology has superior performance compared to other ensemble techniques, as shown by its lower error rates and higher coefficient of determination (R2) of 0.87 and 0.85 for fc' and fts respectively. When comparing ensemble approaches, it can be seen that AdaBoost outperforms bagging by 6 % for both outcome models and individual decision tree learners by 17% and 21% for fc' and fts respectively in terms of performance. The average accuracy of AdaBoost algorithm for both the models is 84%. Significantly, the age and the inclusion of crumb rubber in CRC are identified as the primary criteria that have a substantial influence on the mechanical properties of this particular kind of concrete.
Collapse
Affiliation(s)
- Nudrat Habib
- Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, Pakistan
| | - Muhammad Saqib
- Department of Civil Engineering, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, Pakistan
| | - Taoufik Najeh
- Operation and Maintenance, Operation, Maintenance and Acoustics, Department of Civil, Environmental and Natural Resources Engineering, Lulea University of Technology, Sweden
| | - Yaser Gamil
- Department of Civil Engineering, School of Engineering, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Selangor, Malaysia
| |
Collapse
|
5
|
Xing Y, Jin Y, Liu Y. Construction and comparison of short-term prognosis prediction model based on machine learning in acute ischemic stroke. Heliyon 2024; 10:e24232. [PMID: 38234895 PMCID: PMC10792580 DOI: 10.1016/j.heliyon.2024.e24232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 11/25/2023] [Accepted: 01/04/2024] [Indexed: 01/19/2024] Open
Abstract
Objective To construct and compared the short-term prognosis prediction models of acute ischemic stroke (AIS) by machine learning (ML). Methods Retrospectively study. The group W (mRS≤3) was clustered, and combined with group P (mRS>3) to form the post-clustering dataset for modeling. The "glmnet", "rpart", "xgboost", "randomForest", "neuralnet" packages were used to construct ML models. The accuracy, sensitivity, specificity, positive predict value (PPV), negative predict value (NPV) among the models were compared. Four external clinical datasets were used for external clinical validation. The optimal prediction model was determined by variable screening ability, model visualization, and external clinical validation performance. Results The post-clustering dataset contains 139 patients (group W) and 122 patients (group P). The neutrophil multiplied by D-dimer (NDM) has predictive value in all ML prediction models in this study. In the decision tree model, NDMQ occupies the first tree node, When NDM≤5.62 and the age<74.5, the probability of poor prognosis of AIS is less than 20 %. When NDM>5.62 and accompanied by pneumonia, the incidence of poor prognosis of AIS is about 90 %. In the Random Forest (RF) model, NDMQ had the highest Gini index. The variable combination screened by the RF model had the best performance in the neural network, and the accuracy, sensitivity, specificity, PPV, and NPV of the external validation were 0.800, 0.774, 0.833, 0.857, and 0.741, respectively. The RF model had the best performance in the external clinical validation datasets, with accuracies of 0.646, 0.697, 0.695, and 0.713, respectively. Conclusions NDM shows predictive value for AIS short-term prognosis in all ML models in this study. The optimal model in screening characteristic variables and the performance of in external clinical datasets was RF model. In the analysis of medical data with small sample size and outcome as categorical variables, RF could be used as the main algorithm to build a model.
Collapse
Affiliation(s)
- Yinting Xing
- Department of Clinical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin City, Heilongjiang Province, China
| | - Yingyu Jin
- Department of Clinical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin City, Heilongjiang Province, China
| | - Yanhong Liu
- Department of Clinical Laboratory, The Second Affiliated Hospital of Harbin Medical University, Harbin City, Heilongjiang Province, China
| |
Collapse
|
6
|
Wang Y, Shi F, Yao P, Sheng Y, Zhao C. Assessing the evolution and attribution of watershed resilience in arid inland river basins, Northwest China. Sci Total Environ 2024; 906:167534. [PMID: 37797763 DOI: 10.1016/j.scitotenv.2023.167534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/28/2023] [Accepted: 09/30/2023] [Indexed: 10/07/2023]
Abstract
Water scarcity significantly limits the sustainable development of oasis economies in arid inland river basins. Quantifying watershed resilience and its drivers is a major focus in the fields of hydrology and water resources. In this study, the resilience indicator pi represents watershed resilience, while meteorological, hydrological, socioeconomic, and ecological factors are used to investigate the spatial and temporal patterns of resilience and important driving factors in the Hotan River Basin from 1958 to 2020 by combining principal component analysis and random forest model. Results show that the overall resilience of the Hotan River Basin is low, decreasing from the upper (upstream) to the middle and lower (downstream) reaches, and that the intensity of human activities has a negative impact on resilience. Rivers are more likely to reach maximum resilience after experiencing periods of wet and dry conditions, although there is a lag in this progress. The random forest machine learning algorithm was used to accurately predict the resilience levels of the two upstream tributaries Yurungkash and Karakash Rivers, and the downstream Hotan River, with classification accuracies of 84.2 %, 71.4 %, and 87 %, respectively. The factors affecting the resilience of the Yurungkash River are the 30-day maximum, base flow index, low pulse duration, median streamflow in May, median streamflow in August, median streamflow in October, and 7-day maximum. The set of factors used to classify the resilience of the Karakash River include the 7-day maximum, 1-day maximum, median streamflow in June, 30-day maximum, 3-day maximum, median streamflow in February, and autumn temperature. The factors affecting the resilience of the Hotan River are the watershed inflow, Xiaota station runoff, population growth rate, and effective irrigated area. The findings of this study provide a theoretical basis for integrated water resource management and the sustainable development of the oasis economy in the Hotan River Basin.
Collapse
Affiliation(s)
- Yuehui Wang
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China; Key Laboratory of Surficial Geochemistry, Ministry of Education, Department of Hydrosciences, School of Earth Sciences and Engineering, Nanjing University, Nanjing 210023, China
| | - Fengzhi Shi
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China; Akesu National Station of Observation and Research for Oasis Agro-ecosystem, Akesu 843017, Xinjiang, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Peng Yao
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China; Akesu National Station of Observation and Research for Oasis Agro-ecosystem, Akesu 843017, Xinjiang, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yu Sheng
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China; Akesu National Station of Observation and Research for Oasis Agro-ecosystem, Akesu 843017, Xinjiang, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chengyi Zhao
- School of Geographical Sciences, Nanjing University of Information Science and Technology, Nanjing 210044, China
| |
Collapse
|
7
|
SHEN JUAN, ZHANG WEIYU, JIN QINQIN, GONG FUYU, ZHANG HEPING, XU HONGLIANG, LI JIEJIE, YAO HUI, JIANG XIYA, YANG YINTING, HONG LIN, MEI JIE, SONG YANG, ZHOU SHUGUANG. Polo-like kinase 1 as a biomarker predicts the prognosis and immunotherapy of breast invasive carcinoma patients. Oncol Res 2023; 32:339-351. [PMID: 38186570 PMCID: PMC10765123 DOI: 10.32604/or.2023.030887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 08/03/2023] [Indexed: 01/09/2024] Open
Abstract
Background Invasive breast carcinoma (BRCA) is associated with poor prognosis and high risk of mortality. Therefore, it is critical to identify novel biomarkers for the prognostic assessment of BRCA. Methods The expression data of polo-like kinase 1 (PLK1) in BRCA and the corresponding clinical information were extracted from TCGA and GEO databases. PLK1 expression was validated in diverse breast cancer cell lines by quantitative real-time polymerase chain reaction (qRT-PCR) and western blotting. Single sample gene set enrichment analysis (ssGSEA) was performed to evaluate immune infiltration in the BRCA microenvironment, and the random forest (RF) and support vector machine (SVM) algorithms were used to screen for the hub infiltrating cells and calculate the immunophenoscore (IPS). The RF algorithm and COX regression model were applied to calculate survival risk scores based on the PLK1 expression and immune cell infiltration. Finally, a prognostic nomogram was constructed with the risk score and pathological stage, and its clinical potential was evaluated by plotting calibration charts and DCA curves. The application of the nomogram was further validated in an immunotherapy cohort. Results PLK1 expression was significantly higher in the tumor samples in TCGA-BRCA cohort. Furthermore, PLK1 expression level, age and stage were identified as independent prognostic factors of BRCA. While the IPS was unaffected by PLK1 expression, the TMB and MATH scores were higher in the PLK1-high group, and the TIDE scores were higher for the PLK1-low patients. We also identified 6 immune cell types with high infiltration, along with 11 immune cell types with low infiltration in the PLK1-high tumors. A risk score was devised using PLK1 expression and hub immune cells, which predicted the prognosis of BRCA patients. In addition, a nomogram was constructed based on the risk score and pathological staging, and showed good predictive performance. Conclusions PLK1 expression and immune cell infiltration can predict post-immunotherapy prognosis of BRCA patients.
Collapse
Affiliation(s)
- JUAN SHEN
- School of Big Data and Artificial Intelligence, Anhui Xinhua University, Hefei, 230088, China
| | - WEIYU ZHANG
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - QINQIN JIN
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - FUYU GONG
- Departments of Breast Surgery, Fuyang Women and Children’s Hospital, Fuyang, 236000, China
| | - HEPING ZHANG
- Departments of Pathology, Anhui Province Maternity and Child Health Hospital, Hefei, 230001, China
| | - HONGLIANG XU
- Departments of Pathology, Anhui Province Maternity and Child Health Hospital, Hefei, 230001, China
| | - JIEJIE LI
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - HUI YAO
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - XIYA JIANG
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - YINTING YANG
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - LIN HONG
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - JIE MEI
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
| | - YANG SONG
- Department of Pain, The First Affiliated Hospital of Anhui Medical University, Hefei, 230032, China
| | - SHUGUANG ZHOU
- Department of Gynecology and Obstetrics, Maternity and Child Healthcare Hospital Affiliated to Anhui Medical University, Anhui Province Maternity and Child Healthcare Hospital, Hefei, 230001, China
- Department of Gynecology and Obstetrics, The Fifth Clinical College of Anhui Medical University, Hefei, 230032, China
- Department of Gynecology and Obstetrics, Linquan Maternity and Child Healthcare Hospital, Fuyang, 236400, China
| |
Collapse
|
8
|
Inqiad WB, Siddique MS, Alarifi SS, Butt MJ, Najeh T, Gamil Y. Comparative analysis of various machine learning algorithms to predict 28-day compressive strength of Self-compacting concrete. Heliyon 2023; 9:e22036. [PMID: 38045144 PMCID: PMC10692774 DOI: 10.1016/j.heliyon.2023.e22036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/02/2023] [Accepted: 11/02/2023] [Indexed: 12/05/2023] Open
Abstract
Construction industry is indirectly the largest source of CO 2 emissions in the atmosphere, due to the use of cement in concrete. These emissions can be reduced by using industrial waste materials in place of cement. Self-Compacting Concrete (SCC) is a promising material to enhance the use of industrial wastes in concrete. However, there are very few methods available for accurate prediction of its strength, therefore, reliable models for estimating 28-day Compressive Strength (C-S) of SCC are developed in current study by using three Machine Learning (ML) algorithms including Multi Expression Programming (MEP), Extreme Gradient Boosting (XGB), and Random Forest (RF). The ML models were meticulously developed using a dataset of 231 points collected from internationally published literature considering seven most influential parameters including cement content, quantities of fly ash and silica fume, water content, coarse aggregate, fine aggregate, and superplasticizer dosage to predict C-S. The developed models were evaluated using different statistical errors including Root Mean Square Error (RMSE), Mean Absolute Error (MAE), coefficient of determination (R 2 ) etc. The results showed that the XGB model outperformed the MEP and RF model in terms of accuracy with a correlation R 2 = 0.998 compared to 0.923 for MEP and 0.986 for RF. Similar trend was observed for other error metrices. Thus, XGB is the most accurate model for estimating C-S of SCC. However, it is pertinent to mention here that it does not give its output in the form of an empirical equation like MEP model. The construction of these empirical models will help to efficiently estimate C-S of SCC for practical purposes.
Collapse
Affiliation(s)
- Waleed Bin Inqiad
- Military College of Engineering (MCE), National University of Science and Technology (NUST), Islamabad 44000, Pakistan
| | - Muhammad Shahid Siddique
- Military College of Engineering (MCE), National University of Science and Technology (NUST), Islamabad 44000, Pakistan
| | - Saad S. Alarifi
- Department of Geology and Geophysics, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
| | | | - Taoufik Najeh
- Operation and Maintenance, Operation, Maintenance and Acoustics, Department of Civil, Environmental and Natural Resources Engineering, Lulea University of Technology, Sweden
| | - Yaser Gamil
- Department of Civil Engineering, School of Engineering, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Selangor, Malaysia
| |
Collapse
|
9
|
Jiang Z, Yang S, Luo S. Source analysis and health risk assessment of heavy metals in agricultural land of multi-mineral mining and smelting area in the Karst region - a case study of Jichangpo Town, Southwest China. Heliyon 2023; 9:e17246. [PMID: 37456041 PMCID: PMC10338313 DOI: 10.1016/j.heliyon.2023.e17246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 06/10/2023] [Accepted: 06/12/2023] [Indexed: 07/18/2023] Open
Abstract
In the Karst region of Southwest China, the content of soil heavy metals is generally high because of the geological background. Moreover, Southwest China is rich in mineral resources. A large number of mining and smelting activities discharge heavy metals into surrounding soil and cause superimposed pollution, which has drawn widespread concern. Due to the large variation coefficients of soil heavy metals in the Karst region, it is particularly essential to select appropriate analysis methods. In this paper, Jichangpo in Puding County, a Karst area with multi-mineral mining and smelting, is selected as the research object. A total of 368 pieces of agricultural topsoil in the study area are collected. The pollution level of heavy metals in agricultural soil is evaluated by the geological accumulation index (Igeo) and enrichment factor (EF). Absolute Factor Score/Multiple Linear Regression (APCS/MLR), geographic information system (GIS), self-organizing mapping (SOM), and random forest (RF) are used for the source allocation of soil heavy metals. Finally, the combination of APCS/MLR and health risk assessment model is adopted to evaluate the risks of heavy metal sources and determine the priority-control source. The results show that the average values of soil heavy metals in the study area (Cd, Hg, As, Pb, Cr, Cu, Zn, and Ni) exceed the background values of corresponding elements in Guizhou Province. Three sources of heavy metals are identified by combining APCS/MLR, GIS, SOM, and RF. Zn (63.47%), Pb (55.77%), Cd (58.98%), Hg (32.17%), Cu (14.41%), and As (5.99%) are related to lead-zinc mining and smelting; Cr (98.14%), Ni (90.64%), Cu (76.93%), Pb (43.02%), Zn (35.22%), Cd (28.97%), Hg (22.44%), and As (5.84%) are mixed sources (natural and agricultural sources); As (88.17%), Hg (45.39%), Cd (12.04%), Cu (8.66%), and Ni (6.72%) are related to the mining and smelting of coal and iron. The results of health risk assessment show that only As poses a non-carcinogenic risk to human health. 3.31% of the sampling points of As have non-carcinogenic risks to adults and 10.22% to children. In terms of carcinogenic risks, As, Pb, and Cr pose carcinogenic risks to adults and children. Combined with APCS/MLR and the health risk assessment model, the mining and smelting of coal and iron is the priority-control pollution source. This paper provides a comprehensive method for studying the distribution of heavy metal sources in areas with large variation coefficients of soil heavy metals in the Karst region. Furthermore, it offers a theoretical basis for the management and assessment of heavy metal pollution in agricultural land in the study area, which is helpful for researchers to make strategic decisions on food security when selecting agricultural land.
Collapse
Affiliation(s)
- Zaiju Jiang
- Guizhou Coal Mine Geological Engineering Advisory and Geological Environment Monitoring Center, Guiyang, 550081, China
| | - Shaozhang Yang
- Guizhou Coal Mine Geological Engineering Advisory and Geological Environment Monitoring Center, Guiyang, 550081, China
- Guizhou Rongyuan Environmental Protection Technology Co. LTD, Guiyang, 550081, China
| | - Sha Luo
- Guizhou Coal Mine Geological Engineering Advisory and Geological Environment Monitoring Center, Guiyang, 550081, China
- Guizhou Rongyuan Environmental Protection Technology Co. LTD, Guiyang, 550081, China
| |
Collapse
|
10
|
Wang Z, Wang J, Yu D, Chen K. The potential evaluation of groundwater by integrating rank sum ratio (RSR) and machine learning algorithms in the Qaidam Basin. Environ Sci Pollut Res Int 2023; 30:63991-64005. [PMID: 37059956 DOI: 10.1007/s11356-023-26961-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 04/08/2023] [Indexed: 04/16/2023]
Abstract
Groundwater is a vital resource in arid areas that sustains local industrial development and environmental preservation. Mapping groundwater potential zones and determining high-potential regions are essential for the responsible use of the local groundwater resource. When utilizing machine learning or deep learning algorithms to forecast groundwater potential in arid areas, difficulties such as inaccurate and overfitting predictions might occur due to a shortage of borehole samples. In this study, a database of groundwater conditioning factors with a size of 275,157 × 9 was created in the Qaidam Basin, and 85 known borehole samples were collected. The groundwater potential was evaluated using a combination of rank sum ratio (RSR), projection pursuit regression (PPR) and random forest (RF) algorithms, resulting in four models: PPR, RSR-PPR, RSR-RF, and RF. Results indicated that the groundwater potential was higher in mountainous regions surrounding the Qaidam Basin and decreased progressively towards the central and northwestern regions where most industries and facilities are located. The two primary factors, according to the PPR and RF models, were evapotranspiration (0.246, 0.225) and landform (0.176, 0.294). In terms of their ability to accurately forecast the borehole samples, the four models ranked as follows: RF > RSR-RF > RSR-PPR > PPR. The accuracy of the four models in the low-potential area was 0.73 (PPR), 0.60 (RSR-PPR), 0.87 (RSR-RF), and 0.80 (RF), respectively. However, the RF model showed overfitting due to a lack of samples, especially in high-potential regions, which limits its applicability. The RSR-RF method was applied directly to evaluate the entire factor database, avoiding the risk of overfitting caused by a limited number of training samples. The results demonstrate that the RSR-RF model is effective for classifying groundwater potential types in samples and mapping groundwater potential of the study area. This research presents a novel approach for groundwater potential predictions in areas with insufficient sample sizes, providing a reference for policymakers and researchers.
Collapse
Affiliation(s)
- Zitao Wang
- Key Laboratory of Comprehensive and Highly Efficient Utilization of Salt Lake Resources, Qinghai Institute of Salt Lakes, Chinese Academy of Sciences, Xining, 810008, China
- Qinghai Provincial Key Laboratory of Geology and Environment of Salt Lakes, Xining, 810008, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jianping Wang
- Key Laboratory of Comprehensive and Highly Efficient Utilization of Salt Lake Resources, Qinghai Institute of Salt Lakes, Chinese Academy of Sciences, Xining, 810008, China.
- Qinghai Provincial Key Laboratory of Geology and Environment of Salt Lakes, Xining, 810008, China.
| | - Dongmei Yu
- Key Laboratory of Comprehensive and Highly Efficient Utilization of Salt Lake Resources, Qinghai Institute of Salt Lakes, Chinese Academy of Sciences, Xining, 810008, China
- Qinghai Provincial Key Laboratory of Geology and Environment of Salt Lakes, Xining, 810008, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Kai Chen
- School of Earth and Environment, Anhui University of Science and Technology, Huainan, 232001, China
| |
Collapse
|
11
|
Shi W, Wu W, Zhang L, Jia Q, Tan J, Zheng W, Li N, Xu K, Meng Z. Prognosis of thyroid carcinoma patients with osseous metastases: an SEER-based study with machine learning. Ann Nucl Med 2023. [PMID: 36867400 DOI: 10.1007/s12149-023-01826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Accepted: 02/09/2023] [Indexed: 03/04/2023]
Abstract
OBJECTIVE Osseous metastasis (OM) is the second most common site of thyroid cancer distant metastasis and presents a poor prognosis. Accurate prognostic estimation for OM has clinical significance. Ascertain the risk factors for survival and develop an effective model to predict the 3-year, 5-year overall survival (OS) and cancer-specific survival (CSS) for thyroid cancer patients with OM. METHODS We retrieved the information of patients with OMs between 2010 and 2016 from the Surveillance, Epidemiology, and End Result Program. The Chi-square test, and univariate and multivariate Cox regression analyses were performed. Four machine learning (ML) algorithms, which were most commonly used in this field, were applied. RESULT A total of 579 patients having OMs were eligible. Advanced age, tumor size ≥ 40 mm, combined with other distant metastasis were associated with worse OS in DTC OMs patients. Radioactive iodine (RAI) significantly improved CSS in both males and females. Among four ML models [logistic regression, support vector machines, extreme gradient boosting, and random forest (RF)], RF had the best performance [area under the receiver-operating characteristic curve: 0.9378 for 3-year CSS, 0.9105 for 5-year CSS, 0.8787 for 3-year OS, 0.8909 for 5-year OS]. The accuracy and specificity of RF were also the best. CONCLUSIONS RF model shall be used to establish an accurate prognostic model for thyroid cancer patients with OM, not only from the SEER cohort but also intended for all thyroid cancer patients in the general population, which may be applicable in clinical practice in the future.
Collapse
|
12
|
Elbeltagi A, Pande CB, Kumar M, Tolche AD, Singh SK, Kumar A, Vishwakarma DK. Prediction of meteorological drought and standardized precipitation index based on the random forest (RF), random tree (RT), and Gaussian process regression (GPR) models. Environ Sci Pollut Res Int 2023; 30:43183-43202. [PMID: 36648725 DOI: 10.1007/s11356-023-25221-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
Agriculture, meteorological, and hydrological drought is a natural hazard which affects ecosystems in the central India of Maharashtra state. Due to limited historical data for drought monitoring and forecasting available in the central India of Maharashtra state, implementing machine learning (ML) algorithms could allow for the prediction of future drought events. In this paper, we have focused on the prediction accuracy of meteorological drought in the semi-arid region based on the standardized precipitation index (SPI) using the random forest (RF), random tree (RT), and Gaussian process regression (GPR-PUK kernel) models. A different combination of machine learning models and variables has been performed for the forecasting of metrological drought based on the SPI-6 and 12 months. Models were developed using monthly rainfall data for the period of 2000-2019 at two meteorological stations, namely, Karanjali and Gangawdi, each representing a geographical region of Upper Godavari river basin area in the central India of Maharashtra state which frequently experiences droughts. Historical data from the SPI from 2000 to 2013 was processed to train the model into machine learning model, and the rest of the 2014 to 2019-year data were used for testing to forecast the SPI and metrological drought. The mean square error (MSE), root mean square error (RMSE), adjusted R2, Mallows' (Cp), Akaike's (AIC), Schwarz's (SBC), and Amemiya's PC were used to identify the best combination input model and best subregression analysis for both stations of SPI-6 and 12. The correlation coefficient ([Formula: see text]), mean absolute error (MAE), root mean square error (RMSE), relative absolute error (RAE), and root relative squared error (RRSE) were used to perform evaluation for SPI-6 and 12 months of both stations with RF, RT, and GPR-PUK kernel models during the training and testing scenarios. The results during testing phase revealed that the RF was found as the best model in forecasting droughts with values of [Formula: see text], MAE, RMSE, RAE (%), and RRSE (%) being 0.856, 0.551, 0.718, 74.778, and 54.019, respectively, for SPI-6 while 0.961, 0.361, 0.538, 34.926, and 28.262, respectively, for SPI-12 scales at Gangawdi station. Further, the respective values of evaluators at Karanjali station were 0.913 and 0.966, 0.541 and 0.386, 0.604 and 0.589, 52.592 and 36.959, and 42.315 and 31.394 for PUK kernel and RT models, respectively, during SPI-6 and SPI-12. Machine learning models are potential drought warning techniques because they take less time, have fewer inputs, and are less sophisticated than dynamic or scientific models.
Collapse
Affiliation(s)
- Ahmed Elbeltagi
- Agricultural Engineering Department, Faculty of Agriculture, Mansoura University, Mansoura, 35516, Egypt
| | - Chaitanya B Pande
- Indian Institute of Tropical Meteorology, Pune, India
- Universiti Tenaga Nasional (UNITEN), Kajang, Malaysia
| | - Manish Kumar
- College of Agricultural Engineering and Technology, Dr. R.P.C.A.U, Pusa-Bihar, 848125, India
| | - Abebe Debele Tolche
- Haramaya Institute of Technology, School of Water Resources and Environmental Engineering, Haramaya University, P.O. Box 138, Dire Dawa, Ethiopia
| | - Sudhir Kumar Singh
- K. Banerjee Centre of Atmospheric and Ocean Studies, IIDS, Nehru Science Centre, University of Allahabad, 211002, Prayagraj, India
| | - Akshay Kumar
- Environmental Science and Engineering and Department (ESED), Indian Institute of Technology, Bombay, Maharashtra, India
| | - Dinesh Kumar Vishwakarma
- Department of Irrigation and Drainage Engineering, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, 263145, India.
| |
Collapse
|
13
|
Pourhashemi S, Asadi MAZ, Boroughani M, Azadi H. Mapping of dust source susceptibility by remote sensing and machine learning techniques (case study: Iran-Iraq border). Environ Sci Pollut Res Int 2023; 30:27965-27979. [PMID: 36394809 DOI: 10.1007/s11356-022-23982-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 10/30/2022] [Indexed: 06/16/2023]
Abstract
A dust storm is a major environmental problem affecting many arid regions worldwide. The novel contribution of this study is combining indicators extracted from RS- and statistic-based predictive models to spatial mapping of land susceptibility to dust emissions in a very important dust source area in the borders of Iran and Iraq (Khuzestan province in Iran and Al-Basrah and Maysan provinces in Iraq). In this research, remote sensing (RS) techniques and machine learning techniques, including multivariate adaptive regression spline (MARS), random forest (RF), and logistic regression (LR), were used for dust source identification and susceptibility map preparation. To this end, 152 DSA for the period of 2005-2020 were identified in the study area. Of these DSA data, 70% was assigned to the Dust Source Susceptibility Mapping (DSSM) (training dataset) and 30% to model validation. Consequently, six factors (i.e., soil, lithology, slope, normalized vegetation differential index (NDVI), geomorphology, and land use units) were prepared as DSA's independent and effective variables. The results of all three models indicated that land use had the most impact on DSA. The validation results of these models using the test data showed sub-curves of 0.92, 0.86, and 0.76 for the RF, MARS, and LR models, respectively. Also, results showed that the RF model outperformed MARS (AUC = 0.89) and LR (AUC = 0.78) methods. In all three models, high and very high susceptibility classes generally covered a large percentage of the case study. The highest percentage of dust source points was also in this susceptibility category. Overall, the results of this study can be useful for planners and managers to control and reduce the risk of negative dust consequences.
Collapse
Affiliation(s)
- Sima Pourhashemi
- Department of Geography, Hakim Sabzevari University, Sabzevar, Iran
| | | | - Mahdi Boroughani
- Research Center for Geosciences and Social Studies, Hakim Sabzevari University, Sabzevar, Iran
| | - Hossein Azadi
- Department of Geography, Ghent University, Ghent, Belgium
| |
Collapse
|
14
|
Ampadi Ramachandran R, Chi SW, Srinivasa Pai P, Foucher K, Ozevin D, Mathew MT. Artificial intelligence and machine learning as a viable solution for hip implant failure diagnosis-Review of literature and in vitro case study. Med Biol Eng Comput 2023. [PMID: 36701013 DOI: 10.1007/s11517-023-02779-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 01/09/2023] [Indexed: 01/27/2023]
Abstract
The digital health industry is experiencing fast-paced research which can provide digital care programs and technologies to enhance the competence of healthcare delivery. Orthopedic literature also confirms the applicability of artificial intelligence (AI) and machine learning (ML) models to medical diagnosis and clinical decision-making. However, implant monitoring after primary surgery often happens with a wellness visit or when a patient complains about it. Neglecting implant design and other technical errors in this scenario, unmonitored circumstances, and lack of post-surgery monitoring may ultimately lead to the implant system's failure and leave us with the only option of high-risk revision surgery. Preventive maintenance seems to be a good choice to identify the onset of an irreversible prosthesis failure. Considering all these aspects for hip implant monitoring, this paper explores existing studies linking ML models and intelligent systems for hip implant diagnosis. This paper explores the feasibility of an alternative continuous monitoring technique for post-surgery implant monitoring backed by an in vitro ML case study. Tribocorrosion and acoustic emission (AE) data are considered based on their efficacy in determining irreversible alteration of implant material to prevent total failures. This study also facilitates the relevance of developing an artificially intelligent implant monitoring methodology that can function with daily patient activities and how it can influence the digital orthopedic diagnosis. AI-based non-invasive hip implant monitoring system enabling point-of-care testing.
Collapse
|
15
|
Shi T, Zhang J, Shen W, Wang J, Li X. Machine learning can identify the sources of heavy metals in agricultural soil: A case study in northern Guangdong Province, China. Ecotoxicol Environ Saf 2022; 245:114107. [PMID: 36152430 DOI: 10.1016/j.ecoenv.2022.114107] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/06/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Source tracing of heavy metals in agricultural soils is of critical importance for effective pollution control and targeting policies. It is a great challenge to identify and apportion the complex sources of soil heavy metal pollution. In this study, a traditional analysis method, positive matrix fraction (PMF), and three machine learning methodologies, including self-organizing map (SOM), conditional inference tree (CIT) and random forest (RF), were used to identify and apportion the sources of heavy metals in agricultural soils from Lianzhou, Guangdong Province, China. Based on PMF, the contribution of the total loadings of heavy metals in soil were 19.3% for atmospheric deposition, 65.5% for anthropogenic and geogenic sources, and 15.2% for soil parent materials. Based on SOM model, As, Cd, Hg, Pb and Zn were attributed to mining and geogenic sources; Cr, Cu and Ni were derived from geogenic sources. Based on CIT results, the influence of altitude on soil Cr, Cu, Hg, Ni and Zn, as well as soil pH on Cd indicated their primary origin from natural processes. Whereas As and Pb were related to agricultural practices and traffic emissions, respectively. RF model further quantified the importance of variables and identified potential control factors (altitude, soil pH, soil organic carbon) in heavy metal accumulation in soil. This study provides an integrated approach for heavy metals source apportionment with a clear potential for future application in other similar regions, as well as to provide the theoretical basis for undertaking management and assessment of soil heavy metal pollution.
Collapse
Affiliation(s)
- Taoran Shi
- School of Applied Meteorology, Nanjing University of Information Science & Technology, Nanjing 210044, China
| | - Jingru Zhang
- Guangdong Province Academic of Environmental Science, Guangzhou 510045, China
| | - Wenjie Shen
- School of Earth Science and Engineering, Sun Yat-sen University, Zhuhai 519000, China; Guangdong Key Laboratory of Geological Process and Mineral Resources Exploration, Zhuhai 519000, China.
| | - Jun Wang
- Guangdong Province Academic of Environmental Science, Guangzhou 510045, China
| | - Xingyuan Li
- College of Earth and Environmental Sciences, Lanzhou University, 730000, China.
| |
Collapse
|
16
|
Chengaiyan S, Anandan K. Effect of functional and effective brain connectivity in identifying vowels from articulation imagery procedures. Cogn Process 2022. [PMID: 35794496 DOI: 10.1007/s10339-022-01103-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 06/15/2022] [Indexed: 11/03/2022]
Abstract
Articulation imagery, a form of mental imagery, refers to the activity of imagining or speaking to oneself mentally without an articulation movement. It is an effective domain of research in speech impaired neural disorders, as speech imagination has high similarity to real voice communication. This work employs electroencephalography (EEG) signals acquired from articulation and articulation imagery in identifying the vowel being imagined during different tasks. EEG signals from chosen electrodes are decomposed using the empirical mode decomposition (EMD) method into a series of intrinsic mode functions. Brain connectivity estimators and entropy measures have been computed to analyze the functional cooperation and causal dependence between different cortical regions as well as the regularity in the signals. Using machine learning techniques such as multiclass support vector machine (MSVM) and random forest (RF), the vowels have been classified. Three different training and testing protocols (Articulation-AR, Articulation imagery-AI and Articulation vs Articulation imagery-AR vs AI) were employed for identifying the vowel being imagined of articulating. An overall classification accuracy of 80% was obtained for articulation imagery protocol which was found to be higher than the other two protocols. Also, MSVM techniques outperformed the RF technique in terms of the classification accuracy. The effect of brain connectivity estimators and machine learning techniques seems to be reliable in identifying the vowel from the subjects' thought and thereby assisting the people with speech impairment.
Collapse
|
17
|
Lee J, Lee S, Street WN, Polgreen LA. Machine learning approaches to predict the 1-year-after-initial-AMI survival of elderly patients. BMC Med Inform Decis Mak 2022; 22:115. [PMID: 35488291 PMCID: PMC9052482 DOI: 10.1186/s12911-022-01854-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/11/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND While multiple randomized controlled trials (RCTs) are available, their results may not be generalizable to older, unhealthier or less-adherent patients. Observational data can be used to predict outcomes and evaluate treatments; however, exactly which strategy should be used to analyze the outcomes of treatment using observational data is currently unclear. This study aimed to determine the most accurate machine learning technique to predict 1-year-after-initial-acute-myocardial-infarction (AMI) survival of elderly patients and to identify the association of angiotensin-converting- enzyme inhibitors and angiotensin-receptor blockers (ACEi/ARBs) with survival. METHODS We built a cohort of 124,031 Medicare beneficiaries who experienced an AMI in 2007 or 2008. For analytical purposes, all variables were categorized into nine different groups: ACEi/ARB use, demographics, cardiac events, comorbidities, complications, procedures, medications, insurance, and healthcare utilization. Our outcome of interest was 1-year-post-AMI survival. To solve this classification task, we used lasso logistic regression (LLR) and random forest (RF), and compared their performance depending on category selection, sampling methods, and hyper-parameter selection. Nested 10-fold cross-validation was implemented to obtain an unbiased estimate of performance evaluation. We used the area under the receiver operating curve (AUC) as our primary measure for evaluating the performance of predictive algorithms. RESULTS LLR consistently showed best AUC results throughout the experiments, closely followed by RF. The best prediction was yielded with LLR based on the combination of demographics, comorbidities, procedures, and utilization. The coefficients from the final LLR model showed that AMI patients with many comorbidities, older ages, or living in a low-income area have a higher risk of mortality 1-year after an AMI. In addition, treating the AMI patients with ACEi/ARBs increases the 1-year-after-initial-AMI survival rate of the patients. CONCLUSIONS Given the many features we examined, ACEi/ARBs were associated with increased 1-year survival among elderly patients after an AMI. We found LLR to be the best-performing model over RF to predict 1-year survival after an AMI. LLR greatly improved the generalization of the model by feature selection, which implicitly indicates the association between AMI-related variables and survival can be defined by a relatively simple model with a small number of features. Some comorbidities were associated with a greater risk of mortality, such as heart failure and chronic kidney disease, but others were associated with survival such as hypertension, hyperlipidemia, and diabetes. In addition, patients who live in urban areas and areas with large numbers of immigrants have a higher probability of survival. Machine learning methods are helpful to determine outcomes when RCT results are not available.
Collapse
Affiliation(s)
- Jisoo Lee
- Department of Business Analytics, University of Iowa, Iowa City, USA
| | - Sulyun Lee
- Interdisciplinary Graduate Program in Informatics, University of Iowa, Iowa City, USA
| | - W Nick Street
- Department of Business Analytics, University of Iowa, Iowa City, USA
| | - Linnea A Polgreen
- Department of Pharmacy Practice and Science, University of Iowa, Iowa City, USA.
| |
Collapse
|
18
|
Wang X, Zhang C, Wang C, Liu G, Wang H. GIS-based for prediction and prevention of environmental geological disaster susceptibility: From a perspective of sustainable development. Ecotoxicol Environ Saf 2021; 226:112881. [PMID: 34634737 DOI: 10.1016/j.ecoenv.2021.112881] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 09/21/2021] [Accepted: 10/06/2021] [Indexed: 06/13/2023]
Abstract
Geological disasters seriously threaten the safety of human life, property, ecological resources, and the environment. Effective control of geological disasters is the focus of achieving sustainable social development. The Helong City (Jilin Province, China) was selected as the case study. Combined with GIS technology, a new integrated prediction model of geological disaster susceptibility was developed to improve the accuracy of geological disaster assessment, reduce the cost of geological disaster treatment, and ensure the sustainable development of ecological environment. The research results showed that elevation and normalized difference vegetation index (NDVI) were the key factors affecting susceptibility. Compared with the conventional model, the accuracy of the developing integrated model FR-DT and FR-RF was improved by more than 6%, and the disaster points were more concentrated in the high susceptibility zone. Statistical results of disaster treatment cost estimation and gross domestic product (GDP) value showed that the integrated model can save about 10% of treatment cost, and the ratio of total GDP/disaster governance cost was higher. The performance of the integrated model FR-DT and FR-RF had obvious advantages over the conventional model in terms of prediction accuracy, prevention pertinence, and prevention cost. These research results promote the advancement of geological disaster prevention and control technology, ensure the safety of the geological environment, and are of great significance to the sustainable development of the regional economy.
Collapse
Affiliation(s)
- Xuedong Wang
- College of Mining, Liaoning Technical University, Fuxin 123000, China
| | - Chaobiao Zhang
- College of Mining, Liaoning Technical University, Fuxin 123000, China
| | - Cui Wang
- College of Mining, Liaoning Technical University, Fuxin 123000, China
| | - Guangwei Liu
- College of Mining, Liaoning Technical University, Fuxin 123000, China
| | - Hanxi Wang
- School of Geographical Sciences, Harbin Normal University, Harbin 150025, China; State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration/School of Environment, Northeast Normal University, Jingyue Street 2555, Changchun 130017, China.
| |
Collapse
|
19
|
Alghamdi W, Alzahrani E, Ullah MZ, Khan YD. 4mC-RF: Improving the prediction of 4mC sites using composition and position relative features and statistical moment. Anal Biochem 2021; 633:114385. [PMID: 34571005 DOI: 10.1016/j.ab.2021.114385] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 09/09/2021] [Accepted: 09/13/2021] [Indexed: 01/28/2023]
Abstract
N4-methylcytosine (4 mC) is an important epigenetic modification that occurs enzymatically by the action of DNA methyltransferases. 4 mC sites exist in prokaryotes and eukaryotes while playing a vital role in regulating gene expression, DNA replication, and cell cycle. The efficient and accurate prediction of 4 mC sites has a significant role in the insight of 4 mC biological properties and functions. Therefore, a sequence-based predictor is proposed, namely 4 mC-RF, for identifying 4 mC sites through the integration of statistical moments along with position, and composition-dependent features. Relative and absolute position-based features are computed to extract optimal features. A popular machine learning classifier Random Forest was used for training the model. Validation results were obtained through rigorous processes of self-consistency, 10-fold cross-validation, Independent set testing, and Jackknife yielding 95.1%, 95.2%, 97.0%, and 94.7% accuracies, respectively. Our proposed model depicts the highest prediction accuracies as compared to existing models. Subsequently, the developed 4 mC-RF model was constructed into a web server. A significant and more accurate predictor of 4 mC Methylcytosine sites helps experimental scientists to gather faster, efficient, and cost-effective results.
Collapse
Affiliation(s)
- Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, P. O. Box 80221, Jeddah 21589, Saudi Arabia.
| | - Ebraheem Alzahrani
- Department of Mathematics, Faculty of Science, King Abdulaziz University, P. O. Box 80203, Jeddah 21589, Saudi Arabia.
| | - Malik Zaka Ullah
- Department of Mathematics, Faculty of Science, King Abdulaziz University, P. O. Box 80203, Jeddah 21589, Saudi Arabia.
| | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore 54770, Pakistan.
| |
Collapse
|
20
|
Aher RB, Sarkar D. 2D-QSAR modeling and two-fold classification of 1,2,4-triazole derivatives for antitubercular potency against the dormant stage of Mycobacterium tuberculosis. Mol Divers 2021. [PMID: 34347229 DOI: 10.1007/s11030-021-10254-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/14/2021] [Indexed: 10/20/2022]
Abstract
The dormant or latent form of Mycobacterium tuberculosis (MTB) is not killed by the conventional antitubercular drugs. The treatment of latent TB is essential to reduce the period of treatment as well as incidences of drug resistance. In this background, we have made an attempt to develop the quantitative structure-activity relationship models (QSAR: regression and classification based) against the dormant form of MTB and later used the developed classifier models (linear discriminant analysis (LDA) and random forest (RF)) for the two-fold classifications. The logic of applying this concept of two-fold classification for the MTB modeling is to increase the confidence of correct classification. The 2D-QSAR modeling suggested the contribution of burden eigen, edge adjacency, van der Waals (vdW) surface area, topological charge, and pharmacophoric indices in predicting the antitubercular activity against the dormant MTB. The prediction qualities of the training and test sets were found to be moderate and good, according to the mean absolute error (MAE)-based criteria's. The LDA and RF models unveiled the importance of burden eigen, edge adjacency, Geary autocorrelation, and drug-like indices as discriminating features to differentiate the antitubercular compounds into higher and lower active groups. The LDA model showed the classification accuracies of 85.14% and 87.10% for the training and test sets, while the RF model exhibited the accuracies of 100.00% and 80.65% for both the sets. The descriptors selected in the final models are only two-dimensional (2D), which are easy to compute and does not require computationally expensive steps of structure conversion, optimization, and energy minimization mandatorily needed before the computation of 3D descriptors. These models could be used for identifying and selection of higher active compounds against the dormant form of the MTB.
Collapse
|
21
|
Quddus A, Shahidi Zandi A, Prest L, Comeau FJE. Using long short term memory and convolutional neural networks for driver drowsiness detection. Accid Anal Prev 2021; 156:106107. [PMID: 33848710 DOI: 10.1016/j.aap.2021.106107] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 07/19/2020] [Accepted: 03/27/2021] [Indexed: 06/12/2023]
Abstract
Fatigue negatively affects the safety and performance of drivers on the road. In fact, drowsiness and fatigue are the cause of a substantial number of motor vehicle accidents. Drowsiness among the drivers can be detected using variety of modalities, including electroencephalogram (EEG), eye movement, and vehicle driving dynamics. Among these EEG is highly accurate but very intrusive and cumbersome. On the other hand, vehicle driving dynamics are very easy to acquire but accuracy is not very high. Eye movement based approach is very attractive in terms of balance between these two extremes. However, eye movement based techniques normally require an eye tracking device which consists of high speed camera with sophisticated algorithm to extract eye movement related parameters such as blinking, eye closure, saccades, fixation etc. This makes eye tracking based drowsiness detection difficult to implement as a practical system, especially on an embedded platform. In this paper, authors propose to use eye images from camera directly without the need for expensive eye-tracking system. Here, eye related movements are captured by Recurrent Neural Network (RNN) to detect the drowsiness. Long Short Term Memory (LSTM) is a class of RNN which has several advantages over vanilla RNNs. In this work an array of LSTM cells are utilized to model the eye movements. Two types of LSTMs were employed: 1-D LSTM (R-LSTM) which is used as baseline and the convolutional LSTM (C-LSTM) which facilitates using 2-D images directly. Patches of size 48 × 48 around each eye were extracted from 38 subjects, participating in a simulated driving experiment. The state of vigilance among the subjects were independently assessed by power spectral analysis of multichannel electroencephalogram (EEG) signals, recorded simultaneously, and binary labels of alert and drowsy (baseline) were generated. Results show high efficacy of the proposed system. R-LSTM based approach resulted in accuracy around 82 % and C-LSTM based approach resulted in accuracy in the range of 95%-97%. Comparison is also provided with a recently published eye-tracking based approach, showing the proposed LSTM technique outperform with a wide margin.
Collapse
Affiliation(s)
| | - Ali Shahidi Zandi
- Alcohol Countermeasure Systems Corp. (ACS), 60 International Boulevard, Toronto, ON, Canada.
| | - Laura Prest
- Alcohol Countermeasure Systems Corp. (ACS), 60 International Boulevard, Toronto, ON, Canada.
| | - Felix J E Comeau
- Alcohol Countermeasure Systems Corp. (ACS), 60 International Boulevard, Toronto, ON, Canada.
| |
Collapse
|
22
|
Abstract
Acetylcholinesterase enzyme is responsible for the degradation of acetylcholine and is an important drug target for the treatment of Alzheimer's disease. When this enzyme is inhibited, more acetylcholine is available in the synaptic cleft for the use, which leads to enhanced memory and cognitive ability. The aim of the present work is to create machine learning models for distinguishing between AChE inhibitors and non-inhibitors using algorithms like support vector machine (SVM), k-nearest neighbor (k-NN) and random forest (RF). The developed models were evaluated by 10-fold cross-validation and external dataset. Descriptor analysis was performed to identify most important features for the activity of molecules. Descriptors which were identified as important include maxssCH2, minHssNH, SaasC, minssCH2, bit 128 MACCS key, bit 104 MACCS key, bit 24 estate fingerprint and bit 18 estate fingerprints. The model developed using fingerprints based on random forest algorithm produced better results compared to other models. The overall accuracy of best model on test set was 85.38 percent. The developed model is available at http://14.139.57.41/achepredictor/ .
Collapse
Affiliation(s)
- Hardeep Sandhu
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Sector-67, S.A.S. Nagar, Mohali, Punjab, 160062, India
| | - Rajaram Naresh Kumar
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Sector-67, S.A.S. Nagar, Mohali, Punjab, 160062, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Sector-67, S.A.S. Nagar, Mohali, Punjab, 160062, India.
| |
Collapse
|
23
|
Islam ARMT, Hasanuzzaman M, Shammi M, Salam R, Bodrud-Doza M, Rahman MM, Mannan MA, Huq S. Are meteorological factors enhancing COVID-19 transmission in Bangladesh? Novel findings from a compound Poisson generalized linear modeling approach. Environ Sci Pollut Res Int 2021; 28:11245-11258. [PMID: 33118070 PMCID: PMC7594949 DOI: 10.1007/s11356-020-11273-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 10/15/2020] [Indexed: 05/06/2023]
Abstract
Novel coronavirus (SARS-CoV-2) causing COVID-19 disease has arisen to be a pandemic. Since there is a close association between other viral infection cases by epidemics and environmental factors, this study intends to unveil meteorological effects on the outbreak of COVID-19 across eight divisions of Bangladesh from March to April 2020. A compound Poisson generalized linear modeling (CPGLM), along with a Monte-Carlo method and random forest (RF) model, was employed to explore how meteorological factors affecting the COVID-19 transmission in Bangladesh. Results showed that subtropical climate (mean temperature about 26.6 °C, mean relative humidity (MRH) 64%, and rainfall approximately 3 mm) enhanced COVD-19 onset. The CPGLM model revealed that every 1 mm increase in rainfall elevated by 30.99% (95% CI 77.18%, - 15.20%) COVID-19 cases, while an increase of 1 °C of diurnal temperature (TDN) declined the confirmed cases by - 14.2% (95% CI 9.73%, - 38.13%) on the lag 1 and lag 2, respectively. In addition, NRH and MRH had the highest increase (17.98% (95% CI 22.5%, 13.42%) and 19.92% (95% CI: 25.71%, 14.13%)) of COVID-19 cased in lag 4. The results of the RF model indicated that TDN and AH (absolute humidity) influence the COVID-19 cases most. In the Dhaka division, MRH is the most vital meteorological factor that affects COVID-19 deaths. This study indicates the humidity and rainfall are crucial factors affecting the COVID-19 case, which is contrary to many previous studies in other countries. These outcomes can have policy formulation for the suppression of the COVID-19 outbreak in Bangladesh.
Collapse
Affiliation(s)
| | - Md Hasanuzzaman
- Department of Disaster Management, Begum Rokeya University, Rangpur, 5400, Bangladesh
| | - Mashura Shammi
- Department of Environmental Sciences, Jahangirnagar University, Dhaka, 1342, Bangladesh
| | - Roquia Salam
- Department of Disaster Management, Begum Rokeya University, Rangpur, 5400, Bangladesh
| | | | - Md Mostafizur Rahman
- Department of Environmental Sciences, Jahangirnagar University, Dhaka, 1342, Bangladesh.
| | - Md Abdul Mannan
- Bangladesh Meteorological Department, Meteorological Complex Agargaon, Dhaka, 1207, Bangladesh
| | - Saleemul Huq
- ICCCAD, Independent University Bangladesh, Dhaka, Bangladesh
| |
Collapse
|
24
|
Wang H, Qin Z, Yan A. Classification models and SAR analysis on CysLT1 receptor antagonists using machine learning algorithms. Mol Divers 2021; 25:1597-1616. [PMID: 33534023 DOI: 10.1007/s11030-020-10165-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 11/27/2020] [Indexed: 12/21/2022]
Abstract
Cysteinyl leukotrienes 1 (CysLT1) receptor is a promising drug target for rhinitis or other allergic diseases. In our study, we built classification models to predict bioactivities of CysLT1 receptor antagonists. We built a dataset with 503 CysLT1 receptor antagonists which were divided into two groups: highly active molecules (IC50 < 1000 nM) and weakly active molecules (IC50 ≥ 1000 nM). The molecules were characterized by several descriptors including CORINA descriptors, MACCS fingerprints, Morgan fingerprint and molecular SMILES. For CORINA descriptors and two types of fingerprints, we used the random forests (RF) and deep neural networks (DNN) to build models. For molecular SMILES, we used recurrent neural networks (RNN) with the self-attention to build models. The accuracies of test sets for all models reached 85%, and the accuracy of the best model (Model 2C) was 93%. In addition, we made structure-activity relationship (SAR) analyses on CysLT1 receptor antagonists, which were based on the output from the random forest models and RNN model. It was found that highly active antagonists usually contained the common substructures such as tetrazoles, indoles and quinolines. These substructures may improve the bioactivity of the CysLT1 receptor antagonists.
Collapse
Affiliation(s)
- Hongzhao Wang
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, University of Chemical Technology, Beijing, People's Republic of China
| | - Zijian Qin
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, University of Chemical Technology, Beijing, People's Republic of China
| | - Aixia Yan
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, University of Chemical Technology, Beijing, People's Republic of China.
| |
Collapse
|
25
|
Kwarteng EVS, Andam-Akorful SA, Kwarteng A, Asare DCB, Quaye-Ballard JA, Osei FB, Duker AA. Spatial variation in lymphatic filariasis risk factors of hotspot zones in Ghana. BMC Public Health 2021; 21:230. [PMID: 33509140 PMCID: PMC7841995 DOI: 10.1186/s12889-021-10234-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 01/13/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Lymphatic Filariasis (LF), a parasitic nematode infection, poses a huge economic burden to affected countries. LF endemicity is localized and its prevalence is spatially heterogeneous. In Ghana, there exists differences in LF prevalence and multiplicity of symptoms in the country's northern and southern parts. Species distribution models (SDMs) have been utilized to explore the suite of risk factors that influence the transmission of LF in these geographically distinct regions. METHODS Presence-absence records of microfilaria (mf) cases were stratified into northern and southern zones and used to run SDMs, while climate, socioeconomic, and land cover variables provided explanatory information. Generalized Linear Model (GLM), Generalized Boosted Model (GBM), Artificial Neural Network (ANN), Surface Range Envelope (SRE), Multivariate Adaptive Regression Splines (MARS), and Random Forests (RF) algorithms were run for both study zones and also for the entire country for comparison. RESULTS Best model quality was obtained with RF and GBM algorithms with the highest Area under the Curve (AUC) of 0.98 and 0.95, respectively. The models predicted high suitable environments for LF transmission in the short grass savanna (northern) and coastal (southern) areas of Ghana. Mainly, land cover and socioeconomic variables such as proximity to inland water bodies and population density uniquely influenced LF transmission in the south. At the same time, poor housing was a distinctive risk factor in the north. Precipitation, temperature, slope, and poverty were common risk factors but with subtle variations in response values, which were confirmed by the countrywide model. CONCLUSIONS This study has demonstrated that different variable combinations influence the occurrence of lymphatic filariasis in northern and southern Ghana. Thus, an understanding of the geographic distinctness in risk factors is required to inform on the development of area-specific transmission control systems towards LF elimination in Ghana and internationally.
Collapse
Affiliation(s)
| | - Samuel Ato Andam-Akorful
- Department of Geomatic Engineering, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Alexander Kwarteng
- Department of Biochemistry and Biotechnology, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Da-Costa Boakye Asare
- Department of Geomatic Engineering, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | | | - Frank Badu Osei
- Department of Earth Observation Science, University of Twente, Enschede, Netherlands
| | - Alfred Allan Duker
- Department of Geomatic Engineering, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| |
Collapse
|
26
|
Idakwo G, Thangapandian S, Luttrell J, Li Y, Wang N, Zhou Z, Hong H, Yang B, Zhang C, Gong P. Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets. J Cheminform 2020; 12:66. [PMID: 33372637 PMCID: PMC7592558 DOI: 10.1186/s13321-020-00468-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 10/13/2020] [Indexed: 12/14/2022] Open
Abstract
The specificity of toxicant-target biomolecule interactions lends to the very imbalanced nature of many toxicity datasets, causing poor performance in Structure–Activity Relationship (SAR)-based chemical classification. Undersampling and oversampling are representative techniques for handling such an imbalance challenge. However, removing inactive chemical compound instances from the majority class using an undersampling technique can result in information loss, whereas increasing active toxicant instances in the minority class by interpolation tends to introduce artificial minority instances that often cross into the majority class space, giving rise to class overlapping and a higher false prediction rate. In this study, in order to improve the prediction accuracy of imbalanced learning, we employed SMOTEENN, a combination of Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbor (ENN) algorithms, to oversample the minority class by creating synthetic samples, followed by cleaning the mislabeled instances. We chose the highly imbalanced Tox21 dataset, which consisted of 12 in vitro bioassays for > 10,000 chemicals that were distributed unevenly between binary classes. With Random Forest (RF) as the base classifier and bagging as the ensemble strategy, we applied four hybrid learning methods, i.e., RF without imbalance handling (RF), RF with Random Undersampling (RUS), RF with SMOTE (SMO), and RF with SMOTEENN (SMN). The performance of the four learning methods was compared using nine evaluation metrics, among which F1 score, Matthews correlation coefficient and Brier score provided a more consistent assessment of the overall performance across the 12 datasets. The Friedman’s aligned ranks test and the subsequent Bergmann-Hommel post hoc test showed that SMN significantly outperformed the other three methods. We also found that a strong negative correlation existed between the prediction accuracy and the imbalance ratio (IR), which is defined as the number of inactive compounds divided by the number of active compounds. SMN became less effective when IR exceeded a certain threshold (e.g., > 28). The ability to separate the few active compounds from the vast amounts of inactive ones is of great importance in computational toxicology. This work demonstrates that the performance of SAR-based, imbalanced chemical toxicity classification can be significantly improved through the use of data rebalancing.
Collapse
Affiliation(s)
- Gabriel Idakwo
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Sundar Thangapandian
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | - Joseph Luttrell
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Yan Li
- Bennett Aerospace Inc, Cary, NC, 27518, USA
| | - Nan Wang
- Department of Computer Science, New Jersey City University, Jersey City, NJ, 07305, USA
| | - Zhaoxian Zhou
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Centre for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Bei Yang
- School of Information & Engineering, Zhengzhou University, Zhengzhou, 450000, China
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA.
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA.
| |
Collapse
|
27
|
Zhao SS, Feng XL, Hu YC, Han Y, Tian Q, Sun YZ, Zhang J, Ge XW, Cheng SC, Li XL, Mao L, Shen SN, Yan LF, Cui GB, Wang W. Better efficacy in differentiating WHO grade II from III oligodendrogliomas with machine-learning than radiologist's reading from conventional T1 contrast-enhanced and fluid attenuated inversion recovery images. BMC Neurol 2020; 20:48. [PMID: 32033580 PMCID: PMC7007642 DOI: 10.1186/s12883-020-1613-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 01/13/2020] [Indexed: 12/13/2022] Open
Abstract
Background The medical imaging to differentiate World Health Organization (WHO) grade II (ODG2) from III (ODG3) oligodendrogliomas still remains a challenge. We investigated whether combination of machine leaning with radiomics from conventional T1 contrast-enhanced (T1 CE) and fluid attenuated inversion recovery (FLAIR) magnetic resonance imaging (MRI) offered superior efficacy. Methods Thirty-six patients with histologically confirmed ODGs underwent T1 CE and 33 of them underwent FLAIR MR examination before any intervention from January 2015 to July 2017 were retrospectively recruited in the current study. The volume of interest (VOI) covering the whole tumor enhancement were manually drawn on the T1 CE and FLAIR slice by slice using ITK-SNAP and a total of 1072 features were extracted from the VOI using 3-D slicer software. Random forest (RF) algorithm was applied to differentiate ODG2 from ODG3 and the efficacy was tested with 5-fold cross validation. The diagnostic efficacy of radiomics-based machine learning and radiologist’s assessment were also compared. Results Nineteen ODG2 and 17 ODG3 were included in this study and ODG3 tended to present with prominent necrosis and nodular/ring-like enhancement (P < 0.05). The AUC, ACC, sensitivity, and specificity of radiomics were 0.798, 0.735, 0.672, 0.789 for T1 CE, 0.774, 0.689, 0.700, 0.683 for FLAIR, as well as 0.861, 0.781, 0.778, 0.783 for the combination, respectively. The AUCs of radiologists 1, 2 and 3 were 0.700, 0.687, and 0.714, respectively. The efficacy of machine learning based on radiomics was superior to the radiologists’ assessment. Conclusions Machine-learning based on radiomics of T1 CE and FLAIR offered superior efficacy to that of radiologists in differentiating ODG2 from ODG3.
Collapse
Affiliation(s)
- Sha-Sha Zhao
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Xiu-Long Feng
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Yu-Chuan Hu
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Yu Han
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Qiang Tian
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Ying-Zhi Sun
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Jie Zhang
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Xiang-Wei Ge
- Student Brigade, Air Force Medical University, Xi'an, 710032, Shaanxi, China
| | - Si-Chao Cheng
- Student Brigade, Air Force Medical University, Xi'an, 710032, Shaanxi, China
| | - Xiu-Li Li
- Deepwise AI Lab, Deepwise Inc, No.8 Haidian avenue, Sinosteel International Plaza, Beijing, 100080, China
| | - Li Mao
- Deepwise AI Lab, Deepwise Inc, No.8 Haidian avenue, Sinosteel International Plaza, Beijing, 100080, China
| | - Shu-Ning Shen
- Department of Stomatology, PLA 984 Hospital, Beijing, China
| | - Lin-Feng Yan
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Guang-Bin Cui
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China
| | - Wen Wang
- Department of Radiology & Functional and Molecular Imaging Key Lab of Shaanxi Province, Tangdu Hospital, Air Force Medical University, 569 Xinsi Road, Xi'an, 710038, Shaanxi, People's Republic of China.
| |
Collapse
|
28
|
Fang CH, Theera-Ampornpunt N, Roth MA, Grama A, Chaterji S. AIKYATAN: mapping distal regulatory elements using convolutional learning on GPU. BMC Bioinformatics 2019; 20:488. [PMID: 31590652 PMCID: PMC6781298 DOI: 10.1186/s12859-019-3049-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 08/22/2019] [Indexed: 12/02/2022] Open
Abstract
Background The data deluge can leverage sophisticated ML techniques for functionally annotating the regulatory non-coding genome. The challenge lies in selecting the appropriate classifier for the specific functional annotation problem, within the bounds of the hardware constraints and the model’s complexity. In our system Aikyatan, we annotate distal epigenomic regulatory sites, e.g., enhancers. Specifically, we develop a binary classifier that classifies genome sequences as distal regulatory regions or not, given their histone modifications’ combinatorial signatures. This problem is challenging because the regulatory regions are distal to the genes, with diverse signatures across classes (e.g., enhancers and insulators) and even within each class (e.g., different enhancer sub-classes). Results We develop a suite of ML models, under the banner Aikyatan, including SVM models, random forest variants, and deep learning architectures, for distal regulatory element (DRE) detection. We demonstrate, with strong empirical evidence, deep learning approaches have a computational advantage. Plus, convolutional neural networks (CNN) provide the best-in-class accuracy, superior to the vanilla variant. With the human embryonic cell line H1, CNN achieves an accuracy of 97.9% and an order of magnitude lower runtime than the kernel SVM. Running on a GPU, the training time is sped up 21x and 30x (over CPU) for DNN and CNN, respectively. Finally, our CNN model enjoys superior prediction performance vis-‘a-vis the competition. Specifically, Aikyatan-CNN achieved 40% higher validation rate versus CSIANN and the same accuracy as RFECS. Conclusions Our exhaustive experiments using an array of ML tools validate the need for a model that is not only expressive but can scale with increasing data volumes and diversity. In addition, a subset of these datasets have image-like properties and benefit from spatial pooling of features. Our Aikyatan suite leverages diverse epigenomic datasets that can then be modeled using CNNs with optimized activation and pooling functions. The goal is to capture the salient features of the integrated epigenomic datasets for deciphering the distal (non-coding) regulatory elements, which have been found to be associated with functional variants. Our source code will be made publicly available at: https://bitbucket.org/cellsandmachines/aikyatan. Electronic supplementary material The online version of this article (10.1186/s12859-019-3049-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chih-Hao Fang
- Department of Ag. and Biological Engineering, Purdue University, West Lafayette, IN, USA
| | | | | | - Ananth Grama
- Department of Ag. and Biological Engineering, Purdue University, West Lafayette, IN, USA
| | - Somali Chaterji
- Department of Ag. and Biological Engineering, Purdue University, Purdue University, IN, USA.
| |
Collapse
|
29
|
Abstract
The gender recognition is an important research field to study evidence regarding some personal characteristics in the information and data society. However, some current traditional methods such as vision and sound have been exposed their own security weaknesses. Recently, biometric gender recognition based on Electroencephalography (EEG) signals has been widely used in information safety and medical fields. It is necessary to explore potential of using EEG to present a more robust and accurate result with larger training data based on sophisticated machine learning approaches. In this contribution, we present an automated gender recognition system by a hybrid model based on EEG data of resting state from twenty-eight subjects. These data are useful and handy to get insights into assessing the differences in personal gender. For achieving a good performance and a strong robustness, the system develops a hybrid model of combining random forest and logistic regression, and employs four common entropy measures to analyze the non-stationary EEG signals. Result also suggests that the recognition performance achieve an improved progress with an accuracy of 0.9982 and AUC of 0.9926 based on a nested tenfold cross-validation loop, implying that show a significant potential applicability of the proposed approach and is capable of recognizing personal gender.
Collapse
Affiliation(s)
- Ping Wang
- The Center of Collaboration and Innovation, Jiangxi University of Technology, Nanchang, 330098 China
| | - Jianfeng Hu
- The Center of Collaboration and Innovation, Jiangxi University of Technology, Nanchang, 330098 China
| |
Collapse
|
30
|
Eneanya OA, Cano J, Dorigatti I, Anagbogu I, Okoronkwo C, Garske T, Donnelly CA. Environmental suitability for lymphatic filariasis in Nigeria. Parasit Vectors 2018; 11:513. [PMID: 30223860 PMCID: PMC6142334 DOI: 10.1186/s13071-018-3097-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 09/04/2018] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Lymphatic filariasis (LF) is a mosquito-borne parasitic disease and a major cause of disability worldwide. It is one of the neglected tropical diseases identified by the World Health Organization for elimination as a public health problem by 2020. Maps displaying disease distribution are helpful tools to identify high-risk areas and target scarce control resources. METHODS We used pre-intervention site-level occurrence data from 1192 survey sites collected during extensive mapping surveys by the Nigeria Ministry of Health. Using an ensemble of machine learning modelling algorithms (generalised boosted models and random forest), we mapped the ecological niche of LF at a spatial resolution of 1 km2. By overlaying gridded estimates of population density, we estimated the human population living in LF risk areas on a 100 × 100 m scale. RESULTS Our maps demonstrate that there is a heterogeneous distribution of LF risk areas across Nigeria, with large portions of northern Nigeria having more environmentally suitable conditions for the occurrence of LF. Here we estimated that approximately 110 million individuals live in areas at risk of LF transmission. CONCLUSIONS Machine learning and ensemble modelling are powerful tools to map disease risk and are known to yield more accurate predictive models with less uncertainty than single models. The resulting map provides a geographical framework to target control efforts and assess its potential impacts.
Collapse
Affiliation(s)
- Obiora A. Eneanya
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | - Jorge Cano
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK
| | - Ilaria Dorigatti
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | | | | | - Tini Garske
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | - Christl A. Donnelly
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
- Department of Statistics, University of Oxford, Oxford, UK
| |
Collapse
|
31
|
Zhang HH, Yang L, Liu Y, Wang P, Yin J, Li Y, Qiu M, Zhu X, Yan F. Classification of Parkinson's disease utilizing multi-edit nearest-neighbor and ensemble learning algorithms with speech samples. Biomed Eng Online 2016; 15:122. [PMID: 27852279 PMCID: PMC5112697 DOI: 10.1186/s12938-016-0242-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Accepted: 11/07/2016] [Indexed: 11/10/2022] Open
Abstract
Background The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. Methods In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Results Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. Conclusions This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Collapse
Affiliation(s)
- He-Hua Zhang
- Institute of Surgery Research, Daping Hospital, Third Military Medical University, Chongqing, 400042, China
| | - Liuyang Yang
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Yuchuan Liu
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Pin Wang
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Jun Yin
- Institute of Surgery Research, Daping Hospital, Third Military Medical University, Chongqing, 400042, China
| | - Yongming Li
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China. .,Department of Medical Image, College of Biomedical Engineering, Third Military Medical University, Chongqing, 400038, China.
| | - Mingguo Qiu
- Department of Medical Image, College of Biomedical Engineering, Third Military Medical University, Chongqing, 400038, China
| | - Xueru Zhu
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Fang Yan
- College of Communication Engineering, Chongqing University, Chongqing, 400044, China
| |
Collapse
|
32
|
Biscarini F, Nazzicari N, Broccanello C, Stevanato P, Marini S. "Noisy beets": impact of phenotyping errors on genomic predictions for binary traits in Beta vulgaris. Plant Methods 2016; 12:36. [PMID: 27437026 PMCID: PMC4949885 DOI: 10.1186/s13007-016-0136-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 07/06/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND Noise (errors) in scientific data is endemic and may have a detrimental effect on statistical analyses and experimental results. The effects of noisy data have been assessed in genome-wide association studies for case-control experiments in human medicine. Little is known, however, on the impact of noisy data on genomic predictions, a widely used statistical application in plant and animal breeding. RESULTS In this study, the sensitivity to noise in the data of five classification methods (K-nearest neighbours-KNN, random forest-RF, ridge logistic regression-LR, and support vector machines with linear or radial basis function kernels) was investigated. A sugar beet population of 123 plants phenotyped for a binary trait and genotyped for 192 SNP (single nucleotide polymorphism) markers was used. Labels (0/1 phenotype) were randomly sampled to generate noise. From the base scenario without errors in the labels, increasing proportions of noisy labels-up to 50 %-were generated and introduced in the data. CONCLUSIONS Local classification methods-KNN and RF-showed higher tolerance to noisy labels compared to methods that leverage global data properties-LR and the two SVM models. In particular, KNN outperformed all other classifiers with AUC (area under the ROC curve) higher than 0.95 up to 20 % noisy labels. The runner-up method, RF, had an AUC of 0.941 with 20 % noise.
Collapse
Affiliation(s)
- Filippo Biscarini
- />Department of Bioinformatics and Biostatistics, PTP Science Park, Via Einstein - Loc. Cascina Codazza, 26900 Lodi, Italy
| | - Nelson Nazzicari
- />Council for Agricultural Research and Economics (CREA), Research Centre for Fodder Crops and Dairy Productions, Lodi, Italy
| | | | | | - Simone Marini
- />Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| |
Collapse
|
33
|
Roy PK, Bhuiyan A, Janke A, Desmond PM, Wong TY, Abhayaratna WP, Storey E, Ramamohanarao K. Automatic white matter lesion segmentation using contrast enhanced FLAIR intensity and Markov Random Field. Comput Med Imaging Graph 2015; 45:102-11. [PMID: 26398564 DOI: 10.1016/j.compmedimag.2015.08.005] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2014] [Revised: 08/08/2015] [Accepted: 08/18/2015] [Indexed: 11/24/2022]
Abstract
White matter lesions (WMLs) are small groups of dead cells that clump together in the white matter of brain. In this paper, we propose a reliable method to automatically segment WMLs. Our method uses a novel filter to enhance the intensity of WMLs. Then a feature set containing enhanced intensity, anatomical and spatial information is used to train a random forest classifier for the initial segmentation of WMLs. Following that a reliable and robust edge potential function based Markov Random Field (MRF) is proposed to obtain the final segmentation by removing false positive WMLs. Quantitative evaluation of the proposed method is performed on 24 subjects of ENVISion study. The segmentation results are validated against the manual segmentation, performed under the supervision of an expert neuroradiologist. The results show a dice similarity index of 0.76 for severe lesion load, 0.73 for moderate lesion load and 0.61 for mild lesion load. In addition to that we have compared our method with three state of the art methods on 20 subjects of Medical Image Computing and Computer Aided Intervention Society's (MICCAI's) MS lesion challenge dataset, where our method shows better segmentation accuracy compare to the state of the art methods. These results indicate that the proposed method can assist the neuroradiologists in assessing the WMLs in clinical practice.
Collapse
|
34
|
Cao DS, Zhang LX, Tan GS, Xiang Z, Zeng WB, Xu QS, Chen AF. Computational Prediction of DrugTarget Interactions Using Chemical, Biological, and Network Features. Mol Inform 2014; 33:669-81. [PMID: 27485302 DOI: 10.1002/minf.201400009] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Accepted: 04/22/2014] [Indexed: 02/02/2023]
Abstract
Drugtarget interactions (DTIs) are central to current drug discovery processes. Efforts have been devoted to the development of methodology for predicting DTIs and drugtarget interaction networks. Most existing methods mainly focus on the application of information about drug or protein structure features. In the present work, we proposed a computational method for DTI prediction by combining the information from chemical, biological and network properties. The method was developed based on a learning algorithm-random forest (RF) combined with integrated features for predicting DTIs. Four classes of drugtarget interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models. The RF models gave prediction accuracy of 93.52 %, 94.84 %, 89.68 % and 84.72 % for four pharmaceutically useful datasets, respectively. The prediction ability of our approach is comparative to or even better than that of other DTI prediction methods. These comparative results demonstrated the relevance of the network topology as source of information for predicting DTIs. Further analysis confirmed that among our top ranked predictions of DTIs, several DTIs are supported by databases, while the others represent novel potential DTIs. We believe that our proposed approach can help to limit the search space of DTIs and provide a new way towards repositioning old drugs and identifying targets.
Collapse
Affiliation(s)
- Dong-Sheng Cao
- School of Pharmaceutical Sciences, Central South University, Changsha, 410013, P.R. China.
| | - Liu-Xia Zhang
- The 163rdHospital of The Chinese People's Liberation Army, Changsha 410003, P.R. China
| | - Gui-Shan Tan
- School of Pharmaceutical Sciences, Central South University, Changsha, 410013, P.R. China
| | - Zheng Xiang
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou 325035, P.R. China
| | - Wen-Bin Zeng
- School of Pharmaceutical Sciences, Central South University, Changsha, 410013, P.R. China
| | - Qing-Song Xu
- School of Mathematics and Statistics, Central South University, Changsha 410083, P.R. China
| | - Alex F Chen
- School of Pharmaceutical Sciences, Central South University, Changsha, 410013, P.R. China.
| |
Collapse
|