51
|
Azam N, Ahmad T, Ul Haq N. Automatic emotion recognition in healthcare data using supervised machine learning. PeerJ Comput Sci 2021; 7:e751. [PMID: 35036528 PMCID: PMC8725656 DOI: 10.7717/peerj-cs.751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 09/28/2021] [Indexed: 06/14/2023]
Abstract
Human feelings are fundamental to perceive the conduct and state of mind of an individual. A healthy emotional state is one significant highlight to improve personal satisfaction. On the other hand, bad emotional health can prompt social or psychological well-being issues. Recognizing or detecting feelings in online health care data gives important and helpful information regarding the emotional state of patients. To recognize or detection of patient's emotion against a specific disease using text from online sources is a challenging task. In this paper, we propose a method for the automatic detection of patient's emotions in healthcare data using supervised machine learning approaches. For this purpose, we created a new dataset named EmoHD, comprising of 4,202 text samples against eight disease classes and six emotion classes, gathered from different online resources. We used six different supervised machine learning models based on different feature engineering techniques. We also performed a detailed comparison of the chosen six machine learning algorithms using different feature vectors on our dataset. We achieved the highest 87% accuracy using MultiLayer Perceptron as compared to other state of the art models. Moreover, we use the emotional guidance scale to show that there is a link between negative emotion and psychological health issues. Our proposed work will be helpful to automatically detect a patient's emotion during disease and to avoid extreme acts like suicide, mental disorders, or psychological health issues. The implementation details are made publicly available at the given link: https://bit.ly/2NQeGET.
Collapse
Affiliation(s)
- Nazish Azam
- Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Pakistan
| | - Tauqir Ahmad
- Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Pakistan
| | - Nazeef Ul Haq
- School of Electrical Engineering and Computer Science, National University of Sciences and Technology, Islamabad, Pakistan
| |
Collapse
|
52
|
Chen J, Girard M, Wang S, Kisfalvi K, Lirio R. Using supervised machine learning approach to predict treatment outcomes of vedolizumab in ulcerative colitis patients. J Biopharm Stat 2021; 32:330-345. [PMID: 34882518 DOI: 10.1080/10543406.2021.2009500] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
With recent advances in machine learning, we demonstrated the use of supervised machine learning to optimize the prediction of treatment outcomes of vedolizumab through iterative optimization using VARSITY and VISIBLE 1 data in patients with moderate-to-severe ulcerative colitis. The analysis was carried out using elastic net regularized regression following a 2-stage training process. The model performance was assessed through AUROC, specificity, sensitivity, and accuracy. The generalizable predictive patterns suggest that easily obtained baseline and medical history variables may be able to predict therapeutic response to vedolizumab with clinically meaningful accuracy, implying a potential for individualized prescription of vedolizumab.
Collapse
Affiliation(s)
- Jingjing Chen
- Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| | | | - Song Wang
- Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| | | | - Richard Lirio
- Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| |
Collapse
|
53
|
Clermont J, Woodward-Gagné S, Berteaux D. Digging into the behaviour of an active hunting predator: arctic fox prey caching events revealed by accelerometry. Mov Ecol 2021; 9:58. [PMID: 34838144 PMCID: PMC8626921 DOI: 10.1186/s40462-021-00295-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 11/14/2021] [Indexed: 05/29/2023]
Abstract
BACKGROUND Biologging now allows detailed recording of animal movement, thus informing behavioural ecology in ways unthinkable just a few years ago. In particular, combining GPS and accelerometry allows spatially explicit tracking of various behaviours, including predation events in large terrestrial mammalian predators. Specifically, identification of location clusters resulting from prey handling allows efficient location of killing events. For small predators with short prey handling times, however, identifying predation events through technology remains unresolved. We propose that a promising avenue emerges when specific foraging behaviours generate diagnostic acceleration patterns. One such example is the caching behaviour of the arctic fox (Vulpes lagopus), an active hunting predator strongly relying on food storage when living in proximity to bird colonies. METHODS We equipped 16 Arctic foxes from Bylot Island (Nunavut, Canada) with GPS and accelerometers, yielding 23 fox-summers of movement data. Accelerometers recorded tri-axial acceleration at 50 Hz while we obtained a sample of simultaneous video recordings of fox behaviour. Multiple supervised machine learning algorithms were tested to classify accelerometry data into 4 behaviours: motionless, running, walking and digging, the latter being associated with food caching. Finally, we assessed the spatio-temporal concordance of fox digging and greater snow goose (Anser caerulescens antlanticus) nesting, to test the ecological relevance of our behavioural classification in a well-known study system dominated by top-down trophic interactions. RESULTS The random forest model yielded the best behavioural classification, with accuracies for each behaviour over 96%. Overall, arctic foxes spent 49% of the time motionless, 34% running, 9% walking, and 8% digging. The probability of digging increased with goose nest density and this result held during both goose egg incubation and brooding periods. CONCLUSIONS Accelerometry combined with GPS allowed us to track across space and time a critical foraging behaviour from a small active hunting predator, informing on spatio-temporal distribution of predation risk in an Arctic vertebrate community. Our study opens new possibilities for assessing the foraging behaviour of terrestrial predators, a key step to disentangle the subtle mechanisms structuring many predator-prey interactions and trophic networks.
Collapse
Affiliation(s)
- Jeanne Clermont
- Canada Research Chair On Northern Biodiversity, Université du Québec À Rimouski, 300 Allée des Ursulines, Rimouski, QC, G5L 3A1, Canada.
- Center for Northern Studies, Quebec, Canada.
- Quebec Center for Biodiversity Science, Montreal, Canada.
| | - Sasha Woodward-Gagné
- Canada Research Chair On Northern Biodiversity, Université du Québec À Rimouski, 300 Allée des Ursulines, Rimouski, QC, G5L 3A1, Canada
| | - Dominique Berteaux
- Canada Research Chair On Northern Biodiversity, Université du Québec À Rimouski, 300 Allée des Ursulines, Rimouski, QC, G5L 3A1, Canada.
- Center for Northern Studies, Quebec, Canada.
- Quebec Center for Biodiversity Science, Montreal, Canada.
| |
Collapse
|
54
|
Kibbey TCG, Jabrzemski R, O'Carroll DM. Predicting the relationship between PFAS component signatures in water and non-water phases through mathematical transformation: Application to machine learning classification. Chemosphere 2021; 282:131097. [PMID: 34119734 DOI: 10.1016/j.chemosphere.2021.131097] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/27/2021] [Accepted: 06/01/2021] [Indexed: 06/12/2023]
Abstract
Per- and polyfluoroalkyl substances (PFAS) are widespread in the environment, as a result of decades of use across a range of applications. While PFAS contamination often enters the environment in the aqueous phase, PFAS is regularly detected in a range of different phases, including soils, sediments and biota. Although PFAS at a given site may originate from the same sources, the compositions observed in different phases are nearly always different, a fact that can complicate source allocation efforts. This paper presents a quantitative method for prediction of the relative composition of PFAS in different phases for components for which differences in behavior are primarily driven by hydrophobicity. The derived equations suggest that under these conditions, the relative compositions in different phases in contact with water should be independent of overall affinity for the phase, and as such should be the same for all non-water phases. This result is illustrated with data from individual samples, as well as from site-wide evaluations for a range of different phases. The results of the work provide a useful tool to reconcile PFAS composition differences in different phases, and provide a baseline for recognizing cases where hydrophobicity is not the primary driver of differences in distribution between phases. Furthermore, the results may be useful in forensic applications for classification of PFAS across different phases. The use of the resulting equations to transform water data to train a supervised learning algorithm for forensic analysis of PFAS in non-water phases is illustrated.
Collapse
Affiliation(s)
- Tohren C G Kibbey
- School of Civil Engineering and Environmental Science University of Oklahoma Norman, OK, 73019, USA.
| | - Rafal Jabrzemski
- School of Computer Science University of Oklahoma Norman, OK, 73019, USA
| | - Denis M O'Carroll
- School of Civil and Environmental Engineering UNSW Sydney Manly Vale, NSW, 2093, Australia
| |
Collapse
|
55
|
Huber F, van der Burg S, van der Hooft JJJ, Ridder L. MS2DeepScore: a novel deep learning similarity measure to compare tandem mass spectra. J Cheminform 2021; 13:84. [PMID: 34715914 PMCID: PMC8556919 DOI: 10.1186/s13321-021-00558-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 09/25/2021] [Indexed: 11/18/2022] Open
Abstract
Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model's prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.
Collapse
Affiliation(s)
- Florian Huber
- Netherlands eScience Center, 1098 XG, Amsterdam, The Netherlands.
| | | | | | - Lars Ridder
- Netherlands eScience Center, 1098 XG, Amsterdam, The Netherlands
| |
Collapse
|
56
|
Casali M, Malchiodi D, Spada C, Zanaboni AM, Cotroneo R, Furci D, Sommariva A, Genovese U, Blandino A. A pilot study for investigating the feasibility of supervised machine learning approaches for the classification of pedestrians struck by vehicles. J Forensic Leg Med 2021; 84:102256. [PMID: 34678617 DOI: 10.1016/j.jflm.2021.102256] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/17/2021] [Accepted: 09/27/2021] [Indexed: 12/23/2022]
Abstract
This research focuses on the application of Artificial Intelligence (AI) methodologies to the problem of classifying vehicles involved in lethal pedestrian collisions. Specifically, the vehicle type is predicted on the basis of traumatic injury suffered by casualties, exploiting machine learning algorithms. In the present study, AI-assisted diagnosis was shown to have correct prediction about 70% of the time. In pedestrians struck by trucks, more severe injuries were appreciated in the facial skeleton, lungs, major airways, liver, and spleen as well as in the sternum/clavicle/rib complex, whereas the lower extremities were more affected by fractures in pedestrians struck by cars. Although the distinction of the striking vehicle should develop beyond autopsy evidence alone, the presented approach which is novel in the realm of forensic science, is shown to be effective in building automated decision support systems. Outcomes from this system can provide valuable information after the execution of autoptic examinations supporting the forensic investigation. Preliminary results from the application of machine learning algorithms with real-world datasets seem to highlight the efficacy of the proposed approach, which could be used for further studies concerning this topic.
Collapse
|
57
|
Bleker J, Yakar D, van Noort B, Rouw D, de Jong IJ, Dierckx RAJO, Kwee TC, Huisman H. Single-center versus multi-center biparametric MRI radiomics approach for clinically significant peripheral zone prostate cancer. Insights Imaging 2021; 12:150. [PMID: 34674058 PMCID: PMC8531183 DOI: 10.1186/s13244-021-01099-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/11/2021] [Indexed: 01/06/2023] Open
Abstract
Objectives To investigate a previously developed radiomics-based biparametric magnetic resonance imaging (bpMRI) approach for discrimination of clinically significant peripheral zone prostate cancer (PZ csPCa) using multi-center, multi-vendor (McMv) and single-center, single-vendor (ScSv) datasets.
Methods This study’s starting point was a previously developed ScSv algorithm for PZ csPCa whose performance was demonstrated in a single-center dataset. A McMv dataset was collected, and 262 PZ PCa lesions (9 centers, 2 vendors) were selected to identically develop a multi-center algorithm. The single-center algorithm was then applied to the multi-center dataset (single–multi-validation), and the McMv algorithm was applied to both the multi-center dataset (multi–multi-validation) and the previously used single-center dataset (multi–single-validation). The areas under the curve (AUCs) of the validations were compared using bootstrapping. Results Previously the single–single validation achieved an AUC of 0.82 (95% CI 0.71–0.92), a significant performance reduction of 27.2% compared to the single–multi-validation AUC of 0.59 (95% CI 0.51–0.68). The new multi-center model achieved a multi–multi-validation AUC of 0.75 (95% CI 0.64–0.84). Compared to the multi–single-validation AUC of 0.66 (95% CI 0.56–0.75), the performance did not decrease significantly (p value: 0.114). Bootstrapped comparison showed similar single-center performances and a significantly different multi-center performance (p values: 0.03, 0.012). Conclusions A single-center trained radiomics-based bpMRI model does not generalize to multi-center data. Multi-center trained radiomics-based bpMRI models do generalize, have equal single-center performance and perform better on multi-center data. Supplementary Information The online version contains supplementary material available at 10.1186/s13244-021-01099-y.
Collapse
Affiliation(s)
- Jeroen Bleker
- Departments of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands. .,, Meditech Building, Room n305, L.J. Zielstraweg 1, 9713 GX, Groningen, The Netherlands.
| | - Derya Yakar
- Departments of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands
| | - Bram van Noort
- Departments of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands
| | - Dennis Rouw
- Department of Radiology, Martini Hospital Groningen, Van Swietenplein 1, 9728 NT, Groningen, The Netherlands
| | - Igle Jan de Jong
- Department of Urology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands
| | - Rudi A J O Dierckx
- Departments of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands
| | - Thomas C Kwee
- Departments of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700 RB, Groningen, The Netherlands
| | - Henkjan Huisman
- Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, The Netherlands
| |
Collapse
|
58
|
Ku EJ, Lee C, Shim J, Lee S, Kim KA, Kim SW, Rhee Y, Kim HJ, Lim JS, Chung CH, Chun SW, Yoo SJ, Ryu OH, Cho HC, Hong AR, Ahn CH, Kim JH, Choi MH. Metabolic Subtyping of Adrenal Tumors: Prospective Multi-Center Cohort Study in Korea. Endocrinol Metab (Seoul) 2021; 36:1131-1141. [PMID: 34674508 PMCID: PMC8566125 DOI: 10.3803/enm.2021.1149] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 09/10/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Conventional diagnostic approaches for adrenal tumors require multi-step processes, including imaging studies and dynamic hormone tests. Therefore, this study aimed to discriminate adrenal tumors from a single blood sample based on the combination of liquid chromatography-mass spectrometry (LC-MS) and machine learning algorithms in serum profiling of adrenal steroids. METHODS The LC-MS-based steroid profiling was applied to serum samples obtained from patients with nonfunctioning adenoma (NFA, n=73), Cushing's syndrome (CS, n=30), and primary aldosteronism (PA, n=40) in a prospective multicenter study of adrenal disease. The decision tree (DT), random forest (RF), and extreme gradient boost (XGBoost) were performed to categorize the subtypes of adrenal tumors. RESULTS The CS group showed higher serum levels of 11-deoxycortisol than the NFA group, and increased levels of tetrahydrocortisone (THE), 20α-dihydrocortisol, and 6β-hydroxycortisol were found in the PA group. However, the CS group showed lower levels of dehydroepiandrosterone (DHEA) and its sulfate derivative (DHEA-S) than both the NFA and PA groups. Patients with PA expressed higher serum 18-hydroxycortisol and DHEA but lower THE than NFA patients. The balanced accuracies of DT, RF, and XGBoost for classifying each type were 78%, 96%, and 97%, respectively. In receiver operating characteristics (ROC) analysis for CS, XGBoost, and RF showed a significantly greater diagnostic power than the DT. However, in ROC analysis for PA, only RF exhibited better diagnostic performance than DT. CONCLUSION The combination of LC-MS-based steroid profiling with machine learning algorithms could be a promising one-step diagnostic approach for the classification of adrenal tumor subtypes.
Collapse
Affiliation(s)
- Eu Jeong Ku
- Department of Internal Medicine, Chungbuk National University Hospital, Chungbuk National University College of Medicine, Cheongju,
Korea
| | - Chaelin Lee
- Molecular Recognition Research Center, Korea Institute of Science and Technology, Seoul,
Korea
| | - Jaeyoon Shim
- Molecular Recognition Research Center, Korea Institute of Science and Technology, Seoul,
Korea
| | - Sihoon Lee
- Department of Internal Medicine, Gachon University College of Medicine, Incheon,
Korea
| | - Kyoung-Ah Kim
- Department of Internal Medicine, Dongguk University Ilsan Hospital, Dongguk University College of Medicine, Goyang,
Korea
| | - Sang Wan Kim
- Department of Internal Medicine, Seoul Metropolitan Government Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul,
Korea
| | - Yumie Rhee
- Department of Internal Medicine, Yonsei University College of Medicine, Seoul,
Korea
| | - Hyo-Jeong Kim
- Department of Internal Medicine, Nowon Eulji Medical Center, Eulji University, Seoul,
Korea
| | - Jung Soo Lim
- Department of Internal Medicine, Yonsei University Wonju College of Medicine, Wonju,
Korea
| | - Choon Hee Chung
- Department of Internal Medicine, Yonsei University Wonju College of Medicine, Wonju,
Korea
| | - Sung Wan Chun
- Department of Internal Medicine, Soonchunhyang University Cheonan Hospital, Soonchunhyang University College of Medicine, Cheonan,
Korea
| | - Soon-Jib Yoo
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Bucheon St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Bucheon,
Korea
| | - Ohk-Hyun Ryu
- Department of Internal Medicine, Hallym University Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon,
Korea
| | - Ho Chan Cho
- Department of Internal Medicine, Keimyung University School of Medicine, Daegu,
Korea
| | - A Ram Hong
- Department of Internal Medicine, Chonnam National University Medical School, Gwangju,
Korea
| | - Chang Ho Ahn
- Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam,
Korea
| | - Jung Hee Kim
- Department of Internal Medicine, Seoul National University Hospital, Seoul National University College of Medicine, Seoul,
Korea
| | - Man Ho Choi
- Molecular Recognition Research Center, Korea Institute of Science and Technology, Seoul,
Korea
| |
Collapse
|
59
|
Tideman LEM, Migas LG, Djambazova KV, Patterson NH, Caprioli RM, Spraggins JM, Van de Plas R. Automated biomarker candidate discovery in imaging mass spectrometry data through spatially localized Shapley additive explanations. Anal Chim Acta 2021; 1177:338522. [PMID: 34482894 PMCID: PMC10124144 DOI: 10.1016/j.aca.2021.338522] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 04/04/2021] [Accepted: 04/11/2021] [Indexed: 01/09/2023]
Abstract
The search for molecular species that are differentially expressed between biological states is an important step towards discovering promising biomarker candidates. In imaging mass spectrometry (IMS), performing this search manually is often impractical due to the large size and high-dimensionality of IMS datasets. Instead, we propose an interpretable machine learning workflow that automatically identifies biomarker candidates by their mass-to-charge ratios, and that quantitatively estimates their relevance to recognizing a given biological class using Shapley additive explanations (SHAP). The task of biomarker candidate discovery is translated into a feature ranking problem: given a classification model that assigns pixels to different biological classes on the basis of their mass spectra, the molecular species that the model uses as features are ranked in descending order of relative predictive importance such that the top-ranking features have a higher likelihood of being useful biomarkers. Besides providing the user with an experiment-wide measure of a molecular species' biomarker potential, our workflow delivers spatially localized explanations of the classification model's decision-making process in the form of a novel representation called SHAP maps. SHAP maps deliver insight into the spatial specificity of biomarker candidates by highlighting in which regions of the tissue sample each feature provides discriminative information and in which regions it does not. SHAP maps also enable one to determine whether the relationship between a biomarker candidate and a biological state of interest is correlative or anticorrelative. Our automated approach to estimating a molecular species' potential for characterizing a user-provided biological class, combined with the untargeted and multiplexed nature of IMS, allows for the rapid screening of thousands of molecular species and the obtention of a broader biomarker candidate shortlist than would be possible through targeted manual assessment. Our biomarker candidate discovery workflow is demonstrated on mouse-pup and rat kidney case studies.
Collapse
Affiliation(s)
- Leonoor E M Tideman
- Delft Center for Systems and Control, Delft University of Technology, Delft, Netherlands
| | - Lukasz G Migas
- Delft Center for Systems and Control, Delft University of Technology, Delft, Netherlands
| | - Katerina V Djambazova
- Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN, USA; Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Nathan Heath Patterson
- Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN, USA; Department of Biochemistry, Vanderbilt University, Nashville, TN, USA
| | - Richard M Caprioli
- Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN, USA; Department of Biochemistry, Vanderbilt University, Nashville, TN, USA; Department of Chemistry, Vanderbilt University, Nashville, TN, USA; Department of Pharmacology, Vanderbilt University, Nashville, TN, USA; Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Jeffrey M Spraggins
- Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN, USA; Department of Biochemistry, Vanderbilt University, Nashville, TN, USA; Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Raf Van de Plas
- Delft Center for Systems and Control, Delft University of Technology, Delft, Netherlands; Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN, USA; Department of Biochemistry, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
60
|
Jiao Y, Lesueur F, Azencott CA, Laurent M, Mebirouk N, Laborde L, Beauvallet J, Dondon MG, Eon-Marchais S, Laugé A, Noguès C, Andrieu N, Stoppa-Lyonnet D, Caputo SM. A new hybrid record linkage process to make epidemiological databases interoperable: application to the GEMO and GENEPSO studies involving BRCA1 and BRCA2 mutation carriers. BMC Med Res Methodol 2021; 21:155. [PMID: 34325649 PMCID: PMC8320036 DOI: 10.1186/s12874-021-01299-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 04/29/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Linking independent sources of data describing the same individuals enable innovative epidemiological and health studies but require a robust record linkage approach. We describe a hybrid record linkage process to link databases from two independent ongoing French national studies, GEMO (Genetic Modifiers of BRCA1 and BRCA2), which focuses on the identification of genetic factors modifying cancer risk of BRCA1 and BRCA2 mutation carriers, and GENEPSO (prospective cohort of BRCAx mutation carriers), which focuses on environmental and lifestyle risk factors. METHODS To identify as many as possible of the individuals participating in the two studies but not registered by a shared identifier, we combined probabilistic record linkage (PRL) and supervised machine learning (ML). This approach (named "PRL + ML") combined together the candidate matches identified by both approaches. We built the ML model using the gold standard on a first version of the two databases as a training dataset. This gold standard was obtained from PRL-derived matches verified by an exhaustive manual review. Results The Random Forest (RF) algorithm showed a highest recall (0.985) among six widely used ML algorithms: RF, Bagged trees, AdaBoost, Support Vector Machine, Neural Network. Therefore, RF was selected to build the ML model since our goal was to identify the maximum number of true matches. Our combined linkage PRL + ML showed a higher recall (range 0.988-0.992) than either PRL (range 0.916-0.991) or ML (0.981) alone. It identified 1995 individuals participating in both GEMO (6375 participants) and GENEPSO (4925 participants). CONCLUSIONS Our hybrid linkage process represents an efficient tool for linking GEMO and GENEPSO. It may be generalizable to other epidemiological studies involving other databases and registries.
Collapse
Affiliation(s)
- Yue Jiao
- Department of Genetics, Institut Curie, PSL Research University, Paris, France.,Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Fabienne Lesueur
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Chloé-Agathe Azencott
- Inserm, U900, Paris, France.,Mines ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Maïté Laurent
- Department of Genetics, Institut Curie, PSL Research University, Paris, France
| | - Noura Mebirouk
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Lilian Laborde
- Institut Paoli-Calmettes, Centre de Traitement des Données IPC-PACA, Département de la Recherche Clinique et de l'Innovation, Marseille, France
| | - Juana Beauvallet
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Marie-Gabrielle Dondon
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Séverine Eon-Marchais
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Anthony Laugé
- Department of Genetics, Institut Curie, PSL Research University, Paris, France
| | | | | | - Catherine Noguès
- Institut Paoli-Calmettes, Département d'Anticipation et de Suivi du Cancer, Oncogénétique clinique, Marseille France Inserm, U830, Université Paris Descartes, Paris, France.,Aix Marseille Univ, INSERM, IRD, SESSTIM, Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale, Marseille, France
| | - Nadine Andrieu
- Inserm, U900, Paris, France.,Institut Curie, PSL Research University, Mines ParisTech, Paris, France
| | - Dominique Stoppa-Lyonnet
- Department of Genetics, Institut Curie, PSL Research University, Paris, France.,Paris University, Paris, France.,Inserm, U830, Paris, France
| | - Sandrine M Caputo
- Department of Genetics, Institut Curie, PSL Research University, Paris, France.
| |
Collapse
|
61
|
Dalal S, Hombal V, Weng WH, Mankovich G, Mabotuwana T, Hall CS, Fuller J 3rd, Lehnert BE, Gunn ML. Determining Follow-Up Imaging Study Using Radiology Reports. J Digit Imaging 2020; 33:121-30. [PMID: 31452006 DOI: 10.1007/s10278-019-00260-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Radiology reports often contain follow-up imaging recommendations. Failure to comply with these recommendations in a timely manner can lead to delayed treatment, poor patient outcomes, complications, unnecessary testing, lost revenue, and legal liability. The objective of this study was to develop a scalable approach to automatically identify the completion of a follow-up imaging study recommended by a radiologist in a preceding report. We selected imaging-reports containing 559 follow-up imaging recommendations and all subsequent reports from a multi-hospital academic practice. Three radiologists identified appropriate follow-up examinations among the subsequent reports for the same patient, if any, to establish a ground-truth dataset. We then trained an Extremely Randomized Trees that uses recommendation attributes, study meta-data and text similarity of the radiology reports to determine the most likely follow-up examination for a preceding recommendation. Pairwise inter-annotator F-score ranged from 0.853 to 0.868; the corresponding F-score of the classifier in identifying follow-up exams was 0.807. Our study describes a methodology to automatically determine the most likely follow-up exam after a follow-up imaging recommendation. The accuracy of the algorithm suggests that automated methods can be integrated into a follow-up management application to improve adherence to follow-up imaging recommendations. Radiology administrators could use such a system to monitor follow-up compliance rates and proactively send reminders to primary care providers and/or patients to improve adherence.
Collapse
|
62
|
Levin AD, Ragazzi A, Szot SL, Ning T. Extraction and assessment of diagnosis-relevant features for heart murmur classification. Methods 2021; 202:110-116. [PMID: 34245871 DOI: 10.1016/j.ymeth.2021.07.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Revised: 06/10/2021] [Accepted: 07/02/2021] [Indexed: 12/21/2022] Open
Abstract
This paper presents a heart murmur detection and multi-class classification approach via machine learning. We extracted heart sound and murmur features that are of diagnostic importance and developed additional 16 features that are not perceivable by human ears but are valuable to improve murmur classification accuracy. We examined and compared the classification performance of supervised machine learning with k-nearest neighbor (KNN) and support vector machine (SVM) algorithms. We put together a test repertoire having more than 450 heart sound and murmur episodes to evaluate the performance of murmur classification using cross-validation of 80-20 and 90-10 splits. As clearly demonstrated in our evaluation, the specific set of features chosen in our study resulted in accurate classification consistently exceeding 90% for both classifiers.
Collapse
Affiliation(s)
- Alisa D Levin
- Department of Engineering, Trinity College, Hartford, Connecticut, United States.
| | - Anthony Ragazzi
- Department of Engineering, Trinity College, Hartford, Connecticut, United States.
| | - Skyler L Szot
- Department of Engineering, Trinity College, Hartford, Connecticut, United States.
| | - Taikang Ning
- Department of Engineering, Trinity College, Hartford, Connecticut, United States.
| |
Collapse
|
63
|
Langenbucher A, Häfner L, Eppig T, Seitz B, Szentmáry N, Flockerzi E. [Keratoconus detection and classification from parameters of the Corvis®ST : A study based on algorithms of machine learning]. Ophthalmologe 2021; 118:697-706. [PMID: 32970190 PMCID: PMC8260544 DOI: 10.1007/s00347-020-01231-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 08/23/2020] [Accepted: 08/24/2020] [Indexed: 01/31/2023]
Abstract
BACKGROUND AND OBJECTIVE In the last decades increasingly more systems of artificial intelligence have been established in medicine, which identify diseases or pathologies or discriminate them from complimentary diseases. Up to now the Corvis®ST (Corneal Visualization Scheimpflug Technology, Corvis®ST, Oculus, Wetzlar, Germany) yielded a binary index for classifying keratoconus but did not enable staging. The purpose of this study was to develop a prediction model, which mimics the topographic keratoconus classification index (TKC) of the Pentacam high resolution (HR, Oculus) with measurement parameters extracted from the Corvis®ST. PATIENTS AND METHODS In this study 60 measurements from normal subjects (TKC 0) and 379 eyes with keratoconus (TKC 1-4) were recruited. After measurement with the Pentacam HR (target parameter TKC) a measurement with the Corvis®ST device was performed. From this device 6 dynamic response parameters were extracted, which were included in the Corvis biomechanical index (CBI) provided by the Corvis®ST (ARTh, SP-A1, DA ratio 1 mm, DA ratio 2 mm, A1 velocity, max. deformation amplitude). In addition to the TKC as the target, the binarized TKC (1: TKC 1-4, 0: TKC 0) was modelled. The performance of the model was validated with accuracy as an indicator for correct classification made by the algorithm. Misclassifications in the modelling were penalized by the number of stages of deviation between the modelled and measured TKC values. RESULTS A total of 24 different models of supervised machine learning from 6 different families were tested. For modelling of the TKC stages 0-4, the algorithm based on a support vector machine (SVM) with linear kernel showed the best performance with an accuracy of 65.1% correct classifications. For modelling of binarized TKC, a decision tree with a coarse resolution showed a superior performance with an accuracy of 95.2% correct classifications followed by the SVM with linear or quadratic kernel and a nearest neighborhood classifier with cubic kernel (94.5% each). CONCLUSION This study aimed to show the principle of supervised machine learning applied to a set-up for the modelled classification of keratoconus staging. Preprocessed measurement data extracted from the Corvis®ST device were used to mimic the TKC provided by the Pentacam device with a series of different algorithms of machine learning.
Collapse
Affiliation(s)
- Achim Langenbucher
- Institut für Experimentelle Ophthalmologie, Universität des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland.
| | - Larissa Häfner
- Klinik für Augenheilkunde, Universitätsklinikum des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland
| | - Timo Eppig
- Institut für Experimentelle Ophthalmologie, Universität des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland
| | - Berthold Seitz
- Klinik für Augenheilkunde, Universitätsklinikum des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland
| | - Nóra Szentmáry
- Dr. Rolf M. Schwiete Zentrum für Limbusstammzellforschung und kongenitale Aniridie, Universität des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland
| | - Elias Flockerzi
- Klinik für Augenheilkunde, Universitätsklinikum des Saarlandes, Kirrberger Str., Gebäude 22, 66421, Homburg, Deutschland
| |
Collapse
|
64
|
Nakagami G, Yokota S, Kitamura A, Takahashi T, Morita K, Noguchi H, Ohe K, Sanada H. Supervised machine learning-based prediction for in-hospital pressure injury development using electronic health records: A retrospective observational cohort study in a university hospital in Japan. Int J Nurs Stud 2021; 119:103932. [PMID: 33975074 DOI: 10.1016/j.ijnurstu.2021.103932] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 02/23/2021] [Accepted: 03/17/2021] [Indexed: 02/02/2023]
Abstract
BACKGROUND In hospitals, nurses are responsible for pressure injury risk assessment using several kinds of risk assessment scales. However, their predictive validity is insufficient to initiate targeted preventive strategy for each patient. The use of electronic health records with machine learning technique is a promising strategy to provide automated clinical decision-making aid. OBJECTIVE The purpose of this study was to construct a predictive model for pressure injury development which included feature variables that can be collected on the first day of hospitalization by nurses who routinely input the data to electronic health records. DESIGN Retrospective observational cohort study. SETTING This study was conducted at a university hospital in Japan. PARTICIPANTS This study used electronic health records, which include entry/discharge records, basic nursing records, and pressure injury management documents (N = 75,353). METHODS The outcome measure was the pressure injuries which developed outside of an operation theatre and frequently appeared on the specific body parts at high risk of pressure injury development. We utilized four major classifiers: logistic regression, random forest, linear support vector machine, and extreme gradient boosting (XGBoost) with 5-fold cross-validation technique. The area under the receiver operating characteristic curve (AUC) was used for evaluating predictive performance. RESULTS The proportion of hospital-acquired pressure injuries was 0.52%. The receiver operating characteristic curves revealed the best predictive performance for the XGBoost model, achieving the highest sensitivity of 0.78±0.03 and AUC of 0.80±0.02 amongst four types of classifiers. Variables related to difficulty in activities of daily living, anorexia, and respiratory or cardiac disorders were extracted as important features. CONCLUSIONS Our findings suggest that routinely collected health data by nurses on the first day of patient admission have the potential to help determine high-risk patients for pressure injury development. Tweetable abstract: Machine learning models on routinely collected electronic health records data successfully predict pressure injury development during hospitalization. FUNDING This work was supported by a JSPS KAKENHI Grant-in-Aid for Exploratory Research (16K15865).
Collapse
|
65
|
Meppelink CS, Hendriks H, Trilling D, van Weert JCM, Shao A, Smit ES. Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning. Patient Educ Couns 2021; 104:1460-1466. [PMID: 33243581 DOI: 10.1016/j.pec.2020.11.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 10/02/2020] [Accepted: 11/10/2020] [Indexed: 06/11/2023]
Abstract
OBJECTIVE To investigate the applicability of supervised machine learning (SML) to classify health-related webpages as 'reliable' or 'unreliable' in an automated way. METHODS We collected the textual content of 468 different Dutch webpages about early childhood vaccination. Webpages were manually coded as 'reliable' or 'unreliable' based on their alignment with evidence-based vaccination guidelines. Four SML models were trained on part of the data, whereas the remaining data was used for model testing. RESULTS All models appeared to be successful in the automated identification of unreliable (F1 scores: 0.54-0.86) and reliable information (F1 scores: 0.82-0.91). Typical words for unreliable information are 'dr', 'immune system', and 'vaccine damage', whereas 'measles', 'child', and 'immunization rate', were frequent in reliable information. Our best performing model was also successful in terms of out-of-sample prediction, tested on a dataset about HPV vaccination. CONCLUSION Automated classification of online content in terms of reliability, using basic classifiers, performs well and is particularly useful to identify reliable information. PRACTICE IMPLICATIONS The classifiers can be used as a starting point to develop more complex classifiers, but also warning tools which can help people evaluate the content they encounter online.
Collapse
Affiliation(s)
- Corine S Meppelink
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands.
| | - Hanneke Hendriks
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| | - Damian Trilling
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| | - Julia C M van Weert
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| | - Anqi Shao
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands; Life Sciences Communication, University of Wisconsin-Madison, United States
| | - Eline S Smit
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
66
|
Eberhard FE, Klimpel S, Guarneri AA, Tobias NJ. Metabolites as predictive biomarkers for Trypanosoma cruzi exposure in triatomine bugs. Comput Struct Biotechnol J 2021; 19:3051-3057. [PMID: 34136103 PMCID: PMC8178018 DOI: 10.1016/j.csbj.2021.05.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 05/10/2021] [Accepted: 05/19/2021] [Indexed: 11/17/2022] Open
Abstract
Trypanosoma cruzi, the causative agent of Chagas disease (American trypanosomiasis), colonizes the intestinal tract of triatomines. Triatomine bugs act as vectors in the life cycle of the parasite and transmit infective parasite stages to animals and humans. Contact of the vector with T. cruzi alters its intestinal microbial composition, which may also affect the associated metabolic patterns of the insect. Earlier studies suggest that the complexity of the triatomine fecal metabolome may play a role in vector competence for different T. cruzi strains. Using high-resolution mass spectrometry and supervised machine learning, we aimed to detect differences in the intestinal metabolome of the triatomine Rhodnius prolixus and predict whether the insect had been exposed to T. cruzi or not based solely upon their metabolic profile. We were able to predict the exposure status of R. prolixus to T. cruzi with accuracies of 93.6%, 94.2% and 91.8% using logistic regression, a random forest classifier and a gradient boosting machine model, respectively. We extracted the most important features in producing the models and identified the major metabolites which assist in positive classification. This work highlights the complex interactions between triatomine vector and parasite including effects on the metabolic signature of the insect.
Collapse
Affiliation(s)
- Fanny E. Eberhard
- Institute for Ecology, Evolution and Diversity, Goethe University Frankfurt, Frankfurt/Main, Germany
| | - Sven Klimpel
- Institute for Ecology, Evolution and Diversity, Goethe University Frankfurt, Frankfurt/Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE TBG), Frankfurt/Main, Germany
- Senckenberg Gesellschaft für Naturforschung, Senckenberg Biodiversity and Climate Research Centre, Frankfurt/Main, Germany
| | - Alessandra A. Guarneri
- Vector Behaviour and Pathogen Interaction Group, Instituto René Rachou, Avenida Augusto de Lima,1715, Belo Horizonte, MG CEP 30190-009, Brazil
| | - Nicholas J. Tobias
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE TBG), Frankfurt/Main, Germany
- Senckenberg Gesellschaft für Naturforschung, Senckenberg Biodiversity and Climate Research Centre, Frankfurt/Main, Germany
- Corresponding author at: LOEWE Centre for Translational Biodiversity Genomics (LOEWE TBG), Frankfurt/Main, Germany.
| |
Collapse
|
67
|
García-Terriza L, Risco-Martín JL, Roselló GR, Ayala JL. Predictive and diagnosis models of stroke from hemodynamic signal monitoring. Med Biol Eng Comput 2021; 59:1325-37. [PMID: 33987805 DOI: 10.1007/s11517-021-02354-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 03/19/2021] [Indexed: 10/21/2022]
Abstract
This work presents a novel and promising approach to the clinical management of acute stroke. Using machine learning techniques, our research has succeeded in developing accurate diagnosis and prediction real-time models from hemodynamic data. These models are able to diagnose stroke subtype with 30 min of monitoring, to predict the exitus during the first 3 h of monitoring, and to predict the stroke recurrence in just 15 min of monitoring. Patients with difficult access to a CT scan and all patients that arrive at the stroke unit of a specialized hospital will benefit from these positive results. The results obtained from the real-time developed models are the following: stroke diagnosis around 98% precision (97.8% sensitivity, 99.5% specificity), exitus prediction with 99.8% precision (99.8% Sens., 99.9% Spec.), and 98% precision predicting stroke recurrence (98% Sens., 99% Spec.). Graphical abstract depicting the complete process since a patient is monitored until the data collected is used to generate models.
Collapse
|
68
|
El-Manzalawy Y, Abbas M, Hoaglund I, Cerna AU, Morland TB, Haggerty CM, Hall ES, Fornwalt BK. OASIS +: leveraging machine learning to improve the prognostic accuracy of OASIS severity score for predicting in-hospital mortality. BMC Med Inform Decis Mak 2021; 21:156. [PMID: 33985483 PMCID: PMC8118103 DOI: 10.1186/s12911-021-01517-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Accepted: 05/05/2021] [Indexed: 11/10/2022] Open
Abstract
Background Severity scores assess the acuity of critical illness by penalizing for the deviation of physiologic measurements from normal and aggregating these penalties (also called “weights” or “subscores”) into a final score (or probability) for quantifying the severity of critical illness (or the likelihood of in-hospital mortality). Although these simple additive models are human readable and interpretable, their predictive performance needs to be further improved. Methods We present OASIS +, a variant of the Oxford Acute Severity of Illness Score (OASIS) in which an ensemble of 200 decision trees is used to predict in-hospital mortality based on the 10 same clinical variables in OASIS. Results Using a test set of 9566 admissions extracted from the MIMIC-III database, we show that OASIS + outperforms nine previously developed severity scoring methods (including OASIS) in predicting in-hospital mortality. Furthermore, our results show that the supervised learning algorithms considered in our experiments demonstrated higher predictive performance when trained using the observed clinical variables as opposed to OASIS subscores. Conclusions Our results suggest that there is room for improving the prognostic accuracy of the OASIS severity scores by replacing the simple linear additive scoring function with more sophisticated non-linear machine learning models such as RF and XGB. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01517-7.
Collapse
Affiliation(s)
- Yasser El-Manzalawy
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA.
| | - Mostafa Abbas
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA
| | - Ian Hoaglund
- College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Alvaro Ulloa Cerna
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA
| | - Thomas B Morland
- Department of General Internal Medicine, Geisinger, Danville, PA, 17822, USA
| | - Christopher M Haggerty
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA
| | - Eric S Hall
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA
| | - Brandon K Fornwalt
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, 17822, USA.,Department of Radiology, Geisinger, Danville, PA, 17822, USA
| |
Collapse
|
69
|
Balestra N, Sharma G, Riek LM, Busza A. Automatic Identification of Upper Extremity Rehabilitation Exercise Type and Dose Using Body-Worn Sensors and Machine Learning: A Pilot Study. Digit Biomark 2021; 5:158-166. [PMID: 34414353 PMCID: PMC8339513 DOI: 10.1159/000516619] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 04/19/2021] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Prior studies suggest that participation in rehabilitation exercises improves motor function poststroke; however, studies on optimal exercise dose and timing have been limited by the technical challenge of quantifying exercise activities over multiple days. OBJECTIVES The objectives of this study were to assess the feasibility of using body-worn sensors to track rehabilitation exercises in the inpatient setting and investigate which recording parameters and data analysis strategies are sufficient for accurately identifying and counting exercise repetitions. METHODS MC10 BioStampRC® sensors were used to measure accelerometer and gyroscope data from upper extremities of healthy controls (n = 13) and individuals with upper extremity weakness due to recent stroke (n = 13) while the subjects performed 3 preselected arm exercises. Sensor data were then labeled by exercise type and this labeled data set was used to train a machine learning classification algorithm for identifying exercise type. The machine learning algorithm and a peak-finding algorithm were used to count exercise repetitions in non-labeled data sets. RESULTS We achieved a repetition counting accuracy of 95.6% overall, and 95.0% in patients with upper extremity weakness due to stroke when using both accelerometer and gyroscope data. Accuracy was decreased when using fewer sensors or using accelerometer data alone. CONCLUSIONS Our exploratory study suggests that body-worn sensor systems are technically feasible, well tolerated in subjects with recent stroke, and may ultimately be useful for developing a system to measure total exercise "dose" in poststroke patients during clinical rehabilitation or clinical trials.
Collapse
Affiliation(s)
- Noah Balestra
- Department of Neurology, University of Rochester, Rochester, New York, USA
| | - Gaurav Sharma
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York, USA
- Department of Computer Science, University of Rochester, Rochester, New York, USA
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, USA
| | - Linda M. Riek
- Department of Physical Therapy, Nazareth College, Rochester, New York, USA
| | - Ania Busza
- Department of Neurology, University of Rochester, Rochester, New York, USA
| |
Collapse
|
70
|
Moreira GDA, Andrade IDS, Cacheffo A, Yoshida AC, Gomes AA, Silva JJD, Lopes FJDS, Landulfo E. COVID-19 outbreak and air quality: Analyzing the influence of physical distancing and the resumption of activities in São Paulo municipality. Urban Clim 2021; 37:100813. [PMID: 35756397 PMCID: PMC9212973 DOI: 10.1016/j.uclim.2021.100813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 01/22/2021] [Accepted: 02/18/2021] [Indexed: 06/15/2023]
Affiliation(s)
- Gregori de Arruda Moreira
- Federal Institute of São Paulo (IFSP), Campus Registro. Avenida Clara Gianotti de Souza, 5180, Agrochá - CEP 11900-000, Registro, São Paulo, Brazil
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
| | - Izabel da Silva Andrade
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
| | - Alexandre Cacheffo
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
- Institute of Exact and Natural Sciences of Pontal (ICENP), Federal University of Uberlândia (UFU), Campus Pontal. Rua Vinte, 1600, Bloco C, Tupã - CEP 38304-402, Ituiutaba, Minas Gerais, Brazil
| | - Alexandre Calzavara Yoshida
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
- Institute of Exact and Natural Sciences of Pontal (ICENP), Federal University of Uberlândia (UFU), Campus Pontal. Rua Vinte, 1600, Bloco C, Tupã - CEP 38304-402, Ituiutaba, Minas Gerais, Brazil
| | - Antonio Arleques Gomes
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
| | - Jonatan João da Silva
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
- Center for Exact Sciences and Technologies (CCET), Federal University of Western Bahia (UFOB), Campus Barreiras. Rua da Prainha, 1326, Morada Nobre - CEP 47810-047, Barreiras, Bahia, Brazil
| | - Fábio Juliano da Silva Lopes
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
- Department of Environmental Sciences, Institute of Environmental, Chemical and Pharmaceutical Sciences (ICAQF), Federal University of São Paulo (UNIFESP), Campus Diadema. Rua São Nicolau, 210, Centro - CEP 09913-030, Diadema, São Paulo, Brazil
| | - Eduardo Landulfo
- Center for Lasers and Applications (CELAP), Institute of Energy and Nuclear Research (IPEN), Avenida Lineu Prestes, 2242, Setor E5, Cidade Universitária - CEP 05508-000, São Paulo, São Paulo, Brazil
| |
Collapse
|
71
|
Razaghi-Moghadam Z, Sokolowska EM, Sowa MA, Skirycz A, Nikoloski Z. Combination of network and molecule structure accurately predicts competitive inhibitory interactions. Comput Struct Biotechnol J 2021; 19:2170-2178. [PMID: 34136091 PMCID: PMC8172118 DOI: 10.1016/j.csbj.2021.04.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 04/01/2021] [Accepted: 04/03/2021] [Indexed: 11/30/2022] Open
Abstract
Mining of metabolite-protein interaction networks
facilitates the identification of design principles underlying the regulation of
different cellular processes. However, identification and characterization of
the regulatory role that metabolites play in interactions with proteins on a
genome-scale level remains a pressing task. Based on availability of
high-quality metabolite-protein interaction networks and genome-scale metabolic
networks, here we propose a supervised machine learning approach, called CIRI
that determines whether or not a metabolite is involved in a
competitive inhibitory
regulatory interaction with an enzyme.
First, we show that CIRI outperforms the naive approach based on a structural
similarity threshold for a putative competitive inhibitor and the substrates of
a metabolic reaction. We also validate the performance of CIRI on several unseen
data sets and databases of metabolite-protein interactions not used in the
training, and demonstrate that the classifier can be effectively used to predict
competitive inhibitory interactions. Finally, we show that CIRI can be employed
to refine predictions about metabolite-protein interactions from a recently
proposed PROMIS approach that employs metabolomics and proteomics profiles from
size exclusion chromatography in E. coli to predict
metabolite-protein interactions. Altogether, CIRI fills a gap in cataloguing
metabolite-protein interactions and can be used in directing future machine
learning efforts to categorize the regulatory type of these
interactions.
Collapse
Affiliation(s)
- Zahra Razaghi-Moghadam
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.,Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Ewelina M Sokolowska
- Department of Molecular Physiology, Max Planck Institute for Molecular Plant Physiology, 14476 Potsdam, Germany
| | - Marcin A Sowa
- Institute for Cardiovascular and Metabolic Research, School of Biological Sciences, University of Reading, Reading, United Kingdom
| | - Aleksandra Skirycz
- Department of Molecular Physiology, Max Planck Institute for Molecular Plant Physiology, 14476 Potsdam, Germany.,Boyce Thompson Institute, Ithaca, NY, USA
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.,Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
72
|
Singer A, Kosowan L, Loewen S, Spitoff S, Greiver M, Lynch J. Who is asked about alcohol consumption? A retrospective cohort study using a national repository of Electronic Medical Records. Prev Med Rep 2021; 22:101346. [PMID: 33767948 PMCID: PMC7980052 DOI: 10.1016/j.pmedr.2021.101346] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 01/05/2021] [Accepted: 02/20/2021] [Indexed: 11/17/2022] Open
Abstract
Documentation of alcohol use in electronic medical record (EMR) informs interventions to reduce alcohol-related morbidity and mortality. This retrospective cohort study explored EMR data from 960 primary care providers participating in the Canadian Primary Care Sentinel Surveillance Network to describe documentation of alcohol use (e.g. none, current or past use) in the EMR. Included providers represented 700,620 adult patients from across Canada with an encounter between 2015 and 2018. Bivariate comparisons characterized the patients with, and without, documentation of alcohol use. Multivariate generalized estimating equation models with logit function assessed patient and provider characteristics associated with (1) documentation of alcohol and (2) patients with heightened risk for alcohol-related problems. Forty percent of patients had alcohol use documentation in the EMR. Light alcohol consumption was recorded for 43.6% of these patients. Male patients (OR1.09, CI 1.07-1.12), who were older (OR1.26, CI 1.23-1.30), had more frequent visits to their provider (OR1.11, CI 1.09-1.13) and had hypertension (OR1.07, CI 1.06-1.09) or depression (OR1.07, CI 1.09-1.14) had higher odds of alcohol documentation. There were 4.7% of patients with a record indicating heightened risk for alcohol-related problems. Male patients (OR3.27 CI 3.14-3.4), patients with depression (OR2.01 CI1.93-2.1) and rural residency (OR1.35 CI1.29-1.42) was associated with risk for alcohol-related problems. Heavy alcohol consumption is associated with an increased risk of negative health outcomes, particularly for patients with certain chronic conditions. However, these patients do not have alcohol use consistently documented in the EMR. Strategies should be designed and implemented to support more consistent alcohol-screening among high-risk patients.
Collapse
Affiliation(s)
- Alexander Singer
- Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
- Corresponding author at: Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, D009-780 Bannatyne Ave., Winnipeg, Manitoba R3T2N2, Canada.
| | - Leanne Kosowan
- Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Shilpa Loewen
- Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Sheryl Spitoff
- Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Michelle Greiver
- Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Joanna Lynch
- Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| |
Collapse
|
73
|
Ranjith CP, Puzhakkal N, Arunkrishnan MP, Vysakh R, Irfad MP, Vijayagopal KS, Jayashanker S. Mean parotid dose prediction model using machine learning regression method for intensity-modulated radiotherapy in head and neck cancer. Med Dosim 2021; 46:283-288. [PMID: 33744079 DOI: 10.1016/j.meddos.2021.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 12/21/2020] [Accepted: 02/11/2021] [Indexed: 10/21/2022]
Abstract
Parotids are considered one of the major organs at risk in Head and Neck (HN) intensity-modulated radiotherapy (IMRT). Achieving proper target coverage with reduced mean parotid dose demands an elaborate time-consuming IMRT plan optimization. A parotid mean dose prediction model based on a machine-learning linear regression was developed and validated in this study. The model was developed using independent variables, such as parotid to PTV overlapping volume, dose coverage of the overlapping PTV, the ratio of overlapping parotid volume to total parotid volume, and volume of parotid overlapping with isotopically expanded PTV contours. The Pearson correlation coefficients between these independent variables and the mean parotid dose were calculated. Multicollinearity of the independent variables was checked by calculating the Variance Inflation Factor (VIF). All variables are having VIF less than ten were taken for the model. Fifty IMRT patient plans were used to develop the model. The mean parotid dose predicted by the model was in good agreement with the obtained mean parotid dose. The model is having a Root Mean Square Error (RMSE) of 2.89 Gy and an R-square of 0.7695. The model was successfully validated using the fivefold cross-validation method, resulting R-square value of 0.6179 and an RMSE of 2.93 Gy. The normality of the model's residuals was tested using Quartile-Quartile (Q-Q) plot and Shapiro Wilk test (p = 0.996, for null hypothesis ``residuals were normally distributed''). The data points in the Q-Q plot are falling approximately along the reference line. This model can be used in clinics to help the planner in the preplanning phase for efficient plan optimization.
Collapse
Affiliation(s)
- C P Ranjith
- MVR Cancer Centre and Research Institute, Calicut, India.
| | | | | | - R Vysakh
- MVR Cancer Centre and Research Institute, Calicut, India
| | - M P Irfad
- MVR Cancer Centre and Research Institute, Calicut, India
| | | | - S Jayashanker
- MVR Cancer Centre and Research Institute, Calicut, India
| |
Collapse
|
74
|
Molony C, King D, Di Luca M, Kitching M, Olayinka A, Hakimjavadi R, Julius LAN, Fitzpatrick E, Gusti Y, Burtenshaw D, Healy K, Finlay EK, Kernan D, Llobera A, Liu W, Morrow D, Redmond EM, Ducrée J, Cahill PA. Disease-Relevant Single Cell Photonic Signatures Identify S100β Stem Cells and their Myogenic Progeny in Vascular Lesions. Stem Cell Rev Rep 2021. [PMID: 33730327 DOI: 10.1007/s12015-021-10125-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/20/2021] [Indexed: 10/31/2022]
Abstract
A hallmark of subclinical atherosclerosis is the accumulation of vascular smooth muscle cell (SMC)-like cells leading to intimal thickening and lesion formation. While medial SMCs contribute to vascular lesions, the involvement of resident vascular stem cells (vSCs) remains unclear. We evaluated single cell photonics as a discriminator of cell phenotype in vitro before the presence of vSC within vascular lesions was assessed ex vivo using supervised machine learning and further validated using lineage tracing analysis. Using a novel lab-on-a-Disk(Load) platform, label-free single cell photonic emissions from normal and injured vessels ex vivo were interrogated and compared to freshly isolated aortic SMCs, cultured Movas SMCs, macrophages, B-cells, S100β+ mVSc, bone marrow derived mesenchymal stem cells (MSC) and their respective myogenic progeny across five broadband light wavelengths (λ465 - λ670 ± 20 nm). We found that profiles were of sufficient coverage, specificity, and quality to clearly distinguish medial SMCs from different vascular beds (carotid vs aorta), discriminate normal carotid medial SMCs from lesional SMC-like cells ex vivo following flow restriction, and identify SMC differentiation of a series of multipotent stem cells following treatment with transforming growth factor beta 1 (TGF- β1), the Notch ligand Jagged1, and Sonic Hedgehog using multivariate analysis, in part, due to photonic emissions from enhanced collagen III and elastin expression. Supervised machine learning supported genetic lineage tracing analysis of S100β+ vSCs and identified the presence of S100β+vSC-derived myogenic progeny within vascular lesions. We conclude disease-relevant photonic signatures may have predictive value for vascular disease.
Collapse
|
75
|
Zhang X, Schlögl A, Vandael D, Jonas P. MOD: A novel machine-learning optimal-filtering method for accurate and efficient detection of subthreshold synaptic events in vivo. J Neurosci Methods 2021; 357:109125. [PMID: 33711356 DOI: 10.1016/j.jneumeth.2021.109125] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 03/02/2021] [Accepted: 03/07/2021] [Indexed: 01/08/2023]
Abstract
BACKGROUND To understand information coding in single neurons, it is necessary to analyze subthreshold synaptic events, action potentials (APs), and their interrelation in different behavioral states. However, detecting excitatory postsynaptic potentials (EPSPs) or currents (EPSCs) in behaving animals remains challenging, because of unfavorable signal-to-noise ratio, high frequency, fluctuating amplitude, and variable time course of synaptic events. NEW METHOD We developed a method for synaptic event detection, termed MOD (Machine-learning Optimal-filtering Detection-procedure), which combines concepts of supervised machine learning and optimal Wiener filtering. Experts were asked to manually score short epochs of data. The algorithm was trained to obtain the optimal filter coefficients of a Wiener filter and the optimal detection threshold. Scored and unscored data were then processed with the optimal filter, and events were detected as peaks above threshold. RESULTS We challenged MOD with EPSP traces in vivo in mice during spatial navigation and EPSC traces in vitro in slices under conditions of enhanced transmitter release. The area under the curve (AUC) of the receiver operating characteristics (ROC) curve was, on average, 0.894 for in vivo and 0.969 for in vitro data sets, indicating high detection accuracy and efficiency. COMPARISON WITH EXISTING METHODS When benchmarked using a (1 - AUC)-1 metric, MOD outperformed previous methods (template-fit, deconvolution, and Bayesian methods) by an average factor of 3.13 for in vivo data sets, but showed comparable (template-fit, deconvolution) or higher (Bayesian) computational efficacy. CONCLUSIONS MOD may become an important new tool for large-scale, real-time analysis of synaptic activity.
Collapse
Affiliation(s)
- Xiaomin Zhang
- IST Austria (Institute of Science and Technology Austria), Am Campus 1, A-3400, Klosterneuburg, Austria
| | - Alois Schlögl
- IST Austria (Institute of Science and Technology Austria), Am Campus 1, A-3400, Klosterneuburg, Austria
| | - David Vandael
- IST Austria (Institute of Science and Technology Austria), Am Campus 1, A-3400, Klosterneuburg, Austria
| | - Peter Jonas
- IST Austria (Institute of Science and Technology Austria), Am Campus 1, A-3400, Klosterneuburg, Austria.
| |
Collapse
|
76
|
Li C, Yu H, Sun Y, Zeng X, Zhang W. Identification of the hub genes in gastric cancer through weighted gene co-expression network analysis. PeerJ 2021; 9:e10682. [PMID: 33717664 PMCID: PMC7938783 DOI: 10.7717/peerj.10682] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/09/2020] [Indexed: 02/05/2023] Open
Abstract
Background Gastric cancer is one of the most lethal tumors and is characterized by poor prognosis and lack of effective diagnostic or therapeutic biomarkers. The aim of this study was to find hub genes serving as biomarkers in gastric cancer diagnosis and therapy. Methods GSE66229 from Gene Expression Omnibus (GEO) was used as training set. Genes bearing the top 25% standard deviations among all the samples in training set were performed to systematic weighted gene co-expression network analysis (WGCNA) to find candidate genes. Then, hub genes were further screened by using the “least absolute shrinkage and selection operator” (LASSO) logistic regression. Finally, hub genes were validated in the GSE54129 dataset from GEO by supervised learning method artificial neural network (ANN) algorithm. Results Twelve modules with strong preservation were identified by using WGCNA methods in training set. Of which, five modules significantly related to gastric cancer were selected as clinically significant modules, and 713 candidate genes were identified from these five modules. Then, ADIPOQ, ARHGAP39, ATAD3A, C1orf95, CWH43, GRIK3, INHBA, RDH12, SCNN1G, SIGLEC11 and LYVE1 were screened as the hub genes. These hub genes successfully differentiated the tumor samples from the healthy tissues in an independent testing set through artificial neural network algorithm with the area under the receiver operating characteristic curve at 0.946. Conclusions These hub genes bearing diagnostic and therapeutic values, and our results may provide a novel prospect for the diagnosis and treatment of gastric cancer in the future.
Collapse
Affiliation(s)
- Chunyang Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| | - Haopeng Yu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| | - Yajing Sun
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| | - Xiaoxi Zeng
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| | - Wei Zhang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| |
Collapse
|
77
|
Christ NM, Elhai JD, Forbes CN, Gratz KL, Tull MT. A machine learning approach to modeling PTSD and difficulties in emotion regulation. Psychiatry Res 2021; 297:113712. [PMID: 33548858 DOI: 10.1016/j.psychres.2021.113712] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 01/03/2021] [Indexed: 12/25/2022]
Abstract
Despite evidence for the association between emotion regulation difficulties and posttraumatic stress disorder (PTSD), less is known about the specific emotion regulation abilities that are most relevant to PTSD severity. This study examined both item-level and subscale-level models of difficulties in emotion regulation in relation to PTSD severity using supervised machine learning in a sample of U.S. adults (N=570). Participants were recruited via Amazon's Mechanical Turk (MTurk) and completed self-report measures of emotion regulation difficulties and PTSD severity. We used five different machine learning algorithms separately to train each statistical model. Using ridge and elastic net regression results in the testing sample, emotion regulation predictor variables accounted for approximately 28% and 27% of the variance in PTSD severity in the item- and subscale-level models, respectively. In the item-level model, four predictor variables had notable relative importance values for PTSD severity. These items captured secondary emotional responding, experiencing emotions as out-of-control, difficulties modulating emotional arousal, and low emotional granularity. In the subscale-level model, lack of access to effective emotion regulation strategies, lack of emotional clarity, and emotional nonacceptance subscales had the highest relative importance to PTSD severity. Results from analyses modeling a probable diagnosis of PTSD based on DERS items and subscales are presented in supplemental findings. Findings have implications for developing more efficient, targeted emotion regulation interventions for PTSD.
Collapse
Affiliation(s)
- Nicole M Christ
- Department of Psychology, University of Toledo, 2801 W. Bancroft St., Toledo, Ohio 43606, USA
| | - Jon D Elhai
- Department of Psychology, University of Toledo, 2801 W. Bancroft St., Toledo, Ohio 43606, USA.
| | - Courtney N Forbes
- Department of Psychology, University of Toledo, 2801 W. Bancroft St., Toledo, Ohio 43606, USA
| | - Kim L Gratz
- Department of Psychology, University of Toledo, 2801 W. Bancroft St., Toledo, Ohio 43606, USA
| | - Matthew T Tull
- Department of Psychology, University of Toledo, 2801 W. Bancroft St., Toledo, Ohio 43606, USA
| |
Collapse
|
78
|
Tao X, Chi O, Delaney PJ, Li L, Huang J. Detecting depression using an ensemble classifier based on Quality of Life scales. Brain Inform 2021; 8:2. [PMID: 33590388 PMCID: PMC7884545 DOI: 10.1186/s40708-021-00125-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 10/23/2020] [Indexed: 11/10/2022] Open
Abstract
Major depressive disorder (MDD) is an issue that affects 350 million people worldwide. Traditional approaches have been to identify depressive symptoms in datasets, but recently, research is beginning to explore the association between psychosocial factors such as those on the quality of life scale and mental well-being, which will lead to earlier diagnosis and prediction of MDD. In this research, an ensemble binary classifier is proposed to analyse health survey data against ground truth from the SF-20 Quality of Life scales. The classifier aims to improve the performance of machine learning techniques on large datasets and identify depressed cases based on associations between items on the QoL scale and mental illness by increasing predictive performance. On the experimental evaluation on the National Health and Nutrition Examination Survey (NHANES), the classifier demonstrated an F1 score of 0.976 in the prediction, without any incorrectly identified depression instances. Only about 4% of instances had been mistakenly classified into depressed cases, with a significant accuracy of 95.4% comparing to the result from PHQ-9 mental screen inventory. The presented ensemble binary classifier performed comparably better than each baseline algorithm in all measures and all experiments. We trained the ensemble model on the processed NHANES dataset, tested and evaluated the results of its performance against mental screen inventory and discussed the comparable predictions. Finally, we provided future research directions.
Collapse
Affiliation(s)
- Xiaohui Tao
- School of Sciences, University of Southern Queensland, Toowoomba, Australia.
| | - Oliver Chi
- Advanced Analytics Institute, University of Technology, Sydney, Australia
| | - Patrick J Delaney
- School of Sciences, University of Southern Queensland, Toowoomba, Australia
| | - Lin Li
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, 430070, China
| | - Jiajin Huang
- International WIC Institute, Beijing University of Technology, Beijing, 100124, China
| |
Collapse
|
79
|
Higaki S, Darhan H, Suzuki C, Suda T, Sakurai R, Yoshioka K. An attempt at estrus detection in cattle by continuous measurements of ventral tail base surface temperature with supervised machine learning. J Reprod Dev 2021; 67:67-71. [PMID: 33041266 PMCID: PMC7902215 DOI: 10.1262/jrd.2020-075] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We aimed to determine the effectiveness of estrus detection based on continuous measurements of the ventral tail base surface temperature (ST) with supervised
machine learning in cattle. ST data were obtained through 51 estrus cycles on 11 female cattle (six Holsteins and five Japanese Blacks) using the tail-attached
sensor. Three estrus detection models were constructed with the training data (n = 17) using machine learning techniques (random forest,
artificial neural network, and support vector machine) based on 13 features extracted from sensing data (indicative of estrus-associated ST changes). Estrus
detection abilities of the three models on test data (n = 34) were not statistically different among models in terms of sensitivity and
precision (range 50.0% to 58.8% and 60.6% to 73.1%, respectively). The relatively poor performance of the models might indicate the difficulty of separating
estrus-associated ST changes from estrus-independent fluctuations in ST.
Collapse
Affiliation(s)
- Shogo Higaki
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| | - Hongyu Darhan
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| | - Chie Suzuki
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| | - Tomoko Suda
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| | - Reina Sakurai
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| | - Koji Yoshioka
- National Institute of Animal Health, National Agriculture and Food Research Organization, Ibaraki 305-0856, Japan
| |
Collapse
|
80
|
Eelbode T, Sinonquel P, Maes F, Bisschops R. Pitfalls in training and validation of deep learning systems. Best Pract Res Clin Gastroenterol 2021; 52-53:101712. [PMID: 34172245 DOI: 10.1016/j.bpg.2020.101712] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 11/30/2020] [Indexed: 01/31/2023]
Abstract
The number of publications in endoscopic journals that present deep learning applications has risen tremendously over the past years. Deep learning has shown great promise for automated detection, diagnosis and quality improvement in endoscopy. However, the interdisciplinary nature of these works has undoubtedly made it more difficult to estimate their value and applicability. In this review, the pitfalls and common misconducts when training and validating deep learning systems are discussed and some practical guidelines are proposed that should be taken into account when acquiring data and handling it to ensure an unbiased system that will generalize for application in routine clinical practice. Finally, some considerations are presented to ensure correct validation and comparison of AI systems.
Collapse
|
81
|
Langer T, Favarato M, Giudici R, Bassi G, Garberi R, Villa F, Gay H, Zeduri A, Bragagnolo S, Molteni A, Beretta A, Corradin M, Moreno M, Vismara C, Perno CF, Buscema M, Grossi E, Fumagalli R. Development of machine learning models to predict RT-PCR results for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in patients with influenza-like symptoms using only basic clinical data. Scand J Trauma Resusc Emerg Med 2020; 28:113. [PMID: 33261629 PMCID: PMC7705856 DOI: 10.1186/s13049-020-00808-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 11/06/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Reverse Transcription-Polymerase Chain Reaction (RT-PCR) for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-COV-2) diagnosis currently requires quite a long time span. A quicker and more efficient diagnostic tool in emergency departments could improve management during this global crisis. Our main goal was assessing the accuracy of artificial intelligence in predicting the results of RT-PCR for SARS-COV-2, using basic information at hand in all emergency departments. METHODS This is a retrospective study carried out between February 22, 2020 and March 16, 2020 in one of the main hospitals in Milan, Italy. We screened for eligibility all patients admitted with influenza-like symptoms tested for SARS-COV-2. Patients under 12 years old and patients in whom the leukocyte formula was not performed in the ED were excluded. Input data through artificial intelligence were made up of a combination of clinical, radiological and routine laboratory data upon hospital admission. Different Machine Learning algorithms available on WEKA data mining software and on Semeion Research Centre depository were trained using both the Training and Testing and the K-fold cross-validation protocol. RESULTS Among 199 patients subject to study (median [interquartile range] age 65 [46-78] years; 127 [63.8%] men), 124 [62.3%] resulted positive to SARS-COV-2. The best Machine Learning System reached an accuracy of 91.4% with 94.1% sensitivity and 88.7% specificity. CONCLUSION Our study suggests that properly trained artificial intelligence algorithms may be able to predict correct results in RT-PCR for SARS-COV-2, using basic clinical data. If confirmed, on a larger-scale study, this approach could have important clinical and organizational implications.
Collapse
Affiliation(s)
- Thomas Langer
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy.
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy.
| | - Martina Favarato
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy
| | - Riccardo Giudici
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy
| | - Gabriele Bassi
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy
| | - Roberta Garberi
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
| | - Fabiana Villa
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
| | - Hedwige Gay
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy
| | - Anna Zeduri
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
| | - Sara Bragagnolo
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
| | - Alberto Molteni
- Department of General oncologic and mini-invasive Surgery, Niguarda Ca'Granda, Milan, Italy
| | - Andrea Beretta
- Department of Emergency Medicine, Niguarda Ca' Granda, Milan, Italy
| | | | - Mauro Moreno
- Medical Department, Niguarda Ca' Granda, Milan, Italy
| | - Chiara Vismara
- Department of Laboratory Medicine, ASST Niguarda Hospital, University of Milan, Milan, Italy
| | - Carlo Federico Perno
- Department of Laboratory Medicine, ASST Niguarda Hospital, University of Milan, Milan, Italy
| | - Massimo Buscema
- Semeion Research Center of Sciences of Communication, Rome, Italy
- Department of Mathematical and Statistical Sciences, University of Colorado at Denver, Denver, CO, USA
| | - Enzo Grossi
- Centro Diagnostico Italiano, Milan, Italy
- Villa Santa Maria Foundation, Tavernerio, Italy
| | - Roberto Fumagalli
- Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy
- Department of Anaesthesia and Intensive Care Medicine, Niguarda Ca' Granda, Milan, Italy
| |
Collapse
|
82
|
Lanka P, Rangaprakash D, Dretsch MN, Katz JS, Denney TS, Deshpande G. Supervised machine learning for diagnostic classification from large-scale neuroimaging datasets. Brain Imaging Behav 2020; 14:2378-2416. [PMID: 31691160 PMCID: PMC7198352 DOI: 10.1007/s11682-019-00191-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
There are growing concerns about the generalizability of machine learning classifiers in neuroimaging. In order to evaluate this aspect across relatively large heterogeneous populations, we investigated four disorders: Autism spectrum disorder (N = 988), Attention deficit hyperactivity disorder (N = 930), Post-traumatic stress disorder (N = 87) and Alzheimer's disease (N = 132). We applied 18 different machine learning classifiers (based on diverse principles) wherein the training/validation and the hold-out test data belonged to samples with the same diagnosis but differing in either the age range or the acquisition site. Our results indicate that overfitting can be a huge problem in heterogeneous datasets, especially with fewer samples, leading to inflated measures of accuracy that fail to generalize well to the general clinical population. Further, different classifiers tended to perform well on different datasets. In order to address this, we propose a consensus-classifier by combining the predictive power of all 18 classifiers. The consensus-classifier was less sensitive to unmatched training/validation and holdout test data. Finally, we combined feature importance scores obtained from all classifiers to infer the discriminative ability of connectivity features. The functional connectivity patterns thus identified were robust to the classification algorithm used, age and acquisition site differences, and had diagnostic predictive ability in addition to univariate statistically significant group differences between the groups. A MATLAB toolbox called Machine Learning in NeuroImaging (MALINI), which implements all the 18 different classifiers along with the consensus classifier is available from Lanka et al. (2019) The toolbox can also be found at the following URL: https://github.com/pradlanka/malini .
Collapse
Affiliation(s)
- Pradyumna Lanka
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr., Suite 266D, Auburn, AL, 36849, USA
- Department of Psychological Sciences, University of California Merced, Merced, CA, USA
| | - D Rangaprakash
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr., Suite 266D, Auburn, AL, 36849, USA
- Departments of Radiology and Biomedical Engineering, Northwestern University, Chicago, IL, USA
| | - Michael N Dretsch
- U.S. Army Aeromedical Research Laboratory, Fort Rucker, AL, USA
- US Army Medical Research Directorate-West, Walter Reed Army Institute for Research, Joint Base Lewis-McCord, WA, USA
- Department of Psychology, Auburn University, Auburn, AL, USA
| | - Jeffrey S Katz
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr., Suite 266D, Auburn, AL, 36849, USA
- Department of Psychology, Auburn University, Auburn, AL, USA
- Alabama Advanced Imaging Consortium, Birmingham, AL, USA
- Center for Neuroscience, Auburn University, Auburn, AL, USA
| | - Thomas S Denney
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr., Suite 266D, Auburn, AL, 36849, USA
- Department of Psychology, Auburn University, Auburn, AL, USA
- Alabama Advanced Imaging Consortium, Birmingham, AL, USA
- Center for Neuroscience, Auburn University, Auburn, AL, USA
| | - Gopikrishna Deshpande
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr., Suite 266D, Auburn, AL, 36849, USA.
- Department of Psychology, Auburn University, Auburn, AL, USA.
- Alabama Advanced Imaging Consortium, Birmingham, AL, USA.
- Center for Neuroscience, Auburn University, Auburn, AL, USA.
- Center for Health Ecology and Equity Research, Auburn University, Auburn, AL, USA.
- Department of Psychiatry, National Institute of Mental and Neurosciences, Bangalore, India.
| |
Collapse
|
83
|
Maarseveen TD, Meinderink T, Reinders MJT, Knitza J, Huizinga TWJ, Kleyer A, Simon D, van den Akker EB, Knevel R. Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study. JMIR Med Inform 2020; 8:e23930. [PMID: 33252349 PMCID: PMC7735897 DOI: 10.2196/23930] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/18/2020] [Accepted: 10/24/2020] [Indexed: 11/18/2022] Open
Abstract
Background Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. Objective The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. Methods Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. Results For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). Conclusions We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems.
Collapse
Affiliation(s)
- Tjardo D Maarseveen
- Department of Rheumatology, Leiden University Medical Center, Leiden, Netherlands
| | - Timo Meinderink
- Department of Internal Medicine 3, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany.,Deutsches Zentrum für Immuntherapie, Erlangen-Nuremberg and Universitätsklinikum, Erlangen, Germany
| | - Marcel J T Reinders
- Leiden Computational Biology Centre, Leiden University Medical Center, Leiden, Netherlands.,Molecular Epidemiology, Leiden University Medical Center, Leiden, Netherlands
| | - Johannes Knitza
- Department of Internal Medicine 3, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany.,Deutsches Zentrum für Immuntherapie, Erlangen-Nuremberg and Universitätsklinikum, Erlangen, Germany
| | - Tom W J Huizinga
- Department of Rheumatology, Leiden University Medical Center, Leiden, Netherlands
| | - Arnd Kleyer
- Department of Internal Medicine 3, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany.,Deutsches Zentrum für Immuntherapie, Erlangen-Nuremberg and Universitätsklinikum, Erlangen, Germany
| | - David Simon
- Department of Internal Medicine 3, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany.,Deutsches Zentrum für Immuntherapie, Erlangen-Nuremberg and Universitätsklinikum, Erlangen, Germany
| | - Erik B van den Akker
- Leiden Computational Biology Centre, Leiden University Medical Center, Leiden, Netherlands.,Molecular Epidemiology, Leiden University Medical Center, Leiden, Netherlands
| | - Rachel Knevel
- Department of Rheumatology, Leiden University Medical Center, Leiden, Netherlands.,Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
84
|
Wirries A, Geiger F, Hammad A, Oberkircher L, Blümcke I, Jabari S. Artificial intelligence facilitates decision-making in the treatment of lumbar disc herniations. Eur Spine J 2021; 30:2176-84. [PMID: 33048249 DOI: 10.1007/s00586-020-06613-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 09/05/2020] [Accepted: 09/22/2020] [Indexed: 10/23/2022]
Abstract
PURPOSE Apart from patients with severe neurological deficits, it is not clear whether surgical or conservative treatment of lumbar disc herniations is superior for the individual patient. We investigated whether deep learning techniques can predict the outcome of patients with lumbar disc herniation after 6 months of treatment. METHODS The data of 60 patients were used to train and test a deep learning algorithm with the aim to achieve an accurate prediction of the ODI 6 months after surgery or the start of conservative therapy. We developed an algorithm that predicts the ODI of 6 randomly selected test patients in tenfold cross-validation. RESULTS A 100% accurate prediction of an ODI range could be achieved by dividing the ODI scale into 12% sections. A maximum absolute difference of only 3.4% between individually predicted and actual ODI after 6 months of a given therapy was achieved with our most powerful model. The application of artificial intelligence as shown in this work also allowed to compare the actual patient values after 6 months with the prediction for the alternative therapy, showing deviations up to 18.8%. CONCLUSION Deep learning in the supervised form applied here can identify patients at an early stage who would benefit from conservative therapy, and on the contrary avoid painful and unnecessary delays for patients who would profit from surgical therapy. In addition, this approach can be used in many other areas of medicine as an effective tool for decision-making when choosing between opposing treatment options, despite small patient groups.
Collapse
|
85
|
Vazquez J, Abdelrahman S, Byrne LM, Russell M, Harris P, Facelli JC. Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch. J Clin Transl Sci 2020; 5:e42. [PMID: 33948264 DOI: 10.1017/cts.2020.535] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Introduction: Lack of participation in clinical trials (CTs) is a major barrier for the evaluation of new pharmaceuticals and devices. Here we report the results of the analysis of a dataset from ResearchMatch, an online clinical registry, using supervised machine learning approaches and a deep learning approach to discover characteristics of individuals more likely to show an interest in participating in CTs. Methods: We trained six supervised machine learning classifiers (Logistic Regression (LR), Decision Tree (DT), Gaussian Naïve Bayes (GNB), K-Nearest Neighbor Classifier (KNC), Adaboost Classifier (ABC) and a Random Forest Classifier (RFC)), as well as a deep learning method, Convolutional Neural Network (CNN), using a dataset of 841,377 instances and 20 features, including demographic data, geographic constraints, medical conditions and ResearchMatch visit history. Our outcome variable consisted of responses showing specific participant interest when presented with specific clinical trial opportunity invitations (‘yes’ or ‘no’). Furthermore, we created four subsets from this dataset based on top self-reported medical conditions and gender, which were separately analysed. Results: The deep learning model outperformed the machine learning classifiers, achieving an area under the curve (AUC) of 0.8105. Conclusions: The results show sufficient evidence that there are meaningful correlations amongst predictor variables and outcome variable in the datasets analysed using the supervised machine learning classifiers. These approaches show promise in identifying individuals who may be more likely to participate when offered an opportunity for a clinical trial.
Collapse
|
86
|
Jacques J, Martin-Huyghe H, Lemtiri-Florek J, Taillard J, Jourdan L, Dhaenens C, Delerue D, Hansske A, Leclercq V. The detection of hospitalized patients at risk of testing positive to multi-drug resistant bacteria using MOCA-I, a rule-based "white-box" classification algorithm for medical data. Int J Med Inform 2020; 142:104242. [PMID: 32853975 DOI: 10.1016/j.ijmedinf.2020.104242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 07/19/2020] [Accepted: 07/25/2020] [Indexed: 10/23/2022]
Abstract
BACKGROUND Multi-drug resistant (MDR) bacteria are a major health concern. In this retrospective study, a rule-based classification algorithm, MOCA-I (Multi-Objective Classification Algorithm for Imbalanced data) is used to identify hospitalized patients at risk of testing positive for multidrug-resistant (MDR) bacteria, including Methicillin-resistant Staphylococcus aureus (MRSA), before or during their stay. METHODS Applied to a data set of 48,945 hospital stays (including known cases of carriage) with up to 16,325 attributes per stay, MOCA-I generated alert rules for risk of carriage or infection. A risk score was then computed from each stay according to the triggered rules.Recall and precision curves were plotted. RESULTS The classification can be focused on specifically detecting high risk of having a positive test, or identifying large numbers of at-risk patients by modulating the risk score cut-off level. For a risk score above 0.85,recall (sensitivity) is 62 % with 69 % precision (confidence) for MDR bacteria, recall is 58 % with 88 % precision for MRSA. In addition, MOCA-I identifies 38 and 21 cases of previously unknown MDR and MRSA respectively. CONCLUSIONS MOCA-I generates medically pertinent alert rules. This classification algorithm can be used to detect patients with high risk of testing positive to MDR bacteria (including MRSA). Classification can be modulated by appropriately setting the risk score cut-off level to favor specific detection of small numbers of patients at very high risk or identification of large numbers of patients at risk. MOCA-I can thus contribute to more adapted treatments and preventive measures from admission, depending on the clinical setting or management strategy.
Collapse
Affiliation(s)
- Julie Jacques
- Lille Catholic University, Faculté de Gestion, Economie et Sciences, France; Univ. Lille, CNRS, Centrale Lille, UMR 9189, CRIStAL, Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France.
| | - Hélène Martin-Huyghe
- Lille Catholic Hospitals, Infection Control Department, Lille Catholic University, KASHMIR, Lille, France; CH Arras, Pharmacy Department, Arras, France
| | - Justine Lemtiri-Florek
- Lille Catholic Hospitals, Infection Control Department, Lille Catholic University, KASHMIR, Lille, France; CH Valenciennes, Intensive Care Department, F-59322 Valenciennes, France
| | | | - Laetitia Jourdan
- Univ. Lille, CNRS, Centrale Lille, UMR 9189, CRIStAL, Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France
| | - Clarisse Dhaenens
- Univ. Lille, CNRS, Centrale Lille, UMR 9189, CRIStAL, Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France
| | | | - Arnaud Hansske
- Lille Catholic Hospitals, IT System Department, Lille Catholic University, KASHMIR, Lille, France
| | - Valérie Leclercq
- Lille Catholic Hospitals, Infection Control Department, Lille Catholic University, KASHMIR, Lille, France
| |
Collapse
|
87
|
Sekercioglu N, Fu R, Kim SJ, Mitsakakis N. Machine learning for predicting long-term kidney allograft survival: a scoping review. Ir J Med Sci 2021; 190:807-17. [PMID: 32761550 DOI: 10.1007/s11845-020-02332-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 07/26/2020] [Indexed: 12/24/2022]
Abstract
Supervised machine learning (ML) is a class of algorithms that "learn" from existing input-output pairs, which is gaining popularity in pattern recognition for classification and prediction problems. In this scoping review, we examined the use of supervised ML algorithms for the prediction of long-term allograft survival in kidney transplant recipients. Data sources included PubMed, the Cumulative Index to Nursing and Allied Health Literature, and the Institute for Electrical and Electronics Engineers (IEEE) Xplore libraries from inception to November 2019. We screened titles and abstracts and potentially eligible full-text reports to select studies and subsequently abstracted the data. Eleven studies were identified. Decision trees were the most commonly used method (n = 8), followed by artificial neural networks (ANN) (n = 4) and Bayesian belief networks (n = 2). The area under receiver operating curve (AUC) was the most common measure of discrimination (n = 7), followed by sensitivity (n = 5) and specificity (n = 4). Model calibration examining the reliability in risk prediction was performed using either the Pearson r or the Hosmer-Lemeshow test in four studies. One study showed that logistic regression had comparable performance to ANN, while another study demonstrated that ANN performed better in terms of sensitivity, specificity, and accuracy, as compared with a Cox proportional hazards model. We synthesized the evidence related to the comparison of ML techniques with traditional statistical approaches for prediction of long-term allograft survival in patients with a kidney transplant. The methodological and reporting quality of included studies was poor. Our study also demonstrated mixed results in terms of the predictive potential of the models.
Collapse
|
88
|
Bowen ME. Monitoring functional status using a wearable real-time locating technology. Nurs Outlook 2020; 68:727-733. [PMID: 32546324 DOI: 10.1016/j.outlook.2020.04.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/19/2020] [Accepted: 04/26/2020] [Indexed: 12/19/2022]
Abstract
Sensor technologies enable real-time, continuous, and objective monitoring of activity and functioning in later life. In long-term care, timely assessment of functional status is needed to prevent falls and other acute events. However, the electronic forms and paper and pencil tools currently used are time-consuming and conducted too infrequently (e.g., every 6 months) to provide the sensitivity and specificity required. Staff are also unable to detect subtle changes in functioning through observation alone. The purpose of this paper is to discuss the use of a wearable real-time locating system that utilizes ultra wideband radio technology to continuously and objectively measure activity and aspects of functional status. This paper discusses the associated conceptualization and development of the scoring algorithms, raw data transformation, use of traditional paper and pencil tools and electronic health record data to validate sensor data, and other tips for those interested in this type of wearable sensor technology.
Collapse
Affiliation(s)
- Mary Elizabeth Bowen
- School of Nursing, University of Delaware, Newark, DE; Cpl Michael J. Crescenz VA Medical Center, 3900 Woodland AvenuePhiladelphia, PA 19104.
| |
Collapse
|
89
|
Bertsatos A, Chovalopoulou ME, Brůžek J, Bejdová Š. Advanced procedures for skull sex estimation using sexually dimorphic morphometric features. Int J Legal Med 2020; 134:1927-1937. [PMID: 32504147 DOI: 10.1007/s00414-020-02334-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 05/29/2020] [Indexed: 11/30/2022]
Abstract
This paper introduces an automated method for estimating sex from cranial sex diagnostic traits by extracting and evaluating specialized morphometric features from the glabella, the supraorbital ridge, the occipital protuberance, and the mastoid process. The proposed method was developed and evaluated using two European population samples, a Czech sample comprising 170 crania reconstructed from anonymized CT scans and a Greek sample of 156 crania from the Athens Collection. It is based on a fully automatic algorithm applied on 3D models for extracting sex diagnostic morphometric features which are further processed by computer vision and machine learning algorithms. Classification accuracy was evaluated in a population specific and a population generic 2-way cross-validation scheme. Population-specific accuracy for individual morphometric features ranged from 78.5 to 96.7%, whereas population generic correct classification ranged from 71.7 to 90.8%. Combining all sex diagnostic traits in multi-feature sex estimation yielded correct classification performance in excess of 91% for the entire sample, whereas the sex of about three fourths of the sample could be determined with 100% accuracy according to posterior probability estimates. The proposed method provides an efficient and reliable way to estimate sex from cranial remains, and it offers significant advantages over existing methods. The proposed method can be readily implemented with the skullanalyzer computer program and the estimate_sex.m GNU Octave function, which are freely available under a suitable license.
Collapse
Affiliation(s)
- Andreas Bertsatos
- Department of Animal and Human Physiology, Faculty of Biology, School of Sciences, University of Athens, Panepistimiopolis, GR 157 01, Athens, Greece.
| | - Maria-Eleni Chovalopoulou
- Science and Technology in Archaeology and Culture Research Center, The Cyprus Institute, 2121 Aglantzia, Nicosia, Cyprus
| | - Jaroslav Brůžek
- Department of Anthropology and Human Genetics, Faculty of Science, Charles University, Viničná 7, 128 44, Prague 2, Czech Republic
| | - Šárka Bejdová
- Department of Anthropology and Human Genetics, Faculty of Science, Charles University, Viničná 7, 128 44, Prague 2, Czech Republic
| |
Collapse
|
90
|
Obmann MM, Cosentino A, Cyriac J, Hofmann V, Stieltjes B, Boll DT, Yeh BM, Benz MR. Quantitative enhancement thresholds and machine learning algorithms for the evaluation of renal lesions using single-phase split-filter dual-energy CT. Abdom Radiol (NY) 2020; 45:1922-1928. [PMID: 31451887 DOI: 10.1007/s00261-019-02195-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
PURPOSE To establish thresholds for contrast enhancement-based attenuation (CM) and iodine concentration (IOD) for the quantitative evaluation of enhancement in renal lesions on single-phase split-filter dual-energy CT (tbDECT) and combine measurements in a machine learning algorithm to potentially improve performance. MATERIAL 126 patients with incidental renal cysts (both hypo- and hyperdense cysts) or high suspicion for renal cell carcinoma (312 total lesions) undergoing abdominal, portal venous phase tbDECT were initially included in this retrospective study. Gold standard was pathological confirmation or follow-up imaging (MRI or multiphasic CT). CM, IOD, and ROI size were recorded. Thresholds for CM and IOD were identified using Youden-Index of the empirical ROC curves. Decision tree (DTC) and random forest classifier (RFC) were trained. Sensitivities, specificities, and AUCs were compared using McNemar and DeLong test. RESULTS The final study cohort comprised 40 enhancing and 113 non-enhancing renal lesions. Optimal thresholds for quantitative iodine measurements and contrast enhancement-based attenuation were 1.0 ± 0.0 mg/ml and 23.6 ± 0.3 HU, respectively. Single DECT parameters (IOD, CM) showed similar overall performance with an AUC of 0.894 and 0.858 (p = 0.541) (sensitivity 90 and 80%, specificity 88 and 92%, respectively). While overall performance for the DTC (AUC 0.944) was higher than RFC (AUC 0.886), this difference (p = 0.409) and comparison to CM (p = 0.243) and IOD (p = 0.353) was not statistically significant. CONCLUSIONS Enhancement in incidental renal lesions on single-phase tbDECT can be classified with up to 87.5% sensitivity and 94.6% specificity. Algorithms combining DECT parameters did not increase overall performance.
Collapse
Affiliation(s)
- Markus M Obmann
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland.
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, CA, USA.
| | - Aurelio Cosentino
- Department of Surgical Sciences, Radiology Unit, University of Turin, Turin, Italy
| | - Joshy Cyriac
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland
| | - Verena Hofmann
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland
| | - Bram Stieltjes
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland
| | - Daniel T Boll
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland
| | - Benjamin M Yeh
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, CA, USA
| | - Matthias R Benz
- Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Petersgraben 4, 4031, Basel, Switzerland.
| |
Collapse
|
91
|
Mercier MR, Cappe C. The interplay between multisensory integration and perceptual decision making. Neuroimage 2020; 222:116970. [PMID: 32454204 DOI: 10.1016/j.neuroimage.2020.116970] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 03/23/2020] [Accepted: 05/15/2020] [Indexed: 01/15/2023] Open
Abstract
Facing perceptual uncertainty, the brain combines information from different senses to make optimal perceptual decisions and to guide behavior. However, decision making has been investigated mostly in unimodal contexts. Thus, how the brain integrates multisensory information during decision making is still unclear. Two opposing, but not mutually exclusive, scenarios are plausible: either the brain thoroughly combines the signals from different modalities before starting to build a supramodal decision, or unimodal signals are integrated during decision formation. To answer this question, we devised a paradigm mimicking naturalistic situations where human participants were exposed to continuous cacophonous audiovisual inputs containing an unpredictable signal cue in one or two modalities and had to perform a signal detection task or a cue categorization task. First, model-based analyses of behavioral data indicated that multisensory integration takes place alongside perceptual decision making. Next, using supervised machine learning on concurrently recorded EEG, we identified neural signatures of two processing stages: sensory encoding and decision formation. Generalization analyses across experimental conditions and time revealed that multisensory cues were processed faster during both stages. We further established that acceleration of neural dynamics during sensory encoding and decision formation was directly linked to multisensory integration. Our results were consistent across both signal detection and categorization tasks. Taken together, the results revealed a continuous dynamic interplay between multisensory integration and decision making processes (mixed scenario), with integration of multimodal information taking place both during sensory encoding as well as decision formation.
Collapse
|
92
|
Ye C, Li J, Hao S, Liu M, Jin H, Zheng L, Xia M, Jin B, Zhu C, Alfreds ST, Stearns F, Kanov L, Sylvester KG, Widen E, McElhinney D, Ling XB. Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. Int J Med Inform 2020; 137:104105. [PMID: 32193089 DOI: 10.1016/j.ijmedinf.2020.104105] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 02/15/2020] [Accepted: 02/27/2020] [Indexed: 01/09/2023]
Abstract
OBJECTIVE Predicting the risk of falls in advance can benefit the quality of care and potentially reduce mortality and morbidity in the older population. The aim of this study was to construct and validate an electronic health record-based fall risk predictive tool to identify elders at a higher risk of falls. METHODS The one-year fall prediction model was developed using the machine-learning-based algorithm, XGBoost, and tested on an independent validation cohort. The data were collected from electronic health records (EHR) of Maine from 2016 to 2018, comprising 265,225 older patients (≥65 years of age). RESULTS This model attained a validated C-statistic of 0.807, where 50 % of the identified high-risk true positives were confirmed to fall during the first 94 days of next year. The model also captured in advance 58.01 % and 54.93 % of falls that happened within the first 30 and 30-60 days of next year. The identified high-risk patients of fall showed conditions of severe disease comorbidities, an enrichment of fall-increasing cardiovascular and mental medication prescriptions and increased historical clinical utilization, revealing the complexity of the underlying fall etiology. The XGBoost algorithm captured 157 impactful predictors into the final predictive model, where cognitive disorders, abnormalities of gait and balance, Parkinson's disease, fall history and osteoporosis were identified as the top-5 strongest predictors of the future fall event. CONCLUSIONS By using the EHR data, this risk assessment tool attained an improved discriminative ability and can be immediately deployed in the health system to provide automatic early warnings to older adults with increased fall risk and identify their personalized risk factors to facilitate customized fall interventions.
Collapse
Affiliation(s)
- Chengyin Ye
- Department of Health Management, Hangzhou Normal University, Hangzhou, China.
| | - Jinmei Li
- Department of Health Management, Hangzhou Normal University, Hangzhou, China.
| | - Shiying Hao
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States; Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States.
| | - Modi Liu
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Hua Jin
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Le Zheng
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States; Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States.
| | - Minjie Xia
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Bo Jin
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Chunqing Zhu
- HBI Solutions Inc., Palo Alto, CA, United States.
| | | | | | - Laura Kanov
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Karl G Sylvester
- Department of Surgery, Stanford University, Stanford, CA, United States.
| | - Eric Widen
- HBI Solutions Inc., Palo Alto, CA, United States.
| | - Doff McElhinney
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States; Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States.
| | - Xuefeng Bruce Ling
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States; Department of Surgery, Stanford University, Stanford, CA, United States.
| |
Collapse
|
93
|
Suvorov A, Hochuli J, Schrider DR. Accurate Inference of Tree Topologies from Multiple Sequence Alignments Using Deep Learning. Syst Biol 2020; 69:221-233. [PMID: 31504938 PMCID: PMC8204903 DOI: 10.1093/sysbio/syz060] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 08/28/2019] [Indexed: 11/13/2022] Open
Abstract
Reconstructing the phylogenetic relationships between species is one of the most formidable tasks in evolutionary biology. Multiple methods exist to reconstruct phylogenetic trees, each with their own strengths and weaknesses. Both simulation and empirical studies have identified several "zones" of parameter space where accuracy of some methods can plummet, even for four-taxon trees. Further, some methods can have undesirable statistical properties such as statistical inconsistency and/or the tendency to be positively misleading (i.e. assert strong support for the incorrect tree topology). Recently, deep learning techniques have made inroads on a number of both new and longstanding problems in biological research. In this study, we designed a deep convolutional neural network (CNN) to infer quartet topologies from multiple sequence alignments. This CNN can readily be trained to make inferences using both gapped and ungapped data. We show that our approach is highly accurate on simulated data, often outperforming traditional methods, and is remarkably robust to bias-inducing regions of parameter space such as the Felsenstein zone and the Farris zone. We also demonstrate that the confidence scores produced by our CNN can more accurately assess support for the chosen topology than bootstrap and posterior probability scores from traditional methods. Although numerous practical challenges remain, these findings suggest that the deep learning approaches such as ours have the potential to produce more accurate phylogenetic inferences.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina at Chapel Hill, 120 Mason Farm Road, UNC-Chapel Hill, Chapel Hill, NC 27599-7264, USA
| | - Joshua Hochuli
- Biological and Biomedical Sciences Program, University of North Carolina at Chapel Hill, 130 Mason Farm Road, UNC-Chapel Hill Chapel Hill, NC 27599-7264, USA
| | - Daniel R Schrider
- Biological and Biomedical Sciences Program, University of North Carolina at Chapel Hill, 130 Mason Farm Road, UNC-Chapel Hill Chapel Hill, NC 27599-7264, USA
| |
Collapse
|
94
|
Lanka P, Rangaprakash D, Gotoor SSR, Dretsch MN, Katz JS, Denney TS, Deshpande G. MALINI (Machine Learning in NeuroImaging): A MATLAB toolbox for aiding clinical diagnostics using resting-state fMRI data. Data Brief 2020; 29:105213. [PMID: 32090157 PMCID: PMC7025186 DOI: 10.1016/j.dib.2020.105213] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 01/20/2020] [Accepted: 01/23/2020] [Indexed: 12/26/2022] Open
Abstract
Resting-state functional Magnetic Resonance Imaging (rs-fMRI) has been extensively used for diagnostic classification because it does not require task compliance and is easier to pool data from multiple imaging sites, thereby increasing the sample size. A MATLAB-based toolbox called Machine Learning in NeuroImaging (MALINI) for feature extraction and disease classification is presented. The MALINI toolbox extracts functional and effective connectivity features from preprocessed rs-fMRI data and performs classification between healthy and disease groups using any of 18 popular and widely used machine learning algorithms that are based on diverse principles. A consensus classifier combining the power of multiple classifiers is also presented. The utility of the toolbox is illustrated by accompanying data consisting of resting-state functional connectivity features from healthy controls and subjects with various brain-based disorders: autism spectrum disorder from autism brain imaging data exchange (ABIDE), Alzheimer's disease and mild cognitive impairment from Alzheimer's disease neuroimaging initiative (ADNI), attention deficit hyperactivity disorder from ADHD-200, and post-traumatic stress disorder and post-concussion syndrome acquired in-house. Results of classification performed on the above datasets can be obtained from the main article titled “Supervised machine learning for diagnostic classification from large-scale neuroimaging datasets” [1]. The data was divided into homogeneous and heterogeneous splits, such that 80% could be used for training, model building and cross-validation, while the remaining 20% of the data could be used as a hold-out independent test data for replication of the classification performance, to ensure the robustness of the classifiers to population variance in image acquisition site and age of the sample.
Collapse
Affiliation(s)
- Pradyumna Lanka
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA.,Department of Psychological Sciences, University of California Merced, Merced, CA, USA
| | - D Rangaprakash
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA.,Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.,Division of Health Science and Technology, Massachusetts Institute of Technology, Boston, MA, USA
| | - Sai Sheshan Roy Gotoor
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA
| | - Michael N Dretsch
- U.S. Army Aeromedical Research Laboratory, Fort Rucker, AL, USA.,US Army Medical Research Directorate-West, Walter Reed Army Institute for Research, Joint Base Lewis-McChord, WA, USA.,Department of Psychology, Auburn University, Auburn, AL, USA
| | - Jeffrey S Katz
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA.,Department of Psychology, Auburn University, Auburn, AL, USA.,Alabama Advanced Imaging Consortium, Birmingham, AL, USA.,Center for Neuroscience, Auburn University, Auburn, AL, USA
| | - Thomas S Denney
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA.,Department of Psychology, Auburn University, Auburn, AL, USA.,Alabama Advanced Imaging Consortium, Birmingham, AL, USA.,Center for Neuroscience, Auburn University, Auburn, AL, USA
| | - Gopikrishna Deshpande
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA.,Department of Psychology, Auburn University, Auburn, AL, USA.,Alabama Advanced Imaging Consortium, Birmingham, AL, USA.,Center for Health Ecology and Equity Research, Auburn University, Auburn, AL, USA.,Center for Neuroscience, Auburn University, Auburn, AL, USA.,Department of Psychiatry, National Institute of Mental and Neurosciences, Bangalore, India.,School of Psychology, Capital Normal University, Beijing, China.,Key Laboratory for Learning and Cognition, Capital Normal University, Beijing, China
| |
Collapse
|
95
|
Torada L, Lorenzon L, Beddis A, Isildak U, Pattini L, Mathieson S, Fumagalli M. ImaGene: a convolutional neural network to quantify natural selection from genomic data. BMC Bioinformatics 2019; 20:337. [PMID: 31757205 PMCID: PMC6873651 DOI: 10.1186/s12859-019-2927-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Accepted: 05/31/2019] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND The genetic bases of many complex phenotypes are still largely unknown, mostly due to the polygenic nature of the traits and the small effect of each associated mutation. An alternative approach to classic association studies to determining such genetic bases is an evolutionary framework. As sites targeted by natural selection are likely to harbor important functionalities for the carrier, the identification of selection signatures in the genome has the potential to unveil the genetic mechanisms underpinning human phenotypes. Popular methods of detecting such signals rely on compressing genomic information into summary statistics, resulting in the loss of information. Furthermore, few methods are able to quantify the strength of selection. Here we explored the use of deep learning in evolutionary biology and implemented a program, called ImaGene, to apply convolutional neural networks on population genomic data for the detection and quantification of natural selection. RESULTS ImaGene enables genomic information from multiple individuals to be represented as abstract images. Each image is created by stacking aligned genomic data and encoding distinct alleles into separate colors. To detect and quantify signatures of positive selection, ImaGene implements a convolutional neural network which is trained using simulations. We show how the method implemented in ImaGene can be affected by data manipulation and learning strategies. In particular, we show how sorting images by row and column leads to accurate predictions. We also demonstrate how the misspecification of the correct demographic model for producing training data can influence the quantification of positive selection. We finally illustrate an approach to estimate the selection coefficient, a continuous variable, using multiclass classification techniques. CONCLUSIONS While the use of deep learning in evolutionary genomics is in its infancy, here we demonstrated its potential to detect informative patterns from large-scale genomic data. We implemented methods to process genomic data for deep learning in a user-friendly program called ImaGene. The joint inference of the evolutionary history of mutations and their functional impact will facilitate mapping studies and provide novel insights into the molecular mechanisms associated with human phenotypes.
Collapse
Affiliation(s)
- Luis Torada
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| | - Lucrezia Lorenzon
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, piazza Leonardo da Vinci 32, Milan, 20133 Italy
| | - Alice Beddis
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| | - Ulas Isildak
- Department of Biological Sciences, Middle East Technical University, METU Üniversiteler Mah. Dumlupınar Blv. No:1, Ankara, 06800 Çankaya Turkey
| | - Linda Pattini
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, piazza Leonardo da Vinci 32, Milan, 20133 Italy
| | - Sara Mathieson
- Department of Computer Science, Swarthmore College, 500 College Ave, Swarthmore, 19081 PA USA
| | - Matteo Fumagalli
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| |
Collapse
|
96
|
Mwanga EP, Minja EG, Mrimi E, Jiménez MG, Swai JK, Abbasi S, Ngowo HS, Siria DJ, Mapua S, Stica C, Maia MF, Olotu A, Sikulu-Lord MT, Baldini F, Ferguson HM, Wynne K, Selvaraj P, Babayan SA, Okumu FO. Detection of malaria parasites in dried human blood spots using mid-infrared spectroscopy and logistic regression analysis. Malar J 2019; 18:341. [PMID: 31590669 PMCID: PMC6781347 DOI: 10.1186/s12936-019-2982-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 09/28/2019] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Epidemiological surveys of malaria currently rely on microscopy, polymerase chain reaction assays (PCR) or rapid diagnostic test kits for Plasmodium infections (RDTs). This study investigated whether mid-infrared (MIR) spectroscopy coupled with supervised machine learning could constitute an alternative method for rapid malaria screening, directly from dried human blood spots. METHODS Filter papers containing dried blood spots (DBS) were obtained from a cross-sectional malaria survey in 12 wards in southeastern Tanzania in 2018/19. The DBS were scanned using attenuated total reflection-Fourier Transform Infrared (ATR-FTIR) spectrometer to obtain high-resolution MIR spectra in the range 4000 cm-1 to 500 cm-1. The spectra were cleaned to compensate for atmospheric water vapour and CO2 interference bands and used to train different classification algorithms to distinguish between malaria-positive and malaria-negative DBS papers based on PCR test results as reference. The analysis considered 296 individuals, including 123 PCR-confirmed malaria positives and 173 negatives. Model training was done using 80% of the dataset, after which the best-fitting model was optimized by bootstrapping of 80/20 train/test-stratified splits. The trained models were evaluated by predicting Plasmodium falciparum positivity in the 20% validation set of DBS. RESULTS Logistic regression was the best-performing model. Considering PCR as reference, the models attained overall accuracies of 92% for predicting P. falciparum infections (specificity = 91.7%; sensitivity = 92.8%) and 85% for predicting mixed infections of P. falciparum and Plasmodium ovale (specificity = 85%, sensitivity = 85%) in the field-collected specimen. CONCLUSION These results demonstrate that mid-infrared spectroscopy coupled with supervised machine learning (MIR-ML) could be used to screen for malaria parasites in human DBS. The approach could have potential for rapid and high-throughput screening of Plasmodium in both non-clinical settings (e.g., field surveys) and clinical settings (diagnosis to aid case management). However, before the approach can be used, we need additional field validation in other study sites with different parasite populations, and in-depth evaluation of the biological basis of the MIR signals. Improving the classification algorithms, and model training on larger datasets could also improve specificity and sensitivity. The MIR-ML spectroscopy system is physically robust, low-cost, and requires minimum maintenance.
Collapse
Affiliation(s)
- Emmanuel P Mwanga
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania.
| | - Elihaika G Minja
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | - Emmanuel Mrimi
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | | | - Johnson K Swai
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | - Said Abbasi
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | - Halfan S Ngowo
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Doreen J Siria
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | - Salum Mapua
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
- School of Life Sciences, University of Keele, Keele, Staffordshire, ST5 5BG, UK
| | - Caleb Stica
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania
| | - Marta F Maia
- KEMRI Wellcome Trust Research Programme, P.O. Box 230, Kilifi, 80108, Kenya
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Old Road Campus Roosevelt Drive, Oxford, OX3 7FZ, UK
| | - Ally Olotu
- KEMRI Wellcome Trust Research Programme, P.O. Box 230, Kilifi, 80108, Kenya
- Interventions and Clinical Trials Department, Ifakara Health Institute, Bagamoyo, Tanzania
| | - Maggy T Sikulu-Lord
- School of Public Health, University of Queensland, Saint Lucia, Australia
- Department of Mathematics, Statistics and Computer Science, Marquette University, Wisconsin, USA
| | - Francesco Baldini
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Heather M Ferguson
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Klaas Wynne
- School of Chemistry, University of Glasgow, Glasgow, G12 8QQ, UK
| | | | - Simon A Babayan
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Fredros O Okumu
- Environmental Health and Ecological Sciences Department, Ifakara Health Institute, Morogoro, Tanzania.
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK.
- School of Public Health, University of Witwatersrand, Johannesburg, South Africa.
| |
Collapse
|
97
|
Baltussen EJM, Brouwer de Koning SG, Sanders J, Aalbers AGJ, Kok NFM, Beets GL, Hendriks BHW, Sterenborg HJCM, Kuhlmann KFD, Ruers TJM. Tissue diagnosis during colorectal cancer surgery using optical sensing: an in vivo study. J Transl Med 2019; 17:333. [PMID: 31578153 PMCID: PMC6775650 DOI: 10.1186/s12967-019-2083-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 09/23/2019] [Indexed: 01/20/2023] Open
Abstract
Background In colorectal cancer surgery there is a delicate balance between complete removal of the tumor and sparing as much healthy tissue as possible. Especially in rectal cancer, intraoperative tissue recognition could be of great benefit in preventing positive resection margins and sparing as much healthy tissue as possible. To better guide the surgeon, we evaluated the accuracy of diffuse reflectance spectroscopy (DRS) for tissue characterization during colorectal cancer surgery and determined the added value of DRS when compared to clinical judgement. Methods DRS spectra were obtained from fat, healthy colorectal wall and tumor tissue during colorectal cancer surgery and results were compared to histopathology examination of the measurement locations. All spectra were first normalized at 800 nm, thereafter two support vector machines (SVM) were trained using a tenfold cross-validation. With the first SVM fat was separated from healthy colorectal wall and tumor tissue, the second SVM distinguished healthy colorectal wall from tumor tissue. Results Patients were included based on preoperative imaging, indicating advanced local stage colorectal cancer. Based on the measurement results of 32 patients, the classification resulted in a mean accuracy for fat, healthy colorectal wall and tumor of 0.92, 0.89 and 0.95 respectively. If the classification threshold was adjusted such that no false negatives were allowed, the percentage of false positive measurement locations by DRS was 25% compared to 69% by clinical judgement. Conclusion This study shows the potential of DRS for the use of tissue classification during colorectal cancer surgery. Especially the low false positive rate obtained for a false negative rate of zero shows the added value for the surgeons. Trail registration This trail was performed under approval from the internal review board committee (Dutch Trail Register NTR5315), registered on 04/13/2015, https://www.trialregister.nl/trial/5175.
Collapse
Affiliation(s)
- E J M Baltussen
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands.
| | - S G Brouwer de Koning
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - J Sanders
- Department of Pathology, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - A G J Aalbers
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - N F M Kok
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - G L Beets
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - B H W Hendriks
- Department of In-body Systems, Philips Research, Eindhoven, The Netherlands.,Department of Biomechanical Engineering, Delft University of Technology, Delft, The Netherlands
| | - H J C M Sterenborg
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands.,Department of Biomedical Engineering and Physics, Amsterdam University Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - K F D Kuhlmann
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - T J M Ruers
- Department of Surgery, Antoni van Leeuwenhoek Hospital - The Netherlands Cancer Institute, Amsterdam, The Netherlands.,Faculty TNW, Group Nanobiophysics, Twente University, Enschede, The Netherlands
| |
Collapse
|
98
|
Liu M, Ylanko J, Weekman E, Beckett T, Andrews D, McLaurin J. Utilizing supervised machine learning to identify microglia and astrocytes in situ: implications for large-scale image analysis and quantification. J Neurosci Methods 2019; 328:108424. [PMID: 31494186 DOI: 10.1016/j.jneumeth.2019.108424] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Revised: 08/14/2019] [Accepted: 09/04/2019] [Indexed: 12/24/2022]
Abstract
BACKGROUND The evaluation of histological tissue samples plays a crucial role in deciphering preclinical disease and injury mechanisms. High-resolution images can be obtained quickly however data acquisition are often bottlenecked by manual analysis methodologies. NEW METHOD We describe and validate a pipeline for a novel machine learning-based analytical method, using the Opera High-Content Screening system and Harmony software, allowing for detailed image analysis of cellular markers in histological samples. RESULTS To validate the machine learning pipeline, analyses of single proteins in mouse brain sections were utilized. To demonstrate adaptability of the pipeline for multiple cell types and epitopes, the percent brain coverage of microglial cells, identified by ionized calcium binding adaptors molecule 1 (Iba1), and of astrocytes, by glial fibrillary acidic protein (GFAP) demonstrated no significant differences between automated and manual analyses protocols. Further to examine the robustness of this protocol for multiple proteins simultaneously labeling of rat brain sections were utilized; co-localization of astrocytic endfeet on blood vessels, using aquaporin-4 and tomato lectin respectively, were efficiently identified and quantified by the novel pipeline and were not significantly different between the two analyses protocols. Comparison with Existing Methods: The automated platform maintained the sensitivity and accuracy of manual analysis, while accomplishing the analyses in 1/200th of the time. CONCLUSIONS We demonstrate the benefits and potential of adapting an automated high-throughput machine-learning analytical approach for the analysis ofin situ tissue samples, show effectiveness across different animal models, while reducing analysis time and increasing productivity.
Collapse
|
99
|
Takamatsu M, Yamamoto N, Kawachi H, Chino A, Saito S, Ueno M, Ishikawa Y, Takazawa Y, Takeuchi K. Prediction of early colorectal cancer metastasis by machine learning using digital slide images. Comput Methods Programs Biomed 2019; 178:155-161. [PMID: 31416544 DOI: 10.1016/j.cmpb.2019.06.022] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Revised: 06/08/2019] [Accepted: 06/21/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND AND OBJECTIVES Prediction of lymph node metastasis (LNM) for early colorectal cancer (CRC) is critical for determining treatment strategies after endoscopic resection. Some histologic parameters for predicting LNM have been established, but evaluator error and inter-observer disagreement are unsolved issues. Here we describe an LNM prediction algorithm for submucosal invasive (T1) CRC based on machine learning. METHODS We conducted a retrospective single-institution study of 397 T1 CRCs. Several morphologic parameters were extracted from whole slide images of cytokeratin immunohistochemistry using Image J. A random forest algorithm for a training dataset (n = 277) was executed and used to predict LNM for the test dataset (n = 120). The results were compared with conventional histologic evaluation of hematoxylin-eosin staining. RESULTS Machine learning showed better LNM predictive ability than the conventional method on some datasets. Cross validation revealed no significant difference between the methods. Machine learning resulted in fewer false-negative cases than the conventional method. CONCLUSIONS Machine learning on whole slide images is a potential alternative for determining treatment strategies for T1 CRC.
Collapse
Affiliation(s)
- Manabu Takamatsu
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan.
| | - Noriko Yamamoto
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Hiroshi Kawachi
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Akiko Chino
- Department of Endoscopy, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Shoichi Saito
- Department of Endoscopy, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Masashi Ueno
- Department of Colorectal Surgery, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Yuichi Ishikawa
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Yutaka Takazawa
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Kengo Takeuchi
- Division of Pathology, The Cancer Institute; Department of Pathology, The Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| |
Collapse
|
100
|
Yera A, Muguerza J, Arbelaitz O, Perona I, Keers RN, Ashcroft DM, Williams R, Peek N, Jay C, Vigo M. Modelling the interactive behaviour of users with a medication safety dashboard in a primary care setting. Int J Med Inform 2019; 129:395-403. [PMID: 31445283 DOI: 10.1016/j.ijmedinf.2019.07.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/24/2019] [Accepted: 07/20/2019] [Indexed: 10/26/2022]
Abstract
OBJECTIVE To characterise the use of an electronic medication safety dashboard by exploring and contrasting interactions from primary users (i.e. pharmacists) who were leading the intervention and secondary users (i.e. non-pharmacist staff) who used the dashboard to engage in safe prescribing practices. MATERIALS AND METHODS We conducted a 10-month observational study in which 35 health professionals used an instrumented medication safety dashboard for audit and feedback purposes in clinical practice as part of a wider intervention study. We modelled user interaction by computing features representing exploration and dwell time through user interface events that were logged on a remote database. We applied supervised learning algorithms to classify primary against secondary users. RESULTS We observed values for accuracy above 0.8, indicating that 80% of the time we were able to distinguish a primary user from a secondary user. In particular, the Multilayer Perceptron (MLP) yielded the highest values of precision (0.88), recall (0.86) and F-measure (0.86). The behaviour of primary users was distinctive in that they spent less time between mouse clicks (lower dwell time) on the screens showing the overview of the practice and trends. Secondary users exhibited a higher dwell time and more visual search activity (higher exploration) on the screens displaying patients at risk and visualisations. DISCUSSION AND CONCLUSION We were able to distinguish the interactive behaviour of primary and secondary users of a medication safety dashboard in primary care using timestamped mouse events. Primary users were more competent on population health monitoring activities, while secondary users struggled on activities involving a detailed breakdown of the safety of patients. Informed by these findings, we propose workflows that group these activities and adaptive nudges to increase user engagement.
Collapse
Affiliation(s)
- Ainhoa Yera
- Faculty of Informatics, University of the Basque Country UPV/EHU, Donostia/San Sebastián, Spain
| | - Javier Muguerza
- Faculty of Informatics, University of the Basque Country UPV/EHU, Donostia/San Sebastián, Spain
| | - Olatz Arbelaitz
- Faculty of Informatics, University of the Basque Country UPV/EHU, Donostia/San Sebastián, Spain
| | - Iñigo Perona
- Faculty of Informatics, University of the Basque Country UPV/EHU, Donostia/San Sebastián, Spain
| | - Richard N Keers
- Division of Pharmacy and Optometry, University of Manchester, Manchester, United Kingdom; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Darren M Ashcroft
- Division of Pharmacy and Optometry, University of Manchester, Manchester, United Kingdom; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Richard Williams
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Niels Peek
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Caroline Jay
- School of Computer Science, University of Manchester, Manchester, United Kingdom
| | - Markel Vigo
- School of Computer Science, University of Manchester, Manchester, United Kingdom.
| |
Collapse
|