201
|
Shu Z, Liu W, Wu H, Xiao M, Wu D, Cao T, Ren M, Tao J, Zhang C, He T, Li X, Zhang R, Zhou X. Symptom-based network classification identifies distinct clinical subgroups of liver diseases with common molecular pathways. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019; 174:41-50. [PMID: 29502851 DOI: 10.1016/j.cmpb.2018.02.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 02/22/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Liver disease is a multifactorial complex disease with high global prevalence and poor long-term clinical efficacy and liver disease patients with different comorbidities often incorporate multiple phenotypes in the clinic. Thus, there is a pressing need to improve understanding of the complexity of clinical liver population to help gain more accurate disease subtypes for personalized treatment. METHODS Individualized treatment of the traditional Chinese medicine (TCM) provides a theoretical basis to the study of personalized classification of complex diseases. Utilizing the TCM clinical electronic medical records (EMRs) of 6475 liver inpatient cases, we built a liver disease comorbidity network (LDCN) to show the complicated associations between liver diseases and their comorbidities, and then constructed a patient similarity network with shared symptoms (PSN). Finally, we identified liver patient subgroups using community detection methods and performed enrichment analyses to find both distinct clinical and molecular characteristics (with the phenotype-genotype associations and interactome networks) of these patient subgroups. RESULTS From the comorbidity network, we found that clinical liver patients have a wide range of disease comorbidities, in which the basic liver diseases (e.g. hepatitis b, decompensated liver cirrhosis), and the common chronic diseases (e.g. hypertension, type 2 diabetes), have high degree of disease comorbidities. In addition, we identified 303 patient modules (representing the liver patient subgroups) from the PSN, in which the top 6 modules with large number of cases include 51.68% of the whole cases and 251 modules contain only 10 or fewer cases, which indicates the manifestation diversity of liver diseases. Finally, we found that the patient subgroups actually have distinct symptom phenotypes, disease comorbidity characteristics and their underlying molecular pathways, which could be used for understanding the novel disease subtypes of liver conditions. For example, three patient subgroups, namely Module 6 (M6, n = 638), M2 (n = 623) and M1 (n = 488) were associated to common chronic liver disease conditions (hepatitis, cirrhosis, hepatocellular carcinoma). Meanwhile, patient subgroups of M30 (n = 36) and M36 (n = 37) were mostly related to acute gastroenteritis and upper respiratory infection, respectively, which reflected the individual comorbidity characteristics of liver subgroups. Furthermore, we identified the distinct genes and pathways of patient subgroups and the basic liver diseases (hepatitis b and cirrhosis), respectively. The high degree of overlapping pathways between them (e.g. M36 with 93.33% shared enriched pathways) indicates the underlying molecular network mechanisms of each patient subgroup. CONCLUSIONS Our results demonstrate the utility and comprehensiveness of disease classification study based on community detection of patient network using shared TCM symptom phenotypes and it can be used to other more complex diseases.
Collapse
Affiliation(s)
- Zixin Shu
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; The clinical medical college of Traditional Chinese Medicine, Hubei University of Traditional Chinese Medicine, Wuhan 430065, China
| | - Wenwen Liu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Huikun Wu
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Mingzhong Xiao
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Deng Wu
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Ting Cao
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Meng Ren
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Junxiu Tao
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Chuhua Zhang
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Tangqing He
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China
| | - Xiaodong Li
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan 430061, China; Hubei Province Academy of Traditional Chinese Medicine, Wuhan 430061, China.
| | - Runshun Zhang
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China.
| | - Xuezhong Zhou
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China.
| |
Collapse
|
202
|
Bland JS. What is Evidence-Based Functional Medicine in the 21st Century? Integr Med (Encinitas) 2019; 18:14-18. [PMID: 32549804 PMCID: PMC7217393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The 21st century has already demonstrated itself to be an era of change for medicine and science. There is a new openness-to ideas, to a shift in perspectives, to a redefinition of evidence and the many ways it can be gathered. New interest in real-world data, patient-experience information has also become an increasingly important contributor to the evaluation of treatment effectiveness. It is a fertile time on many fronts, including an expanded reach for a systems biology formalism and the Functional Medicine movement.
Collapse
|
203
|
Dennis JM, Shields BM, Henley WE, Jones AG, Hattersley AT. Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data. Lancet Diabetes Endocrinol 2019; 7:442-451. [PMID: 31047901 PMCID: PMC6520497 DOI: 10.1016/s2213-8587(19)30087-7] [Citation(s) in RCA: 266] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/20/2019] [Accepted: 02/21/2019] [Indexed: 02/08/2023]
Abstract
BACKGROUND Research using data-driven cluster analysis has proposed five subgroups of diabetes with differences in diabetes progression and risk of complications. We aimed to compare the clinical utility of this subgroup-based approach for predicting patient outcomes with an alternative strategy of developing models for each outcome using simple patient characteristics. METHODS We identified five clusters in the ADOPT trial (n=4351) using the same data-driven cluster analysis as reported by Ahlqvist and colleagues. Differences between clusters in glycaemic and renal progression were investigated and contrasted with stratification using simple continuous clinical features (age at diagnosis for glycaemic progression and baseline renal function for renal progression). We compared the effectiveness of a strategy of selecting glucose-lowering therapy using clusters with one combining simple clinical features (sex, BMI, age at diagnosis, baseline HbA1c) in an independent trial cohort (RECORD [n=4447]). FINDINGS Clusters identified in trial data were similar to those described in the original study by Ahlqvist and colleagues. Clusters showed differences in glycaemic progression, but a model using age at diagnosis alone explained a similar amount of variation in progression. We found differences in incidence of chronic kidney disease between clusters; however, estimated glomerular filtration rate at baseline was a better predictor of time to chronic kidney disease. Clusters differed in glycaemic response, with a particular benefit for thiazolidinediones in patients in the severe insulin-resistant diabetes cluster and for sulfonylureas in patients in the mild age-related diabetes cluster. However, simple clinical features outperformed clusters to select therapy for individual patients. INTERPRETATION The proposed data-driven clusters differ in diabetes progression and treatment response, but models that are based on simple continuous clinical features are more useful to stratify patients. This finding suggests that precision medicine in type 2 diabetes is likely to have most clinical utility if it is based on an approach of using specific phenotypic measures to predict specific outcomes, rather than assigning patients to subgroups. FUNDING UK Medical Research Council.
Collapse
Affiliation(s)
- John M Dennis
- Institute of Biomedical and Clinical Science, Royal Devon and Exeter Hospital, University of Exeter Medical School, Exeter, UK
| | - Beverley M Shields
- Institute of Biomedical and Clinical Science, Royal Devon and Exeter Hospital, University of Exeter Medical School, Exeter, UK
| | - William E Henley
- Health Statistics Group, Institute of Health Research, Royal Devon and Exeter Hospital, University of Exeter Medical School, Exeter, UK
| | - Angus G Jones
- Institute of Biomedical and Clinical Science, Royal Devon and Exeter Hospital, University of Exeter Medical School, Exeter, UK
| | - Andrew T Hattersley
- Institute of Biomedical and Clinical Science, Royal Devon and Exeter Hospital, University of Exeter Medical School, Exeter, UK.
| |
Collapse
|
204
|
Wang X, Zhang Y, Hao S, Zheng L, Liao J, Ye C, Xia M, Wang O, Liu M, Weng CH, Duong SQ, Jin B, Alfreds ST, Stearns F, Kanov L, Sylvester KG, Widen E, McElhinney DB, Ling XB. Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine. J Med Internet Res 2019; 21:e13260. [PMID: 31099339 PMCID: PMC6542253 DOI: 10.2196/13260] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Revised: 04/18/2019] [Accepted: 04/23/2019] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Lung cancer is the leading cause of cancer death worldwide. Early detection of individuals at risk of lung cancer is critical to reduce the mortality rate. OBJECTIVE The aim of this study was to develop and validate a prospective risk prediction model to identify patients at risk of new incident lung cancer within the next 1 year in the general population. METHODS Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. The study population consisted of patients with at least one EHR between April 1, 2016, and March 31, 2018, who had no history of lung cancer. A retrospective cohort (N=873,598) and a prospective cohort (N=836,659) were formed for model construction and validation. An Extreme Gradient Boosting (XGBoost) algorithm was adopted to build the model. It assigned a score to each individual to quantify the probability of a new incident lung cancer diagnosis from October 1, 2016, to September 31, 2017. The model was trained with the clinical profile in the retrospective cohort from the preceding 6 months and validated with the prospective cohort to predict the risk of incident lung cancer from April 1, 2017, to March 31, 2018. RESULTS The model had an area under the curve (AUC) of 0.881 (95% CI 0.873-0.889) in the prospective cohort. Two thresholds of 0.0045 and 0.01 were applied to the predictive scores to stratify the population into low-, medium-, and high-risk categories. The incidence of lung cancer in the high-risk category (579/53,922, 1.07%) was 7.7 times higher than that in the overall cohort (1167/836,659, 0.14%). Age, a history of pulmonary diseases and other chronic diseases, medications for mental disorders, and social disparities were found to be associated with new incident lung cancer. CONCLUSIONS We retrospectively developed and prospectively validated an accurate risk prediction model of new incident lung cancer occurring in the next 1 year. Through statistical learning from the statewide EHR data in the preceding 6 months, our model was able to identify statewide high-risk patients, which will benefit the population health through establishment of preventive interventions or more intensive surveillance.
Collapse
Affiliation(s)
- Xiaofang Wang
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
- Department of Surgery, Stanford University, Stanford, CA, United States
| | - Yan Zhang
- Department of Oncology, The First Hospital of Shijiazhuang, Shijiazhuang, China
| | - Shiying Hao
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States
| | - Le Zheng
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States
| | - Jiayu Liao
- Department of Bioengineering, University of California, Riverside, CA, United States
- West China-California Multiomics Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Chengyin Ye
- Department of Health Management, Hangzhou Normal University, Hangzhou, China
| | - Minjie Xia
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Oliver Wang
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Modi Liu
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Ching Ho Weng
- Department of Surgery, Stanford University, Stanford, CA, United States
| | - Son Q Duong
- Lucile Packard Children's Hospital, Palo Alto, CA, United States
| | - Bo Jin
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | | | - Frank Stearns
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Laura Kanov
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Karl G Sylvester
- Department of Surgery, Stanford University, Stanford, CA, United States
| | - Eric Widen
- Healthcare Business Intelligence Solutions Inc, Palo Alto, CA, United States
| | - Doff B McElhinney
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States
| | - Xuefeng B Ling
- Department of Surgery, Stanford University, Stanford, CA, United States
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Palo Alto, CA, United States
| |
Collapse
|
205
|
Seetharam K, Shrestha S, Sengupta PP. Artificial Intelligence in Cardiovascular Medicine. CURRENT TREATMENT OPTIONS IN CARDIOVASCULAR MEDICINE 2019; 21:25. [PMID: 31089906 PMCID: PMC7561035 DOI: 10.1007/s11936-019-0728-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
PURPOSE OF REVIEW The ripples of artificial intelligence are being felt in various sectors of human life. Machine learning, a subset of artificial intelligence, extracts information from large databases of information and is gaining traction in various fields of cardiology. In this review, we highlight noteworthy examples of machine learning utilization in echocardiography, nuclear cardiology, computed tomography, and magnetic resonance imaging over the past year. RECENT FINDINGS In the past year, machine learning (ML) has expanded its boundaries in cardiology with several positive results. Some studies have integrated clinical and imaging information to further augment the accuracy of these ML algorithms. All the studies mentioned in this review have clearly demonstrated superior results of ML in relation to conventional approaches for identifying obstructions or predicting major adverse events in reference to conventional approaches. As the influx of data arriving from gradually evolving technologies in health care and wearable devices continues to be more complex, ML may serve as the bridge to transcend the gap between health care and patients in the future. In order to facilitate a seamless transition between both, a few issues must be resolved for a successful implementation of ML in health care.
Collapse
Affiliation(s)
- Karthik Seetharam
- WVU Heart & Vascular Institute, 1 Medical Center Drive, Morgantown, WV, 26506, USA
| | - Sirish Shrestha
- WVU Heart & Vascular Institute, 1 Medical Center Drive, Morgantown, WV, 26506, USA
| | - Partho P Sengupta
- WVU Heart & Vascular Institute, 1 Medical Center Drive, Morgantown, WV, 26506, USA.
| |
Collapse
|
206
|
Liao X, Kerr D, Morales J, Duncan I. Application of Machine Learning to Identify Clustering of Cardiometabolic Risk Factors in U.S. Adults. Diabetes Technol Ther 2019; 21:245-253. [PMID: 30969131 DOI: 10.1089/dia.2018.0390] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Aims: The aim of this study is to compare some machine learning methods with traditional statistical parametric analyses using logistic regression to investigate the relationship of risk factors for diabetes and cardiovascular (cardiometabolic risk) for U.S. adults using a cross-sectional data from participants in a wellness improvement program. Methods: Logistic regression was used to find the relationship between individual risk factors, predictor and cardiometabolic risk. Supervised machine learning methods were used to predict risk and produce a ranking of variables' importance. A clustering method was used to identify subpopulations of interest. Predictors were divided into those that are nonmodifiable and those that are modifiable. Results: The population comprised 217,254 adults of whom 8.1% had diabetes. Using logistic regression, six variables were identified to be negatively related and eleven were positively related to cardiometabolic risk. Three supervised machine learning classifiers (random forest, gradient boosting, and bagging) were applied with average AUC to be 0.806. Each classifier also produced a ranking of variables' importance. Four subgroups were identified with a k-medoid clustering algorithm, which were mainly distinguished by gender and diabetes status. Conclusions: The study illustrates that machine learning is an important addition to traditional logistic regression in terms of identifying important cardiometabolic risk factors and ranking their importance and the potential for interventions based on lifestyle and medications at an individual level.
Collapse
Affiliation(s)
- Xiyue Liao
- 1 Department of Statistics and Applied Probability, University of California Santa Barbara, Santa Barbara, California
| | - David Kerr
- 2 Sansum Diabetes Research Institute, Santa Barbara, California
| | | | - Ian Duncan
- 1 Department of Statistics and Applied Probability, University of California Santa Barbara, Santa Barbara, California
| |
Collapse
|
207
|
Jia Z, Lu X, Duan H, Li H. Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity. BMC Med Inform Decis Mak 2019; 19:91. [PMID: 31023325 PMCID: PMC6485152 DOI: 10.1186/s12911-019-0807-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 04/01/2019] [Indexed: 11/10/2022] Open
Abstract
Background Many clinical concepts are standardized under a categorical and hierarchical taxonomy such as ICD-10, ATC, etc. These taxonomic clinical concepts provide insight into semantic meaning and similarity among clinical concepts and have been applied to patient similarity measures. However, the effects of diverse set sizes of taxonomic clinical concepts contributing to similarity at the patient level have not been well studied. Methods In this paper the most widely used taxonomic clinical concepts system, ICD-10, was studied as a representative taxonomy. The distance between ICD-10-coded diagnosis sets is an integrated estimation of the information content of each concept, the similarity between each pairwise concepts and the similarity between the sets of concepts. We proposed a novel method at the set-level similarity to calculate the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity. A real-world clinical dataset with ICD-10 coded diagnoses and hospital length of stay (HLOS) information was used to evaluate the performance of various algorithms and their combinations in predicting whether a patient need long-term hospitalization or not. Four subpopulation prototypes that were defined based on age and HLOS with different diagnoses set sizes were used as the target for similarity analysis. The F-score was used to evaluate the performance of different algorithms by controlling other factors. We also evaluated the effect of prototype set size on prediction precision. Results The results identified the strengths and weaknesses of different algorithms to compute information content, code-level similarity and set-level similarity under different contexts, such as set size and concept set background. The minimum weighted bipartite matching approach, which has not been fully recognized previously showed unique advantages in measuring the concepts-based patient similarity. Conclusions This study provides a systematic benchmark evaluation of previous algorithms and novel algorithms used in taxonomic concepts-based patient similarity, and it provides the basis for selecting appropriate methods under different clinical scenarios. Electronic supplementary material The online version of this article (10.1186/s12911-019-0807-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zheng Jia
- College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Xudong Lu
- College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Huilong Duan
- College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Haomin Li
- The Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China. .,The Institute of Translational Medicine, Zhejiang University, Hangzhou, China.
| |
Collapse
|
208
|
Virtual Care 2.0—a Vision for the Future of Data-Driven Technology-Enabled Healthcare. CURRENT TREATMENT OPTIONS IN CARDIOVASCULAR MEDICINE 2019; 21:21. [DOI: 10.1007/s11936-019-0727-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
209
|
Liang Y, Kelemen A. Dynamic modeling and network approaches for omics time course data: overview of computational approaches and applications. Brief Bioinform 2019; 19:1051-1068. [PMID: 28430854 DOI: 10.1093/bib/bbx036] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Indexed: 12/23/2022] Open
Abstract
Inferring networks and dynamics of genes, proteins, cells and other biological entities from high-throughput biological omics data is a central and challenging issue in computational and systems biology. This is essential for understanding the complexity of human health, disease susceptibility and pathogenesis for Predictive, Preventive, Personalized and Participatory (P4) system and precision medicine. The delineation of the possible interactions of all genes/proteins in a genome/proteome is a task for which conventional experimental techniques are ill suited. Urgently needed are rapid and inexpensive computational and statistical methods that can identify interacting candidate disease genes or drug targets out of thousands that can be further investigated or validated by experimentations. Moreover, identifying biological dynamic systems, and simultaneously estimating the important kinetic structural and functional parameters, which may not be experimentally accessible could be important directions for drug-disease-gene network studies. In this article, we present an overview and comparison of recent developments of dynamic modeling and network approaches for time-course omics data, and their applications to various biological systems, health conditions and disease statuses. Moreover, various data reduction and analytical schemes ranging from mathematical to computational to statistical methods are compared including their merits, drawbacks and limitations. The most recent software, associated web resources and other potentials for the compared methods are also presented and discussed in detail.
Collapse
Affiliation(s)
- Yulan Liang
- Department of Family and Community Health, University of Maryland, Baltimore, MD, USA
| | - Arpad Kelemen
- Department of Family and Community Health, University of Maryland, Baltimore, MD, USA
| |
Collapse
|
210
|
Dahl A, Cai N, Ko A, Laakso M, Pajukanta P, Flint J, Zaitlen N. Reverse GWAS: Using genetics to identify and model phenotypic subtypes. PLoS Genet 2019; 15:e1008009. [PMID: 30951530 PMCID: PMC6469799 DOI: 10.1371/journal.pgen.1008009] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 04/17/2019] [Accepted: 02/07/2019] [Indexed: 12/16/2022] Open
Abstract
Recent and classical work has revealed biologically and medically significant subtypes in complex diseases and traits. However, relevant subtypes are often unknown, unmeasured, or actively debated, making automated statistical approaches to subtype definition valuable. We propose reverse GWAS (RGWAS) to identify and validate subtypes using genetics and multiple traits: while GWAS seeks the genetic basis of a given trait, RGWAS seeks to define trait subtypes with distinct genetic bases. Unlike existing approaches relying on off-the-shelf clustering methods, RGWAS uses a novel decomposition, MFMR, to model covariates, binary traits, and population structure. We use extensive simulations to show that modelling these features can be crucial for power and calibration. We validate RGWAS in practice by recovering a recently discovered stress subtype in major depression. We then show the utility of RGWAS by identifying three novel subtypes of metabolic traits. We biologically validate these metabolic subtypes with SNP-level tests and a novel polygenic test: the former recover known metabolic GxE SNPs; the latter suggests subtypes may explain substantial missing heritability. Crucially, statins, which are widely prescribed and theorized to increase diabetes risk, have opposing effects on blood glucose across metabolic subtypes, suggesting the subtypes have potential translational value.
Collapse
Affiliation(s)
- Andy Dahl
- Department of Medicine, UCSF, San Francisco, California, United States of America
| | - Na Cai
- Wellcome Sanger Institute, Cambridge, United Kingdom
- European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Arthur Ko
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, California, United States of America
| | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
- Kuopio University Hospital, Kuopio, Finland
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, California, United States of America
| | - Jonathan Flint
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, UCLA, Los Angeles, California, United States of America
| | - Noah Zaitlen
- Department of Medicine, UCSF, San Francisco, California, United States of America
| |
Collapse
|
211
|
Dey D, Slomka PJ, Leeson P, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH. Artificial Intelligence in Cardiovascular Imaging: JACC State-of-the-Art Review. J Am Coll Cardiol 2019; 73:1317-1335. [PMID: 30898208 PMCID: PMC6474254 DOI: 10.1016/j.jacc.2018.12.054] [Citation(s) in RCA: 383] [Impact Index Per Article: 63.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 12/13/2018] [Indexed: 12/11/2022]
Abstract
Data science is likely to lead to major changes in cardiovascular imaging. Problems with timing, efficiency, and missed diagnoses occur at all stages of the imaging chain. The application of artificial intelligence (AI) is dependent on robust data; the application of appropriate computational approaches and tools; and validation of its clinical application to image segmentation, automated measurements, and eventually, automated diagnosis. AI may reduce cost and improve value at the stages of image acquisition, interpretation, and decision-making. Moreover, the precision now possible with cardiovascular imaging, combined with "big data" from the electronic health record and pathology, is likely to better characterize disease and personalize therapy. This review summarizes recent promising applications of AI in cardiology and cardiac imaging, which potentially add value to patient care.
Collapse
Affiliation(s)
- Damini Dey
- Departments of Biomedical Sciences and Medicine, Cedars-Sinai Medical Center, Biomedical Imaging Research Institute, Los Angeles, California
| | - Piotr J Slomka
- Departments of Biomedical Sciences and Medicine, Cedars-Sinai Medical Center, Biomedical Imaging Research Institute, Los Angeles, California
| | - Paul Leeson
- Oxford Cardiovascular Clinical Research Facility, Radcliffe Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | | - Sirish Shrestha
- Section of Cardiology, West Virginia University, Morgantown, West Virginia
| | - Partho P Sengupta
- Section of Cardiology, West Virginia University, Morgantown, West Virginia
| | - Thomas H Marwick
- Baker Heart and Diabetes Research Institute, Melbourne, Australia.
| |
Collapse
|
212
|
A Comparative Review of Manifold Learning Techniques for Hyperspectral and Polarimetric SAR Image Fusion. REMOTE SENSING 2019. [DOI: 10.3390/rs11060681] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In remote sensing, hyperspectral and polarimetric synthetic aperture radar (PolSAR) images are the two most versatile data sources for a wide range of applications such as land use land cover classification. However, the fusion of these two data sources receive less attention than many other, because of their scarce data availability, and relatively challenging fusion task caused by their distinct imaging geometries. Among the existing fusion methods, including manifold learning-based, kernel-based, ensemble-based, and matrix factorization, manifold learning is one of most celebrated techniques for the fusion of heterogeneous data. Therefore, this paper aims to promote the research in hyperspectral and PolSAR data fusion, by providing a comprehensive comparison between existing manifold learning-based fusion algorithms. We conducted experiments on 16 state-of-the-art manifold learning algorithms that embrace two important research questions in manifold learning-based fusion of hyperspectral and PolSAR data: (1) in which domain should the data be aligned—the data domain or the manifold domain; and (2) how to make use of existing labeled data when formulating a graph to represent a manifold—supervised, semi-supervised, or unsupervised. The performance of the algorithms were evaluated via multiple accuracy metrics of land use land cover classification over two data sets. Results show that the algorithms based on manifold alignment generally outperform those based on data alignment (data concatenation). Semi-supervised manifold alignment fusion algorithms performs the best among all. Experiments using multiple classifiers show that they outperform the benchmark data alignment-based algorithms by ca. 3% in terms of the overall classification accuracy.
Collapse
|
213
|
Pai S, Hui S, Isserlin R, Shah MA, Kaka H, Bader GD. netDx: interpretable patient classification using integrated patient similarity networks. Mol Syst Biol 2019; 15:e8497. [PMID: 30872331 PMCID: PMC6423721 DOI: 10.15252/msb.20188497] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Patient classification has widespread biomedical and clinical applications, including diagnosis, prognosis, and treatment response prediction. A clinically useful prediction algorithm should be accurate, generalizable, be able to integrate diverse data types, and handle sparse data. A clinical predictor based on genomic data needs to be interpretable to drive hypothesis‐driven research into new treatments. We describe netDx, a novel supervised patient classification framework based on patient similarity networks, which meets these criteria. In a cancer survival benchmark dataset integrating up to six data types in four cancer types, netDx significantly outperforms most other machine‐learning approaches across most cancer types. Compared to traditional machine‐learning‐based patient classifiers, netDx results are more interpretable, visualizing the decision boundary in the context of patient similarity space. When patient similarity is defined by pathway‐level gene expression, netDx identifies biological pathways important for outcome prediction, as demonstrated in breast cancer and asthma. netDx can serve as a patient classifier and as a tool for discovery of biological features characteristic of disease. We provide a free software implementation of netDx with automation workflows.
Collapse
Affiliation(s)
- Shraddha Pai
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Affiliate Scientist, The Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Shirley Hui
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Ruth Isserlin
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Muhammad A Shah
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Hussam Kaka
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada .,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada
| |
Collapse
|
214
|
|
215
|
Cho SB, Kim SC, Chung MG. Identification of novel population clusters with different susceptibilities to type 2 diabetes and their impact on the prediction of diabetes. Sci Rep 2019; 9:3329. [PMID: 30833619 PMCID: PMC6399283 DOI: 10.1038/s41598-019-40058-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 02/05/2019] [Indexed: 01/10/2023] Open
Abstract
Type 2 diabetes is one of the subtypes of diabetes. However, previous studies have revealed its heterogeneous features. Here, we hypothesized that there would be heterogeneity in its development, resulting in higher susceptibility in some populations. We performed risk-factor based clustering (RFC), which is a hierarchical clustering of the population with profiles of five known risk factors for type 2 diabetes (age, gender, body mass index, hypertension, and family history of diabetes). The RFC identified six population clusters with significantly different prevalence rates of type 2 diabetes in the discovery data (N = 10,023), ranging from 0.09 to 0.44 (Chi-square test, P < 0.001). The machine learning method identified six clusters in the validation data (N = 215,083), which also showed the heterogeneity of prevalence between the clusters (P < 0.001). In addition to the prevalence of type 2 diabetes, the clusters showed different clinical features including biochemical profiles and prediction performance with the risk factors. SOur results seem to implicate a heterogeneous mechanism in the development of type 2 diabetes. These results will provide new insights for the development of more precise management strategy for type 2 diabetes.
Collapse
Affiliation(s)
- Seong Beom Cho
- Division of Biomedical Informatics, National Institute of Health, KCDC, Cheongju-si, Chungcheongbuk-do, 28159, Republic of Korea.
| | - Sang Cheol Kim
- Division of Biomedical Informatics, National Institute of Health, KCDC, Cheongju-si, Chungcheongbuk-do, 28159, Republic of Korea
| | - Myung Guen Chung
- Division of Biomedical Informatics, National Institute of Health, KCDC, Cheongju-si, Chungcheongbuk-do, 28159, Republic of Korea
| |
Collapse
|
216
|
Abstract
PURPOSE OF REVIEW The purpose of this review was to summarize recent advances in the genomics of type 2 diabetes (T2D) and to highlight current initiatives to advance precision health. RECENT FINDINGS Generation of multi-omic data to measure each of the "biologic layers," developments in describing genomic function and annotation in T2D relevant tissue, along with the increasing recognition that T2D is a heterogeneous disease, and large-scale collaborations have all contributed to advancing our understanding of the molecular basis of T2D. Substantial advances have been made in understanding the molecular basis of T2D pathogenesis, such that precision health diabetes is increasingly becoming a reality. For precision diabetes to become a routine in clinical and public health, additional large-scale multi-omic initiatives are needed along with better assessment of our environment to delineate an individual's diabetes subtype for improved detection and management.
Collapse
Affiliation(s)
- Yuan Lin
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University, Indianapolis, IN, USA
| | - Jennifer Wessel
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University, Indianapolis, IN, USA.
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA.
- Diabetes Translational Research Center, Indiana University School of Medicine, Indianapolis, IN, USA.
| |
Collapse
|
217
|
Elksnis A, Martinell M, Eriksson O, Espes D. Heterogeneity of Metabolic Defects in Type 2 Diabetes and Its Relation to Reactive Oxygen Species and Alterations in Beta-Cell Mass. Front Physiol 2019; 10:107. [PMID: 30837889 PMCID: PMC6383038 DOI: 10.3389/fphys.2019.00107] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 01/28/2019] [Indexed: 12/21/2022] Open
Abstract
Type 2 diabetes (T2D) is a complex and heterogeneous disease which affects millions of people worldwide. The classification of diabetes is at an interesting turning point and there have been several recent reports on sub-classification of T2D based on phenotypical and metabolic characteristics. An important, and perhaps so far underestimated, factor in the pathophysiology of T2D is the role of oxidative stress and reactive oxygen species (ROS). There are multiple pathways for excessive ROS formation in T2D and in addition, beta-cells have an inherent deficit in the capacity to cope with oxidative stress. ROS formation could be causal, but also contribute to a large number of the metabolic defects in T2D, including beta-cell dysfunction and loss. Currently, our knowledge on beta-cell mass is limited to autopsy studies and based on comparisons with healthy controls. The combined evidence suggests that beta-cell mass is unaltered at onset of T2D but that it declines progressively. In order to better understand the pathophysiology of T2D, to identify and evaluate novel treatments, there is a need for in vivo techniques able to quantify beta-cell mass. Positron emission tomography holds great potential for this purpose and can in addition map metabolic defects, including ROS activity, in specific tissue compartments. In this review, we highlight the different phenotypical features of T2D and how metabolic defects impact oxidative stress and ROS formation. In addition, we review the literature on alterations of beta-cell mass in T2D and discuss potential techniques to assess beta-cell mass and metabolic defects in vivo.
Collapse
Affiliation(s)
- Andris Elksnis
- Department of Medical Cell Biology, Uppsala University, Uppsala, Sweden
| | - Mats Martinell
- Department of Public Health and Caring Sciences, Uppsala University, Uppsala, Sweden
| | - Olof Eriksson
- Science for Life Laboratory, Department of Medicinal Chemistry, Uppsala University, Uppsala, Sweden
| | - Daniel Espes
- Department of Medical Cell Biology, Uppsala University, Uppsala, Sweden
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| |
Collapse
|
218
|
Network Tomography for Understanding Phenotypic Presentations in Aortic Stenosis. JACC Cardiovasc Imaging 2019; 12:236-248. [DOI: 10.1016/j.jcmg.2018.11.025] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 11/15/2018] [Accepted: 11/28/2018] [Indexed: 11/19/2022]
|
219
|
Tozzi A. The multidimensional brain. Phys Life Rev 2019; 31:86-103. [PMID: 30661792 DOI: 10.1016/j.plrev.2018.12.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Revised: 05/17/2018] [Accepted: 12/27/2018] [Indexed: 01/24/2023]
Abstract
Brain activity takes place in three spatial-plus time dimensions. This rather obvious claim has been recently questioned by papers that, taking into account the big data outburst and novel available computational tools, are starting to unveil a more intricate state of affairs. Indeed, various brain activities and their correlated mental functions can be assessed in terms of trajectories embedded in phase spaces of dimensions higher than the canonical ones. In this review, I show how further dimensions may not just represent a convenient methodological tool that allows a better mathematical treatment of otherwise elusive cortical activities, but may also reflect genuine functional or anatomical relationships among real nervous functions. I then describe how to extract hidden multidimensional information from real or artificial neurodata series, and make clear how our mind dilutes, rather than concentrates as currently believed, inputs coming from the environment. Finally, I argue that the principle "the higher the dimension, the greater the information" may explain the occurrence of mental activities and elucidate the mechanisms of human diseases associated with dimensionality reduction.
Collapse
Affiliation(s)
- Arturo Tozzi
- Center for Nonlinear Science, University of North Texas, 1155 Union Circle, #311427 Denton, TX 76203-5017, USA.
| |
Collapse
|
220
|
Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. CURRENT PROTOCOLS IN HUMAN GENETICS 2019; 100:e80. [PMID: 30516347 PMCID: PMC6318047 DOI: 10.1002/cphg.80] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Electronic health records contain patient-level data collected during and for clinical care. Data within the electronic health record include diagnostic billing codes, procedure codes, vital signs, laboratory test results, clinical imaging, and physician notes. With repeated clinic visits, these data are longitudinal, providing important information on disease development, progression, and response to treatment or intervention strategies. The near universal adoption of electronic health records nationally has the potential to provide population-scale real-world clinical data accessible for biomedical research, including genetic association studies. For this research potential to be realized, high-quality research-grade variables must be extracted from these clinical data warehouses. We describe here common and emerging electronic phenotyping approaches applied to electronic health records, as well as current limitations of both the approaches and the biases associated with these clinically collected data that impact their use in research. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Sarah A. Pendergrass
- Biomedical and Translational Informatics Institute,
Geisinger Research, Rockville MD
| | - Dana C. Crawford
- Institute for Computational Biology, Department of
Population and Quantitative Health Sciences, Case Western Reserve University,
Cleveland, OH
| |
Collapse
|
221
|
Dagliati A, Geifman N, Peek N, Holmes JH, Sacchi L, Sajjadi SE, Tucker A. Inferring Temporal Phenotypes with Topological Data Analysis and Pseudo Time-Series. Artif Intell Med 2019. [DOI: 10.1007/978-3-030-21642-9_50] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
222
|
Enhanced Molecular Appreciation of Psychiatric Disorders Through High-Dimensionality Data Acquisition and Analytics. Methods Mol Biol 2019; 2011:671-723. [PMID: 31273728 DOI: 10.1007/978-1-4939-9554-7_39] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The initial diagnosis, molecular investigation, treatment, and posttreatment care of major psychiatric disorders (schizophrenia and bipolar depression) are all still significantly hindered by the current inability to define these disorders in an explicit molecular signaling manner. High-dimensionality data analytics, using large datastreams from transcriptomic, proteomic, or metabolomic investigations, will likely advance both the appreciation of the molecular nature of major psychiatric disorders and simultaneously enhance our ability to more efficiently diagnose and treat these debilitating conditions. High-dimensionality data analysis in psychiatric research has been heterogeneous in aims and methods and limited by insufficient sample sizes, poorly defined case definitions, methodological inhomogeneity, and confounding results. All of these issues combine to constrain the conclusions that can be extracted from them. Here, we discuss possibilities for overcoming methodological challenges through the implementation of transcriptomic, proteomic, or metabolomics signatures in psychiatric diagnosis and offer an outlook for future investigations. To fulfill the promise of intelligent high-dimensionality data-based differential diagnosis in mental disease diagnosis and treatment, future research will need large, well-defined cohorts in combination with state-of-the-art technologies.
Collapse
|
223
|
Delma MI. The Quest for Type 2 Diabetes Subgroups Identification: Literature Review for a New Subtype Proposal. Cureus 2018; 10:e3770. [PMID: 30820389 PMCID: PMC6389034 DOI: 10.7759/cureus.3770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 12/24/2018] [Indexed: 11/05/2022] Open
Abstract
Type 2 diabetes is considered typically as a heterogeneous disease that englobes the potential different subtypes with distinct pathophysiological mechanisms and/or susceptibility to complications. Some authors have succeeded in the identification of some of these subgroups, but a lot of work remains to be completed. Given the effects of the sympathetic innervation via alpha 1 adrenoceptors on diabetes target organs and the interindividual variability of this receptor sensitivity, the existence of a subtype of type 2 diabetes with hyperactivation of alpha 1 adrenoceptors (HA1A) is proposed. Based on the literature review, the potential characteristics of this phenotype and its susceptibility to certain complications have been identified in this article.
Collapse
|
224
|
Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 2018; 19:1236-1246. [PMID: 28481991 PMCID: PMC6455466 DOI: 10.1093/bib/bbx044] [Citation(s) in RCA: 883] [Impact Index Per Article: 126.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Revised: 02/19/2017] [Indexed: 02/07/2023] Open
Abstract
Gaining knowledge and actionable insights from complex, high-dimensional and heterogeneous biomedical data remains a key challenge in transforming health care. Various types of data have been emerging in modern biomedical research, including electronic health records, imaging, -omics, sensor data and text, which are complex, heterogeneous, poorly annotated and generally unstructured. Traditional data mining and statistical learning approaches typically need to first perform feature engineering to obtain effective and more robust features from those data, and then build prediction or clustering models on top of them. There are lots of challenges on both steps in a scenario of complicated data and lacking of sufficient domain knowledge. The latest advances in deep learning technologies provide new effective paradigms to obtain end-to-end learning models from complex data. In this article, we review the recent literature on applying deep learning technologies to advance the health care domain. Based on the analyzed work, we suggest that deep learning approaches could be the vehicle for translating big biomedical data into improved human health. However, we also note limitations and needs for improved methods development and applications, especially in terms of ease-of-understanding for domain experts and citizen scientists. We discuss such challenges and suggest developing holistic and meaningful interpretable architectures to bridge deep learning models and human interpretability.
Collapse
Affiliation(s)
- Riccardo Miotto
- Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences at the Icahn School of Medicine at Mount Sinai, New York, NY
| | - Fei Wang
- Division of Health Informatics, Department of Healthcare Policy and Research at Weill Cornell Medicine at Cornell University, New York, NY
| | - Shuang Wang
- Department of Biomedical Informatics at the University of California San Diego, La Jolla, CA
| | - Xiaoqian Jiang
- Department of Biomedical Informatics at the University of California San Diego, La Jolla, CA
| | - Joel T Dudley
- the Institute for Next Generation Healthcare and associate professor in the Department of Genetics and Genomic Sciences at the Icahn School of Medicine at Mount Sinai, New York, NY
| |
Collapse
|
225
|
Li J, Akil O, Rouse SL, McLaughlin CW, Matthews IR, Lustig LR, Chan DK, Sherr EH. Deletion of Tmtc4 activates the unfolded protein response and causes postnatal hearing loss. J Clin Invest 2018; 128:5150-5162. [PMID: 30188326 DOI: 10.1172/jci97498] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 08/30/2018] [Indexed: 12/16/2022] Open
Abstract
Hearing loss is a significant public health concern, affecting over 250 million people worldwide. Both genetic and environmental etiologies are linked to hearing loss, but in many cases the underlying cellular pathophysiology is not well understood, highlighting the importance of further discovery. We found that inactivation of the gene Tmtc4 (transmembrane and tetratricopeptide repeat 4), which was broadly expressed in the mouse cochlea, caused acquired hearing loss in mice. Our data showed Tmtc4 enriched in the endoplasmic reticulum, and that it functioned by regulating Ca2+ dynamics and the unfolded protein response (UPR). Given this genetic linkage of the UPR to hearing loss, we demonstrated a direct link between the more common noise-induced hearing loss (NIHL) and the UPR. These experiments suggested a novel approach to treatment. We demonstrated that the small-molecule UPR and stress response modulator ISRIB (integrated stress response inhibitor), which activates eIF2B, prevented NIHL in a mouse model. Moreover, in an inverse genetic complementation approach, we demonstrated that mice with homozygous inactivation of both Tmtc4 and Chop had less hearing loss than knockout of Tmtc4 alone. This study implicated a novel mechanism for hearing impairment, highlighting a potential treatment approach for a broad range of human hearing loss disorders.
Collapse
Affiliation(s)
| | - Omar Akil
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco (UCSF), San Francisco, California, USA
| | - Stephanie L Rouse
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco (UCSF), San Francisco, California, USA
| | - Conor W McLaughlin
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco (UCSF), San Francisco, California, USA
| | - Ian R Matthews
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco (UCSF), San Francisco, California, USA
| | - Lawrence R Lustig
- Department of Otolaryngology - Head and Neck Surgery, College of Physicians and Surgeons, Columbia University and New York Presbyterian Hospital, New York, New York, USA
| | - Dylan K Chan
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco (UCSF), San Francisco, California, USA
| | - Elliott H Sherr
- Department of Neurology and.,Department of Pediatrics, Institute of Human Genetics, Weill Institute for Neurosciences, UCSF, San Francisco, California, USA
| |
Collapse
|
226
|
Identify and monitor clinical variation using machine intelligence: a pilot in colorectal surgery. J Clin Monit Comput 2018; 33:725-731. [PMID: 30251058 DOI: 10.1007/s10877-018-0200-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 09/17/2018] [Indexed: 11/27/2022]
Abstract
Standardized clinical pathways are useful tool to reduce variation in clinical management and may improve quality of care. However the evidence supporting a specific clinical pathway for a patient or patient population is often imperfect limiting adoption and efficacy of clinical pathway. Machine intelligence can potentially identify clinical variation and may provide useful insights to create and optimize clinical pathways. In this quality improvement project we analyzed the inpatient care of 1786 patients undergoing colorectal surgery from 2015 to 2016 across multiple Ohio hospitals in the Cleveland Clinic System. Data from four information subsystems was loaded in the Clinical Variation Management (CVM) application (Ayasdi, Inc., Menlo Park, CA). The CVM application uses machine intelligence and topological data analysis methods to identify groups of similar patients based on the treatment received. We defined "favorable performance" as groups with lower direct variable cost, lower length of stay, and lower 30-day readmissions. The software auto-generated 9 distinct groups of patients based on similarity analysis. Overall, favorable performance was seen with ketorolac use, lower intra-operative fluid use (< 2000 cc) and surgery for cancer. Multiple sub-groups were easily created and analyzed. Adherence reporting tools were easy to use enabling almost real time monitoring. Machine intelligence provided useful insights to create and monitor care pathways with several advantages over traditional analytic approaches including: (1) analysis across disparate data sets, (2) unsupervised discovery, (3) speed and auto-generation of clinical pathways, (4) ease of use by team members, and (5) adherence reporting.
Collapse
|
227
|
Pai S, Bader GD. Patient Similarity Networks for Precision Medicine. J Mol Biol 2018; 430:2924-2938. [PMID: 29860027 PMCID: PMC6097926 DOI: 10.1016/j.jmb.2018.05.037] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2018] [Revised: 05/24/2018] [Accepted: 05/29/2018] [Indexed: 02/08/2023]
Abstract
Clinical research and practice in the 21st century is poised to be transformed by analysis of computable electronic medical records and population-level genome-scale patient profiles. Genomic data capture genetic and environmental state, providing information on heterogeneity in disease and treatment outcome, but genomic-based clinical risk scores are limited. Achieving the goal of routine precision medicine that takes advantage of these rich genomics data will require computational methods that support heterogeneous data, have excellent predictive performance, and ideally, provide biologically interpretable results. Traditional machine-learning approaches excel at performance, but often have limited interpretability. Patient similarity networks are an emerging paradigm for precision medicine, in which patients are clustered or classified based on their similarities in various features, including genomic profiles. This strategy is analogous to standard medical diagnosis, has excellent performance, is interpretable, and can preserve patient privacy. We review new methods based on patient similarity networks, including Similarity Network Fusion for patient clustering and netDx for patient classification. While these methods are already useful, much work is required to improve their scalability for contemporary genetic cohorts, optimize parameters, and incorporate a wide range of genomics and clinical data. The coming 5 years will provide an opportunity to assess the utility of network-based algorithms for precision medicine.
Collapse
Affiliation(s)
- Shraddha Pai
- The Donnelly Centre, University of Toronto, Toronto, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Canada; Department of Computer Science, University of Toronto, Toronto, Canada; The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada.
| |
Collapse
|
228
|
Udler MS, Kim J, von Grotthuss M, Bonàs-Guarch S, Cole JB, Chiou J, Christopher D. Anderson on behalf of METASTROKE and the ISGC, Boehnke M, Laakso M, Atzmon G, Glaser B, Mercader JM, Gaulton K, Flannick J, Getz G, Florez JC. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis. PLoS Med 2018; 15:e1002654. [PMID: 30240442 PMCID: PMC6150463 DOI: 10.1371/journal.pmed.1002654] [Citation(s) in RCA: 347] [Impact Index Per Article: 49.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 08/17/2018] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Type 2 diabetes (T2D) is a heterogeneous disease for which (1) disease-causing pathways are incompletely understood and (2) subclassification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper, we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four separate subsets of individuals with T2D. METHODS AND FINDINGS In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization (bNMF) clustering to genome-wide association study (GWAS) results for 94 independent T2D genetic variants and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta cell function, differing from each other by high versus low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity mediated (high body mass index [BMI] and waist circumference [WC]), "lipodystrophy-like" fat distribution (low BMI, adiponectin, and high-density lipoprotein [HDL] cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster genetic risk scores were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease (CAD), and stroke. We evaluated the potential for clinical impact of these clusters in four studies containing individuals with T2D (Metabolic Syndrome in Men Study [METSIM], N = 487; Ashkenazi, N = 509; Partners Biobank, N = 2,065; UK Biobank [UKBB], N = 14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with approximately 30% of all individuals assigned to just one cluster top decile. Limitations of this study include that the genetic variants used in the cluster analysis were restricted to those associated with T2D in populations of European ancestry. CONCLUSION Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports the use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.
Collapse
Affiliation(s)
- Miriam S. Udler
- Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jaegil Kim
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Marcin von Grotthuss
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sílvia Bonàs-Guarch
- Barcelona Supercomputing Center (BSC), Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona, Spain
| | - Joanne B. Cole
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Joshua Chiou
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | | | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland
| | - Gil Atzmon
- Faculty of Natural Sciences, University of Haifa, Haifa, Israel
- Department of Medicine; Albert Einstein College of Medicine, Bronx, New York, United States of America
- Department of Genetics, Institute for Aging Research, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Benjamin Glaser
- Endocrinology and Metabolism Service, Hadassah-Hebrew University Medical Center, Jerusalem, Israel
| | - Josep M. Mercader
- Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Barcelona Supercomputing Center (BSC), Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona, Spain
| | - Kyle Gaulton
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Jason Flannick
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Genetics, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jose C. Florez
- Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
229
|
Huang C, Yang W, Wang J, Zhou Y, Geng B, Kararigas G, Yang J, Cui Q. The DrugPattern tool for drug set enrichment analysis and its prediction for beneficial effects of oxLDL on type 2 diabetes. J Genet Genomics 2018; 45:389-397. [PMID: 30054214 DOI: 10.1016/j.jgg.2018.07.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 05/18/2018] [Accepted: 07/04/2018] [Indexed: 01/01/2023]
Abstract
Enrichment analysis methods, e.g., gene set enrichment analysis, represent one class of important bioinformatical resources for mining patterns in biomedical datasets. However, tools for inferring patterns and rules of a list of drugs are limited. In this study, we developed a web-based tool, DrugPattern, for drug set enrichment analysis. We first collected and curated 7019 drug sets, including indications, adverse reactions, targets, pathways, etc. from public databases. For a list of interested drugs, DrugPattern then evaluates the significance of the enrichment of these drugs in each of the 7019 drug sets. To validate DrugPattern, we employed it for the prediction of the effects of oxidized low-density lipoprotein (oxLDL), a factor expected to be deleterious. We predicted that oxLDL has beneficial effects on some diseases, most of which were supported by evidence in the literature. Because DrugPattern predicted the potential beneficial effects of oxLDL in type 2 diabetes (T2D), animal experiments were then performed to further verify this prediction. As a result, the experimental evidences validated the DrugPattern prediction that oxLDL indeed has beneficial effects on T2D in the case of energy restriction. These data confirmed the prediction accuracy of our approach and revealed unexpected protective roles for oxLDL in various diseases. This study provides a tool to infer patterns and rules in biomedical datasets based on drug set enrichment analysis. DrugPattern is available at http://www.cuilab.cn/drugpattern.
Collapse
Affiliation(s)
- Chuanbo Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China; School of Mathematics Sciences, Huaqiao University, Quanzhou 362021, China
| | - Weili Yang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China
| | - Junpei Wang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China
| | - Bin Geng
- Hypertension Center, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, State Key Laboratory of Cardiovascular Disease, National Center for Cardiovascular Diseases, Beijing 100037, China
| | - Georgios Kararigas
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Institute of Gender in Medicine and Center for Cardiovascular Research, DZHK (German Centre for Cardiovascular Research), 10115 Berlin, Germany
| | - Jichun Yang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China.
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Center for Non-Coding RNA Medicine, Peking University, Beijing 100191, China; Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
230
|
Abstract
Biomedical data science has experienced an explosion of new data over the past decade. Abundant genetic and genomic data are increasingly available in large, diverse data sets due to the maturation of modern molecular technologies. Along with these molecular data, dense, rich phenotypic data are also available on comprehensive clinical data sets from health care provider organizations, clinical trials, population health registries, and epidemiologic studies. The methods and approaches for interrogating these large genetic/genomic and clinical data sets continue to evolve rapidly, as our understanding of the questions and challenges continue to emerge. In this review, the state-of-the-art methodologies for genetic/genomic analysis along with complex phenomics will be discussed. This field is changing and adapting to the novel data types made available, as well as technological advances in computation and machine learning. Thus, I will also discuss the future challenges in this exciting and innovative space. The promises of precision medicine rely heavily on the ability to marry complex genetic/genomic data with clinical phenotypes in meaningful ways.
Collapse
Affiliation(s)
- Marylyn D. Ritchie
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
231
|
Patient representation learning and interpretable evaluation using clinical notes. J Biomed Inform 2018; 84:103-113. [PMID: 29966746 DOI: 10.1016/j.jbi.2018.06.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 06/07/2018] [Accepted: 06/28/2018] [Indexed: 11/22/2022]
Abstract
We have three contributions in this work: 1. We explore the utility of a stacked denoising autoencoder and a paragraph vector model to learn task-independent dense patient representations directly from clinical notes. To analyze if these representations are transferable across tasks, we evaluate them in multiple supervised setups to predict patient mortality, primary diagnostic and procedural category, and gender. We compare their performance with sparse representations obtained from a bag-of-words model. We observe that the learned generalized representations significantly outperform the sparse representations when we have few positive instances to learn from, and there is an absence of strong lexical features. 2. We compare the model performance of the feature set constructed from a bag of words to that obtained from medical concepts. In the latter case, concepts represent problems, treatments, and tests. We find that concept identification does not improve the classification performance. 3. We propose novel techniques to facilitate model interpretability. To understand and interpret the representations, we explore the best encoded features within the patient representations obtained from the autoencoder model. Further, we calculate feature sensitivity across two networks to identify the most significant input features for different classification tasks when we use these pretrained representations as the supervised input. We successfully extract the most influential features for the pipeline using this technique.
Collapse
|
232
|
Zhang H, Zhu F, Dodge HH, Higgins GA, Omenn GS, Guan Y, the Alzheimer's Disease Neuroimaging Initiative. A similarity-based approach to leverage multi-cohort medical data on the diagnosis and prognosis of Alzheimer's disease. Gigascience 2018; 7:5052206. [PMID: 30010762 PMCID: PMC6054197 DOI: 10.1093/gigascience/giy085] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 04/15/2018] [Accepted: 06/28/2018] [Indexed: 01/17/2023] Open
Abstract
Motivation Heterogeneous diseases such as Alzheimer's disease (AD) manifest a variety of phenotypes among populations. Early diagnosis and effective treatment offer cost benefits. Many studies on biochemical and imaging markers have shown potential promise in improving diagnosis, yet establishing quantitative diagnostic criteria for ancillary tests remains challenging. Results We have developed a similarity-based approach that matches individuals to subjects with similar conditions. We modeled the disease with a Gaussian process, and tested the method in the Alzheimer's Disease Big Data DREAM Challenge. Ranked the highest among submitted methods, our diagnostic model predicted cognitive impairment scores in an independent dataset test with a correlation score of 0.573. It differentiated AD patients from control subjects with an area under the receiver operating curve of 0.920. Without knowing longitudinal information about subjects, the model predicted patients who are vulnerable to conversion from mild-cognitive impairment to AD through the similarity network. This diagnostic framework can be applied to other diseases with clinical heterogeneity, such as Parkinson's disease.
Collapse
Affiliation(s)
- Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 2017G Palmer Commons, 100 Washtenaw Avenue, Ann Arbor, MI, USA 48109
| | - Fan Zhu
- Department of Computational Medicine and Bioinformatics, University of Michigan, 2017G Palmer Commons, 100 Washtenaw Avenue, Ann Arbor, MI, USA 48109
- Chongqing Key Laboratory of Big Data and Intelligent Computing, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, 266 Fangzheng Avenue, Shuitu Hi-tech Industrial Park, Shuitu Town, Beibei District, Chongqing, China 400714
| | - Hiroko H Dodge
- Michigan Alzheimer's Disease Center, University of Michigan, 2101 Commonwealth Blvd, Ann Arbor, MI, USA 48105
- Department of Neurology, University of Michigan, 1500 E. Medical Center Dr., 1914 Taubman Center SPC 5316, Ann Arbor, MI, USA 48109
- Layton Aging and Alzheimer's Disease Center and Department of Neurology, Oregon Health & Science University, 3181 S.W. Sam Jackson Park Road, L226, Portland, OR, USA 97239
| | - Gerald A Higgins
- Department of Computational Medicine and Bioinformatics, University of Michigan, 2017G Palmer Commons, 100 Washtenaw Avenue, Ann Arbor, MI, USA 48109
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 2017G Palmer Commons, 100 Washtenaw Avenue, Ann Arbor, MI, USA 48109
- Department of Internal Medicine, University of Michigan, 3110 Taubman Center, SPC 5368, 1500 East Medical Center Drive, Ann Arbor, MI, USA 48109
- Department of Human Genetics, University of Michigan, 4909 Buhl Building, 1241 E. Catherine St., Ann Arbor, MI, USA 48109
- School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, USA 48109
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 2017G Palmer Commons, 100 Washtenaw Avenue, Ann Arbor, MI, USA 48109
- Department of Internal Medicine, University of Michigan, 3110 Taubman Center, SPC 5368, 1500 East Medical Center Drive, Ann Arbor, MI, USA 48109
- Department of Electronic Engineering and Computer Science, Bob and Betty Beyster Building, 2260 Hayward Street, University of Michigan, Ann Arbor, MI, USA 48109
| | | |
Collapse
|
233
|
Safai N, Ali A, Rossing P, Ridderstråle M. Stratification of type 2 diabetes based on routine clinical markers. Diabetes Res Clin Pract 2018; 141:275-283. [PMID: 29782936 DOI: 10.1016/j.diabres.2018.05.014] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 04/15/2018] [Accepted: 05/08/2018] [Indexed: 12/16/2022]
Abstract
AIMS We hypothesized that patients with dysregulated type 2 diabetes may be stratified based on routine clinical markers. METHODS In this retrospective cohort study, diabetes related clinical measures including age at onset, diabetes duration, HbA1c, BMI, HOMA2-β, HOMA2-IR and GAD65 autoantibodies, were used for sub-grouping patients by K-means clustering and for adjusting. Probability of diabetes complications (95% confidence interval), were calculated using logistic regression. RESULTS Based on baseline data from patients with type 2 diabetes (n = 2290), the cluster analysis suggested up to five sub-groups. These were primarily characterized by autoimmune β-cell failure (3%), insulin resistance with short disease duration (21%), non-autoimmune β-cell failure (22%), insulin resistance with long disease duration (32%), and presence of metabolic syndrome (22%), respectively. Retinopathy was more common in the sub-group characterized by non-autoimmune β-cell failure (52% (47.7-56.8)) compared to other sub-groups (22% (20.1-24.1)), adj. p < 0.001. The prevalence of cardiovascular disease, nephropathy and neuropathy also differed between sub-groups, but significance was lost after adjustment. CONCLUSIONS Patients with type 2 diabetes cluster into clinically relevant sub-groups based on routine clinical markers. The prevalence of diabetes complications seems to be sub-group specific. Our data suggests the need for a tailored strategy for the treatment of type 2 diabetes.
Collapse
Affiliation(s)
- Narges Safai
- Steno Diabetes Center Copenhagen, Patient Care, Niels Steensens Vej 2-4, DK-2820 Gentofte, Denmark.
| | - Ashfaq Ali
- Steno Diabetes Center Copenhagen, Systems Medicine, Niels Steensens Vej 2-4, DK-2820 Gentofte, Denmark.
| | - Peter Rossing
- Steno Diabetes Center Copenhagen, Complication Research, Niels Steensens Vej 2-4, DK-2820 Gentofte, Denmark; University of Copenhagen, Department of Clinical Medicine, Copenhagen, Denmark.
| | - Martin Ridderstråle
- Steno Diabetes Center Copenhagen, Patient Care, Niels Steensens Vej 2-4, DK-2820 Gentofte, Denmark.
| |
Collapse
|
234
|
Langenberg C, Lotta LA. Genomic insights into the causes of type 2 diabetes. Lancet 2018; 391:2463-2474. [PMID: 29916387 DOI: 10.1016/s0140-6736(18)31132-2] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 04/30/2018] [Accepted: 05/15/2018] [Indexed: 01/05/2023]
Abstract
Genome-wide association studies have implicated around 250 genomic regions in predisposition to type 2 diabetes, with evidence for causal variants and genes emerging for several of these regions. Understanding of the underlying mechanisms, including the interplay between β-cell failure, insulin sensitivity, appetite regulation, and adipose storage has been facilitated by the integration of multidimensional data for diabetes-related intermediate phenotypes, detailed genomic annotations, functional experiments, and now multiomic molecular features. Studies in diverse ethnic groups and examples from population isolates have shown the value and need for a broad genomic approach to this global disease. Transethnic discovery efforts and large-scale biobanks in diverse populations and ancestries could help to address some of the Eurocentric bias. Despite rapid progress in the discovery of the highly polygenic architecture of type 2 diabetes, dominated by common alleles with small, cumulative effects on disease risk, these insights have been of little clinical use in terms of disease prediction or prevention, and have made only small contributions to subtype classification or stratified approaches to treatment. Successful development of academia-industry partnerships for exome or genome sequencing in large biobanks could help to deliver economies of scale, with implications for the future of genomics-focused research.
Collapse
Affiliation(s)
| | - Luca A Lotta
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
235
|
Kopan C, Tucker T, Alexander M, Mohammadi MR, Pone EJ, Lakey JRT. Approaches in Immunotherapy, Regenerative Medicine, and Bioengineering for Type 1 Diabetes. Front Immunol 2018; 9:1354. [PMID: 29963051 PMCID: PMC6011033 DOI: 10.3389/fimmu.2018.01354] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/31/2018] [Indexed: 12/12/2022] Open
Abstract
Recent advances on using immune and stem cells as two-pronged approaches for type 1 diabetes mellitus (T1DM) treatment show promise for advancement into clinical practice. As T1DM is thought to arise from autoimmune attack destroying pancreatic β-cells, increasing treatments that use biologics and cells to manipulate the immune system are achieving better results in pre-clinical and clinical studies. Increasingly, focus has shifted from small molecule drugs that suppress the immune system nonspecifically to more complex biologics that show enhanced efficacy due to their selectivity for specific types of immune cells. Approaches that seek to inhibit only autoreactive effector T cells or enhance the suppressive regulatory T cell subset are showing remarkable promise. These modern immune interventions are also enabling the transplantation of pancreatic islets or β-like cells derived from stem cells. While complete immune tolerance and body acceptance of grafted islets and cells is still challenging, bioengineering approaches that shield the implanted cells are also advancing. Integrating immunotherapy, stem cell-mediated β-cell or islet production and bioengineering to interface with the patient is expected to lead to a durable cure or pave the way for a clinical solution for T1DM.
Collapse
Affiliation(s)
- Christopher Kopan
- Department of Surgery, University of California Irvine, Irvine, CA, United States
| | - Tori Tucker
- Department of Cell and Molecular Biosciences, University of California Irvine, Irvine, CA, United States
| | - Michael Alexander
- Department of Surgery, University of California Irvine, Irvine, CA, United States
| | - M. Rezaa Mohammadi
- Department of Chemical Engineering and Materials Science, University of California Irvine, Irvine, CA, United States
| | - Egest J. Pone
- Department of Pharmaceutical Sciences, University of California Irvine, Irvine, CA, United States
| | - Jonathan Robert Todd Lakey
- Department of Surgery, University of California Irvine, Irvine, CA, United States
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, United States
| |
Collapse
|
236
|
Tranchevent LC, Nazarov PV, Kaoma T, Schmartz GP, Muller A, Kim SY, Rajapakse JC, Azuaje F. Predicting clinical outcome of neuroblastoma patients using an integrative network-based approach. Biol Direct 2018; 13:12. [PMID: 29880025 PMCID: PMC5992838 DOI: 10.1186/s13062-018-0214-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 05/04/2018] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND One of the main current challenges in computational biology is to make sense of the huge amounts of multidimensional experimental data that are being produced. For instance, large cohorts of patients are often screened using different high-throughput technologies, effectively producing multiple patient-specific molecular profiles for hundreds or thousands of patients. RESULTS We propose and implement a network-based method that integrates such patient omics data into Patient Similarity Networks. Topological features derived from these networks were then used to predict relevant clinical features. As part of the 2017 CAMDA challenge, we have successfully applied this strategy to a neuroblastoma dataset, consisting of genomic and transcriptomic data. In particular, we observe that models built on our network-based approach perform at least as well as state of the art models. We furthermore explore the effectiveness of various topological features and observe, for instance, that redundant centrality metrics can be combined to build more powerful models. CONCLUSION We demonstrate that the networks inferred from omics data contain clinically relevant information and that patient clinical outcomes can be predicted using only network topological data. REVIEWERS This article was reviewed by Yang-Yu Liu, Tomislav Smuc and Isabel Nepomuceno.
Collapse
Affiliation(s)
- Léon-Charles Tranchevent
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| | - Petr V. Nazarov
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| | - Tony Kaoma
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| | - Georges P. Schmartz
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
- Bioinformatics bachelor program, Universität des Saarlandes, Saarbrücken, Germany
| | - Arnaud Muller
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| | - Sang-Yoon Kim
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| | - Jagath C. Rajapakse
- Bioinformatics Research Center, School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Francisco Azuaje
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
| |
Collapse
|
237
|
Parimbelli E, Marini S, Sacchi L, Bellazzi R. Patient similarity for precision medicine: A systematic review. J Biomed Inform 2018; 83:87-96. [PMID: 29864490 DOI: 10.1016/j.jbi.2018.06.001] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 05/16/2018] [Accepted: 06/01/2018] [Indexed: 12/19/2022]
Abstract
Evidence-based medicine is the most prevalent paradigm adopted by physicians. Clinical practice guidelines typically define a set of recommendations together with eligibility criteria that restrict their applicability to a specific group of patients. The ever-growing size and availability of health-related data is currently challenging the broad definitions of guideline-defined patient groups. Precision medicine leverages on genetic, phenotypic, or psychosocial characteristics to provide precise identification of patient subsets for treatment targeting. Defining a patient similarity measure is thus an essential step to allow stratification of patients into clinically-meaningful subgroups. The present review investigates the use of patient similarity as a tool to enable precision medicine. 279 articles were analyzed along four dimensions: data types considered, clinical domains of application, data analysis methods, and translational stage of findings. Cancer-related research employing molecular profiling and standard data analysis techniques such as clustering constitute the majority of the retrieved studies. Chronic and psychiatric diseases follow as the second most represented clinical domains. Interestingly, almost one quarter of the studies analyzed presented a novel methodology, with the most advanced employing data integration strategies and being portable to different clinical domains. Integration of such techniques into decision support systems constitutes and interesting trend for future research.
Collapse
Affiliation(s)
- E Parimbelli
- Telfer School of Management, University of Ottawa, Ottawa, Canada; Interdepartmental Centre for Health Technologies, University of Pavia, Italy.
| | - S Marini
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, USA; Interdepartmental Centre for Health Technologies, University of Pavia, Italy
| | - L Sacchi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; Interdepartmental Centre for Health Technologies, University of Pavia, Italy
| | - R Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; Interdepartmental Centre for Health Technologies, University of Pavia, Italy; RCCS ICS Maugeri, Pavia, Italy
| |
Collapse
|
238
|
Machine learning: novel bioinformatics approaches for combating antimicrobial resistance. Curr Opin Infect Dis 2018; 30:511-517. [PMID: 28914640 DOI: 10.1097/qco.0000000000000406] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
PURPOSE OF REVIEW Antimicrobial resistance (AMR) is a threat to global health and new approaches to combating AMR are needed. Use of machine learning in addressing AMR is in its infancy but has made promising steps. We reviewed the current literature on the use of machine learning for studying bacterial AMR. RECENT FINDINGS The advent of large-scale data sets provided by next-generation sequencing and electronic health records make applying machine learning to the study and treatment of AMR possible. To date, it has been used for antimicrobial susceptibility genotype/phenotype prediction, development of AMR clinical decision rules, novel antimicrobial agent discovery and antimicrobial therapy optimization. SUMMARY Application of machine learning to studying AMR is feasible but remains limited. Implementation of machine learning in clinical settings faces barriers to uptake with concerns regarding model interpretability and data quality.Future applications of machine learning to AMR are likely to be laboratory-based, such as antimicrobial susceptibility phenotype prediction.
Collapse
|
239
|
Burg D, Schofield JPR, Brandsma J, Staykova D, Folisi C, Bansal A, Nicholas B, Xian Y, Rowe A, Corfield J, Wilson S, Ward J, Lutter R, Fleming L, Shaw DE, Bakke PS, Caruso M, Dahlen SE, Fowler SJ, Hashimoto S, Horváth I, Howarth P, Krug N, Montuschi P, Sanak M, Sandström T, Singer F, Sun K, Pandis I, Auffray C, Sousa AR, Adcock IM, Chung KF, Sterk PJ, Djukanović R, Skipp PJ, The U-Biopred Study Group. Large-Scale Label-Free Quantitative Mapping of the Sputum Proteome. J Proteome Res 2018; 17:2072-2091. [PMID: 29737851 DOI: 10.1021/acs.jproteome.8b00018] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Analysis of induced sputum supernatant is a minimally invasive approach to study the epithelial lining fluid and, thereby, provide insight into normal lung biology and the pathobiology of lung diseases. We present here a novel proteomics approach to sputum analysis developed within the U-BIOPRED (unbiased biomarkers predictive of respiratory disease outcomes) international project. We present practical and analytical techniques to optimize the detection of robust biomarkers in proteomic studies. The normal sputum proteome was derived using data-independent HDMSE applied to 40 healthy nonsmoking participants, which provides an essential baseline from which to compare modulation of protein expression in respiratory diseases. The "core" sputum proteome (proteins detected in ≥40% of participants) was composed of 284 proteins, and the extended proteome (proteins detected in ≥3 participants) contained 1666 proteins. Quality control procedures were developed to optimize the accuracy and consistency of measurement of sputum proteins and analyze the distribution of sputum proteins in the healthy population. The analysis showed that quantitation of proteins by HDMSE is influenced by several factors, with some proteins being measured in all participants' samples and with low measurement variance between samples from the same patient. The measurement of some proteins is highly variable between repeat analyses, susceptible to sample processing effects, or difficult to accurately quantify by mass spectrometry. Other proteins show high interindividual variance. We also highlight that the sputum proteome of healthy individuals is related to sputum neutrophil levels, but not gender or allergic sensitization. We illustrate the importance of design and interpretation of disease biomarker studies considering such protein population and technical measurement variance.
Collapse
Affiliation(s)
- Dominic Burg
- Centre for Proteomic Research, Biological Sciences , University of Southampton , Southampton SO17 1BJ , U.K.,NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - James P R Schofield
- Centre for Proteomic Research, Biological Sciences , University of Southampton , Southampton SO17 1BJ , U.K.,NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Joost Brandsma
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Doroteya Staykova
- Centre for Proteomic Research, Biological Sciences , University of Southampton , Southampton SO17 1BJ , U.K
| | - Caterina Folisi
- Centre for Proteomic Research, Biological Sciences , University of Southampton , Southampton SO17 1BJ , U.K
| | | | - Ben Nicholas
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Yang Xian
- Data Science Institute , Imperial College London , London SW7 2AZ , U.K
| | - Anthony Rowe
- Janssen Research & Development , Buckinghamshire HP12 4DP , U.K
| | | | - Susan Wilson
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Jonathan Ward
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Rene Lutter
- AMC, Department of Experimental Immunology , University of Amsterdam , 1012 WX Amsterdam , The Netherlands.,AMC, Department of Respiratory Medicine , University of Amsterdam , 1012 WX Amsterdam , The Netherlands
| | - Louise Fleming
- Airways Disease , National Heart and Lung Institute, Imperial College, London & Royal Brompton NIHR Biomedical Research Unit , London SW7 2AZ , United Kingdom
| | - Dominick E Shaw
- Respiratory Research Unit , University of Nottingham , Nottingham NG7 2RD , U.K
| | - Per S Bakke
- Institute of Medicine , University of Bergen , 5007 Bergen , Norway
| | - Massimo Caruso
- Department of Clinical and Experimental Medicine Hospital University , University of Catania , 95124 Catania , Italy
| | - Sven-Erik Dahlen
- The Centre for Allergy Research , The Institute of Environmental Medicine, Karolinska Institutet , SE-171 77 Stockholm , Sweden
| | - Stephen J Fowler
- Respiratory and Allergy Research Group , University of Manchester , Manchester M13 9PL , U.K
| | - Simone Hashimoto
- Department of Respiratory Medicine, Academic Medical Centre , University of Amsterdam , 1012 WX Amsterdam , The Netherlands
| | - Ildikó Horváth
- Department of Pulmonology , Semmelweis University , Budapest 1085 , Hungary
| | - Peter Howarth
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Norbert Krug
- Fraunhofer Institute for Toxicology and Experimental Medicine Hannover , 30625 Hannover , Germany
| | - Paolo Montuschi
- Faculty of Medicine , Catholic University of the Sacred Heart , 00168 Rome , Italy
| | - Marek Sanak
- Laboratory of Molecular Biology and Clinical Genetics, Medical College , Jagiellonian University , 31-007 Krakow , Poland
| | - Thomas Sandström
- Department of Medicine, Department of Public Health and Clinical Medicine Respiratory Medicine Unit , Umeå University , 901 87 Umeå , Sweden
| | - Florian Singer
- University Children's Hospital Zurich , 8032 Zurich , Switzerland
| | - Kai Sun
- Data Science Institute , Imperial College London , London SW7 2AZ , U.K
| | - Ioannis Pandis
- Data Science Institute , Imperial College London , London SW7 2AZ , U.K
| | - Charles Auffray
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL-INSERM , Université de Lyon , 69007 Lyon , France
| | - Ana R Sousa
- Respiratory Therapeutic Unit, GSK , Stockley Park , Uxbridge UB11 1BT , U.K
| | - Ian M Adcock
- Cell and Molecular Biology Group, Airways Disease Section , National Heart and Lung Institute, Imperial College London , Dovehouse Street , London SW3 6LR , U.K
| | - Kian Fan Chung
- Airways Disease , National Heart and Lung Institute, Imperial College, London & Royal Brompton NIHR Biomedical Research Unit , London SW7 2AZ , United Kingdom
| | - Peter J Sterk
- AMC, Department of Experimental Immunology , University of Amsterdam , 1012 WX Amsterdam , The Netherlands
| | - Ratko Djukanović
- NIHR Southampton Biomedical Research Centre, Clinical and Experimental Sciences, Faculty of Medicine , University of Southampton , Southampton SO16 6YD , U.K
| | - Paul J Skipp
- Centre for Proteomic Research, Biological Sciences , University of Southampton , Southampton SO17 1BJ , U.K
| | | |
Collapse
|
240
|
Sladek R. The many faces of diabetes: addressing heterogeneity of a complex disease. Lancet Diabetes Endocrinol 2018; 6:348-349. [PMID: 29503171 DOI: 10.1016/s2213-8587(18)30070-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 02/15/2018] [Indexed: 01/05/2023]
Affiliation(s)
- Rob Sladek
- McGill University and Génome Québec Innovation Centre, Montréal, Québec H3A 0G1, Canada; Department of Human Genetics and Department of Medicine, McGill University, Montréal, Québec, Canada.
| |
Collapse
|
241
|
Ahlqvist E, Storm P, Käräjämäki A, Martinell M, Dorkhan M, Carlsson A, Vikman P, Prasad RB, Aly DM, Almgren P, Wessman Y, Shaat N, Spégel P, Mulder H, Lindholm E, Melander O, Hansson O, Malmqvist U, Lernmark Å, Lahti K, Forsén T, Tuomi T, Rosengren AH, Groop L. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol 2018; 6:361-369. [PMID: 29503172 DOI: 10.1016/s2213-8587(18)30051-2] [Citation(s) in RCA: 1345] [Impact Index Per Article: 192.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 01/30/2018] [Accepted: 01/31/2018] [Indexed: 12/14/2022]
Abstract
BACKGROUND Diabetes is presently classified into two main forms, type 1 and type 2 diabetes, but type 2 diabetes in particular is highly heterogeneous. A refined classification could provide a powerful tool to individualise treatment regimens and identify individuals with increased risk of complications at diagnosis. METHODS We did data-driven cluster analysis (k-means and hierarchical clustering) in patients with newly diagnosed diabetes (n=8980) from the Swedish All New Diabetics in Scania cohort. Clusters were based on six variables (glutamate decarboxylase antibodies, age at diagnosis, BMI, HbA1c, and homoeostatic model assessment 2 estimates of β-cell function and insulin resistance), and were related to prospective data from patient records on development of complications and prescription of medication. Replication was done in three independent cohorts: the Scania Diabetes Registry (n=1466), All New Diabetics in Uppsala (n=844), and Diabetes Registry Vaasa (n=3485). Cox regression and logistic regression were used to compare time to medication, time to reaching the treatment goal, and risk of diabetic complications and genetic associations. FINDINGS We identified five replicable clusters of patients with diabetes, which had significantly different patient characteristics and risk of diabetic complications. In particular, individuals in cluster 3 (most resistant to insulin) had significantly higher risk of diabetic kidney disease than individuals in clusters 4 and 5, but had been prescribed similar diabetes treatment. Cluster 2 (insulin deficient) had the highest risk of retinopathy. In support of the clustering, genetic associations in the clusters differed from those seen in traditional type 2 diabetes. INTERPRETATION We stratified patients into five subgroups with differing disease progression and risk of diabetic complications. This new substratification might eventually help to tailor and target early treatment to patients who would benefit most, thereby representing a first step towards precision medicine in diabetes. FUNDING Swedish Research Council, European Research Council, Vinnova, Academy of Finland, Novo Nordisk Foundation, Scania University Hospital, Sigrid Juselius Foundation, Innovative Medicines Initiative 2 Joint Undertaking, Vasa Hospital district, Jakobstadsnejden Heart Foundation, Folkhälsan Research Foundation, Ollqvist Foundation, and Swedish Foundation for Strategic Research.
Collapse
Affiliation(s)
- Emma Ahlqvist
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Petter Storm
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Annemari Käräjämäki
- Department of Primary Health Care, Vaasa Central Hospital, Vaasa, Finland; Diabetes Center, Vaasa Health Care Center, Vaasa, Finland
| | - Mats Martinell
- Department of Public Health and Caring Sciences, Uppsala University, Uppsala, Sweden
| | - Mozhgan Dorkhan
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Annelie Carlsson
- Lund University Diabetes Centre, Department of Clinical Sciences, Skåne University Hospital, Lund University, Lund, Sweden
| | - Petter Vikman
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Rashmi B Prasad
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Dina Mansour Aly
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Peter Almgren
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Ylva Wessman
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Nael Shaat
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Peter Spégel
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden; Department of Chemistry, Centre for Analysis and Synthesis, Lund University, Lund, Sweden
| | - Hindrik Mulder
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Eero Lindholm
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Olle Melander
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Ola Hansson
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Ulf Malmqvist
- Clinical Research and Trial Center, Lund University Hospital, Sweden
| | - Åke Lernmark
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Kaj Lahti
- Department of Primary Health Care, Vaasa Central Hospital, Vaasa, Finland; Diabetes Center, Vaasa Health Care Center, Vaasa, Finland
| | - Tom Forsén
- Folkhälsan Research Center, Helsinki, Finland
| | - Tiinamaija Tuomi
- Folkhälsan Research Center, Helsinki, Finland; Abdominal Center, Endocrinology, Helsinki University Central Hospital, Research Program for Diabetes and Obesity, University of Helsinki, Helsinki, Finland; Finnish Institute for Molecular Medicine, University of Helsinki, Helsinki, Finland
| | - Anders H Rosengren
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden; Department of Neuroscience and Physiology, Wallenberg Center for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Leif Groop
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden; Finnish Institute for Molecular Medicine, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
242
|
Abstract
Data, including information generated from them by processing and analysis, are an asset with measurable value. The assets that biological research funding produces are the data generated, the information derived from these data, and, ultimately, the discoveries and knowledge these lead to. From the time when Henry Oldenburg published the first scientific journal in 1665 (Proceedings of the Royal Society) to the founding of the United States National Library of Medicine in 1879 to the present, there has been a sustained drive to improve how researchers can record and discover what is known. Researchers’ experimental work builds upon years and (collectively) billions of dollars’ worth of earlier work. Today, researchers are generating data at ever-faster rates because of advances in instrumentation and technology, coupled with decreases in production costs. Unfortunately, the ability of researchers to manage and disseminate their results has not kept pace, so their work cannot achieve its maximal impact. Strides have recently been made, but more awareness is needed of the essential role that biological data resources, including biocuration, play in maintaining and linking this ever-growing flood of data and information. The aim of this paper is to describe the nature of data as an asset, the role biocurators play in increasing its value, and consistent, practical means to measure effectiveness that can guide planning and justify costs in biological research information resources’ development and management.
Collapse
|
243
|
Siddiqui S, Shikotra A, Richardson M, Doran E, Choy D, Bell A, Austin CD, Eastham-Anderson J, Hargadon B, Arron JR, Wardlaw A, Brightling CE, Heaney LG, Bradding P. Airway pathological heterogeneity in asthma: Visualization of disease microclusters using topological data analysis. J Allergy Clin Immunol 2018; 142:1457-1468. [PMID: 29550052 DOI: 10.1016/j.jaci.2017.12.982] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Revised: 11/16/2017] [Accepted: 12/04/2017] [Indexed: 12/27/2022]
Abstract
BACKGROUND Asthma is a complex chronic disease underpinned by pathological changes within the airway wall. How variations in structural airway pathology and cellular inflammation contribute to the expression and severity of asthma are poorly understood. OBJECTIVES Therefore we evaluated pathological heterogeneity using topological data analysis (TDA) with the aim of visualizing disease clusters and microclusters. METHODS A discovery population of 202 adult patients (142 asthmatic patients and 60 healthy subjects) and an external replication population (59 patients with severe asthma) were evaluated. Pathology and gene expression were examined in bronchial biopsy samples. TDA was applied by using pathological variables alone to create pathology-driven visual networks. RESULTS In the discovery cohort TDA identified 4 groups/networks with multiple microclusters/regions of interest that were masked by group-level statistics. Specifically, TDA group 1 consisted of a high proportion of healthy subjects, with a microcluster representing a topological continuum connecting healthy subjects to patients with mild-to-moderate asthma. Three additional TDA groups with moderate-to-severe asthma (Airway Smooth MuscleHigh, Reticular Basement MembraneHigh, and RemodelingLow groups) were identified and contained numerous microclusters with varying pathological and clinical features. Mutually exclusive TH2 and TH17 tissue gene expression signatures were identified in all pathological groups. Discovery and external replication applied to the severe asthma subgroup identified only highly similar "pathological data shapes" through analyses of persistent homology. CONCLUSIONS We have identified and replicated novel pathological phenotypes of asthma using TDA. Our methodology is applicable to other complex chronic diseases.
Collapse
Affiliation(s)
- Salman Siddiqui
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom.
| | - Aarti Shikotra
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| | - Matthew Richardson
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| | | | | | - Alex Bell
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom; Department of Mathematics, University of Leicester, Leicester, United Kingdom
| | | | | | - Beverley Hargadon
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| | | | - Andrew Wardlaw
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| | - Christopher E Brightling
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| | - Liam G Heaney
- Centre for Infection and Immunity, Health Sciences Building, Queens University Belfast, Belfast, United Kingdom
| | - Peter Bradding
- Department of Infection Immunity and Inflammation, Institute for Lung Health, University of Leicester, Glenfield Hospital, Leicester, United Kingdom
| |
Collapse
|
244
|
Denny JC, Van Driest SL, Wei WQ, Roden DM. The Influence of Big (Clinical) Data and Genomics on Precision Medicine and Drug Development. Clin Pharmacol Ther 2018; 103:409-418. [PMID: 29171014 PMCID: PMC5805632 DOI: 10.1002/cpt.951] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 11/15/2017] [Accepted: 11/19/2017] [Indexed: 12/30/2022]
Abstract
Drug development continues to be costly and slow, with medications failing due to lack of efficacy or presence of toxicity. The promise of pharmacogenomic discovery includes tailoring therapeutics based on an individual's genetic makeup, rational drug development, and repurposing medications. Rapid growth of large research cohorts, linked to electronic health record (EHR) data, fuels discovery of new genetic variants predicting drug action, supports Mendelian randomization experiments to show drug efficacy, and suggests new indications for existing medications. New biomedical informatics and machine-learning approaches advance the ability to interpret clinical information, enabling identification of complex phenotypes and subpopulations of patients. We review the recent history of use of "big data" from EHR-based cohorts and biobanks supporting these activities. Future studies using EHR data, other information sources, and new methods will promote a foundation for discovery to more rapidly advance precision medicine.
Collapse
Affiliation(s)
- Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center
- Department of Medicine, Vanderbilt University Medical Center
| | - Sara L. Van Driest
- Department of Medicine, Vanderbilt University Medical Center
- Department of Pediatrics, Vanderbilt University Medical Center
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center
| | - Dan M. Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center
- Department of Medicine, Vanderbilt University Medical Center
- Department of Pharmacology, Vanderbilt University Medical Center
| |
Collapse
|
245
|
Hripcsak G, Albers DJ. High-fidelity phenotyping: richness and freedom from bias. J Am Med Inform Assoc 2018; 25:289-294. [PMID: 29040596 PMCID: PMC7282504 DOI: 10.1093/jamia/ocx110] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Revised: 08/07/2017] [Accepted: 09/06/2017] [Indexed: 01/14/2023] Open
Abstract
Electronic health record phenotyping is the use of raw electronic health record data to assert characterizations about patients. Researchers have been doing it since the beginning of biomedical informatics, under different names. Phenotyping will benefit from an increasing focus on fidelity, both in the sense of increasing richness, such as measured levels, degree or severity, timing, probability, or conceptual relationships, and in the sense of reducing bias. Research agendas should shift from merely improving binary assignment to studying and improving richer representations. The field is actively researching new temporal directions and abstract representations, including deep learning. The field would benefit from research in nonlinear dynamics, in combining mechanistic models with empirical data, including data assimilation, and in topology. The health care process produces substantial bias, and studying that bias explicitly rather than treating it as merely another source of noise would facilitate addressing it.
Collapse
Affiliation(s)
- George Hripcsak
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY, USA
| | - David J Albers
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY, USA
| |
Collapse
|
246
|
Basile AO, Ritchie MD. Informatics and machine learning to define the phenotype. Expert Rev Mol Diagn 2018; 18:219-226. [PMID: 29431517 DOI: 10.1080/14737159.2018.1439380] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
INTRODUCTION For the past decade, the focus of complex disease research has been the genotype. From technological advancements to the development of analysis methods, great progress has been made. However, advances in our definition of the phenotype have remained stagnant. Phenotype characterization has recently emerged as an exciting area of informatics and machine learning. The copious amounts of diverse biomedical data that have been collected may be leveraged with data-driven approaches to elucidate trait-related features and patterns. Areas covered: In this review, the authors discuss the phenotype in traditional genetic associations and the challenges this has imposed.Approaches for phenotype refinement that can aid in more accurate characterization of traits are also discussed. Further, the authors highlight promising machine learning approaches for establishing a phenotype and the challenges of electronic health record (EHR)-derived data. Expert commentary: The authors hypothesize that through unsupervised machine learning, data-driven approaches can be used to define phenotypes rather than relying on expert clinician knowledge. Through the use of machine learning and an unbiased set of features extracted from clinical repositories, researchers will have the potential to further understand complex traits and identify patient subgroups. This knowledge may lead to more preventative and precise clinical care.
Collapse
Affiliation(s)
- Anna Okula Basile
- a Department of Biochemistry and Molecular Biology , The Pennsylvania State University , State College , PA , USA
| | - Marylyn DeRiggi Ritchie
- a Department of Biochemistry and Molecular Biology , The Pennsylvania State University , State College , PA , USA.,b Department of Genetics , University of Pennsylvania, Perelman School of Medicine , Philadelphia , PA , USA
| |
Collapse
|
247
|
Duponchel L. Exploring hyperspectral imaging data sets with topological data analysis. Anal Chim Acta 2018; 1000:123-131. [PMID: 29289301 DOI: 10.1016/j.aca.2017.11.029] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 11/16/2017] [Accepted: 11/17/2017] [Indexed: 11/15/2022]
Affiliation(s)
- Ludovic Duponchel
- LASIR CNRS UMR 8516, Université Lille 1, Sciences et Technologies, 59655 Villeneuve d'Ascq Cedex, France.
| |
Collapse
|
248
|
Ye C, Fu T, Hao S, Zhang Y, Wang O, Jin B, Xia M, Liu M, Zhou X, Wu Q, Guo Y, Zhu C, Li YM, Culver DS, Alfreds ST, Stearns F, Sylvester KG, Widen E, McElhinney D, Ling X. Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning. J Med Internet Res 2018; 20:e22. [PMID: 29382633 PMCID: PMC5811646 DOI: 10.2196/jmir.9268] [Citation(s) in RCA: 141] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 12/05/2017] [Accepted: 12/06/2017] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke. OBJECTIVE The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year. METHODS Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual. RESULTS The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension. CONCLUSIONS With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.
Collapse
Affiliation(s)
- Chengyin Ye
- Department of Health Management, Hangzhou Normal University, Hangzhou, China.,Department of Surgery, Stanford University, Stanford, CA, United States
| | - Tianyun Fu
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Shiying Hao
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States.,Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, United States
| | - Yan Zhang
- Department of Oncology, The First Hospital of Shijiazhuang, Shijiazhuang, China
| | - Oliver Wang
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Bo Jin
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Minjie Xia
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Modi Liu
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Xin Zhou
- Tianjin Key Laboratory of Cardiovascular Remodeling and Target Organ Injury, Pingjin Hospital Heart Center, Tianjin, China
| | - Qian Wu
- China Electric Power Research Institute, Beijing, China
| | - Yanting Guo
- Department of Surgery, Stanford University, Stanford, CA, United States.,School of Management, Zhejiang University, Hangzhou, China
| | | | - Yu-Ming Li
- Tianjin Key Laboratory of Cardiovascular Remodeling and Target Organ Injury, Pingjin Hospital Heart Center, Tianjin, China
| | | | | | | | - Karl G Sylvester
- Department of Surgery, Stanford University, Stanford, CA, United States
| | - Eric Widen
- HBI Solutions Inc, Palo Alto, CA, United States
| | - Doff McElhinney
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, United States.,Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, United States
| | - Xuefeng Ling
- Department of Surgery, Stanford University, Stanford, CA, United States.,Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, United States.,Health Care Big Data Center, School of Public Health, Zhejiang University, Hangzhou, China
| |
Collapse
|
249
|
Verma SS, Ritchie MD. Another Round of "Clue" to Uncover the Mystery of Complex Traits. Genes (Basel) 2018; 9:E61. [PMID: 29370075 PMCID: PMC5852557 DOI: 10.3390/genes9020061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/19/2017] [Accepted: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
A plethora of genetic association analyses have identified several genetic risk loci. Technological and statistical advancements have now led to the identification of not only common genetic variants, but also low-frequency variants, structural variants, and environmental factors, as well as multi-omics variations that affect the phenotypic variance of complex traits in a population, thus referred to as complex trait architecture. The concept of heritability, or the proportion of phenotypic variance due to genetic inheritance, has been studied for several decades, but its application is mainly in addressing the narrow sense heritability (or additive genetic component) from Genome-Wide Association Studies (GWAS). In this commentary, we reflect on our perspective on the complexity of understanding heritability for human traits in comparison to model organisms, highlighting another round of clues beyond GWAS and an alternative approach, investigating these clues comprehensively to help in elucidating the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Shefali Setia Verma
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
250
|
Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart 2018; 104:1156-1164. [PMID: 29352006 DOI: 10.1136/heartjnl-2017-311198] [Citation(s) in RCA: 245] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 12/19/2017] [Accepted: 12/21/2017] [Indexed: 12/11/2022] Open
Abstract
Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine.
Collapse
Affiliation(s)
- Khader Shameer
- Departments of Medical Informatics and Research Informatics, Northwell Health, Great Neck, New York, USA.,Institute for Next Generation Healthcare, Mount Sinai Health System, New York City, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Mount Sinai Health System, New York City, New York, USA.,Department of Genetics and Genomic Sciences, Mount Sinai Health System, New York City, New York, USA.,Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA.,Center for Research Informatics and Innovation, Northwell Health, New Hyde Park, NY, USA
| | - Kipp W Johnson
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York City, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Mount Sinai Health System, New York City, New York, USA.,Department of Genetics and Genomic Sciences, Mount Sinai Health System, New York City, New York, USA.,Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
| | - Benjamin S Glicksberg
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York City, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Mount Sinai Health System, New York City, New York, USA.,Department of Genetics and Genomic Sciences, Mount Sinai Health System, New York City, New York, USA.,Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA.,Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, California, USA
| | - Joel T Dudley
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York City, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Mount Sinai Health System, New York City, New York, USA.,Department of Genetics and Genomic Sciences, Mount Sinai Health System, New York City, New York, USA.,Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, New York, USA
| | - Partho P Sengupta
- Division of Cardiology, West Virginia Heart and Vascular Institute, Morgantown, West Virginia, USA
| |
Collapse
|