1
|
Abulibdeh R, Tu K, Butt DA, Train A, Crampton N, Sejdić E. Assessing the capture of sociodemographic information in electronic medical records to inform clinical decision making. PLoS One 2025; 20:e0317599. [PMID: 39823404 PMCID: PMC11741650 DOI: 10.1371/journal.pone.0317599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 01/01/2025] [Indexed: 01/19/2025] Open
Abstract
There is a growing need to document sociodemographic factors in electronic medical records to produce representative cohorts for medical research and to perform focused research for potentially vulnerable populations. The objective of this work was to assess the content of family physicians' electronic medical records and characterize the quality of the documentation of sociodemographic characteristics. Descriptive statistics were reported for each sociodemographic characteristic. The association between the completeness rates of the sociodemographic data and the various clinics, electronic medical record vendors, and physician characteristics was analyzed. Supervised machine learning models were used to determine the absence or presence of each characteristic for all adult patients over the age of 18 in the database. Documentation of marital status (51.0%) and occupation (47.2%) were significantly higher compared to the rest of the variables. Race (1.4%), sexual orientation (2.5%), and gender identity (0.8%) had the lowest documentation rates with a 97.5% missingness rate or higher. The correlation analysis for vendor type demonstrated that there was significant variation in the availability of marital and occupation information between vendors (χ2 > 6.0, P < 0.05). Variability in documentation between clinics indicated that the majority of characteristics exhibited high variation in completeness rates with the highest variation for occupation (median: 47.2, interquartile range: 60.6%) and marital status (median: 45.6, interquartile: 59.7%). Finally, physician sex, years since a physician graduated, and whether a physician was a foreign vs a Canadian medical graduate were significantly associated with documentation rates of place of birth, citizenship status, occupation, and education in the electronic medical records. Our findings suggest a crucial need to implement better documentation strategies for sociodemographic information in the healthcare setting. To improve completeness rates, healthcare systems should monitor, encourage, enforce, or incentivize sociodemographic data collection standards.
Collapse
Affiliation(s)
- Rawan Abulibdeh
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
- North York General Hospital, Toronto, Ontario, Canada
- Toronto Western Hospital Family Health Team, University Health Network, Toronto, Ontario, Canada
| | - Debra A. Butt
- Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Family and Community Medicine, Scarborough Health Network, Scarborough, Ontario, Canada
| | - Anthony Train
- Department of Family Medicine, Queen’s University, Kingston, Ontario, Canada
| | - Noah Crampton
- Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Ervin Sejdić
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
- North York General Hospital, Toronto, Ontario, Canada
| |
Collapse
|
2
|
Saavedra-Moreno C, Hurtado R, Velasco N, Ramírez A. Identification of population multimorbidity patterns in 3.9 million patients from Bogota in 2018. GLOBAL EPIDEMIOLOGY 2024; 8:100171. [PMID: 39498214 PMCID: PMC11533067 DOI: 10.1016/j.gloepi.2024.100171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 10/08/2024] [Accepted: 10/11/2024] [Indexed: 11/07/2024] Open
Abstract
Background Multimorbidity has emerged as a challenge for health systems due to its association with adverse clinical outcomes. Given the limited information available on multimorbidity, particularly in low- and middle-income countries, this study characterizes multimorbidity patterns in the population of Bogotá, Colombia in 2018. Methods In a cross-sectional study, we analyzed 16 million medical consultation records from Bogotá reported in the National Service Delivery Records in 2018. Using network analysis, we quantified the prevalence of multimorbidity in the population and identified the most common associations between diagnoses, with data stratified by age, sex, and socioeconomic status. Results The study found that the prevalence of multimorbidity in the population was 44.2 %, increased with age, and was higher in women and in people affiliated to the contributory health scheme. Allergies and vasomotor rhinitis with asthma were common in young people. In women aged 19-39 years, obesity with hypothyroidism was common, while men in the same age group had obesity with dyslipidemia. In people aged 60 years and older, essential hypertension with dyslipidemia was the most common. In addition, some associations between diagnoses showed a higher association in people affiliated to the subsidized health scheme, with notable associations with trauma, especially in men. Conclusion Overall, the results provide valuable insights into multimorbidity in the population and highlight inequalities based on sociodemographic factors. Future research should investigate whether the lower prevalence of multimorbidity in vulnerable groups is related to biases in data collection or to underlying inequalities in healthcare access.
Collapse
Affiliation(s)
- Carolina Saavedra-Moreno
- Faculty of Engineering, Universidad Nacional de Colombia, Bogotá, Colombia
- Faculty of Engineering, Universidad de Ibagué, Ibagué, Colombia
| | - Rafael Hurtado
- Science Faculty, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Nubia Velasco
- School of Management, Universidad de los Andes, Bogotá, Colombia
| | - Andrea Ramírez
- Department of Epidemiology, UTHealth Science Center at Houston, School of Public Health, Center for Pediatric Population Health, Department of Pediatrics at McGovern Medical School, Houston, USA
| |
Collapse
|
3
|
Tang AS, Woldemariam SR, Miramontes S, Norgeot B, Oskotsky TT, Sirota M. Harnessing EHR data for health research. Nat Med 2024; 30:1847-1855. [PMID: 38965433 DOI: 10.1038/s41591-024-03074-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 05/17/2024] [Indexed: 07/06/2024]
Abstract
With the increasing availability of rich, longitudinal, real-world clinical data recorded in electronic health records (EHRs) for millions of patients, there is a growing interest in leveraging these records to improve the understanding of human health and disease and translate these insights into clinical applications. However, there is also a need to consider the limitations of these data due to various biases and to understand the impact of missing information. Recognizing and addressing these limitations can inform the design and interpretation of EHR-based informatics studies that avoid confusing or incorrect conclusions, particularly when applied to population or precision medicine. Here we discuss key considerations in the design, implementation and interpretation of EHR-based informatics studies, drawing from examples in the literature across hypothesis generation, hypothesis testing and machine learning applications. We outline the growing opportunities for EHR-based informatics studies, including association studies and predictive modeling, enabled by evolving AI capabilities-while addressing limitations and potential pitfalls to avoid.
Collapse
Affiliation(s)
- Alice S Tang
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Sarah R Woldemariam
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Silvia Miramontes
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | | | - Tomiko T Oskotsky
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
4
|
Miller M, Jorm L, Partyka C, Burns B, Habig K, Oh C, Immens S, Ballard N, Gallego B. Identifying prehospital trauma patients from ambulance patient care records; comparing two methods using linked data in New South Wales, Australia. Injury 2024; 55:111570. [PMID: 38664086 DOI: 10.1016/j.injury.2024.111570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/11/2024] [Accepted: 04/14/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Linked datasets for trauma system monitoring should ideally follow patients from the prehospital scene to hospital admission and post-discharge. Having a well-defined cohort when using administrative datasets is essential because they must capture the representative population. Unlike hospital electronic health records (EHR), ambulance patient-care records lack access to sources beyond immediate clinical notes. Relying on a limited set of variables to define a study population might result in missed patient inclusion. We aimed to compare two methods of identifying prehospital trauma patients: one using only those documented under a trauma protocol and another incorporating additional data elements from ambulance patient care records. METHODS We analyzed data from six routinely collected administrative datasets from 2015 to 2018, including ambulance patient-care records, aeromedical data, emergency department visits, hospitalizations, rehabilitation outcomes, and death records. Three prehospital trauma cohorts were created: an Extended-T-protocol cohort (patients transported under a trauma protocol and/or patients with prespecified criteria from structured data fields), T-protocol cohort (only patients documented as transported under a trauma protocol) and non-T-protocol (extended-T-protocol population not in the T-protocol cohort). Patient-encounter characteristics, mortality, clinical and post-hospital discharge outcomes were compared. A conservative p-value of 0.01 was considered significant RESULTS: Of 1 038 263 patient-encounters included in the extended-T-population 814 729 (78.5 %) were transported, with 438 893 (53.9 %) documented as a T-protocol patient. Half (49.6 %) of the non-T-protocol sub-cohort had an International Classification of Disease 10th edition injury or external cause code, indicating 79644 missed patients when a T-protocol-only definition was used. The non-T-protocol sub-cohort also identified additional patients with intubation, prehospital blood transfusion and positive eFAST. A higher proportion of non-T protocol patients than T-protocol patients were admitted to the ICU (4.6% vs 3.6 %), ventilated (1.8% vs 1.3 %), received in-hospital transfusion (7.9 vs 6.8 %) or died (1.8% vs 1.3 %). Urgent trauma surgery was similar between groups (1.3% vs 1.4 %). CONCLUSION The extended-T-population definition identified 50 % more admitted patients with an ICD-10-AM code consistent with an injury, including patients with severe trauma. Developing an EHR phenotype incorporating multiple data fields of ambulance-transported trauma patients for use with linked data may avoid missing these patients.
Collapse
Affiliation(s)
- Matthew Miller
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Anesthesia, St George Hospital, Kogarah, NSW 2217 Australia; Centre for Big Data Research in Health at UNSW Sydney, Kensington, NSW 2052, Australia.
| | - Louisa Jorm
- Foundation Director of the Centre for Big Data Research in Health at UNSW Sydney, Kensington 2052, Australia
| | - Chris Partyka
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Emergency Medicine, Royal North Shore Hospital, St Leonards, NSW 2065, Australia
| | - Brian Burns
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Royal North Shore Hospital, St Leonards, NSW 2065, Australia; Faculty of Medicine & Health, University of Sydney, Camperdown, NSW 2050, Australia
| | - Karel Habig
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia
| | - Carissa Oh
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Emergency Medicine, St George Hospital, Kogarah, NSW 2217 Australia
| | - Sam Immens
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia
| | - Neil Ballard
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Paediatric Emergency Medicine, Sydney Children's Hospital, Randwick, NSW 2031, Australia; Department of Emergency Medicine, Royal Prince Alfred Hospital, Camperdown, NSW 2050, Australia
| | - Blanca Gallego
- Clinical analytics and machine learning unit, Centre for Big Data Research in Health at UNSW Sydney, Kensington 2052, Australia
| |
Collapse
|
5
|
Caruana A, Bandara M, Musial K, Catchpoole D, Kennedy PJ. Machine learning for administrative health records: A systematic review of techniques and applications. Artif Intell Med 2023; 144:102642. [PMID: 37783537 DOI: 10.1016/j.artmed.2023.102642] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 10/04/2023]
Abstract
Machine learning provides many powerful and effective techniques for analysing heterogeneous electronic health records (EHR). Administrative Health Records (AHR) are a subset of EHR collected for administrative purposes, and the use of machine learning on AHRs is a growing subfield of EHR analytics. Existing reviews of EHR analytics emphasise that the data-modality of the EHR limits the breadth of suitable machine learning techniques, and pursuable healthcare applications. Despite emphasising the importance of data modality, the literature fails to analyse which techniques and applications are relevant to AHRs. AHRs contain uniquely well-structured, categorically encoded records which are distinct from other data-modalities captured by EHRs, and they can provide valuable information pertaining to how patients interact with the healthcare system. This paper systematically reviews AHR-based research, analysing 70 relevant studies and spanning multiple databases. We identify and analyse which machine learning techniques are applied to AHRs and which health informatics applications are pursued in AHR-based research. We also analyse how these techniques are applied in pursuit of each application, and identify the limitations of these approaches. We find that while AHR-based studies are disconnected from each other, the use of AHRs in health informatics research is substantial and accelerating. Our synthesis of these studies highlights the utility of AHRs for pursuing increasingly complex and diverse research objectives despite a number of pervading data- and technique-based limitations. Finally, through our findings, we propose a set of future research directions that can enhance the utility of AHR data and machine learning techniques for health informatics research.
Collapse
Affiliation(s)
- Adrian Caruana
- Australian Artificial Intelligence Institute, Faculty of Engineering and IT, University of Technology Sydney, Australia.
| | - Madhushi Bandara
- Australian Artificial Intelligence Institute, Faculty of Engineering and IT, University of Technology Sydney, Australia
| | - Katarzyna Musial
- Complex Adaptive Systems Lab, Data Science Institute, Faculty of Engineering and IT, University of Technology Sydney, Australia
| | - Daniel Catchpoole
- Australian Artificial Intelligence Institute, Faculty of Engineering and IT, University of Technology Sydney, Australia; Biospecimen Research Services, The Children's Cancer Research Unit, The Children's Hospital at Westmead, Australia
| | - Paul J Kennedy
- Australian Artificial Intelligence Institute, Faculty of Engineering and IT, University of Technology Sydney, Australia; Joint Research Centre in AI for Health and Wellness, University of Technology Sydney, Australia, and Ontario Tech University, Canada
| |
Collapse
|
6
|
Rudd J, Igbrude C. A global perspective on data powering responsible AI solutions in health applications. AI AND ETHICS 2023:1-11. [PMID: 37360149 PMCID: PMC10231277 DOI: 10.1007/s43681-023-00302-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 05/18/2023] [Indexed: 06/28/2023]
Abstract
Healthcare AI solutions have the potential to transform access, quality of care, and improve outcomes for patients globally. This review suggests consideration of a more global perspective, with a particular focus on marginalized communities, during the development of healthcare AI solutions. The review focuses on one aspect (medical applications) to allow technologists to build solutions in today's environment with an understanding of the challenges they face. The following sections explore and discuss the current challenges in the underlying data and AI technology design on healthcare solutions for global deployment. We highlight some of the factors that lead to gaps in data, gaps around regulations for the healthcare sector, and infrastructural challenges in power and network connectivity, as well as lack of social systems for healthcare and education, which pose challenges to the potential universal impacts of such technologies. We recommend using these considerations in developing prototype healthcare AI solutions to better capture the needs of a global population.
Collapse
|
7
|
Woldemariam SR, Tang AS, Oskotsky TT, Yaffe K, Sirota M. Similarities and differences in Alzheimer's dementia comorbidities in racialized populations identified from electronic medical records. COMMUNICATIONS MEDICINE 2023; 3:50. [PMID: 37031271 PMCID: PMC10082816 DOI: 10.1038/s43856-023-00280-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 03/24/2023] [Indexed: 04/10/2023] Open
Abstract
BACKGROUND Alzheimer's dementia (AD) is a neurodegenerative disease that is disproportionately prevalent in racially marginalized individuals. However, due to research underrepresentation, the spectrum of AD-associated comorbidities that increase AD risk or suggest AD treatment disparities in these individuals is not completely understood. We leveraged electronic medical records (EMR) to explore AD-associated comorbidities and disease networks in racialized individuals identified as Asian, Non-Latine Black, Latine, or Non-Latine White. METHODS We performed low-dimensional embedding, differential analysis, and disease network-based analyses of 5664 patients with AD and 11,328 demographically matched controls across two EMR systems and five medical centers, with equal representation of Asian-, Non-Latine Black-, Latine-, and Non-Latine White-identified individuals. For low-dimensional embedding and disease network comparisons, Mann-Whitney U tests or Kruskal-Wallis tests followed by Dunn's tests were used to compare categories. Fisher's exact or chi-squared tests were used for differential analysis. Spearman's rank correlation coefficients were used to compare results between the two EMR systems. RESULTS Here we show that primarily established AD-associated comorbidities, such as essential hypertension and major depressive disorder, are generally similar across racialized populations. However, a few comorbidities, including respiratory diseases, may be significantly associated with AD in Black- and Latine- identified individuals. CONCLUSIONS Our study revealed similarities and differences in AD-associated comorbidities and disease networks between racialized populations. Our approach could be a starting point for hypothesis-driven studies that can further explore the relationship between these comorbidities and AD in racialized populations, potentially identifying interventions that can reduce AD health disparities.
Collapse
Affiliation(s)
- Sarah R Woldemariam
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Alice S Tang
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
- School of Medicine, University of California San Francisco, San Francisco, California, USA
| | - Tomiko T Oskotsky
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
- Department of Pediatrics, University of California San Francisco, San Francisco, California, USA
| | - Kristine Yaffe
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, California, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA.
- Department of Pediatrics, University of California San Francisco, San Francisco, California, USA.
| |
Collapse
|
8
|
Gu JH, Li WQ, Chen CJ. A retrospective cohort study evaluating the improvement of medical records management based on whole-process control. Technol Health Care 2023; 31:1901-1910. [PMID: 37393450 DOI: 10.3233/thc-220863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2023]
Abstract
BACKGROUND Whole-process management is a novel approach widely applied in industry and commerce; however, it is not widely used in the management of medical records in hospitals. OBJECTIVE The purpose of this study is to investigate the application of whole-process control in the administration of a hospital's medical records department to achieve refined management of medical records. METHODS Whole-process control is a management measure that begins with process conception and implementation and includes control over all processes. The control group included medical records that were created prior to the implementation of whole-process control, i.e., those created between June 1 and December 31, 2020. The observation group included medical records that were created after the implementation of whole-process control. The behavior of the medical records staff (in terms of medical record collection, sorting, entry, inquiry, and supply) and the final quality of the medical records (the number of grade-A medical records and their front-page quality) were compared between the two groups, and subjective judgments related to staff satisfaction were reviewed. RESULTS The implementation of whole-process control improved the behavior of the medical records staff. The final quality of the medical records was also improved, as was the job satisfaction of the medical records staff. CONCLUSION Implementing whole-process control improved the management of medical records and quality of medical records.
Collapse
Affiliation(s)
- Jun-Hua Gu
- Department of Medical Records and Statistics, Taizhou People's Hospital, Taizhou, China
| | - Wen-Qi Li
- Department of Quality and Safety Management Office, Taizhou People's Hospital, Taizhou, China
| | - Chuan-Jun Chen
- Department of Burn and Plastic Surgery, Taizhou People's Hospital, Taizhou, China
| |
Collapse
|
9
|
Kuan V, Denaxas S, Patalay P, Nitsch D, Mathur R, Gonzalez-Izquierdo A, Sofat R, Partridge L, Roberts A, Wong ICK, Hingorani M, Chaturvedi N, Hemingway H, Hingorani AD. Identifying and visualising multimorbidity and comorbidity patterns in patients in the English National Health Service: a population-based study. Lancet Digit Health 2023; 5:e16-e27. [PMID: 36460578 DOI: 10.1016/s2589-7500(22)00187-x] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 09/10/2022] [Accepted: 09/19/2022] [Indexed: 12/03/2022]
Abstract
BACKGROUND Globally, there is a paucity of multimorbidity and comorbidity data, especially for minority ethnic groups and younger people. We estimated the frequency of common disease combinations and identified non-random disease associations for all ages in a multiethnic population. METHODS In this population-based study, we examined multimorbidity and comorbidity patterns stratified by ethnicity or race, sex, and age for 308 health conditions using electronic health records from individuals included on the Clinical Practice Research Datalink linked with the Hospital Episode Statistics admitted patient care dataset in England. We included individuals who were older than 1 year and who had been registered for at least 1 year in a participating general practice during the study period (between April 1, 2010, and March 31, 2015). We identified the most common combinations of conditions and comorbidities for index conditions. We defined comorbidity as the accumulation of additional conditions to an index condition over an individual's lifetime. We used network analysis to identify conditions that co-occurred more often than expected by chance. We developed online interactive tools to explore multimorbidity and comorbidity patterns overall and by subgroup based on ethnicity, sex, and age. FINDINGS We collected data for 3 872 451 eligible patients, of whom 1 955 700 (50·5%) were women and girls, 1 916 751 (49·5%) were men and boys, 2 666 234 (68·9%) were White, 155 435 (4·0%) were south Asian, and 98 815 (2·6%) were Black. We found that a higher proportion of boys aged 1-9 years (132 506 [47·8%] of 277 158) had two or more diagnosed conditions than did girls in the same age group (106 982 [40·3%] of 265 179), but more women and girls were diagnosed with multimorbidity than were boys aged 10 years and older and men (1 361 232 [80·5%] of 1 690 521 vs 1 161 308 [70·8%] of 1 639 593). White individuals (2 097 536 [78·7%] of 2 666 234) were more likely to be diagnosed with two or more conditions than were Black (59 339 [60·1%] of 98 815) or south Asian individuals (93 617 [60·2%] of 155 435). Depression commonly co-occurred with anxiety, migraine, obesity, atopic conditions, deafness, soft-tissue disorders, and gastrointestinal disorders across all subgroups. Heart failure often co-occurred with hypertension, atrial fibrillation, osteoarthritis, stable angina, myocardial infarction, chronic kidney disease, type 2 diabetes, and chronic obstructive pulmonary disease. Spinal fractures were most strongly non-randomly associated with malignancy in Black individuals, but with osteoporosis in White individuals. Hypertension was most strongly associated with kidney disorders in those aged 20-29 years, but with dyslipidaemia, obesity, and type 2 diabetes in individuals aged 40 years and older. Breast cancer was associated with different comorbidities in individuals from different ethnic groups. Asthma was associated with different comorbidities between males and females. Bipolar disorder was associated with different comorbidities in younger age groups compared with older age groups. INTERPRETATION Our findings and interactive online tools are a resource for: patients and their clinicians, to prevent and detect comorbid conditions; research funders and policy makers, to redesign service provision, training priorities, and guideline development; and biomedical researchers and manufacturers of medicines, to provide leads for research into common or sequential pathways of disease and inform the design of clinical trials. FUNDING UK Research and Innovation, Medical Research Council, National Institute for Health and Care Research, Department of Health and Social Care, Wellcome Trust, British Heart Foundation, and The Alan Turing Institute.
Collapse
Affiliation(s)
- Valerie Kuan
- Institute of Health Informatics, University College London, London, UK; Health Data Research UK, University College London, London, UK.
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK; Health Data Research UK, University College London, London, UK; UCL BHF Research Accelerator, University College London, London, UK; Alan Turing Institute, London, UK; University College London Hospitals NIHR Biomedical Research Centre, London, UK; British Heart Foundation Data Science Centre, HDR UK, London, UK
| | - Praveetha Patalay
- Centre for Longitudinal Studies, University College London, London, UK; MRC Unit for Lifelong Health and Ageing at UCL, University College London, London, UK
| | - Dorothea Nitsch
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Rohini Mathur
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK; Centre for Primary Care, Wolfson Institute of Primary Care, Queen Mary University of London, London, UK
| | - Arturo Gonzalez-Izquierdo
- Institute of Health Informatics, University College London, London, UK; Health Data Research UK, University College London, London, UK
| | - Reecha Sofat
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK; British Heart Foundation Data Science Centre, HDR UK, London, UK
| | - Linda Partridge
- Institute of Healthy Ageing, Department of Genetics, Evolution and Environment, University College London, London, UK; Max Planck Institute for Biology of Ageing, Cologne, Germany
| | - Amanda Roberts
- Nottingham Support Group for Carers of Children with Eczema, Nottingham, UK
| | - Ian C K Wong
- School of Pharmacy, University College London, London, UK; Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, LKS Faculty of Medicine, University of Hong Kong, Hong Kong Special Administrative Region, China; Aston Pharmacy School, Aston University, Birmingham, UK
| | | | - Nishi Chaturvedi
- MRC Unit for Lifelong Health and Ageing at UCL, University College London, London, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK; Health Data Research UK, University College London, London, UK; University College London Hospitals NIHR Biomedical Research Centre, London, UK
| | - Aroon D Hingorani
- UCL BHF Research Accelerator, University College London, London, UK; Institute of Cardiovascular Science, University College London, London, UK; University College London Hospitals NIHR Biomedical Research Centre, London, UK
| |
Collapse
|
10
|
Teagle WL, Norris ET, Rishishwar L, Nagar SD, Jordan IK, Mariño-Ramírez L. Comorbidities and ethnic health disparities in the UK biobank. JAMIA Open 2022; 5:ooac057. [PMID: 36313969 PMCID: PMC9272510 DOI: 10.1093/jamiaopen/ooac057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/15/2022] [Accepted: 06/24/2022] [Indexed: 11/15/2022] Open
Abstract
Objective The goal of this study was to investigate the relationship between comorbidities and ethnic health disparities in a diverse, cosmopolitan population. Materials and Methods We used the UK Biobank (UKB), a large progressive cohort study of the UK population. Study participants self-identified with 1 of 5 ethnic groups and participant comorbidities were characterized using the 31 disease categories captured by the Elixhauser Comorbidity Index. Ethnic disparities in comorbidities were quantified as the extent to which disease prevalence within categories varies across ethnic groups and the extent to which pairs of comorbidities co-occur within ethnic groups. Disease-risk factor comorbidity pairs were identified where one comorbidity is known to be a risk factor for a co-occurring comorbidity. Results The Asian ethnic group shows the greatest average number of comorbidities, followed by the Black and then White groups. The Chinese group shows the lowest average number of comorbidities. Comorbidity prevalence varies significantly among the ethnic groups for almost all disease categories, with diabetes and hypertension showing the largest differences across groups. Diabetes and hypertension both show ethnic-specific comorbidities that may contribute to the observed disease prevalence disparities. Discussion These results underscore the extent to which comorbidities vary among ethnic groups and reveal group-specific disease comorbidities that may underlie ethnic health disparities. Conclusion The study of comorbidity distributions across ethnic groups can be used to inform targeted group-specific interventions to reduce ethnic health disparities.
Collapse
Affiliation(s)
- Whitney L Teagle
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, Maryland, USA
| | - Emily T Norris
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, Maryland, USA.,Applied Bioinformatics Laboratory, Atlanta, Georgia, USA.,School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Lavanya Rishishwar
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, Maryland, USA.,Applied Bioinformatics Laboratory, Atlanta, Georgia, USA.,School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Shashwat Deepali Nagar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - I King Jordan
- Applied Bioinformatics Laboratory, Atlanta, Georgia, USA.,School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Leonardo Mariño-Ramírez
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, Maryland, USA.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| |
Collapse
|
11
|
Monchka BA, Leung CK, Nickel NC, Lix LM. The effect of disease co-occurrence measurement on multimorbidity networks: a population-based study. BMC Med Res Methodol 2022; 22:165. [PMID: 35676621 PMCID: PMC9175465 DOI: 10.1186/s12874-022-01607-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 04/15/2022] [Indexed: 11/29/2022] Open
Abstract
Background Network analysis, a technique for describing relationships, can provide insights into patterns of co-occurring chronic health conditions. The effect that co-occurrence measurement has on disease network structure and resulting inferences has not been well studied. The purpose of the study was to compare structural differences among multimorbidity networks constructed using different co-occurrence measures. Methods A retrospective cohort study was conducted using four fiscal years of administrative health data (2015/16 – 2018/19) from the province of Manitoba, Canada (population 1.5 million). Chronic conditions were identified using diagnosis codes from electronic records of physician visits, surgeries, and inpatient hospitalizations, and grouped into categories using the Johns Hopkins Adjusted Clinical Group (ACG) System. Pairwise disease networks were separately constructed using each of seven co-occurrence measures: lift, relative risk, phi, Jaccard, cosine, Kulczynski, and joint prevalence. Centrality analysis was limited to the top 20 central nodes, with degree centrality used to identify potentially influential chronic conditions. Community detection was used to identify disease clusters. Similarities in community structure between networks was measured using the adjusted Rand index (ARI). Network edges were described using disease prevalence categorized as low (< 1%), moderate (1 to < 7%), and high (≥7%). Network complexity was measured using network density and frequencies of nodes and edges. Results Relative risk and lift highlighted co-occurrences between pairs of low prevalence health conditions. Kulczynski emphasized relationships between high and low prevalence conditions. Joint prevalence focused on highly-prevalent conditions. Phi, Jaccard, and cosine emphasized associations involving moderately prevalent conditions. Co-occurrence measurement differences significantly affected the number and structure of identified disease clusters. When limiting the number of edges to produce visually interpretable graphs, networks had significant dissimilarity in the percentage of co-occurrence relationships in common, and in their selection of the highest-degree nodes. Conclusions Multimorbidity network analyses are sensitive to disease co-occurrence measurement. Co-occurrence measures should be selected considering their intrinsic properties, research objectives, and the health condition prevalence relationships of greatest interest. Researchers should consider conducting sensitivity analyses using different co-occurrence measures. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01607-8.
Collapse
Affiliation(s)
- Barret A Monchka
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada. .,George and Fay Yee Centre for Healthcare Innovation, University of Manitoba, 3rd Floor, 753 McDermot Ave, Winnipeg, Manitoba, R3E 0T6, Canada.
| | - Carson K Leung
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Nathan C Nickel
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.,Manitoba Centre for Health Policy, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Lisa M Lix
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.,George and Fay Yee Centre for Healthcare Innovation, University of Manitoba, 3rd Floor, 753 McDermot Ave, Winnipeg, Manitoba, R3E 0T6, Canada
| |
Collapse
|
12
|
Tang AS, Oskotsky T, Havaldar S, Mantyh WG, Bicak M, Solsberg CW, Woldemariam S, Zeng B, Hu Z, Oskotsky B, Dubal D, Allen IE, Glicksberg BS, Sirota M. Deep phenotyping of Alzheimer's disease leveraging electronic medical records identifies sex-specific clinical associations. Nat Commun 2022; 13:675. [PMID: 35115528 PMCID: PMC8814236 DOI: 10.1038/s41467-022-28273-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 01/18/2022] [Indexed: 12/14/2022] Open
Abstract
Alzheimer's Disease (AD) is a neurodegenerative disorder that is still not fully understood. Sex modifies AD vulnerability, but the reasons for this are largely unknown. We utilize two independent electronic medical record (EMR) systems across 44,288 patients to perform deep clinical phenotyping and network analysis to gain insight into clinical characteristics and sex-specific clinical associations in AD. Embeddings and network representation of patient diagnoses demonstrate greater comorbidity interactions in AD in comparison to matched controls. Enrichment analysis identifies multiple known and new diagnostic, medication, and lab result associations across the whole cohort and in a sex-stratified analysis. With this data-driven method of phenotyping, we can represent AD complexity and generate hypotheses of clinical factors that can be followed-up for further diagnostic and predictive analyses, mechanistic understanding, or drug repurposing and therapeutic approaches.
Collapse
Affiliation(s)
- Alice S Tang
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA.
- Graduate Program in Bioengineering, UCSF, San Francisco, CA, USA.
- School of Medicine, UCSF, San Francisco, CA, USA.
| | - Tomiko Oskotsky
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Shreyas Havaldar
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - William G Mantyh
- Department of Neurology, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Mesude Bicak
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline Warly Solsberg
- Pharmaceutical Sciences and Pharmacogenomics, UCSF, San Francisco, CA, USA
- Department of Neurology and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Memory and Aging Center, UCSF, San Francisco, CA, USA
| | - Sarah Woldemariam
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Billy Zeng
- School of Medicine, UCSF, San Francisco, CA, USA
| | - Zicheng Hu
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Boris Oskotsky
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Dena Dubal
- Department of Neurology and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Isabel E Allen
- Department of Epidemiology and Biostatistics, UCSF, San Francisco, CA, USA
| | - Benjamin S Glicksberg
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA.
- Department of Pediatrics, UCSF, San Francisco, CA, USA.
| |
Collapse
|
13
|
Yu J, Li Y, Zheng Z, Jia H, Cao P, Qiangba Y, Yu X. Analysis of multimorbidity networks associated with different factors in Northeast China: a cross-sectional analysis. BMJ Open 2021; 11:e051050. [PMID: 34732482 PMCID: PMC8572406 DOI: 10.1136/bmjopen-2021-051050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES This study aimed to identify and study the associations and co-occurrence of multimorbidity, and assessed the associations of diseases with sex, age and hospitalisation duration. DESIGN Cross-sectional. SETTING 15 general hospitals in Jilin Province, China. PARTICIPANTS A total of 431 295 inpatients were enrolled through a cross-sectional study in Jilin Province, China. PRIMARY OUTCOME MEASURES The complex relationships of multimorbidity were presented as weighted networks. RESULTS The distributions of the numbers of diseases differed significantly by sex, age and hospitalisation duration (p<0.001). Cerebrovascular diseases (CD), hypertensive diseases (HyD), ischaemic heart diseases (IHD) and other forms of heart disease (OFHD) showed the highest weights in the multimorbidity networks. The connections between different sexes or hospitalisation duration and diseases were similar, while those between different age groups and diseases were different. CONCLUSIONS CD, HyD, IHD and OFHD were the central points of disease clusters and directly or indirectly related to other diseases or factors. Thus, effective interventions for these diseases should be adopted. Furthermore, different intervention strategies should be developed according to multimorbidity patterns in different age groups.
Collapse
Affiliation(s)
- Jianxing Yu
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Yingying Li
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Zhou Zheng
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Huanhuan Jia
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Peng Cao
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Yuzhen Qiangba
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| | - Xihe Yu
- Social Medicine and Health Service Management, School of Public Health, Jilin University, Changchun, China
| |
Collapse
|
14
|
Affiliation(s)
- Alice Tang
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA.,Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA.,Bioengineering Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Tomiko Oskotsky
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA.,Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA. .,Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
15
|
Hong JC, Hauser ER, Redding TS, Sims KJ, Gellad ZF, O'Leary MC, Hyslop T, Madison AN, Qin X, Weiss D, Bullard AJ, Williams CD, Sullivan BA, Lieberman D, Provenzale D. Characterizing chronological accumulation of comorbidities in healthy veterans: a computational approach. Sci Rep 2021; 11:8104. [PMID: 33854078 PMCID: PMC8046765 DOI: 10.1038/s41598-021-85546-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 12/14/2020] [Indexed: 12/13/2022] Open
Abstract
Understanding patient accumulation of comorbidities can facilitate healthcare strategy and personalized preventative care. We applied a directed network graph to electronic health record (EHR) data and characterized comorbidities in a cohort of healthy veterans undergoing screening colonoscopy. The Veterans Affairs Cooperative Studies Program #380 was a prospective longitudinal study of screening and surveillance colonoscopy. We identified initial instances of three-digit ICD-9 diagnoses for participants with at least 5 years of linked EHR history (October 1999 to December 2015). For diagnoses affecting at least 10% of patients, we calculated pairwise chronological relative risk (RR). iGraph was used to produce directed graphs of comorbidities with RR > 1, as well as summary statistics, key diseases, and communities. A directed graph based on 2210 patients visualized longitudinal development of comorbidities. Top hub (preceding) diseases included ischemic heart disease, inflammatory and toxic neuropathy, and diabetes. Top authority (subsequent) diagnoses were acute kidney failure and hypertensive chronic kidney failure. Four communities of correlated comorbidities were identified. Close analysis of top hub and authority diagnoses demonstrated known relationships, correlated sequelae, and novel hypotheses. Directed network graphs portray chronologic comorbidity relationships. We identified relationships between comorbid diagnoses in this aging veteran cohort. This may direct healthcare prioritization and personalized care.
Collapse
Affiliation(s)
- Julian C Hong
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA. .,Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA, USA. .,Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Elizabeth R Hauser
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Thomas S Redding
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA
| | - Kellie J Sims
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA
| | - Ziad F Gellad
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Medicine, Duke University, Durham, NC, USA
| | - Meghan C O'Leary
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA
| | - Terry Hyslop
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Ashton N Madison
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA
| | - Xuejun Qin
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - David Weiss
- Cooperative Studies Program Coordinating Center, Perry Point VA Medical Center, Perry Point, MD, USA
| | - A Jasmine Bullard
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA
| | - Christina D Williams
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Medicine, Duke University, Durham, NC, USA
| | - Brian A Sullivan
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA.,Department of Medicine, Duke University, Durham, NC, USA
| | - David Lieberman
- VA Portland Health Care System, Portland, OR, USA.,Oregon Health and Science University, Portland, OR, USA
| | - Dawn Provenzale
- Cooperative Studies Program Epidemiology Center-Durham, Durham VA Health Care System, Durham, NC, USA. .,Department of Medicine, Duke University, Durham, NC, USA.
| |
Collapse
|
16
|
Webster AJ, Gaitskell K, Turnbull I, Cairns BJ, Clarke R. Characterisation, identification, clustering, and classification of disease. Sci Rep 2021; 11:5405. [PMID: 33686097 PMCID: PMC7940639 DOI: 10.1038/s41598-021-84860-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 02/17/2021] [Indexed: 12/25/2022] Open
Abstract
The importance of quantifying the distribution and determinants of multimorbidity has prompted novel data-driven classifications of disease. Applications have included improved statistical power and refined prognoses for a range of respiratory, infectious, autoimmune, and neurological diseases, with studies using molecular information, age of disease incidence, and sequences of disease onset ("disease trajectories") to classify disease clusters. Here we consider whether easily measured risk factors such as height and BMI can effectively characterise diseases in UK Biobank data, combining established statistical methods in new but rigorous ways to provide clinically relevant comparisons and clusters of disease. Over 400 common diseases were selected for analysis using clinical and epidemiological criteria, and conventional proportional hazards models were used to estimate associations with 12 established risk factors. Several diseases had strongly sex-dependent associations of disease risk with BMI. Importantly, a large proportion of diseases affecting both sexes could be identified by their risk factors, and equivalent diseases tended to cluster adjacently. These included 10 diseases presently classified as "Symptoms, signs, and abnormal clinical and laboratory findings, not elsewhere classified". Many clusters are associated with a shared, known pathogenesis, others suggest likely but presently unconfirmed causes. The specificity of associations and shared pathogenesis of many clustered diseases provide a new perspective on the interactions between biological pathways, risk factors, and patterns of disease such as multimorbidity.
Collapse
Affiliation(s)
- A J Webster
- Nuffield Department of Population Health, University of Oxford, Oxford, UK.
| | - K Gaitskell
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Nuffield Division of Clinical Laboratory Sciences, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - I Turnbull
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - B J Cairns
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - R Clarke
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| |
Collapse
|
17
|
Caufield JH, Sigdel D, Fu J, Choi H, Guevara-Gonzalez V, Wang D, Ping P. Cardiovascular Informatics: building a bridge to data harmony. Cardiovasc Res 2021; 118:732-745. [PMID: 33751044 DOI: 10.1093/cvr/cvab067] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 03/03/2021] [Indexed: 12/11/2022] Open
Abstract
The search for new strategies for better understanding cardiovascular disease is a constant one, spanning multitudinous types of observations and studies. A comprehensive characterization of each disease state and its biomolecular underpinnings relies upon insights gleaned from extensive information collection of various types of data. Researchers and clinicians in cardiovascular biomedicine repeatedly face questions regarding which types of data may best answer their questions, how to integrate information from multiple datasets of various types, and how to adapt emerging advances in machine learning and/or artificial intelligence to their needs in data processing. Frequently lauded as a field with great practical and translational potential, the interface between biomedical informatics and cardiovascular medicine is challenged with staggeringly massive datasets. Successful application of computational approaches to decode these complex and gigantic amounts of information becomes an essential step toward realizing the desired benefits. In this review, we examine recent efforts to adapt informatics strategies to cardiovascular biomedical research: automated information extraction and unification of multifaceted -omics data. We discuss how and why this interdisciplinary space of Cardiovascular Informatics is particularly relevant to and supportive of current experimental and clinical research. We describe in detail how open data sources and methods can drive discovery while demanding few initial resources, an advantage afforded by widespread availability of cloud computing-driven platforms. Subsequently, we provide examples of how interoperable computational systems facilitate exploration of data from multiple sources, including both consistently-formatted structured data and unstructured data. Taken together, these approaches for achieving data harmony enable molecular phenotyping of cardiovascular (CV) diseases and unification of cardiovascular knowledge.
Collapse
Affiliation(s)
- J Harry Caufield
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA.,Departments of Physiology at UCLA School of Medicine, Los Angeles, CA, 90095, USA
| | - Dibakar Sigdel
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA.,Departments of Physiology at UCLA School of Medicine, Los Angeles, CA, 90095, USA
| | - John Fu
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA
| | - Howard Choi
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA
| | - Vladimir Guevara-Gonzalez
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA
| | - Ding Wang
- Departments of Physiology at UCLA School of Medicine, Los Angeles, CA, 90095, USA
| | - Peipei Ping
- NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Los Angeles, CA, 90095, USA.,Departments of Physiology at UCLA School of Medicine, Los Angeles, CA, 90095, USA.,Department of Medicine (Cardiology) at UCLA School of Medicine, Los Angeles, CA, 90095, USA.,Bioinformatics and Medical Informatics, Los Angeles, CA, 90095, USA.,Scalable Analytics Institute (ScAi) at UCLA School of Engineering, Los Angeles, CA, 90095, USA
| |
Collapse
|
18
|
Kalgotra P, Sharda R, Croff JM. Examining multimorbidity differences across racial groups: a network analysis of electronic medical records. Sci Rep 2020; 10:13538. [PMID: 32782346 PMCID: PMC7419498 DOI: 10.1038/s41598-020-70470-8] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 07/22/2020] [Indexed: 02/07/2023] Open
Abstract
Health disparities across ethnic or racial groups are typically examined through single behavior at a time. The syndemics and multimorbidity health disparities have not been well examined by race. In this study, we study health disparities by identifying the networks of multimorbidities among individuals from seven population groups based on race, including White, African American, Asian, Hispanic, Native American, Bi- or Multi-racial and Pacific Islander. We examined a large electronic medical record (EMR) containing health records of more than 18.7 million patients and created multimorbidity networks considering their lifetime history from medical records in order to compare the network properties among seven population groups. In addition, the networks at organ system level depicting the relationship among disorders belonging to different organ systems are also compared. Our macro analysis at the organ-level indicates that African-Americans have a stronger multimorbidity network followed by Whites and Native Americans. The networks of Asians and Hispanics are sparse. Specifically, the relationship of infectious and parasitic disorders with respiratory, circulatory and genitourinary system disorders is stronger among African Americans than others. On the other hand, the relationship of mental disorders with respiratory, musculoskeletal system and connective tissue disorders is more prevalent in Whites. Similar other disparities are discussed. Recognition and explanation of such differences in multimorbidities inform the public health policies, and can inform clinical decisions as well. Our multimorbidity network analysis identifies specific differences in diagnoses among different population groups, and presents questions for biological, behavioral, clinical, social science, and policy research.
Collapse
Affiliation(s)
- Pankush Kalgotra
- Raymond J. Harbert College of Business, Auburn University, Auburn, AL, USA.
| | - Ramesh Sharda
- Spears School of Business, Oklahoma State University, Stillwater, OK, USA
| | - Julie M Croff
- Center for Health Sciences, National Center for Wellness and Recovery, Oklahoma State University, Tulsa, USA
| |
Collapse
|
19
|
Crowley RJ, Tan YJ, Ioannidis JPA. Empirical assessment of bias in machine learning diagnostic test accuracy studies. J Am Med Inform Assoc 2020; 27:1092-1101. [PMID: 32548642 PMCID: PMC7647361 DOI: 10.1093/jamia/ocaa075] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/12/2020] [Accepted: 04/24/2020] [Indexed: 12/29/2022] Open
Abstract
OBJECTIVE Machine learning (ML) diagnostic tools have significant potential to improve health care. However, methodological pitfalls may affect diagnostic test accuracy studies used to appraise such tools. We aimed to evaluate the prevalence and reporting of design characteristics within the literature. Further, we sought to empirically assess whether design features may be associated with different estimates of diagnostic accuracy. MATERIALS AND METHODS We systematically retrieved 2 × 2 tables (n = 281) describing the performance of ML diagnostic tools, derived from 114 publications in 38 meta-analyses, from PubMed. Data extracted included test performance, sample sizes, and design features. A mixed-effects metaregression was run to quantify the association between design features and diagnostic accuracy. RESULTS Participant ethnicity and blinding in test interpretation was unreported in 90% and 60% of studies, respectively. Reporting was occasionally lacking for rudimentary characteristics such as study design (28% unreported). Internal validation without appropriate safeguards was used in 44% of studies. Several design features were associated with larger estimates of accuracy, including having unreported (relative diagnostic odds ratio [RDOR], 2.11; 95% confidence interval [CI], 1.43-3.1) or case-control study designs (RDOR, 1.27; 95% CI, 0.97-1.66), and recruiting participants for the index test (RDOR, 1.67; 95% CI, 1.08-2.59). DISCUSSION Significant underreporting of experimental details was present. Study design features may affect estimates of diagnostic performance in the ML diagnostic test accuracy literature. CONCLUSIONS The present study identifies pitfalls that threaten the validity, generalizability, and clinical value of ML diagnostic tools and provides recommendations for improvement.
Collapse
Affiliation(s)
- Ryan J Crowley
- Meta-Research Innovation Center at Stanford, Stanford University, Stanford, California, USA
- Department of Bioengineering, Stanford School of Engineering, Stanford University, Stanford, California, USA
| | - Yuan Jin Tan
- Meta-Research Innovation Center at Stanford, Stanford University, Stanford, California, USA
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California, USA
| | - John P A Ioannidis
- Meta-Research Innovation Center at Stanford, Stanford University, Stanford, California, USA
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California, USA
- Stanford Prevention Research Center, Department of Medicine, Stanford Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford Medicine, Stanford University, Stanford, California, USA
- Department of Statistics, School of Humanities and Science, Stanford University, Stanford, California, USA
| |
Collapse
|
20
|
Brunson JC, Agresta TP, Laubenbacher RC. Sensitivity of comorbidity network analysis. JAMIA Open 2020; 3:94-103. [PMID: 32607491 PMCID: PMC7309234 DOI: 10.1093/jamiaopen/ooz067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/12/2019] [Accepted: 12/10/2019] [Indexed: 01/10/2023] Open
Abstract
OBJECTIVES Comorbidity network analysis (CNA) is a graph-theoretic approach to systems medicine based on associations revealed from disease co-occurrence data. Researchers have used CNA to explore epidemiological patterns, differentiate populations, characterize disorders, and more; but these techniques have not been comprehensively evaluated. Our objectives were to assess the stability of common CNA techniques. MATERIALS AND METHODS We obtained seven co-occurrence data sets, most from previous CNAs, coded using several ontologies. We constructed comorbidity networks under various modeling procedures and calculated summary statistics and centrality rankings. We used regression, ordination, and rank correlation to assess these properties' sensitivity to the source of data and construction parameters. RESULTS Most summary statistics were robust to variation in link determination but somewhere sensitive to the association measure. Some more effectively than others discriminated among networks constructed from different data sets. Centrality rankings, especially among hubs, were somewhat sensitive to link determination and highly sensitive to ontology. As multivariate models incorporated additional effects, comorbid associations among low-prevalence disorders weakened while those between high-prevalence disorders shifted negative. DISCUSSION Pairwise CNA techniques are generally robust, but some analyses are highly sensitive to certain parameters. Multivariate approaches expose additional conceptual and technical limitations to the usual pairwise approach. CONCLUSION We conclude with a set of recommendations we believe will help CNA researchers improve the robustness of results and the potential of follow-up research.
Collapse
Affiliation(s)
- Jason Cory Brunson
- Center for Quantitative Medicine, UConn Health, 263 Farmington Ave, Farmington, Connecticut 06030-6033, USA
| | - Thomas P Agresta
- Center for Quantitative Medicine, UConn Health, 263 Farmington Ave, Farmington, Connecticut 06030-6033, USA
- Department of Family Medicine, UConn Health, 263 Farmington Ave, Farmington, Connecticut 06030-6033, USA
| | - Reinhard C Laubenbacher
- Center for Quantitative Medicine, UConn Health, 263 Farmington Ave, Farmington, Connecticut 06030-6033, USA
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr, Farmington, CT 06032, USA
| |
Collapse
|
21
|
Glicksberg BS, Amadori L, Akers NK, Sukhavasi K, Franzén O, Li L, Belbin GM, Ayers KL, Shameer K, Badgeley MA, Johnson KW, Readhead B, Darrow BJ, Kenny EE, Betsholtz C, Ermel R, Skogsberg J, Ruusalepp A, Schadt EE, Dudley JT, Ren H, Kovacic JC, Giannarelli C, Li SD, Björkegren JLM, Chen R. Integrative analysis of loss-of-function variants in clinical and genomic data reveals novel genes associated with cardiovascular traits. BMC Med Genomics 2019; 12:108. [PMID: 31345219 PMCID: PMC6657044 DOI: 10.1186/s12920-019-0542-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Background Genetic loss-of-function variants (LoFs) associated with disease traits are increasingly recognized as critical evidence for the selection of therapeutic targets. We integrated the analysis of genetic and clinical data from 10,511 individuals in the Mount Sinai BioMe Biobank to identify genes with loss-of-function variants (LoFs) significantly associated with cardiovascular disease (CVD) traits, and used RNA-sequence data of seven metabolic and vascular tissues isolated from 600 CVD patients in the Stockholm-Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) study for validation. We also carried out in vitro functional studies of several candidate genes, and in vivo studies of one gene. Results We identified LoFs in 433 genes significantly associated with at least one of 10 major CVD traits. Next, we used RNA-sequence data from the STARNET study to validate 115 of the 433 LoF harboring-genes in that their expression levels were concordantly associated with corresponding CVD traits. Together with the documented hepatic lipid-lowering gene, APOC3, the expression levels of six additional liver LoF-genes were positively associated with levels of plasma lipids in STARNET. Candidate LoF-genes were subjected to gene silencing in HepG2 cells with marked overall effects on cellular LDLR, levels of triglycerides and on secreted APOB100 and PCSK9. In addition, we identified novel LoFs in DGAT2 associated with lower plasma cholesterol and glucose levels in BioMe that were also confirmed in STARNET, and showed a selective DGAT2-inhibitor in C57BL/6 mice not only significantly lowered fasting glucose levels but also affected body weight. Conclusion In sum, by integrating genetic and electronic medical record data, and leveraging one of the world’s largest human RNA-sequence datasets (STARNET), we identified known and novel CVD-trait related genes that may serve as targets for CVD therapeutics and as such merit further investigation. Electronic supplementary material The online version of this article (10.1186/s12920-019-0542-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, 94158, CA, USA
| | - Letizia Amadori
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Cardiovascular Research Center and Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Nicholas K Akers
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Katyayani Sukhavasi
- Department of Pathophysiology, Institute of Biomedicine and Translation Medicine, University of Tartu, Biomeedikum, Ravila 19, 50411, Tartu, Estonia
| | - Oscar Franzén
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Clinical Gene Networks AB, Jungfrugatan 10, 114 44, Stockholm, Sweden.,Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Novum, 14157, Huddinge, Sweden
| | - Li Li
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Gillian M Belbin
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Kristin L Ayers
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Sema4, a Mount Sinai venture, Stamford, CT, 06902, USA
| | - Khader Shameer
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Marcus A Badgeley
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Kipp W Johnson
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Ben Readhead
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Bruce J Darrow
- Cardiovascular Research Center and Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Eimear E Kenny
- Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Christer Betsholtz
- Department of Immunology, Genetics and Pathology, Uppsala University, 751 85, Uppsala, Sweden
| | - Raili Ermel
- Department of Cardiac Surgery, Tartu University Hospital, 1a Ludwig Puusepa Street, 50406, Tartu, Estonia
| | - Josefin Skogsberg
- Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset Huddinge, 141 86, Stockholm, Sweden
| | - Arno Ruusalepp
- Clinical Gene Networks AB, Jungfrugatan 10, 114 44, Stockholm, Sweden.,Department of Immunology, Genetics and Pathology, Uppsala University, 751 85, Uppsala, Sweden
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Clinical Gene Networks AB, Jungfrugatan 10, 114 44, Stockholm, Sweden.,Sema4, a Mount Sinai venture, Stamford, CT, 06902, USA
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,The Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Department of Health Policy and Research, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Hongxia Ren
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Center for Diabetes and Metabolic Diseases, Stark Neurosciences Research Institute, Indiana University, 635 Barnhill Dr., MS2049, Indianapolis, IN, 46202, USA
| | - Jason C Kovacic
- Cardiovascular Research Center and Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Chiara Giannarelli
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA.,Cardiovascular Research Center and Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Shuyu D Li
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA. .,Sema4, a Mount Sinai venture, Stamford, CT, 06902, USA.
| | - Johan L M Björkegren
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA. .,Department of Pathophysiology, Institute of Biomedicine and Translation Medicine, University of Tartu, Biomeedikum, Ravila 19, 50411, Tartu, Estonia. .,Clinical Gene Networks AB, Jungfrugatan 10, 114 44, Stockholm, Sweden. .,Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset Huddinge, 141 86, Stockholm, Sweden.
| | - Rong Chen
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA. .,Sema4, a Mount Sinai venture, Stamford, CT, 06902, USA.
| |
Collapse
|
22
|
Lademann M, Lademann M, Boeck Jensen A, Brunak S. Incorporating symptom data in longitudinal disease trajectories for more detailed patient stratification. Int J Med Inform 2019; 129:107-113. [PMID: 31445244 DOI: 10.1016/j.ijmedinf.2019.06.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 03/13/2019] [Accepted: 06/04/2019] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Use symptoms to stratify temporal disease trajectories. MATERIALS AND METHODS We use data from the Danish National Patient Registry to stratify temporal disease pairs by the symptom distributions they associate to. The underlying data comprise of 6.6 million patients collectively assigned with 7.5 million symptoms from chapter XVIII in the WHO International Classification of Disease version 10 terminology. RESULTS We stratify 33 disease pairs into 67 temporal disease-symptom-disease trajectories from three main diagnoses (two diabetes subtypes and COPD), where the symptom significantly changes the risk of developing the subsequent diseases. We combine these trajectories into three temporal disease networks, one for each main diagnosis. We confirm apparent relations between diseases and symptoms and discovered that multiple symptoms decrease the risk for diabetes progression. CONCLUSION Symptoms can be used to stratify disease trajectories, and we suggest that this approach can be applied to temporal disease trajectories systematically using structured claims data. The method can be extended to also use text-mined symptoms from unstructured data in health records.
Collapse
Affiliation(s)
- Martin Lademann
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200, Copenhagen, Denmark; Department of Pulmonary and Infection Medicine, Nordsjællands Hospital, DK-3400, Hillerød, Denmark
| | - Mette Lademann
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200, Copenhagen, Denmark
| | - Anders Boeck Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200, Copenhagen, Denmark; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029-5674, USA
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200, Copenhagen, Denmark.
| |
Collapse
|
23
|
Glicksberg BS, Johnson KW, Dudley JT. The next generation of precision medicine: observational studies, electronic health records, biobanks and continuous monitoring. Hum Mol Genet 2019; 27:R56-R62. [PMID: 29659828 DOI: 10.1093/hmg/ddy114] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 03/27/2018] [Indexed: 02/06/2023] Open
Abstract
Precision medicine can utilize new techniques in order to more effectively translate research findings into clinical practice. In this article, we first explore the limitations of traditional study designs, which stem from (to name a few): massive cost for the assembly of large patient cohorts; non-representative patient data; and the astounding complexity of human biology. Second, we propose that harnessing electronic health records and mobile device biometrics coupled to longitudinal data may prove to be a solution to many of these problems by capturing a 'real world' phenotype. We envision that future biomedical research utilizing more precise approaches to patient care will utilize continuous and longitudinal data sources.
Collapse
Affiliation(s)
- Benjamin S Glicksberg
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA.,Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Kipp W Johnson
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA
| | - Joel T Dudley
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA
| |
Collapse
|
24
|
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data. JAMA Intern Med 2018; 178:1544-1547. [PMID: 30128552 PMCID: PMC6347576 DOI: 10.1001/jamainternmed.2018.3763] [Citation(s) in RCA: 636] [Impact Index Per Article: 90.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
A promise of machine learning in health care is the avoidance of biases in diagnosis and treatment; a computer algorithm could objectively synthesize and interpret the data in the medical record. Integration of machine learning with clinical decision support tools, such as computerized alerts or diagnostic support, may offer physicians and others who provide health care targeted and timely information that can improve clinical decisions. Machine learning algorithms, however, may also be subject to biases. The biases include those related to missing data and patients not identified by algorithms, sample size and underestimation, and misclassification and measurement error. There is concern that biases and deficiencies in the data used by machine learning algorithms may contribute to socioeconomic disparities in health care. This Special Communication outlines the potential biases that may be introduced into machine learning-based clinical decision support tools that use electronic health record data and proposes potential solutions to the problems of overreliance on automation, algorithms based on biased data, and algorithms that do not provide information that is clinically meaningful. Existing health care disparities should not be amplified by thoughtless or excessive reliance on machines.
Collapse
Affiliation(s)
- Milena A Gianfrancesco
- Division of Rheumatology, Department of Medicine, University of California, San Francisco
| | - Suzanne Tamang
- Center for Population Health Sciences, Stanford University, Palo Alto, California
| | - Jinoos Yazdany
- Division of Rheumatology, Department of Medicine, University of California, San Francisco
| | - Gabriela Schmajuk
- Division of Rheumatology, Department of Medicine, University of California, San Francisco
- Veterans Affairs Medical Center, San Francisco, California
| |
Collapse
|
25
|
Chami GF, Kabatereine NB, Tukahebwa EM, Dunne DW. Precision global health and comorbidity: a population-based study of 16 357 people in rural Uganda. J R Soc Interface 2018; 15:20180248. [PMID: 30381343 PMCID: PMC6228477 DOI: 10.1098/rsif.2018.0248] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 10/09/2018] [Indexed: 12/13/2022] Open
Abstract
In low-income countries, complex comorbidities and weak health systems confound disease diagnosis and treatment. Yet, data-driven approaches have not been applied to develop better diagnostic strategies or to tailor treatment delivery for individuals within rural poor communities. We observed symptoms/diseases reported within three months by 16 357 individuals aged 1+ years in 17 villages of Mayuge District, Uganda. Symptoms were mapped to the Human Phenotype Ontology. Comorbidity networks were constructed. An edge between two symptoms/diseases was generated if the relative risk greater than 1, ϕ correlation greater than 0, and local false discovery rate less than 0.05. We studied how network structure and flagship symptom profiles varied against biosocial factors. 88.05% of individuals (14 402/16 357) reported at least one symptom/disease. Young children and individuals in worse-off households-low socioeconomic status, poor water, sanitation, and hygiene, and poor medical care-had dense network structures with the highest comorbidity burden and/or were conducive to the onset of new comorbidities from existing flagship symptoms, such as fever. Flagship symptom profiles for fever revealed self-misdiagnoses of fever as malaria and sexually transmitted infections as a potentially missed cause of fever in individuals of reproductive age. Network analysis may inform the development of new diagnostic and treatment strategies for flagship symptoms used to characterize syndromes/diseases of global concern.
Collapse
Affiliation(s)
- Goylette F Chami
- Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK
| | - Narcis B Kabatereine
- Schistosomiasis Control Initiative, Imperial College London, Norfolk Place, London W2 1PG, UK
- Bilharzia and Worm Control Programme, Vector Control Division, Ministry of Health, 15 Bombo Road, Kampala, Uganda
| | - Edridah M Tukahebwa
- Bilharzia and Worm Control Programme, Vector Control Division, Ministry of Health, 15 Bombo Road, Kampala, Uganda
| | - David W Dunne
- Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK
| |
Collapse
|
26
|
Kruse CS, Stein A, Thomas H, Kaur H. The use of Electronic Health Records to Support Population Health: A Systematic Review of the Literature. J Med Syst 2018; 42:214. [PMID: 30269237 PMCID: PMC6182727 DOI: 10.1007/s10916-018-1075-6] [Citation(s) in RCA: 173] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 09/19/2018] [Indexed: 12/16/2022]
Abstract
Electronic health records (EHRs) have emerged among health information technology as "meaningful use" to improve the quality and efficiency of healthcare, and health disparities in population health. In other instances, they have also shown lack of interoperability, functionality and many medical errors. With proper implementation and training, are electronic health records a viable source in managing population health? The primary objective of this systematic review is to assess the relationship of electronic health records' use on population health through the identification and analysis of facilitators and barriers to its adoption for this purpose. Authors searched Cumulative Index of Nursing and Allied Health Literature (CINAHL) and MEDLINE (PubMed), 10/02/2012-10/02/2017, core clinical/academic journals, MEDLINE full text, English only, human species and evaluated the articles that were germane to our research objective. Each article was analyzed by multiple reviewers. Group members recognized common facilitators and barriers associated with EHRs effect on population health. A final list of articles was selected by the group after three consensus meetings (n = 55). Among a total of 26 factors identified, 63% (147/232) of those were facilitators and 37% (85/232) barriers. About 70% of the facilitators consisted of productivity/efficiency in EHRs occurring 33 times, increased quality and data management each occurring 19 times, surveillance occurring 17 times, and preventative care occurring 15 times. About 70% of the barriers consisted of missing data occurring 24 times, no standards (interoperability) occurring 13 times, productivity loss occurring 12 times, and technology too complex occurring 10 times. The analysis identified more facilitators than barriers to the use of the EHR to support public health. Wider adoption of the EHR and more comprehensive standards for interoperability will only enhance the ability for the EHR to support this important area of surveillance and disease prevention. This review identifies more facilitators than barriers to using the EHR to support public health, which implies a certain level of usability and acceptance to use the EHR in this manner. The public-health industry should combine their efforts with the interoperability projects to make the EHR both fully adopted and fully interoperable. This will greatly increase the availability, accuracy, and comprehensiveness of data across the country, which will enhance benchmarking and disease surveillance/prevention capabilities.
Collapse
Affiliation(s)
- Clemens Scott Kruse
- Texas State University, 601 University Dr, Encino 250, San Marcos, TX, 78666, USA.
| | - Anna Stein
- Texas State University, 601 University Dr, Encino 250, San Marcos, TX, 78666, USA
| | - Heather Thomas
- Texas State University, 601 University Dr, Encino 250, San Marcos, TX, 78666, USA
| | - Harmander Kaur
- Texas State University, 601 University Dr, Encino 250, San Marcos, TX, 78666, USA
| |
Collapse
|
27
|
Shameer K, Perez-Rodriguez MM, Bachar R, Li L, Johnson A, Johnson KW, Glicksberg BS, Smith MR, Readhead B, Scarpa J, Jebakaran J, Kovatch P, Lim S, Goodman W, Reich DL, Kasarskis A, Tatonetti NP, Dudley JT. Pharmacological risk factors associated with hospital readmission rates in a psychiatric cohort identified using prescriptome data mining. BMC Med Inform Decis Mak 2018; 18:79. [PMID: 30255805 PMCID: PMC6156906 DOI: 10.1186/s12911-018-0653-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Worldwide, over 14% of individuals hospitalized for psychiatric reasons have readmissions to hospitals within 30 days after discharge. Predicting patients at risk and leveraging accelerated interventions can reduce the rates of early readmission, a negative clinical outcome (i.e., a treatment failure) that affects the quality of life of patient. To implement individualized interventions, it is necessary to predict those individuals at highest risk for 30-day readmission. In this study, our aim was to conduct a data-driven investigation to find the pharmacological factors influencing 30-day all-cause, intra- and interdepartmental readmissions after an index psychiatric admission, using the compendium of prescription data (prescriptome) from electronic medical records (EMR). METHODS The data scientists in the project received a deidentified database from the Mount Sinai Data Warehouse, which was used to perform all analyses. Data was stored in a secured MySQL database, normalized and indexed using a unique hexadecimal identifier associated with the data for psychiatric illness visits. We used Bayesian logistic regression models to evaluate the association of prescription data with 30-day readmission risk. We constructed individual models and compiled results after adjusting for covariates, including drug exposure, age, and gender. We also performed digital comorbidity survey using EMR data combined with the estimation of shared genetic architecture using genomic annotations to disease phenotypes. RESULTS Using an automated, data-driven approach, we identified prescription medications, side effects (primary side effects), and drug-drug interaction-induced side effects (secondary side effects) associated with readmission risk in a cohort of 1275 patients using prescriptome analytics. In our study, we identified 28 drugs associated with risk for readmission among psychiatric patients. Based on prescription data, Pravastatin had the highest risk of readmission (OR = 13.10; 95% CI (2.82, 60.8)). We also identified enrichment of primary side effects (n = 4006) and secondary side effects (n = 36) induced by prescription drugs in the subset of readmitted patients (n = 89) compared to the non-readmitted subgroup (n = 1186). Digital comorbidity analyses and shared genetic analyses further reveals that cardiovascular disease and psychiatric conditions are comorbid and share functional gene modules (cardiomyopathy and anxiety disorder: shared genes (n = 37; P = 1.06815E-06)). CONCLUSIONS Large scale prescriptome data is now available from EMRs and accessible for analytics that could improve healthcare outcomes. Such analyses could also drive hypothesis and data-driven research. In this study, we explored the utility of prescriptome data to identify factors driving readmission in a psychiatric cohort. Converging digital health data from EMRs and systems biology investigations reveal a subset of patient populations that have significant comorbidities with cardiovascular diseases are more likely to be readmitted. Further, the genetic architecture of psychiatric illness also suggests overlap with cardiovascular diseases. In summary, assessment of medications, side effects, and drug-drug interactions in a clinical setting as well as genomic information using a data mining approach could help to find factors that could help to lower readmission rates in patients with mental illness.
Collapse
Affiliation(s)
- Khader Shameer
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | | | - Roy Bachar
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
- Hackensack Meridian Health Hackensack University Medical Center, Hackensack, NJ, USA
| | - Li Li
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Amy Johnson
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - Kipp W Johnson
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Benjamin S Glicksberg
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Milo R Smith
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Ben Readhead
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Joseph Scarpa
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | | | - Patricia Kovatch
- Mount Sinai Data Warehouse, Mount Sinai Health System, New York, NY, USA
| | - Sabina Lim
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - Wayne Goodman
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - David L Reich
- Department of Anesthesiology, Mount Sinai Health System, New York, NY, USA
| | - Andrew Kasarskis
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology and Medicine, Columbia University, New York, NY, USA
| | - Joel T Dudley
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA.
- Department of Population Health Science and Policy; Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York, NY, USA.
| |
Collapse
|
28
|
Shameer K, Glicksberg BS, Hodos R, Johnson KW, Badgeley MA, Readhead B, Tomlinson MS, O’Connor T, Miotto R, Kidd BA, Chen R, Ma’ayan A, Dudley JT. Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Brief Bioinform 2018; 19:656-678. [PMID: 28200013 PMCID: PMC6192146 DOI: 10.1093/bib/bbw136] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Revised: 11/29/2016] [Indexed: 12/22/2022] Open
Abstract
Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug-resistant pathogens, drug-resistant cancers (cisplatin-resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta-analyses could augment therapeutic development.
Collapse
Affiliation(s)
- Khader Shameer
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
| | - Benjamin S Glicksberg
- Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York,
NY, USA
| | - Rachel Hodos
- Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York,
NY, USA
- New York University, New York, NY, USA
| | - Kipp W Johnson
- Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York,
NY, USA
| | - Marcus A Badgeley
- Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York,
NY, USA
| | - Ben Readhead
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
| | - Max S Tomlinson
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
| | | | - Riccardo Miotto
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
| | - Brian A Kidd
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
| | - Rong Chen
- Clinical Genome Informatics, Icahn Institute of Genetics and Multiscale
Biology, Mount Sinai Health System, New York, NY
| | - Avi Ma’ayan
- Mount Sinai Center for Bioinformatics, Mount Sinai Health System, New York,
NY
| | - Joel T Dudley
- Institute of Next Generation Healthcare, Mount Sinai Health System, New York,
NY, USA
- Department of Genetics and Genomic Sciences, Mount Sinai Health System, New
York, NY, USA
- Department of Population Health Science and Policy, Mount Sinai Health System,
New York, NY, USA
- Director of Biomedical Informatics, Icahn School of Medicine at Mount Sinai,
Mount Sinai Health System, New York, NY
| |
Collapse
|
29
|
A Network-Biology Informed Computational Drug Repositioning Strategy to Target Disease Risk Trajectories and Comorbidities of Peripheral Artery Disease. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:108-117. [PMID: 29888052 PMCID: PMC5961807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Currently, drug discovery approaches focus on the design of therapies that alleviate an index symptom by reengineering the underlying biological mechanism in agonistic or antagonistic fashion. For example, medicines are routinely developed to target an essential gene that drives the disease mechanism. Therapeutic overloading where patients get multiple medications to reduce the primary and secondary side effect burden is standard practice. This single-symptom based approach may not be scalable, as we understand that diseases are more connected than random and molecular interactions drive disease comorbidities. In this work, we present a proof-of-concept drug discovery strategy by combining network biology, disease comorbidity estimates, and computational drug repositioning, by targeting the risk factors and comorbidities of peripheral artery disease, a vascular disease associated with high morbidity and mortality. Individualized risk estimation and recommending disease sequelae based therapies may help to lower the mortality and morbidity of peripheral artery disease.
Collapse
|
30
|
Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Chronopoulos AT. Computational intelligence approaches for classification of medical data: State-of-the-art, future challenges and research directions. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.01.126] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
31
|
Brunson JC, Laubenbacher RC. Applications of network analysis to routinely collected health care data: a systematic review. J Am Med Inform Assoc 2018; 25:210-221. [PMID: 29025116 PMCID: PMC6664849 DOI: 10.1093/jamia/ocx052] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 04/18/2017] [Accepted: 04/23/2017] [Indexed: 01/21/2023] Open
Abstract
Objective To survey network analyses of datasets collected in the course of routine operations in health care settings and identify driving questions, methods, needs, and potential for future research. Materials and Methods A search strategy was designed to find studies that applied network analysis to routinely collected health care datasets and was adapted to 3 bibliographic databases. The results were grouped according to a thematic analysis of their settings, objectives, data, and methods. Each group received a methodological synthesis. Results The search found 189 distinct studies reported before August 2016. We manually partitioned the sample into 4 groups, which investigated institutional exchange, physician collaboration, clinical co-occurrence, and workplace interaction networks. Several robust and ongoing research programs were discerned within (and sometimes across) the groups. Little interaction was observed between these programs, despite conceptual and methodological similarities. Discussion We use the literature sample to inform a discussion of good practice at this methodological interface, including the concordance of motivations, study design, data, and tools and the validation and standardization of techniques. We then highlight instances of positive feedback between methodological development and knowledge domains and assess the overall cohesion of the sample.
Collapse
|
32
|
Shameer K, Johnson KW, Yahi A, Miotto R, Li LI, Ricks D, Jebakaran J, Kovatch P, Sengupta PP, Gelijns S, Moskovitz A, Darrow B, David DL, Kasarskis A, Tatonetti NP, Pinney S, Dudley JT. PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017; 22:276-287. [PMID: 27896982 DOI: 10.1142/9789813207813_0027] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Reduction of preventable hospital readmissions that result from chronic or acute conditions like stroke, heart failure, myocardial infarction and pneumonia remains a significant challenge for improving the outcomes and decreasing the cost of healthcare delivery in the United States. Patient readmission rates are relatively high for conditions like heart failure (HF) despite the implementation of high-quality healthcare delivery operation guidelines created by regulatory authorities. Multiple predictive models are currently available to evaluate potential 30-day readmission rates of patients. Most of these models are hypothesis driven and repetitively assess the predictive abilities of the same set of biomarkers as predictive features. In this manuscript, we discuss our attempt to develop a data-driven, electronic-medical record-wide (EMR-wide) feature selection approach and subsequent machine learning to predict readmission probabilities. We have assessed a large repertoire of variables from electronic medical records of heart failure patients in a single center. The cohort included 1,068 patients with 178 patients were readmitted within a 30-day interval (16.66% readmission rate). A total of 4,205 variables were extracted from EMR including diagnosis codes (n=1,763), medications (n=1,028), laboratory measurements (n=846), surgical procedures (n=564) and vital signs (n=4). We designed a multistep modeling strategy using the Naïve Bayes algorithm. In the first step, we created individual models to classify the cases (readmitted) and controls (non-readmitted). In the second step, features contributing to predictive risk from independent models were combined into a composite model using a correlation-based feature selection (CFS) method. All models were trained and tested using a 5-fold cross-validation method, with 70% of the cohort used for training and the remaining 30% for testing. Compared to existing predictive models for HF readmission rates (AUCs in the range of 0.6-0.7), results from our EMR-wide predictive model (AUC=0.78; Accuracy=83.19%) and phenome-wide feature selection strategies are encouraging and reveal the utility of such datadriven machine learning. Fine tuning of the model, replication using multi-center cohorts and prospective clinical trial to evaluate the clinical utility would help the adoption of the model as a clinical decision system for evaluating readmission status.
Collapse
Affiliation(s)
- Khader Shameer
- Department of Genetics and Genomics, Icahn Institute of Genomics and Multiscale Biology, New York, NY, USA2Institute of Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Johnson KW, Shameer K, Glicksberg BS, Readhead B, Sengupta PP, Björkegren JLM, Kovacic JC, Dudley JT. Enabling Precision Cardiology Through Multiscale Biology and Systems Medicine. ACTA ACUST UNITED AC 2017; 2:311-327. [PMID: 30062151 PMCID: PMC6034501 DOI: 10.1016/j.jacbts.2016.11.010] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Revised: 11/29/2016] [Accepted: 11/30/2016] [Indexed: 12/20/2022]
Abstract
The traditional paradigm of cardiovascular disease research derives insight from large-scale, broadly inclusive clinical studies of well-characterized pathologies. These insights are then put into practice according to standardized clinical guidelines. However, stagnation in the development of new cardiovascular therapies and variability in therapeutic response implies that this paradigm is insufficient for reducing the cardiovascular disease burden. In this state-of-the-art review, we examine 3 interconnected ideas we put forth as key concepts for enabling a transition to precision cardiology: 1) precision characterization of cardiovascular disease with machine learning methods; 2) the application of network models of disease to embrace disease complexity; and 3) using insights from the previous 2 ideas to enable pharmacology and polypharmacology systems for more precise drug-to-patient matching and patient-disease stratification. We conclude by exploring the challenges of applying a precision approach to cardiology, which arise from a deficit of the required resources and infrastructure, and emerging evidence for the clinical effectiveness of this nascent approach.
Collapse
Affiliation(s)
- Kipp W Johnson
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, New York.,Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Khader Shameer
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, New York.,Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Benjamin S Glicksberg
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, New York.,Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Ben Readhead
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, New York.,Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Partho P Sengupta
- The Zena and Michael A. Wiener Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Johan L M Björkegren
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Medical Biochemistry and Biophysics Vascular Biology Unit, Karolinska Institutet, Stockholm, Sweden
| | - Jason C Kovacic
- The Zena and Michael A. Wiener Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Joel T Dudley
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, New York.,Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York
| |
Collapse
|
34
|
Accelerators: Sparking Innovation and Transdisciplinary Team Science in Disparities Research. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2017; 14:ijerph14030225. [PMID: 28241508 PMCID: PMC5369061 DOI: 10.3390/ijerph14030225] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Revised: 02/15/2017] [Accepted: 02/16/2017] [Indexed: 12/20/2022]
Abstract
Development and implementation of effective, sustainable, and scalable interventions that advance equity could be propelled by innovative and inclusive partnerships. Readied catalytic frameworks that foster communication, collaboration, a shared vision, and transformative translational research across scientific and non-scientific divides are needed to foster rapid generation of novel solutions to address and ultimately eliminate disparities. To achieve this, we transformed and expanded a community-academic board into a translational science board with members from public, academic and private sectors. Rooted in team science, diverse board experts formed topic-specific "accelerators", tasked with collaborating to rapidly generate new ideas, questions, approaches, and projects comprising patients, advocates, clinicians, researchers, funders, public health and industry leaders. We began with four accelerators-digital health, big data, genomics and environmental health-and were rapidly able to respond to funding opportunities, transform new ideas into clinical and community programs, generate new, accessible, actionable data, and more efficiently and effectively conduct research. This innovative model has the power to maximize research quality and efficiency, improve patient care and engagement, optimize data democratization and dissemination among target populations, contribute to policy, and lead to systems changes needed to address the root causes of disparities.
Collapse
|