1
|
Bick AG, Metcalf GA, Mayo KR, Lichtenstein L, Rura S, Carroll RJ, Musick A, Linder JE, Jordan IK, Nagar SD, Sharma S, Meller R, Basford M, Boerwinkle E, Cicek MS, Doheny KF, Eichler EE, Gabriel S, Gibbs RA, Glazer D, Harris PA, Jarvik GP, Philippakis A, Rehm HL, Roden DM, Thibodeau SN, Topper S, Blegen AL, Wirkus SJ, Wagner VA, Meyer JG, Cicek MS, Muzny DM, Venner E, Mawhinney MZ, Griffith SML, Hsu E, Ling H, Adams MK, Walker K, Hu J, Doddapaneni H, Kovar CL, Murugan M, Dugan S, Khan Z, Boerwinkle E, Lennon NJ, Austin-Tse C, Banks E, Gatzen M, Gupta N, Henricks E, Larsson K, McDonough S, Harrison SM, Kachulis C, Lebo MS, Neben CL, Steeves M, Zhou AY, Smith JD, Frazar CD, Davis CP, Patterson KE, Wheeler MM, McGee S, Lockwood CM, Shirts BH, Pritchard CC, Murray ML, Vasta V, Leistritz D, Richardson MA, Buchan JG, Radhakrishnan A, Krumm N, Ehmen BW, Schwartz S, Aster MMT, Cibulskis K, Haessly A, Asch R, Cremer A, Degatano K, Shergill A, Gauthier LD, Lee SK, Hatcher A, Grant GB, Brandt GR, Covarrubias M, Banks E, Able A, Green AE, Carroll RJ, Zhang J, Condon HR, Wang Y, Dillon MK, Albach CH, Baalawi W, Choi SH, Wang X, Rosenthal EA, Ramirez AH, Lim S, Nambiar S, Ozenberger B, Wise AL, Lunt C, Ginsburg GS, Denny JC. Genomic data in the All of Us Research Program. Nature 2024; 627:340-346. [PMID: 38374255 PMCID: PMC10937371 DOI: 10.1038/s41586-023-06957-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/08/2023] [Indexed: 02/21/2024]
Abstract
Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics1-4. The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health5,6. Here we describe the programme's genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.
Collapse
|
2
|
Wei WQ, Guardo C, Gandireddy S, Yan C, Ong H, Kerchberger V, Dickson A, Pfaff E, Master H, Basford M, Tran N, Mancuso S, Syed T, Zhao Z, Feng Q, Haendel M, Lunt C, Ginsburg G, Chute C, Denny J, Roden D. Genetic and Survey Data Improves Performance of Machine Learning Model for Long COVID. Res Sq 2023:rs.3.rs-3749510. [PMID: 38196610 PMCID: PMC10775401 DOI: 10.21203/rs.3.rs-3749510/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Over 200 million SARS-CoV-2 patients have or will develop persistent symptoms (long COVID). Given this pressing research priority, the National COVID Cohort Collaborative (N3C) developed a machine learning model using only electronic health record data to identify potential patients with long COVID. We hypothesized that additional data from health surveys, mobile devices, and genotypes could improve prediction ability. In a cohort of SARS-CoV-2 infected individuals (n=17,755) in the All of Us program, we applied and expanded upon the N3C long COVID prediction model, testing machine learning infrastructures, assessing model performance, and identifying factors that contributed most to the prediction models. For the survey/mobile device information and genetic data, extreme gradient boosting and a convolutional neural network delivered the best performance for predicting long COVID, respectively. Combined survey, genetic, and mobile data increased specificity and the Area Under Curve the Receiver Operating Characteristic score versus the original N3C model.
Collapse
Affiliation(s)
| | | | | | - Chao Yan
- Vanderbilt University Medical Center
| | - Henry Ong
- Vanderbilt University Medical Center
| | | | | | | | | | - Melissa Basford
- Vanderbilt Institute of Clinical and Translational Research/Vanderbilt University Medical Center
| | | | | | | | | | - QiPing Feng
- Department of Medicine, Vanderbilt University Medical Center
| | | | | | | | | | - Joshua Denny
- All of Us Research Program, National Institutes of Health
| | - Dan Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
3
|
Pfaff ER, Girvin AT, Crosskey M, Gangireddy S, Master H, Wei WQ, Kerchberger VE, Weiner M, Harris PA, Basford M, Lunt C, Chute CG, Moffitt RA, Haendel M. De-black-boxing health AI: demonstrating reproducible machine learning computable phenotypes using the N3C-RECOVER Long COVID model in the All of Us data repository. J Am Med Inform Assoc 2023; 30:1305-1312. [PMID: 37218289 PMCID: PMC10280348 DOI: 10.1093/jamia/ocad077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/28/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
Machine learning (ML)-driven computable phenotypes are among the most challenging to share and reproduce. Despite this difficulty, the urgent public health considerations around Long COVID make it especially important to ensure the rigor and reproducibility of Long COVID phenotyping algorithms such that they can be made available to a broad audience of researchers. As part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, researchers with the National COVID Cohort Collaborative (N3C) devised and trained an ML-based phenotype to identify patients highly probable to have Long COVID. Supported by RECOVER, N3C and NIH's All of Us study partnered to reproduce the output of N3C's trained model in the All of Us data enclave, demonstrating model extensibility in multiple environments. This case study in ML-based phenotype reuse illustrates how open-source software best practices and cross-site collaboration can de-black-box phenotyping algorithms, prevent unnecessary rework, and promote open science in informatics.
Collapse
Affiliation(s)
- Emily R Pfaff
- Department of Medicine, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, North Carolina, USA
| | | | | | - Srushti Gangireddy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Hiral Master
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - V Eric Kerchberger
- Department of Medicine, Division of Allergy, Pulmonary & Critical Care Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Mark Weiner
- Department of Medicine, Weill Cornell Medicine, New York, USA
| | - Paul A Harris
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Melissa Basford
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Chris Lunt
- National Institutes of Health, Bethesda, Maryland, USA
| | - Christopher G Chute
- Johns Hopkins Schools of Medicine, Public Health, and Nursing. Baltimore, Maryland, USA
| | - Richard A Moffitt
- Departments of Hematology and Medical Oncology and Biomedical Informatics, Emory University, Atlanta, Georgia, USA
| | - Melissa Haendel
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Denver, Colorado, USA
| | | |
Collapse
|
4
|
Xia W, Basford M, Carroll R, Clayton EW, Harris P, Kantacioglu M, Liu Y, Nyemba S, Vorobeychik Y, Wan Z, Malin BA. Managing re-identification risks while providing access to the All of Us research program. J Am Med Inform Assoc 2023; 30:907-914. [PMID: 36809550 PMCID: PMC10114067 DOI: 10.1093/jamia/ocad021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 01/23/2023] [Accepted: 02/09/2023] [Indexed: 02/23/2023] Open
Abstract
OBJECTIVE The All of Us Research Program makes individual-level data available to researchers while protecting the participants' privacy. This article describes the protections embedded in the multistep access process, with a particular focus on how the data was transformed to meet generally accepted re-identification risk levels. METHODS At the time of the study, the resource consisted of 329 084 participants. Systematic amendments were applied to the data to mitigate re-identification risk (eg, generalization of geographic regions, suppression of public events, and randomization of dates). We computed the re-identification risk for each participant using a state-of-the-art adversarial model specifically assuming that it is known that someone is a participant in the program. We confirmed the expected risk is no greater than 0.09, a threshold that is consistent with guidelines from various US state and federal agencies. We further investigated how risk varied as a function of participant demographics. RESULTS The results indicated that 95th percentile of the re-identification risk of all the participants is below current thresholds. At the same time, we observed that risk levels were higher for certain race, ethnic, and genders. CONCLUSIONS While the re-identification risk was sufficiently low, this does not imply that the system is devoid of risk. Rather, All of Us uses a multipronged data protection strategy that includes strong authentication practices, active monitoring of data misuse, and penalization mechanisms for users who violate terms of service.
Collapse
Affiliation(s)
- Weiyi Xia
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Melissa Basford
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Robert Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Ellen Wright Clayton
- Law School, Vanderbilt University, Nashville, Tennessee, USA
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Health Policy, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Paul Harris
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, USA
| | - Murat Kantacioglu
- Department of Computer Science, University of Texas at Dallas, Dallas, Texas, USA
| | - Yongtai Liu
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, USA
| | - Steve Nyemba
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Yevgeniy Vorobeychik
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Zhiyu Wan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, USA
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
5
|
Master H, Kouame A, Marginean K, Basford M, Harris P, Holko M. How Fitbit data are being made available to registered researchers in All of Us Research Program. Pac Symp Biocomput 2023; 28:19-30. [PMID: 36540961 PMCID: PMC9811842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The National Institutes of Health's (NIH) All of Us Research Program aims to enroll at least one million US participants from diverse backgrounds; collect electronic health record (EHR) data, survey data, physical measurements, biospecimens for genomics and other assays, and digital health data; and create a researcher database and tools to enable precision medicine research [1]. Since inception, digital health technologies (DHT) have been envisioned as essential to achieving the goals of the program [2]. A "bring your own device" (BYOD) study for collecting Fitbit data from participants' devices was developed with integration of additional DHTs planned in the future [3]. Here we describe how participants can consent to share their digital health technology data, how the data are collected, how the data set is parsed, and how researchers can access the data.
Collapse
Affiliation(s)
- Hiral Master
- Vanderbilt University Medical Center, Nashville TN, USA,
| | | | | | | | | | | |
Collapse
|
6
|
Yang Y, Rodriguez K, Basford M, Nambiar S, Berman L, Kho A. Ancillary Data Record Linkage to characterize the completeness of data for the All of Us Research Program. Int J Popul Data Sci 2022. [DOI: 10.23889/ijpds.v7i3.2090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022] Open
Abstract
ObjectiveThe All of Us Research Program (AoURP) is an ambitious effort to gather health data from one million Americans to accelerate research. We linked Electronic Health Records (EHR) and insurance claims data to characterize the degree to which ancillary datasets can improve data completeness for care received by AoURP participants.
ApproachWe sought to link EHR data for 400,000 consented AoURP participants with insurance claims data provided by IPM.AI (Swoop Analytics), a commercial analytics company who have insurance claims data for 300M (over 90% of) Americans. We utilized a HIPAA-compliant privacy-preserving record linkage method (tokenization, provided by Datavant) to match patients between datasets. We evaluated match fidelity and the degree of overlap between AoURP EHRs and IPM.AI claims data. We characterized the association of patient and organizational level factors (demographics, healthcare provider organization, reporting site) with match performance.
ResultsAs of submission of this abstract, 41% of AoURP EHRs matched with IPM.AI claims. We compared patient healthcare encounters, diagnosis codes (DX), procedure codes (PX), and national drug codes (NDC) for matched patients by month. The union of AoU and IPM.AI data greatly increased data completeness in matched patients. Only 20% of healthcare encounters were seen by AoURP and IPM.AI concurrently while 25% were unique to AoU EHRs and 55% to IPM.AI claims on a monthly level. The number of diagnosis events compared between AoURP and IPM.AI is roughly equal (AoU +6%) while procedure events are elevated in claims data (23%) and drug counts are greatly elevated in AoURP EHR data (71%). We found that matched patients had more healthcare encounters compared to unmatched patients.
ConclusionTo our knowledge this is the first effort to address challenges in AoURP data completeness through complementary data linkage. Our results suggest that supplementary data linkage can improve data completeness in a large national research initiative. We identified several patient factors that require further investigation in improving match fidelity.
Collapse
|
7
|
Kanakaraj P, Ramadass K, Bao S, Basford M, Jones LM, Lee HH, Xu K, Schilling KG, Carr JJ, Terry JG, Huo Y, Sandler KL, Netwon AT, Landman BA. Workflow Integration of Research AI Tools into a Hospital Radiology Rapid Prototyping Environment. J Digit Imaging 2022; 35:1023-1033. [PMID: 35266088 PMCID: PMC9485498 DOI: 10.1007/s10278-022-00601-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 01/14/2022] [Accepted: 01/23/2022] [Indexed: 11/25/2022] Open
Abstract
The field of artificial intelligence (AI) in medical imaging is undergoing explosive growth, and Radiology is a prime target for innovation. The American College of Radiology Data Science Institute has identified more than 240 specific use cases where AI could be used to improve clinical practice. In this context, thousands of potential methods are developed by research labs and industry innovators. Deploying AI tools within a clinical enterprise, even on limited retrospective evaluation, is complicated by security and privacy concerns. Thus, innovation must be weighed against the substantive resources required for local clinical evaluation. To reduce barriers to AI validation while maintaining rigorous security and privacy standards, we developed the AI Imaging Incubator. The AI Imaging Incubator serves as a DICOM storage destination within a clinical enterprise where images can be directed for novel research evaluation under Institutional Review Board approval. AI Imaging Incubator is controlled by a secure HIPAA-compliant front end and provides access to a menu of AI procedures captured within network-isolated containers. Results are served via a secure website that supports research and clinical data formats. Deployment of new AI approaches within this system is streamlined through a standardized application programming interface. This manuscript presents case studies of the AI Imaging Incubator applied to randomizing lung biopsies on chest CT, liver fat assessment on abdomen CT, and brain volumetry on head MRI.
Collapse
Affiliation(s)
| | | | - Shunxing Bao
- Computer Science, Vanderbilt University, Nashville, TN USA
| | - Melissa Basford
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN USA
| | - Laura M. Jones
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN USA
| | - Ho Hin Lee
- Computer Science, Vanderbilt University, Nashville, TN USA
| | - Kaiwen Xu
- Computer Science, Vanderbilt University, Nashville, TN USA
| | - Kurt G. Schilling
- Vanderbilt University Institute of Imaging Science, Vanderbilt University Medical Center, Nashville, TN USA ,Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA
| | - John Jeffrey Carr
- Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA
| | - James Gregory Terry
- Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA
| | - Yuankai Huo
- Computer Science, Vanderbilt University, Nashville, TN USA ,Data Science Institute, Vanderbilt University, Nashville, TN USA
| | - Kim Lori Sandler
- Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA
| | - Allen T. Netwon
- Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA
| | - Bennett A. Landman
- Computer Science, Vanderbilt University, Nashville, TN USA ,Vanderbilt University Institute of Imaging Science, Vanderbilt University Medical Center, Nashville, TN USA ,Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, TN USA ,Electrical Engineering, Vanderbilt University, Nashville, TN USA ,Biomedical Engineering, Vanderbilt University, Nashville, TN USA ,Data Science Institute, Vanderbilt University, Nashville, TN USA
| |
Collapse
|
8
|
Ramirez AH, Sulieman L, Schlueter DJ, Halvorson A, Qian J, Ratsimbazafy F, Loperena R, Mayo K, Basford M, Deflaux N, Muthuraman KN, Natarajan K, Kho A, Xu H, Wilkins C, Anton-Culver H, Boerwinkle E, Cicek M, Clark CR, Cohn E, Ohno-Machado L, Schully SD, Ahmedani BK, Argos M, Cronin RM, O’Donnell C, Fouad M, Goldstein DB, Greenland P, Hebbring SJ, Karlson EW, Khatri P, Korf B, Smoller JW, Sodeke S, Wilbanks J, Hentges J, Mockrin S, Lunt C, Devaney SA, Gebo K, Denny JC, Carroll RJ, Glazer D, Harris PA, Hripcsak G, Philippakis A, Roden DM, Ahmedani B, Cole Johnson CD, Ahsan H, Antoine-LaVigne D, Singleton G, Anton-Culver H, Topol E, Baca-Motes K, Steinhubl S, Wade J, Begale M, Jain P, Sutherland S, Lewis B, Korf B, Behringer M, Gharavi AG, Goldstein DB, Hripcsak G, Bier L, Boerwinkle E, Brilliant MH, Murali N, Hebbring SJ, Farrar-Edwards D, Burnside E, Drezner MK, Taylor A, Channamsetty V, Montalvo W, Sharma Y, Chinea C, Jenks N, Cicek M, Thibodeau S, Holmes BW, Schlueter E, Collier E, Winkler J, Corcoran J, D’Addezio N, Daviglus M, Winn R, Wilkins C, Roden D, Denny J, Doheny K, Nickerson D, Eichler E, Jarvik G, Funk G, Philippakis A, Rehm H, Lennon N, Kathiresan S, Gabriel S, Gibbs R, Gil Rico EM, Glazer D, Grand J, Greenland P, Harris P, Shenkman E, Hogan WR, Igho-Pemu P, Pollan C, Jorge M, Okun S, Karlson EW, Smoller J, Murphy SN, Ross ME, Kaushal R, Winford E, Wallace F, Khatri P, Kheterpal V, Ojo A, Moreno FA, Kron I, Peterson R, Menon U, Lattimore PW, Leviner N, Obedin-Maliver J, Lunn M, Malik-Gagnon L, Mangravite L, Marallo A, Marroquin O, Visweswaran S, Reis S, Marshall G, McGovern P, Mignucci D, Moore J, Munoz F, Talavera G, O'Connor GT, O'Donnell C, Ohno-Machado L, Orr G, Randal F, Theodorou AA, Reiman E, Roxas-Murray M, Stark L, Tepp R, Zhou A, Topper S, Trousdale R, Tsao P, Weidman L, Weiss ST, Wellis D, Whittle J, Wilson A, Zuchner S, Zwick ME. The All of Us Research Program: Data quality, utility, and diversity. Patterns 2022; 3:100570. [PMID: 36033590 PMCID: PMC9403360 DOI: 10.1016/j.patter.2022.100570] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 03/30/2022] [Accepted: 07/14/2022] [Indexed: 11/05/2022]
Abstract
The All of Us Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools. The All of Us Research Program has released data for over 315,000 participants Demonstration projects support the utility and validity of the All of Us dataset The cloud-based Researcher Workbench provides secure, low-cost compute power
The engagement of participants in the research process and broad availability of data to diverse researchers are essential elements in building precision medicine equitably available for all. The NIH has established the ambitious All of Us Research Program to build one of the most diverse health databases in history with tools to support research to improve human health. Here, we present the initial launch of the Researcher Workbench with data types including surveys, physical measurements, and electronic health record data with validation studies to support researcher use of this novel platform. Broad access for researchers to data like these is a critical step in returning value to participants seeking to support the advancement of precision medicine and improved health for all.
Collapse
|
9
|
Zouk H, Venner E, Lennon NJ, Muzny DM, Abrams D, Adunyah S, Albertson-Junkans L, Ames DC, Appelbaum P, Aronson S, Aufox S, Babb LJ, Balasubramanian A, Bangash H, Basford M, Bastarache L, Baxter S, Behr M, Benoit B, Bhoj E, Bielinski SJ, Bland HT, Blout C, Borthwick K, Bottinger EP, Bowser M, Brand H, Brilliant M, Brodeur W, Caraballo P, Carrell D, Carroll A, Almoguera B, Castillo L, Castro V, Chandanavelli G, Chiang T, Chisholm RL, Christensen KD, Chung W, Chute CG, City B, Cobb BL, Connolly JJ, Crane P, Crew K, Crosslin D, De Andrade M, De la Cruz J, Denson S, Denny J, DeSmet T, Dikilitas O, Friedrich C, Fullerton SM, Funke B, Gabriel S, Gainer V, Gharavi A, Glazer AM, Glessner JT, Goehringer J, Gordon AS, Graham C, Green RC, Gundelach JH, Dayal J, Hain HS, Hakonarson H, Harden MV, Harley J, Harr M, Hartzler A, Hayes MG, Hebbring S, Henrikson N, Hershey A, Hoell C, Holm I, Howell KM, Hripcsak G, Hu J, Jarvik GP, Jayaseelan JC, Jiang Y, Joo YY, Jose S, Josyula NS, Justice AE, Kalla SE, Kalra D, Karlson E, Kelly MA, Keating BJ, Kenny EE, Key D, Kiryluk K, Kitchner T, Klanderman B, Klee E, Kochan DC, Korchina V, Kottyan L, Kovar C, Kudalkar E, Kullo IJ, Lammers P, Larson EB, Lebo MS, Leduc M, Lee MT(M, Leppig KA, Leslie ND, Li R, Liang WH, Lin CF, Linder J, Lindor NM, Lingren T, Linneman JG, Liu C, Liu W, Liu X, Lynch J, Lyon H, Macbeth A, Mahadeshwar H, Mahanta L, Malin B, Manolio T, Marasa M, Marsolo K, Dinsmore MJ, Dodge S, Hynes ED, Dunlea P, Edwards TL, Eng CM, Fasel D, Fedotov A, Feng Q, Fleharty M, Foster A, Freimuth R, McGowan ML, McNally E, Meldrim J, Mentch F, Mosley J, Mukherjee S, Mullen TE, Muniz J, Murdock DR, Murphy S, Murugan M, Myers MF, Namjou B, Ni Y, Obeng AO, Onofrio RC, Taylor CO, Person TN, Peterson JF, Petukhova L, Pisieczko CJ, Pratap S, Prows CA, Puckelwartz MJ, Rahm AK, Raj R, Ralston JD, Ramaprasan A, Ramirez A, Rasmussen L, Rasmussen-Torvik L, Rasouly HM, Raychaudhuri S, Ritchie MD, Rives C, Riza B, Roden D, Rosenthal EA, Santani A, Schaid D, Scherer S, Scott S, Scrol A, Sengupta S, Shang N, Sharma H, Sharp RR, Singh R, Sleiman PM, Slowik K, Smith JC, Smith ME, Smoller JW, Sohn S, Stanaway IB, Starren J, Stroud M, Su J, Tolwinski K, Van Driest SL, Vargas SM, Varugheese M, Veenstra D, Verbitsky M, Vicente G, Wagner M, Walker K, Walunas T, Wang L, Wang Q, Wei WQ, Weiss ST, Wiesner GL, Wells Q, Weng C, White PS, Wiley KL, Williams JL, Williams MS, Wilson MW, Witkowski L, Woods LA, Woolf B, Wu TJ, Wynn J, Yang Y, Yi V, Zhang G, Zhang L, Rehm HL, Gibbs RA. Harmonizing Clinical Sequencing and Interpretation for the eMERGE III Network. Am J Hum Genet 2019; 105:588-605. [PMID: 31447099 PMCID: PMC6731372 DOI: 10.1016/j.ajhg.2019.07.018] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 07/26/2019] [Indexed: 12/25/2022] Open
Abstract
The advancement of precision medicine requires new methods to coordinate and deliver genetic data from heterogeneous sources to physicians and patients. The eMERGE III Network enrolled >25,000 participants from biobank and prospective cohorts of predominantly healthy individuals for clinical genetic testing to determine clinically actionable findings. The network developed protocols linking together the 11 participant collection sites and 2 clinical genetic testing laboratories. DNA capture panels targeting 109 genes were used for testing of DNA and sample collection, data generation, interpretation, reporting, delivery, and storage were each harmonized. A compliant and secure network enabled ongoing review and reconciliation of clinical interpretations, while maintaining communication and data sharing between clinicians and investigators. A total of 202 individuals had positive diagnostic findings relevant to the indication for testing and 1,294 had additional/secondary findings of medical significance deemed to be returnable, establishing data return rates for other testing endeavors. This study accomplished integration of structured genomic results into multiple electronic health record (EHR) systems, setting the stage for clinical decision support to enable genomic medicine. Further, the established processes enable different sequencing sites to harmonize technical and interpretive aspects of sequencing tests, a critical achievement toward global standardization of genomic testing. The eMERGE protocols and tools are available for widespread dissemination.
Collapse
|
10
|
Smith ME, Sanderson SC, Brothers KB, Myers MF, McCormick J, Aufox S, Shrubsole MJ, Garrison NA, Mercaldo ND, Schildcrout JS, Clayton EW, Antommaria AHM, Basford M, Brilliant M, Connolly JJ, Fullerton SM, Horowitz CR, Jarvik GP, Kaufman D, Kitchner T, Li R, Ludman EJ, McCarty C, McManus V, Stallings S, Williams JL, Holm IA. Conducting a large, multi-site survey about patients' views on broad consent: challenges and solutions. BMC Med Res Methodol 2016; 16:162. [PMID: 27881091 PMCID: PMC5122167 DOI: 10.1186/s12874-016-0263-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 11/11/2016] [Indexed: 11/28/2022] Open
Abstract
Background As biobanks play an increasing role in the genomic research that will lead to precision medicine, input from diverse and large populations of patients in a variety of health care settings will be important in order to successfully carry out such studies. One important topic is participants’ views towards consent and data sharing, especially since the 2011 Advanced Notice of Proposed Rulemaking (ANPRM), and subsequently the 2015 Notice of Proposed Rulemaking (NPRM) were issued by the Department of Health and Human Services (HHS) and Office of Science and Technology Policy (OSTP). These notices required that participants consent to research uses of their de-identified tissue samples and most clinical data, and allowing such consent be obtained in a one-time, open-ended or “broad” fashion. Conducting a survey across multiple sites provides clear advantages to either a single site survey or using a large online database, and is a potentially powerful way of understanding the views of diverse populations on this topic. Methods A workgroup of the Electronic Medical Records and Genomics (eMERGE) Network, a national consortium of 9 sites (13 separate institutions, 11 clinical centers) supported by the National Human Genome Research Institute (NHGRI) that combines DNA biorepositories with electronic medical record (EMR) systems for large-scale genetic research, conducted a survey to understand patients’ views on consent, sample and data sharing for future research, biobank governance, data protection, and return of research results. Results Working across 9 sites to design and conduct a national survey presented challenges in organization, meeting human subjects guidelines at each institution, and survey development and implementation. The challenges were met through a committee structure to address each aspect of the project with representatives from all sites. Each committee’s output was integrated into the overall survey plan. A number of site-specific issues were successfully managed allowing the survey to be developed and implemented uniformly across 11 clinical centers. Conclusions Conducting a survey across a number of institutions with different cultures and practices is a methodological and logistical challenge. With a clear infrastructure, collaborative attitudes, excellent lines of communication, and the right expertise, this can be accomplished successfully.
Collapse
Affiliation(s)
- Maureen E Smith
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, 645 N. Michigan Avenue, Chicago, IL, 60611, USA.
| | - Saskia C Sanderson
- Icahn School of Medicine at Mount Sinai, New York, NY, USA.,University College London, London, UK
| | - Kyle B Brothers
- University of Louisville School of Medicine, Louisville, KY, USA
| | - Melanie F Myers
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | | | - Sharon Aufox
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, 645 N. Michigan Avenue, Chicago, IL, 60611, USA
| | - Martha J Shrubsole
- Vanderbilt University Medical Center and Vanderbilt University, Nashville, TN, USA
| | | | - Nathaniel D Mercaldo
- Vanderbilt University Medical Center and Vanderbilt University, Nashville, TN, USA
| | | | - Ellen Wright Clayton
- Vanderbilt University Medical Center and Vanderbilt University, Nashville, TN, USA
| | | | - Melissa Basford
- Vanderbilt University Medical Center and Vanderbilt University, Nashville, TN, USA
| | | | - John J Connolly
- The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | | | | | | | - Dave Kaufman
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Terri Kitchner
- Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | - Rongling Li
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | - Sarah Stallings
- Vanderbilt University Medical Center and Vanderbilt University, Nashville, TN, USA
| | | | - Ingrid A Holm
- Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
11
|
Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, Ellis SB, Lingren T, Thompson WK, Savova G, Haines J, Roden DM, Harris PA, Denny JC. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016; 23:1046-1052. [PMID: 27026615 PMCID: PMC5070514 DOI: 10.1093/jamia/ocv202] [Citation(s) in RCA: 213] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Revised: 10/27/2015] [Accepted: 11/25/2015] [Indexed: 01/29/2023] Open
Abstract
OBJECTIVE Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites. RESULTS As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%). DISCUSSION These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others. CONCLUSION By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.
Collapse
Affiliation(s)
| | - Peter Speltz
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Luke V Rasmussen
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | | | - Omri Gottesman
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | | | | | | | | | | | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Will K Thompson
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Guergana Savova
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Dan M Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Paul A Harris
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
12
|
Abstract
The Mid-South Clinical Data Research Network (CDRN) encompasses three large health systems: (1) Vanderbilt Health System (VU) with electronic medical records for over 2 million patients, (2) the Vanderbilt Healthcare Affiliated Network (VHAN) which currently includes over 40 hospitals, hundreds of ambulatory practices, and over 3 million patients in the Mid-South, and (3) Greenway Medical Technologies, with access to 24 million patients nationally. Initial goals of the Mid-South CDRN include: (1) expansion of our VU data network to include the VHAN and Greenway systems, (2) developing data integration/interoperability across the three systems, (3) improving our current tools for extracting clinical data, (4) optimization of tools for collection of patient-reported data, and (5) expansion of clinical decision support. By 18 months, we anticipate our CDRN will robustly support projects in comparative effectiveness research, pragmatic clinical trials, and other key research areas and have the capacity to share data and health information technology tools nationally.
Collapse
Affiliation(s)
- S Trent Rosenbloom
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Paul Harris
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jill Pulley
- Office of Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA Office of Personalized Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Melissa Basford
- Office of Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jason Grant
- Vanderbilt Health Affiliated Network, Nashville, Tennessee, USA
| | | | - Russell L Rothman
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA Center for Health Services Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
13
|
Shirey-Rice J, Mapes B, Basford M, Zufelt A, Wehbe F, Harris P, Alcorn M, Allen D, Arnim M, Autry S, Briggs MS, Carnegie A, Chavis-Keeling D, De La Pena C, Dworschak D, Earnest J, Grieb T, Guess M, Hafer N, Johnson T, Kasper A, Kopp J, Lockie T, Lombardo V, McHale L, Minogue A, Nunnally B, O'Quinn D, Peck K, Pemberton K, Perry C, Petrie G, Pontello A, Posner R, Rehman B, Roth D, Sacksteder P, Scahill S, Schieri L, Simpson R, Skinner A, Toussant K, Turner A, Van der Put E, Wasser J, Webb CD, Williams M, Wiseman L, Yasko L, Pulley J. The CTSA Consortium's Catalog of Assets for Translational and Clinical Health Research (CATCHR). Clin Transl Sci 2014; 7:100-7. [PMID: 24456567 DOI: 10.1111/cts.12144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
The 61 CTSA Consortium sites are home to valuable programs and infrastructure supporting translational science and all are charged with ensuring that such investments translate quickly to improved clinical care. Catalog of Assets for Translational and Clinical Health Research (CATCHR) is the Consortium's effort to collect and make available information on programs and resources to maximize efficiency and facilitate collaborations. By capturing information on a broad range of assets supporting the entire clinical and translational research spectrum, CATCHR aims to provide the necessary infrastructure and processes to establish and maintain an open-access, searchable database of consortium resources to support multisite clinical and translational research studies. Data are collected using rigorous, defined methods, with the resulting information made visible through an integrated, searchable Web-based tool. Additional easy-to-use Web tools assist resource owners in validating and updating resource information over time. In this paper, we discuss the design and scope of the project, data collection methods, current results, and future plans for development and sustainability. With increasing pressure on research programs to avoid redundancy, CATCHR aims to make available information on programs and core facilities to maximize efficient use of resources.
Collapse
|
14
|
Newton KM, Peissig PL, Kho AN, Bielinski SJ, Berg RL, Choudhary V, Basford M, Chute CG, Kullo IJ, Li R, Pacheco JA, Rasmussen LV, Spangler L, Denny JC. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013; 20:e147-54. [PMID: 23531748 PMCID: PMC3715338 DOI: 10.1136/amiajnl-2012-000896] [Citation(s) in RCA: 270] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2012] [Accepted: 03/05/2013] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Genetic studies require precise phenotype definitions, but electronic medical record (EMR) phenotype data are recorded inconsistently and in a variety of formats. OBJECTIVE To present lessons learned about validation of EMR-based phenotypes from the Electronic Medical Records and Genomics (eMERGE) studies. MATERIALS AND METHODS The eMERGE network created and validated 13 EMR-derived phenotype algorithms. Network sites are Group Health, Marshfield Clinic, Mayo Clinic, Northwestern University, and Vanderbilt University. RESULTS By validating EMR-derived phenotypes we learned that: (1) multisite validation improves phenotype algorithm accuracy; (2) targets for validation should be carefully considered and defined; (3) specifying time frames for review of variables eases validation time and improves accuracy; (4) using repeated measures requires defining the relevant time period and specifying the most meaningful value to be studied; (5) patient movement in and out of the health plan (transience) can result in incomplete or fragmented data; (6) the review scope should be defined carefully; (7) particular care is required in combining EMR and research data; (8) medication data can be assessed using claims, medications dispensed, or medications prescribed; (9) algorithm development and validation work best as an iterative process; and (10) validation by content experts or structured chart review can provide accurate results. CONCLUSIONS Despite the diverse structure of the five EMRs of the eMERGE sites, we developed, validated, and successfully deployed 13 electronic phenotype algorithms. Validation is a worthwhile process that not only measures phenotype performance but also strengthens phenotype algorithm definitions and enhances their inter-institutional sharing.
Collapse
|
15
|
McGuire AL, Basford M, Dressler LG, Fullerton SM, Koenig BA, Li R, McCarty CA, Ramos E, Smith ME, Somkin CP, Waudby C, Wolf WA, Clayton EW. Ethical and practical challenges of sharing data from genome-wide association studies: the eMERGE Consortium experience. Genome Res 2011; 21:1001-7. [PMID: 21632745 DOI: 10.1101/gr.120329.111] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In 2007, the National Human Genome Research Institute (NHGRI) established the Electronic MEdical Records and GEnomics (eMERGE) Consortium (www.gwas.net) to develop, disseminate, and apply approaches to research that combine DNA biorepositories with electronic medical record (EMR) systems for large-scale, high-throughput genetic research. One of the major ethical and administrative challenges for the eMERGE Consortium has been complying with existing data-sharing policies. This paper discusses the challenges of sharing genomic data linked to health information in the electronic medical record (EMR) and explores the issues as they relate to sharing both within a large consortium and in compliance with the National Institutes of Health (NIH) data-sharing policy. We use the eMERGE Consortium experience to explore data-sharing challenges from the perspective of multiple stakeholders (i.e., research participants, investigators, and research institutions), provide recommendations for researchers and institutions, and call for clearer guidance from the NIH regarding ethical implementation of its data-sharing policy.
Collapse
Affiliation(s)
- Amy L McGuire
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Pathak J, Wang J, Kashyap S, Basford M, Li R, Masys DR, Chute CG. Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience. J Am Med Inform Assoc 2011; 18:376-86. [PMID: 21597104 DOI: 10.1136/amiajnl-2010-000061] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
BACKGROUND Systematic study of clinical phenotypes is important for a better understanding of the genetic basis of human diseases and more effective gene-based disease management. A key aspect in facilitating such studies requires standardized representation of the phenotype data using common data elements (CDEs) and controlled biomedical vocabularies. In this study, the authors analyzed how a limited subset of phenotypic data is amenable to common definition and standardized collection, as well as how their adoption in large-scale epidemiological and genome-wide studies can significantly facilitate cross-study analysis. METHODS The authors mapped phenotype data dictionaries from five different eMERGE (Electronic Medical Records and Genomics) Network sites studying multiple diseases such as peripheral arterial disease and type 2 diabetes. For mapping, standardized terminological and metadata repository resources, such as the caDSR (Cancer Data Standards Registry and Repository) and SNOMED CT (Systematized Nomenclature of Medicine), were used. The mapping process comprised both lexical (via searching for relevant pre-coordinated concepts and data elements) and semantic (via post-coordination) techniques. Where feasible, new data elements were curated to enhance the coverage during mapping. A web-based application was also developed to uniformly represent and query the mapped data elements from different eMERGE studies. RESULTS Approximately 60% of the target data elements (95 out of 157) could be mapped using simple lexical analysis techniques on pre-coordinated terms and concepts before any additional curation of terminology and metadata resources was initiated by eMERGE investigators. After curation of 54 new caDSR CDEs and nine new NCI thesaurus concepts and using post-coordination, the authors were able to map the remaining 40% of data elements to caDSR and SNOMED CT. A web-based tool was also implemented to assist in semi-automatic mapping of data elements. CONCLUSION This study emphasizes the requirement for standardized representation of clinical research data using existing metadata and terminology resources and provides simple techniques and software for data element mapping using experiences from the eMERGE Network.
Collapse
Affiliation(s)
- Jyotishman Pathak
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota 55905, USA.
| | | | | | | | | | | | | |
Collapse
|
17
|
Fenner KM, Basford M. Get yourself a dog: a strategy for avoiding acquisition. Trustee 1997; 50:28, 30. [PMID: 10173678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|