1
|
Suver C, Thorogood A, Doerr M, Wilbanks J, Knoppers B. Bringing Code to Data: Do Not Forget Governance. J Med Internet Res 2020; 22:e18087. [PMID: 32540846 PMCID: PMC7420687 DOI: 10.2196/18087] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Revised: 05/21/2020] [Accepted: 06/11/2020] [Indexed: 11/13/2022] Open
Abstract
Developing or independently evaluating algorithms in biomedical research is difficult because of restrictions on access to clinical data. Access is restricted because of privacy concerns, the proprietary treatment of data by institutions (fueled in part by the cost of data hosting, curation, and distribution), concerns over misuse, and the complexities of applicable regulatory frameworks. The use of cloud technology and services can address many of the barriers to data sharing. For example, researchers can access data in high performance, secure, and auditable cloud computing environments without the need for copying or downloading. An alternative path to accessing data sets requiring additional protection is the model-to-data approach. In model-to-data, researchers submit algorithms to run on secure data sets that remain hidden. Model-to-data is designed to enhance security and local control while enabling communities of researchers to generate new knowledge from sequestered data. Model-to-data has not yet been widely implemented, but pilots have demonstrated its utility when technical or legal constraints preclude other methods of sharing. We argue that model-to-data can make a valuable addition to our data sharing arsenal, with 2 caveats. First, model-to-data should only be adopted where necessary to supplement rather than replace existing data-sharing approaches given that it requires significant resource commitments from data stewards and limits scientific freedom, reproducibility, and scalability. Second, although model-to-data reduces concerns over data privacy and loss of local control when sharing clinical data, it is not an ethical panacea. Data stewards will remain hesitant to adopt model-to-data approaches without guidance on how to do so responsibly. To address this gap, we explored how commitments to open science, reproducibility, security, respect for data subjects, and research ethics oversight must be re-evaluated in a model-to-data context.
Collapse
Affiliation(s)
| | - Adrian Thorogood
- Centre of Genomics and Policy, McGill University, Montreal, QC, Canada
| | - Megan Doerr
- Sage Bionetworks, Seattle, WA, United States
| | | | - Bartha Knoppers
- Centre of Genomics and Policy, McGill University, Montreal, QC, Canada
| |
Collapse
|
2
|
Sun C, Li H, Mills RE, Guan Y. Prognostic model for multiple myeloma progression integrating gene expression and clinical features. Gigascience 2019; 8:giz153. [PMID: 31886876 PMCID: PMC6936209 DOI: 10.1093/gigascience/giz153] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 12/05/2019] [Accepted: 12/06/2019] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Multiple myeloma (MM) is a hematological cancer caused by abnormal accumulation of monoclonal plasma cells in bone marrow. With the increase in treatment options, risk-adapted therapy is becoming more and more important. Survival analysis is commonly applied to study progression or other events of interest and stratify the risk of patients. RESULTS In this study, we present the current state-of-the-art model for MM prognosis and the molecular biomarker set for stratification: the winning algorithm in the 2017 Multiple Myeloma DREAM Challenge, Sub-Challenge 3. Specifically, we built a non-parametric complete hazard ranking model to map the right-censored data into a linear space, where commonplace machine learning techniques, such as Gaussian process regression and random forests, can play their roles. Our model integrated both the gene expression profile and clinical features to predict the progression of MM. Compared with conventional models, such as Cox model and random survival forests, our model achieved higher accuracy in 3 within-cohort predictions. In addition, it showed robust predictive power in cross-cohort validations. Key molecular signatures related to MM progression were identified from our model, which may function as the core determinants of MM progression and provide important guidance for future research and clinical practice. Functional enrichment analysis and mammalian gene-gene interaction network revealed crucial biological processes and pathways involved in MM progression. The model is dockerized and publicly available at https://www.synapse.org/#!Synapse:syn11459638. Both data and reproducible code are included in the docker. CONCLUSIONS We present the current state-of-the-art prognostic model for MM integrating gene expression and clinical features validated in an independent test set.
Collapse
Affiliation(s)
- Chen Sun
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Ryan E Mills
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, 1241 East Catherine Street, Ann Arbor, MI 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Internal Medicine, Nephrology Division, University of Michigan, 1150 West Medical Center Drive, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
White BS, Lanc I, O'Neal J, Gupta H, Fulton RS, Schmidt H, Fronick C, Belter EA, Fiala M, King J, Ahmann GJ, DeRome M, Mardis ER, Vij R, DiPersio JF, Levy J, Auclair D, Tomasson MH. A multiple myeloma-specific capture sequencing platform discovers novel translocations and frequent, risk-associated point mutations in IGLL5. Blood Cancer J 2018; 8:35. [PMID: 29563506 PMCID: PMC5862875 DOI: 10.1038/s41408-018-0062-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Revised: 12/10/2017] [Accepted: 12/18/2017] [Indexed: 12/28/2022] Open
Abstract
Multiple myeloma (MM) is a disease of copy number variants (CNVs), chromosomal translocations, and single-nucleotide variants (SNVs). To enable integrative studies across these diverse mutation types, we developed a capture-based sequencing platform to detect their occurrence in 465 genes altered in MM and used it to sequence 95 primary tumor-normal pairs to a mean depth of 104×. We detected cases of hyperdiploidy (23%), deletions of 1p (8%), 6q (21%), 8p (17%), 14q (16%), 16q (22%), and 17p (4%), and amplification of 1q (19%). We also detected IGH and MYC translocations near expected frequencies and non-silent SNVs in NRAS (24%), KRAS (21%), FAM46C (17%), TP53 (9%), DIS3 (9%), and BRAF (3%). We discovered frequent mutations in IGLL5 (18%) that were mutually exclusive of RAS mutations and associated with increased risk of disease progression (p = 0.03), suggesting that IGLL5 may be a stratifying biomarker. We identified novel IGLL5/IGH translocations in two samples. We subjected 15 of the pairs to ultra-deep sequencing (1259×) and found that although depth correlated with number of mutations detected (p = 0.001), depth past ~300× added little. The platform provides cost-effective genomic analysis for research and may be useful in individualizing treatment decisions in clinical settings.
Collapse
Affiliation(s)
- Brian S White
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA.,Sage Bionetworks, Seattle, WA, 91809, USA
| | - Irena Lanc
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Julie O'Neal
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Harshath Gupta
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Robert S Fulton
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, 63108, MO, USA
| | - Heather Schmidt
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, 63108, MO, USA
| | - Catrina Fronick
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, 63108, MO, USA
| | - Edward A Belter
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, 63108, MO, USA
| | - Mark Fiala
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Justin King
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Greg J Ahmann
- Division of Hematology-Oncology, Mayo Clinic, Rochester, 55905, MN, USA
| | - Mary DeRome
- Multiple Myeloma Research Foundation, Norwalk, CT, 06851, USA
| | - Elaine R Mardis
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, 63108, MO, USA.,Genomics Institute, Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Ravi Vij
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - John F DiPersio
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA
| | - Joan Levy
- Multiple Myeloma Research Foundation, Norwalk, CT, 06851, USA.,Chordoma Foundation, Durham, NC, 27702, USA
| | - Daniel Auclair
- Multiple Myeloma Research Foundation, Norwalk, CT, 06851, USA
| | - Michael H Tomasson
- Department of Medicine, Washington University School of Medicine, St. Louis, 63110, MO, USA. .,Division of Hematology, Oncology and Bone Marrow Transplantation, 5204 MERF, University of Iowa, Iowa City, IA, 52242, USA.
| |
Collapse
|