1
|
Panja S, Rahem S, Chu CJ, Mitrofanova A. Big Data to Knowledge: Application of Machine Learning to Predictive Modeling of Therapeutic Response in Cancer. Curr Genomics 2021; 22:244-266. [PMID: 35273457 PMCID: PMC8822229 DOI: 10.2174/1389202921999201224110101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 09/16/2020] [Accepted: 09/30/2020] [Indexed: 11/22/2022] Open
Abstract
Background In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches in light of their application to therapeutic response modeling in cancer. Conclusion We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.
Collapse
Affiliation(s)
| | | | | | - Antonina Mitrofanova
- Address correspondence to this author at the Department of Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, Newark, NJ 07107, USA; E-mail:
| |
Collapse
|
2
|
Robinson T, Harkin J, Shukla P. Hardware Acceleration of Genomics Data Analysis: Challenges and Opportunities. Bioinformatics 2021; 37:1785-1795. [PMID: 34037688 PMCID: PMC8317111 DOI: 10.1093/bioinformatics/btab017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 11/03/2020] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
The significant decline in the cost of genome sequencing has dramatically changed the typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the computational challenge of sequencing is now secondary to genomic data analysis. Short read alignment (SRA) is a ubiquitous process within every modern bioinformatics pipeline in the field of genomics and is often regarded as the principal computational bottleneck. Many hardware and software approaches have been provided to solve the challenge of acceleration. However, previous attempts to increase throughput using many-core processing strategies have enjoyed limited success, mainly due to a dependence on global memory for each computational block. The limited scalability and high energy costs of many-core SRA implementations pose a significant constraint in maintaining acceleration. The Networks-On-Chip (NoC) hardware interconnect mechanism has advanced the scalability of many-core computing systems and, more recently, has demonstrated potential in SRA implementations by integrating multiple computational blocks such as pre-alignment filtering and sequence alignment efficiently, while minimising memory latency and global memory access. This paper provides a state of the art review on current hardware acceleration strategies for genomic data analysis, and it establishes the challenges and opportunities of utilising NoCs as a critical building block in next-generation sequencing (NGS) technologies for advancing the speed of analysis.
Collapse
Affiliation(s)
- Tony Robinson
- School of Computing, Engineering and Intelligent Systems, Ulster University, Magee Campus, Derry/Londonderry, BT48 7JL, UK
| | - Jim Harkin
- School of Computing, Engineering and Intelligent Systems, Ulster University, Magee Campus, Derry/Londonderry, BT48 7JL, UK
| | - Priyank Shukla
- Northern Ireland Centre for Stratified Medicine, Biomedical Sciences Research Institute, Ulster University, C-TRIC Building, Altnagelvin Area Hospital, Derry/Londonderry, BT47 6SB, UK
| |
Collapse
|
3
|
Diossy M, Sztupinszki Z, Krzystanek M, Borcsok J, Eklund AC, Csabai I, Pedersen AG, Szallasi Z. Strand Orientation Bias Detector to determine the probability of FFPE sequencing artifacts. Brief Bioinform 2021; 22:6278604. [PMID: 34015811 DOI: 10.1093/bib/bbab186] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 03/11/2021] [Accepted: 04/22/2021] [Indexed: 12/20/2022] Open
Abstract
Formalin-fixed paraffin-embedded tissue, the most common tissue specimen stored in clinical practice, presents challenges in the analysis due to formalin-induced artifacts. Here, we present Strand Orientation Bias Detector (SOBDetector), a flexible computational platform compatible with all the common somatic SNV-calling pipelines, designed to assess the probability whether a given detected mutation is an artifact. The underlying predictor mechanism is based on the posterior distribution of a Bayesian logistic regression model trained on The Cancer Genome Atlas whole exomes. SOBDetector is a freely available cross-platform program, implemented in Java 1.8.
Collapse
Affiliation(s)
| | | | | | - Judit Borcsok
- University of Copenhagen and at the Danish Cancer Society, Copenhagen, Denmark
| | | | - István Csabai
- Department of Complex Physics, Eotvos Lorand University, Budapest, Hungary
| | | | - Zoltan Szallasi
- Boston Children's Hospital and Harvard Medical School, Boston, USA
| |
Collapse
|
4
|
Camp SY, Kofman E, Reardon B, Moore ND, Al-Rubaish AM, Aljumaan M, Al-Ali AK, Van Allen EM, Taylor-Weiner A, AlDubayan SH. Evaluating the molecular diagnostic yield of joint genotyping-based approach for detecting rare germline pathogenic and putative loss-of-function variants. Genet Med 2021; 23:918-926. [PMID: 33531667 PMCID: PMC8720416 DOI: 10.1038/s41436-020-01074-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 12/08/2020] [Accepted: 12/15/2020] [Indexed: 12/30/2022] Open
Abstract
PURPOSE Cohort-based germline variant characterization is the standard approach for pathogenic variant discovery in clinical and research samples. However, the impact of cohort size on the molecular diagnostic yield of joint genotyping is largely unknown. METHODS Head-to-head comparison of the molecular diagnostic yield of joint genotyping in two cohorts of 239 cancer patients in the absence and then in the presence of 100 additional germline exomes. RESULTS In 239 testicular cancer patients, 4 (7.4%, 95% confidence interval [CI]: 2.1-17.9) of 54 pathogenic variants in the cancer predisposition and American College of Medical Genetics and Genomics (ACMG) genes were missed by one or both computational runs of joint genotyping. Similarly, 8 (12.1%, 95% CI: 5.4-22.5) of 66 pathogenic variants in these genes were undetected by joint genotyping in another independent cohort of 239 breast cancer patients. An exome-wide analysis of putative loss-of-function (pLOF) variants in the testicular cancer cohort showed that 162 (8.2%, 95% CI: 7.1-9.6) pLOF variants were only detected in one analysis run but not the other, while 433 (22.0%, 95% CI: 20.2-23.9%) pLOF variants were filtered out by both analyses despite having sufficient sequencing coverage. CONCLUSION Our analysis of the standard germline variant detection method highlighted a substantial impact of concurrently analyzing additional genomic data sets on the ability to detect clinically relevant germline pathogenic variants.
Collapse
Affiliation(s)
- Sabrina Y Camp
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric Kofman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brendan Reardon
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nathanael D Moore
- Internal Medicine Residency Program, University of Cincinnati, Cincinnati, OH, USA
| | | | - Mohammed Aljumaan
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Amein K Al-Ali
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Eliezer M Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Saud H AlDubayan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA.
- College of Medicine, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia.
| |
Collapse
|
5
|
Rehder C, Bean LJH, Bick D, Chao E, Chung W, Das S, O'Daniel J, Rehm H, Shashi V, Vincent LM. Next-generation sequencing for constitutional variants in the clinical laboratory, 2021 revision: a technical standard of the American College of Medical Genetics and Genomics (ACMG). Genet Med 2021; 23:1399-1415. [PMID: 33927380 DOI: 10.1038/s41436-021-01139-4] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 02/25/2021] [Accepted: 02/26/2021] [Indexed: 12/17/2022] Open
Abstract
Next-generation sequencing (NGS) technologies are now established in clinical laboratories as a primary testing modality in genomic medicine. These technologies have reduced the cost of large-scale sequencing by several orders of magnitude. It is now cost-effective to analyze an individual with disease-targeted gene panels, exome sequencing, or genome sequencing to assist in the diagnosis of a wide array of clinical scenarios. While clinical validation and use of NGS in many settings is established, there are continuing challenges as technologies and the associated informatics evolve. To assist clinical laboratories with the validation of NGS methods and platforms, the ongoing monitoring of NGS testing to ensure quality results, and the interpretation and reporting of variants found using these technologies, the American College of Medical Genetics and Genomics (ACMG) has developed the following technical standards.
Collapse
Affiliation(s)
| | - Lora J H Bean
- Department of Human Genetics, Emory University, Atlanta, GA, USA
| | - David Bick
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Elizabeth Chao
- Division of Genetics and Genomics, Department of Pediatrics, University of California, Irvine, CA, USA
| | - Wendy Chung
- Departments of Pediatrics and Medicine, Columbia University, New York, NY, USA
| | - Soma Das
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Julianne O'Daniel
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Heidi Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vandana Shashi
- Department of Pediatrics, Duke University, Durham, NC, USA
| | - Lisa M Vincent
- Division of Pathology & Laboratory Medicine, Children's National Health System, Washington, DC, USA.,Departments of Pathology and Pediatrics, George Washington University, Washington, DC, USA
| | | |
Collapse
|
6
|
DNA Sequencing Flow Cells and the Security of the Molecular-Digital Interface. PROCEEDINGS ON PRIVACY ENHANCING TECHNOLOGIES 2021. [DOI: 10.2478/popets-2021-0054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Abstract
DNA sequencing is the molecular-to-digital conversion of DNA molecules, which are made up of a linear sequence of bases (A,C,G,T), into digital information. Central to this conversion are specialized fluidic devices, called sequencing flow cells, that distribute DNA onto a surface where the molecules can be read. As more computing becomes integrated with physical systems, we set out to explore how sequencing flow cell architecture can affect the security and privacy of the sequencing process and downstream data analysis. In the course of our investigation, we found that the unusual nature of molecular processing and flow cell design contributes to two security and privacy issues. First, DNA molecules are ‘sticky’ and stable for long periods of time. In a manner analogous to data recovery from discarded hard drives, we hypothesized that residual DNA attached to used flow cells could be collected and re-sequenced to recover a significant portion of the previously sequenced data. In experiments we were able to recover over 23.4% of a previously sequenced genome sample and perfectly decode image files encoded in DNA, suggesting that flow cells may be at risk of data recovery attacks. Second, we hypothesized that methods used to simultaneously sequence separate DNA samples together to increase sequencing throughput (multiplex sequencing), which incidentally leaks small amounts of data between samples, could cause data corruption and allow samples to adversarially manipulate sequencing data. We find that a maliciously crafted synthetic DNA sample can be used to alter targeted genetic variants in other samples using this vulnerability. Such a sample could be used to corrupt sequencing data or even be spiked into tissue samples, whenever untrusted samples are sequenced together. Taken together, these results suggest that, like many computing boundaries, the molecular-to-digital interface raises potential issues that should be considered in future sequencing and molecular sensing systems, especially as they become more ubiquitous.
Collapse
|
7
|
To portray clonal evolution in blood cancer, count your stem cells. Blood 2021; 137:1862-1870. [PMID: 33512426 DOI: 10.1182/blood.2020008407] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 12/05/2020] [Indexed: 12/18/2022] Open
Abstract
Clonal evolution, the process of expansion and diversification of mutated cells, plays an important role in cancer development, resistance, and relapse. Although clonal evolution is most often conceived of as driven by natural selection, recent studies uncovered that neutral evolution shapes clonal evolution in a significant proportion of solid cancers. In hematological malignancies, the interplay between neutral evolution and natural selection is also disputed. Because natural selection selects cells with a greater fitness, providing a growth advantage to some cells relative to others, the architecture of clonal evolution serves as indirect evidence to distinguish natural selection from neutral evolution and has been associated with different prognoses for the patient. Linear architecture, when the new mutant clone grows within the previous one, is characteristic of hematological malignancies and is typically interpreted as being driven by natural selection. Here, we discuss the role of natural selection and neutral evolution in the production of linear clonal architectures in hematological malignancies. Although it is tempting to attribute linear evolution to natural selection, we argue that a lower number of contributing stem cells accompanied by genetic drift can also result in a linear pattern of evolution, as illustrated by simulations of clonal evolution in hematopoietic stem cells. The number of stem cells contributing to long-term clonal evolution is not known in the pathological context, and we advocate that estimating these numbers in the context of cancer and aging is crucial to parsing out neutral evolution from natural selection, 2 processes that require different therapeutic strategies.
Collapse
|
8
|
Harrington F, Greenslade M, Talaulikar D, Corboy G. Genomic characterisation of diffuse large B-cell lymphoma. Pathology 2021; 53:367-376. [PMID: 33642095 DOI: 10.1016/j.pathol.2020.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 12/19/2020] [Accepted: 12/23/2020] [Indexed: 02/09/2023]
Abstract
Diffuse large B-cell lymphoma (DLBCL) is a genomically heterogenous disease comprised of many subtypes that display significantly different clinical outcomes, in the context of treatment with conventional immunochemotherapy. Poor clinical outcomes in some subtypes, and imperfect identification of high risk individuals in otherwise low risk subgroups, demonstrate there is room for improvement in the subclassification and risk stratification of DLBCL. In addition, more comprehensive profiling may lead to improved molecular testing guided treatment selection. Existing characterisation and risk stratification strategies, such as division of DLBCL into activated B-cell (ABC) and germinal centre B-cell (GCB) subtypes, although prognostically useful, may oversimplify the underlying biology and have proven to be less useful in improving therapy selection. Several groups have proposed more predictive molecular testing based prognostic models with potentially more relevance to therapy choice. These alternative approaches use more resource intensive comprehensive genomic profiling strategies which present practical challenges to implement in diagnostic laboratories. The addition of genomic testing to the subclassification of DLBCL shows promise, but laboratories must identify testing strategies relevant to clinical practice. A consensus on optimal molecular profiling techniques is yet to be achieved. In this article we review various next generation sequencing-based analytical techniques and molecular classification models proposed recently. Emerging therapeutics where molecular profiling may guide patient selection are also reviewed. The potential utility of genomic testing in DLBCL is discussed, in addition to practical considerations when considering introducing genomics into the diagnostic laboratory.
Collapse
Affiliation(s)
| | - Mark Greenslade
- Diagnostic Genetics, LabPlus, Auckland City Hospital, Grafton, New Zealand
| | - Dipti Talaulikar
- Department of Haematology, Canberra Hospital, ACT, Australia; College of Health and Medicine, Australian National University, Canberra, ACT, Australia
| | - Greg Corboy
- Diagnostic Genetics, LabPlus, Auckland City Hospital, Grafton, New Zealand; Department of Molecular Medicine and Pathology, Faculty of Medical and Health Sciences, The University of Auckland, Auckland, New Zealand; School of Clinical Sciences, Monash University, Clayton, Vic, Australia; Department of Clinical Pathology, The University of Melbourne, Parkville, Vic, Australia
| |
Collapse
|
9
|
Investigating the importance of individual mitochondrial genotype in susceptibility to drug-induced toxicity. Biochem Soc Trans 2021; 48:787-797. [PMID: 32453388 PMCID: PMC7329340 DOI: 10.1042/bst20190233] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 04/30/2020] [Accepted: 05/01/2020] [Indexed: 12/13/2022]
Abstract
The mitochondrion is an essential organelle responsible for generating cellular energy. Additionally, mitochondria are a source of inter-individual variation as they contain their own genome. Evidence has revealed that mitochondrial DNA (mtDNA) variation can confer differences in mitochondrial function and importantly, these differences may be a factor underlying the idiosyncrasies associated with unpredictable drug-induced toxicities. Thus far, preclinical and clinical data are limited but have revealed evidence in support of an association between mitochondrial haplogroup and susceptibility to specific adverse drug reactions. In particular, clinical studies have reported associations between mitochondrial haplogroup and antiretroviral therapy, chemotherapy and antibiotic-induced toxicity, although study limitations and conflicting findings mean that the importance of mtDNA variation to toxicity remains unclear. Several studies have used transmitochondrial cybrid cells as personalised models with which to study the impact of mitochondrial genetic variation. Cybrids allow the effects of mtDNA to be assessed against a stable nuclear background and thus the in vitro elucidation of the fundamental mechanistic basis of such differences. Overall, the current evidence supports the tenet that mitochondrial genetics represent an exciting area within the field of personalised medicine and drug toxicity. However, further research effort is required to confirm its importance. In particular, efforts should focus upon translational research to connect preclinical and clinical data that can inform whether mitochondrial genetics can be useful to identify at risk individuals or inform risk assessment during drug development.
Collapse
|
10
|
AlDubayan SH, Conway JR, Camp SY, Witkowski L, Kofman E, Reardon B, Han S, Moore N, Elmarakeby H, Salari K, Choudhry H, Al-Rubaish AM, Al-Sulaiman AA, Al-Ali AK, Taylor-Weiner A, Van Allen EM. Detection of Pathogenic Variants With Germline Genetic Testing Using Deep Learning vs Standard Methods in Patients With Prostate Cancer and Melanoma. JAMA 2020; 324:1957-1969. [PMID: 33201204 PMCID: PMC7672519 DOI: 10.1001/jama.2020.20457] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 10/06/2020] [Indexed: 12/15/2022]
Abstract
Importance Less than 10% of patients with cancer have detectable pathogenic germline alterations, which may be partially due to incomplete pathogenic variant detection. Objective To evaluate if deep learning approaches identify more germline pathogenic variants in patients with cancer. Design, Setting, and Participants A cross-sectional study of a standard germline detection method and a deep learning method in 2 convenience cohorts with prostate cancer and melanoma enrolled in the US and Europe between 2010 and 2017. The final date of clinical data collection was December 2017. Exposures Germline variant detection using standard or deep learning methods. Main Outcomes and Measures The primary outcomes included pathogenic variant detection performance in 118 cancer-predisposition genes estimated as sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The secondary outcomes were pathogenic variant detection performance in 59 genes deemed actionable by the American College of Medical Genetics and Genomics (ACMG) and 5197 clinically relevant mendelian genes. True sensitivity and true specificity could not be calculated due to lack of a criterion reference standard, but were estimated as the proportion of true-positive variants and true-negative variants, respectively, identified by each method in a reference variant set that consisted of all variants judged to be valid from either approach. Results The prostate cancer cohort included 1072 men (mean [SD] age at diagnosis, 63.7 [7.9] years; 857 [79.9%] with European ancestry) and the melanoma cohort included 1295 patients (mean [SD] age at diagnosis, 59.8 [15.6] years; 488 [37.7%] women; 1060 [81.9%] with European ancestry). The deep learning method identified more patients with pathogenic variants in cancer-predisposition genes than the standard method (prostate cancer: 198 vs 182; melanoma: 93 vs 74); sensitivity (prostate cancer: 94.7% vs 87.1% [difference, 7.6%; 95% CI, 2.2% to 13.1%]; melanoma: 74.4% vs 59.2% [difference, 15.2%; 95% CI, 3.7% to 26.7%]), specificity (prostate cancer: 64.0% vs 36.0% [difference, 28.0%; 95% CI, 1.4% to 54.6%]; melanoma: 63.4% vs 36.6% [difference, 26.8%; 95% CI, 17.6% to 35.9%]), PPV (prostate cancer: 95.7% vs 91.9% [difference, 3.8%; 95% CI, -1.0% to 8.4%]; melanoma: 54.4% vs 35.4% [difference, 19.0%; 95% CI, 9.1% to 28.9%]), and NPV (prostate cancer: 59.3% vs 25.0% [difference, 34.3%; 95% CI, 10.9% to 57.6%]; melanoma: 80.8% vs 60.5% [difference, 20.3%; 95% CI, 10.0% to 30.7%]). For the ACMG genes, the sensitivity of the 2 methods was not significantly different in the prostate cancer cohort (94.9% vs 90.6% [difference, 4.3%; 95% CI, -2.3% to 10.9%]), but the deep learning method had a higher sensitivity in the melanoma cohort (71.6% vs 53.7% [difference, 17.9%; 95% CI, 1.82% to 34.0%]). The deep learning method had higher sensitivity in the mendelian genes (prostate cancer: 99.7% vs 95.1% [difference, 4.6%; 95% CI, 3.0% to 6.3%]; melanoma: 91.7% vs 86.2% [difference, 5.5%; 95% CI, 2.2% to 8.8%]). Conclusions and Relevance Among a convenience sample of 2 independent cohorts of patients with prostate cancer and melanoma, germline genetic testing using deep learning, compared with the current standard genetic testing method, was associated with higher sensitivity and specificity for detection of pathogenic variants. Further research is needed to understand the relevance of these findings with regard to clinical outcomes.
Collapse
Affiliation(s)
- Saud H. AlDubayan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
- Division of Genetics, Brigham and Women’s Hospital, Boston, Massachusetts
- College of Medicine, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Jake R. Conway
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
- Division of Medical Sciences, Department of Biomedical Informatics, Harvard University, Boston, Massachusetts
| | - Sabrina Y. Camp
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
| | - Leora Witkowski
- Genetics Training Program, Harvard Medical School, Harvard University, Boston, Massachusetts
| | - Eric Kofman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
| | - Brendan Reardon
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
| | - Seunghun Han
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
- Program in Biological and Biomedical Sciences, Division of Medical Sciences, Harvard University, Boston, Massachusetts
| | - Nicholas Moore
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
| | - Haitham Elmarakeby
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
- Department of System and Computer Engineering, Al-Azhar University, Cairo, Egypt
| | - Keyan Salari
- Department of Urology, Massachusetts General Hospital, Boston
| | - Hani Choudhry
- Department of Biochemistry, Cancer Metabolism and Epigenetic Unit, Faculty of Science, Cancer and Mutagenesis Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| | | | | | - Amein K. Al-Ali
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Amaro Taylor-Weiner
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
- Division of Medical Sciences, Department of Biomedical Informatics, Harvard University, Boston, Massachusetts
| | - Eliezer M. Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge
| |
Collapse
|
11
|
Abstract
The sequencing and assembly of a reference genome for the horse has been revolutionary for investigation of horse health and performance. Next-generation sequencing (NGS) methods represent a second revolution in equine genomics. Researchers can align and compare DNA and RNA sequencing data to the reference genome to explore variation that may contribute or be attributed to disease. NGS has also facilitated the translation of research discovery to clinically relevant applications. This article discusses the history and development of NGS, details some of the available sequencing platforms, and describes currently available applications in the context of both discovery and clinical settings.
Collapse
|
12
|
Dual Deep Sequencing Improves the Accuracy of Low-Frequency Somatic Mutation Detection in Cancer Gene Panel Testing. Int J Mol Sci 2020; 21:ijms21103530. [PMID: 32429412 PMCID: PMC7278996 DOI: 10.3390/ijms21103530] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 05/14/2020] [Accepted: 05/14/2020] [Indexed: 02/07/2023] Open
Abstract
Cancer gene panel testing requires accurate detection of somatic mosaic mutations, as the test sample consists of a mixture of cancer cells and normal cells; each minor clone in the tumor also has different somatic mutations. Several studies have shown that the different types of software used for variant calling for next generation sequencing (NGS) can detect low-frequency somatic mutations. However, the accuracy of these somatic variant callers is unknown. We performed cancer gene panel testing in duplicate experiments using three different high-fidelity DNA polymerases in pre-capture amplification steps and analyzed by three different variant callers, Strelka2, Mutect2, and LoFreq. We selected six somatic variants that were detected in both experiments with more than two polymerases and by at least one variant caller. Among them, five single nucleotide variants were verified by CEL nuclease-mediated heteroduplex incision with polyacrylamide gel electrophoresis and silver staining (CHIPS) and Sanger sequencing. In silico analysis indicated that the FBXW7 and MAP3K1 missense mutations cause damage at the protein level. Comparing three somatic variant callers, we found that Strelka2 detected more variants than Mutect2 and LoFreq. We conclude that dual sequencing with Strelka2 analysis is useful for detection of accurate somatic mutations in cancer gene panel testing.
Collapse
|
13
|
Hernández-Lemus E, Reyes-Gopar H, Espinal-Enríquez J, Ochoa S. The Many Faces of Gene Regulation in Cancer: A Computational Oncogenomics Outlook. Genes (Basel) 2019; 10:E865. [PMID: 31671657 PMCID: PMC6896122 DOI: 10.3390/genes10110865] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/16/2019] [Accepted: 10/24/2019] [Indexed: 12/16/2022] Open
Abstract
Cancer is a complex disease at many different levels. The molecular phenomenology of cancer is also quite rich. The mutational and genomic origins of cancer and their downstream effects on processes such as the reprogramming of the gene regulatory control and the molecular pathways depending on such control have been recognized as central to the characterization of the disease. More important though is the understanding of their causes, prognosis, and therapeutics. There is a multitude of factors associated with anomalous control of gene expression in cancer. Many of these factors are now amenable to be studied comprehensively by means of experiments based on diverse omic technologies. However, characterizing each dimension of the phenomenon individually has proven to fall short in presenting a clear picture of expression regulation as a whole. In this review article, we discuss some of the more relevant factors affecting gene expression control both, under normal conditions and in tumor settings. We describe the different omic approaches that we can use as well as the computational genomic analysis needed to track down these factors. Then we present theoretical and computational frameworks developed to integrate the amount of diverse information provided by such single-omic analyses. We contextualize this within a systems biology-based multi-omic regulation setting, aimed at better understanding the complex interplay of gene expression deregulation in cancer.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico.
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| | - Helena Reyes-Gopar
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico.
| | - Jesús Espinal-Enríquez
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico.
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| | - Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City 14610, Mexico.
| |
Collapse
|
14
|
Meerzaman D, Dunn BK. Value of Collaboration among Multi-Domain Experts in Analysis of High-Throughput Genomics Data. Cancer Res 2019; 79:5140-5145. [PMID: 31337654 DOI: 10.1158/0008-5472.can-19-0769] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 05/30/2019] [Accepted: 07/11/2019] [Indexed: 12/18/2022]
Abstract
The recent explosion and ease of access to large-scale genomics data is intriguing. However, serious obstacles exist to the optimal management of the entire spectrum from data production in the laboratory through bioinformatic analysis to statistical evaluation and ultimately clinical interpretation. Beyond the multitude of technical issues, what stands out the most is the absence of adequate communication among the specialists in these domains. Successful interdisciplinary collaborations along the genomics pipeline extending from laboratory experiments to bioinformatic analyses to clinical application are notable in large scale, well managed projects such as The Cancer Genome Atlas. However, in certain settings in which the various experts perform their specialized research activities in isolation, the siloed approach to their research contributes to the generation of questionable genomic interpretations. Such situations are particularly concerning when the ultimate endpoint involves genetic/genomic interpretations that are intended for clinical applications. In spite of the fact that clinicians express interest in gaining a better understanding of clinical genomic applications, the lack of communication from upstream experts leaves them with a serious level of discomfort in applying such genomic knowledge to patient care. This discomfort is especially evident among healthcare providers who are not trained as geneticists, in particular primary care physicians. We offer some initiatives that have potential to address this problem, with emphasis on improved and ongoing communication among all the experts in these fields, constituting a comprehensive genomic "pipeline" from laboratory to patient.
Collapse
Affiliation(s)
- Daoud Meerzaman
- NCI, Center for Biomedical Informatics and Information Technology, Bethesda, Maryland.
| | | |
Collapse
|
15
|
Zeng J, Johnson A, Shufean MA, Kahle M, Yang D, Woodman SE, Vu T, Moorthy S, Holla V, Meric-Bernstam F. Operationalization of Next-Generation Sequencing and Decision Support for Precision Oncology. JCO Clin Cancer Inform 2019; 3:1-12. [PMID: 31550176 PMCID: PMC6874004 DOI: 10.1200/cci.19.00089] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/24/2019] [Indexed: 12/18/2022] Open
Abstract
Genomic testing has become a part of routine oncology care and plays critical roles in diagnosis, prognostic assessment, and treatment selection. Thus, in parallel, the variety of genomic testing providers and sequencing platforms has grown exponentially. Selection of the best-fit panel for each case can be daunting, with many factors to consider. Among them is whether alteration interpretation and therapy/clinical trial matching are included and/or sufficient. In this article, we review some common commercially available sequencing platforms for the genes and types of alterations tested, samples needed, and reporting content provided. We review publicly available resources for a do-it-yourself approach to alteration interpretation when it is not provided or when supplemental research is needed, along with resources to identify genomically matched treatment options that are approved and/or investigational. However, with both commercially provided interpretation and publicly available resources, there are still caveats and limitations that can stem from insufficient or ambiguous nomenclature as well as from the presentation of information. Use cases in which clinical decision making was affected are discussed. After treatment options are identified, it is important to assess the level of evidence for use within the patient's tumor type and molecular profile. However, numerous level-of-evidence scales have been published in recent years, so we provide a publicly available tool to facilitate interoperability. The level of evidence, along with other factors, such as allelic frequency and copy number, can be used to prioritize treatment options when multiple are identified.
Collapse
Affiliation(s)
- Jia Zeng
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Amber Johnson
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Md Abu Shufean
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Michael Kahle
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Dong Yang
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - Thuy Vu
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Shhyam Moorthy
- The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | | |
Collapse
|