1
|
Schmitz EG, Griffith M, Griffith OL, Cooper MA. Identifying genetic errors of immunity due to mosaicism. J Exp Med 2025; 222:e20241045. [PMID: 40232243 PMCID: PMC11998702 DOI: 10.1084/jem.20241045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Revised: 02/24/2025] [Accepted: 03/24/2025] [Indexed: 04/16/2025] Open
Abstract
Inborn errors of immunity are monogenic disorders of the immune system that lead to immune deficiency and/or dysregulation in patients. Identification of precise genetic causes of disease aids diagnosis and advances our understanding of the human immune system; however, a significant portion of patients lack a molecular diagnosis. Somatic mosaicism, genetic changes in a subset of cells, is emerging as an important mechanism of immune disease in both young and older patients. Here, we review the current landscape of somatic genetic errors of immunity and methods for the detection and validation of somatic variants.
Collapse
Affiliation(s)
- Elizabeth G. Schmitz
- Division of Rheumatology/Immunology, Department of Pediatrics, Washington University in St. Louis, St. Louis, MO, USA
| | - Malachi Griffith
- Division of Oncology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Obi L. Griffith
- Division of Oncology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Megan A. Cooper
- Division of Rheumatology/Immunology, Department of Pediatrics, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
2
|
Bannister MH, Peng XP. Clinical Genetics and Genomics for the Immunologist: A Primer. Immunol Allergy Clin North Am 2025; 45:153-171. [PMID: 40287166 DOI: 10.1016/j.iac.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2025]
Abstract
We are just beginning to understand the architectures, landscapes, and paradigms underlying genetically driven immune disorders (GIDs), though have already benefited greatly from the evolution of increasingly sophisticated sequencing technologies. Genetic diagnostic strategies are chosen by matching the most appropriate molecular assays and analytical tools to the relevant genetic and genomic features of a patient's differential. This review provides a practical guide for such decision-making. The authors review GID-specific paradigms, compare available and emerging genomic technologies and assays, delineate a typical clinical genomic diagnostic process, and discuss the implications of the current variant classification framework for GIDs.
Collapse
Affiliation(s)
- Maxwell H Bannister
- Medical Scientist Training Program, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Xiao P Peng
- Genetics of Blood and Immunity, Montefiore Einstein; New York Center for Rare Diseases; Division of Pediatric Genetic Medicine, Department of Pediatrics, The Children's Hospital at Montefiore, The University Hospital for Albert Einstein College of Medicine, 3411 Wayne Avenue, 9th Floor, Bronx, NY 10467, USA.
| |
Collapse
|
3
|
Abdelwahab O, Torkamaneh D. Artificial intelligence in variant calling: a review. FRONTIERS IN BIOINFORMATICS 2025; 5:1574359. [PMID: 40337525 PMCID: PMC12055765 DOI: 10.3389/fbinf.2025.1574359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 04/08/2025] [Indexed: 05/09/2025] Open
Abstract
Artificial intelligence (AI) has revolutionized numerous fields, including genomics, where it has significantly impacted variant calling, a crucial process in genomic analysis. Variant calling involves the detection of genetic variants such as single nucleotide polymorphisms (SNPs), insertions/deletions (InDels), and structural variants from high-throughput sequencing data. Traditionally, statistical approaches have dominated this task, but the advent of AI led to the development of sophisticated tools that promise higher accuracy, efficiency, and scalability. This review explores the state-of-the-art AI-based variant calling tools, including DeepVariant, DNAscope, DeepTrio, Clair, Clairvoyante, Medaka, and HELLO. We discuss their underlying methodologies, strengths, limitations, and performance metrics across different sequencing technologies, alongside their computational requirements, focusing primarily on SNP and InDel detection. By comparing these AI-driven techniques with conventional methods, we highlight the transformative advancements AI has introduced and its potential to further enhance genomic research.
Collapse
Affiliation(s)
- Omar Abdelwahab
- Département de Phytologie, Université Laval, Québec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
- Centre de recherche et d’innovation sur les végétaux (CRIV), Université Laval, Québec City, QC, Canada
- Institut intelligence et données (IID), Université Laval, Québec City, QC, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
- Centre de recherche et d’innovation sur les végétaux (CRIV), Université Laval, Québec City, QC, Canada
- Institut intelligence et données (IID), Université Laval, Québec City, QC, Canada
| |
Collapse
|
4
|
Satas G, Myers MA, McPherson A, Shah SP. Inferring active mutational processes in cancer using single cell sequencing and evolutionary constraints. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.24.639589. [PMID: 40060559 PMCID: PMC11888314 DOI: 10.1101/2025.02.24.639589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/17/2025]
Abstract
Ongoing mutagenesis in cancer drives genetic diversity throughout the natural history of cancers. As the activities of mutational processes are dynamic throughout evolution, distinguishing the mutational signatures of 'active' and 'historical' processes has important implications for studying how tumors evolve. This can aid in understanding mutagenic states at the time of presentation, and in associating active mutational process with therapeutic resistance. As bulk sequencing primarily captures historical mutational processes, we studied whether ultra-low-coverage single-cell whole-genome sequencing (scWGS), which measures the distribution of mutations across hundreds or thousands of individual cells, could enable the distinction between historical and active mutational processes. While technical challenges and data sparsity have limited mutation analysis in scWGS, we show that these data contain valuable information about dynamic mutational processes. To robustly interpret single nucleotide variants (SNVs) in scWGS, we introduce ArtiCull, a method to identify and remove SNV artifacts by leveraging evolutionary constraints, enabling reliable detection of mutations for signature analysis. Applying this approach to scWGS data from pancreatic ductal adenocarcinoma (PDAC), triple-negative breast cancer (TNBC), and high-grade serous ovarian cancer (HGSOC), we uncover temporal and spatial patterns in mutational processes. In PDAC, we observe a temporal increase in mismatch repair deficiency (MMRd). In cisplatin-treated TNBC patient-derived xenografts, we identify therapy-induced mutagenesis and inactivation of APOBEC3 activity. In HGSOC, we show distinct patterns of APOBEC3 mutagenesis, including late tumor-wide activation in one case and clade-specific enrichment in another. Additionally, we detect a clone-specific increase in SBS17 activity, in a clone previously linked to recurrence. Our findings establish ultra-low-coverage scWGS as a powerful approach for studying active mutational processes that may influence ongoing clonal evolution and therapeutic resistance.
Collapse
Affiliation(s)
- Gryte Satas
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Matthew A. Myers
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Andrew McPherson
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- The Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
5
|
Novakova A, Morris SA, Vaiarelli L, Frank S. Manufacturing and Financial Evaluation of Peptide-Based Neoantigen Cancer Vaccines for Triple-Negative Breast Cancer in the United Kingdom: Opportunities and Challenges. Vaccines (Basel) 2025; 13:144. [PMID: 40006691 PMCID: PMC11860436 DOI: 10.3390/vaccines13020144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2024] [Revised: 01/21/2025] [Accepted: 01/21/2025] [Indexed: 02/27/2025] Open
Abstract
This review evaluates the financial burden of current treatments for triple-negative breast cancer (TNBC) and projects potential financial scenarios to assess the feasibility of introducing a peptide-based neoantigen cancer vaccine (NCV) targeting the disease, using the UK as a healthcare system model. TNBC, the most aggressive breast cancer subtype, is associated with poor prognosis, worsened by the lack of personalised treatment options. Neoantigen cancer vaccine therapies present a personalised alternative with the potential to enhance T-cell responses independently of genetic factors, unlike approved immunotherapies for TNBC. Through a systematic literature review, the underlying science and manufacturing processes of NCVs are explored, the direct medical costs of existing TNBC treatments are enumerated, and two contrasting pricing scenarios for NCV clinical adoption are evaluated. The findings indicate that limited immunogenicity is the main scientific barrier to NCV clinical advancement, alongside production inefficiencies. Financial analysis shows that the UK spends approximately GBP 230 million annually on TNBC treatments, ranging from GBP 2200 to GBP 54,000 per patient. A best-case pricing model involving government-sponsored NCV therapy appears financially viable, while a worst-case, privately funded model exceeds the National Institute for Health and Care Excellence (NICE) cost thresholds. This study concludes that while NCVs show potential clinical benefits for TNBC, uncertainties about their standalone efficacy make their widespread adoption in the UK unlikely without further clinical research.
Collapse
Affiliation(s)
| | | | - Ludovica Vaiarelli
- Department of Biochemical Engineering, University College London, Bernard Katz Building, Gower Street, London WC1E 6BT, UK; (A.N.); (S.A.M.)
| | - Stefanie Frank
- Department of Biochemical Engineering, University College London, Bernard Katz Building, Gower Street, London WC1E 6BT, UK; (A.N.); (S.A.M.)
| |
Collapse
|
6
|
Xu H, Bierman R, Akey D, Koers C, Comi T, McWhite C, Akey JM. Landscape of human protein-coding somatic mutations across tissues and individuals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.07.631808. [PMID: 39829890 PMCID: PMC11741334 DOI: 10.1101/2025.01.07.631808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Although somatic mutations are fundamentally important to human biology, disease, and aging, many outstanding questions remain about their rates, spectrum, and determinants in apparently healthy tissues. Here, we performed high-coverage exome sequencing on 265 samples from 14 GTEx donors sampled for a median of 17.5 tissues per donor (spanning 46 total tissues). Using a novel probabilistic method tailored to the unique structure of our data, we identified 8,470 somatic variants. We leverage our compendium of somatic mutations to quantify the burden of deleterious somatic variants among tissues and individuals, identify molecular features such as chromatin accessibility that exhibit significantly elevated somatic mutation rates, provide novel biological insights into mutational mechanisms, and infer developmental trajectories based on patterns of multi-tissue somatic mosaicism. Our data provides a high-resolution portrait of somatic mutations across genes, tissues, and individuals.
Collapse
Affiliation(s)
- Huixin Xu
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
| | - Rob Bierman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
- Princeton Research Computing, Princeton University, Princeton NJ. 08540, USA
| | - Dayna Akey
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
| | - Cooper Koers
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
| | - Troy Comi
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
- Princeton Research Computing, Princeton University, Princeton NJ. 08540, USA
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
| | - Joshua M. Akey
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA
- Lead Contact
| |
Collapse
|
7
|
Guille A, Adélaïde J, Finetti P, Andre F, Birnbaum D, Mamessier E, Bertucci F, Chaffanet M. A benchmarking study of individual somatic variant callers and voting-based ensembles for whole-exome sequencing. Brief Bioinform 2024; 26:bbae697. [PMID: 39828270 PMCID: PMC11790059 DOI: 10.1093/bib/bbae697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/22/2024] [Indexed: 01/22/2025] Open
Abstract
By identifying somatic mutations, whole-exome sequencing (WES) has become a technology of choice for the diagnosis and guiding treatment decisions in many cancers. Despite advances in the field of somatic variant detection and the emergence of sophisticated tools incorporating machine learning, accurately identifying somatic variants remains challenging. Each new somatic variant caller is often accompanied by claims of superior performance compared to predecessors. Furthermore, most comparative studies focus on a limited set of tools and reference datasets, leading to inconsistent results and making it difficult for laboratories to select the optimal solution. Our study comprehensively evaluated 20 somatic variant callers across four reference WES datasets. We subsequently assessed the performance of ensemble approaches by exploring all possible combinations of these callers, generating 8178 and 1013 combinations for single-nucleotide variants (SNVs) and indels, respectively, with varying voting thresholds. Our analysis identified five high-performing individual somatic variant callers: Muse, Mutect2, Dragen, TNScope, and NeuSomatic. For somatic SNVs, an ensemble combining LoFreq, Muse, Mutect2, SomaticSniper, Strelka, and Lancet outperformed the top-performing caller (Dragen) by >3.6% (mean F1 score = 0.927). Similarly, for somatic indels, an ensemble of Mutect2, Strelka, Varscan2, and Pindel outperformed the best individual caller (Neusomatic) by >3.5% (mean F1 score = 0.867). By considering the computational costs of each combination, we were able to identify an optimal solution involving four somatic variant callers, Muse, Mutect2, and Strelka for the SNVs and Mutect2, Strelka, and Varscan2 for the indels, enabling accurate and cost-effective somatic variant detection in whole exome.
Collapse
Affiliation(s)
- Arnaud Guille
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - José Adélaïde
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Pascal Finetti
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Fabrice Andre
- Department of Medical Oncology, Gustave Roussy, University Paris-Saclay, 94805 Villejuif, France
| | - Daniel Birnbaum
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - Emilie Mamessier
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| | - François Bertucci
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
- Medical Oncology, Institut Paoli-Calmettes, 13009, Marseille, France
| | - Max Chaffanet
- Predictive Oncology Laboratory, Marseille Research Cancer Center, INSERM U1068, CNRS U7258, Institut Paoli-Calmettes, Aix-Marseille University, Equipe labellisée « Ligue Nationale Contre le Cancer », 13009 Marseille, France
| |
Collapse
|
8
|
Lin Y, Rasmussen MH, Christensen MH, Frydendahl A, Maretty L, Andersen CL, Besenbacher S. Evaluating Bioinformatics Processing of Somatic Variant Detection in cfDNA Using Targeted Sequencing with UMIs. Int J Mol Sci 2024; 25:11439. [PMID: 39518990 PMCID: PMC11546253 DOI: 10.3390/ijms252111439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 10/15/2024] [Accepted: 10/19/2024] [Indexed: 11/16/2024] Open
Abstract
Circulating tumor DNA (ctDNA) is a promising cancer biomarker, but accurately detecting tumor mutations in cell-free DNA (cfDNA) is challenging due to their low frequency and sequencing errors. Our study benchmarked Mutect2, VarScan2, shearwater, and DREAMS-vc using deep targeted sequencing of cfDNA with Unique Molecular Identifiers (UMIs) from 111 colorectal cancer patients. Performance was assessed at both the mutation level (distinguish tumor variants from errors) and the sample level (detect if an individual has cancer). Additionally, we investigated the effects of various UMI grouping and consensus strategies. The shearwater-AND variant calling method demonstrated the highest precision in detecting tumor-derived mutations from plasma, and reached the highest ROC-AUC of 0.984 for sample classification in tumor-informed cfDNA analyses. DREAMS-vc exhibited the highest ROC-AUC of 0.808 for sample classification in tumor-agnostic studies. We also found that sequencing depth differences in PBMCs could lead to false positives, particularly with VarScan2 and Mutect2, which was addressed by downsampling to equivalent mean depths. Additionally, network-based UMI grouping methods outperformed those using identical UMIs when all reads were retained. Our findings emphasize that the optimal variant caller depends on the study context-whether focused on mutation or sample classification, and whether conducted under tumor-informed or tumor-agnostic conditions.
Collapse
Affiliation(s)
- Yixin Lin
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Mads Heilskov Rasmussen
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Mikkel Hovden Christensen
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Amanda Frydendahl
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Lasse Maretty
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Claus Lindbjerg Andersen
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Søren Besenbacher
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark; (Y.L.); (C.L.A.)
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus, Denmark
- Bioinformatics Research Centre, Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark
| |
Collapse
|
9
|
Sandran NG, Fornarino DL, Corbett MA, Kroes T, Gardner AE, MacLennan AH, Gécz J, van Eyk CL. Application of multiple mosaic callers improves post-zygotic mutation detection from exome sequencing data. Genet Med 2024; 26:101220. [PMID: 39041334 DOI: 10.1016/j.gim.2024.101220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 07/24/2024] Open
Abstract
PURPOSE The gold standard for identification of post-zygotic variants (PZVs) is droplet digital polymerase chain reaction or high-depth sequencing across multiple tissues types. These approaches are yet to be systematically implemented for monogenic disorders. We developed PZV detection pipelines for correct classification of de novo variants. METHOD Our pipelines detect PZV in parents (gonosomal mosaicism [pGoM]) and children (somatic mosaicism, "M3"). We applied them to research exome sequencing (ES) data from the Australian Cerebral Palsy Biobank (n = 145 trios) and Simons Simplex Collection (n = 405 families). Candidate mosaic variants were validated using deep amplicon sequencing or droplet digital polymerase chain reaction. RESULTS 69.2% (M3trio), 63.9% (M3single), and 92.7% (pGoM) of detected variants were validated, with 48.6%, 56.7%, and 26.2% of variants, respectively, meeting strict criteria for mosaicism. In the Australian Cerebral Palsy Biobank, 16.6% of probands and 20.7% of parents had at least 1 true-positive somatic or pGoM variant, respectively. A large proportion of PZVs detected in Simons Simplex Collection parents (79.8%) and child (94.5%) were not previously reported. We reclassified 3.7% to 8.0% of germline de novo variants as mosaic. CONCLUSION Many PZVs were incorrectly classified as germline variants or missed by previous approaches. Systematic application of our pipelines could increase genetic diagnostic rate, improve estimates of recurrence risk in families, and benefit novel disease gene identification.
Collapse
Affiliation(s)
- Nandini G Sandran
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia; Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
| | - Dani L Fornarino
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia; Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
| | - Mark A Corbett
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia; Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
| | - Thessa Kroes
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Alison E Gardner
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Alastair H MacLennan
- Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
| | - Jozef Gécz
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia; Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia; South Australian Health and Medical Research Institute, Adelaide, SA, Australia.
| | - Clare L van Eyk
- Neurogenetics Research Program, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia; Australian Collaborative Cerebral Palsy Research Group, Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
10
|
Maruzani R, Brierley L, Jorgensen A, Fowler A. Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection. BMC Genomics 2024; 25:827. [PMID: 39227777 PMCID: PMC11370058 DOI: 10.1186/s12864-024-10737-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 08/22/2024] [Indexed: 09/05/2024] Open
Abstract
BACKGROUND Circulating tumour DNA (ctDNA) is a subset of cell free DNA (cfDNA) released by tumour cells into the bloodstream. Circulating tumour DNA has shown great potential as a biomarker to inform treatment in cancer patients. Collecting ctDNA is minimally invasive and reflects the entire genetic makeup of a patient's cancer. ctDNA variants in NGS data can be difficult to distinguish from sequencing and PCR artefacts due to low abundance, particularly in the early stages of cancer. Unique Molecular Identifiers (UMIs) are short sequences ligated to the sequencing library before amplification. These sequences are useful for filtering out low frequency artefacts. The utility of ctDNA as a cancer biomarker depends on accurate detection of cancer variants. RESULTS In this study, we benchmarked six variant calling tools, including two UMI-aware callers for their ability to call ctDNA variants. The standard variant callers tested included Mutect2, bcftools, LoFreq and FreeBayes. The UMI-aware variant callers benchmarked were UMI-VarCal and UMIErrorCorrect. We used both datasets with known variants spiked in at low frequencies, and datasets containing ctDNA, and generated synthetic UMI sequences for these datasets. Variant callers displayed different preferences for sensitivity and specificity. Mutect2 showed high sensitivity, while returning more privately called variants than any other caller in data without synthetic UMIs - an indicator of false positive variant discovery. In data encoded with synthetic UMIs, UMI-VarCal detected fewer putative false positive variants than all other callers in synthetic datasets. Mutect2 showed a balance between high sensitivity and specificity in data encoded with synthetic UMIs. CONCLUSIONS Our results indicate UMI-aware variant callers have potential to improve sensitivity and specificity in calling low frequency ctDNA variants over standard variant calling tools. There is a growing need for further development of UMI-aware variant calling tools if effective early detection methods for cancer using ctDNA samples are to be realised.
Collapse
Affiliation(s)
- Rugare Maruzani
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK.
| | - Liam Brierley
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Garscube Campus, 464 Bearsden Road, Glasgow, G61 1QH, UK
| | - Andrea Jorgensen
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
| | - Anna Fowler
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Waterhouse Building, Block F, Brownlow Street, Liverpool, L69 3GF, UK
| |
Collapse
|
11
|
Furtado LV, Bifulco C, Dolderer D, Hsiao SJ, Kipp BR, Lindeman NI, Ritterhouse LL, Temple-Smolkin RL, Zehir A, Nowak JA. Recommendations for Tumor Mutational Burden Assay Validation and Reporting: A Joint Consensus Recommendation of the Association for Molecular Pathology, College of American Pathologists, and Society for Immunotherapy of Cancer. J Mol Diagn 2024; 26:653-668. [PMID: 38851389 DOI: 10.1016/j.jmoldx.2024.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 04/05/2024] [Accepted: 05/07/2024] [Indexed: 06/10/2024] Open
Abstract
Tumor mutational burden (TMB) has been recognized as a predictive biomarker for immunotherapy response in several tumor types. Several laboratories offer TMB testing, but there is significant variation in how TMB is calculated, reported, and interpreted among laboratories. TMB standardization efforts are underway, but no published guidance for TMB validation and reporting is currently available. Recognizing the current challenges of clinical TMB testing, the Association for Molecular Pathology convened a multidisciplinary collaborative working group with representation from the American Society of Clinical Oncology, the College of American Pathologists, and the Society for the Immunotherapy of Cancer to review the laboratory practices surrounding TMB and develop recommendations for the analytical validation and reporting of TMB testing based on survey data, literature review, and expert consensus. These recommendations encompass pre-analytical, analytical, and postanalytical factors of TMB analysis, and they emphasize the relevance of comprehensive methodological descriptions to allow comparability between assays.
Collapse
Affiliation(s)
- Larissa V Furtado
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, St. Jude Children's Research Hospital, Memphis, Tennessee.
| | - Carlo Bifulco
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Providence Portland Medical Center, Portland, Oregon
| | - Daniel Dolderer
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Jupiter Medical Center, Jupiter, Florida
| | - Susan J Hsiao
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology and Cell Biology, Columbia University Medical Center, New York, New York
| | - Benjamin R Kipp
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
| | - Neal I Lindeman
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Weill Cornell Medicine, New York, New York
| | - Lauren L Ritterhouse
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts
| | | | - Ahmet Zehir
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Memorial Sloan Kettering Cancer Center, New York, New York
| | - Jonathan A Nowak
- The Tumor Mutational Burden Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts
| |
Collapse
|
12
|
Atzeni R, Massidda M, Pieroni E, Rallo V, Pisu M, Angius A. A Novel Affordable and Reliable Framework for Accurate Detection and Comprehensive Analysis of Somatic Mutations in Cancer. Int J Mol Sci 2024; 25:8044. [PMID: 39125613 PMCID: PMC11311285 DOI: 10.3390/ijms25158044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/11/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024] Open
Abstract
Accurate detection and analysis of somatic variants in cancer involve multiple third-party tools with complex dependencies and configurations, leading to laborious, error-prone, and time-consuming data conversions. This approach lacks accuracy, reproducibility, and portability, limiting clinical application. Musta was developed to address these issues as an end-to-end pipeline for detecting, classifying, and interpreting cancer mutations. Musta is based on a Python command-line tool designed to manage tumor-normal samples for precise somatic mutation analysis. The core is a Snakemake-based workflow that covers all key cancer genomics steps, including variant calling, mutational signature deconvolution, variant annotation, driver gene detection, pathway analysis, and tumor heterogeneity estimation. Musta is easy to install on any system via Docker, with a Makefile handling installation, configuration, and execution, allowing for full or partial pipeline runs. Musta has been validated at the CRS4-NGS Core facility and tested on large datasets from The Cancer Genome Atlas and the Beijing Institute of Genomics. Musta has proven robust and flexible for somatic variant analysis in cancer. It is user-friendly, requiring no specialized programming skills, and enables data processing with a single command line. Its reproducibility ensures consistent results across users following the same protocol.
Collapse
Affiliation(s)
- Rossano Atzeni
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Matteo Massidda
- Department of Medical, Surgical and Experimental Sciences, University of Sassari, 07100 Sassari, Italy;
| | - Enrico Pieroni
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Vincenzo Rallo
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cittadella Universitaria di Cagliari, 09042 Monserrato, Italy;
| | - Massimo Pisu
- Center for Advanced Studies, Research and Development in Sardinia (CRS4), 09050 Pula, Italy; (R.A.); (E.P.); (M.P.)
| | - Andrea Angius
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cittadella Universitaria di Cagliari, 09042 Monserrato, Italy;
| |
Collapse
|
13
|
Pastò B, Buzzatti G, Schettino C, Malapelle U, Bergamini A, De Angelis C, Musacchio L, Dieci MV, Kuhn E, Lambertini M, Passarelli A, Toss A, Farolfi A, Roncato R, Capoluongo E, Vida R, Pignata S, Callari M, Baldassarre G, Bartoletti M, Gerratana L, Puglisi F. Unlocking the potential of Molecular Tumor Boards: from cutting-edge data interpretation to innovative clinical pathways. Crit Rev Oncol Hematol 2024; 199:104379. [PMID: 38718940 DOI: 10.1016/j.critrevonc.2024.104379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/02/2024] [Accepted: 05/01/2024] [Indexed: 05/22/2024] Open
Abstract
The emerging era of precision medicine is characterized by an increasing availability of targeted anticancer therapies and by the parallel development of techniques to obtain more refined molecular data, whose interpretation may not always be straightforward. Molecular tumor boards gather various professional figures, in order to leverage the analysis of molecular data and provide prognostic and predictive insights for clinicians. In addition to healthcare development, they could also become a tool to promote knowledge and research spreading. A growing body of evidence on the application of molecular tumor boards to clinical practice is forming and positive signals are emerging, although a certain degree of heterogeneity exists. This work analyzes molecular tumor boards' potential workflows, figures involved, data sources, sample matrices and eligible patients, as well as available evidence and learning examples. The emerging concept of multi-institutional, disease-specific molecular tumor boards is also considered by presenting two ongoing nationwide experiences.
Collapse
Affiliation(s)
- Brenno Pastò
- Department of Medicine (DMED), University of Udine, Udine 33100, Italy; Department of Medical Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, Aviano 33081, Italy
| | - Giulia Buzzatti
- Department of Medical Oncology, U.O. Clinica di Oncologia Medica, IRCCS Ospedale Policlinico San Martino, Genova 16132, Italy
| | - Clorinda Schettino
- Clinical Trials Unit, Istituto Nazionale Tumori, IRCCS, Fondazione G. Pascale, Napoli 80131, Italy
| | - Umberto Malapelle
- Department of Public Health, University of Naples Federico II, Napoli 80131, Italy
| | - Alice Bergamini
- Faculty of Medicine and Surgery, Vita-Salute San Raffaele University, Milano 20132, Italy; Unit of Obstetrics and Gynaecology, IRCCS San Raffaele Scientific Institute, Milano 20132, Italy
| | - Carmine De Angelis
- Oncology Unit - Department of Clinical Medicine and Surgery, University of Naples Federico II, Napoli 80131, Italy
| | - Lucia Musacchio
- Department of Women and Child Health, Division of Gynaecologic Oncology, Fondazione Policlinico Universitario "A. Gemelli" IRCCS, Roma 00168, Italy
| | - Maria Vittoria Dieci
- Department of Surgery, Oncology and Gastroenterology, University of Padova, Padova 35122, Italy; Oncology 2, Veneto Institute of Oncology IOV-IRCCS, Padova 35128, Italy
| | - Elisabetta Kuhn
- Department of Biomedical, Surgical and Dental Sciences, University of Milan, Milano 20122, Italy; Pathology Unit, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milano 20122, Italy
| | - Matteo Lambertini
- Department of Medical Oncology, U.O. Clinica di Oncologia Medica, IRCCS Ospedale Policlinico San Martino, Genova 16132, Italy; Department of Internal Medicine and Medical Specialties (DiMI), School of Medicine, University of Genova, Genova 16132, Italy
| | - Anna Passarelli
- Department of Urology and Gynaecology, Istituto Nazionale Tumori IRCCS "Fondazione G. Pascale", Napoli 80131, Italy
| | - Angela Toss
- Department of Oncology and Hematology, Azienda Ospedaliero-Universitaria di Modena, Modena 41124, Italy; Department of Medical and Surgical Sciences, University of Modena and Reggio Emilia, Modena 41124, Italy
| | - Alberto Farolfi
- Department of Medical Oncology, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) "Dino Amadori", Meldola 47014, Italy
| | - Rossana Roncato
- Department of Medicine (DMED), University of Udine, Udine 33100, Italy; Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Aviano 33081, Italy
| | - Ettore Capoluongo
- Department of Molecular Medicine and Medical Biotechnologies, University of Naples Federico II, Napoli 80131, Italy; Clinical Pathology Unit, Azienda Ospedaliera San Giovanni Addolorata, Roma 00184, Italy
| | - Riccardo Vida
- Department of Medicine (DMED), University of Udine, Udine 33100, Italy; Department of Medical Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, Aviano 33081, Italy
| | - Sandro Pignata
- Department of Urology and Gynaecology, Istituto Nazionale Tumori IRCCS "Fondazione G. Pascale", Napoli 80131, Italy
| | | | - Gustavo Baldassarre
- Molecular Oncology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Aviano 33081, Italy
| | - Michele Bartoletti
- Department of Medical Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, Aviano 33081, Italy
| | - Lorenzo Gerratana
- Department of Medicine (DMED), University of Udine, Udine 33100, Italy; Department of Medical Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, Aviano 33081, Italy.
| | - Fabio Puglisi
- Department of Medicine (DMED), University of Udine, Udine 33100, Italy; Department of Medical Oncology, Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, Aviano 33081, Italy
| |
Collapse
|
14
|
Tang G, Liu X, Cho M, Li Y, Tran DH, Wang X. Pan-cancer discovery of somatic mutations from RNA sequencing data. Commun Biol 2024; 7:619. [PMID: 38783092 PMCID: PMC11116503 DOI: 10.1038/s42003-024-06326-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 05/14/2024] [Indexed: 05/25/2024] Open
Abstract
Identification of somatic mutations (SMs) is essential for characterizing cancer genomes. While DNA-seq is the prevalent method for identifying SMs, RNA-seq provides an alternative strategy to discover tumor mutations in the transcribed genome. Here, we have developed a machine learning based pipeline to discover SMs based on RNA-seq data (designated as RNA-SMs). Subsequently, we have conducted a pan-cancer analysis to systematically identify RNA-SMs from over 8,000 tumors in The Cancer Genome Atlas (TCGA). In this way, we have identified over 105,000 novel SMs that had not been reported in previous TCGA studies. These novel SMs have significant clinical implications in designing targeted therapy for improved patient outcomes. Further, we have combined the SMs identified by both RNA-seq and DNA-seq analyses to depict an updated mutational landscape across 32 cancer types. This new online SM atlas, OncoDB ( https://oncodb.org ), offers a more complete view of gene mutations that underline the development and progression of various cancers.
Collapse
Affiliation(s)
- Gongyu Tang
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA
- Department of Mechanical Engineering and Materials Science, Washington University in St. Louis, St. Louis, MO, USA
| | - Xinyi Liu
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Minsu Cho
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Yuanxiang Li
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Dan-Ho Tran
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Xiaowei Wang
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, IL, USA.
- University of Illinois Cancer Center, Chicago, IL, USA.
| |
Collapse
|
15
|
Ji S, Zhu T, Sethia A, Wang W. Accelerated somatic mutation calling for whole-genome and whole-exome sequencing data from heterogenous tumor samples. Genome Res 2024; 34:633-641. [PMID: 38589250 PMCID: PMC11146589 DOI: 10.1101/gr.278456.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 04/03/2024] [Indexed: 04/10/2024]
Abstract
Accurate detection of somatic mutations in DNA sequencing data is a fundamental prerequisite for cancer research. Previous analytical challenges were overcome by consensus mutation calling from four to five popular callers. This, however, increases the already nontrivial computing time from individual callers. Here, we launch MuSE 2, powered by multistep parallelization and efficient memory allocation, to resolve the computing time bottleneck. MuSE 2 speeds up 50 times more than MuSE 1 and eight to 80 times more than other popular callers. Our benchmark study suggests combining MuSE 2 and the recently accelerated Strelka2 achieves high efficiency and accuracy in analyzing large cancer genomic data sets.
Collapse
Affiliation(s)
- Shuangxi Ji
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Tong Zhu
- NVIDIA Corporation, Santa Clara, California 95051, USA
| | - Ankit Sethia
- NVIDIA Corporation, Santa Clara, California 95051, USA
| | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| |
Collapse
|
16
|
Simpson JT. Detecting Somatic Mutations Without Matched Normal Samples Using Long Reads. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582089. [PMID: 38464143 PMCID: PMC10925087 DOI: 10.1101/2024.02.26.582089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
DNA sequencing of tumours to identify somatic mutations has become a critical tool to guide the type of treatment given to cancer patients. The gold standard for mutation calling is comparing sequencing data from the tumour to a matched normal sample to avoid mis-classifying inherited SNPs as mutations. This procedure works extremely well, but in certain situations only a tumour sample is available. While approaches have been developed to find mutations without a matched normal, they have limited accuracy or require specific types of input data (e.g. ultra-deep sequencing). Here we explore the application of single molecule long read sequencing to calling somatic mutations without matched normal samples. We develop a simple theoretical framework to show how haplotype phasing is an important source of information for determining whether a variant is a somatic mutation. We then use simulations to assess the range of experimental parameters (tumour purity, sequencing depth) where this approach is effective. These ideas are developed into a prototype somatic mutation caller, smrest, and its use is demonstrated on two highly mutated cancer cell lines. Finally, we argue that this approach has potential to measure clinically important biomarkers that are based on the genome-wide distribution of mutations: tumour mutation burden and mutation signatures.
Collapse
Affiliation(s)
- Jared T. Simpson
- Ontario Institute for Cancer Research, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
| |
Collapse
|
17
|
Li Z, Lan J, Shi X, Lu T, Hu X, Liu X, Chen Y, He Z. Whole-Genome Sequencing Reveals Rare Off-Target Mutations in MC1R-Edited Pigs Generated by Using CRISPR-Cas9 and Somatic Cell Nuclear Transfer. CRISPR J 2024; 7:29-40. [PMID: 38353621 DOI: 10.1089/crispr.2023.0034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2024] Open
Abstract
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been widely used to create animal models for biomedical and agricultural use owing to its low cost and easy handling. However, the occurrence of erroneous cleavage (off-targeting) may raise certain concerns for the practical application of the CRISPR-Cas9 system. In this study, we created a melanocortin 1 receptor (MC1R)-edited pig model through somatic cell nuclear transfer (SCNT) by using porcine kidney cells modified by the CRISPR-Cas9 system. We then carried out whole-genome sequencing of two MC1R-edited pigs and two cloned wild-type siblings, together with the donor cells, to assess the genome-wide presence of single-nucleotide variants and small insertions and deletions (indels) and found only one candidate off-target indel in both MC1R-edited pigs. In summary, our study indicates that the minimal off-targeting effect induced by CRISPR-Cas9 may not be a major concern in gene-edited pigs created by SCNT.
Collapse
Affiliation(s)
- Zhenyang Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Jin Lan
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Xuan Shi
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Tong Lu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Xiaoli Hu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Xiaohong Liu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Yaosheng Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Zuyong He
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, People's Republic of China
| |
Collapse
|
18
|
Karimnezhad A, Perkins TJ. Empirical Bayes single nucleotide variant-calling for next-generation sequencing data. Sci Rep 2024; 14:1550. [PMID: 38233494 PMCID: PMC10794290 DOI: 10.1038/s41598-024-51958-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 01/11/2024] [Indexed: 01/19/2024] Open
Abstract
One of the fundamental computational problems in cancer genomics is the identification of single nucleotide variants (SNVs) from DNA sequencing data. Many statistical models and software implementations for SNV calling have been developed in the literature, yet, they still disagree widely on real datasets. Based on an empirical Bayesian approach, we introduce a local false discovery rate (LFDR) estimator for germline SNV calling. Our approach learns model parameters without prior information, and simultaneously accounts for information across all sites in the genomic regions of interest. We also propose another LFDR-based algorithm that reliably prioritizes a given list of mutations called by any other variant-calling algorithm. We use a suite of gold-standard cell line data to compare our LFDR approach against a collection of widely used, state of the art programs. We find that our LFDR approach approximately matches or exceeds the performance of all of these programs, despite some very large differences among them. Furthermore, when prioritizing other algorithms' calls by our LFDR score, we find that by manipulating the type I-type II tradeoff we can select subsets of variant calls with minimal loss of sensitivity but dramatic increases in precision.
Collapse
Affiliation(s)
- Ali Karimnezhad
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, K1N 9A7, Canada.
- Biostatistics and Risk Modelling Division, Bureau of Food Surveillance and Science Integration, Food Directorate, Health Products and Food Branch, Health Canada, Ottawa, K1A 0K9, Canada.
| | - Theodore J Perkins
- Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, K1H 8L6, Canada
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, K1H 8M5, Canada
| |
Collapse
|
19
|
Dhanushkumar T, M E S, Selvam PK, Rambabu M, Dasegowda KR, Vasudevan K, George Priya Doss C. Advancements and hurdles in the development of a vaccine for triple-negative breast cancer: A comprehensive review of multi-omics and immunomics strategies. Life Sci 2024; 337:122360. [PMID: 38135117 DOI: 10.1016/j.lfs.2023.122360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 12/15/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023]
Abstract
Triple-Negative Breast Cancer (TNBC) presents a significant challenge in oncology due to its aggressive behavior and limited therapeutic options. This review explores the potential of immunotherapy, particularly vaccine-based approaches, in addressing TNBC. It delves into the role of immunoinformatics in creating effective vaccines against TNBC. The review first underscores the distinct attributes of TNBC and the importance of tumor antigens in vaccine development. It then elaborates on antigen detection techniques such as exome sequencing, HLA typing, and RNA sequencing, which are instrumental in identifying TNBC-specific antigens and selecting vaccine candidates. The discussion then shifts to the in-silico vaccine development process, encompassing antigen selection, epitope prediction, and rational vaccine design. This process merges computational simulations with immunological insights. The role of Artificial Intelligence (AI) in expediting the prediction of antigens and epitopes is also emphasized. The review concludes by encapsulating how Immunoinformatics can augment the design of TNBC vaccines, integrating tumor antigens, advanced detection methods, in-silico strategies, and AI-driven insights to advance TNBC immunotherapy. This could potentially pave the way for more targeted and efficacious treatments.
Collapse
Affiliation(s)
- T Dhanushkumar
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Santhosh M E
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Prasanna Kumar Selvam
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Majji Rambabu
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - K R Dasegowda
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Karthick Vasudevan
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India.
| | - C George Priya Doss
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of BioSciences and Technology, Vellore Institute of Technology (VIT), Vellore, India.
| |
Collapse
|
20
|
Abdelwahab O, Belzile F, Torkamaneh D. Performance analysis of conventional and AI-based variant callers using short and long reads. BMC Bioinformatics 2023; 24:472. [PMID: 38097928 PMCID: PMC10720095 DOI: 10.1186/s12859-023-05596-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/04/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND The accurate detection of variants is essential for genomics-based studies. Currently, there are various tools designed to detect genomic variants, however, it has always been a challenge to decide which tool to use, especially when various major genome projects have chosen to use different tools. Thus far, most of the existing tools were mainly developed to work on short-read data (i.e., Illumina); however, other sequencing technologies (e.g. PacBio, and Oxford Nanopore) have recently shown that they can also be used for variant calling. In addition, with the emergence of artificial intelligence (AI)-based variant calling tools, there is a pressing need to compare these tools in terms of efficiency, accuracy, computational power, and ease of use. RESULTS In this study, we evaluated five of the most widely used conventional and AI-based variant calling tools (BCFTools, GATK4, Platypus, DNAscope, and DeepVariant) in terms of accuracy and computational cost using both short-read and long-read data derived from three different sequencing technologies (Illumina, PacBio HiFi, and ONT) for the same set of samples from the Genome In A Bottle project. The analysis showed that AI-based variant calling tools supersede conventional ones for calling SNVs and INDELs using both long and short reads in most aspects. In addition, we demonstrate the advantages and drawbacks of each tool while ranking them in each aspect of these comparisons. CONCLUSION This study provides best practices for variant calling using AI-based and conventional variant callers with different types of sequencing data.
Collapse
Affiliation(s)
- Omar Abdelwahab
- Département de Phytologie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada
- Institut intelligence et données (IID), Université Laval, Québec, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada.
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada.
- Institut intelligence et données (IID), Université Laval, Québec, Canada.
| |
Collapse
|
21
|
Cabello-Aguilar S, Vendrell JA, Solassol J. A Bioinformatics Toolkit for Next-Generation Sequencing in Clinical Oncology. Curr Issues Mol Biol 2023; 45:9737-9752. [PMID: 38132454 PMCID: PMC10741970 DOI: 10.3390/cimb45120608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 11/28/2023] [Accepted: 12/02/2023] [Indexed: 12/23/2023] Open
Abstract
Next-generation sequencing (NGS) has taken on major importance in clinical oncology practice. With the advent of targeted therapies capable of effectively targeting specific genomic alterations in cancer patients, the development of bioinformatics processes has become crucial. Thus, bioinformatics pipelines play an essential role not only in the detection and in identification of molecular alterations obtained from NGS data but also in the analysis and interpretation of variants, making it possible to transform raw sequencing data into meaningful and clinically useful information. In this review, we aim to examine the multiple steps of a bioinformatics pipeline as used in current clinical practice, and we also provide an updated list of the necessary bioinformatics tools. This resource is intended to assist researchers and clinicians in their genetic data analyses, improving the precision and efficiency of these processes in clinical research and patient care.
Collapse
Affiliation(s)
- Simon Cabello-Aguilar
- Montpellier BioInformatics for Clinical Diagnosis (MOBIDIC), Molecular Medicine and Genomics Platform (PMMG), CHU Montpellier, 34295 Montpellier, France
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| | - Julie A. Vendrell
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| | - Jérôme Solassol
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| |
Collapse
|
22
|
Beeler JS, Bolton KL. How low can you go?: Methodologic considerations in clonal hematopoiesis variant calling. Leuk Res 2023; 135:107419. [PMID: 37956474 DOI: 10.1016/j.leukres.2023.107419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023]
Abstract
Clonal hematopoiesis (CH) is defined by the presence of an expanded clonal hematopoietic cell population due to an acquired mutation conferring a selective growth advantage and is known to predispose to hematologic malignancy. In this review, we discuss sequencing methods for CH detection in bulk sequencing data and corresponding bioinformatic approaches for variant calling, filtering, and curation. We detail practical recommendations for CH calling. Finally, we discuss how improvements in CH sequencing and bioinformatic approaches will enable the characterization of CH trajectories, its impact on human health, and therapeutic approaches to mitigate its adverse effects.
Collapse
Affiliation(s)
- J Scott Beeler
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Kelly L Bolton
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
23
|
Wang D, Wang S, Zhang Y, Cheng X, Huang X, Han Y, Chen Z, Liu C, Li J, Zhang R. Validation and benchmarking of targeted panel sequencing for cancer genomic profiling. Am J Clin Pathol 2023; 160:507-523. [PMID: 37477357 DOI: 10.1093/ajcp/aqad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 06/22/2023] [Indexed: 07/22/2023] Open
Abstract
OBJECTIVES To validate a large next-generation sequencing (NGS) panel for comprehensive genomic profiling and improve patient access to more effective precision oncology treatment strategies. METHODS OncoPanScan was designed by targeting 825 cancer-related genes to detect a broad range of genomic alterations. A practical validation strategy was used to evaluate the assay's analytical performance, involving 97 tumor specimens with 25 paired blood specimens, 10 engineered cell lines, and 121 artificial reference DNA samples. RESULTS Overall, 1107 libraries were prepared and the sequencing failure rate was 0.18%. Across alteration classes, sensitivity ranged from 0.938 to more than 0.999, specificity ranged from 0.889 to more than 0.999, positive predictive value ranged from 0.867 to more than 0.999, repeatability ranged from 0.908 to more than 0.999, and reproducibility ranged from 0.832 to more than 0.999. The limit of detection for variants was established based on variant frequency, while for tumor mutation burden and microsatellite instability, it was based on tumor content, resulting in a minimum requirement of 20% tumor content. Benchmarking variant calls against validated NGS assays revealed that variations in the dry-bench processes were the primary cause of discordances. CONCLUSIONS This study presents a detailed validation framework and empirical recommendations for large panel validation and elucidates the sources of discordant alteration calls by comparing with "gold standard measures."
Collapse
Affiliation(s)
- Duo Wang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | | | - Yuanfeng Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | | | - Xin Huang
- Genetron Health (Beijing), Beijing, China
| | - Yanxi Han
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | | | - Cong Liu
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| |
Collapse
|
24
|
Majidian S, Agustinho DP, Chin CS, Sedlazeck FJ, Mahmoud M. Genomic variant benchmark: if you cannot measure it, you cannot improve it. Genome Biol 2023; 24:221. [PMID: 37798733 PMCID: PMC10552390 DOI: 10.1186/s13059-023-03061-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 09/18/2023] [Indexed: 10/07/2023] Open
Abstract
Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.
Collapse
Affiliation(s)
- Sina Majidian
- Department of Computational Biology, University of Lausanne, 1015, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | | | | | - Fritz J Sedlazeck
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, 77030, USA.
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA.
| | - Medhat Mahmoud
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, 77030, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
25
|
Bzikadze AV, Pevzner PA. UniAligner: a parameter-free framework for fast sequence alignment. Nat Methods 2023; 20:1346-1354. [PMID: 37580559 DOI: 10.1038/s41592-023-01970-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 07/05/2023] [Indexed: 08/16/2023]
Abstract
Even though the recent advances in 'complete genomics' revealed the previously inaccessible genomic regions, analysis of variations in centromeres and other extra-long tandem repeats (ETRs) faces an algorithmic challenge since there are currently no tools for accurate sequence comparison of ETRs. Counterintuitively, the classical alignment approaches, such as the Smith-Waterman algorithm, fail to construct biologically adequate alignments of ETRs. We present UniAligner-the parameter-free sequence alignment algorithm with sequence-dependent alignment scoring that automatically changes for any pair of compared sequences. UniAligner prioritizes matches of rare substrings that are more likely to be relevant to the evolutionary relationship between two sequences. We apply UniAligner to estimate the mutation rates in human centromeres, and quantify the extremely high rate of large duplications and deletions in centromeres. This high rate suggests that centromeres may represent some of the most rapidly evolving regions of the human genome with respect to their structural organization.
Collapse
Affiliation(s)
- Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
26
|
Wang Y, Du H, Dai W, Bao C, Zhang X, Hu Y, Xie Z, Zhao X, Li C, Zhang W, Wu R. Diagnostic Potential of Endometrial Cancer DNA from Pipelle, Pap-Brush, and Swab Sampling. Cancers (Basel) 2023; 15:3522. [PMID: 37444632 DOI: 10.3390/cancers15133522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/29/2023] [Accepted: 07/03/2023] [Indexed: 07/15/2023] Open
Abstract
Endometrial cancer (EC) is a major gynecological malignancy with rising morbidity and mortality worldwide. The aim of this study was to explore a safe and readily available sample and a sensitive and effective detection method and its biomarkers for early diagnosis of EC, which is critical for patient prognosis. This study designed a panel targeting variants for EC-related genes, assessed its technical performance by comparing it with whole-exon sequencing, and explored the diagnostic potential of endometrial biopsies using the Pipelle aspirator, cervical samples using the Pap brush, and vaginal specimens using the swab from 38 EC patients and 208 women with risk factors for EC by applying targeted panel sequencing (TPS). TPS produced high-quality data (Q30 > 85% and mapping ratios > 99.35%) and was found to have strong consistency with whole-exome sequencing (WES) in detecting pathogenic mutations (92.11%), calculating homologous recombination deficiency (HRD) scores (r = 0.65), and assessing the microsatellite instability (MSI) status of EC (100%). The sensitivity of TPS in detection of EC is slightly better than that of WES (86.84% vs. 84.21%). Of the three types of samples detected using TPS, endometrial biopsy using the Pipelle aspirator had the highest sensitivity in detection of pathogenic mutations (81.87%) and the best consistency with surgical tumor specimens in MSI (85.16%). About 84% of EC patients contained pathogenic mutations in PIK3CA, PTEN, TP53, ARID1A, CTNNB1, KRAS, and MTOR, suggesting that this small gene set can achieve an excellent pathogenic mutation detection rate in Chinese EC patients. The custom panel combined with ultra-deep sequencing serves as a sensitive method for detecting genetic lesions from endometrial biopsy using the Pipelle aspirator.
Collapse
Affiliation(s)
- Yinan Wang
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- School of Medicine, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Hui Du
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Wenkui Dai
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Cuijun Bao
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Xi Zhang
- Department of Clinical Medicine, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yan Hu
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Zhiyu Xie
- Department of Clinical Medicine, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xin Zhao
- China National GeneBank, BGI-Shenzhen, Shenzhen 518116, China
| | - Changzhong Li
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| | - Wenyong Zhang
- School of Medicine, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055, China
| | - Ruifang Wu
- Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital, Shenzhen 518036, China
- Institute of Obstetrics and Gynecology, Shenzhen PKU-HKUST Medical Center, Shenzhen 518036, China
- Shenzhen Key Laboratory on Technology for Early Diagnosis of Major Gynecologic Diseases, Shenzhen 518036, China
| |
Collapse
|
27
|
Ji S, Zhu T, Sethia A, Wang W. Accelerated somatic mutation calling for whole-genome and whole-exome sequencing data from heterogenous tumor samples. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.04.547569. [PMID: 37461467 PMCID: PMC10350007 DOI: 10.1101/2023.07.04.547569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Accurate detection of somatic mutations in DNA sequencing data is a fundamental prerequisite for cancer research. Previous analytical challenge was overcome by consensus mutation calling from four to five popular callers. This, however, increases the already nontrivial computing time from individual callers. Here, we launch MuSE2.0, powered by multi-step parallelization and efficient memory allocation, to resolve the computing time bottleneck. MuSE2.0 speeds up 50 times than MuSE1.0 and 8-80 times than other popular callers. Our benchmark study suggests combining MuSE2.0 and the recently expedited Strelka2 can achieve high efficiency and accuracy in analyzing large cancer genomic datasets.
Collapse
Affiliation(s)
- Shuangxi Ji
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tong Zhu
- NVIDIA Corporation, Santa Clara, CA, USA
| | | | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
28
|
Trevarton AJ, Chang JT, Symmans WF. Simple combination of multiple somatic variant callers to increase accuracy. Sci Rep 2023; 13:8463. [PMID: 37231022 DOI: 10.1038/s41598-023-34925-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 05/10/2023] [Indexed: 05/27/2023] Open
Abstract
Publications comparing variant caller algorithms present discordant results with contradictory rankings. Caller performances are inconsistent and wide ranging, and dependent upon input data, application, parameter settings, and evaluation metric. With no single variant caller emerging as a superior standard, combinations or ensembles of variant callers have appeared in the literature. In this study, a whole genome somatic reference standard was used to derive principles to guide strategies for combining variant calls. Then, manually annotated variants called from the whole exome sequencing of a tumor were used to corroborate these general principles. Finally, we examined the ability of these principles to reduce noise in targeted sequencing.
Collapse
Affiliation(s)
- Alexander J Trevarton
- School of Biological Sciences, Faculty of Science, University of Auckland, Auckland, New Zealand.
| | - Jeffrey T Chang
- Department of Integrative Biology and Pharmacology, The University of Texas Health Sciences Center, Houston, USA
| | - W Fraser Symmans
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, USA
| |
Collapse
|
29
|
Liu Y, Wang S, Wang Y, Li Y, Zhu X, Lai X, Zhang X, Li X, Xiao X, Wang J. What makes TMB an ambivalent biomarker for immunotherapy? A subtle mismatch between the sample-based design of variant callers and real clinical cohort. Front Immunol 2023; 14:1151224. [PMID: 37304296 PMCID: PMC10248171 DOI: 10.3389/fimmu.2023.1151224] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 05/15/2023] [Indexed: 06/13/2023] Open
Abstract
Tumor mutation burden (TMB) is a widely recognized biomarker for predicting the efficacy of immunotherapy. However, its use still remains highly controversial. In this study, we examine the underlying causes of this controversy based on clinical needs. By tracing the source of the TMB errors and analyzing the design philosophy behind variant callers, we identify the conflict between the incompleteness of biostatistics rules and the variety of clinical samples as the critical issue that renders TMB an ambivalent biomarker. A series of experiments were conducted to illustrate the challenges of mutation detection in clinical practice. Additionally, we also discuss potential strategies for overcoming these conflict issues to enable the application of TMB in guiding decision-making in real clinical settings.
Collapse
Affiliation(s)
- Yuqian Liu
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Shenjie Wang
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Yixuan Wang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Yifei Li
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Xin Lai
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Xuqi Li
- Department of General Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Xiao Xiao
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Geneplus Shenzhen, Shenzhen, China
| | - Jiayin Wang
- School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, Shaanxi, China
| |
Collapse
|
30
|
Li S, Hu R, Small C, Kang TY, Liu CC, Zhou XJ, Li W. cfSNV: a software tool for the sensitive detection of somatic mutations from cell-free DNA. Nat Protoc 2023; 18:1563-1583. [PMID: 36849599 PMCID: PMC10411976 DOI: 10.1038/s41596-023-00807-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 11/24/2022] [Indexed: 03/01/2023]
Abstract
Cell-free DNA (cfDNA) in blood, viewed as a surrogate for tumor biopsy, has many clinical applications, including diagnosing cancer, guiding cancer treatment and monitoring treatment response. All these applications depend on an indispensable, yet underdeveloped task: detecting somatic mutations from cfDNA. The task is challenging because of the low tumor fraction in cfDNA. Recently, we developed the computational method cfSNV, the first method that comprehensively considers the properties of cfDNA for the sensitive detection of mutations from cfDNA. cfSNV vastly outperformed the conventional methods that were developed primarily for calling mutations from solid tumor tissues. cfSNV can accurately detect mutations in cfDNA even with medium-coverage (e.g., ≥200×) sequencing, which makes whole-exome sequencing (WES) of cfDNA a viable option for various clinical utilities. Here, we present a user-friendly cfSNV package that exhibits fast computation and convenient user options. We also built a Docker image of it, which is designed to enable researchers and clinicians with a limited computational background to easily carry out analyses on both high-performance computing platforms and local computers. Mutation calling from a standard preprocessed WES dataset (~250× and ~70 million base pair target size) can be carried out in 3 h on a server with eight virtual CPUs and 32 GB of random access memory.
Collapse
Affiliation(s)
- Shuo Li
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA
| | - Ran Hu
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Graduate Program, University of California at Los Angeles, Los Angeles, CA, USA
- Institute for Quantitative & Computational Biosciences, University of California at Los Angeles, Los Angeles, CA, USA
| | - Colin Small
- Institute for Quantitative & Computational Biosciences, University of California at Los Angeles, Los Angeles, CA, USA
| | | | - Chun-Chi Liu
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA
- EarlyDiagnostics Inc., Los Angeles, CA, USA
| | - Xianghong Jasmine Zhou
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA.
- Institute for Quantitative & Computational Biosciences, University of California at Los Angeles, Los Angeles, CA, USA.
- EarlyDiagnostics Inc., Los Angeles, CA, USA.
| | - Wenyuan Li
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA.
- EarlyDiagnostics Inc., Los Angeles, CA, USA.
| |
Collapse
|
31
|
Vaisband M, Schubert M, Gassner FJ, Geisberger R, Greil R, Zaborsky N, Hasenauer J. Validation of genetic variants from NGS data using deep convolutional neural networks. BMC Bioinformatics 2023; 24:158. [PMID: 37081386 PMCID: PMC10116675 DOI: 10.1186/s12859-023-05255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023] Open
Abstract
Accurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.
Collapse
Affiliation(s)
- Marc Vaisband
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Maria Schubert
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Franz Josef Gassner
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Roland Geisberger
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Richard Greil
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Nadja Zaborsky
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Jan Hasenauer
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| |
Collapse
|
32
|
Xia H, McMichael J, Becker-Hapak M, Onyeador OC, Buchli R, McClain E, Pence P, Supabphol S, Richters MM, Basu A, Ramirez CA, Puig-Saus C, Cotto KC, Freshour SL, Hundal J, Kiwala S, Goedegebuure SP, Johanns TM, Dunn GP, Ribas A, Miller CA, Gillanders WE, Fehniger TA, Griffith OL, Griffith M. Computational prediction of MHC anchor locations guides neoantigen identification and prioritization. Sci Immunol 2023; 8:eabg2200. [PMID: 37027480 PMCID: PMC10450883 DOI: 10.1126/sciimmunol.abg2200] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 03/16/2023] [Indexed: 04/09/2023]
Abstract
Neoantigens are tumor-specific peptide sequences resulting from sources such as somatic DNA mutations. Upon loading onto major histocompatibility complex (MHC) molecules, they can trigger recognition by T cells. Accurate neoantigen identification is thus critical for both designing cancer vaccines and predicting response to immunotherapies. Neoantigen identification and prioritization relies on correctly predicting whether the presenting peptide sequence can successfully induce an immune response. Because most somatic mutations are single-nucleotide variants, changes between wild-type and mutated peptides are typically subtle and require cautious interpretation. A potentially underappreciated variable in neoantigen prediction pipelines is the mutation position within the peptide relative to its anchor positions for the patient's specific MHC molecules. Whereas a subset of peptide positions are presented to the T cell receptor for recognition, others are responsible for anchoring to the MHC, making these positional considerations critical for predicting T cell responses. We computationally predicted anchor positions for different peptide lengths for 328 common HLA alleles and identified unique anchoring patterns among them. Analysis of 923 tumor samples shows that 6 to 38% of neoantigen candidates are potentially misclassified and can be rescued using allele-specific knowledge of anchor positions. A subset of anchor results were orthogonally validated using protein crystallography structures. Representative anchor trends were experimentally validated using peptide-MHC stability assays and competition binding assays. By incorporating our anchor prediction results into neoantigen prediction pipelines, we hope to formalize, streamline, and improve the identification process for relevant clinical studies.
Collapse
Affiliation(s)
- Huiming Xia
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Joshua McMichael
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Michelle Becker-Hapak
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Onyinyechi C. Onyeador
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Rico Buchli
- Pure Protein LLC, Oklahoma City, OK 73104, USA
| | - Ethan McClain
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Patrick Pence
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Suangson Supabphol
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
- The Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Megan M. Richters
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Anamika Basu
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Cody A. Ramirez
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Cristina Puig-Saus
- Division of Hematology/Oncology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, Los Angeles, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
| | - Kelsy C. Cotto
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Sharon L. Freshour
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Jasreet Hundal
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Susanna Kiwala
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - S. Peter Goedegebuure
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
| | - Tanner M. Johanns
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Gavin P. Dunn
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Antoni Ribas
- Division of Hematology/Oncology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, Los Angeles, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
| | - Christopher A. Miller
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
| | - William E. Gillanders
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
| | - Todd A. Fehniger
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Obi L. Griffith
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Malachi Griffith
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
33
|
Morazán-Fernández D, Mora J, Molina-Mora JA. In Silico Pipeline to Identify Tumor-Specific Antigens for Cancer Immunotherapy Using Exome Sequencing Data. PHENOMICS (CHAM, SWITZERLAND) 2023; 3:130-137. [PMID: 37197645 PMCID: PMC10110822 DOI: 10.1007/s43657-022-00084-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 11/09/2022] [Accepted: 11/15/2022] [Indexed: 05/19/2023]
Abstract
Tumor-specific antigens or neoantigens are peptides that are expressed only in cancer cells and not in healthy cells. Some of these molecules can induce an immune response, and therefore, their use in immunotherapeutic strategies based on cancer vaccines has been extensively explored. Studies based on these approaches have been triggered by the current high-throughput DNA sequencing technologies. However, there is no universal nor straightforward bioinformatic protocol to discover neoantigens using DNA sequencing data. Thus, we propose a bioinformatic protocol to detect tumor-specific antigens associated with single nucleotide variants (SNVs) or "mutations" in tumoral tissues. For this purpose, we used publicly available data to build our model, including exome sequencing data from colorectal cancer and healthy cells obtained from a single case, as well as frequent human leukocyte antigen (HLA) class I alleles in a specific population. HLA data from Costa Rican Central Valley population was selected as an example. The strategy included three main steps: (1) pre-processing of sequencing data; (2) variant calling analysis to detect tumor-specific SNVs in comparison with healthy tissue; and (3) prediction and characterization of peptides (protein fragments, the tumor-specific antigens) derived from the variants, in the context of their affinity with frequent alleles of the selected population. In our model data, we found 28 non-silent SNVs, present in 17 genes in chromosome one. The protocol yielded 23 strong binders peptides derived from the SNVs for frequent HLA class I alleles for the Costa Rican population. Although the analyses were performed as an example to implement the pipeline, to our knowledge, this is the first study of an in silico cancer vaccine using DNA sequencing data in the context of the HLA alleles. It is concluded that the standardized protocol was not only able to identify neoantigens in a specific but also provides a complete pipeline for the eventual design of cancer vaccines using the best bioinformatic practices. Supplementary Information The online version contains supplementary material available at 10.1007/s43657-022-00084-9.
Collapse
Affiliation(s)
| | - Javier Mora
- Centro de Investigación de Enfermedades Tropicales, Centro de Investigación en Cirugía y Cáncer, and Facultad de Microbiología, Universidad de Costa Rica, San José, 2060 Costa Rica
| | - Jose Arturo Molina-Mora
- Centro de Investigación de Enfermedades Tropicales, Centro de Investigación en Cirugía y Cáncer, and Facultad de Microbiología, Universidad de Costa Rica, San José, 2060 Costa Rica
| |
Collapse
|
34
|
Evaluation of ctDNA in the Prediction of Response to Neoadjuvant Therapy and Prognosis in Locally Advanced Rectal Cancer Patients: A Prospective Study. Pharmaceuticals (Basel) 2023; 16:ph16030427. [PMID: 36986526 PMCID: PMC10057108 DOI: 10.3390/ph16030427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 02/28/2023] [Accepted: 03/03/2023] [Indexed: 03/18/2023] Open
Abstract
“Watch and wait” is becoming a common treatment option for patients with locally advanced rectal cancer (LARC) submitted to neoadjuvant treatment. However, currently, no clinical modality has an acceptable accuracy for predicting pathological complete response (pCR). The aim of this study was to assess the clinical utility of circulating tumor DNA (ctDNA) in predicting the response and prognosis in these patients. We prospectively enrolled a cohort of three Iberian centers between January 2020 and December 2021 and performed an analysis on the association of ctDNA with the main response outcomes and disease-free survival (DFS). The rate of pCR in the total sample was 15.3%. A total of 24 plasma samples from 18 patients were analyzed by next-generation sequencing. At baseline, mutations were detected in 38.9%, with the most common being TP53 and KRAS. Combination of either positive magnetic resonance imaging (MRI) extramural venous invasion (mrEMVI) and ctDNA increased the risk of poor response (p = 0.021). Also, patients with two mutations vs. those with fewer than two mutations had a worse DFS (p = 0.005). Although these results should be read carefully due to sample size, this study suggests that baseline ctDNA combined with mrEMVI could potentially help to predict the response and baseline ctDNA number of mutations might allow the discrimination of groups with different DFS. Further studies are needed to clarify the role of ctDNA as an independent tool in the selection and management of LARC patients.
Collapse
|
35
|
Larson NB, Oberg AL, Adjei AA, Wang L. A Clinician's Guide to Bioinformatics for Next-Generation Sequencing. J Thorac Oncol 2023; 18:143-157. [PMID: 36379355 PMCID: PMC9870988 DOI: 10.1016/j.jtho.2022.11.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 10/31/2022] [Accepted: 11/05/2022] [Indexed: 11/15/2022]
Abstract
Next-generation sequencing (NGS) technologies are high-throughput methods for DNA sequencing and have become a widely adopted tool in cancer research. The sheer amount and variety of data generated by NGS assays require sophisticated computational methods and bioinformatics expertise. In this review, we provide background details of NGS technology and basic bioinformatics concepts for the clinician investigator interested in cancer research applications, with a focus on DNA-based approaches. We introduce the general principles of presequencing library preparation, postsequencing alignment, and variant calling. We also highlight the common variant annotations and NGS applications for other molecular data types. Finally, we briefly discuss the revealed utility of NGS methods in NSCLC research and study design considerations for research studies that aim to leverage NGS technologies for clinical care.
Collapse
Affiliation(s)
- Nicholas Bradley Larson
- Division of Clinical Trials and Biostatistics, Department of Quantitative Health Sciences, Mayo Clinic College of Medicine and Science, Rochester, Minnesota.
| | - Ann L Oberg
- Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic College of Medicine and Science, Rochester, Minnesota
| | - Alex A Adjei
- Taussig Cancer Institute, Cleveland Clinic, Cleveland, Ohio
| | - Liguo Wang
- Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic College of Medicine and Science, Rochester, Minnesota
| |
Collapse
|
36
|
Dhanda SK, Mahajan S, Manoharan M. Neoepitopes prediction strategies: an integration of cancer genomics and immunoinformatics approaches. Brief Funct Genomics 2023; 22:1-8. [PMID: 36398967 DOI: 10.1093/bfgp/elac041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/28/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
A major near-term medical impact of the genomic technology revolution will be the elucidation of mechanisms of cancer pathogenesis, leading to improvements in the diagnosis of cancer and the selection of cancer treatment. Next-generation sequencing technologies have accelerated the characterization of a tumor, leading to the comprehensive discovery of all the major alterations in a given cancer genome, followed by the translation of this information using computational and immunoinformatics approaches to cancer diagnostics and therapeutic efforts. In the current article, we review various components of cancer immunoinformatics applied to a series of fields of cancer research, including computational tools for cancer mutation detection, cancer mutation and immunological databases, and computational vaccinology.
Collapse
Affiliation(s)
- Sandeep Kumar Dhanda
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Swapnil Mahajan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| | - Malini Manoharan
- DeepKnomics Labs Private Limited, 7014 Prestige Garden Bay, IVRI Road, Avalahalli, Behind CRPF Campus, Yelahanka, Bangalore 560064, India
| |
Collapse
|
37
|
Vilov S, Heinig M. DeepSom: a CNN-based approach to somatic variant calling in WGS samples without a matched normal. Bioinformatics 2023; 39:6986966. [PMID: 36637201 PMCID: PMC9843587 DOI: 10.1093/bioinformatics/btac828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 12/19/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Somatic mutations are usually called by analyzing the DNA sequence of a tumor sample in conjunction with a matched normal. However, a matched normal is not always available, for instance, in retrospective analysis or diagnostic settings. For such cases, tumor-only somatic variant calling tools need to be designed. Previously proposed approaches demonstrate inferior performance on whole-genome sequencing (WGS) samples. RESULTS We present the convolutional neural network-based approach called DeepSom for detecting somatic single nucleotide polymorphism and short insertion and deletion variants in tumor WGS samples without a matched normal. We validate DeepSom by reporting its performance on five different cancer datasets. We also demonstrate that on WGS samples DeepSom outperforms previously proposed methods for tumor-only somatic variant calling. AVAILABILITY AND IMPLEMENTATION DeepSom is available as a GitHub repository at https://github.com/heiniglab/DeepSom. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sergey Vilov
- Institute of Computational Biology, Computational Health Center, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), 85764 Neuherberg, Germany
| | | |
Collapse
|
38
|
Seillier L, Peifer M. Reconstructing Phylogenetic Relationship in Bladder Cancer: A Methodological Overview. Methods Mol Biol 2023; 2684:113-132. [PMID: 37410230 DOI: 10.1007/978-1-0716-3291-8_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
Bladder cancer (BC) expresses itself as a highly heterogeneous disease both at the histological and molecular level, often occurring as synchronous or metachronous multifocal disease with high risk of recurrence and potential to metastasize. Multiple sequencing studies focusing on both non-muscle-invasive bladder cancer (NMIBC) and muscle-invasive bladder cancer (MIBC) gave insights into the extent of both inter- and intrapatient heterogeneity, but many questions on clonal evolution in BC remain unanswered. In this review article, we provide an overview over the technical and theoretical concepts linked to reconstructing evolutionary trajectories in BC and propose a set of tools and established software for phylogenetic analysis.
Collapse
Affiliation(s)
| | - Martin Peifer
- Department of Translational Genomics, University of Cologne, Cologne, Germany
| |
Collapse
|
39
|
Craven KE, Fischer CG, Jiang L, Pallavajjala A, Lin MT, Eshleman JR. Optimizing Insertion and Deletion Detection Using Next-Generation Sequencing in the Clinical Laboratory. J Mol Diagn 2022; 24:1217-1231. [PMID: 36162758 PMCID: PMC9808503 DOI: 10.1016/j.jmoldx.2022.08.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/18/2022] [Accepted: 08/31/2022] [Indexed: 01/13/2023] Open
Abstract
Detection of insertions and deletions (InDels) by short-read next-generation sequencing (NGS) technology can be challenging because of frequent misaligned reads. A systematic analysis of short InDels (1 to 30 bases) and fms-related receptor tyrosine kinase 3 (FLT3) internal tandem duplications (ITDs; 6 to 183 bases) from 46 clinical cases of solid or hematologic malignancy processed with a clinical NGS assay identified misaligned reads in every case, ranging from 3% to 100% of reads with the InDel showing mismapped bases. Mismaps also increased with InDel size. As a consequence, the clinical NGS bioinformatics pipeline undercalled the variant allele frequency by 1% to 84%, incorrectly called simultaneous single-base substitutions along with InDels, or did not report an FLT3 ITD that had been detected by capillary electrophoresis. To improve the ability of the pipeline to better detect and quantify InDels, we utilized a software program called Assembly-Based ReAligner (ABRA2) to more accurately remap reads. ABRA2 was able to correct 41% to 100% of the reads with mismapped bases and led to absolute increases in the variant allele frequency from 1% to 61% along with correction of all of the single-base substitutions except for two cases. ABRA2 could also detect multiple FLT3 ITD clones except for one 183-base ITD. Our analysis has found that ABRA2 performs well on short InDels as well as FLT3 ITDs that are <100 bases.
Collapse
Affiliation(s)
- Kelly E Craven
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Catherine G Fischer
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland; Division of Cancer Prevention, National Cancer Institute, Rockville, Maryland
| | - LiQun Jiang
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Aparna Pallavajjala
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Ming-Tseh Lin
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - James R Eshleman
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland; The Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.
| |
Collapse
|
40
|
Muñoz-Barrera A, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Lorenzo-Salazar JM, González-Montelongo R, García-Olivares V, Flores C. From Samples to Germline and Somatic Sequence Variation: A Focus on Next-Generation Sequencing in Melanoma Research. Life (Basel) 2022; 12:1939. [PMID: 36431075 PMCID: PMC9695713 DOI: 10.3390/life12111939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/12/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing (NGS) applications have flourished in the last decade, permitting the identification of cancer driver genes and profoundly expanding the possibilities of genomic studies of cancer, including melanoma. Here we aimed to present a technical review across many of the methodological approaches brought by the use of NGS applications with a focus on assessing germline and somatic sequence variation. We provide cautionary notes and discuss key technical details involved in library preparation, the most common problems with the samples, and guidance to circumvent them. We also provide an overview of the sequence-based methods for cancer genomics, exposing the pros and cons of targeted sequencing vs. exome or whole-genome sequencing (WGS), the fundamentals of the most common commercial platforms, and a comparison of throughputs and key applications. Details of the steps and the main software involved in the bioinformatics processing of the sequencing results, from preprocessing to variant prioritization and filtering, are also provided in the context of the full spectrum of genetic variation (SNVs, indels, CNVs, structural variation, and gene fusions). Finally, we put the emphasis on selected bioinformatic pipelines behind (a) short-read WGS identification of small germline and somatic variants, (b) detection of gene fusions from transcriptomes, and (c) de novo assembly of genomes from long-read WGS data. Overall, we provide comprehensive guidance across the main methodological procedures involved in obtaining sequencing results for the most common short- and long-read NGS platforms, highlighting key applications in melanoma research.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|
41
|
Batlle-Masó L, Garcia-Prat M, Parra-Martínez A, Franco-Jarava C, Aguiló-Cucurull A, Velasco P, Antolín M, Rivière JG, Martín-Nalda A, Soler-Palacín P, Martínez-Gallo M, Colobran R. Detection and evolutionary dynamics of somatic FAS variants in autoimmune lymphoproliferative syndrome: Diagnostic implications. Front Immunol 2022; 13:1014984. [PMID: 36466883 PMCID: PMC9716137 DOI: 10.3389/fimmu.2022.1014984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/24/2022] [Indexed: 11/21/2022] Open
Abstract
Autoimmune lymphoproliferative syndrome (ALPS) is a rare primary immune disorder characterized by impaired apoptotic homeostasis. The clinical characteristics include lymphoproliferation, autoimmunity (mainly cytopenia), and an increased risk of lymphoma. A distinctive biological feature is accumulation (>2.5%) of an abnormal cell subset composed of TCRαβ+ CD4-CD8- T cells (DNTs). The most common genetic causes of ALPS are monoallelic pathogenic variants in the FAS gene followed by somatic FAS variants, mainly restricted to DNTs. Identification of somatic FAS variants has been typically addressed by Sanger sequencing in isolated DNTs. However, this approach can be costly and technically challenging, and may not be successful in patients with normal DNT counts receiving immunosuppressive treatment. In this study, we identified a novel somatic mutation in FAS (c.718_719insGTCG) by Sanger sequencing on purified CD3+ cells. We then followed the evolutionary dynamics of the variant along time with an NGS-based approach involving deep amplicon sequencing (DAS) at high coverage (20,000-30,000x). Over five years of clinical follow-up, we obtained six blood samples for molecular study from the pre-treatment (DNTs>7%) and treatment (DNTs<2%) periods. DAS enabled detection of the somatic variant in all samples, even the one obtained after five years of immunosuppressive treatment (DNTs: 0.89%). The variant allele frequency (VAF) range was 4%-5% in pre-treatment samples and <1.5% in treatment samples, and there was a strong positive correlation between DNT counts and VAF (Pearson’s R: 0.98, p=0.0003). We then explored whether the same approach could be used in a discovery setting. In the last follow-up sample (DNT: 0.89%) we performed somatic variant calling on the FAS exon 9 DAS data from whole blood and purified CD3+ cells using VarScan 2. The c.718_719insGTCG variant was identified in both samples and showed the highest VAF (0.67% blood, 1.58% CD3+ cells) among >400 variants called. In summary, our study illustrates the evolutionary dynamics of a somatic FAS mutation before and during immunosuppressive treatment. The results show that pathogenic somatic FAS variants can be identified with the use of DAS in whole blood of ALPS patients regardless of their DNT counts.
Collapse
Affiliation(s)
- Laura Batlle-Masó
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Marina Garcia-Prat
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Alba Parra-Martínez
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Clara Franco-Jarava
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
- Translational Immunology Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Immunology Division, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
| | - Aina Aguiló-Cucurull
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
- Translational Immunology Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Immunology Division, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
| | - Pablo Velasco
- Pediatric Oncology and Hematology Department, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
| | - María Antolín
- Department of Clinical and Molecular Genetics, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
| | - Jacques G. Rivière
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Andrea Martín-Nalda
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Pere Soler-Palacín
- Infection in Immunocompromised Pediatric Patients Research Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Mónica Martínez-Gallo
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
- Translational Immunology Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Immunology Division, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Department of Cell Biology, Autonomous University of Barcelona (UAB), Physiology and Immunology, Bellaterra, Spain
| | - Roger Colobran
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
- Translational Immunology Group, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain
- Immunology Division, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Department of Clinical and Molecular Genetics, Vall d’Hebron University Hospital (HUVH), Barcelona, Spain
- Department of Cell Biology, Autonomous University of Barcelona (UAB), Physiology and Immunology, Bellaterra, Spain
- *Correspondence: Roger Colobran,
| |
Collapse
|
42
|
Genestet C, Refrégier G, Hodille E, Zein-Eddine R, Le Meur A, Hak F, Barbry A, Westeel E, Berland JL, Engelmann A, Verdier I, Lina G, Ader F, Dray S, Jacob L, Massol F, Venner S, Dumitrescu O. Mycobacterium tuberculosis genetic features associated with pulmonary tuberculosis severity. Int J Infect Dis 2022; 125:74-83. [PMID: 36273524 DOI: 10.1016/j.ijid.2022.10.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 10/13/2022] [Accepted: 10/15/2022] [Indexed: 11/06/2022] Open
Abstract
OBJECTIVES Mycobacterium tuberculosis (Mtb) infections result in a wide spectrum of clinical presentations but without proven Mtb genetic determinants. Herein, we hypothesized that the genetic features of Mtb clinical isolates, such as specific polymorphisms or microdiversity, may be linked to tuberculosis (TB) severity. METHODS A total of 234 patients with pulmonary TB (including 193 drug-susceptible and 14 monoresistant cases diagnosed between 2017 and 2020 and 27 multidrug-resistant cases diagnosed between 2010 and 2020) were stratified according to TB disease severity, and Mtb genetic features were explored using whole genome sequencing, including heterologous single-nucleotide polymorphism (SNP), calling to explore microdiversity. Finally, we performed a structural equation modeling analysis to relate TB severity to Mtb genetic features. RESULTS The clinical isolates from patients with mild TB carried mutations in genes associated with host-pathogen interaction, whereas those from patients with moderate/severe TB carried mutations associated with regulatory mechanisms. Genome-wide association study identified an SNP in the promoter of the gene coding for the virulence regulator espR, statistically associated with moderate/severe disease. Structural equation modeling and model comparisons indicated that TB severity was associated with the detection of Mtb microdiversity within clinical isolates and to the espR SNP. CONCLUSION Taken together, these results provide a new insight to better understand TB pathophysiology and could provide a new prognosis tool for pulmonary TB severity.
Collapse
Affiliation(s)
- Charlotte Genestet
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Institut des Agents Infectieux, Laboratoire de bactériologie, Rhône-Alpes, Lyon, France.
| | - Guislaine Refrégier
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique et Evolution, Île-de-France, Orsay, France.; Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Sud, Université Paris-Saclay, Île-de-France, Gif-sur-Yvette, France
| | - Elisabeth Hodille
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Institut des Agents Infectieux, Laboratoire de bactériologie, Rhône-Alpes, Lyon, France
| | - Rima Zein-Eddine
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique et Evolution, Île-de-France, Orsay, France.; Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Sud, Université Paris-Saclay, Île-de-France, Gif-sur-Yvette, France; Laboratory of Optics and Biosciences, CNRS-INSERM-Ecole Polytechnique, Île-de-France, Palaiseau, France
| | - Adrien Le Meur
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique et Evolution, Île-de-France, Orsay, France.; Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Sud, Université Paris-Saclay, Île-de-France, Gif-sur-Yvette, France
| | - Fiona Hak
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique et Evolution, Île-de-France, Orsay, France.; Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Sud, Université Paris-Saclay, Île-de-France, Gif-sur-Yvette, France
| | - Alexia Barbry
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Institut des Agents Infectieux, Laboratoire de bactériologie, Rhône-Alpes, Lyon, France
| | - Emilie Westeel
- Fondation Mérieux, Emerging Pathogens Laboratory, Rhône-Alpes, Lyon, France
| | - Jean-Luc Berland
- Fondation Mérieux, Emerging Pathogens Laboratory, Rhône-Alpes, Lyon, France
| | - Astrid Engelmann
- Centre Hospitalier Fleyriat, Rhône-Alpes, Bourg-en-Bresse, France
| | - Isabelle Verdier
- Centre Hospitalier Fleyriat, Rhône-Alpes, Bourg-en-Bresse, France
| | - Gérard Lina
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Institut des Agents Infectieux, Laboratoire de bactériologie, Rhône-Alpes, Lyon, France; Université Lyon 1, Facultés de Médecine et de Pharmacie de Lyon, Rhône-Alpes, Lyon, France
| | - Florence Ader
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Service des Maladies infectieuses et tropicales, Rhône-Alpes, Lyon, France
| | - Stéphane Dray
- Biometrics and Evolutionary Biology Laboratory, CNRS UMR 5558, Université Lyon 1, Rhône-Alpes, Villeurbanne, France
| | - Laurent Jacob
- Biometrics and Evolutionary Biology Laboratory, CNRS UMR 5558, Université Lyon 1, Rhône-Alpes, Villeurbanne, France
| | - François Massol
- UMR 8198 Evo-Eco-Paleo, SPICI Group, University of Lille, Hauts-de-France, Lille, France; CNRS, CHU Lille, Institut Pasteur de Lille, U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, University of Lille, Hauts-de-France, Lille, France
| | - Samuel Venner
- Biometrics and Evolutionary Biology Laboratory, CNRS UMR 5558, Université Lyon 1, Rhône-Alpes, Villeurbanne, France
| | - Oana Dumitrescu
- CIRI - Centre International de Recherche en Infectiologie, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon-1, Rhône-Alpes, Lyon, France; Hospices Civils de Lyon, Institut des Agents Infectieux, Laboratoire de bactériologie, Rhône-Alpes, Lyon, France; Université Lyon 1, Facultés de Médecine et de Pharmacie de Lyon, Rhône-Alpes, Lyon, France
| | | |
Collapse
|
43
|
Czech L, Exposito-Alonso M. grenepipe: a flexible, scalable and reproducible pipeline to automate variant calling from sequence reads. Bioinformatics 2022; 38:4809-4811. [PMID: 36053180 PMCID: PMC10424805 DOI: 10.1093/bioinformatics/btac600] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/27/2022] [Accepted: 09/05/2022] [Indexed: 11/14/2022] Open
Abstract
SUMMARY We developed grenepipe, an all-in-one Snakemake workflow to streamline the data processing from raw high-throughput sequencing data of individuals or populations to genotype variant calls. Our pipeline offers a range of popular software tools within a single configuration file, automatically installs software dependencies, is highly optimized for scalability in cluster environments and runs with a single command. AVAILABILITY AND IMPLEMENTATION grenepipe is published under the GPLv3 and freely available at github.com/moiexpositoalonsolab/grenepipe.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Moises Exposito-Alonso
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
- Department of Global Ecology, Carnegie Institution for Science, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
44
|
Neoantigens in precision cancer immunotherapy: from identification to clinical applications. Chin Med J (Engl) 2022; 135:1285-1298. [PMID: 35838545 PMCID: PMC9433083 DOI: 10.1097/cm9.0000000000002181] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Immunotherapies targeting cancer neoantigens are safe, effective, and precise. Neoantigens can be identified mainly by genomic techniques such as next-generation sequencing and high-throughput single-cell sequencing; proteomic techniques such as mass spectrometry; and bioinformatics tools based on high-throughput sequencing data, mass spectrometry data, and biological databases. Neoantigen-related therapies are widely used in clinical practice and include neoantigen vaccines, neoantigen-specific CD8+ and CD4+ T cells, and neoantigen-pulsed dendritic cells. In addition, neoantigens can be used as biomarkers to assess immunotherapy response, resistance, and prognosis. Therapies based on neoantigens are an important and promising branch of cancer immunotherapy. Unremitting efforts are needed to unravel the comprehensive role of neoantigens in anti-tumor immunity and to extend their clinical application. This review aimed to summarize the progress in neoantigen research and to discuss its opportunities and challenges in precision cancer immunotherapy.
Collapse
|
45
|
Long Q, Yuan Y, Li M. RNA-SSNV: A Reliable Somatic Single Nucleotide Variant Identification Framework for Bulk RNA-Seq Data. Front Genet 2022; 13:865313. [PMID: 35846154 PMCID: PMC9279659 DOI: 10.3389/fgene.2022.865313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
The usage of expressed somatic mutations may have a unique advantage in identifying active cancer driver mutations. However, accurately calling mutations from RNA-seq data is difficult due to confounding factors such as RNA-editing, reverse transcription, and gap alignment. In the present study, we proposed a framework (named RNA-SSNV, https://github.com/pmglab/RNA-SSNV) to call somatic single nucleotide variants (SSNV) from tumor bulk RNA-seq data. Based on a comprehensive multi-filtering strategy and a machine-learning classification model trained with comprehensively curated features, RNA-SSNV achieved the best precision–recall rate (0.880–0.884) in a testing dataset and robustly retained 0.94 AUC for the precision–recall curve in three validation adult-based TCGA (The Cancer Genome Atlas) datasets. We further showed that the somatic mutations called by RNA-SSNV tended to have a higher functional impact and therapeutic power in known driver genes. Furthermore, VAF (variant allele fraction) analysis revealed that subclonal harboring expressed mutations had evolutional selection advantage and RNA had higher detection power to rescue DNA-omitted mutations. In sum, RNA-SSNV will be a useful approach to accurately call expressed somatic mutations for a more insightful analysis of cancer drive genes and carcinogenic mechanisms.
Collapse
Affiliation(s)
- Qihan Long
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Yangyang Yuan
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
- Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, China
- *Correspondence: Miaoxin Li,
| |
Collapse
|
46
|
Zhang L, Zhou X, Sha H, Xie L, Liu B. Recent Progress on Therapeutic Vaccines for Breast Cancer. Front Oncol 2022; 12:905832. [PMID: 35734599 PMCID: PMC9207208 DOI: 10.3389/fonc.2022.905832] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 05/11/2022] [Indexed: 11/13/2022] Open
Abstract
Breast cancer remains the most frequently diagnosed malignancy worldwide. Advanced breast cancer is still an incurable disease mainly because of its heterogeneity and limited immunogenicity. The great success of cancer immunotherapy is paving the way for a new era in cancer treatment, and therapeutic cancer vaccination is an area of interest. Vaccine targets include tumor-associated antigens and tumor-specific antigens. Immune responses differ in different vaccine delivery platforms. Next-generation sequencing technologies and computational analysis have recently made personalized vaccination possible. However, only a few cases benefiting from neoantigen-based treatment have been reported in breast cancer, and more attention has been given to overexpressed antigen-based treatment, especially human epidermal growth factor 2-derived peptide vaccines. Here, we discuss recent advancements in therapeutic vaccines for breast cancer and highlight near-term opportunities for moving forward.
Collapse
Affiliation(s)
- Lianru Zhang
- The Comprehensive Cancer Centre of Drum Tower Hospital, Medical School of Nanjing University & Clinical Cancer Institute of Nanjing University, Nanjing, China
| | - Xipeng Zhou
- Department of oncology, Yizheng People's Hospital, Yangzhou, China
| | - Huizi Sha
- The Comprehensive Cancer Centre of Drum Tower Hospital, Medical School of Nanjing University & Clinical Cancer Institute of Nanjing University, Nanjing, China
| | - Li Xie
- The Comprehensive Cancer Centre of Drum Tower Hospital, Medical School of Nanjing University & Clinical Cancer Institute of Nanjing University, Nanjing, China
| | - Baorui Liu
- The Comprehensive Cancer Centre of Drum Tower Hospital, Medical School of Nanjing University & Clinical Cancer Institute of Nanjing University, Nanjing, China
| |
Collapse
|
47
|
Quazi S. Artificial intelligence and machine learning in precision and genomic medicine. Med Oncol 2022; 39:120. [PMID: 35704152 PMCID: PMC9198206 DOI: 10.1007/s12032-022-01711-1] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 03/14/2022] [Indexed: 10/28/2022]
Abstract
The advancement of precision medicine in medical care has led behind the conventional symptom-driven treatment process by allowing early risk prediction of disease through improved diagnostics and customization of more effective treatments. It is necessary to scrutinize overall patient data alongside broad factors to observe and differentiate between ill and relatively healthy people to take the most appropriate path toward precision medicine, resulting in an improved vision of biological indicators that can signal health changes. Precision and genomic medicine combined with artificial intelligence have the potential to improve patient healthcare. Patients with less common therapeutic responses or unique healthcare demands are using genomic medicine technologies. AI provides insights through advanced computation and inference, enabling the system to reason and learn while enhancing physician decision making. Many cell characteristics, including gene up-regulation, proteins binding to nucleic acids, and splicing, can be measured at high throughput and used as training objectives for predictive models. Researchers can create a new era of effective genomic medicine with the improved availability of a broad range of datasets and modern computer techniques such as machine learning. This review article has elucidated the contributions of ML algorithms in precision and genome medicine.
Collapse
Affiliation(s)
- Sameer Quazi
- GenLab Biosolutions Private Limited, Bangalore, Karnataka, 560043, India.
- Department of Biomedical Sciences, School of Life Sciences, Anglia Ruskin University, Cambridge, UK.
| |
Collapse
|
48
|
Abstract
The advancement of precision medicine in medical care has led behind the conventional symptom-driven treatment process by allowing early risk prediction of disease through improved diagnostics and customization of more effective treatments. It is necessary to scrutinize overall patient data alongside broad factors to observe and differentiate between ill and relatively healthy people to take the most appropriate path toward precision medicine, resulting in an improved vision of biological indicators that can signal health changes. Precision and genomic medicine combined with artificial intelligence have the potential to improve patient healthcare. Patients with less common therapeutic responses or unique healthcare demands are using genomic medicine technologies. AI provides insights through advanced computation and inference, enabling the system to reason and learn while enhancing physician decision making. Many cell characteristics, including gene up-regulation, proteins binding to nucleic acids, and splicing, can be measured at high throughput and used as training objectives for predictive models. Researchers can create a new era of effective genomic medicine with the improved availability of a broad range of datasets and modern computer techniques such as machine learning. This review article has elucidated the contributions of ML algorithms in precision and genome medicine.
Collapse
Affiliation(s)
- Sameer Quazi
- GenLab Biosolutions Private Limited, Bangalore, Karnataka, 560043, India.
- Department of Biomedical Sciences, School of Life Sciences, Anglia Ruskin University, Cambridge, UK.
| |
Collapse
|
49
|
Dodani DD, Nguyen MH, Morin RD, Marra MA, Corbett RD. Combinatorial and Machine Learning Approaches for Improved Somatic Variant Calling From Formalin-Fixed Paraffin-Embedded Genome Sequence Data. Front Genet 2022; 13:834764. [PMID: 35571031 PMCID: PMC9092826 DOI: 10.3389/fgene.2022.834764] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/18/2022] [Indexed: 11/13/2022] Open
Abstract
Formalin fixation of paraffin-embedded tissue samples is a well-established method for preserving tissue and is routinely used in clinical settings. Although formalin-fixed, paraffin-embedded (FFPE) tissues are deemed crucial for research and clinical applications, the fixation process results in molecular damage to nucleic acids, thus confounding their use in genome sequence analysis. Methods to improve genomic data quality from FFPE tissues have emerged, but there remains significant room for improvement. Here, we use whole-genome sequencing (WGS) data from matched Fresh Frozen (FF) and FFPE tissue samples to optimize a sensitive and precise FFPE single nucleotide variant (SNV) calling approach. We present methods to reduce the prevalence of false-positive SNVs by applying combinatorial techniques to five publicly available variant callers. We also introduce FFPolish, a novel variant classification method that efficiently classifies FFPE-specific false-positive variants. Our combinatorial and statistical techniques improve precision and F1 scores compared to the results of publicly available tools when tested individually.
Collapse
Affiliation(s)
- Dollina D Dodani
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Matthew H Nguyen
- The Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Ryan D Morin
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Richard D Corbett
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Provincial Health Services Authority, Vancouver, BC, Canada
| |
Collapse
|
50
|
Garcia-Prieto CA, Martínez-Jiménez F, Valencia A, Porta-Pardo E. Detection of oncogenic and clinically actionable mutations in cancer genomes critically depends on variant calling tools. Bioinformatics 2022; 38:3181-3191. [PMID: 35512388 PMCID: PMC9191211 DOI: 10.1093/bioinformatics/btac306] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 02/09/2022] [Accepted: 05/01/2022] [Indexed: 11/22/2022] Open
Abstract
Motivation The analysis of cancer genomes provides fundamental information about its etiology, the processes driving cell transformation or potential treatments. While researchers and clinicians are often only interested in the identification of oncogenic mutations, actionable variants or mutational signatures, the first crucial step in the analysis of any tumor genome is the identification of somatic variants in cancer cells (i.e. those that have been acquired during their evolution). For that purpose, a wide range of computational tools have been developed in recent years to detect somatic mutations in sequencing data from tumor samples. While there have been some efforts to benchmark somatic variant calling tools and strategies, the extent to which variant calling decisions impact the results of downstream analyses of tumor genomes remains unknown. Results Here, we quantify the impact of variant calling decisions by comparing the results obtained in three important analyses of cancer genomics data (identification of cancer driver genes, quantification of mutational signatures and detection of clinically actionable variants) when changing the somatic variant caller (MuSE, MuTect2, SomaticSniper and VarScan2) or the strategy to combine them (Consensus of two, Consensus of three and Union) across all 33 cancer types from The Cancer Genome Atlas. Our results show that variant calling decisions have a significant impact on these analyses, creating important differences that could even impact treatment decisions for some patients. Moreover, the Consensus of three calling strategy to combine the output of multiple variant calling tools, a very widely used strategy by the research community, can lead to the loss of some cancer driver genes and actionable mutations. Overall, our results highlight the limitations of widespread practices within the cancer genomics community and point to important differences in critical analyses of tumor sequencing data depending on variant calling, affecting even the identification of clinically actionable variants. Availability and implementation Code is available at https://github.com/carlosgarciaprieto/VariantCallingClinicalBenchmark. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carlos A Garcia-Prieto
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain.,Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Francisco Martínez-Jiménez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Alfonso Valencia
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Eduard Porta-Pardo
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain.,Barcelona Supercomputing Center (BSC), Barcelona, Spain
| |
Collapse
|