1
|
Vestergaard L, Lopacinska‐Jørgensen J, Høgdall E. CANVAR: A Tool for Clinical Annotation of Variants Using ClinVar Databases. Mol Genet Genomic Med 2024; 12:e70020. [PMID: 39382066 PMCID: PMC11462301 DOI: 10.1002/mgg3.70020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 08/20/2024] [Accepted: 09/25/2024] [Indexed: 10/10/2024] Open
Abstract
BACKGROUND Genomic medicine has transformed clinical genetics by utilizing high-throughput sequencing technologies to analyze genetic variants associated with diseases. Accurate variant classification is crucial for diagnosis and treatment decisions, and various tools and software such as the Ion Reporter Software and the Illumina Nirvana Software often used in a clinical setting utilize information from the ClinVar database/archive to aid in variant interpretation. However, these existing annotation tools may lack access to the latest ClinVar data, necessitating manual variant inspection. AIMS To address this gap in developing a tool providing the latest ClinVar data for variant annotation in clinical and research settings. MATERIALS AND METHODS We introduce CANVAR, a Python-based script that efficiently annotates variants identified from next-generation sequencing in a clinical or research context, offering comprehensive information from the latest ClinVar database. RESULTS CANVAR provides accurate, up-to-date variant annotations, streamlining variant analysis. DISCUSSION The rise in genomic data requires accurate variant annotation for clinical decision-making. Misclassification poses risks, and current tools may not always access the latest data, challenging variant interpretation. CONCLUSION CANVAR contributes to enhancing variant annotation by offering comprehensive information from the latest ClinVar database for genetic variants identified through next-generation sequencing.
Collapse
Affiliation(s)
- Lau K. Vestergaard
- Molecular Unit, Department of PathologyHerlev Hospital, University of CopenhagenHerlevDenmark
| | | | - Estrid V. Høgdall
- Molecular Unit, Department of PathologyHerlev Hospital, University of CopenhagenHerlevDenmark
| |
Collapse
|
2
|
Yu H, Jiang L, Li CI, Ness S, Piccirillo SGM, Guo Y. Somatic mutation effects diffused over microRNA dysregulation. Bioinformatics 2023; 39:btad520. [PMID: 37624931 PMCID: PMC10474951 DOI: 10.1093/bioinformatics/btad520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/14/2023] [Accepted: 08/23/2023] [Indexed: 08/27/2023] Open
Abstract
MOTIVATION As an important player in transcriptome regulation, microRNAs may effectively diffuse somatic mutation impacts to broad cellular processes and ultimately manifest disease and dictate prognosis. Previous studies that tried to correlate mutation with gene expression dysregulation neglected to adjust for the disparate multitudes of false positives associated with unequal sample sizes and uneven class balancing scenarios. RESULTS To properly address this issue, we developed a statistical framework to rigorously assess the extent of mutation impact on microRNAs in relation to a permutation-based null distribution of a matching sample structure. Carrying out the framework in a pan-cancer study, we ascertained 9008 protein-coding genes with statistically significant mutation impacts on miRNAs. Of these, the collective miRNA expression for 83 genes showed significant prognostic power in nine cancer types. For example, in lower-grade glioma, 10 genes' mutations broadly impacted miRNAs, all of which showed prognostic value with the corresponding miRNA expression. Our framework was further validated with functional analysis and augmented with rich features including the ability to analyze miRNA isoforms; aggregative prognostic analysis; advanced annotations such as mutation type, regulator alteration, somatic motif, and disease association; and instructive visualization such as mutation OncoPrint, Ideogram, and interactive mRNA-miRNA network. AVAILABILITY AND IMPLEMENTATION The data underlying this article are available in MutMix, at http://innovebioinfo.com/Database/TmiEx/MutMix.php.
Collapse
Affiliation(s)
- Hui Yu
- Department of Public Health, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, U.S.A
| | - Limin Jiang
- Department of Public Health, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, U.S.A
| | - Chung-I Li
- Department of Statistics, National Cheng Kung University, Tainan 701401, Taiwan
| | - Scott Ness
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, United States
| | - Sara G M Piccirillo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, United States
| | - Yan Guo
- Department of Public Health, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, U.S.A
| |
Collapse
|
3
|
Vihinen M. Systematic errors in annotations of truncations, loss-of-function and synonymous variants. Front Genet 2023; 14:1015017. [PMID: 36713076 PMCID: PMC9880313 DOI: 10.3389/fgene.2023.1015017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 01/03/2023] [Indexed: 01/15/2023] Open
Abstract
Description of genetic phenomena and variations requires exact language and concepts. Vast amounts of variation data are produced with next-generation sequencing pipelines. The obtained variations are automatically annotated, e.g., for their functional consequences. These tools and pipelines, along with systematic nomenclature, mainly work well, but there are still some problems in nomenclature, organization of some databases, misuse of concepts and certain practices. Therefore, systematic errors prevent correct annotation and often preclude further analysis of certain variation types. Problems and solutions are described for presumed protein truncations, variants that are claimed to be of loss-of-function based on the type of variation, and synonymous variants that are not synonymous and lead to sequence changes or to missing protein.
Collapse
|
4
|
Halim-Fikri H, Syed-Hassan SNRK, Wan-Juhari WK, Assyuhada MGSN, Hernaningsih Y, Yusoff NM, Merican AF, Zilfalil BA. Central resources of variant discovery and annotation and its role in precision medicine. ASIAN BIOMED 2022; 16:285-298. [PMID: 37551357 PMCID: PMC10392146 DOI: 10.2478/abm-2022-0032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Rapid technological advancement in high-throughput genomics, microarray, and deep sequencing technologies has accelerated the possibility of more complex precision medicine research using large amounts of heterogeneous health-related data from patients, including genomic variants. Genomic variants can be identified and annotated based on the reference human genome either within the sequence as a whole or in a putative functional genomic element. The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) mutually created standards and guidelines for the appraisal of proof to expand consistency and straightforwardness in clinical variation interpretations. Various efforts toward precision medicine have been facilitated by many national and international public databases that classify and annotate genomic variation. In the present study, several resources are highlighted with recognition and data spreading of clinically important genetic variations.
Collapse
Affiliation(s)
- Hashim Halim-Fikri
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | | | - Wan-Khairunnisa Wan-Juhari
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Mat Ghani Siti Nor Assyuhada
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Yetti Hernaningsih
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
| | - Narazah Mohd Yusoff
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
- Clinical Diagnostic Laboratory, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang13200, Malaysia
| | - Amir Feisal Merican
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur50603, Malaysia
- Center of Research for Computational Sciences and Informatics in Biology, Bio Industry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur50603, Malaysia
| | - Bin Alwi Zilfalil
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| |
Collapse
|
5
|
Xi E, Bai J, Zhang K, Yu H, Guo Y. Genomic variants disrupt miRNA-mRNA regulation. Chem Biodivers 2022; 19:e202200623. [PMID: 35985010 DOI: 10.1002/cbdv.202200623] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/17/2022] [Indexed: 11/09/2022]
Abstract
Micro RNA (miRNA) and its regulatory effect on messenger RNA (mRNA) gene expression are a major focus in cancer research. Disruption in the normal miRNA-mRNA regulation network can result in serious cascading biological repercussions. In this study, we curated miRNA-related variants from major genomic consortiums and thoroughly evaluated how these variants could exert their effects by cross-validating with independent functional knowledge bases. Nearly all known variants (more than 664 million) categorized by type (germline, somatic, epigenetic) were mapped to the genomic regions involved in miRNA-mRNA binding (miRNA seeds and miRNA-mRNA 3'-UTR binding sequence). Subsets of miRNA-related variants supported by additional functional evidence, such as expression Quantitative Trait Loci (eQTL) and Genome-Wide Association Study (GWAS), were identified and scrutinized. Our results show that variants in miRNA seeds can substantially alter the composition of an miRNA's target mRNA set. Various functional analyses converged to reveal a post-transcriptional complex regulatory network where miRNA, eQTL, and RNA-binding protein intertwined to disseminate the impact of genomic variants. These results may potentially explain how certain variants affect disease/trait risks in genome wide association studies.
Collapse
Affiliation(s)
- Ellie Xi
- University of New Mexico - Albuquerque: The University of New Mexico, Internal Medicine, 100A Cancer Research Facility, 100A Cancer Research Facility, 87131, Albuquerque, UNITED STATES
| | - Judy Bai
- University of New Mexico - Albuquerque: The University of New Mexico, Internal Medicine, 100A Cancer Research Facility, 100A Cancer Research Facility, 87131, Albuquerque, UNITED STATES
| | - Klaira Zhang
- University of New Mexico - Albuquerque: The University of New Mexico, Internal Medicine, 100A Cancer Research Facility, 100A Cancer Research Facility, 87131, Albuquerque, UNITED STATES
| | - Hui Yu
- University of New Mexico - Albuquerque: The University of New Mexico, Internal Medicine, 100A Cancer Research Facility, Albuquerque, UNITED STATES
| | - Yan Guo
- University of New Mexico, Cancer Research Facility 100A, 87131, Albuquerque, UNITED STATES
| |
Collapse
|
6
|
Tuteja S, Kadri S, Yap KL. A performance evaluation study: Variant annotation tools - The enigma of clinical next generation sequencing (NGS) based genetic testing. J Pathol Inform 2022; 13:100130. [PMID: 36268089 PMCID: PMC9577137 DOI: 10.1016/j.jpi.2022.100130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/25/2022] [Accepted: 07/25/2022] [Indexed: 12/03/2022] Open
Abstract
Dramatically expanding our ability for clinical genetic testing for inherited conditions and complex diseases such as cancer, next generation sequencing (NGS) technologies are allowing for rapid interrogation of thousands of genes and identification of millions of variants. Variant annotation, the process of assigning functional information to DNA variants based on the standardized Human Genome Variation Society (HGVS) nomenclature, is a fundamental challenge in the analysis of NGS data that has led to the development of many bioinformatic algorithms. In this study, we evaluated the performance of 3 variant annotation tools: Alamut® Batch, Ensembl Variant Effect Predictor (VEP), and ANNOVAR, benchmarked by a manually curated ground-truth set of 298 variants from the medical exome database at the Molecular Diagnostics Laboratory at Lurie Children's Hospital. Of the 3 tools, VEP produces the most accurate variant annotations (HGVS nomenclature for 297 of the 298 variants) due to usage of updated gene transcript versions within the algorithm. Alamut® Batch called 296 of the 298 variants correctly; strikingly, ANNOVAR exhibited the greatest number of discrepancies (20 of the 298 variants, 93.3% concordance with ground-truth set). Adoption of validated methods of variant annotation is critical in post-analytical phases of clinical testing.
Collapse
Affiliation(s)
- Sachleen Tuteja
- Illinois Mathematics and Science Academy, 1500 Sullivan Road, Aurora, IL 60506, USA
| | - Sabah Kadri
- Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, 225 E. Chicago Ave, Chicago, IL 60611, USA
- Department of Pathology, Northwestern University Feinberg School of Medicine, 420 E. Superior St, Chicago, IL 606011, USA
| | - Kai Lee Yap
- Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, 225 E. Chicago Ave, Chicago, IL 60611, USA
- Department of Pathology, Northwestern University Feinberg School of Medicine, 420 E. Superior St, Chicago, IL 606011, USA
- Corresponding author at: Molecular Diagnostics, Department of Pathology & Laboratory Medicine, Ann & Robert H. Lurie Children's Hospital of Chicago, Northwestern Feinberg School of Medicine, 225 E. Chicago Ave, Box 82, Chicago, IL 60611, USA.
| |
Collapse
|