1
|
Malcikova J, Pavlova S, Baliakas P, Chatzikonstantinou T, Tausch E, Catherwood M, Rossi D, Soussi T, Tichy B, Kater AP, Niemann CU, Davi F, Gaidano G, Stilgenbauer S, Rosenquist R, Stamatopoulos K, Ghia P, Pospisilova S. ERIC recommendations for TP53 mutation analysis in chronic lymphocytic leukemia-2024 update. Leukemia 2024:10.1038/s41375-024-02267-x. [PMID: 38755420 DOI: 10.1038/s41375-024-02267-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 04/24/2024] [Accepted: 04/25/2024] [Indexed: 05/18/2024]
Abstract
In chronic lymphocytic leukemia (CLL), analysis of TP53 aberrations (deletion and/or mutation) is a crucial part of treatment decision-making algorithms. Technological and treatment advances have resulted in the need for an update of the last recommendations for TP53 analysis in CLL, published by ERIC, the European Research Initiative on CLL, in 2018. Based on the current knowledge of the relevance of low-burden TP53-mutated clones, a specific variant allele frequency (VAF) cut-off for reporting TP53 mutations is no longer recommended, but instead, the need for thorough method validation by the reporting laboratory is emphasized. The result of TP53 analyses should always be interpreted within the context of available laboratory and clinical information, treatment indication, and therapeutic options. Methodological aspects of introducing next-generation sequencing (NGS) in routine practice are discussed with a focus on reliable detection of low-burden clones. Furthermore, potential interpretation challenges are presented, and a simplified algorithm for the classification of TP53 variants in CLL is provided, representing a consensus based on previously published guidelines. Finally, the reporting requirements are highlighted, including a template for clinical reports of TP53 aberrations. These recommendations are intended to assist diagnosticians in the correct assessment of TP53 mutation status, but also physicians in the appropriate understanding of the lab reports, thus decreasing the risk of misinterpretation and incorrect management of patients in routine practice whilst also leading to improved stratification of patients with CLL in clinical trials.
Collapse
Affiliation(s)
- Jitka Malcikova
- Department of Internal Medicine, Hematology and Oncology, and Institute of Medical Genetics and Genomics, University Hospital Brno and Medical Faculty, Masaryk University, Brno, Czech Republic
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Sarka Pavlova
- Department of Internal Medicine, Hematology and Oncology, and Institute of Medical Genetics and Genomics, University Hospital Brno and Medical Faculty, Masaryk University, Brno, Czech Republic
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Panagiotis Baliakas
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | | | - Eugen Tausch
- Division of CLL, Department of Internal Medicine III, Ulm University, Ulm, Germany
| | - Mark Catherwood
- Haematology Department, Belfast Health and Social Care Trust, Belfast, United Kingdom
| | - Davide Rossi
- Hematology, Oncology Institute of Southern Switzerland and Institute of Oncology Research, Università della Svizzera Italiana, Bellinzona, Switzerland
| | - Thierry Soussi
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
- Hematopoietic and Leukemic Development, UMRS_938, Sorbonne University, Paris, France
| | - Boris Tichy
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Arnon P Kater
- Department of Hematology, Cancer Center Amsterdam, Amsterdam University Medical Centers, Amsterdam, the Netherlands
| | | | - Frederic Davi
- Sorbonne Université, Paris, France
- Department of Hematology, Hôpital Pitié-Salpêtière, AP-HP, Paris, France
| | - Gianluca Gaidano
- Division of Haematology, Department of Translational Medicine, University of Eastern Piedmont, Novara, Italy
| | - Stephan Stilgenbauer
- Division of CLL, Department of Internal Medicine III, Ulm University, Ulm, Germany
| | - Richard Rosenquist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
- Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm, Sweden
| | - Kostas Stamatopoulos
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, Greece
| | - Paolo Ghia
- Università Vita-Salute San Raffaele, Milan, Italy.
- Strategic Research Program on CLL, Division of Experimental Oncology, IRCCS Ospedale San Raffaele, Milan, Italy.
| | - Sarka Pospisilova
- Department of Internal Medicine, Hematology and Oncology, and Institute of Medical Genetics and Genomics, University Hospital Brno and Medical Faculty, Masaryk University, Brno, Czech Republic.
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
2
|
Minkin I, Salzberg SL. Conservation assessment of human splice site annotation based on a 470-genome alignment. bioRxiv 2024:2023.12.01.569581. [PMID: 38076842 PMCID: PMC10705407 DOI: 10.1101/2023.12.01.569581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Despite many improvements over the years, the annotation of the human genome remains imperfect, and different annotations of the human reference genome sometimes contradict one another. The use of evolutionarily conserved sequences provides a strategy for selecting a high-confidence subset of the annotation that is more likely to be related to biological functions, and the rapidly growing number of genomes from other species increases its power. Using the latest whole genome alignment, we found that splice sites from protein-coding genes in the high-quality MANE annotation are consistently conserved across more than 400 species. We also studied splice sites from the RefSeq, GENCODE, and CHESS databases that are not present in MANE. We trained a logistic regression classifier to distinguish between the conservation exhibited by sites from MANE versus sites chosen randomly from neutrally evolving sequence. We found that splice sites classified by our model as conserved have lower SNP rates and better transcriptomic support. We then computed a subset of transcripts only using either "conserved" splice sites or ones from MANE. This subset is enriched in high-confidence transcripts of the major gene catalogs that appear to be under purifying selection and are more likely to be correct and functionally relevant.
Collapse
|
3
|
Hoh C, Salzberg SL. Discovering Intron Gain Events in Humans through Large-Scale Evolutionary Comparisons. bioRxiv 2024:2024.05.02.592247. [PMID: 38746259 PMCID: PMC11092651 DOI: 10.1101/2024.05.02.592247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The rapid growth in the number of sequenced genomes makes it possible to search for the appearance of entirely new introns in the human lineage. In this study, we compared the genomic sequences for 19,120 human protein-coding genes to a collection of 3493 vertebrate genomes, mapping the patterns of intron alignments onto a phylogenetic tree. This mapping allowed us to trace many intron gain events to precise locations in the tree, corresponding to distinct points in evolutionary history. We discovered 584 intron gain events, all of them relatively recent, in 514 distinct human genes. Among these events, we explored the hypothesis that intronization was the mechanism responsible for intron gain. Intronization events were identified by locating instances where human introns correspond to exonic sequences in homologous vertebrate genes. Although apparently rare, we found three compelling cases of intronization, and for each of those we compared the human protein sequence and structure to homologous genes that lack the introns.
Collapse
|
4
|
Ryu J, Barkal S, Yu T, Jankowiak M, Zhou Y, Francoeur M, Phan QV, Li Z, Tognon M, Brown L, Love MI, Bhat V, Lettre G, Ascher DB, Cassa CA, Sherwood RI, Pinello L. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. Nat Genet 2024; 56:925-937. [PMID: 38658794 DOI: 10.1038/s41588-024-01726-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 03/21/2024] [Indexed: 04/26/2024]
Abstract
CRISPR base editing screens enable analysis of disease-associated variants at scale; however, variable efficiency and precision confounds the assessment of variant-induced phenotypes. Here, we provide an integrated experimental and computational pipeline that improves estimation of variant effects in base editing screens. We use a reporter construct to measure guide RNA (gRNA) editing outcomes alongside their phenotypic consequences and introduce base editor screen analysis with activity normalization (BEAN), a Bayesian network that uses per-guide editing outcomes provided by the reporter and target site chromatin accessibility to estimate variant impacts. BEAN outperforms existing tools in variant effect quantification. We use BEAN to pinpoint common regulatory variants that alter low-density lipoprotein (LDL) uptake, implicating previously unreported genes. Additionally, through saturation base editing of LDLR, we accurately quantify missense variant pathogenicity that is consistent with measurements in UK Biobank patients and identify underlying structural mechanisms. This work provides a widely applicable approach to improve the power of base editing screens for disease-associated variant characterization.
Collapse
Affiliation(s)
- Jayoung Ryu
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Sam Barkal
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Tian Yu
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Martin Jankowiak
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Yunzhuo Zhou
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Matthew Francoeur
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Quang Vinh Phan
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhijian Li
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Manuel Tognon
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computer Science Department, University of Verona, Verona, Italy
| | - Lara Brown
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael I Love
- Department of Genetics, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Vineel Bhat
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Guillaume Lettre
- Montreal Heart Institute, Montréal, Quebec, Canada
- Faculté de Médecine, Université de Montréal, Montréal, Quebec, Canada
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Christopher A Cassa
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Richard I Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Luca Pinello
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
5
|
McDevitt T, Durkie M, Arnold N, Burghel GJ, Butler S, Claes KBM, Logan P, Robinson R, Sheils K, Wolstenholme N, Hanson H, Turnbull C, Hume S. EMQN best practice guidelines for genetic testing in hereditary breast and ovarian cancer. Eur J Hum Genet 2024; 32:479-488. [PMID: 38443545 PMCID: PMC11061103 DOI: 10.1038/s41431-023-01507-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 11/07/2023] [Accepted: 11/21/2023] [Indexed: 03/07/2024] Open
Abstract
Hereditary Breast and Ovarian Cancer (HBOC) is a genetic condition associated with increased risk of cancers. The past decade has brought about significant changes to hereditary breast and ovarian cancer (HBOC) diagnostic testing with new treatments, testing methods and strategies, and evolving information on genetic associations. These best practice guidelines have been produced to assist clinical laboratories in effectively addressing the complexities of HBOC testing, while taking into account advancements since the last guidelines were published in 2007. These guidelines summarise cancer risk data from recent studies for the most commonly tested high and moderate risk HBOC genes for laboratories to refer to as a guide. Furthermore, recommendations are provided for somatic and germline testing services with regards to clinical referral, laboratory analyses, variant interpretation, and reporting. The guidelines present recommendations where 'must' is assigned to advocate that the recommendation is essential; and 'should' is assigned to advocate that the recommendation is highly advised but may not be universally applicable. Recommendations are presented in the form of shaded italicised statements throughout the document, and in the form of a table in supplementary materials (Table S4). Finally, for the purposes of encouraging standardisation and aiding implementation of recommendations, example report wording covering the essential points to be included is provided for the most common HBOC referral and reporting scenarios. These guidelines are aimed primarily at genomic scientists working in diagnostic testing laboratories.
Collapse
Affiliation(s)
- Trudi McDevitt
- Department of Clinical Genetics, Children's Health Ireland at Crumlin, Dublin, Ireland.
| | - Miranda Durkie
- Sheffield Diagnostic Genetics Service, North East and Yorkshire Genomic Laboratory Hub, Sheffield Children's NHS Foundation Trust Western Bank, Sheffield, UK
| | - Norbert Arnold
- UKSH Campus Kiel, Gynecology and Obstetrics, Institut of Clinical Chemistry, Institut of Clinical Molecular Biology, Kiel, Germany
| | - George J Burghel
- Manchester University NHS Foundation Trust, North West Genomic Laboratory Hub, Manchester, UK
| | - Samantha Butler
- Central and South Genomic Laboratory Hub, West Midlands Regional Genetics Laboratory, Birmingham Women's and Children's NHS Foundation Trust, Birmingham, UK
| | | | - Peter Logan
- HSCNI / Belfast Trust Laboratories, Regional Molecular Diagnostics Service, Belfast, Northern Ireland
| | - Rachel Robinson
- Leeds Teaching Hospitals NHS Trust, Genetics Department, Leeds, UK
| | | | | | - Helen Hanson
- St George's University Hospitals NHS Foundation Trust, Clinical Genetics, London, UK
| | | | - Stacey Hume
- University of British Columbia, Pathology and Laboratory Medicine, Vancouver, British Columbia, Canada
| |
Collapse
|
6
|
Platton S, Baker P, Bowyer A, Keenan C, Lawrence C, Lester W, Riddell A, Sutherland M. Guideline for laboratory diagnosis and monitoring of von Willebrand disease: A joint guideline from the United Kingdom Haemophilia Centre Doctors' Organisation and the British Society for Haematology. Br J Haematol 2024; 204:1714-1731. [PMID: 38532595 DOI: 10.1111/bjh.19385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/20/2024] [Accepted: 02/23/2024] [Indexed: 03/28/2024]
Affiliation(s)
- Sean Platton
- Royal London Hospital Haemophilia Centre, London, UK
| | - Peter Baker
- Oxford Haemophilia and Thrombosis Centre, Nuffield Orthopaedic Hospital, Oxford, UK
| | - Annette Bowyer
- Department of Coagulation, Royal Hallamshire Hospital, Sheffield, UK
| | - Catriona Keenan
- Department of Haematology & the National Coagulation Centre, St. James's Hospital, Dublin, Ireland
| | | | - Will Lester
- Haemophilia Unit, University Hospitals, Birmingham, UK
| | - Anne Riddell
- Katharine Dormandy Haemophilia Centre, Royal Free Hospital, London, UK
| | - Megan Sutherland
- North West Genomic Laboratory Hub, Manchester University NHS Foundation Trust, Manchester, UK
| |
Collapse
|
7
|
Matsui H, Hirata M. Evaluation of the pathogenic potential of germline DDX41 variants in hematopoietic neoplasms using the ACMG/AMP guidelines. Int J Hematol 2024; 119:552-563. [PMID: 38492200 DOI: 10.1007/s12185-024-03728-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/30/2024] [Accepted: 02/01/2024] [Indexed: 03/18/2024]
Abstract
Clinical use of gene panel testing for hematopoietic neoplasms in areas, such as diagnosis, prognosis prediction, and exploration of treatment options, has increased in recent years. The keys to interpreting gene variants detected in gene panel testing are to distinguish between germline and somatic variants and accurately determine whether the detected variants are pathogenic. If a variant is suspected to be a pathogenic germline variant, it is essential to confirm its consistency with the disease phenotype and gather a thorough family history. Donor eligibility must also be considered, especially if the patient's variant is also detected in the expected donor for hematopoietic stem cell transplantation. However, determining the pathogenicity of gene variants is often complicated, given the current limited availability of databases covering germline variants of hematopoietic neoplasms. This means that hematologists will frequently need to interpret gene variants themselves. Here, we outline how to assess the pathogenicity of germline variants according to criteria from the American College of Medical Genetics and Genomics/Association for Molecular Pathology standards and guidelines for the interpretation of variants using DDX41, a gene recently shown to be closely associated with myeloid neoplasms with a germline predisposition, as an example.
Collapse
Affiliation(s)
- Hirotaka Matsui
- Department of Laboratory Medicine, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan.
- Department of Medical Oncology and Translational Research, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan.
| | - Makoto Hirata
- Department of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
| |
Collapse
|
8
|
Farris J, Khanna C, Smadbeck JB, Johnson SH, Bothun E, Kaplan T, Hoffman F, Polonis K, Oliver G, Reis LM, Semina EV, Rust L, Hoppman NL, Vasmatzis G, Marcou CA, Schimmenti LA, Klee EW. Complex balanced intrachromosomal rearrangement involving PITX2 identified as a cause of Axenfeld-Rieger Syndrome. Am J Med Genet A 2024; 194:e63542. [PMID: 38234180 PMCID: PMC11003841 DOI: 10.1002/ajmg.a.63542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 12/15/2023] [Accepted: 01/07/2024] [Indexed: 01/19/2024]
Abstract
Axenfeld-Rieger Syndrome (ARS) type 1 is a rare autosomal dominant condition characterized by anterior chamber anomalies, umbilical defects, dental hypoplasia, and craniofacial anomalies, with Meckel's diverticulum in some individuals. Here, we describe a clinically ascertained female of childbearing age with ARS for whom clinical targeted sequencing and deletion/duplication analysis followed by clinical exome and genome sequencing resulted in no pathogenic variants or variants of unknown significance in PITX2 or FOXC1. Advanced bioinformatic analysis of the genome data identified a complex, balanced rearrangement disrupting PITX2. This case is the first reported intrachromosomal rearrangement leading to ARS, illustrating that for patients with compelling clinical phenotypes but negative genomic testing, additional bioinformatic analysis are essential to identify subtle genomic abnormalities in target genes.
Collapse
Affiliation(s)
- Joseph Farris
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Cheryl Khanna
- Department of Ophthalmology, Mayo Clinic, Rochester, Minnesota, USA
| | - James B Smadbeck
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Sarah H Johnson
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Erick Bothun
- Department of Ophthalmology, Mayo Clinic, Rochester, Minnesota, USA
| | - Tyler Kaplan
- Department of Ophthalmology, Mayo Clinic, Rochester, Minnesota, USA
| | - Francis Hoffman
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA
| | - Katarzyna Polonis
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Gavin Oliver
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Linda M Reis
- Department of Pediatrics and Children's Research Institute, Medical College of Wisconsin and Children's Wisconsin, Milwaukee, Wisconsin, USA
| | - Elena V Semina
- Department of Pediatrics and Children's Research Institute, Medical College of Wisconsin and Children's Wisconsin, Milwaukee, Wisconsin, USA
- Department of Ophthalmology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Laura Rust
- Department of Clinical Genomics, Mayo Clinic, Rochester, Minnesota, USA
| | - Nicole L Hoppman
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA
| | - George Vasmatzis
- Department of Molecular Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
| | - Cherisse A Marcou
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA
| | - Lisa A Schimmenti
- Department of Clinical Genomics, Mayo Clinic, Rochester, Minnesota, USA
| | - Eric W Klee
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
9
|
Wieder N, D'Souza EN, Martin-Geary AC, Lassen FH, Talbot-Martin J, Fernandes M, Chothani SP, Rackham OJL, Schafer S, Aspden JL, MacArthur DG, Davies RW, Whiffin N. Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes. Genome Biol 2024; 25:111. [PMID: 38685090 PMCID: PMC11057154 DOI: 10.1186/s13059-024-03248-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 04/15/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. RESULTS We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. CONCLUSIONS Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.
Collapse
Affiliation(s)
- Nechama Wieder
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Elston N D'Souza
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Alexandra C Martin-Geary
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Frederik H Lassen
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | - Maria Fernandes
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Sonia P Chothani
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
| | - Owen J L Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
- School of Biological Sciences, University of Southampton, Southampton, UK
| | - Sebastian Schafer
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
| | - Julie L Aspden
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom
- LeedsOmics, University of Leeds, Leeds, LS2 9JT, United Kingdom
- Astbury Centre of Structural Molecular Biology, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Robert W Davies
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Nicola Whiffin
- Big Data Institute, University of Oxford, Oxford, UK.
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
10
|
Livesey BJ, Badonyi M, Dias M, Frazer J, Kumar S, Lindorff-Larsen K, McCandlish DM, Orenbuch R, Shearer CA, Muffley L, Foreman J, Glazer AM, Lehner B, Marks DS, Roth FP, Rubin AF, Starita LM, Marsh JA. Guidelines for releasing a variant effect predictor. ArXiv 2024:arXiv:2404.10807v1. [PMID: 38699161 PMCID: PMC11065047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs. Emphasising open-source availability, transparent methodologies, clear variant effect score interpretations, standardised scales, accessible predictions, and rigorous training data disclosure, we aim to improve the usability and interpretability of VEPs, and promote their integration into analysis and evaluation pipelines. We also provide a large, categorised list of currently available VEPs, aiming to facilitate the discovery and encourage the usage of novel methods within the scientific community.
Collapse
Affiliation(s)
- Benjamin J. Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mafalda Dias
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jonathan Frazer
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Sushant Kumar
- Department of Medical Biophysics, University of Toronto; Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rose Orenbuch
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Lara Muffley
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Julia Foreman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Ben Lehner
- Wellcome Sanger Institute, Cambridge, UK; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | - Frederick P. Roth
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Alan F. Rubin
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research; Department of Medical Biology, University of Melbourne, Parkville, Australia
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
11
|
Forrest IS, Duffy Á, Park JK, Vy HMT, Pasquale LR, Nadkarni GN, Cho JH, Do R. Genome-first evaluation with exome sequence and clinical data uncovers underdiagnosed genetic disorders in a large healthcare system. Cell Rep Med 2024:101518. [PMID: 38642551 DOI: 10.1016/j.xcrm.2024.101518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/01/2023] [Accepted: 03/26/2024] [Indexed: 04/22/2024]
Abstract
Population-based genomic screening may help diagnose individuals with disease-risk variants. Here, we perform a genome-first evaluation for nine disorders in 29,039 participants with linked exome sequences and electronic health records (EHRs). We identify 614 individuals with 303 pathogenic/likely pathogenic or predicted loss-of-function (P/LP/LoF) variants, yielding 644 observations; 487 observations (76%) lack a corresponding clinical diagnosis in the EHR. Upon further investigation, 75 clinically undiagnosed observations (15%) have evidence of symptomatic untreated disease, including familial hypercholesterolemia (3 of 6 [50%] undiagnosed observations with disease evidence) and breast cancer (23 of 106 [22%]). These genetic findings enable targeted phenotyping that reveals new diagnoses in previously undiagnosed individuals. Disease yield is greater with variants in penetrant genes for which disease is observed in carriers in an independent cohort. The prevalence of P/LP/LoF variants exceeds that of clinical diagnoses, and some clinically undiagnosed carriers are discovered to have disease. These results highlight the potential of population-based genomic screening.
Collapse
Affiliation(s)
- Iain S Forrest
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Medical Scientist Training Program, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Áine Duffy
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Joshua K Park
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Medical Scientist Training Program, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ha My T Vy
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Center for Genomic Data Analytics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Louis R Pasquale
- Department of Ophthalmology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Eye and Vision Research Institute, New York Eye and Ear Infirmary of Mount Sinai, New York, NY 10003, USA
| | - Girish N Nadkarni
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Division of Data-driven and Digital Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Judy H Cho
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ron Do
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Center for Genomic Data Analytics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
12
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. bioRxiv 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
13
|
Gurwitz D, Shomron N. Artificial intelligence utility for drug development: ChatGPT and beyond. Drug Dev Res 2024; 85:e22121. [PMID: 37815084 DOI: 10.1002/ddr.22121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/20/2023] [Accepted: 10/02/2023] [Indexed: 10/11/2023]
Affiliation(s)
- David Gurwitz
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv, Israel
| | - Noam Shomron
- Sagol School of Neuroscience, Tel Aviv, Israel
- Department of Cell and Developmental Biology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Edmond J Safra Center for Bioinformatics, Tel Aviv University, Tel Aviv, Israel
- Tel Aviv University Innovation Labs (TILabs), Tel Aviv, Israel
- Djerassi Institute of Oncology, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
14
|
Seaby EG, Leggatt G, Cheng G, Thomas NS, Ashton JJ, Stafford I, Baralle D, Rehm HL, O'Donnell-Luria A, Ennis S. A gene pathogenicity tool "GenePy" identifies missed biallelic diagnoses in the 100,000 Genomes Project. Genet Med 2024; 26:101073. [PMID: 38245859 DOI: 10.1016/j.gim.2024.101073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 01/22/2024] Open
Abstract
PURPOSE The 100,000 Genomes Project diagnosed a quarter of affected participants, but 26% of diagnoses were not on the applied gene panel(s); with many being de novo variants. Assessing biallelic variants without a gene panel is more challenging. METHODS We sought to identify missed biallelic diagnoses using GenePy, which incorporates allele frequency, zygosity, and a user-defined deleterious metric, generating an aggregate GenePy score per gene, per participant. We calculated GenePy scores for 2862 recessive disease genes in 78,216 100,000 Genomes Project participants. For each gene, we ranked participant GenePy scores and scrutinized affected participants without a diagnosis, whose scores ranked among the top 5 for each gene. In cases which participant phenotypes overlapped with the disease gene of interest, we extracted rare variants and applied phase, ClinVar, and ACMG classification. RESULTS 3184 affected individuals without a molecular diagnosis had a top-5-ranked GenePy score and 682 of 3184 (21%) had phenotypes overlapping with a top-ranking gene. In 122 of 669 (18%) phenotype-matched cases (excluding 13 withdrawn participants), we identified a putative missed diagnosis (2.2% of all undiagnosed participants). A further 334 of 669 (50%) cases have a possible missed diagnosis but require functional validation. CONCLUSION Applying GenePy at scale has identified 456 potential diagnoses, demonstrating the value of novel diagnostic strategies.
Collapse
Affiliation(s)
- Eleanor G Seaby
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA; Paediatric Infectious Diseases, Imperial College London, London, United Kingdom.
| | - Gary Leggatt
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom
| | - Guo Cheng
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom
| | - N Simon Thomas
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom; Wessex Regional Genomics Laboratory, Salisbury NHS Foundation Trust, Salisbury, United Kingdom
| | - James J Ashton
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom
| | | | - Diana Baralle
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom
| | - Heidi L Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
| | - Sarah Ennis
- Human Development and Health, Faculty of Medicine, University Hospital Southampton, Southampton, Hampshire, United Kingdom
| |
Collapse
|
15
|
Ji HJ, Salzberg SL. Upstream open reading frames may contain hundreds of novel human exons. bioRxiv 2024:2024.03.22.586333. [PMID: 38562894 PMCID: PMC10983949 DOI: 10.1101/2024.03.22.586333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Several recent studies have presented evidence that the human gene catalogue should be expanded to include thousands of short open reading frames (ORFs) appearing upstream or downstream of existing protein-coding genes, each of which would comprise an additional bicistronic transcript in humans. Here we explore an alternative hypothesis that would explain the translational and evolutionary evidence for these upstream ORFs without the need to create novel genes or bicistronic transcripts. We examined 2,199 upstream ORFs that have been proposed as high-quality candidates for novel genes, to determine if they could instead represent protein-coding exons that can be added to existing genes. We checked for the conservation of these ORFs in four recently sequenced, high-quality human genomes, and found a large majority (87.8%) to be conserved in all four as expected. We then looked for splicing evidence that would connect each upstream ORF to the downstream protein-coding gene at the same locus, thus creating a novel splicing variant using the upstream ORF as its first exon. These protein coding exon candidates were further evaluated using protein structure predictions of the protein sequences that included the proposed new exons. We determined that 582 out of 2,199 upstream ORFs have strong evidence that they can form protein coding exons that are part of an existing gene, and that the resulting protein is predicted to have similar or better structural quality than the currently annotated isoform.
Collapse
Affiliation(s)
- Hyun Joo Ji
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD
- Department of Computer Science, Johns Hopkins University; Baltimore, MD
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD
- Department of Computer Science, Johns Hopkins University; Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD
- Department of Biostatistics, Johns Hopkins University; Baltimore, MD
| |
Collapse
|
16
|
Zhao Y, Chukanova M, Kentistou KA, Fairhurst-Hunter Z, Siegert AM, Jia RY, Dowsett GKC, Gardner EJ, Lawler K, Day FR, Kaisinger LR, Tung YCL, Lam BYH, Chen HJC, Wang Q, Berumen-Campos J, Kuri-Morales P, Tapia-Conyer R, Alegre-Diaz J, Barroso I, Emberson J, Torres JM, Collins R, Saleheen D, Smith KR, Paul DS, Merkle F, Farooqi IS, Wareham NJ, Petrovski S, O'Rahilly S, Ong KK, Yeo GSH, Perry JRB. Protein-truncating variants in BSN are associated with severe adult-onset obesity, type 2 diabetes and fatty liver disease. Nat Genet 2024; 56:579-584. [PMID: 38575728 PMCID: PMC11018524 DOI: 10.1038/s41588-024-01694-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 02/21/2024] [Indexed: 04/06/2024]
Abstract
Obesity is a major risk factor for many common diseases and has a substantial heritable component. To identify new genetic determinants, we performed exome-sequence analyses for adult body mass index (BMI) in up to 587,027 individuals. We identified rare loss-of-function variants in two genes (BSN and APBA1) with effects substantially larger than those of well-established obesity genes such as MC4R. In contrast to most other obesity-related genes, rare variants in BSN and APBA1 were not associated with normal variation in childhood adiposity. Furthermore, BSN protein-truncating variants (PTVs) magnified the influence of common genetic variants associated with BMI, with a common variant polygenic score exhibiting an effect twice as large in BSN PTV carriers than in noncarriers. Finally, we explored the plasma proteomic signatures of BSN PTV carriers as well as the functional consequences of BSN deletion in human induced pluripotent stem cell-derived hypothalamic neurons. Collectively, our findings implicate degenerative processes in synaptic function in the etiology of adult-onset obesity.
Collapse
Affiliation(s)
- Yajie Zhao
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Maria Chukanova
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Katherine A Kentistou
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Zammy Fairhurst-Hunter
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Anna Maria Siegert
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Raina Y Jia
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Georgina K C Dowsett
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Eugene J Gardner
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Katherine Lawler
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Felix R Day
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Lena R Kaisinger
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Yi-Chun Loraine Tung
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Brian Yee Hong Lam
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Hsiao-Jou Cortina Chen
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Jaime Berumen-Campos
- Experimental Medicine Research Unit, Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Mexico City, Mexico
| | - Pablo Kuri-Morales
- Experimental Medicine Research Unit, Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Mexico City, Mexico
- Instituto Tecnológico de Estudios Superiores de Monterrey, Tecnológico, Monterrey, Mexico
| | - Roberto Tapia-Conyer
- Experimental Medicine Research Unit, Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Mexico City, Mexico
| | - Jesus Alegre-Diaz
- Experimental Medicine Research Unit, Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Mexico City, Mexico
| | - Inês Barroso
- Exeter Centre of Excellence for Diabetes Research (EXCEED), University of Exeter Medical School, Exeter, UK
| | - Jonathan Emberson
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jason M Torres
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Rory Collins
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Danish Saleheen
- Center for Non-Communicable Diseases, Karachi, Pakistan
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Katherine R Smith
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Dirk S Paul
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Florian Merkle
- Institute of Metabolic Science and Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - I Sadaf Farooqi
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Nick J Wareham
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Stephen O'Rahilly
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Ken K Ong
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Giles S H Yeo
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - John R B Perry
- MRC Epidemiology Unit and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK.
- Metabolic Research Laboratories, MRC Metabolic Diseases Unit and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK.
| |
Collapse
|
17
|
Nanni A, Titus-McQuillan J, Bankole KS, Pardo-Palacios F, Signor S, Vlaho S, Moskalenko O, Morse A, Rogers RL, Conesa A, McIntyre LM. Nucleotide-level distance metrics to quantify alternative splicing implemented in TranD. Nucleic Acids Res 2024; 52:e28. [PMID: 38340337 PMCID: PMC10954468 DOI: 10.1093/nar/gkae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 11/29/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
Advances in affordable transcriptome sequencing combined with better exon and gene prediction has motivated many to compare transcription across the tree of life. We develop a mathematical framework to calculate complexity and compare transcript models. Structural features, i.e. intron retention (IR), donor/acceptor site variation, alternative exon cassettes, alternative 5'/3' UTRs, are compared and the distance between transcript models is calculated with nucleotide level precision. All metrics are implemented in a PyPi package, TranD and output can be used to summarize splicing patterns for a transcriptome (1GTF) and between transcriptomes (2GTF). TranD output enables quantitative comparisons between: annotations augmented by empirical RNA-seq data and the original transcript models; transcript model prediction tools for longread RNA-seq (e.g. FLAIR versus Isoseq3); alternate annotations for a species (e.g. RefSeq vs Ensembl); and between closely related species. In C. elegans, Z. mays, D. melanogaster, D. simulans and H. sapiens, alternative exons were observed more frequently in combination with an alternative donor/acceptor than alone. Transcript models in RefSeq and Ensembl are linked and both have unique transcript models with empirical support. D. melanogaster and D. simulans, share many transcript models and long-read RNAseq data suggests that both species are under-annotated. We recommend combined references.
Collapse
Affiliation(s)
- Adalena Nanni
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - James Titus-McQuillan
- University of North Carolina at Charlotte Department of Bioinformatics and Genomics Charlotte, NC, USA
| | - Kinfeosioluwa S Bankole
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | | | - Sarah Signor
- Department of Biological Sciences, North Dakota State University, Fargo, ND, USA
| | - Srna Vlaho
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| | - Oleksandr Moskalenko
- University of Florida Research Computing, University of Florida, Gainesville, FL 32611, USA
| | - Alison M Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Rebekah L Rogers
- University of North Carolina at Charlotte Department of Bioinformatics and Genomics Charlotte, NC, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology. Spanish National Research Council, Paterna, Spain
| | - Lauren M McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
18
|
Venkatesh SS, Wittemans LBL, Palmer DS, Baya NA, Ferreira T, Hill B, Lassen FH, Parker MJ, Reibe S, Elhakeem A, Banasik K, Bruun MT, Erikstrup C, Jensen BA, Juul A, Mikkelsen C, Nielsen HS, Ostrowski SR, Pedersen OB, Rohde PD, Sorensen E, Ullum H, Westergaard D, Haraldsson A, Holm H, Jonsdottir I, Olafsson I, Steingrimsdottir T, Steinthorsdottir V, Thorleifsson G, Figueredo J, Karjalainen MK, Pasanen A, Jacobs BM, Hubers N, Lippincott M, Fraser A, Lawlor DA, Timpson NJ, Nyegaard M, Stefansson K, Magi R, Laivuori H, van Heel DA, Boomsma DI, Balasubramanian R, Seminara SB, Chan YM, Laisk T, Lindgren CM. Genome-wide analyses identify 21 infertility loci and over 400 reproductive hormone loci across the allele frequency spectrum. medRxiv 2024:2024.03.19.24304530. [PMID: 38562841 PMCID: PMC10984039 DOI: 10.1101/2024.03.19.24304530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome-wide association studies (GWASs) may help inform treatments for infertility, whose causes remain unknown in many cases. Here we present GWAS meta-analyses across six cohorts for male and female infertility in up to 41,200 cases and 687,005 controls. We identified 21 genetic risk loci for infertility (P≤5E-08), of which 12 have not been reported for any reproductive condition. We found positive genetic correlations between endometriosis and all-cause female infertility (rg=0.585, P=8.98E-14), and between polycystic ovary syndrome and anovulatory infertility (rg=0.403, P=2.16E-03). The evolutionary persistence of female infertility-risk alleles in EBAG9 may be explained by recent directional selection. We additionally identified up to 269 genetic loci associated with follicle-stimulating hormone (FSH), luteinising hormone, oestradiol, and testosterone through sex-specific GWAS meta-analyses (N=6,095-246,862). While hormone-associated variants near FSHB and ARL14EP colocalised with signals for anovulatory infertility, we found no rg between female infertility and reproductive hormones (P>0.05). Exome sequencing analyses in the UK Biobank (N=197,340) revealed that women carrying testosterone-lowering rare variants in GPC2 were at higher risk of infertility (OR=2.63, P=1.25E-03). Taken together, our results suggest that while individual genes associated with hormone regulation may be relevant for fertility, there is limited genetic evidence for correlation between reproductive hormones and infertility at the population level. We provide the first comprehensive view of the genetic architecture of infertility across multiple diagnostic criteria in men and women, and characterise its relationship to other health conditions.
Collapse
Affiliation(s)
- Samvida S Venkatesh
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Laura B L Wittemans
- Novo Nordisk Research Centre Oxford, Oxford, United Kingdom
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, United Kingdom
| | - Duncan S Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Nikolas A Baya
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Teresa Ferreira
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
| | - Barney Hill
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Frederik Heymann Lassen
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Melody J Parker
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
| | - Saskia Reibe
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Ahmed Elhakeem
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Karina Banasik
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre, Copenhagen, Denmark
| | - Mie T Bruun
- Department of Clinical Immunology, Odense University Hospital, Odense, Denmark
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Health, Aarhus University, Aarhus, Denmark
| | - Bitten A Jensen
- Department of Clinical Immunology, Aalborg University Hospital, Aalborg, Denmark
| | - Anders Juul
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen; Copenhagen, Denmark
- Department of Growth and Reproduction, Copenhagen University Hospital-Rigshospitalet, Copenhagen, Denmark
| | - Christina Mikkelsen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
| | - Henriette S Nielsen
- Department of Obstetrics and Gynecology, The Fertility Clinic, Hvidovre University Hospital, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Sisse R Ostrowski
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Ole B Pedersen
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Immunology, Zealand University Hospital, Kge, Denmark
| | - Palle D Rohde
- Genomic Medicine, Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
| | - Erik Sorensen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | | | - David Westergaard
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre, Copenhagen, Denmark
| | - Asgeir Haraldsson
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Children's Hospital Iceland, Landspitali University Hospital, Reykjavik, Iceland
| | - Hilma Holm
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Ingileif Jonsdottir
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Isleifur Olafsson
- Department of Clinical Biochemistry, Landspitali University Hospital, Reykjavik, Iceland
| | - Thora Steingrimsdottir
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Department of Obstetrics and Gynecology, Landspitali University Hospital, Reykjavik, Iceland
| | | | | | - Jessica Figueredo
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Minna K Karjalainen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Research Unit of Population Health, Faculty of Medicine, University of Oulu, Finland
- Northern Finland Birth Cohorts, Arctic Biobank, Infrastructure for Population Studies, Faculty of Medicine, University of Oulu, Oulu, Finland
| | - Anu Pasanen
- Research Unit of Clinical Medicine, Medical Research Center Oulu, University of Oulu, and Department of Children and Adolescents, Oulu University Hospital, Oulu, Finland
| | - Benjamin M Jacobs
- Centre for Preventive Neurology, Wolfson Institute of Population Health, Queen Mary University London, London, EC1M 6BQ, United Kingdom
| | - Nikki Hubers
- Department of Biological Psychology, Netherlands Twin Register, Vrije Universiteit, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development Institute, Amsterdam, The Netherlands
| | - Margaret Lippincott
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Abigail Fraser
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Deborah A Lawlor
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Nicholas J Timpson
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Mette Nyegaard
- Genomic Medicine, Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
| | - Kari Stefansson
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Reedik Magi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Hannele Laivuori
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Department of Obstetrics and Gynecology, Tampere University Hospital, Finland
- Center for Child, Adolescent, and Maternal Health Research, Faculty of Medicine and Health Technology, Tampere University, Finland
| | - David A van Heel
- Blizard Institute, Queen Mary University London, London, E1 2AT, United Kingdom
| | - Dorret I Boomsma
- Department of Biological Psychology, Netherlands Twin Register, Vrije Universiteit, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development Institute, Amsterdam, The Netherlands
| | - Ravikumar Balasubramanian
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Stephanie B Seminara
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Yee-Ming Chan
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Endocrinology, Department of Pediatrics, Boston Children's Hospital, Boston, Massachusetts, United States of America
| | - Triin Laisk
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Cecilia M Lindgren
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, United Kingdom
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| |
Collapse
|
19
|
Rui Y, Zhou J, Zhen X, Zhang J, Liu S, Gao Y. TBX5 genetic variants and SCD-CAD susceptibility: insights from Chinese Han cohorts. PeerJ 2024; 12:e17139. [PMID: 38525280 PMCID: PMC10959103 DOI: 10.7717/peerj.17139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 02/28/2024] [Indexed: 03/26/2024] Open
Abstract
Background The prevention and prediction of sudden cardiac death (SCD) present persistent challenges, prompting exploration into common genetic variations for potential insights. T-box 5 (TBX5), a critical cardiac transcription factor, plays a pivotal role in cardiovascular development and function. This study systematically examined variants within the 500-bp region downstream of the TBX5 gene, focusing on their potential impact on susceptibility to SCD associated with coronary artery disease (SCD-CAD) in four different Chinese Han populations. Methods In a comprehensive case-control analysis, we explored the association between rs11278315 and SCD-CAD susceptibility using a cohort of 553 controls and 201 SCD-CAD cases. Dual luciferase reporter assays and genotype-phenotype correlation studies using human cardiac tissue samples as well as integrated in silicon analysis were applied to explore the underlining mechanism. Result Binary logistic regression results underscored a significantly reduced risk of SCD-CAD in individuals harboring the deletion allele (odds ratio = 0.70, 95% CI [0.55-0.88], p = 0.0019). Consistent with the lower transcriptional activity of the deletion allele observed in dual luciferase reporter assays, genotype-phenotype correlation studies on human cardiac tissue samples affirmed lower expression levels associated with the deletion allele at both mRNA and protein levels. Furthermore, our investigation revealed intriguing insights into the role of rs11278315 in TBX5 alternative splicing, which may contribute to alterations in its ultimate functional effects, as suggested by sQTL analysis. Gene ontology analysis and functional annotation further underscored the potential involvement of TBX5 in alternative splicing and cardiac-related transcriptional regulation. Conclusions In summary, our current dataset points to a plausible correlation between rs11278315 and susceptibility to SCD-CAD, emphasizing the potential of rs11278315 as a genetic risk marker for aiding in molecular diagnosis and risk stratification of SCD-CAD.
Collapse
Affiliation(s)
- Yukun Rui
- Department of Forensic Medicine, Medical College of Soochow University, Suzhou, China
| | - Ju Zhou
- Medical College of Soochow University, Suzhou, China
| | - Xiaoyuan Zhen
- Department of Forensic Medicine, Medical College of Soochow University, Suzhou, China
| | - Jianhua Zhang
- Shanghai Key Laboratory of Forensic Medicine, Institute of Forensic Sciences, Ministry of Justice, Shanghai, China
| | - Shiquan Liu
- Institute of Evidence Law and Forensic Science, China University of Political Science and Law, Beijing, China
| | - Yuzhen Gao
- Department of Forensic Medicine, Medical College of Soochow University, Suzhou, China
| |
Collapse
|
20
|
Degalez F, Charles M, Foissac S, Zhou H, Guan D, Fang L, Klopp C, Allain C, Lagoutte L, Lecerf F, Acloque H, Giuffra E, Pitel F, Lagarrigue S. Enriched atlas of lncRNA and protein-coding genes for the GRCg7b chicken assembly and its functional annotation across 47 tissues. Sci Rep 2024; 14:6588. [PMID: 38504112 PMCID: PMC10951430 DOI: 10.1038/s41598-024-56705-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 03/09/2024] [Indexed: 03/21/2024] Open
Abstract
Gene atlases for livestock are steadily improving thanks to new genome assemblies and new expression data improving the gene annotation. However, gene content varies across databases due to differences in RNA sequencing data and bioinformatics pipelines, especially for long non-coding RNAs (lncRNAs) which have higher tissue and developmental specificity and are harder to consistently identify compared to protein coding genes (PCGs). As done previously in 2020 for chicken assemblies galgal5 and GRCg6a, we provide a new gene atlas, lncRNA-enriched, for the latest GRCg7b chicken assembly, integrating "NCBI RefSeq", "EMBL-EBI Ensembl/GENCODE" reference annotations and other resources such as FAANG and NONCODE. As a result, the number of PCGs increases from 18,022 (RefSeq) and 17,007 (Ensembl) to 24,102, and that of lncRNAs from 5789 (RefSeq) and 11,944 (Ensembl) to 44,428. Using 1400 public RNA-seq transcriptome representing 47 tissues, we provided expression evidence for 35,257 (79%) lncRNAs and 22,468 (93%) PCGs, supporting the relevance of this atlas. Further characterization including tissue-specificity, sex-differential expression and gene configurations are provided. We also identified conserved miRNA-hosting genes with human counterparts, suggesting common function. The annotated atlas is available at gega.sigenae.org.
Collapse
Affiliation(s)
- Fabien Degalez
- PEGASE, INRAE, Institut Agro, 35590, Saint Gilles, France
| | - Mathieu Charles
- INRAE, BioinfOmics, GenoToul Bioinformatics facility, Sigenae, Université Fédérale de Toulouse, 31326, Castanet-Tolosan, France
- INRAE, AgroParisTech, GABI, Paris-Saclay University, 78350, Jouy-en-Josas, France
| | - Sylvain Foissac
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet-Tolosan, France
| | | | - Dailu Guan
- University of California Davis, Davis, USA
| | | | - Christophe Klopp
- INRAE, BioinfOmics, GenoToul Bioinformatics facility, Sigenae, Université Fédérale de Toulouse, 31326, Castanet-Tolosan, France
| | - Coralie Allain
- PEGASE, INRAE, Institut Agro, 35590, Saint Gilles, France
| | | | | | - Hervé Acloque
- INRAE, AgroParisTech, GABI, Paris-Saclay University, 78350, Jouy-en-Josas, France
| | - Elisabetta Giuffra
- INRAE, AgroParisTech, GABI, Paris-Saclay University, 78350, Jouy-en-Josas, France
| | - Frédérique Pitel
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet-Tolosan, France
| | | |
Collapse
|
21
|
Einson J, Minaeva M, Rafi F, Lappalainen T. The impact of genetically controlled splicing on exon inclusion and protein structure. PLoS One 2024; 19:e0291960. [PMID: 38478511 PMCID: PMC10936842 DOI: 10.1371/journal.pone.0291960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 09/08/2023] [Indexed: 03/17/2024] Open
Abstract
Common variants affecting mRNA splicing are typically identified though splicing quantitative trait locus (sQTL) mapping and have been shown to be enriched for GWAS signals by a similar degree to eQTLs. However, the specific splicing changes induced by these variants have been difficult to characterize, making it more complicated to analyze the effect size and direction of sQTLs, and to determine downstream splicing effects on protein structure. In this study, we catalogue sQTLs using exon percent spliced in (PSI) scores as a quantitative phenotype. PSI is an interpretable metric for identifying exon skipping events and has some advantages over other methods for quantifying splicing from short read RNA sequencing. In our set of sQTL variants, we find evidence of selective effects based on splicing effect size and effect direction, as well as exon symmetry. Additionally, we utilize AlphaFold2 to predict changes in protein structure associated with sQTLs overlapping GWAS traits, highlighting a potential new use-case for this technology for interpreting genetic effects on traits and disorders.
Collapse
Affiliation(s)
- Jonah Einson
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, United States of America
- New York Genome Center, New York, NY, United States of America
| | - Mariia Minaeva
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Faiza Rafi
- New York Genome Center, New York, NY, United States of America
- Department of Biotechnology, The City College of New York, New York, NY, United States of America
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, United States of America
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, United States of America
| |
Collapse
|
22
|
Stenton SL, Pejaver V, Bergquist T, Biesecker LG, Byrne AB, Nadeau E, Greenblatt MS, Harrison S, Tavtigian S, Radivojac P, Brenner SE, O’Donnell-Luria A. Assessment of the evidence yield for the calibrated PP3/BP4 computational recommendations. medRxiv 2024:2024.03.05.24303807. [PMID: 38496501 PMCID: PMC10942508 DOI: 10.1101/2024.03.05.24303807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Purpose To investigate the number of rare missense variants observed in human genome sequences by ACMG/AMP PP3/BP4 evidence strength, following the calibrated PP3/BP4 computational recommendations. Methods Missense variants from the genome sequences of 300 probands from the Rare Genomes Project with suspected rare disease were analyzed using computational prediction tools able to reach PP3_Strong and BP4_Moderate evidence strengths (BayesDel, MutPred2, REVEL, and VEST4). The numbers of variants at each evidence strength were analyzed across disease-associated genes and genome-wide. Results From a median of 75.5 rare (≤1% allele frequency) missense variants in disease-associated genes per proband, a median of one reached PP3_Strong, 3-5 PP3_Moderate, and 3-5 PP3_Supporting. Most were allocated BP4 evidence (median 41-49 per proband) or were indeterminate (median 17.5-19 per proband). Extending the analysis to all protein-coding genes genome-wide, the number of PP3_Strong variants increased approximately 2.6-fold compared to disease-associated genes, with a median per proband of 1-3 PP3_Strong, 8-16 PP3_Moderate, and 10-17 PP3_Supporting. Conclusion A small number of variants per proband reached PP3_Strong and PP3_Moderate in 3,424 disease-associated genes, and though not the intended use of the recommendations, also genome-wide. Use of PP3/BP4 evidence as recommended from calibrated computational prediction tools in the clinical diagnostic laboratory is unlikely to inappropriately contribute to the classification of an excessive number of variants as Pathogenic or Likely Pathogenic by ACMG/AMP rules.
Collapse
Affiliation(s)
- Sarah L. Stenton
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Timothy Bergquist
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Leslie G. Biesecker
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alicia B. Byrne
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Emily Nadeau
- Department of Medicine and University of Vermont Cancer Center, University of Vermont, Larner College of Medicine, Burlington, VT 05405, USA
| | - Marc S. Greenblatt
- Department of Medicine and University of Vermont Cancer Center, University of Vermont, Larner College of Medicine, Burlington, VT 05405, USA
| | - Steven Harrison
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Ambry Genetics, Aliso Viejo, CA, USA
| | - Sean Tavtigian
- Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Anne O’Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
23
|
Wu H, Lin JH, Tang XY, Marenne G, Zou WB, Schutz S, Masson E, Génin E, Fichou Y, Le Gac G, Férec C, Liao Z, Chen JM. Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible SPINK1 coding variants. Hum Genomics 2024; 18:21. [PMID: 38414044 PMCID: PMC10898081 DOI: 10.1186/s40246-024-00586-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 02/13/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. RESULTS Our study began with a retrospective analysis of 27 SPINK1 coding SNVs previously assessed using FLGSA, proceeded with a prospective analysis of 35 new FLGSA-tested SPINK1 coding SNVs, followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, which account for 9.3% of the 720 possible coding SNVs. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through detailed comparison of FLGSA results and SpliceAI predictions, we inferred that the remaining 653 untested coding SNVs in the SPINK1 gene are unlikely to significantly affect splicing. Of the 12 splice-altering events, nine produced both normally spliced and aberrantly spliced transcripts, while the remaining three only generated aberrantly spliced transcripts. These splice-impacting SNVs were found solely in exons 1 and 2, notably at the first and/or last coding nucleotides of these exons. Among the 12 splice-altering events, 11 were missense variants (2.17% of 506 potential missense variants), and one was synonymous (0.61% of 164 potential synonymous variants). Notably, adjusting the SpliceAI cut-off to 0.30 instead of the conventional 0.20 would improve specificity without reducing sensitivity. CONCLUSIONS By integrating FLGSA with SpliceAI, we have determined that less than 2% (1.67%) of all possible coding SNVs in SPINK1 significantly influence splicing outcomes. Our findings emphasize the critical importance of conducting splicing analysis within the broader genomic sequence context of the study gene and highlight the inherent uncertainties associated with intermediate SpliceAI scores (0.20 to 0.80). This study contributes to the field by being the first to prospectively interpret all potential coding SNVs in a disease-associated gene with a high degree of accuracy, representing a meaningful attempt at shifting from retrospective to prospective variant analysis in the era of exome and genome sequencing.
Collapse
Affiliation(s)
- Hao Wu
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Jin-Huan Lin
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Xin-Ying Tang
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
- Department of Prevention and Health Care, Eastern Hepatobiliary Surgery Hospital, Naval Medical University, Shanghai, China
| | - Gaëlle Marenne
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Wen-Bin Zou
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Sacha Schutz
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | - Emmanuelle Masson
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | | | - Yann Fichou
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Gerald Le Gac
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | - Claude Férec
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Zhuan Liao
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China.
- Shanghai Institute of Pancreatic Diseases, Shanghai, China.
| | - Jian-Min Chen
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France.
| |
Collapse
|
24
|
Hoge C, de Manuel M, Mahgoub M, Okami N, Fuller Z, Banerjee S, Baker Z, McNulty M, Andolfatto P, Macfarlan TS, Schumer M, Tzika AC, Przeworski M. Patterns of recombination in snakes reveal a tug-of-war between PRDM9 and promoter-like features. Science 2024; 383:eadj7026. [PMID: 38386752 DOI: 10.1126/science.adj7026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 01/04/2024] [Indexed: 02/24/2024]
Abstract
In some mammals, notably humans, recombination occurs almost exclusively where the protein PRDM9 binds, whereas in vertebrates lacking an intact PRDM9, such as birds and canids, recombination rates are elevated near promoter-like features. To determine whether PRDM9 directs recombination in nonmammalian vertebrates, we focused on an exemplar species with a single, intact PRDM9 ortholog, the corn snake (Pantherophis guttatus). Analyzing historical recombination rates along the genome and crossovers in pedigrees, we found evidence that PRDM9 specifies the location of recombination events, but we also detected a separable effect of promoter-like features. These findings reveal that the uses of PRDM9 and promoter-like features need not be mutually exclusive and instead reflect a tug-of-war that is more even in some species than others.
Collapse
Affiliation(s)
- Carla Hoge
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Marc de Manuel
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Mohamed Mahgoub
- The Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Naima Okami
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Zachary Fuller
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Shreya Banerjee
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Zachary Baker
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Morgan McNulty
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Todd S Macfarlan
- The Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Molly Schumer
- Department of Biology, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford, CA, USA
| | - Athanasia C Tzika
- Laboratory of Artificial & Natural Evolution (LANE), Department of Genetics & Evolution, University of Geneva, Geneva, Switzerland
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
25
|
Nakamura T, Ueda J, Mizuno S, Honda K, Kazuno AA, Yamamoto H, Hara T, Takata A. Topologically associating domains define the impact of de novo promoter variants on autism spectrum disorder risk. Cell Genom 2024; 4:100488. [PMID: 38280381 PMCID: PMC10879036 DOI: 10.1016/j.xgen.2024.100488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 08/24/2023] [Accepted: 01/02/2024] [Indexed: 01/29/2024]
Abstract
Whole-genome sequencing (WGS) studies of autism spectrum disorder (ASD) have demonstrated the roles of rare promoter de novo variants (DNVs). However, most promoter DNVs in ASD are not located immediately upstream of known ASD genes. In this study analyzing WGS data of 5,044 ASD probands, 4,095 unaffected siblings, and their parents, we show that promoter DNVs within topologically associating domains (TADs) containing ASD genes are significantly and specifically associated with ASD. An analysis considering TADs as functional units identified specific TADs enriched for promoter DNVs in ASD and indicated that common variants in these regions also confer ASD heritability. Experimental validation using human induced pluripotent stem cells (iPSCs) showed that likely deleterious promoter DNVs in ASD can influence multiple genes within the same TAD, resulting in overall dysregulation of ASD-associated genes. These results highlight the importance of TADs and gene-regulatory mechanisms in better understanding the genetic architecture of ASD.
Collapse
Affiliation(s)
- Takumi Nakamura
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Junko Ueda
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| | - Shota Mizuno
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kurara Honda
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - An-A Kazuno
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Hirona Yamamoto
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
| | - Tomonori Hara
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Department of Organ Anatomy, Tohoku University Graduate School of Medicine, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Atsushi Takata
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Research Institute for Diseases of Old Age, Juntendo University Graduate School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo 113-8421, Japan.
| |
Collapse
|
26
|
Omenn GS, Lane L, Overall CM, Lindskog C, Pineau C, Packer NH, Cristea IM, Weintraub ST, Orchard S, Roehrl MHA, Nice E, Guo T, Van Eyk JE, Liu S, Bandeira N, Aebersold R, Moritz RL, Deutsch EW. The 2023 Report on the Proteome from the HUPO Human Proteome Project. J Proteome Res 2024; 23:532-549. [PMID: 38232391 PMCID: PMC11026053 DOI: 10.1021/acs.jproteome.3c00591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Since 2010, the Human Proteome Project (HPP), the flagship initiative of the Human Proteome Organization (HUPO), has pursued two goals: (1) to credibly identify the protein parts list and (2) to make proteomics an integral part of multiomics studies of human health and disease. The HPP relies on international collaboration, data sharing, standardized reanalysis of MS data sets by PeptideAtlas and MassIVE-KB using HPP Guidelines for quality assurance, integration and curation of MS and non-MS protein data by neXtProt, plus extensive use of antibody profiling carried out by the Human Protein Atlas. According to the neXtProt release 2023-04-18, protein expression has now been credibly detected (PE1) for 18,397 of the 19,778 neXtProt predicted proteins coded in the human genome (93%). Of these PE1 proteins, 17,453 were detected with mass spectrometry (MS) in accordance with HPP Guidelines and 944 by a variety of non-MS methods. The number of neXtProt PE2, PE3, and PE4 missing proteins now stands at 1381. Achieving the unambiguous identification of 93% of predicted proteins encoded from across all chromosomes represents remarkable experimental progress on the Human Proteome parts list. Meanwhile, there are several categories of predicted proteins that have proved resistant to detection regardless of protein-based methods used. Additionally there are some PE1-4 proteins that probably should be reclassified to PE5, specifically 21 LINC entries and ∼30 HERV entries; these are being addressed in the present year. Applying proteomics in a wide array of biological and clinical studies ensures integration with other omics platforms as reported by the Biology and Disease-driven HPP teams and the antibody and pathology resource pillars. Current progress has positioned the HPP to transition to its Grand Challenge Project focused on determining the primary function(s) of every protein itself and in networks and pathways within the context of human health and disease.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, 1015 Lausanne, Switzerland
| | - Christopher M. Overall
- University of British Columbia, Vancouver, BC V6T 1Z4, Canada, Yonsei University Republic of Korea
| | | | - Charles Pineau
- University Rennes, Inserm U1085, Irset, 35042 Rennes, France
| | | | | | - Susan T. Weintraub
- University of Texas Health Science Center-San Antonio, San Antonio, Texas 78229-3900, United States
| | | | - Michael H. A. Roehrl
- Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, United States
| | | | - Tiannan Guo
- Westlake Center for Intelligent Proteomics, Westlake Laboratory, Westlake University, Hangzhou 310024, Zhejiang Province, China
| | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, Smidt Heart Institute, Cedars-Sinai Medical Center, 127 South San Vicente Boulevard, Pavilion, 9th Floor, Los Angeles, CA, 90048, United States
| | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, CA, 92093, United States
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology in ETH Zurich, 8092 Zurich, Switzerland
- University of Zurich, 8092 Zurich, Switzerland
| | - Robert L. Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W. Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
27
|
Lei H, Li J, Zhao B, Kou SH, Xiao F, Chen T, Wang SM. Evolutionary origin of germline pathogenic variants in human DNA mismatch repair genes. Hum Genomics 2024; 18:5. [PMID: 38287404 PMCID: PMC10823654 DOI: 10.1186/s40246-024-00573-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 01/17/2024] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Mismatch repair (MMR) system is evolutionarily conserved for genome stability maintenance. Germline pathogenic variants (PVs) in MMR genes that lead to MMR functional deficiency are associated with high cancer risk. Knowing the evolutionary origin of germline PVs in human MMR genes will facilitate understanding the biological base of MMR deficiency in cancer. However, systematic knowledge is lacking to address the issue. In this study, we performed a comprehensive analysis to know the evolutionary origin of human MMR PVs. METHODS We retrieved MMR gene variants from the ClinVar database. The genomes of 100 vertebrates were collected from the UCSC genome browser and ancient human sequencing data were obtained through comprehensive data mining. Cross-species conservation analysis was performed based on the phylogenetic relationship among 100 vertebrates. Rescaled ancient sequencing data were used to perform variant calling for archeological analysis. RESULTS Using the phylogenetic approach, we traced the 3369 MMR PVs identified in modern humans in 99 non-human vertebrate genomes but found no evidence for cross-species conservation as the source for human MMR PVs. Using the archeological approach, we searched the human MMR PVs in over 5000 ancient human genomes dated from 45,045 to 100 years before present and identified a group of MMR PVs shared between modern and ancient humans mostly within 10,000 years with similar quantitative patterns. CONCLUSION Our study reveals that MMR PVs in modern humans were arisen within the recent human evolutionary history.
Collapse
Affiliation(s)
- Huijun Lei
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310018, Zhejiang, China
- Department of Cancer Prevention, Zhejiang Cancer Hospital, Hangzhou, 310022, Zhejiang, China
| | - Jiaheng Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Bojin Zhao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Si Hoi Kou
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Fengxia Xiao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Tianhui Chen
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310018, Zhejiang, China.
- Department of Cancer Prevention, Zhejiang Cancer Hospital, Hangzhou, 310022, Zhejiang, China.
| | - San Ming Wang
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China.
| |
Collapse
|
28
|
Yu K, Deuitch N, Merguerian M, Cunningham L, Davis J, Bresciani E, Diemer J, Andrews E, Young A, Donovan F, Sood R, Craft K, Chong S, Chandrasekharappa S, Mullikin J, Liu PP. Genomic landscape of patients with germline RUNX1 variants and familial platelet disorder with myeloid malignancy. Blood Adv 2024; 8:497-511. [PMID: 38019014 PMCID: PMC10837196 DOI: 10.1182/bloodadvances.2023011165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 11/07/2023] [Accepted: 11/07/2023] [Indexed: 11/30/2023] Open
Abstract
ABSTRACT Familial platelet disorder with associated myeloid malignancies (FPDMM) is caused by germline RUNX1 mutations and characterized by thrombocytopenia and increased risk of hematologic malignancies. We recently launched a longitudinal natural history study for patients with FPDMM. Among 27 families with research genomic data by the end of 2021, 26 different germline RUNX1 variants were detected. Besides missense mutations enriched in Runt homology domain and loss-of-function mutations distributed throughout the gene, splice-region mutations and large deletions were detected in 6 and 7 families, respectively. In 25 of 51 (49%) patients without hematologic malignancy, somatic mutations were detected in at least 1 of the clonal hematopoiesis of indeterminate potential (CHIP) genes or acute myeloid leukemia (AML) driver genes. BCOR was the most frequently mutated gene (in 9 patients), and multiple BCOR mutations were identified in 4 patients. Mutations in 6 other CHIP- or AML-driver genes (TET2, DNMT3A, KRAS, LRP1B, IDH1, and KMT2C) were also found in ≥2 patients without hematologic malignancy. Moreover, 3 unrelated patients (1 with myeloid malignancy) carried somatic mutations in NFE2, which regulates erythroid and megakaryocytic differentiation. Sequential sequencing data from 19 patients demonstrated dynamic changes of somatic mutations over time, and stable clones were more frequently found in older adult patients. In summary, there are diverse types of germline RUNX1 mutations and high frequency of somatic mutations related to clonal hematopoiesis in patients with FPDMM. Monitoring changes in somatic mutations and clinical manifestations prospectively may reveal mechanisms for malignant progression and inform clinical management. This trial was registered at www.clinicaltrials.gov as #NCT03854318.
Collapse
Affiliation(s)
- Kai Yu
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Natalie Deuitch
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Matthew Merguerian
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
- Department of Pediatrics, Johns Hopkins University School of Medicine, Balltimore, MD
| | - Lea Cunningham
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
- Immune Deficiency Cellular Therapy Program, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Joie Davis
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Erica Bresciani
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Jamie Diemer
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Elizabeth Andrews
- Immune Deficiency Cellular Therapy Program, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Frank Donovan
- Genomics Core, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Raman Sood
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Kathleen Craft
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Shawn Chong
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Settara Chandrasekharappa
- Genomics Core, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Jim Mullikin
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Paul P. Liu
- Oncogenesis and Development Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| |
Collapse
|
29
|
Roy G, Syed R, Lazaro O, Robertson S, McCabe SD, Rodriguez D, Mawla AM, Johnson TS, Kalwat MA. Identification of type 2 diabetes- and obesity-associated human β-cells using deep transfer learning. bioRxiv 2024:2024.01.18.576260. [PMID: 38328172 PMCID: PMC10849510 DOI: 10.1101/2024.01.18.576260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Diabetes affects >10% of adults worldwide and is caused by impaired production or response to insulin, resulting in chronic hyperglycemia. Pancreatic islet β-cells are the sole source of endogenous insulin and our understanding of β-cell dysfunction and death in type 2 diabetes (T2D) is incomplete. Single-cell RNA-seq data supports heterogeneity as an important factor in β-cell function and survival. However, it is difficult to identify which β-cell phenotypes are critical for T2D etiology and progression. Our goal was to prioritize specific disease-related β-cell subpopulations to better understand T2D pathogenesis and identify relevant genes for targeted therapeutics. To address this, we applied a deep transfer learning tool, DEGAS, which maps disease associations onto single-cell RNA-seq data from bulk expression data. Independent runs of DEGAS using T2D or obesity status identified distinct β-cell subpopulations. A singular cluster of T2D-associated β-cells was identified; however, β-cells with high obese-DEGAS scores contained two subpopulations derived largely from either non-diabetic or T2D donors. The obesity-associated non-diabetic cells were enriched for translation and unfolded protein response genes compared to T2D cells. We selected DLK1 for validation by immunostaining in human pancreas sections from healthy and T2D donors. DLK1 was heterogeneously expressed among β-cells and appeared depleted from T2D islets. In conclusion, DEGAS has the potential to advance our holistic understanding of the β-cell transcriptomic phenotypes, including features that distinguish β-cells in obese non-diabetic or lean T2D states. Future work will expand this approach to additional human islet omics datasets to reveal the complex multicellular interactions driving T2D.
Collapse
|
30
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
31
|
Hofman DA, Ruiz-Orera J, Yannuzzi I, Murugesan R, Brown A, Clauser KR, Condurat AL, van Dinter JT, Engels SAG, Goodale A, van der Lugt J, Abid T, Wang L, Zhou KN, Vogelzang J, Ligon KL, Phoenix TN, Roth JA, Root DE, Hubner N, Golub TR, Bandopadhayay P, van Heesch S, Prensner JR. Translation of non-canonical open reading frames as a cancer cell survival mechanism in childhood medulloblastoma. Mol Cell 2024; 84:261-276.e18. [PMID: 38176414 PMCID: PMC10872554 DOI: 10.1016/j.molcel.2023.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/30/2023] [Accepted: 12/01/2023] [Indexed: 01/06/2024]
Abstract
A hallmark of high-risk childhood medulloblastoma is the dysregulation of RNA translation. Currently, it is unknown whether medulloblastoma dysregulates the translation of putatively oncogenic non-canonical open reading frames (ORFs). To address this question, we performed ribosome profiling of 32 medulloblastoma tissues and cell lines and observed widespread non-canonical ORF translation. We then developed a stepwise approach using multiple CRISPR-Cas9 screens to elucidate non-canonical ORFs and putative microproteins implicated in medulloblastoma cell survival. We determined that multiple lncRNA-ORFs and upstream ORFs (uORFs) exhibited selective functionality independent of main coding sequences. A microprotein encoded by one of these ORFs, ASNSD1-uORF or ASDURF, was upregulated, associated with MYC-family oncogenes, and promoted medulloblastoma cell survival through engagement with the prefoldin-like chaperone complex. Our findings underscore the fundamental importance of non-canonical ORF translation in medulloblastoma and provide a rationale to include these ORFs in future studies seeking to define new cancer targets.
Collapse
Affiliation(s)
- Damon A Hofman
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, the Netherlands
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Ian Yannuzzi
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Adam Brown
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alexandra L Condurat
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Jip T van Dinter
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, the Netherlands
| | - Sem A G Engels
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, the Netherlands
| | - Amy Goodale
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jasper van der Lugt
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, the Netherlands
| | - Tanaz Abid
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Li Wang
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kevin N Zhou
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Jayne Vogelzang
- Department of Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA; Department of Pathology, Brigham and Women's Hospital, Boston, MA 02215, USA
| | - Keith L Ligon
- Department of Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA; Department of Pathology, Brigham and Women's Hospital, Boston, MA 02215, USA; Department of Pathology, Boston Children's Hospital, Boston MA 02115, USA
| | - Timothy N Phoenix
- Division of Pharmaceutical Sciences, James L. Winkle College of Pharmacy, University of Cincinnati, Cincinnati, OH 45229, USA
| | - Jennifer A Roth
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - David E Root
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Norbert Hubner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany; German Centre for Cardiovascular Research, Partner Site Berlin, 13347 Berlin, Germany
| | - Todd R Golub
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
| | - Pratiti Bandopadhayay
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, the Netherlands.
| | - John R Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology and Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| |
Collapse
|
32
|
Dueñas Rey A, Del Pozo Valero M, Bouckaert M, Wood KA, Van den Broeck F, Daich Varela M, Thomas HB, Van Heetvelde M, De Bruyne M, Van de Sompele S, Bauwens M, Lenaerts H, Mahieu Q, Josifova D, Rivolta C, O'Keefe RT, Ellingford J, Webster AR, Arno G, Ayuso C, De Zaeytijd J, Leroy BP, De Baere E, Coppieters F. Combining a prioritization strategy and functional studies nominates 5'UTR variants underlying inherited retinal disease. Genome Med 2024; 16:7. [PMID: 38184646 PMCID: PMC10771650 DOI: 10.1186/s13073-023-01277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 12/15/2023] [Indexed: 01/08/2024] Open
Abstract
BACKGROUND 5' untranslated regions (5'UTRs) are essential modulators of protein translation. Predicting the impact of 5'UTR variants is challenging and rarely performed in routine diagnostics. Here, we present a combined approach of a comprehensive prioritization strategy and functional assays to evaluate 5'UTR variation in two large cohorts of patients with inherited retinal diseases (IRDs). METHODS We performed an isoform-level re-analysis of retinal RNA-seq data to identify the protein-coding transcripts of 378 IRD genes with highest expression in retina. We evaluated the coverage of their 5'UTRs by different whole exome sequencing (WES) kits. The selected 5'UTRs were analyzed in whole genome sequencing (WGS) and WES data from IRD sub-cohorts from the 100,000 Genomes Project (n = 2397 WGS) and an in-house database (n = 1682 WES), respectively. Identified variants were annotated for 5'UTR-relevant features and classified into seven categories based on their predicted functional consequence. We developed a variant prioritization strategy by integrating population frequency, specific criteria for each category, and family and phenotypic data. A selection of candidate variants underwent functional validation using diverse approaches. RESULTS Isoform-level re-quantification of retinal gene expression revealed 76 IRD genes with a non-canonical retina-enriched isoform, of which 20 display a fully distinct 5'UTR compared to that of their canonical isoform. Depending on the probe design, 3-20% of IRD genes have 5'UTRs fully captured by WES. After analyzing these regions in both cohorts, we prioritized 11 (likely) pathogenic variants in 10 genes (ARL3, MERTK, NDP, NMNAT1, NPHP4, PAX6, PRPF31, PRPF4, RDH12, RD3), of which 7 were novel. Functional analyses further supported the pathogenicity of three variants. Mis-splicing was demonstrated for the PRPF31:c.-9+1G>T variant. The MERTK:c.-125G>A variant, overlapping a transcriptional start site, was shown to significantly reduce both luciferase mRNA levels and activity. The RDH12:c.-123C>T variant was found in cis with the hypomorphic RDH12:c.701G>A (p.Arg234His) variant in 11 patients. This 5'UTR variant, predicted to introduce an upstream open reading frame, was shown to result in reduced RDH12 protein but unaltered mRNA levels. CONCLUSIONS This study demonstrates the importance of 5'UTR variants implicated in IRDs and provides a systematic approach for 5'UTR annotation and validation that is applicable to other inherited diseases.
Collapse
Affiliation(s)
- Alfredo Dueñas Rey
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Marta Del Pozo Valero
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
- Department of Genetics, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz, University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
| | - Manon Bouckaert
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Katherine A Wood
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester, UK
| | - Filip Van den Broeck
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
- Department of Head & Skin, Ghent University, Ghent, Belgium
| | - Malena Daich Varela
- UCL Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Huw B Thomas
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester, UK
| | - Mattias Van Heetvelde
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Marieke De Bruyne
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Stijn Van de Sompele
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Miriam Bauwens
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Hanne Lenaerts
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Quinten Mahieu
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | | | - Carlo Rivolta
- Department of Ophthalmology, University of Basel, Basel, Switzerland
- Institute of Molecular and Clinical Ophthalmology Basel (IOB), Basel, Switzerland
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Raymond T O'Keefe
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester, UK
| | - Jamie Ellingford
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester, UK
- Genomics England, London, UK
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, UK
| | - Andrew R Webster
- UCL Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Gavin Arno
- UCL Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Carmen Ayuso
- Department of Genetics, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz, University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Julie De Zaeytijd
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
- Department of Head & Skin, Ghent University, Ghent, Belgium
| | - Bart P Leroy
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
- Department of Head & Skin, Ghent University, Ghent, Belgium
- Division of Ophthalmology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Elfride De Baere
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium
| | - Frauke Coppieters
- Center for Medical Genetics Ghent (CMGG), Ghent University Hospital, Ghent, Belgium.
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, Ghent, 9000, Belgium.
- Department of Pharmaceutics, Ghent University, Ghent, Belgium.
| |
Collapse
|
33
|
Behera S, Catreux S, Rossi M, Truong S, Huang Z, Ruehle M, Visvanath A, Parnaby G, Roddey C, Onuchic V, Cameron DL, English A, Mehtalia S, Han J, Mehio R, Sedlazeck FJ. Comprehensive and accurate genome analysis at scale using DRAGEN accelerated algorithms. bioRxiv 2024:2024.01.02.573821. [PMID: 38260545 PMCID: PMC10802302 DOI: 10.1101/2024.01.02.573821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Research and medical genomics require comprehensive and scalable solutions to drive the discovery of novel disease targets, evolutionary drivers, and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size (e.g., SNV/SV) or location (e.g., repeats). Here we present DRAGEN that utilizes novel methods based on multigenomes, hardware acceleration, and machine learning based variant detection to provide novel insights into individual genomes with ~30min computation time (from raw reads to variant detection). DRAGEN outperforms all other state-of-the-art methods in speed and accuracy across all variant types (SNV, indel, STR, SV, CNV) and further incorporates specialized methods to obtain key insights in medically relevant genes (e.g., HLA, SMN, GBA). We showcase DRAGEN across 3,202 genomes and demonstrate its scalability, accuracy, and innovations to further advance the integration of comprehensive genomics for research and medical applications.
Collapse
Affiliation(s)
- Sairam Behera
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | | | | | | | | | | | | | | | - Adam English
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, TX, USA
- Department of Computer Science, Rice University, TX, USA
| |
Collapse
|
34
|
Harrison PW, Amode MR, Austine-Orimoloye O, Azov A, Barba M, Barnes I, Becker A, Bennett R, Berry A, Bhai J, Bhurji SK, Boddu S, Branco Lins PR, Brooks L, Ramaraju S, Campbell L, Martinez MC, Charkhchi M, Chougule K, Cockburn A, Davidson C, De Silva N, Dodiya K, Donaldson S, El Houdaigui B, Naboulsi T, Fatima R, Giron CG, Genez T, Grigoriadis D, Ghattaoraya G, Martinez JG, Gurbich T, Hardy M, Hollis Z, Hourlier T, Hunt T, Kay M, Kaykala V, Le T, Lemos D, Lodha D, Marques-Coelho D, Maslen G, Merino G, Mirabueno L, Mushtaq A, Hossain S, Ogeh D, Sakthivel MP, Parker A, Perry M, Piližota I, Poppleton D, Prosovetskaia I, Raj S, Pérez-Silva J, Salam A, Saraf S, Saraiva-Agostinho N, Sheppard D, Sinha S, Sipos B, Sitnik V, Stark W, Steed E, Suner MM, Surapaneni L, Sutinen K, Tricomi FF, Urbina-Gómez D, Veidenberg A, Walsh TA, Ware D, Wass E, Willhoft N, Allen J, Alvarez-Jarreta J, Chakiachvili M, Flint B, Giorgetti S, Haggerty L, Ilsley G, Keatley J, Loveland J, Moore B, Mudge J, Naamati G, Tate J, Trevanion S, Winterbottom A, Frankish A, Hunt SE, Cunningham F, Dyer S, Finn R, Martin F, Yates A. Ensembl 2024. Nucleic Acids Res 2024; 52:D891-D899. [PMID: 37953337 PMCID: PMC10767893 DOI: 10.1093/nar/gkad1049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/20/2023] [Accepted: 10/24/2023] [Indexed: 11/14/2023] Open
Abstract
Ensembl (https://www.ensembl.org) is a freely available genomic resource that has produced high-quality annotations, tools, and services for vertebrates and model organisms for more than two decades. In recent years, there has been a dramatic shift in the genomic landscape, with a large increase in the number and phylogenetic breadth of high-quality reference genomes, alongside major advances in the pan-genome representations of higher species. In order to support these efforts and accelerate downstream research, Ensembl continues to focus on scaling for the rapid annotation of new genome assemblies, developing new methods for comparative analysis, and expanding the depth and quality of our genome annotations. This year we have continued our expansion to support global biodiversity research, doubling the number of annotated genomes we support on our Rapid Release site to over 1700, driven by our close collaboration with biodiversity projects such as Darwin Tree of Life. We have also strengthened support for key agricultural species, including the first regulatory builds for farmed animals, and have updated key tools and resources that support the global scientific community, notably the Ensembl Variant Effect Predictor. Ensembl data, software, and tools are freely available.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Olanrewaju Austine-Orimoloye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrey G Azov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthieu Barba
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - If Barnes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Arne Becker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ruth Bennett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew Berry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Simarpreet Kaur Bhurji
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sanjay Boddu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Paulo R Branco Lins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lucy Brooks
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shashank Budhanuru Ramaraju
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lahcen I Campbell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manuel Carbajo Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mehrnaz Charkhchi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Alexander Cockburn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Claire Davidson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nishadi H De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kamalkumar Dodiya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Donaldson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bilal El Houdaigui
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tamara El Naboulsi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Reham Fatima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thiago Genez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dionysios Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gurpreet S Ghattaoraya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jose Gonzalez Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthew Hardy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Zoe Hollis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mike Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vinay Kaykala
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diana Lemos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Disha Lodha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diego Marques-Coelho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gareth Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gabriela Alejandra Merino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Louisse Paola Mirabueno
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Syed Nakib Hossain
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Denye N Ogeh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manoj Pandian Sakthivel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Malcolm Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ivana Piližota
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Daniel Poppleton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Irina Prosovetskaia
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - José G Pérez-Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ahamed Imran Abdul Salam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shradha Saraf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nuno Saraiva-Agostinho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Swati Sinha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Botond Sipos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vasily Sitnik
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - William Stark
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Emily Steed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marie-Marthe Suner
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kyösti Sutinen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Francesca Floriana Tricomi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - David Urbina-Gómez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andres Veidenberg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thomas A Walsh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| | - Elizabeth Wass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Natalie L Willhoft
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jorge Alvarez-Jarreta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bethany Flint
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stefano Giorgetti
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Garth R Ilsley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jon Keatley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jane E Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrea Winterbottom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
35
|
Sayers E, Beck J, Bolton E, Brister J, Chan J, Comeau D, Connor R, DiCuccio M, Farrell C, Feldgarden M, Fine A, Funk K, Hatcher E, Hoeppner M, Kane M, Kannan S, Katz K, Kelly C, Klimke W, Kim S, Kimchi A, Landrum M, Lathrop S, Lu Z, Malheiro A, Marchler-Bauer A, Murphy T, Phan L, Prasad A, Pujar S, Sawyer A, Schmieder E, Schneider V, Schoch C, Sharma S, Thibaud-Nissen F, Trawick B, Venkatapathi T, Wang J, Pruitt K, Sherry S. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2024; 52:D33-D43. [PMID: 37994677 PMCID: PMC10767890 DOI: 10.1093/nar/gkad1044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/24/2023] Open
Abstract
The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.
Collapse
Affiliation(s)
- Eric W Sayers
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Jeff Beck
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - J Rodney Brister
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Jessica Chan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Donald C Comeau
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Ryan Connor
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Michael DiCuccio
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Catherine M Farrell
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Michael Feldgarden
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Anna M Fine
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Kathryn Funk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Eneida Hatcher
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Marilu Hoeppner
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Megan Kane
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Sivakumar Kannan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Kenneth S Katz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Christopher Kelly
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - William Klimke
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Sunghwan Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Avi Kimchi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Melissa Landrum
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Stacy Lathrop
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Adriana Malheiro
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Lon Phan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Arjun B Prasad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Shashikant Pujar
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Amanda Sawyer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Erin Schmieder
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Conrad L Schoch
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Shobha Sharma
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Barton W Trawick
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Thilakam Venkatapathi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Jiyao Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Stephen T Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
36
|
Varadi M, Bertoni D, Magana P, Paramval U, Pidruchna I, Radhakrishnan M, Tsenkov M, Nair S, Mirdita M, Yeo J, Kovalevskiy O, Tunyasuvunakool K, Laydon A, Žídek A, Tomlinson H, Hariharan D, Abrahamson J, Green T, Jumper J, Birney E, Steinegger M, Hassabis D, Velankar S. AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Res 2024; 52:D368-D375. [PMID: 37933859 PMCID: PMC10767828 DOI: 10.1093/nar/gkad1011] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/13/2023] [Accepted: 10/18/2023] [Indexed: 11/08/2023] Open
Abstract
The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Damian Bertoni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Paulyna Magana
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Urmila Paramval
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Ivanna Pidruchna
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | - Maxim Tsenkov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Jingi Yeo
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | | | | | | | | | | | | | | | | | | | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | | | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| |
Collapse
|
37
|
Song Y, Zhang C, Omenn GS, O’Meara MJ, Welch JD. Predicting the Structural Impact of Human Alternative Splicing. bioRxiv 2023:2023.12.21.572928. [PMID: 38187531 PMCID: PMC10769328 DOI: 10.1101/2023.12.21.572928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction. We identified examples of how alternative splicing induced clear changes in each of these properties. Structural similarity between isoforms largely correlated with degree of sequence identity, but we identified a subset of isoforms with low structural similarity despite high sequence similarity. Exon skipping and alternative last exons tended to increase the surface charge and radius of gyration. Splicing also buried or exposed numerous post-translational modification sites, most notably among the isoforms of BAX. Functional prediction nominated numerous functional differences among isoforms of the same gene, with loss of function compared to the reference predominating. Finally, we used single-cell RNA-seq data from the Tabula Sapiens to determine the cell types in which each structure is expressed. Our work represents an important resource for studying the structure and function of splice isoforms across the cell types of the human body.
Collapse
Affiliation(s)
- Yuxuan Song
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Joshua D. Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
38
|
Smith C, Kitzman JO. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. Genome Biol 2023; 24:294. [PMID: 38129864 PMCID: PMC10734170 DOI: 10.1186/s13059-023-03144-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. RESULTS We benchmark eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compare experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms' concordance with MPSA measurements, and with each other, is lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieve the best overall performance at distinguishing disruptive and neutral variants, and controlling for overall call rate genome-wide, SpliceAI and Pangolin have superior sensitivity. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. CONCLUSION SpliceAI and Pangolin show the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
Collapse
Affiliation(s)
- Cathy Smith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Jacob O Kitzman
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
39
|
Horste EL, Fansler MM, Cai T, Chen X, Mitschka S, Zhen G, Lee FCY, Ule J, Mayr C. Subcytoplasmic location of translation controls protein output. Mol Cell 2023; 83:4509-4523.e11. [PMID: 38134885 DOI: 10.1016/j.molcel.2023.11.025] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 08/15/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023]
Abstract
The cytoplasm is highly compartmentalized, but the extent and consequences of subcytoplasmic mRNA localization in non-polarized cells are largely unknown. We determined mRNA enrichment in TIS granules (TGs) and the rough endoplasmic reticulum (ER) through particle sorting and isolated cytosolic mRNAs by digitonin extraction. When focusing on genes that encode non-membrane proteins, we observed that 52% have transcripts enriched in specific compartments. Compartment enrichment correlates with a combinatorial code based on mRNA length, exon length, and 3' UTR-bound RNA-binding proteins. Compartment-biased mRNAs differ in the functional classes of their encoded proteins: TG-enriched mRNAs encode low-abundance proteins with strong enrichment of transcription factors, whereas ER-enriched mRNAs encode large and highly expressed proteins. Compartment localization is an important determinant of mRNA and protein abundance, which is supported by reporter experiments showing that redirecting cytosolic mRNAs to the ER increases their protein expression. In summary, the cytoplasm is functionally compartmentalized by local translation environments.
Collapse
Affiliation(s)
- Ellen L Horste
- Gerstner Sloan Kettering Graduate School of Biomedical Sciences, New York, NY 10065, USA; Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Mervin M Fansler
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA; Tri-Institutional Training Program in Computational Biology and Medicine, Weill-Cornell Graduate College, New York, NY 10021, USA
| | - Ting Cai
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Xiuzhen Chen
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Sibylle Mitschka
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Gang Zhen
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Flora C Y Lee
- UK Dementia Research Institute, King's College London, London SE5 9NU, UK; The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Jernej Ule
- UK Dementia Research Institute, King's College London, London SE5 9NU, UK; The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Christine Mayr
- Gerstner Sloan Kettering Graduate School of Biomedical Sciences, New York, NY 10065, USA; Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA; Tri-Institutional Training Program in Computational Biology and Medicine, Weill-Cornell Graduate College, New York, NY 10021, USA.
| |
Collapse
|
40
|
Ma JG, O’Neill MJ, Richardson E, Thomson KL, Ingles J, Muhammad A, Solus JF, Davogustto G, Anderson KC, Benjamin Shoemaker M, Stergachis AB, Floyd BJ, Dunn K, Parikh VN, Chubb H, Perrin MJ, Roden DM, Vandenberg JI, Ng CA, Glazer AM. Multi-site validation of a functional assay to adjudicate SCN5A Brugada Syndrome-associated variants. medRxiv 2023:2023.12.19.23299592. [PMID: 38196587 PMCID: PMC10775332 DOI: 10.1101/2023.12.19.23299592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Brugada Syndrome (BrS) is an inheritable arrhythmia condition that is associated with rare, loss-of-function variants in the cardiac sodium channel gene, SCN5A. Interpreting the pathogenicity of SCN5A missense variants is challenging and ~79% of SCN5A missense variants in ClinVar are currently classified as Variants of Uncertain Significance (VUS). An in vitro SCN5A-BrS automated patch clamp assay was generated for high-throughput functional studies of NaV1.5. The assay was independently studied at two separate research sites - Vanderbilt University Medical Center and Victor Chang Cardiac Research Institute - revealing strong correlations, including peak INa density (R2=0.86). The assay was calibrated according to ClinGen Sequence Variant Interpretation recommendations using high-confidence variant controls (n=49). Normal and abnormal ranges of function were established based on the distribution of benign variant assay results. The assay accurately distinguished benign controls (24/25) from pathogenic controls (23/24). Odds of Pathogenicity values derived from the experimental results yielded 0.042 for normal function (BS3 criterion) and 24.0 for abnormal function (PS3 criterion), resulting in up to strong evidence for both ACMG criteria. The calibrated assay was then used to study SCN5A VUS observed in four families with BrS and other arrhythmia phenotypes associated with SCN5A loss-of-function. The assay revealed loss-of-function for three of four variants, enabling reclassification to likely pathogenic. This validated APC assay provides clinical-grade functional evidence for the reclassification of current VUS and will aid future SCN5A-BrS variant classification.
Collapse
Affiliation(s)
- Joanne G. Ma
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia
| | | | - Ebony Richardson
- Clinical Genomics Laboratory, Centre for Population Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia and Murdoch Children Research Institute, Melbourne, Australia
| | - Kate L. Thomson
- Oxford Genetics Laboratories, Churchill Hospital, Oxford, UK
| | - Jodie Ingles
- Clinical Genomics Laboratory, Centre for Population Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia and Murdoch Children Research Institute, Melbourne, Australia
| | - Ayesha Muhammad
- Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Joseph F. Solus
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Giovanni Davogustto
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Katherine C. Anderson
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - M. Benjamin Shoemaker
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Andrew B. Stergachis
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Brendan J. Floyd
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
| | - Kyla Dunn
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
| | - Victoria N. Parikh
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
| | - Henry Chubb
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
| | - Mark J. Perrin
- Department of Genomic Medicine, Royal Melbourne Hospital, Victoria, Australia
| | - Dan M. Roden
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jamie I. Vandenberg
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia
| | - Chai-Ann Ng
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia
| | - Andrew M. Glazer
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
41
|
McCarley SC, Murphy DA, Thompson J, Shovlin CL. Pharmacogenomic Considerations for Anticoagulant Prescription in Patients with Hereditary Haemorrhagic Telangiectasia. J Clin Med 2023; 12:7710. [PMID: 38137783 PMCID: PMC10744266 DOI: 10.3390/jcm12247710] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/10/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Hereditary haemorrhagic telangiectasia (HHT) is a vascular dysplasia that commonly results in bleeding but with frequent indications for therapeutic anticoagulation. Our aims were to advance the understanding of drug-specific intolerance and evaluate if there was an indication for pharmacogenomic testing. Genes encoding proteins involved in the absorption, distribution, metabolism, and excretion of warfarin, heparin, and direct oral anticoagulants (DOACs) apixaban, rivaroxaban, edoxaban, and dabigatran were identified and examined. Linkage disequilibrium with HHT genes was excluded, before variants within these genes were examined following whole genome sequencing of general and HHT populations. The 44 genes identified included 5/17 actionable pharmacogenes with guidelines. The 76,156 participants in the Genome Aggregation Database v3.1.2 had 28,446 variants, including 9668 missense substitutions and 1076 predicted loss-of-function (frameshift, nonsense, and consensus splice site) variants, i.e., approximately 1 in 7.9 individuals had a missense substitution, and 1 in 71 had a loss-of-function variant. Focusing on the 17 genes relevant to usually preferred DOACs, similar variant profiles were identified in HHT patients. With HHT patients at particular risk of haemorrhage when undergoing anticoagulant treatment, we explore how pre-emptive pharmacogenomic testing, alongside HHT gene testing, may prove beneficial in reducing the risk of bleeding and conclude that HHT patients are well placed to be at the vanguard of personalised prescribing.
Collapse
Affiliation(s)
- Sarah C. McCarley
- National Heart and Lung Institute, Imperial College London, London W12 0NN, UK; (S.C.M.); (J.T.)
| | - Daniel A. Murphy
- Pharmacy Department, Imperial College Healthcare NHS Trust, London W2 1NY, UK;
- Social, Genetic and Envionmental Determinants of Health Theme, NIHR Imperial Biomedical Research Centre, London W2 1NY, UK
| | - Jack Thompson
- National Heart and Lung Institute, Imperial College London, London W12 0NN, UK; (S.C.M.); (J.T.)
| | - Claire L. Shovlin
- National Heart and Lung Institute, Imperial College London, London W12 0NN, UK; (S.C.M.); (J.T.)
- Social, Genetic and Envionmental Determinants of Health Theme, NIHR Imperial Biomedical Research Centre, London W2 1NY, UK
- Specialist Medicine, Hammersmith Hospital, Imperial College Healthcare NHS Trust, London W12 0HS, UK
| |
Collapse
|
42
|
Radford EJ, Tan HK, Andersson MHL, Stephenson JD, Gardner EJ, Ironfield H, Waters AJ, Gitterman D, Lindsay S, Abascal F, Martincorena I, Kolesnik-Taylor A, Ng-Cordell E, Firth HV, Baker K, Perry JRB, Adams DJ, Gerety SS, Hurles ME. Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation. Nat Commun 2023; 14:7702. [PMID: 38057330 PMCID: PMC10700591 DOI: 10.1038/s41467-023-43041-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 10/30/2023] [Indexed: 12/08/2023] Open
Abstract
Loss-of-function of DDX3X is a leading cause of neurodevelopmental disorders (NDD) in females. DDX3X is also a somatically mutated cancer driver gene proposed to have tumour promoting and suppressing effects. We perform saturation genome editing of DDX3X, testing in vitro the functional impact of 12,776 nucleotide variants. We identify 3432 functionally abnormal variants, in three distinct classes. We train a machine learning classifier to identify functionally abnormal variants of NDD-relevance. This classifier has at least 97% sensitivity and 99% specificity to detect variants pathogenic for NDD, substantially out-performing in silico predictors, and resolving up to 93% of variants of uncertain significance. Moreover, functionally-abnormal variants can account for almost all of the excess nonsynonymous DDX3X somatic mutations seen in DDX3X-driven cancers. Systematic maps of variant effects generated in experimentally tractable cell types have the potential to transform clinical interpretation of both germline and somatic disease-associated variation.
Collapse
Affiliation(s)
- Elizabeth J Radford
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
- Department of Paediatrics, University of Cambridge, Level 8, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | - Hong-Kee Tan
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
| | | | | | - Eugene J Gardner
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | | | | | | | | | | | | | | | - Elise Ng-Cordell
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
- Department of Psychology, University of British Columbia, Vancouver, Canada
| | - Helen V Firth
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
| | - Kate Baker
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
| | - John R B Perry
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | | | | | | |
Collapse
|
43
|
Zhang Q, Shao M. Transcript assembly and annotations: Bias and adjustment. PLoS Comput Biol 2023; 19:e1011734. [PMID: 38127855 PMCID: PMC10769104 DOI: 10.1371/journal.pcbi.1011734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 01/05/2024] [Accepted: 12/04/2023] [Indexed: 12/23/2023] Open
Abstract
Transcript annotations play a critical role in gene expression analysis as they serve as a reference for quantifying isoform-level expression. The two main sources of annotations are RefSeq and Ensembl/GENCODE, but discrepancies between their methodologies and information resources can lead to significant differences. It has been demonstrated that the choice of annotation can have a significant impact on gene expression analysis. Furthermore, transcript assembly is closely linked to annotations, as assembling large-scale available RNA-seq data is an effective data-driven way to construct annotations, and annotations are often served as benchmarks to evaluate the accuracy of assembly methods. However, the influence of different annotations on transcript assembly is not yet fully understood. We investigate the impact of annotations on transcript assembly. Surprisingly, we observe that opposite conclusions can arise when evaluating assemblers with different annotations. To understand this striking phenomenon, we compare the structural similarity of annotations at various levels and find that the primary structural difference across annotations occurs at the intron-chain level. Next, we examine the biotypes of annotated and assembled transcripts and uncover a significant bias towards annotating and assembling transcripts with intron retentions, which explains above the contradictory conclusions. We develop a standalone tool, available at https://github.com/Shao-Group/irtool, that can be combined with an assembler to generate an assembly without intron retentions. We evaluate the performance of such a pipeline and offer guidance to select appropriate assembling tools for different application scenarios.
Collapse
Affiliation(s)
- Qimin Zhang
- Department of Computer Science and Engineering, School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Mingfu Shao
- Department of Computer Science and Engineering, School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
44
|
Dondi A, Lischetti U, Jacob F, Singer F, Borgsmüller N, Coelho R, Heinzelmann-Schwarz V, Beisel C, Beerenwinkel N. Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer. Nat Commun 2023; 14:7780. [PMID: 38012143 PMCID: PMC10682465 DOI: 10.1038/s41467-023-43387-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023] Open
Abstract
Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Here, we perform long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increase the PacBio sequencing depth to 12,000 reads per cell. Our approach captures 152,000 isoforms, of which over 52,000 were not previously reported. Isoform-level analysis accounting for non-coding isoforms reveals 20% overestimation of protein-coding gene expression on average. We also detect cell type-specific isoform and poly-adenylation site usage in tumor and mesothelial cells, and find that mesothelial cells transition into cancer-associated fibroblasts in the metastasis, partly through the TGF-β/miR-29/Collagen axis. Furthermore, we identify gene fusions, including an experimentally validated IGF2BP2::TESPA1 fusion, which is misclassified as high TESPA1 expression in matched short-read data, and call mutations confirmed by targeted NGS cancer gene panel results. With these findings, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine.
Collapse
Affiliation(s)
- Arthur Dondi
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Ulrike Lischetti
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland.
| | - Francis Jacob
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
| | - Franziska Singer
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
- ETH Zurich, NEXUS Personalized Health Technologies, Wagistrasse 18, 8952, Schlieren, Switzerland
| | - Nico Borgsmüller
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Ricardo Coelho
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
| | - Viola Heinzelmann-Schwarz
- University Hospital Basel and University of Basel, Ovarian Cancer Research, Department of Biomedicine, Hebelstrasse 20, 4031, Basel, Switzerland
- University Hospital Basel, Gynecological Cancer Center, Spitalstrasse 21, 4031, Basel, Switzerland
| | - Christian Beisel
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
| | - Niko Beerenwinkel
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland.
| |
Collapse
|
45
|
Sanchez-Mete L, Mosciatti L, Casadio M, Vittori L, Martayan A, Stigliano V. MUTYH-associated polyposis: Is it time to change upper gastrointestinal surveillance? A single-center case series and a literature overview. World J Gastrointest Oncol 2023; 15:1891-1899. [DOI: 10.4251/wjgo.v15.i11.1891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/28/2023] [Accepted: 06/13/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND The presence of Spigelman stage (SS) IV duodenal polyposis is considered the most significant risk factor for duodenal cancer in patients with MUTYH-associated polyposis (MAP). However, advanced SS disease is rarely reported in MAP patients, and no clear recommendations on small bowel (SB) surveillance have been proposed in this patient setting.
AIM To research more because that case reports of duodenal cancers in MAP suggest that they may develop in the absence of advanced benign SS disease and often involve the distal portion of the duodenum.
METHODS We describe a series of MAP patients followed up at the Regina Elena National Cancer Institute of Rome (Italy). A literature overview on previously reported SB cancers in MAP is also provided.
RESULTS We identified two (6%) SB adenocarcinomas with no previous history of duodenal polyposis. Our observations, supported by literature evidence, suggest that the formula for staging duodenal polyposis and predicting risk factors for distal duodenum and jejunal cancer may need to be adjusted to take this into account rather than focusing solely on the presence or absence of SS IV disease.
CONCLUSION Our study emphasizes the need for further studies to define appropriate upper gastrointestinal surveillance programs in MAP patients.
Collapse
Affiliation(s)
- Lupe Sanchez-Mete
- Gastroenterology and Digestive Endoscopy, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| | - Lorenzo Mosciatti
- Gastroenterology and Digestive Endoscopy, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| | - Marco Casadio
- Gastroenterology and Digestive Endoscopy, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| | - Luigi Vittori
- Department of Radiological, Oncological and Pathological Sciences, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| | - Aline Martayan
- Gastroenterology and Digestive Endoscopy, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| | - Vittoria Stigliano
- Gastroenterology and Digestive Endoscopy, Regina Elena National Cancer Institute, IRCCS, Rome 00144, Italy
| |
Collapse
|
46
|
Zhang P, Chaldebas M, Ogishi M, Al Qureshah F, Ponsin K, Feng Y, Rinchai D, Milisavljevic B, Han JE, Moncada-Vélez M, Keles S, Schröder B, Stenson PD, Cooper DN, Cobat A, Boisson B, Zhang Q, Boisson-Dupuis S, Abel L, Casanova JL. Genome-wide detection of human intronic AG-gain variants located between splicing branchpoints and canonical splice acceptor sites. Proc Natl Acad Sci U S A 2023; 120:e2314225120. [PMID: 37931111 PMCID: PMC10655562 DOI: 10.1073/pnas.2314225120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 10/02/2023] [Indexed: 11/08/2023] Open
Abstract
Human genetic variants that introduce an AG into the intronic region between the branchpoint (BP) and the canonical splice acceptor site (ACC) of protein-coding genes can disrupt pre-mRNA splicing. Using our genome-wide BP database, we delineated the BP-ACC segments of all human introns and found extreme depletion of AG/YAG in the [BP+8, ACC-4] high-risk region. We developed AGAIN as a genome-wide computational approach to systematically and precisely pinpoint intronic AG-gain variants within the BP-ACC regions. AGAIN identified 350 AG-gain variants from the Human Gene Mutation Database, all of which alter splicing and cause disease. Among them, 74% created new acceptor sites, whereas 31% resulted in complete exon skipping. AGAIN also predicts the protein-level products resulting from these two consequences. We performed AGAIN on our exome/genomes database of patients with severe infectious diseases but without known genetic etiology and identified a private homozygous intronic AG-gain variant in the antimycobacterial gene SPPL2A in a patient with mycobacterial disease. AGAIN also predicts a retention of six intronic nucleotides that encode an in-frame stop codon, turning AG-gain into stop-gain. This allele was then confirmed experimentally to lead to loss of function by disrupting splicing. We further showed that AG-gain variants inside the high-risk region led to misspliced products, while those outside the region did not, by two case studies in genes STAT1 and IRF7. We finally evaluated AGAIN on our 14 paired exome-RNAseq samples and found that 82% of AG-gain variants in high-risk regions showed evidence of missplicing. AGAIN is publicly available from https://hgidsoft.rockefeller.edu/AGAIN and https://github.com/casanova-lab/AGAIN.
Collapse
Affiliation(s)
- Peng Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Matthieu Chaldebas
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Masato Ogishi
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Fahd Al Qureshah
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Khoren Ponsin
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Yi Feng
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Darawan Rinchai
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Baptiste Milisavljevic
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Ji Eun Han
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Marcela Moncada-Vélez
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
| | - Sevgi Keles
- Division of Pediatric Allergy and Immunology, Necmettin Erbakan University, Meram Medical Faculty, Konya42080, Turkey
| | - Bernd Schröder
- Institute of Physiological Chemistry, Technische Universität Dresden, Dresden01307, Germany
| | - Peter D. Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, CardiffCF14 4XN, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, CardiffCF14 4XN, United Kingdom
| | - Aurélie Cobat
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
| | - Bertrand Boisson
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
| | - Qian Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
| | - Stéphanie Boisson-Dupuis
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris75015, France
- Paris Cité University, Imagine Institute, Paris75015, France
- Department of Pediatrics, Necker Hospital for Sick Children, Paris75015, France
- HHMI, New York, NY10065
| |
Collapse
|
47
|
Shinder I, Hu R, Ji HJ, Chao KH, Pertea M. EASTR: Identifying and eliminating systematic alignment errors in multi-exon genes. Nat Commun 2023; 14:7223. [PMID: 37940654 PMCID: PMC10632439 DOI: 10.1038/s41467-023-43017-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 10/30/2023] [Indexed: 11/10/2023] Open
Abstract
Accurate alignment of transcribed RNA to reference genomes is a critical step in the analysis of gene expression, which in turn has broad applications in biomedical research and in the basic sciences. We reveal that widely used splice-aware aligners, such as STAR and HISAT2, can introduce erroneous spliced alignments between repeated sequences, leading to the inclusion of falsely spliced transcripts in RNA-seq experiments. In some cases, the 'phantom' introns resulting from these errors make their way into widely-used genome annotation databases. To address this issue, we present EASTR (Emending Alignments of Spliced Transcript Reads), a software tool that detects and removes falsely spliced alignments or transcripts from alignment and annotation files. EASTR improves the accuracy of spliced alignments across diverse species, including human, maize, and Arabidopsis thaliana, by detecting sequence similarity between intron-flanking regions. We demonstrate that applying EASTR before transcript assembly substantially reduces false positive introns, exons, and transcripts, improving the overall accuracy of assembled transcripts. Additionally, we show that EASTR's application to reference annotation databases can detect and correct likely cases of mis-annotated transcripts.
Collapse
Affiliation(s)
- Ida Shinder
- Cross Disciplinary Graduate Program in Biomedical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
| | - Richard Hu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Hyun Joo Ji
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Kuan-Hao Chao
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA.
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
48
|
Chen X, Fansler MM, Janjoš U, Ule J, Mayr C. The FXR1 network acts as signaling scaffold for actomyosin remodeling. bioRxiv 2023:2023.11.05.565677. [PMID: 37961296 PMCID: PMC10635158 DOI: 10.1101/2023.11.05.565677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
It is currently not known that mRNAs fulfill structural roles in the cytoplasm. Here, we report the FXR1 network, an mRNA-protein (mRNP) network present throughout the cytoplasm: FXR1 packages exceptionally long mRNAs that serve as an underlying network scaffold and concentrate FXR1 molecules, which have multiple protein binding sites. The proximity of FXR1 molecules makes the FXR1 network a hub for transient interactions of proteins lacking RNA-binding domains. We show that the FXR1 network is necessary for RhoA signaling-induced actomyosin reorganization to provide spatial proximity between kinases and their substrates. A point mutation in FXR1, which is found in its FMR1 homolog and causes Fragile X syndrome, disrupts the network. FXR1 network disruption prevents actomyosin remodeling-an essential and ubiquitous process for the regulation of cell shape, migration, and synaptic function. These findings uncover a structural role for cytoplasmic mRNA and show how the FXR1 RNA-binding protein as part of the FXR1 network acts as organizer of signaling reactions.
Collapse
Affiliation(s)
- Xiuzhen Chen
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Mervin M. Fansler
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Urška Janjoš
- National Institute of Chemistry, Hajdrihova 19, 1001 Ljubljana, Slovenia
- Biosciences PhD Program, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Jernej Ule
- National Institute of Chemistry, Hajdrihova 19, 1001 Ljubljana, Slovenia
- UK Dementia Research Institute at King’s College London, London, SE5 9NU, UK
| | - Christine Mayr
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, NY 10065, USA
| |
Collapse
|
49
|
Sun KY, Bai X, Chen S, Bao S, Kapoor M, Zhang C, Backman J, Joseph T, Maxwell E, Mitra G, Gorovits A, Mansfield A, Boutkov B, Gokhale S, Habegger L, Marcketta A, Locke A, Kessler MD, Sharma D, Staples J, Bovijn J, Gelfman S, Gioia AD, Rajagopal V, Lopez A, Varela JR, Alegre J, Berumen J, Tapia-Conyer R, Kuri-Morales P, Torres J, Emberson J, Collins R, Cantor M, Thornton T, Kang HM, Overton J, Shuldiner AR, Cremona ML, Nafde M, Baras A, Abecasis G, Marchini J, Reid JG, Salerno W, Balasubramanian S. A deep catalog of protein-coding variation in 985,830 individuals. bioRxiv 2023:2023.05.09.539329. [PMID: 37214792 PMCID: PMC10197621 DOI: 10.1101/2023.05.09.539329] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.
Collapse
Affiliation(s)
| | | | - Siying Chen
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Suying Bao
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | - Adam Locke
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | | | | | | | | | | - Jesus Alegre
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Jaime Berumen
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Roberto Tapia-Conyer
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Pablo Kuri-Morales
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Jason Torres
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jonathan Emberson
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Rory Collins
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | | | | | | | | | | | | | | | | | - Mona Nafde
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Aris Baras
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | |
Collapse
|
50
|
Cross NCP, Ernst T, Branford S, Cayuela JM, Deininger M, Fabarius A, Kim DDH, Machova Polakova K, Radich JP, Hehlmann R, Hochhaus A, Apperley JF, Soverini S. European LeukemiaNet laboratory recommendations for the diagnosis and management of chronic myeloid leukemia. Leukemia 2023; 37:2150-2167. [PMID: 37794101 PMCID: PMC10624636 DOI: 10.1038/s41375-023-02048-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/13/2023] [Accepted: 09/20/2023] [Indexed: 10/06/2023]
Abstract
From the laboratory perspective, effective management of patients with chronic myeloid leukemia (CML) requires accurate diagnosis, assessment of prognostic markers, sequential assessment of levels of residual disease and investigation of possible reasons for resistance, relapse or progression. Our scientific and clinical knowledge underpinning these requirements continues to evolve, as do laboratory methods and technologies. The European LeukemiaNet convened an expert panel to critically consider the current status of genetic laboratory approaches to help diagnose and manage CML patients. Our recommendations focus on current best practice and highlight the strengths and pitfalls of commonly used laboratory tests.
Collapse
Affiliation(s)
| | - Thomas Ernst
- Klinik für Innere Medizin II, Universitätsklinikum Jena, Jena, Germany
| | - Susan Branford
- Centre for Cancer Biology and SA Pathology, Adelaide, SA, Australia
| | - Jean-Michel Cayuela
- Laboratory of Hematology, University Hospital Saint-Louis, AP-HP and EA3518, Université Paris Cité, Paris, France
| | | | - Alice Fabarius
- III. Medizinische Klinik, Medizinische Fakultät Mannheim, Universität Heidelberg, Mannheim, Germany
| | - Dennis Dong Hwan Kim
- Department of Medical Oncology and Hematology, Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Canada
| | | | | | - Rüdiger Hehlmann
- III. Medizinische Klinik, Medizinische Fakultät Mannheim, Universität Heidelberg, Mannheim, Germany
- ELN Foundation, Weinheim, Germany
| | - Andreas Hochhaus
- Klinik für Innere Medizin II, Universitätsklinikum Jena, Jena, Germany
| | - Jane F Apperley
- Centre for Haematology, Imperial College London, London, UK
- Department of Clinical Haematology, Imperial College Healthcare NHS Trust, London, UK
| | - Simona Soverini
- Department of Medical and Surgical Sciences, Institute of Hematology "Lorenzo e Ariosto Seràgnoli", University of Bologna, Bologna, Italy
| |
Collapse
|