151
|
Abstract
Whole Exome Sequencing (WES) is used for querying DNA variants using the protein coding parts of genomes (exomes). However, WES analysis can be challenging because of the complexity of the data. Here, we describe a consolidated protocol for unbiased WES analysis. The protocol uses three variant callers (HaplotypeCaller, FreeBayes, and DeepVariant), which have different underlying models. We provide detailed execution steps, as well as basic variant filtering, annotation, visualization, and consolidation aspects. Protocol to enable whole exome data analysis in an unbiased approach A protocol for unbiased analysis using 3 variant callers with different underlying models From raw data to filtered, consolidated, and annotated DNA variant calls
Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics.
Collapse
|
152
|
Vrbová E, Noda AA, Grillová L, Rodríguez I, Forsyth A, Oppelt J, Šmajs D. Whole genome sequences of Treponema pallidum subsp. endemicum isolated from Cuban patients: The non-clonal character of isolates suggests a persistent human infection rather than a single outbreak. PLoS Negl Trop Dis 2022; 16:e0009900. [PMID: 35687593 PMCID: PMC9223347 DOI: 10.1371/journal.pntd.0009900] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 06/23/2022] [Accepted: 04/21/2022] [Indexed: 11/18/2022] Open
Abstract
Bejel (endemic syphilis) is a neglected non-venereal disease caused by Treponema pallidum subsp. endemicum (TEN). Although it is mostly present in hot, dry climates, a few cases have been found outside of these areas. The aim of this work was the sequencing and analysis of TEN isolates obtained from “syphilis patients” in Cuba, which is not considered an endemic area for bejel. Genomes were obtained by pool segment genome sequencing or direct sequencing methods, and the bioinformatics analysis was performed according to an established pipeline. We obtained four genomes with 100%, 81.7%, 52.6%, and 21.1% breadth of coverage, respectively. The sequenced genomes revealed a non-clonal character, with nucleotide variability ranging between 0.2–10.3 nucleotide substitutions per 100 kbp among the TEN isolates. Nucleotide changes affected 27 genes, and the analysis of the completely sequenced genome also showed a recombination event between tprC and tprI, in TP0488 as well as in the intergenic region between TP0127–TP0129. Despite limitations in the quality of samples affecting breadth of sequencing coverage, the determined non-clonal character of the isolates suggests a persistent infection in the Cuban population rather than a single outbreak caused by imported case.
Collapse
Affiliation(s)
- Eliška Vrbová
- Department of Biology, Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Angel A. Noda
- Department of Mycology-Bacteriology, Institute of Tropical Medicine “Pedro Kourí”, Havana, Cuba
| | - Linda Grillová
- Department of Biology, Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Islay Rodríguez
- Department of Mycology-Bacteriology, Institute of Tropical Medicine “Pedro Kourí”, Havana, Cuba
| | - Allyn Forsyth
- GeneticPrime Dx, Inc., La Jolla, California, United States of America
- San Diego State University, San Diego, California, United States of America
| | - Jan Oppelt
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, United States of America
| | - David Šmajs
- Department of Biology, Faculty of Medicine, Masaryk University, Brno, Czech Republic
- * E-mail:
| |
Collapse
|
153
|
A complete, telomere-to-telomere human genome sequence presents new opportunities for evolutionary genomics. Nat Methods 2022; 19:635-638. [PMID: 35689027 DOI: 10.1038/s41592-022-01512-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
154
|
Dong W, Wong KHY, Liu Y, Levy-Sakin M, Hung WC, Li M, Li B, Jin SC, Choi J, Lopez-Giraldez F, Vaka D, Poon A, Chu C, Lao R, Balamir M, Movsesyan I, Malloy MJ, Zhao H, Kwok PY, Kane JP, Lifton RP, Pullinger CR. Whole-exome sequencing reveals damaging gene variants associated with hypoalphalipoproteinemia. J Lipid Res 2022; 63:100209. [PMID: 35460704 PMCID: PMC9126845 DOI: 10.1016/j.jlr.2022.100209] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 04/12/2022] [Accepted: 04/13/2022] [Indexed: 12/02/2022] Open
Abstract
Low levels of high density lipoprotein-cholesterol (HDL-C) are associated with an elevated risk of arteriosclerotic coronary heart disease. Heritability of HDL-C levels is high. In this research discovery study, we used whole-exome sequencing to identify damaging gene variants that may play significant roles in determining HDL-C levels. We studied 204 individuals with a mean HDL-C level of 27.8 ± 6.4 mg/dl (range: 4-36 mg/dl). Data were analyzed by statistical gene burden testing and by filtering against candidate gene lists. We found 120 occurrences of probably damaging variants (116 heterozygous; four homozygous) among 45 of 104 recognized HDL candidate genes. Those with the highest prevalence of damaging variants were ABCA1 (n = 20), STAB1 (n = 9), OSBPL1A (n = 8), CPS1 (n = 8), CD36 (n = 7), LRP1 (n = 6), ABCA8 (n = 6), GOT2 (n = 5), AMPD3 (n = 5), WWOX (n = 4), and IRS1 (n = 4). Binomial analysis for damaging missense or loss-of-function variants identified the ABCA1 and LDLR genes at genome-wide significance. In conclusion, whole-exome sequencing of individuals with low HDL-C showed the burden of damaging rare variants in the ABCA1 and LDLR genes is particularly high and revealed numerous occurrences in HDL candidate genes, including many genes identified in genome-wide association study reports. Many of these genes are involved in cancer biology, which accords with epidemiologic findings of the association of HDL deficiency with increased risk of cancer, thus presenting a new area of interest in HDL genomics.
Collapse
Affiliation(s)
- Weilai Dong
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Karen H Y Wong
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Youbin Liu
- Department of Cardiology, The Guangzhou Eighth People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Michal Levy-Sakin
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Wei-Chien Hung
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Mo Li
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Boyang Li
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Sheng Chih Jin
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Jungmin Choi
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; Department of Biomedical Sciences, Korea University College of Medicine, Seoul, Korea
| | | | - Dedeepya Vaka
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Annie Poon
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Catherine Chu
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Richard Lao
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Melek Balamir
- Department of Internal Medicine, Istanbul University, Istanbul, Turkey
| | - Irina Movsesyan
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Mary J Malloy
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Pediatrics, University of California, San Francisco, CA, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Pui-Yan Kwok
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Dermatology, University of California, San Francisco, CA, USA
| | - John P Kane
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA
| | - Richard P Lifton
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Clive R Pullinger
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA; Physiological Nursing, University of California, San Francisco, CA, USA.
| |
Collapse
|
155
|
Feng YCA, Stanaway IB, Connolly JJ, Denny JC, Luo Y, Weng C, Wei WQ, Weiss ST, Karlson EW, Smoller JW. Psychiatric manifestations of rare variation in medically actionable genes: a PheWAS approach. BMC Genomics 2022; 23:385. [PMID: 35590255 PMCID: PMC9121574 DOI: 10.1186/s12864-022-08600-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 04/22/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As genomic sequencing moves closer to clinical implementation, there has been an increasing acceptance of returning incidental findings to research participants and patients for mutations in highly penetrant, medically actionable genes. A curated list of genes has been recommended by the American College of Medical Genetics and Genomics (ACMG) for return of incidental findings. However, the pleiotropic effects of these genes are not fully known. Such effects could complicate genetic counseling when returning incidental findings. In particular, there has been no systematic evaluation of psychiatric manifestations associated with rare variation in these genes. RESULTS Here, we leveraged a targeted sequence panel and real-world electronic health records from the eMERGE network to assess the burden of rare variation in the ACMG-56 genes and two psychiatric-associated genes (CACNA1C and TCF4) across common mental health conditions in 15,181 individuals of European descent. As a positive control, we showed that this approach replicated the established association between rare mutations in LDLR and hypercholesterolemia with no visible inflation from population stratification. However, we did not identify any genes significantly enriched with rare deleterious variants that confer risk for common psychiatric disorders after correction for multiple testing. Suggestive associations were observed between depression and rare coding variation in PTEN (P = 1.5 × 10-4), LDLR (P = 3.6 × 10-4), and CACNA1S (P = 5.8 × 10-4). We also observed nominal associations between rare variants in KCNQ1 and substance use disorders (P = 2.4 × 10-4), and APOB and tobacco use disorder (P = 1.1 × 10-3). CONCLUSIONS Our results do not support an association between psychiatric disorders and incidental findings in medically actionable gene mutations, but power was limited with the available sample sizes. Given the phenotypic and genetic complexity of psychiatric phenotypes, future work will require a much larger sequencing dataset to determine whether incidental findings in these genes have implications for risk of psychopathology.
Collapse
Affiliation(s)
- Yen-Chen A Feng
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, USA.
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan.
- Master of Public Health Program, National Taiwan University, Taipei, Taiwan.
| | - Ian B Stanaway
- Division of Nephrology, School of Medicine, Kidney Research Institute, University of Washington, Seattle, WA, USA
| | - John J Connolly
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- All of Us Research Program, National Institutes of Health, Besthesda, MD, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Scott T Weiss
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Elizabeth W Karlson
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Boston, MA, USA
| | - Jordan W Smoller
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, USA.
- Center for Precision Psychiatry, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
156
|
Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability. Nat Commun 2022; 13:2532. [PMID: 35534486 PMCID: PMC9085767 DOI: 10.1038/s41467-022-30208-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 04/21/2022] [Indexed: 12/03/2022] Open
Abstract
Despite the success of genome-wide association studies, much of the genetic contribution to complex traits remains unexplained. Here, we analyse high coverage whole-genome sequencing data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants is skewed towards the rare spectrum, and damaging variants are more often rare. We estimate that less than 4.3% of the narrow-sense heritability is expected to be explained by rare variants in our cohort. Using a gene-based approach, we identify Cis-associations for 237 of the proteins, which is slightly more compared to a GWAS (N = 213), and we identify 34 associated loci in Trans. Several associations are driven by rare variants, which have larger effects, on average. We therefore conclude that rare variants could be of importance for precision medicine applications, but have a more limited contribution to the missing heritability of complex diseases. Despite the success of genome-wide association studies, much of the genetic contribution to complex traits remains unexplained. Here, the authors identify effects by rare variants on plasma proteins, and estimate the contribution of rare variants to the heritability.
Collapse
|
157
|
Abstract
Distilling biologically meaningful information from cancer genome sequencing data requires comprehensive identification of somatic alterations using rigorous computational methods. As the amount and complexity of sequencing data have increased, so has the number of tools for analysing them. Here, we describe the main steps involved in the bioinformatic analysis of cancer genomes, review key algorithmic developments and highlight popular tools and emerging technologies. These tools include those that identify point mutations, copy number alterations, structural variations and mutational signatures in cancer genomes. We also discuss issues in experimental design, the strengths and limitations of sequencing modalities and methodological challenges for the future.
Collapse
|
158
|
Estimating bonobo ( Pan paniscus) and chimpanzee ( Pan troglodytes) evolutionary history from nucleotide site patterns. Proc Natl Acad Sci U S A 2022; 119:e2200858119. [PMID: 35452306 PMCID: PMC9170072 DOI: 10.1073/pnas.2200858119] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
There is genomic evidence of widespread admixture in deep time between many closely related species, including humans. Our closest living relatives, bonobos and chimpanzees, may also exhibit such patterns. However, assessing the exact degree of interbreeding remains challenging because previous studies have resulted in multiple inconsistent demographic models. We use an approach that addresses these gaps by analyzing all lineages, simultaneously estimating parameters, and comparing previously models. We find evidence of considerable introgression from western into eastern chimpanzees. We also show more breeding females than males and evidence of male-biased dispersal in western chimpanzees. These findings highlight the extent of admixture in bonobo and chimpanzee evolutionary history and are consistent with substantial differences between past and present chimpanzee biogeography. Admixture appears increasingly ubiquitous in the evolutionary history of various taxa, including humans. Such gene flow likely also occurred among our closest living relatives: bonobos (Pan paniscus) and chimpanzees (Pan troglodytes). However, our understanding of their evolutionary history has been limited by studies that do not consider all Pan lineages or do not analyze all lineages simultaneously, resulting in conflicting demographic models. Here, we investigate this gap in knowledge using nucleotide site patterns calculated from whole-genome sequences from the autosomes of 71 bonobos and chimpanzees, representing all five extant Pan lineages. We estimated demographic parameters and compared all previously proposed demographic models for this clade. We further considered sex bias in Pan evolutionary history by analyzing the site patterns from the X chromosome. We show that 1) 21% of autosomal DNA in eastern chimpanzees derives from western chimpanzee introgression and that 2) all four chimpanzee lineages share a common ancestor about 987,000 y ago, much earlier than previous estimates. In addition, we suggest that 3) there was male reproductive skew throughout Pan evolutionary history and find evidence of 4) male-biased dispersal from western to eastern chimpanzees. Collectively, these results offer insight into bonobo and chimpanzee evolutionary history and suggest considerable differences between current and historic chimpanzee biogeography.
Collapse
|
159
|
Ng AWT, Contino G, Killcoyne S, Devonshire G, Hsu R, Abbas S, Su J, Redmond AM, Weaver JMJ, Eldridge MD, Tavaré S, Edwards PAW, Fitzgerald RC. Rearrangement processes and structural variations show evidence of selection in oesophageal adenocarcinomas. Commun Biol 2022; 5:335. [PMID: 35396535 PMCID: PMC8993906 DOI: 10.1038/s42003-022-03238-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 02/25/2022] [Indexed: 11/26/2022] Open
Abstract
Oesophageal adenocarcinoma (OAC) provides an ideal case study to characterize large-scale rearrangements. Using whole genome short-read sequencing of 383 cases, for which 214 had matched whole transcriptomes, we observed structural variations (SV) with a predominance of deletions, tandem duplications and inter-chromosome junctions that could be identified as LINE-1 mobile element (ME) insertions. Complex clusters of rearrangements resembling breakage-fusion-bridge cycles or extrachromosomal circular DNA accounted for 22% of complex SVs affecting known oncogenes. Counting SV events affecting known driver genes substantially increased the recurrence rates of these drivers. After excluding fragile sites, we identified 51 candidate new drivers in genomic regions disrupted by SVs, including ETV5, KAT6B and CLTC. RUNX1 was the most recurrently altered gene (24%), with many deletions inactivating the RUNT domain but preserved the reading frame, suggesting an altered protein product. These findings underscore the importance of identification of SV events in OAC with implications for targeted therapies.
Collapse
Affiliation(s)
- Alvin Wei Tian Ng
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Gianmarco Contino
- Institute of Cancer and Genomic Sciences, College of Medical & Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, B15 2GW, UK
| | - Sarah Killcoyne
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Hinxton, UK
| | - Ginny Devonshire
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Ray Hsu
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Sujath Abbas
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK
| | - Jing Su
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Aisling M Redmond
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK
| | - Jamie M J Weaver
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Medical Oncology, The Christie NHS Foundation Trust, Manchester, UK
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Matthew D Eldridge
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Simon Tavaré
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Irving Institute for Cancer Dynamics, Columbia University, New York, USA
- Department of Statistics, Columbia University, New York, USA
- Department of Biological Sciences, Columbia University, New York, USA
| | - Paul A W Edwards
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Rebecca C Fitzgerald
- Medical Research Council Cancer Unit, Hutchison/Medical Research Council Research Centre, University of Cambridge, Cambridge, UK.
| |
Collapse
|
160
|
Jiang Y, Hu X, Yuan Y, Guo X, Chase MW, Ge S, Li J, Fu J, Li K, Hao M, Wang Y, Jiao Y, Jiang W, Jin X. The Gastrodia menghaiensis (Orchidaceae) genome provides new insights of orchid mycorrhizal interactions. BMC PLANT BIOLOGY 2022; 22:179. [PMID: 35392808 PMCID: PMC8988336 DOI: 10.1186/s12870-022-03573-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/01/2022] [Indexed: 06/01/2023]
Abstract
BACKGROUND To illustrate the molecular mechanism of mycoheterotrophic interactions between orchids and fungi, we assembled chromosome-level reference genome of Gastrodia menghaiensis (Orchidaceae) and analyzed the genomes of two species of Gastrodia. RESULTS Our analyses indicated that the genomes of Gastrodia are globally diminished in comparison to autotrophic orchids, even compared to Cuscuta (a plant parasite). Genes involved in arbuscular mycorrhizae colonization were found in genomes of Gastrodia, and many of the genes involved biological interaction between Gatrodia and symbiotic microbionts are more numerous than in photosynthetic orchids. The highly expressed genes for fatty acid and ammonium root transporters suggest that fungi receive material from orchids, although most raw materials flow from the fungi. Many nuclear genes (e.g. biosynthesis of aromatic amino acid L-tryptophan) supporting plastid functions are expanded compared to photosynthetic orchids, an indication of the importance of plastids even in totally mycoheterotrophic species. CONCLUSION Gastrodia menghaiensis has the smallest proteome thus far among angiosperms. Many of the genes involved biological interaction between Gatrodia and symbiotic microbionts are more numerous than in photosynthetic orchids.
Collapse
Affiliation(s)
- Yan Jiang
- Institute of Botany, Chinese Academy of Sciences, Xiangshan, Haidian, Beijing, 100093, China
| | - Xiaodi Hu
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Yuan Yuan
- National Resource Center for Chinese Meteria Medica, Chinese Academy of Chinese Medical Sciences, Chaoyang, Beijing, 100700, China
| | - Xuelian Guo
- Institute of Botany, Chinese Academy of Sciences, Xiangshan, Haidian, Beijing, 100093, China
| | - Mark W Chase
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, TW9 3DS, Surrey, UK
- Department of Environment and Agriculture, Curtin University, Perth, WA, Australia
| | - Song Ge
- Institute of Botany, Chinese Academy of Sciences, Xiangshan, Haidian, Beijing, 100093, China
| | - Jianwu Li
- Xishuanbanan Tropical Botanical Gardens, Chinese Academy of Sciences, Menglun, Mengla, Yunnan, China
| | - Jinlong Fu
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Kui Li
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Meng Hao
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Yiming Wang
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Yuannian Jiao
- Institute of Botany, Chinese Academy of Sciences, Xiangshan, Haidian, Beijing, 100093, China
| | - Wenkai Jiang
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Xiaohua Jin
- Institute of Botany, Chinese Academy of Sciences, Xiangshan, Haidian, Beijing, 100093, China.
| |
Collapse
|
161
|
Noyes MD, Harvey WT, Porubsky D, Sulovari A, Li R, Rose NR, Audano PA, Munson KM, Lewis AP, Hoekzema K, Mantere T, Graves-Lindsay TA, Sanders AD, Goodwin S, Kramer M, Mokrab Y, Zody MC, Hoischen A, Korbel JO, McCombie WR, Eichler EE. Familial long-read sequencing increases yield of de novo mutations. Am J Hum Genet 2022; 109:631-646. [PMID: 35290762 PMCID: PMC9069071 DOI: 10.1016/j.ajhg.2022.02.014] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/16/2022] [Indexed: 12/11/2022] Open
Abstract
Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.
Collapse
Affiliation(s)
- Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ruiyang Li
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Nicholas R Rose
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tuomo Mantere
- Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit and Biocenter Oulu, University of Oulu, 90220 Oulu, Finland
| | | | - Ashley D Sanders
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - Sara Goodwin
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Melissa Kramer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Younes Mokrab
- Department of Human Genetics, Sidra Medicine, PO Box 26999, Doha, Qatar; Weill Cornell Medicine, PO Box 24144, Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, PO Box 34110, Doha, Qatar
| | | | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Radboud Institute of Medical Life Sciences and Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, 6500 Nijmegen, the Netherlands
| | - Jan O Korbel
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - W Richard McCombie
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
162
|
An Amish founder population reveals rare-population genetic determinants of the human lipidome. Commun Biol 2022; 5:334. [PMID: 35393526 PMCID: PMC8989972 DOI: 10.1038/s42003-022-03291-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 03/17/2022] [Indexed: 12/02/2022] Open
Abstract
Identifying the genetic determinants of inter-individual variation in lipid species (lipidome) may provide deeper understanding and additional insight into the mechanistic effect of complex lipidomic pathways in CVD risk and progression beyond simple traditional lipids. Previous studies have been largely population based and thus only powered to discover associations with common genetic variants. Founder populations represent a powerful resource to accelerate discovery of previously unknown biology associated with rare population alleles that have risen to higher frequency due to genetic drift. We performed a genome-wide association scan of 355 lipid species in 650 individuals from the Amish founder population including 127 lipid species not previously tested. To the best of our knowledge, we report for the first time the lipid species associated with two rare-population but Amish-enriched lipid variants: APOB_rs5742904 and APOC3_rs76353203. We also identified novel associations for 3 rare-population Amish-enriched loci with several sphingolipids and with proposed potential functional/causal variant in each locus including GLTPD2_rs536055318, CERS5_rs771033566, and AKNA_rs531892793. We replicated 7 previously known common loci including novel associations with two sterols: androstenediol with UGT locus and estriol with SLC22A8/A24 locus. Our results show the double power of founder populations and detailed lipidome to discover novel trait-associated variants. A GWAS of 355 lipid species in the Old Order Amish founder population reveals associations between Amish-enriched loci and several sphingolipids.
Collapse
|
163
|
Yi X, Liu J, Chen S, Wu H, Liu M, Xu Q, Lei L, Lee S, Zhang B, Kudrna D, Fan W, Wing RA, Wang X, Zhang M, Zhang J, Yang C, Chen N. Genome assembly of the JD17 soybean provides a new reference genome for comparative genomics. G3 (BETHESDA, MD.) 2022; 12:jkac017. [PMID: 35188189 PMCID: PMC8982393 DOI: 10.1093/g3journal/jkac017] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 01/11/2022] [Indexed: 12/30/2022]
Abstract
Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromosome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with 3 published soybeans (WM82, ZH13, and W05), which identified 5 large inversions and 2 large translocations specific to JD17, 20,984-46,912 presence-absence variations spanning 13.1-46.9 Mb in size. A total of 1,695,741-3,664,629 SNPs and 446,689-800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.
Collapse
Affiliation(s)
- Xinxin Yi
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| | - Jing Liu
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
- State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng 475004, Henan, China
| | - Shengcai Chen
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
- State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng 475004, Henan, China
| | - Hao Wu
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| | - Min Liu
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| | - Qing Xu
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| | - Lingshan Lei
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| | - Seunghee Lee
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Bao Zhang
- State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng 475004, Henan, China
| | - Dave Kudrna
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Wei Fan
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
- State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng 475004, Henan, China
| | - Rod A Wing
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Xuelu Wang
- State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng 475004, Henan, China
| | - Mengchen Zhang
- Institute of Food and Oil Crops, Hebei Academy of Agricultural and Forestry Sciences, Shijiazhuang 050031, Hebei, China
| | - Jianwei Zhang
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Chunyan Yang
- Institute of Food and Oil Crops, Hebei Academy of Agricultural and Forestry Sciences, Shijiazhuang 050031, Hebei, China
| | - Nansheng Chen
- National Key Laboratory of Crop Genetic Improvement, Center of Integrative Biology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, Hubei, China
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
164
|
Ozerov M, Noreikiene K, Kahar S, Huss M, Huusko A, Kõiv T, Sepp M, López M, Gårdmark A, Gross R, Vasemägi A. Whole-genome sequencing illuminates multifaceted targets of selection to humic substances in Eurasian perch. Mol Ecol 2022; 31:2367-2383. [PMID: 35202502 PMCID: PMC9314028 DOI: 10.1111/mec.16409] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/10/2022] [Accepted: 02/17/2022] [Indexed: 11/30/2022]
Abstract
Extreme environments are inhospitable to the majority of species, but some organisms are able to survive in such hostile conditions due to evolutionary adaptations. For example, modern bony fishes have colonized various aquatic environments, including perpetually dark, hypoxic, hypersaline and toxic habitats. Eurasian perch (Perca fluviatilis) is among the few fish species of northern latitudes that is able to live in very acidic humic lakes. Such lakes represent almost "nocturnal" environments; they contain high levels of dissolved organic matter, which in addition to creating a challenging visual environment, also affects a large number of other habitat parameters and biotic interactions. To reveal the genomic targets of humic-associated selection, we performed whole-genome sequencing of perch originating from 16 humic and 16 clear-water lakes in northern Europe. We identified over 800,000 single nucleotide polymorphisms, of which >10,000 were identified as potential candidates under selection (associated with >3000 genes) using multiple outlier approaches. Our findings suggest that adaptation to the humic environment may involve hundreds of regions scattered across the genome. Putative signals of adaptation were detected in genes and gene families with diverse functions, including organism development and ion transportation. The observed excess of variants under selection in regulatory regions highlights the importance of adaptive evolution via regulatory elements, rather than via protein sequence modification. Our study demonstrates the power of whole-genome analysis to illuminate the multifaceted nature of humic adaptation and provides the foundation for further investigation of causal mutations underlying phenotypic traits of ecological and evolutionary importance.
Collapse
Affiliation(s)
- Mikhail Ozerov
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
- Department of BiologyUniversity of TurkuTurkuFinland
- Biodiversity UnitUniversity of TurkuTurkuFinland
| | - Kristina Noreikiene
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Siim Kahar
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Magnus Huss
- Department of Aquatic ResourcesSwedish University of Agricultural SciencesÖregrundSweden
| | - Ari Huusko
- Natural resources Institute Finland (Luke)PaltamoFinland
| | - Toomas Kõiv
- Chair of Hydrobiology and FisheryInstitute of Agricultural and Environmental SciencesEstonian University of Life SciencesTartuEstonia
| | - Margot Sepp
- Chair of Hydrobiology and FisheryInstitute of Agricultural and Environmental SciencesEstonian University of Life SciencesTartuEstonia
| | - María‐Eugenia López
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
| | - Anna Gårdmark
- Department of Aquatic ResourcesSwedish University of Agricultural SciencesÖregrundSweden
| | - Riho Gross
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| | - Anti Vasemägi
- Department of Aquatic ResourcesInstitute of Freshwater ResearchSwedish University of Agricultural SciencesDrottningholmSweden
- Chair of AquacultureInstitute of Veterinary Medicine and Animal SciencesEstonian University of Life SciencesTartuEstonia
| |
Collapse
|
165
|
Marin M, Vargas R, Harris M, Jeffrey B, Epperson LE, Durbin D, Strong M, Salfinger M, Iqbal Z, Akhundova I, Vashakidze S, Crudu V, Rosenthal A, Farhat MR. Benchmarking the empirical accuracy of short-read sequencing across the M. tuberculosis genome. Bioinformatics 2022; 38:1781-1787. [PMID: 35020793 PMCID: PMC8963317 DOI: 10.1093/bioinformatics/btac023] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 12/23/2021] [Accepted: 01/07/2022] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION Short-read whole-genome sequencing (WGS) is a vital tool for clinical applications and basic research. Genetic divergence from the reference genome, repetitive sequences and sequencing bias reduces the performance of variant calling using short-read alignment, but the loss in recall and specificity has not been adequately characterized. To benchmark short-read variant calling, we used 36 diverse clinical Mycobacterium tuberculosis (Mtb) isolates dually sequenced with Illumina short-reads and PacBio long-reads. We systematically studied the short-read variant calling accuracy and the influence of sequence uniqueness, reference bias and GC content. RESULTS Reference-based Illumina variant calling demonstrated a maximum recall of 89.0% and minimum precision of 98.5% across parameters evaluated. The approach that maximized variant recall while still maintaining high precision (<99%) was tuning the mapping quality filtering threshold, i.e. confidence of the read mapping (recall = 85.8%, precision = 99.1%, MQ ≥ 40). Additional masking of repetitive sequence content is an alternative conservative approach to variant calling that increases precision at cost to recall (recall = 70.2%, precision = 99.6%, MQ ≥ 40). Of the genomic positions typically excluded for Mtb, 68% are accurately called using Illumina WGS including 52/168 PE/PPE genes (34.5%). From these results, we present a refined list of low confidence regions across the Mtb genome, which we found to frequently overlap with regions with structural variation, low sequence uniqueness and low sequencing coverage. Our benchmarking results have broad implications for the use of WGS in the study of Mtb biology, inference of transmission in public health surveillance systems and more generally for WGS applications in other organisms. AVAILABILITY AND IMPLEMENTATION All relevant code is available at https://github.com/farhat-lab/mtb-illumina-wgs-evaluation. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maximillian Marin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Roger Vargas
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Michael Harris
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20894, USA
| | - Brendan Jeffrey
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20894, USA
| | - L Elaine Epperson
- Center for Genes, Environment, and Health, National Jewish Health, Denver, CO 80206, USA
| | - David Durbin
- Mycobacteriology Reference Laboratory, Advanced Diagnostic Laboratories, National Jewish Health, Denver, CO 80206, USA
| | - Michael Strong
- Center for Genes, Environment, and Health, National Jewish Health, Denver, CO 80206, USA
| | - Max Salfinger
- College of Public Health and Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Zamin Iqbal
- EMBL-EBI, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Irada Akhundova
- Scientific Research Institute of Lung Diseases, Ministry of Health, Baku AZ1014, Azerbaijan
| | - Sergo Vashakidze
- Department of Medicine, The University of Georgia, Tbilisi 0171, Georgia
- National Center for Tuberculosis and Lung Diseases, Ministry of Health, Tbilisi 0171, Georgia
| | - Valeriu Crudu
- Phthisiopneumology Institute, Ministry of Health, Chisinau 2025, Republic of Moldova
| | - Alex Rosenthal
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20894, USA
| | - Maha Reda Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| |
Collapse
|
166
|
Giacopuzzi E, Popitsch N, Taylor JC. GREEN-DB: a framework for the annotation and prioritization of non-coding regulatory variants from whole-genome sequencing data. Nucleic Acids Res 2022; 50:2522-2535. [PMID: 35234913 PMCID: PMC8934622 DOI: 10.1093/nar/gkac130] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 02/02/2022] [Accepted: 02/14/2022] [Indexed: 11/25/2022] Open
Abstract
Non-coding variants have long been recognized as important contributors to common disease risks, but with the expansion of clinical whole genome sequencing, examples of rare, high-impact non-coding variants are also accumulating. Despite recent advances in the study of regulatory elements and the availability of specialized data collections, the systematic annotation of non-coding variants from genome sequencing remains challenging. Here, we propose a new framework for the prioritization of non-coding regulatory variants that integrates information about regulatory regions with prediction scores and HPO-based prioritization. Firstly, we created a comprehensive collection of annotations for regulatory regions including a database of 2.4 million regulatory elements (GREEN-DB) annotated with controlled gene(s), tissue(s) and associated phenotype(s) where available. Secondly, we calculated a variation constraint metric and showed that constrained regulatory regions associate with disease-associated genes and essential genes from mouse knock-outs. Thirdly, we compared 19 non-coding impact prediction scores providing suggestions for variant prioritization. Finally, we developed a VCF annotation tool (GREEN-VARAN) that can integrate all these elements to annotate variants for their potential regulatory impact. In our evaluation, we show that GREEN-DB can capture previously published disease-associated non-coding variants as well as identify additional candidate disease genes in trio analyses.
Collapse
Affiliation(s)
- Edoardo Giacopuzzi
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- National Institute for Health Research Oxford Biomedical Research Centre, Oxford OX4 2PG, UK
| | - Niko Popitsch
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- Max Perutz Labs, University of Vienna, Dr. Bohr-Gasse 9, 1030 Vienna, Austria
| | - Jenny C Taylor
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
- National Institute for Health Research Oxford Biomedical Research Centre, Oxford OX4 2PG, UK
| |
Collapse
|
167
|
Niu Y, Teng X, Zhou H, Shi Y, Li Y, Tang Y, Zhang P, Luo H, Kang Q, Xu T, He S. Characterizing mobile element insertions in 5675 genomes. Nucleic Acids Res 2022; 50:2493-2508. [PMID: 35212372 PMCID: PMC8934628 DOI: 10.1093/nar/gkac128] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 02/07/2022] [Accepted: 02/11/2022] [Indexed: 12/30/2022] Open
Abstract
Mobile element insertions (MEIs) are a major class of structural variants (SVs) and have been linked to many human genetic disorders, including hemophilia, neurofibromatosis, and various cancers. However, human MEI resources from large-scale genome sequencing are still lacking compared to those for SNPs and SVs. Here, we report a comprehensive map of 36 699 non-reference MEIs constructed from 5675 genomes, comprising 2998 Chinese samples (∼26.2×, NyuWa) and 2677 samples from the 1000 Genomes Project (∼7.4×, 1KGP). We discovered that LINE-1 insertions were highly enriched in centromere regions, implying the role of chromosome context in retroelement insertion. After functional annotation, we estimated that MEIs are responsible for about 9.3% of all protein-truncating events per genome. Finally, we built a companion database named HMEID for public use. This resource represents the latest and largest genomewide study on MEIs and will have broad utility for exploration of human MEI findings.
Collapse
Affiliation(s)
- Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xueyi Teng
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Honghong Zhou
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiheng Tang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Quan Kang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Tao Xu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
168
|
Assessment of linkage disequilibrium patterns between structural variants and single nucleotide polymorphisms in three commercial chicken populations. BMC Genomics 2022; 23:193. [PMID: 35264116 PMCID: PMC8908679 DOI: 10.1186/s12864-022-08418-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 02/24/2022] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Structural variants (SV) are causative for some prominent phenotypic traits of livestock as different comb types in chickens or color patterns in pigs. Their effects on production traits are also increasingly studied. Nevertheless, accurately calling SV remains challenging. It is therefore of interest, whether close-by single nucleotide polymorphisms (SNPs) are in strong linkage disequilibrium (LD) with SVs and can serve as markers. Literature comes to different conclusions on whether SVs are in LD to SNPs on the same level as SNPs to other SNPs. The present study aimed to generate a precise SV callset from whole-genome short-read sequencing (WGS) data for three commercial chicken populations and to evaluate LD patterns between the called SVs and surrounding SNPs. It is thereby the first study that assessed LD between SVs and SNPs in chickens. RESULTS The final callset consisted of 12,294,329 bivariate SNPs, 4,301 deletions (DEL), 224 duplications (DUP), 218 inversions (INV) and 117 translocation breakpoints (BND). While average LD between DELs and SNPs was at the same level as between SNPs and SNPs, LD between other SVs and SNPs was strongly reduced (DUP: 40%, INV: 27%, BND: 19% of between-SNP LD). A main factor for the reduced LD was the presence of local minor allele frequency differences, which accounted for 50% of the difference between SNP - SNP and DUP - SNP LD. This was potentially accompanied by lower genotyping accuracies for DUP, INV and BND compared with SNPs and DELs. An evaluation of the presence of tag SNPs (SNP in highest LD to the variant of interest) further revealed DELs to be slightly less tagged by WGS SNPs than WGS SNPs by other SNPs. This difference, however, was no longer present when reducing the pool of potential tag SNPs to SNPs located on four different chicken genotyping arrays. CONCLUSIONS The results implied that genomic variance due to DELs in the chicken populations studied can be captured by different SNP marker sets as good as variance from WGS SNPs, whereas separate SV calling might be advisable for DUP, INV, and BND effects.
Collapse
|
169
|
Futas J, Oppelt J, Vychodilova L, Burger P, Horin P. The deadly face of felid killer cells: the cytotoxic proteins and their genes. HLA 2022; 100:37-51. [PMID: 35263044 DOI: 10.1111/tan.14595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/08/2022] [Accepted: 03/03/2022] [Indexed: 11/29/2022]
Abstract
Natural killer cells and cytotoxic T lymphocytes are the main cell populations of the immune system able to directly kill target cells via cytotoxic granules. Different mammalian species may differ in specific features of their pore-forming protein (perforin) and granule-bound serine proteases (granzymes). One perforin gene (PRF1) and four genes encoding granzymes A, B, H, and K (GZMA, GZMB, GZMH, GZMK) were identified in the reference genomes of felids. The objective of this work was to characterize the genes PRF1, GZMA and GZMB in a panel of 17 felid species by next-generation re-sequencing. A search of available felid genomes (17 species) retrieved the coding sequences of these genes for comparison to our data. Both sets of sequences or their combinations (23 species) were used for phylogenetic and selection analyses. Nucleotide PRF1, GZMA and GZMB sequences showed high similarities between felid species (over 95% identity). All trees derived from coding sequences expressed phylogenetic relationships corresponding to the zoological taxonomy of the Felidae, except GZMA. No effects of positive selection were detected in the genes studied, however, effects of purifying selection were observed for PRF1 and GZMA. The conservation of PRF1 is in agreement with its critical biological function. The differentiation observed between granzyme sub-families may reflect an adaptation to pathogen variation. The need to maintain important gene functions and at the same time cope with various pathogens may lead to an equilibrium between positive and negative selective pressures acting on GZMB. The within-species variability in wild felid populations merits further investigation. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Jan Futas
- Department of Animal Genetics, Faculty of Veterinary Medicine, University of Veterinary Sciences Brno (VETUNI), Brno, Czech Republic.,Research Group Animal Immunogenomics, CEITEC VETUNI, Brno, Czech Republic
| | - Jan Oppelt
- Research Group Animal Immunogenomics, CEITEC VETUNI, Brno, Czech Republic
| | - Leona Vychodilova
- Department of Animal Genetics, Faculty of Veterinary Medicine, University of Veterinary Sciences Brno (VETUNI), Brno, Czech Republic
| | - Pamela Burger
- Research Institute of Wildlife Ecology, University of Veterinary Medicine Vienna (VETMEDUNI), Vienna, Austria
| | - Petr Horin
- Department of Animal Genetics, Faculty of Veterinary Medicine, University of Veterinary Sciences Brno (VETUNI), Brno, Czech Republic.,Research Group Animal Immunogenomics, CEITEC VETUNI, Brno, Czech Republic
| |
Collapse
|
170
|
Sandoval-Castillo J, Beheregaray LB, Wellenreuther M. Genomic prediction of growth in a commercially, recreationally, and culturally important marine resource, the Australian snapper (Chrysophrys auratus). G3 (BETHESDA, MD.) 2022; 12:jkac015. [PMID: 35100370 PMCID: PMC8896003 DOI: 10.1093/g3journal/jkac015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
Growth is one of the most important traits of an organism. For exploited species, this trait has ecological and evolutionary consequences as well as economical and conservation significance. Rapid changes in growth rate associated with anthropogenic stressors have been reported for several marine fishes, but little is known about the genetic basis of growth traits in teleosts. We used reduced genome representation data and genome-wide association approaches to identify growth-related genetic variation in the commercially, recreationally, and culturally important Australian snapper (Chrysophrys auratus, Sparidae). Based on 17,490 high-quality single-nucleotide polymorphisms and 363 individuals representing extreme growth phenotypes from 15,000 fish of the same age and reared under identical conditions in a sea pen, we identified 100 unique candidates that were annotated to 51 proteins. We documented a complex polygenic nature of growth in the species that included several loci with small effects and a few loci with larger effects. Overall heritability was high (75.7%), reflected in the high accuracy of the genomic prediction for the phenotype (small vs large). Although the single-nucleotide polymorphisms were distributed across the genome, most candidates (60%) clustered on chromosome 16, which also explains the largest proportion of heritability (16.4%). This study demonstrates that reduced genome representation single-nucleotide polymorphisms and the right bioinformatic tools provide a cost-efficient approach to identify growth-related loci and to describe genomic architectures of complex quantitative traits. Our results help to inform captive aquaculture breeding programs and are of relevance to monitor growth-related evolutionary shifts in wild populations in response to anthropogenic pressures.
Collapse
Affiliation(s)
- Jonathan Sandoval-Castillo
- Molecular Ecology Laboratory, College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia
| | - Luciano B Beheregaray
- Molecular Ecology Laboratory, College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia
| | - Maren Wellenreuther
- School of Biological Sciences, The New Zealand Institute for Plant and Food Research Limited, Nelson 7010, New Zealand
- Seafood Production Group, The School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
171
|
Genome-wide scan for selection signatures and genes related to heat tolerance in domestic chickens in the tropical and temperate regions in Asia. Poult Sci 2022; 101:101821. [PMID: 35537342 PMCID: PMC9118144 DOI: 10.1016/j.psj.2022.101821] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 02/02/2022] [Accepted: 02/28/2022] [Indexed: 11/24/2022] Open
Abstract
Heat stress is one of the major environmental stressors challenging the global poultry industry. Identifying the genes responsible for heat tolerance is fundamentally important for direct breeding programs. To uncover the genetic basis underlying the ambient temperature adaptation of chickens, we analyzed a total of 59 whole genomes from indigenous chickens that inhabit South Asian tropical regions and temperate regions from Northern China. We applied FST and π-ratio to scan selective sweeps and identified 34 genes with a signature of positive selection in chickens from tropical regions. Several of these genes are functionally implicated in metabolism (FABP2, RAMP3, SUGCT, and TSHR) and vascular smooth muscle contractility (CAMK2), and they may be associated with adaptation to tropical regions. In particular, we found a missense mutation in thyroid-stimulating hormone receptor (41020238:G>A) that shows significant differences in allele frequency between the chicken populations of the two regions. To evaluate whether the missense mutation in TSHR could enhance the heat tolerance of chickens, we constructed segregated chicken populations and conducted heat stress experiments using homozygous mutations (AA) and wild-type (GG) chickens. We found that GG chickens exhibited significantly higher concentrations of alanine aminotransferase, lactate dehydrogenase, and creatine kinase than AA chickens under heat stress (35 ± 1°C) conditions (P < 0.05). These results suggest that TSHR (41020238:G>A) can facilitate heat tolerance and adaptation to higher ambient temperature conditions in tropical climates. Overall, our results provide potential candidate genes for molecular breeding of heat-tolerant chickens.
Collapse
|
172
|
Couvin D, Dereeper A, Meyer DF, Noroy C, Gaete S, Bhakkan B, Poullet N, Gaspard S, Bezault E, Marcelino I, Pruneau L, Segretier W, Stattner E, Cazenave D, Garnier M, Pot M, Tressières B, Deloumeaux J, Breurec S, Ferdinand S, Gonzalez-Rizzo S, Reynaud Y. KaruBioNet: a network and discussion group for a better collaboration and structuring of bioinformatics in Guadeloupe (French West Indies). BIOINFORMATICS ADVANCES 2022; 2:vbac010. [PMID: 36699379 PMCID: PMC9710593 DOI: 10.1093/bioadv/vbac010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/24/2022] [Accepted: 02/09/2022] [Indexed: 01/28/2023]
Abstract
Summary Sequencing and other biological data are now more frequently available and at a lower price. Mutual tools and strategies are needed to analyze the huge amount of heterogeneous data generated by several research teams and devices. Bioinformatics represents a growing field in the scientific community globally. This multidisciplinary field provides a great amount of tools and methods that can be used to conduct scientific studies in a more strategic way. Coordinated actions and collaborations are needed to find more innovative and accurate methods for a better understanding of real-life data. A wide variety of organizations are contributing to KaruBioNet in Guadeloupe (French West Indies), a Caribbean archipelago. The purpose of this group is to foster collaboration and mutual aid among people from different disciplines using a 'one health' approach, for a better comprehension and surveillance of humans, plants or animals' health and diseases. The KaruBioNet network particularly aims to help researchers in their studies related to 'omics' data, but also more general aspects concerning biological data analysis. This transdisciplinary network is a platform for discussion, sharing, training and support between scientists interested in bioinformatics and related fields. Starting from a little archipelago in the Caribbean, we envision to facilitate exchange between other Caribbean partners in the future, knowing that the Caribbean is a region with non-negligible biodiversity which should be preserved and protected. Joining forces with other Caribbean countries or territories would strengthen scientific collaborative impact in the region. Information related to this network can be found at: http://www.pasteur-guadeloupe.fr/karubionet.html. Furthermore, a dedicated 'Galaxy KaruBioNet' platform is available at: http://calamar.univ-ag.fr/c3i/galaxy_karubionet.html. Availability and implementation Information about KaruBioNet is availabe at: http://www.pasteur-guadeloupe.fr/karubionet.html. Contact dcouvin@pasteur-guadeloupe.fr. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- David Couvin
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France,To whom correspondence should be addressed
| | - Alexis Dereeper
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Damien F Meyer
- CIRAD, UMR ASTRE, Petit-Bourg, Guadeloupe 97170, France,ASTRE, Univ Montpellier, CIRAD, INRAE, Montpellier 34000, France
| | - Christophe Noroy
- Développement, Analyse, Transfert et Application (DATA), Lamentin, Guadeloupe 97129, France
| | - Stanie Gaete
- Karubiotec Centre de Ressources Biologiques-UF 0216, CHU de la Guadeloupe, Pointe-à-Pitre 97110, France
| | - Bernard Bhakkan
- Registre des cancers de Guadeloupe, CHU de la Guadeloupe, Pointe-à-Pitre 97110, France
| | - Nausicaa Poullet
- URZ Recherches Zootechniques, INRAE, Petit-Bourg, Guadeloupe 97170, France
| | - Sarra Gaspard
- Laboratoire COVACHIMM2E EA3592, Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Etienne Bezault
- UMR BOREA (MNHN, CNRS-7208, IRD-207, Sorbonne Université, UCN, UA), Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Isabel Marcelino
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Ludovic Pruneau
- Équipe « Biologie de la mangrove » UMR7205 « ISYEB » MNHN-CNRS-Sorbonne Université-EPHE-UA, UFR SEN Département de Biologie, Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Wilfried Segretier
- Laboratoire de Mathématiques Informatique et Applications (LAMIA), Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Erick Stattner
- Laboratoire de Mathématiques Informatique et Applications (LAMIA), Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Damien Cazenave
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Maëlle Garnier
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Matthieu Pot
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Benoît Tressières
- Centre d’Investigation Clinique Antilles Guyane, Inserm CIC 1424, Les Abymes, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Jacqueline Deloumeaux
- Karubiotec Centre de Ressources Biologiques-UF 0216, CHU de la Guadeloupe, Pointe-à-Pitre 97110, France,Registre des cancers de Guadeloupe, CHU de la Guadeloupe, Pointe-à-Pitre 97110, France
| | - Sébastien Breurec
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France,Centre d’Investigation Clinique Antilles Guyane, Inserm CIC 1424, Les Abymes, Pointe-à-Pitre, Guadeloupe 97110, France,Faculté de Médecine Hyacinthe Bastaraud, Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Séverine Ferdinand
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| | - Silvina Gonzalez-Rizzo
- Équipe « Biologie de la mangrove » UMR7205 « ISYEB » MNHN-CNRS-Sorbonne Université-EPHE-UA, UFR SEN Département de Biologie, Université des Antilles, Pointe-à-Pitre, Guadeloupe 97110, France
| | - Yann Reynaud
- Unité Transmission, Réservoir et Diversité des Pathogènes, Institut Pasteur de Guadeloupe, Les Abymes, Guadeloupe 97139, France
| |
Collapse
|
173
|
Yamaguchi M, Nakaoka H, Suda K, Yoshihara K, Ishiguro T, Yachida N, Saito K, Ueda H, Sugino K, Mori Y, Yamawaki K, Tamura R, Revathidevi S, Motoyama T, Tainaka K, Verhaak RGW, Inoue I, Enomoto T. Spatiotemporal dynamics of clonal selection and diversification in normal endometrial epithelium. Nat Commun 2022; 13:943. [PMID: 35177608 PMCID: PMC8854701 DOI: 10.1038/s41467-022-28568-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 02/02/2022] [Indexed: 12/15/2022] Open
Abstract
It has become evident that somatic mutations in cancer-associated genes accumulate in the normal endometrium, but spatiotemporal understanding of the evolution and expansion of mutant clones is limited. To elucidate the timing and mechanism of the clonal expansion of somatic mutations in cancer-associated genes in the normal endometrium, we sequence 1311 endometrial glands from 37 women. By collecting endometrial glands from different parts of the endometrium, we show that multiple glands with the same somatic mutations occupy substantial areas of the endometrium. We demonstrate that “rhizome structures”, in which the basal glands run horizontally along the muscular layer and multiple vertical glands rise from the basal gland, originate from the same ancestral clone. Moreover, mutant clones detected in the vertical glands diversify by acquiring additional mutations. These results suggest that clonal expansions through the rhizome structures are involved in the mechanism by which mutant clones extend their territories. Furthermore, we show clonal expansions and copy neutral loss-of-heterozygosity events occur early in life, suggesting such events can be tolerated many years in the normal endometrium. Our results of the evolutionary dynamics of mutant clones in the human endometrium will lead to a better understanding of the mechanisms of endometrial regeneration during the menstrual cycle and the development of therapies for the prevention and treatment of endometrium-related diseases. Through regeneration, the endometrium accumulates somatic mutations that can lead to diseases like endometriosis and cancer. Here, the authors use genomics to analyse normal endometrial glands from different patient cohorts, detect rhizome structures with common clonal ancestors and infer clonal expansion dynamics.
Collapse
Affiliation(s)
- Manako Yamaguchi
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Hirofumi Nakaoka
- Human Genetics Laboratory, National Institute of Genetics, Mishima, 411-8540, Japan. .,Department of Cancer Genome Research, Sasaki Institute, Sasaki Foundation, Chiyoda-ku, 101-0062, Japan.
| | - Kazuaki Suda
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Kosuke Yoshihara
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan.
| | - Tatsuya Ishiguro
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Nozomi Yachida
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Kyota Saito
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Haruka Ueda
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Kentaro Sugino
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Yutaro Mori
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Kaoru Yamawaki
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Ryo Tamura
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | | | - Teiichi Motoyama
- Department of Molecular and Diagnostic Pathology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan
| | - Kazuki Tainaka
- Department of System Pathology for Neurological Disorders, Brain Research Institute, Niigata University, Niigata, 951-8585, Japan.,Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, 565-5241, Japan
| | - Roel G W Verhaak
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.,Department of Neurosurgery, Cancer Center Amsterdam, Amsterdam UMC, VU University Medical Center (VUmc), 1081 HV, Amsterdam, The Netherlands
| | - Ituro Inoue
- Human Genetics Laboratory, National Institute of Genetics, Mishima, 411-8540, Japan.
| | - Takayuki Enomoto
- Department of Obstetrics and Gynecology, Niigata University Graduate School of Medical and Dental Sciences, Niigata, 951-8510, Japan.
| |
Collapse
|
174
|
Handayani ND, Lestari P, van As W, Holterman M, van den Elsen S, Dikin A, Bert W, Helder J, Van Steenbrugge JJM. Genomic Reconstruction of the Introduction and Diversification of Golden Potato Cyst Nematode Populations in Indonesia. PHYTOPATHOLOGY 2022; 112:396-403. [PMID: 34129357 DOI: 10.1094/phyto-04-21-0150-r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Potato cyst nematodes (PCNs), the umbrella term for Globodera rostochiensis and G. pallida, coevolved with their Solanaceous hosts in the Andean Mountain region. From there, PCN proliferated worldwide to virtually all potato production areas. PCN is a major factor limiting the potato production in Indonesia. In our survey, only G. rostochiensis was found. Fourteen field populations were collected on Java and Sumatra, and unique variants were called by mapping resequencing data on a G. rostochiensis reference genome. A phylogenetic tree based on 1.4 million unique variants showed a genotypic separation between the outgroup, a Scottish Ro1 population, and all Indonesian populations. This separation was comparable in size with the genotypic distinction between the Javanese and the Sumatran PCN populations. Next, variants within PCN effector gene families SPRYSEC, 1106, 4D06, and venom allergen-like protein (VAL) that all interfere with the host innate immune system were compared. Distinct selective pressures acted on these effector families; while SPRYSECs (4,341 single-nucleotide polymorphisms [SNPs]/insertions or deletions of bases [indels]) behaved like neutral genes, the phylogenetic trees of 1106, 4D06, and VAL proteins (235, 790, and 150 SNPs/indels, respectively) showed deviating topologies. Our data suggest that PCN was introduced on Java not too long after the introduction of potato in the middle of the eighteenth century. Soon thereafter, the pathogen established on Sumatra and started to diversify independently. This scenario was corroborated by diversification patterns of the effector families 1106, 4D06, and VAL. Our data demonstrate how genome resequencing data from a nonindigenous pathogen can be used to reconstruct the introduction and diversification process.
Collapse
Affiliation(s)
- Nurul Dwi Handayani
- Indonesian Agricultural Quarantine Agency, Ministry of Agriculture, Ragunan, Jakarta 12550, Indonesia
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
- Nematology Research Unit, Department of Biology, Ghent University, 9000 Ghent, Belgium
| | - Prabowo Lestari
- Indonesian Agricultural Quarantine Agency, Ministry of Agriculture, Ragunan, Jakarta 12550, Indonesia
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
- Nematology Research Unit, Department of Biology, Ghent University, 9000 Ghent, Belgium
| | - Wouter van As
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
| | - Martijn Holterman
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
- Solynta, 6703 HA Wageningen, The Netherlands
| | - Sven van den Elsen
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
| | - Antarjo Dikin
- Directorate General of Estate Crops, Ministry of Agriculture, Ragunan, Jakarta 12550, Indonesia
| | - Wim Bert
- Nematology Research Unit, Department of Biology, Ghent University, 9000 Ghent, Belgium
| | - Johannes Helder
- Laboratory of Nematology, Wageningen University, 6708 PB Wageningen, The Netherlands
| | | |
Collapse
|
175
|
Uranga C, Nelson KE, Edlund A, Baker JL. Tetramic Acids Mutanocyclin and Reutericyclin A, Produced by Streptococcus mutans Strain B04Sm5 Modulate the Ecology of an in vitro Oral Biofilm. FRONTIERS IN ORAL HEALTH 2022; 2:796140. [PMID: 35048077 PMCID: PMC8757879 DOI: 10.3389/froh.2021.796140] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 11/29/2021] [Indexed: 01/04/2023] Open
Abstract
The human oral microbiome consists of diverse microbes actively communicating and interacting through a variety of biochemical mechanisms. Dental caries is a major public health issue caused by fermentable carbohydrate consumption that leads to dysbiosis of the oral microbiome. Streptococcus mutans is a known major contributor to caries pathogenesis, due to its exceptional ability to form biofilms in the presence of sucrose, as well as to its acidophilic lifestyle. S. mutans can also kill competing bacteria, which are typically health associated, through the production of bacteriocins and other small molecules. A subset of S. mutans strains encode the muc biosynthetic gene cluster (BGC), which was recently shown to produce the tetramic acids, mutanocyclin and reutericyclins A, B, and C. Reutericyclin A displayed strong antimicrobial activity and mutanocyclin appeared to be anti-inflammatory; however the effect of these compounds, and the carriage of muc by S. mutans, on the ecology of the oral microbiota is not known, and was examined here using a previously developed in vitro biofilm model derived from human saliva. While reutericyclin significantly inhibited in vitro biofilm formation and acid production at sub-nanomolar concentrations, mutanocyclin did not present any activity until the high micromolar range. 16S rRNA gene sequencing revealed that reutericyclin drastically altered the biofilm community composition, while mutanocyclin showed a more specific effect, reducing the relative abundance of cariogenic Limosilactobacillus fermentum. Mutanocyclin or reutericyclin produced by the S. mutans strains amended to the community did not appear to affect the community in the same way as the purified compounds, although the results were somewhat confounded by the differing growth rates of the S. mutans strains. Regardless of the strain added, the addition of S. mutans to the in vitro community significantly increased the abundance of S. mutans and Veillonella infantium, only. Overall, this study illustrates that reutericyclin A and mutanocyclin do impact the ecology of a complex in vitro oral biofilm; however, further research is needed to determine the extent to which the production of these compounds affects the virulence of S. mutans.
Collapse
Affiliation(s)
- Carla Uranga
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA, United States
| | - Karen E Nelson
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA, United States
| | - Anna Edlund
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA, United States.,Department of Pediatrics, UC San Diego School of Medicine, San Diego, CA, United States
| | - Jonathon L Baker
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA, United States.,Department of Pediatrics, UC San Diego School of Medicine, San Diego, CA, United States
| |
Collapse
|
176
|
De Angelis F, Romboni M, Veltre V, Catalano P, Martínez-Labarga C, Gazzaniga V, Rickards O. First Glimpse into the Genomic Characterization of People from the Imperial Roman Community of Casal Bertone (Rome, First–Third Centuries AD). Genes (Basel) 2022; 13:genes13010136. [PMID: 35052476 PMCID: PMC8774527 DOI: 10.3390/genes13010136] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 01/04/2022] [Accepted: 01/08/2022] [Indexed: 02/01/2023] Open
Abstract
This paper aims to provide a first glimpse into the genomic characterization of individuals buried in Casal Bertone (Rome, first–third centuries AD) to gain preliminary insight into the genetic makeup of people who lived near a tannery workshop, fullonica. Therefore, we explored the genetic characteristics of individuals who were putatively recruited as fuller workers outside the Roman population. Moreover, we identified the microbial communities associated with humans to detect microbes associated with the unhealthy environment supposed for such a workshop. We examined five individuals from Casal Bertone for ancient DNA analysis through whole-genome sequencing via a shotgun approach. We conducted multiple investigations to unveil the genetic components featured in the samples studied and their associated microbial communities. We generated reliable whole-genome data for three samples surviving the quality controls. The individuals were descendants of people from North African and the Near East, two of the main foci for tannery and dyeing activity in the past. Our evaluation of the microbes associated with the skeletal samples showed microbes growing in soils with waste products used in the tannery process, indicating that people lived, died, and were buried around places where they worked. In that perspective, the results represent the first genomic characterization of fullers from the past. This analysis broadens our knowledge about the presence of multiple ancestries in Imperial Rome, marking a starting point for future data integration as part of interdisciplinary research on human mobility and the bio-cultural characteristics of people employed in dedicated workshops.
Collapse
Affiliation(s)
- Flavio De Angelis
- Centre of Molecular Anthropology for Ancient DNA Studies, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy; (V.V.); (C.M.-L.); (O.R.)
- Correspondence: ; Tel.: +39-0672594350
| | - Marco Romboni
- Department of Biology, University of Pisa, 56121 Pisa, Italy;
| | - Virginia Veltre
- Centre of Molecular Anthropology for Ancient DNA Studies, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy; (V.V.); (C.M.-L.); (O.R.)
- PhD Program in Evolutionary Biology and Ecology, Department of Biology, University of Rome Tor Vergata, 00133 Roma, Italy
| | - Paola Catalano
- Former Servizio di Antropologia, Soprintendenza Speciale Archeologia, Belle Arti e Paesaggio di Roma, 00185 Roma, Italy;
| | - Cristina Martínez-Labarga
- Centre of Molecular Anthropology for Ancient DNA Studies, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy; (V.V.); (C.M.-L.); (O.R.)
| | - Valentina Gazzaniga
- Unità di Storia della Medicina e Bioetica, Sapienza University of Rome, 00185 Roma, Italy;
| | - Olga Rickards
- Centre of Molecular Anthropology for Ancient DNA Studies, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy; (V.V.); (C.M.-L.); (O.R.)
| |
Collapse
|
177
|
Bergeron LA, Besenbacher S, Turner T, Versoza CJ, Wang RJ, Price AL, Armstrong E, Riera M, Carlson J, Chen HY, Hahn MW, Harris K, Kleppe AS, López-Nandam EH, Moorjani P, Pfeifer SP, Tiley GP, Yoder AD, Zhang G, Schierup MH. The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife 2022; 11:73577. [PMID: 35018888 PMCID: PMC8830884 DOI: 10.7554/elife.73577] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 01/11/2022] [Indexed: 11/13/2022] Open
Abstract
In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a ‘Mutationathon,’ a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.
Collapse
Affiliation(s)
- Lucie A Bergeron
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Søren Besenbacher
- Department of Molecular Medicine (MOMA), Aarhus University, Aarhus N, Denmark
| | - Tychele Turner
- Department of Genetics, Washington University in St. Louis, Saint Louis, United States
| | - Cyril J Versoza
- Center for Evolution and Medicine, Arizona State University, Tempe, United States
| | - Richard J Wang
- Department of Biology, Indiana University, Bloomington, United States
| | - Alivia Lee Price
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Ellie Armstrong
- Department of Biology, Stanford University, Stanford, United States
| | - Meritxell Riera
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Jedidiah Carlson
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Hwei-Yen Chen
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, United States
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States
| | | | | | - Priya Moorjani
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
| | - Susanne P Pfeifer
- School of Life Sciences, Arizona State University, Tempe, United States
| | - George P Tiley
- Department of Biology, Duke University, Durham, United States
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, United States
| | - Guojie Zhang
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
178
|
Adams PE, Crist AB, Young EM, Willis JH, Phillips PC, Fierst JL. Slow Recovery from Inbreeding Depression Generated by the Complex Genetic Architecture of Segregating Deleterious Mutations. Mol Biol Evol 2022; 39:msab330. [PMID: 34791426 PMCID: PMC8789292 DOI: 10.1093/molbev/msab330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The deleterious effects of inbreeding have been of extreme importance to evolutionary biology, but it has been difficult to characterize the complex interactions between genetic constraints and selection that lead to fitness loss and recovery after inbreeding. Haploid organisms and selfing organisms like the nematode Caenorhabditis elegans are capable of rapid recovery from the fixation of novel deleterious mutation; however, the potential for recovery and genomic consequences of inbreeding in diploid, outcrossing organisms are not well understood. We sought to answer two questions: 1) Can a diploid, outcrossing population recover from inbreeding via standing genetic variation and new mutation? and 2) How does allelic diversity change during recovery? We inbred C. remanei, an outcrossing relative of C. elegans, through brother-sister mating for 30 generations followed by recovery at large population size. Inbreeding reduced fitness but, surprisingly, recovery from inbreeding at large populations sizes generated only very moderate fitness recovery after 300 generations. We found that 65% of ancestral single nucleotide polymorphisms (SNPs) were fixed in the inbred population, far fewer than the theoretical expectation of ∼99%. Under recovery, 36 SNPs across 30 genes involved in alimentary, muscular, nervous, and reproductive systems changed reproducibly across replicates, indicating that strong selection for fitness recovery does exist. Our results indicate that recovery from inbreeding depression via standing genetic variation and mutation is likely to be constrained by the large number of segregating deleterious variants present in natural populations, limiting the capacity for recovery of small populations.
Collapse
Affiliation(s)
- Paula E Adams
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL, USA
| | - Anna B Crist
- Department of Genomes and Genetics, Institut Pasteur, Paris, France
| | - Ellen M Young
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - John H Willis
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Patrick C Phillips
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Janna L Fierst
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL, USA
| |
Collapse
|
179
|
Pan B, Ren L, Onuchic V, Guan M, Kusko R, Bruinsma S, Trigg L, Scherer A, Ning B, Zhang C, Glidewell-Kenney C, Xiao C, Donaldson E, Sedlazeck FJ, Schroth G, Yavas G, Grunenwald H, Chen H, Meinholz H, Meehan J, Wang J, Yang J, Foox J, Shang J, Miclaus K, Dong L, Shi L, Mohiyuddin M, Pirooznia M, Gong P, Golshani R, Wolfinger R, Lababidi S, Sahraeian SME, Sherry S, Han T, Chen T, Shi T, Hou W, Ge W, Zou W, Guo W, Bao W, Xiao W, Fan X, Gondo Y, Yu Y, Zhao Y, Su Z, Liu Z, Tong W, Xiao W, Zook JM, Zheng Y, Hong H. Assessing reproducibility of inherited variants detected with short-read whole genome sequencing. Genome Biol 2022; 23:2. [PMID: 34980216 PMCID: PMC8722114 DOI: 10.1186/s13059-021-02569-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 12/06/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.
Collapse
Affiliation(s)
- Bohu Pan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | | | | | | | - Len Trigg
- Real Time Genomics, Hamilton, New Zealand
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Baitang Ning
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Eric Donaldson
- Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Gokhan Yavas
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | | | | | - Joe Meehan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Jing Wang
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Lianhua Dong
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | | | | | - Samir Lababidi
- Office of Health Informatics, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | | | - Steve Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Tao Han
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tao Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wen Zou
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjing Guo
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjun Bao
- SAS Institute Inc., Cary, NC, 27513, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA, 94305, USA
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yoichi Gondo
- Department of Molecular Life Sciences, Tokai University School of Medicine, 143 Shimokasuya, Isehara, 259-1193, Japan
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Yongmei Zhao
- CCR-SF Bioinformatics Group, Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science, Frederick National Laboratory for Cancer Research, Frederick, MD, 21701, USA
| | - Zhenqiang Su
- Takeda Pharmaceuticals, Cambridge, MA, 02139, USA
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenming Xiao
- Division of Molecular Genetics and Pathology, Center for Device and Radiological Health, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Fudan University, Shanghai, 200438, China.
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
180
|
Chang TC, Xu K, Cheng Z, Wu G. Somatic and Germline Variant Calling from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:37-54. [DOI: 10.1007/978-3-030-91836-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
181
|
Wang MS, Thakur M, Jhala Y, Wang S, Srinivas Y, Dai SS, Liu ZX, Chen HM, Green RE, Koepfli KP, Shapiro B. OUP accepted manuscript. Genome Biol Evol 2022; 14:6524629. [PMID: 35137061 PMCID: PMC8841465 DOI: 10.1093/gbe/evac012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/13/2022] [Indexed: 11/14/2022] Open
Affiliation(s)
- Ming-Shan Wang
- Howard Hughes Medical Institute, University of California Santa Cruz, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, USA
- Corresponding authors: E-mails: ; ; ;
| | - Mukesh Thakur
- Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Corresponding authors: E-mails: ; ; ;
| | | | - Sheng Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yellapu Srinivas
- Wildlife Institute of India, Chandrabani, Dehradun, Uttarakhand, India
| | - Shan-Shan Dai
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zheng-Xi Liu
- College of Animal Science, Jilin University, Changchun, China
| | - Hong-Man Chen
- College of Animal Science and Technology, Yunnan Agricultural University, Kunming, China
| | - Richard E Green
- Department of Biomolecular Engineering, University of California Santa Cruz, USA
| | - Klaus-Peter Koepfli
- Smithsonian-Mason School of Conservation, George Mason University, USA
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, District of Columbia, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg, Russia
- Corresponding authors: E-mails: ; ; ;
| | - Beth Shapiro
- Howard Hughes Medical Institute, University of California Santa Cruz, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, USA
- Corresponding authors: E-mails: ; ; ;
| |
Collapse
|
182
|
Host Lung Environment Limits Aspergillus fumigatus Germination through an SskA-Dependent Signaling Response. mSphere 2021; 6:e0092221. [PMID: 34878292 PMCID: PMC8653827 DOI: 10.1128/msphere.00922-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Aspergillus fumigatus isolates display significant heterogeneity in growth, virulence, pathology, and inflammatory potential in multiple murine models of invasive aspergillosis. Previous studies have linked the initial germination of a fungal isolate in the airways to the inflammatory and pathological potential, but the mechanism(s) regulating A. fumigatus germination in the airways is unresolved. To explore the genetic basis for divergent germination phenotypes, we utilized a serial passaging strategy in which we cultured a slow germinating strain (AF293) in a murine-lung-based medium for multiple generations. Through this serial passaging approach, a strain emerged with an increased germination rate that induces more inflammation than the parental strain (herein named LH-EVOL for lung homogenate evolved). We identified a potential loss-of-function allele of Afu5g08390 (sskA) in the LH-EVOL strain. The LH-EVOL strain had a decreased ability to induce the SakA-dependent stress pathway, similar to AF293 ΔsskA and CEA10. In support of the whole-genome variant analyses, sskA, sakA, or mpkC loss-of-function strains in the AF293 parental strain increased germination both in vitro and in vivo. Since the airway surface liquid of the lungs contains low glucose levels, the relationship of low glucose concentration on germination of these mutant AF293 strains was examined; interestingly, in low glucose conditions, the sakA pathway mutants exhibited an enhanced germination rate. In conclusion, A. fumigatus germination in the airways is regulated by SskA through the SakA mitogen-activated protein kinase (MAPK) pathway and drives enhanced disease initiation and inflammation in the lungs. IMPORTANCEAspergillus fumigatus is an important human fungal pathogen particularly in immunocompromised individuals. Initiation of growth by A. fumigatus in the lung is important for its pathogenicity in murine models. However, our understanding of what regulates fungal germination in the lung environment is lacking. Through a serial passage experiment using lung-based medium, we identified a new strain of A. fumigatus that has increased germination potential and inflammation in the lungs. Using this serially passaged strain, we found it had a decreased ability to mediate signaling through the osmotic stress response pathway. This finding was confirmed using genetic null mutants demonstrating that the osmotic stress response pathway is critical for regulating growth in the murine lungs. Our results contribute to the understanding of A. fumigatus adaptation and growth in the host lung environment.
Collapse
|
183
|
Çelik G, Tuncalı T. ROHMM-A flexible hidden Markov model framework to detect runs of homozygosity from genotyping data. Hum Mutat 2021; 43:158-168. [PMID: 34923717 DOI: 10.1002/humu.24316] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 11/29/2021] [Accepted: 12/15/2021] [Indexed: 11/05/2022]
Abstract
Runs of long homozygous (ROH) stretches are considered to be the result of consanguinity and usually contain recessive deleterious disease-causing mutations. Several algorithms have been developed to detect ROHs. Here, we developed a simple alternative strategy by examining X chromosome non-pseudoautosomal region to detect the ROHs from next-generation sequencing data utilizing the genotype probabilities and the hidden Markov model algorithm as a tool, namely ROHMM. It is implemented purely in java and contains both a command line and a graphical user interface. We tested ROHMM on simulated data as well as real population data from the 1000G Project and a clinical sample. Our results have shown that ROHMM can perform robustly producing highly accurate homozygosity estimations under all conditions thereby meeting and even exceeding the performance of its natural competitors.
Collapse
Affiliation(s)
- Gökalp Çelik
- Health Sciences Institute, Department of Medical Genetics, Ankara Yildirim Beyazit University, Ankara, Turkey
| | - Timur Tuncalı
- Department of Medical Genetics, Ankara University School of Medicine, Ankara, Turkey
| |
Collapse
|
184
|
Lin LH, Chou CH, Cheng HW, Chang KW, Liu CJ. Precise Identification of Recurrent Somatic Mutations in Oral Cancer Through Whole-Exome Sequencing Using Multiple Mutation Calling Pipelines. Front Oncol 2021; 11:741626. [PMID: 34912705 PMCID: PMC8666431 DOI: 10.3389/fonc.2021.741626] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/11/2021] [Indexed: 01/18/2023] Open
Abstract
Understanding the genomic alterations in oral carcinogenesis remains crucial for the appropriate diagnosis and treatment of oral squamous cell carcinoma (OSCC). To unveil the mutational spectrum, in this study, we conducted whole-exome sequencing (WES), using six mutation calling pipelines and multiple filtering criteria applied to 50 paired OSCC samples. The tumor mutation burden extracted from the data set of somatic variations was significantly associated with age, tumor staging, and survival. Several genes (MUC16, MUC19, KMT2D, TTN, HERC2) with a high frequency of false positive mutations were identified. Moreover, known (TP53, FAT1, EPHA2, NOTCH1, CASP8, and PIK3CA) and novel (HYDIN, ALPK3, ASXL1, USP9X, SKOR2, CPLANE1, STARD9, and NSD2) genes have been found to be significantly and frequently mutated in OSCC. Further analysis of gene alteration status with clinical parameters revealed that canonical pathways, including clathrin-mediated endocytotic signaling, NFκB signaling, PEDF signaling, and calcium signaling were associated with OSCC prognosis. Defining a catalog of targetable genomic alterations showed that 58% of the tumors carried at least one aberrant event that may potentially be targeted by approved therapeutic agents. We found molecular OSCC subgroups which were correlated with etiology and prognosis while defining the landscape of major altered events in the coding regions of OSCC genomes. These findings provide information that will be helpful in the design of clinical trials on targeted therapies and in the stratification of patients with OSCC according to therapeutic efficacy.
Collapse
Affiliation(s)
- Li-Han Lin
- Department of Medical Research, MacKay Memorial Hospital, Taipei, Taiwan
| | - Chung-Hsien Chou
- Institute of Oral Biology, School of Dentistry, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Hui-Wen Cheng
- Department of Medical Research, MacKay Memorial Hospital, Taipei, Taiwan
| | - Kuo-Wei Chang
- Institute of Oral Biology, School of Dentistry, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Department of Stomatology, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Chung-Ji Liu
- Department of Medical Research, MacKay Memorial Hospital, Taipei, Taiwan.,Department of Oral and Maxillofacial Surgery, Taipei MacKay Memorial Hospital, Taipei, Taiwan
| |
Collapse
|
185
|
Zhao YW, Pan HX, Liu Z, Wang Y, Zeng Q, Fang ZH, Luo TF, Xu K, Wang Z, Zhou X, He R, Li B, Zhao G, Xu Q, Sun QY, Yan XX, Tan JQ, Li JC, Guo JF, Tang BS. The Association Between Lysosomal Storage Disorder Genes and Parkinson's Disease: A Large Cohort Study in Chinese Mainland Population. Front Aging Neurosci 2021; 13:749109. [PMID: 34867278 PMCID: PMC8634711 DOI: 10.3389/fnagi.2021.749109] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 09/29/2021] [Indexed: 12/20/2022] Open
Abstract
Background: Recent years have witnessed an increasing number of studies indicating an essential role of the lysosomal dysfunction in Parkinson’s disease (PD) at the genetic, biochemical, and cellular pathway levels. In this study, we investigated the association between rare variants in lysosomal storage disorder (LSD) genes and Chinese mainland PD. Methods: We explored the association between rare variants of 69 LSD genes and PD in 3,879 patients and 2,931 controls from Parkinson’s Disease & Movement Disorders Multicenter Database and Collaborative Network in China (PD-MDCNC) using next-generation sequencing, which were analyzed by using the optimized sequence kernel association test. Results: We identified the significant burden of rare putative LSD gene variants in Chinese mainland patients with PD. This association was robust in familial or sporadic early-onset patients after excluding the GBA variants but not in sporadic late-onset patients. The burden analysis of variant sets in genes of LSD subgroups revealed a suggestive significant association between variant sets in genes of sphingolipidosis deficiency disorders and familial or sporadic early-onset patients. In contrast, variant sets in genes of sphingolipidoses, mucopolysaccharidoses, and post-translational modification defect disorders were suggestively associated with sporadic late-onset patients. Then, SMPD1 and other four novel genes (i.e., GUSB, CLN6, PPT1, and SCARB2) were suggestively associated with sporadic early-onset or familial patients, whereas GALNS and NAGA were suggestively associated with late-onset patients. Conclusion: Our findings supported the association between LSD genes and PD and revealed several novel risk genes in Chinese mainland patients with PD, which confirmed the importance of lysosomal mechanisms in PD pathogenesis. Moreover, we identified the genetic heterogeneity in early-onset and late-onset of patients with PD, which may provide valuable suggestions for the treatment.
Collapse
Affiliation(s)
- Yu-Wen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Hong-Xu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Zhenhua Liu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qian Zeng
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zheng-Huan Fang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Teng-Fei Luo
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Kun Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zheng Wang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Xun Zhou
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Runcheng He
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Qian Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qi-Ying Sun
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Xin-Xiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Jie-Qiong Tan
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Jin-Chen Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China.,Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China.,Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Ji-Feng Guo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China.,Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China.,Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| | - Bei-Sha Tang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China.,Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China.,Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| |
Collapse
|
186
|
Montasser ME, Van Hout CV, Miloscio L, Howard AD, Rosenberg A, Callaway M, Shen B, Li N, Locke AE, Verweij N, De T, Ferreira MA, Lotta LA, Baras A, Daly TJ, Hartford SA, Lin W, Mao Y, Ye B, White D, Gong G, Perry JA, Ryan KA, Fang Q, Tzoneva G, Pefanis E, Hunt C, Tang Y, Lee L, Sztalryd-Woodle C, Mitchell BD, Healy M, Streeten EA, Taylor SI, O'Connell JR, Economides AN, Della Gatta G, Shuldiner AR. Genetic and functional evidence links a missense variant in B4GALT1 to lower LDL and fibrinogen. Science 2021; 374:1221-1227. [PMID: 34855475 DOI: 10.1126/science.abe0348] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
[Figure: see text].
Collapse
Affiliation(s)
- May E Montasser
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Cristopher V Van Hout
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA.,Laboratorio Internacional de Investigatión sobre el Genoma Humano, Campus Juriquilla de la Universidad Nacional Autónoma de México, Querétaro, Querétaro 76230, México
| | | | - Alicia D Howard
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA.,Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, USA
| | | | | | - Biao Shen
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Ning Li
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Adam E Locke
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | - Niek Verweij
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | - Tanima De
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | | | - Luca A Lotta
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | - Aris Baras
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | - Thomas J Daly
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | | | - Wei Lin
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Yuan Mao
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Bin Ye
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | - Derek White
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Guochun Gong
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - James A Perry
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Kathleen A Ryan
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Qing Fang
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Gannie Tzoneva
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA
| | | | - Charleen Hunt
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Yajun Tang
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | - Lynn Lee
- Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | | | - Carole Sztalryd-Woodle
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA.,US Department of Veterans Affairs, Washington, DC 20420 USA
| | - Braxton D Mitchell
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA.,Geriatrics Research and Education Clinical Center, VA Medical Center, Baltimore, MD 21201, USA
| | | | - Elizabeth A Streeten
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA.,Division of Genetics, Department of Pediatrics, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Simeon I Taylor
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Jeffrey R O'Connell
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Aris N Economides
- Regeneron Genetics Center, LLC, Tarrytown, NY 10591, USA.,Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA
| | | | | |
Collapse
|
187
|
Zhang Z, Bai H, Blumenfeld J, Ramnauth AB, Barash I, Prince M, Tan AY, Michaeel A, Liu G, Chicos I, Rennert L, Giannakopoulos S, Larbi K, Hughes S, Salvatore SP, Robinson BD, Kapur S, Rennert H. Detection of PKD1 and PKD2 Somatic Variants in Autosomal Dominant Polycystic Kidney Cyst Epithelial Cells by Whole-Genome Sequencing. J Am Soc Nephrol 2021; 32:3114-3129. [PMID: 34716216 PMCID: PMC8638386 DOI: 10.1681/asn.2021050690] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 09/03/2021] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Autosomal dominant polycystic kidney disease (ADPKD) is a genetic disorder characterized by the development of multiple cysts in the kidneys. It is often caused by pathogenic mutations in PKD1 and PKD2 genes that encode polycystin proteins. Although the molecular mechanisms for cystogenesis are not established, concurrent inactivating germline and somatic mutations in PKD1 and PKD2 have been previously observed in renal tubular epithelium (RTE). METHODS To further investigate the cellular recessive mechanism of cystogenesis in RTE, we conducted whole-genome DNA sequencing analysis to identify germline variants and somatic alterations in RTE of 90 unique kidney cysts obtained during nephrectomy from 24 unrelated participants. RESULTS Kidney cysts were overall genomically stable, with low burdens of somatic short mutations or large-scale structural alterations. Pathogenic somatic "second hit" alterations disrupting PKD1 or PKD2 were identified in 93% of the cysts. Of these, 77% of cysts acquired short mutations in PKD1 or PKD2 ; specifically, 60% resulted in protein truncations (nonsense, frameshift, or splice site) and 17% caused non-truncating mutations (missense, in-frame insertions, or deletions). Another 18% of cysts acquired somatic chromosomal loss of heterozygosity (LOH) events encompassing PKD1 or PKD2 ranging from 2.6 to 81.3 Mb. 14% of these cysts harbored copy number neutral LOH events, while the other 3% had hemizygous chromosomal deletions. LOH events frequently occurred at chromosomal fragile sites, or in regions comprising chromosome microdeletion diseases/syndromes. Almost all somatic "second hit" alterations occurred at the same germline mutated PKD1/2 gene. CONCLUSIONS These findings further support a cellular recessive mechanism for cystogenesis in ADPKD primarily caused by inactivating germline and somatic variants of PKD1 or PKD2 genes in kidney cyst epithelium.
Collapse
Affiliation(s)
- Zhengmao Zhang
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Hanwen Bai
- Vertex Pharmaceuticals Inc., Boston, Massachusetts
| | - Jon Blumenfeld
- Department of Medicine, Weill Cornell Medicine, New York, New York
- The Rogosin Institute, New York, New York
| | - Andrew B. Ramnauth
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Irina Barash
- Department of Medicine, Weill Cornell Medicine, New York, New York
- The Rogosin Institute, New York, New York
| | - Martin Prince
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Adrian Y. Tan
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
- Department of Medicine, Weill Cornell Medicine, New York, New York
| | - Alber Michaeel
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Genyan Liu
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | | | - Lior Rennert
- Department of Public Health Sciences, Clemson University, Clemson, South Carolina
| | | | - Karen Larbi
- Vertex Pharmaceuticals Inc., Oxford, United Kingdom
| | | | - Steven P. Salvatore
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Brian D. Robinson
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Sandip Kapur
- Department of Surgery, Weill Cornell Medicine, New York, New York
| | - Hanna Rennert
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| |
Collapse
|
188
|
Tellaetxe-Abete M, Calvo B, Lawrie C. Ideafix: a decision tree-based method for the refinement of variants in FFPE DNA sequencing data. NAR Genom Bioinform 2021; 3:lqab092. [PMID: 34729472 PMCID: PMC8557387 DOI: 10.1093/nargab/lqab092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 09/14/2021] [Accepted: 09/29/2021] [Indexed: 12/16/2022] Open
Abstract
Increasingly, treatment decisions for cancer patients are being made from next-generation sequencing results generated from formalin-fixed and paraffin-embedded (FFPE) biopsies. However, this material is prone to sequence artefacts that cannot be easily identified. In order to address this issue, we designed a machine learning-based algorithm to identify these artefacts using data from >1 600 000 variants from 27 paired FFPE and fresh-frozen breast cancer samples. Using these data, we assembled a series of variant features and evaluated the classification performance of five machine learning algorithms. Using leave-one-sample-out cross-validation, we found that XGBoost (extreme gradient boosting) and random forest obtained AUC (area under the receiver operating characteristic curve) values >0.86. Performance was further tested using two independent datasets that resulted in AUC values of 0.96, whereas a comparison with previously published tools resulted in a maximum AUC value of 0.92. The most discriminating features were read pair orientation bias, genomic context and variant allele frequency. In summary, our results show a promising future for the use of these samples in molecular testing. We built the algorithm into an R package called Ideafix (DEAmination FIXing) that is freely available at https://github.com/mmaitenat/ideafix.
Collapse
Affiliation(s)
| | - Borja Calvo
- Intelligent Systems Group, Computer Science Faculty, University of the Basque Country, Paseo Manuel Lardizabal, 20018 Donostia/San Sebastian, Spain
| | - Charles Lawrie
- Correspondence may also be addressed to Charles Lawrie. Tel: +34 943 006138;
| |
Collapse
|
189
|
Mercier A, Simon A, Lapalu N, Giraud T, Bardin M, Walker AS, Viaud M, Gladieux P. Population Genomics Reveals Molecular Determinants of Specialization to Tomato in the Polyphagous Fungal Pathogen Botrytis cinerea in France. PHYTOPATHOLOGY 2021; 111:2355-2366. [PMID: 33829853 DOI: 10.1094/phyto-07-20-0302-fi] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Many fungal plant pathogens encompass multiple populations specialized on different plant species. Understanding the factors underlying pathogen adaptation to their hosts is a major challenge of evolutionary microbiology, and it should help to prevent the emergence of new specialized pathogens on novel hosts. Previous studies have shown that French populations of the gray mold pathogen Botrytis cinerea parasitizing tomato and grapevine are differentiated from each other, and have higher aggressiveness on their host of origin than on other hosts, indicating some degree of host specialization in this polyphagous pathogen. Here, we aimed at identifying the genomic features underlying the specialization of B. cinerea populations to tomato and grapevine. Based on whole genome sequences of 32 isolates, we confirmed the subdivision of B. cinerea pathogens into two genetic clusters on grapevine and another, single cluster on tomato. Levels of genetic variation in the different clusters were similar, suggesting that the tomato-specific cluster has not recently emerged following a bottleneck. Using genome scans for selective sweeps and divergent selection, tests of positive selection based on polymorphism and divergence at synonymous and nonsynonymous sites, and analyses of presence and absence variation, we identified several candidate genes that represent possible determinants of host specialization in the tomato-associated population. This work deepens our understanding of the genomic changes underlying the specialization of fungal pathogen populations.
Collapse
Affiliation(s)
- Alex Mercier
- Université Paris-Saclay, Institut National de la Recherche Agronomique (INRAE), AgroParisTech, UMR BIOGER, 78850 Thiverval-Grignon, France
- Université Paris-Saclay, Orsay, France
| | - Adeline Simon
- Université Paris-Saclay, Institut National de la Recherche Agronomique (INRAE), AgroParisTech, UMR BIOGER, 78850 Thiverval-Grignon, France
| | - Nicolas Lapalu
- Université Paris-Saclay, Institut National de la Recherche Agronomique (INRAE), AgroParisTech, UMR BIOGER, 78850 Thiverval-Grignon, France
| | - Tatiana Giraud
- Ecologie Systématique Evolution, CNRS, Université Paris-Saclay, AgroParisTech, 91400 Orsay, France
| | - Marc Bardin
- UR0407 Pathologie Végétale, INRAE, 84143 Montfavet, France
| | - Anne-Sophie Walker
- Université Paris-Saclay, Institut National de la Recherche Agronomique (INRAE), AgroParisTech, UMR BIOGER, 78850 Thiverval-Grignon, France
| | - Muriel Viaud
- Université Paris-Saclay, Institut National de la Recherche Agronomique (INRAE), AgroParisTech, UMR BIOGER, 78850 Thiverval-Grignon, France
| | - Pierre Gladieux
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| |
Collapse
|
190
|
Kreiner JM, Caballero A, Wright SI, Stinchcombe JR. Selective ancestral sorting and de novo evolution in the agricultural invasion of Amaranthus tuberculatus. Evolution 2021; 76:70-85. [PMID: 34806764 DOI: 10.1111/evo.14404] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 10/12/2021] [Accepted: 10/24/2021] [Indexed: 12/16/2022]
Abstract
The relative role of hybridization, de novo evolution, and standing variation in weed adaptation to agricultural environments is largely unknown. In Amaranthus tuberculatus, a widespread North American agricultural weed, adaptation is likely influenced by recent secondary contact and admixture of two previously isolated lineages. We characterized the extent of adaptation and phenotypic differentiation accompanying the spread of A. tuberculatus into agricultural environments and the contribution of ancestral divergence. We generated phenotypic and whole-genome sequence data from a manipulative common garden experiment, using paired samples from natural and agricultural populations. We found strong latitudinal, longitudinal, and sex differentiation in phenotypes, and subtle differences among agricultural and natural environments that were further resolved with ancestry inference. The transition into agricultural environments has favored southwestern var. rudis ancestry that leads to higher biomass and treatment-specific phenotypes: increased biomass and earlier flowering under reduced water availability, and reduced plasticity in fitness-related traits. We also detected de novo adaptation in individuals from agricultural habitats independent of ancestry effects, including marginally higher biomass, later flowering, and treatment-dependent divergence in time to germination. Therefore, the invasion of A. tuberculatus into agricultural environments has drawn on adaptive variation across multiple timescales-through both preadaptation via the preferential sorting of var. rudis ancestry and de novo local adaptation.
Collapse
Affiliation(s)
- Julia M Kreiner
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, V6T 1Z4, Canada.,Current Address: Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.,Current Address: Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Amalia Caballero
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, V6T 1Z4, Canada
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, V6T 1Z4, Canada.,Koffler Scientific Reserve, University of Toronto, King City, ON, L7B 1K5, Canada
| |
Collapse
|
191
|
Peng MS, Li JB, Cai ZF, Liu H, Tang X, Ying R, Zhang JN, Tao JJ, Yin TT, Zhang T, Hu JY, Wu RN, Zhou ZY, Zhang ZG, Yu L, Yao YG, Shi ZL, Lu XM, Lu J, Zhang YP. The high diversity of SARS-CoV-2-related coronaviruses in pangolins alerts potential ecological risks. Zool Res 2021; 42:834-844. [PMID: 34766482 PMCID: PMC8645874 DOI: 10.24272/j.issn.2095-8137.2021.334] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/09/2021] [Indexed: 11/07/2022] Open
Abstract
Understanding the zoonotic origin and evolution history of SARS-CoV-2 will provide critical insights for alerting and preventing future outbreaks. A significant gap remains for the possible role of pangolins as a reservoir of SARS-CoV-2 related coronaviruses (SC2r-CoVs). Here, we screened SC2r-CoVs in 172 samples from 163 pangolin individuals of four species, and detected positive signals in muscles of four Manis javanica and, for the first time, one M. pentadactyla. Phylogeographic analysis of pangolin mitochondrial DNA traced their origins from Southeast Asia. Using in-solution hybridization capture sequencing, we assembled a partial pangolin SC2r-CoV (pangolin-CoV) genome sequence of 22 895 bp (MP20) from the M. pentadactyla sample. Phylogenetic analyses revealed MP20 was very closely related to pangolin-CoVs that were identified in M. javanica seized by Guangxi Customs. A genetic contribution of bat coronavirus to pangolin-CoVs via recombination was indicated. Our analysis revealed that the genetic diversity of pangolin-CoVs is substantially higher than previously anticipated. Given the potential infectivity of pangolin-CoVs, the high genetic diversity of pangolin-CoVs alerts the ecological risk of zoonotic evolution and transmission of pathogenic SC2r-CoVs.
Collapse
Affiliation(s)
- Min-Sheng Peng
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China. E-mail:
| | - Jian-Bo Li
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
| | - Zheng-Fei Cai
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
| | - Hang Liu
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
| | - Xiaolu Tang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China
| | - Ruochen Ying
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China
| | - Jia-Nan Zhang
- Molbreeding Biotechnology Co., Ltd., Shijiazhuang, Hebei 050035, China
| | - Jia-Jun Tao
- Molbreeding Biotechnology Co., Ltd., Shijiazhuang, Hebei 050035, China
| | - Ting-Ting Yin
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Tao Zhang
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
| | - Jing-Yang Hu
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
| | - Ru-Nian Wu
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Zhong-Yin Zhou
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Zhi-Gang Zhang
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
| | - Li Yu
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
| | - Yong-Gang Yao
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan 650201, China
| | - Zheng-Li Shi
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, Hubei 430071, China
| | - Xue-Mei Lu
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China. E-mail:
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution, Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, Yunnan 650091, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650201, China. E-mail:
| |
Collapse
|
192
|
Yang Y, Li S, Xing Y, Zhang Z, Liu T, Ao W, Bao G, Zhan Z, Zhao R, Zhang T, Zhang D, Song Y, Bian C, Xu L, Kang T. The first high-quality chromosomal genome assembly of a medicinal and edible plant Arctium lappa. Mol Ecol Resour 2021; 22:1493-1507. [PMID: 34758188 DOI: 10.1111/1755-0998.13547] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 08/11/2021] [Accepted: 09/07/2021] [Indexed: 12/20/2022]
Abstract
Arctium lappa has a long medicinal and edible history with great economic importance. Here, the first high-quality chromosome-level draft genome of A. lappa was presented by the Illumina and PacBio sequencing data. The assembled genome was approximately 1.79 Gb with a N50 contig size of 6.88 Mb. Approximately 1.70 Gb (95.4%) of the contig sequences were anchored onto 18 chromosomes using Hi-C data; the scaffold N50 was improved to be 91.64 Mb. Furthermore, we obtained 1.12 Gb (68.46%) of repetitive sequences and 32,771 protein-coding genes; 616 positively selected candidate genes were identified. Among candidate genes related to lignan biosynthesis, the following were found to be highly correlated with the accumulation of arctiin: 4-coumarate-CoA ligase (4CL), dirigent protein (DIR), and hydroxycinnamoyl transferase (HCT). Additionally, we compared the transcriptomes of A. lappa roots at three different developmental stages and identified 8,943 differentially expressed genes (DEGs) in these tissues. These data can be utilized to identify genes related to A. lappa quality or provide a basis for molecular identification and comparative genomics among related species.
Collapse
Affiliation(s)
- Yanyun Yang
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Shengnan Li
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Yanping Xing
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | | | - Tao Liu
- School of Pharmacy, China Medical University, Shenyang, China
| | - Wuliji Ao
- School of Mongol Medicine, Inner Mongolia University for Nationalities, Tongliao, China
| | - Guihua Bao
- School of Mongol Medicine, Inner Mongolia University for Nationalities, Tongliao, China
| | - Zhilai Zhan
- Traditional Chinese Medicine Resource Center, Chinese Academy of Traditional Chinese Medicine, Beijing, China
| | - Rong Zhao
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Tingting Zhang
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Dachuan Zhang
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Yueyue Song
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Che Bian
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Liang Xu
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| | - Tingguo Kang
- School of Pharmacy, Liaoning University of Traditional Chinese Medicine, Dalian, China
| |
Collapse
|
193
|
Little P, Jo H, Hoyle A, Mazul A, Zhao X, Salazar AH, Farquhar D, Sheth S, Masood M, Hayward MC, Parker JS, Hoadley KA, Zevallos J, Hayes DN. UNMASC: tumor-only variant calling with unmatched normal controls. NAR Cancer 2021; 3:zcab040. [PMID: 34632388 PMCID: PMC8494212 DOI: 10.1093/narcan/zcab040] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 09/07/2021] [Accepted: 10/04/2021] [Indexed: 12/11/2022] Open
Abstract
Despite years of progress, mutation detection in cancer samples continues to require significant manual review as a final step. Expert review is particularly challenging in cases where tumors are sequenced without matched normal control DNA. Attempts have been made to call somatic point mutations without a matched normal sample by removing well-known germline variants, utilizing unmatched normal controls, and constructing decision rules to classify sequencing errors and private germline variants. With budgetary constraints related to computational and sequencing costs, finding the appropriate number of controls is a crucial step to identifying somatic variants. Our approach utilizes public databases for canonical somatic variants as well as germline variants and leverages information gathered about nearby positions in the normal controls. Drawing from our cohort of targeted capture panel sequencing of tumor and normal samples with varying tumortypes and demographics, these served as a benchmark for our tumor-only variant calling pipeline to observe the relationship between our ability to correctly classify variants against a number of unmatched normals. With our benchmarked samples, approximately ten normal controls were needed to maintain 94% sensitivity, 99% specificity and 76% positive predictive value, far outperforming comparable methods. Our approach, called UNMASC, also serves as a supplement to traditional tumor with matched normal variant calling workflows and can potentially extend to other concerns arising from analyzing next generation sequencing data.
Collapse
Affiliation(s)
- Paul Little
- Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Heejoon Jo
- Center for Cancer Research, University of Tennessee Health Science Center, 19 South Manassas, Memphis, TN 38163, USA
| | - Alan Hoyle
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Angela Mazul
- Otolaryngology Head and Neck Surgery, Washington University School of Medicine, 660 South Euclid Avenue, Campus Box 8115, St. Louis, MO 63110, USA
| | - Xiaobei Zhao
- Center for Cancer Research, University of Tennessee Health Science Center, 19 South Manassas, Memphis, TN 38163, USA
| | - Ashley H Salazar
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Douglas Farquhar
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Siddharth Sheth
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Maheer Masood
- Otolaryngology, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS 66160, USA
| | - Michele C Hayward
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Katherine A Hoadley
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 Manning Drive Chapel Hill, NC 27514, USA
| | - Jose Zevallos
- Otolaryngology Head and Neck Surgery, Washington University School of Medicine, 660 South Euclid Avenue, Campus Box 8115, St. Louis, MO 63110, USA
| | - D Neil Hayes
- Center for Cancer Research, University of Tennessee Health Science Center, 19 South Manassas, Memphis, TN 38163, USA
| |
Collapse
|
194
|
Wang M, Lee-Kim VS, Atri DS, Elowe NH, Yu J, Garvie CW, Won HH, Hadaya JE, MacDonald BT, Trindade K, Melander O, Rader DJ, Natarajan P, Kathiresan S, Kaushik VK, Khera AV, Gupta RM. Rare, Damaging DNA Variants in CORIN and Risk of Coronary Artery Disease: Insights From Functional Genomics and Large-Scale Sequencing Analyses. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2021; 14:e003399. [PMID: 34592835 DOI: 10.1161/circgen.121.003399] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
BACKGROUND Corin is a protease expressed in cardiomyocytes that plays a key role in salt handling and intravascular volume homeostasis via activation of natriuretic peptides. It is unknown if Corin loss-of-function (LOF) is causally associated with risk of coronary artery disease (CAD). METHODS We analyzed all coding CORIN variants in an Italian case-control study of CAD. We functionally tested all 64 rare missense mutations in Western Blot and Mass Spectroscopy assays for proatrial natriuretic peptide cleavage. An expanded rare variant association analysis for Corin LOF mutations was conducted in whole exome sequencing data from 37 799 CAD cases and 212 184 controls. RESULTS We observed LOF variants in CORIN in 8 of 1803 (0.4%) CAD cases versus 0 of 1725 controls (P, 0.007). Of 64 rare missense variants profiled, 21 (33%) demonstrated <30% of wild-type activity and were deemed damaging in the 2 functional assays for Corin activity. In a rare variant association study that aggregated rare LOF and functionally validated damaging missense variants from the Italian study, we observed no association with CAD-21 of 1803 CAD cases versus 12 of 1725 controls with adjusted odds ratio of 1.61 ([95% CI, 0.79-3.29]; P=0.17). In the expanded sequencing dataset, there was no relationship between rare LOF variants with CAD was also observed (odds ratio, 1.15 [95% CI, 0.89-1.49]; P=0.30). Consistent with the genetic analysis, we observed no relationship between circulating Corin concentrations with incident CAD events among 4744 participants of a prospective cohort study-sex-stratified hazard ratio per SD increment of 0.96 ([95% CI, 0.87-1.07], P=0.48). CONCLUSIONS Functional testing of missense mutations improved the accuracy of rare variant association analysis. Despite compelling pathophysiology and a preliminary observation suggesting association, we observed no relationship between rare damaging variants in CORIN or circulating Corin concentrations with risk of CAD.
Collapse
Affiliation(s)
- Minxian Wang
- Program in Medical and Population Genetics (M.W., J.E.H., P.N., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Center for Genomic Medicine (M.W., P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston
| | - Vivian S Lee-Kim
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Divisions of Genetics and Cardiovascular Medicine, Brigham and Women's Hospital, Boston, MA (V.S.L.-K., D.S.A.)
| | - Deepak S Atri
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Divisions of Genetics and Cardiovascular Medicine, Brigham and Women's Hospital, Boston, MA (V.S.L.-K., D.S.A.)
| | - Nadine H Elowe
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - John Yu
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Colin W Garvie
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Hong-Hee Won
- Samsung Advanced Institute for Health Sciences and Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Gyeonggi, South Korea (H.-H.W.)
| | - Joseph E Hadaya
- Program in Medical and Population Genetics (M.W., J.E.H., P.N., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Bryan T MacDonald
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Kevin Trindade
- Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia (K.T., D.J.R.)
| | - Olle Melander
- Department of Clinical Sciences, Lund University, Malmö, Skåne, Sweden (O.M.).,Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.)
| | - Daniel J Rader
- Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia (K.T., D.J.R.)
| | - Pradeep Natarajan
- Program in Medical and Population Genetics (M.W., J.E.H., P.N., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Center for Genomic Medicine (M.W., P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston.,Division of Cardiology (P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston
| | - Sekar Kathiresan
- Center for Genomic Medicine (M.W., P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston.,Division of Cardiology (P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston.,Verve Therapeutics, Cambridge, MA (S.K.)
| | - Virendar K Kaushik
- Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Amit V Khera
- Program in Medical and Population Genetics (M.W., J.E.H., P.N., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Center for Genomic Medicine (M.W., P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston.,Division of Cardiology (P.N., S.K., A.V.K.), Massachusetts General Hospital, Boston
| | - Rajat M Gupta
- Program in Medical and Population Genetics (M.W., J.E.H., P.N., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA.,Cardiovascular Disease Initiative (M.W., V.S.L.-K., D.S.A., N.H.E., J.Y., C.W.G., B.T.M., P.N., V.K.K., A.V.K., R.M.G.), Broad Institute of MIT and Harvard, Cambridge, MA
| |
Collapse
|
195
|
Biswas P, Villanueva AL, Soto-Hermida A, Duncan JL, Matsui H, Borooah S, Kurmanov B, Richard G, Khan SY, Branham K, Huang B, Suk J, Bakall B, Goldberg JL, Gabriel L, Khan NW, Raghavendra PB, Zhou J, Devalaraja S, Huynh A, Alapati A, Zawaydeh Q, Weleber RG, Heckenlively JR, Hejtmancik JF, Riazuddin S, Sieving PA, Riazuddin SA, Frazer KA, Ayyagari R. Deciphering the genetic architecture and ethnographic distribution of IRD in three ethnic populations by whole genome sequence analysis. PLoS Genet 2021; 17:e1009848. [PMID: 34662339 PMCID: PMC8589175 DOI: 10.1371/journal.pgen.1009848] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 11/12/2021] [Accepted: 09/29/2021] [Indexed: 12/12/2022] Open
Abstract
Patients with inherited retinal dystrophies (IRDs) were recruited from two understudied populations: Mexico and Pakistan as well as a third well-studied population of European Americans to define the genetic architecture of IRD by performing whole-genome sequencing (WGS). Whole-genome analysis was performed on 409 individuals from 108 unrelated pedigrees with IRDs. All patients underwent an ophthalmic evaluation to establish the retinal phenotype. Although the 108 pedigrees in this study had previously been examined for mutations in known IRD genes using a wide range of methodologies including targeted gene(s) or mutation(s) screening, linkage analysis and exome sequencing, the gene mutations responsible for IRD in these 108 pedigrees were not determined. WGS was performed on these pedigrees using Illumina X10 at a minimum of 30X depth. The sequence reads were mapped against hg19 followed by variant calling using GATK. The genome variants were annotated using SnpEff, PolyPhen2, and CADD score; the structural variants (SVs) were called using GenomeSTRiP and LUMPY. We identified potential causative sequence alterations in 61 pedigrees (57%), including 39 novel and 54 reported variants in IRD genes. For 57 of these pedigrees the observed genotype was consistent with the initial clinical diagnosis, the remaining 4 had the clinical diagnosis reclassified based on our findings. In seven pedigrees (12%) we observed atypical causal variants, i.e. unexpected genotype(s), including 4 pedigrees with causal variants in more than one IRD gene within all affected family members, one pedigree with intrafamilial genetic heterogeneity (different affected family members carrying causal variants in different IRD genes), one pedigree carrying a dominant causative variant present in pseudo-recessive form due to consanguinity and one pedigree with a de-novo variant in the affected family member. Combined atypical and large structural variants contributed to about 20% of cases. Among the novel mutations, 75% were detected in Mexican and 50% found in European American pedigrees and have not been reported in any other population while only 20% were detected in Pakistani pedigrees and were not previously reported. The remaining novel IRD causative variants were listed in gnomAD but were found to be very rare and population specific. Mutations in known IRD associated genes contributed to pathology in 63% Mexican, 60% Pakistani and 45% European American pedigrees analyzed. Overall, contribution of known IRD gene variants to disease pathology in these three populations was similar to that observed in other populations worldwide. This study revealed a spectrum of mutations contributing to IRD in three populations, identified a large proportion of novel potentially causative variants that are specific to the corresponding population or not reported in gnomAD and shed light on the genetic architecture of IRD in these diverse global populations.
Collapse
Affiliation(s)
- Pooja Biswas
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
- School of Biotechnology, REVA University, Bengaluru, Karnataka, India
| | - Adda L. Villanueva
- Retina and Genomics Institute, Yucatán, México
- Laboratoire de Diagnostic Moleculaire, Hôpital Maisonneuve Rosemont, Montreal, Quebec, Canada
| | - Angel Soto-Hermida
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Jacque L. Duncan
- Ophthalmology, University of California San Francisco, San Francisco, California, United States of America
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California, United States of America
| | - Shyamanga Borooah
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Berzhan Kurmanov
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | | | - Shahid Y. Khan
- The Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Kari Branham
- Ophthalmology & Visual Science, University of Michigan Kellogg Eye Center, Ann Arbor, Michigan, United States of America
| | - Bonnie Huang
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - John Suk
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Benjamin Bakall
- Ophthalmology, University of Arizona College of Medicine Phoenix, Phoenix, Arizona, United States of America
| | - Jeffrey L. Goldberg
- Byers Eye Institute, Stanford, Palo Alto, California, United States of America
| | - Luis Gabriel
- Genetics and Ophthalmology, Genelabor, Goiânia, Brazil
| | - Naheed W. Khan
- Ophthalmology & Visual Science, University of Michigan Kellogg Eye Center, Ann Arbor, Michigan, United States of America
| | - Pongali B. Raghavendra
- School of Biotechnology, REVA University, Bengaluru, Karnataka, India
- School of Regenerative Medicine, Manipal University, Bengaluru, Karnataka, India
| | - Jason Zhou
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Sindhu Devalaraja
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Andrew Huynh
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Akhila Alapati
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Qais Zawaydeh
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| | - Richard G. Weleber
- Casey Eye Institute, Oregon Health & Science University, Portland, Oregon, United States of America
| | - John R. Heckenlively
- Ophthalmology & Visual Science, University of Michigan Kellogg Eye Center, Ann Arbor, Michigan, United States of America
| | - J. Fielding Hejtmancik
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Sheikh Riazuddin
- National Centre of Excellence in Molecular Biology, University of the Punjab, Lahore, Pakistan
- Allama Iqbal Medical College, University of Health Sciences, Lahore, Pakistan
| | - Paul A. Sieving
- National Eye Institute, Bethesda, Maryland, United States of America
- Ophthalmology & Vision Science, UC Davis Medical Center, California, United States of America
| | - S. Amer Riazuddin
- The Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Kelly A. Frazer
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California, United States of America
- Department of Pediatrics, Rady Children’s Hospital, Division of Genome Information Sciences, San Diego, California, United States of America
| | - Radha Ayyagari
- Shiley Eye Institute, University of California San Diego, La Jolla, California, United States of America
| |
Collapse
|
196
|
Mahadevaiah C, Appunu C, Aitken K, Suresha GS, Vignesh P, Mahadeva Swamy HK, Valarmathi R, Hemaprabha G, Alagarasan G, Ram B. Genomic Selection in Sugarcane: Current Status and Future Prospects. FRONTIERS IN PLANT SCIENCE 2021; 12:708233. [PMID: 34646284 PMCID: PMC8502939 DOI: 10.3389/fpls.2021.708233] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 08/24/2021] [Indexed: 05/18/2023]
Abstract
Sugarcane is a C4 and agro-industry-based crop with a high potential for biomass production. It serves as raw material for the production of sugar, ethanol, and electricity. Modern sugarcane varieties are derived from the interspecific and intergeneric hybridization between Saccharum officinarum, Saccharum spontaneum, and other wild relatives. Sugarcane breeding programmes are broadly categorized into germplasm collection and characterization, pre-breeding and genetic base-broadening, and varietal development programmes. The varietal identification through the classic breeding programme requires a minimum of 12-14 years. The precise phenotyping in sugarcane is extremely tedious due to the high propensity of lodging and suckering owing to the influence of environmental factors and crop management practices. This kind of phenotyping requires data from both plant crop and ratoon experiments conducted over locations and seasons. In this review, we explored the feasibility of genomic selection schemes for various breeding programmes in sugarcane. The genetic diversity analysis using genome-wide markers helps in the formation of core set germplasm representing the total genomic diversity present in the Saccharum gene bank. The genome-wide association studies and genomic prediction in the Saccharum gene bank are helpful to identify the complete genomic resources for cane yield, commercial cane sugar, tolerances to biotic and abiotic stresses, and other agronomic traits. The implementation of genomic selection in pre-breeding, genetic base-broadening programmes assist in precise introgression of specific genes and recurrent selection schemes enhance the higher frequency of favorable alleles in the population with a considerable reduction in breeding cycles and population size. The integration of environmental covariates and genomic prediction in multi-environment trials assists in the prediction of varietal performance for different agro-climatic zones. This review also directed its focus on enhancing the genetic gain over time, cost, and resource allocation at various stages of breeding programmes.
Collapse
Affiliation(s)
| | - Chinnaswamy Appunu
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Karen Aitken
- CSIRO (Commonwealth Scientific and Industrial Research Organization), St. Lucia, QLD, Australia
| | | | - Palanisamy Vignesh
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | | | | | - Govind Hemaprabha
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Ganesh Alagarasan
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Bakshi Ram
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| |
Collapse
|
197
|
Ferré Q, Chèneby J, Puthier D, Capponi C, Ballester B. Anomaly detection in genomic catalogues using unsupervised multi-view autoencoders. BMC Bioinformatics 2021; 22:460. [PMID: 34563116 PMCID: PMC8467021 DOI: 10.1186/s12859-021-04359-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 06/04/2021] [Accepted: 08/09/2021] [Indexed: 11/13/2022] Open
Abstract
Background Accurate identification of Transcriptional Regulator binding locations is essential for analysis of genomic regions, including Cis Regulatory Elements. The customary NGS approaches, predominantly ChIP-Seq, can be obscured by data anomalies and biases which are difficult to detect without supervision. Results Here, we develop a method to leverage the usual combinations between many experimental series to mark such atypical peaks. We use deep learning to perform a lossy compression of the genomic regions’ representations with multiview convolutions. Using artificial data, we show that our method correctly identifies groups of correlating series and evaluates CRE according to group completeness. It is then applied to the ReMap database’s large volume of curated ChIP-seq data. We show that peaks lacking known biological correlators are singled out and less confirmed in real data. We propose normalization approaches useful in interpreting black-box models. Conclusion Our approach detects peaks that are less corroborated than average. It can be extended to other similar problems, and can be interpreted to identify correlation groups. It is implemented in an open-source tool called atyPeak. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04359-2.
Collapse
Affiliation(s)
- Quentin Ferré
- INSERM, TAGC, Aix Marseille University, Marseille, France.,Université de Toulon, CNRS, LIS, Aix Marseille University, Marseille, France
| | - Jeanne Chèneby
- INSERM, TAGC, Aix Marseille University, Marseille, France
| | - Denis Puthier
- INSERM, TAGC, Aix Marseille University, Marseille, France
| | - Cécile Capponi
- Université de Toulon, CNRS, LIS, Aix Marseille University, Marseille, France.
| | | |
Collapse
|
198
|
Boso G, Lam O, Bamunusinghe D, Oler AJ, Wollenberg K, Liu Q, Shaffer E, Kozak CA. Patterns of Coevolutionary Adaptations across Time and Space in Mouse Gammaretroviruses and Three Restrictive Host Factors. Viruses 2021; 13:v13091864. [PMID: 34578445 PMCID: PMC8472935 DOI: 10.3390/v13091864] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 09/04/2021] [Accepted: 09/15/2021] [Indexed: 10/26/2022] Open
Abstract
The classical laboratory mouse strains are genetic mosaics of three Mus musculus subspecies that occupy distinct regions of Eurasia. These strains and subspecies carry infectious and endogenous mouse leukemia viruses (MLVs) that can be pathogenic and mutagenic. MLVs evolved in concert with restrictive host factors with some under positive selection, including the XPR1 receptor for xenotropic/polytropic MLVs (X/P-MLVs) and the post-entry restriction factor Fv1. Since positive selection marks host-pathogen genetic conflicts, we examined MLVs for counter-adaptations at sites that interact with XPR1, Fv1, and the CAT1 receptor for ecotropic MLVs (E-MLVs). Results describe different co-adaptive evolutionary paths within the ranges occupied by these virus-infected subspecies. The interface of CAT1, and the otherwise variable E-MLV envelopes, is highly conserved; antiviral protection is afforded by the Fv4 restriction factor. XPR1 and X/P-MLVs variants show coordinate geographic distributions, with receptor critical sites in envelope, under positive selection but with little variation in envelope and XPR1 in mice carrying P-ERVs. The major Fv1 target in the viral capsid is under positive selection, and the distribution of Fv1 alleles is subspecies-correlated. These data document adaptive, spatial and temporal, co-evolutionary trajectories at the critical interfaces of MLVs and the host factors that restrict their replication.
Collapse
Affiliation(s)
- Guney Boso
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
| | - Oscar Lam
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
| | - Devinka Bamunusinghe
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
| | - Andrew J. Oler
- Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (A.J.O.); (K.W.)
| | - Kurt Wollenberg
- Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (A.J.O.); (K.W.)
| | - Qingping Liu
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
| | - Esther Shaffer
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
| | - Christine A. Kozak
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892, USA; (G.B.); (O.L.); (D.B.); (Q.L.); (E.S.)
- Correspondence:
| |
Collapse
|
199
|
Liu S, Westbury MV, Dussex N, Mitchell KJ, Sinding MHS, Heintzman PD, Duchêne DA, Kapp JD, von Seth J, Heiniger H, Sánchez-Barreiro F, Margaryan A, André-Olsen R, De Cahsan B, Meng G, Yang C, Chen L, van der Valk T, Moodley Y, Rookmaaker K, Bruford MW, Ryder O, Steiner C, Bruins-van Sonsbeek LGR, Vartanyan S, Guo C, Cooper A, Kosintsev P, Kirillova I, Lister AM, Marques-Bonet T, Gopalakrishnan S, Dunn RR, Lorenzen ED, Shapiro B, Zhang G, Antoine PO, Dalén L, Gilbert MTP. Ancient and modern genomes unravel the evolutionary history of the rhinoceros family. Cell 2021; 184:4874-4885.e16. [PMID: 34433011 DOI: 10.1016/j.cell.2021.07.032] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 06/16/2021] [Accepted: 07/23/2021] [Indexed: 12/27/2022]
Abstract
Only five species of the once-diverse Rhinocerotidae remain, making the reconstruction of their evolutionary history a challenge to biologists since Darwin. We sequenced genomes from five rhinoceros species (three extinct and two living), which we compared to existing data from the remaining three living species and a range of outgroups. We identify an early divergence between extant African and Eurasian lineages, resolving a key debate regarding the phylogeny of extant rhinoceroses. This early Miocene (∼16 million years ago [mya]) split post-dates the land bridge formation between the Afro-Arabian and Eurasian landmasses. Our analyses also show that while rhinoceros genomes in general exhibit low levels of genome-wide diversity, heterozygosity is lowest and inbreeding is highest in the modern species. These results suggest that while low genetic diversity is a long-term feature of the family, it has been particularly exacerbated recently, likely reflecting recent anthropogenic-driven population declines.
Collapse
Affiliation(s)
- Shanlin Liu
- Department of Entomology, College of Plant Protection, China Agricultural University, Beijing 100193, China; The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark.
| | - Michael V Westbury
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Nicolas Dussex
- Centre for Palaeogenetics, Svante Arrhenius vag 20C, Stockholm 10691, Sweden; Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm 10405, Sweden; Department of Zoology, Stockholm University, Stockholm 10691, Sweden
| | - Kieren J Mitchell
- Australian Centre for Ancient DNA, School of Biological Sciences, University of Adelaide, Adelaide 5005, Australia
| | - Mikkel-Holger S Sinding
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Peter D Heintzman
- The Arctic University Museum of Norway, UiT The Arctic University of Norway, Tromsø 9037, Norway
| | - David A Duchêne
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Joshua D Kapp
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Johanna von Seth
- Centre for Palaeogenetics, Svante Arrhenius vag 20C, Stockholm 10691, Sweden; Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm 10405, Sweden; Department of Zoology, Stockholm University, Stockholm 10691, Sweden
| | - Holly Heiniger
- Australian Centre for Ancient DNA, School of Biological Sciences, University of Adelaide, Adelaide 5005, Australia
| | - Fátima Sánchez-Barreiro
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Ashot Margaryan
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Remi André-Olsen
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 17121 Solna, Sweden
| | - Binia De Cahsan
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Guanliang Meng
- China National Genebank, BGI Shenzhen, Shenzhen 518083, China
| | - Chentao Yang
- China National Genebank, BGI Shenzhen, Shenzhen 518083, China
| | - Lei Chen
- Center for Ecological and Environmental Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Tom van der Valk
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Yoshan Moodley
- Department of Zoology, University of Venda, Thohoyandou 0950, Republic of South Africa
| | - Kees Rookmaaker
- Editor of the Rhino Resource Center, Utrecht, the Netherlands
| | - Michael W Bruford
- School of Biosciences, Sir Martin Evans Building, Cardiff University, Cardiff CF10 3AX, UK; Sustainable Places Research Institute, Cardiff University, Cardiff CF10 3BA, UK
| | - Oliver Ryder
- San Diego Zoo Wildlife Alliance, Beckman Center for Conservation Research, San Diego, CA 92027, USA
| | - Cynthia Steiner
- San Diego Zoo Wildlife Alliance, Beckman Center for Conservation Research, San Diego, CA 92027, USA
| | | | - Sergey Vartanyan
- N.A. Shilo North-East Interdisciplinary Scientific Research Institute, Far East Branch, Russian Academy of Sciences (NEISRI FEB RAS), Magadan 685000, Russia
| | - Chunxue Guo
- China National Genebank, BGI Shenzhen, Shenzhen 518083, China
| | - Alan Cooper
- South Australian Museum, Adelaide, SA 5000, Australia
| | - Pavel Kosintsev
- Institute of Plant and Animal Ecology, Ural Branch of the Russian Academy of Sciences, Yekaterinburg, Russia; Ural Federal University, Yekaterinburg, Russia
| | - Irina Kirillova
- Institute of Geography, Russian Academy of Sciences, Moscow 119017, Russia
| | - Adrian M Lister
- Department of Earth Sciences, Natural History Museum, London, UK
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), Barcelona, Spain; Centre Nacional d'Anàlisi Genòmica, Centre for Genomic Regulation (CNAG-CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain; Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Shyam Gopalakrishnan
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Robert R Dunn
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark; Department of Applied Ecology, North Carolina State University, Raleigh, NC, USA
| | - Eline D Lorenzen
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA; Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA 96050, USA
| | - Guojie Zhang
- China National Genebank, BGI Shenzhen, Shenzhen 518083, China; Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| | - Pierre-Olivier Antoine
- Institut des Sciences de l'Évolution, Université Montpellier, CNRS, IRD, EPHE, Montpellier 34095, France
| | - Love Dalén
- Centre for Palaeogenetics, Svante Arrhenius vag 20C, Stockholm 10691, Sweden; Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm 10405, Sweden; Department of Zoology, Stockholm University, Stockholm 10691, Sweden.
| | - M Thomas P Gilbert
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, 1353 Copenhagen, Denmark; Norwegian University of Science and Technology (NTNU) University Museum, Trondheim 7012, Norway.
| |
Collapse
|
200
|
Taylor RS, Manseau M, Klütsch CFC, Polfus JL, Steedman A, Hervieux D, Kelly A, Larter NC, Gamberg M, Schwantje H, Wilson PJ. Population dynamics of caribou shaped by glacial cycles before the last glacial maximum. Mol Ecol 2021; 30:6121-6143. [PMID: 34482596 PMCID: PMC9293238 DOI: 10.1111/mec.16166] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 08/18/2021] [Accepted: 08/23/2021] [Indexed: 12/04/2022]
Abstract
Pleistocene glacial cycles influenced the diversification of high‐latitude wildlife species through recurrent periods of range contraction, isolation, divergence, and expansion from refugia and subsequent admixture of refugial populations. We investigate population size changes and the introgressive history of caribou (Rangifer tarandus) in western Canada using 33 whole genome sequences coupled with larger‐scale mitochondrial data. We found that a major population expansion of caribou occurred starting around 110,000 years ago (kya), the start of the last glacial period. Additionally, we found effective population sizes of some caribou reaching ~700,000 to 1,000,000 individuals, one of the highest recorded historical effective population sizes for any mammal species thus far. Mitochondrial analyses dated introgression events prior to the LGM dating to 20–30 kya and even more ancient at 60 kya, coinciding with colder periods with extensive ice coverage, further demonstrating the importance of glacial cycles and events prior to the LGM in shaping demographic history. Reconstructing the origins and differential introgressive history has implications for predictions on species responses under climate change. Our results have implications for other whole genome analyses using pairwise sequentially Markovian coalescent (PSMC) analyses, as well as highlighting the need to investigate pre‐LGM demographic patterns to fully reconstruct the origin of species diversity, especially for high‐latitude species.
Collapse
Affiliation(s)
- Rebecca S Taylor
- Biology Department, Trent University, Peterborough, Ontario, Canada
| | - Micheline Manseau
- Biology Department, Trent University, Peterborough, Ontario, Canada.,Landscape Science and Technology, Environment and Climate Change Canada, Ottawa, Ontario, Canada
| | | | - Jean L Polfus
- Biology Department, Trent University, Peterborough, Ontario, Canada
| | - Audrey Steedman
- Parks Canada, Government of Canada, Winnipeg, Manitoba, Canada
| | - Dave Hervieux
- Department of Environment and Parks, Government of Alberta, Grande Prairie, Alberta, Canada
| | - Allicia Kelly
- Department of Environment and Natural Resources, Government of the Northwest Territories, Fort Smith, Northwest Territories, Canada
| | - Nicholas C Larter
- Department of Environment and Natural Resources, Government of the Northwest Territories, Fort Simpson, Northwest Territories, Canada
| | | | - Helen Schwantje
- BC Ministry of Forest, Lands, Natural Resource Operations, and Rural Development, Nanaimo, British Columbia, Canada
| | - Paul J Wilson
- Biology Department, Trent University, Peterborough, Ontario, Canada
| |
Collapse
|