1
|
Lehner R, Blazek L, Minoche AE, Dohm JC, Himmelbauer H. Assembly and characterization of the genome of chard (Beta vulgaris ssp. vulgaris var. cicla). J Biotechnol 2021; 333:67-76. [PMID: 33932500 DOI: 10.1016/j.jbiotec.2021.04.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 04/09/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022]
Abstract
Chard (Beta vulgaris ssp. vulgaris var. cicla) is a member of one of four different cultigroups of beets. While the genome of sugar beet, the most prominent beet crop, has been studied extensively, molecular data on other beet cultivars is scant. Here, we present a genome assembly of chard, a vegetable crop grown for its fleshy leaves. We report a de novo genome assembly of 604 Mbp, slightly larger than sugar beet assemblies presented so far. About 57 % of the assembly was annotated as repetitive sequence, of which LTR retrotransposons were the most abundant. Based on the presence of conserved genes, the chard assembly was estimated to be at least 96 % complete regarding its gene space. We predicted 34,521 genes of which 27,582 genes were supported by evidence from transcriptomic sequencing reads, and 5503 of the evidence-supported genes had multiple isoforms. We compared the chard gene set with gene sets from sugar beet and two wild beets (i.e. Beta vulgaris ssp. maritima and Beta patula) to find orthology relationships and identified genome-wide syntenic regions between chard and sugar beet. Lastly, we determined genomic variants that distinguish sugar beet and chard. Assessing the variation distribution along the chard chromosomes, we found extensive haplotype sharing between the two cultivars. In summary, our work provides a foundation for the molecular analysis of Beta vulgaris cultigroups as a basis for chard genomics and to unravel the domestication history of beet crops.
Collapse
Affiliation(s)
- Reinhard Lehner
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| | - Lisa Blazek
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| | - André E Minoche
- Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, Australia
| | - Juliane C Dohm
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria.
| | - Heinz Himmelbauer
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria.
| |
Collapse
|
2
|
Dittami SM, Corre E, Brillet-Guéguen L, Lipinska AP, Pontoizeau N, Aite M, Avia K, Caron C, Cho CH, Collén J, Cormier A, Delage L, Doubleau S, Frioux C, Gobet A, González-Navarrete I, Groisillier A, Hervé C, Jollivet D, KleinJan H, Leblanc C, Liu X, Marie D, Markov GV, Minoche AE, Monsoor M, Pericard P, Perrineau MM, Peters AF, Siegel A, Siméon A, Trottier C, Yoon HS, Himmelbauer H, Boyen C, Tonon T. The genome of Ectocarpus subulatus - A highly stress-tolerant brown alga. Mar Genomics 2020; 52:100740. [PMID: 31937506 DOI: 10.1016/j.margen.2020.100740] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 01/01/2020] [Indexed: 11/20/2022]
Abstract
Brown algae are multicellular photosynthetic stramenopiles that colonize marine rocky shores worldwide. Ectocarpus sp. Ec32 has been established as a genomic model for brown algae. Here we present the genome and metabolic network of the closely related species, Ectocarpus subulatus Kützing, which is characterized by high abiotic stress tolerance. Since their separation, both strains show new traces of viral sequences and the activity of large retrotransposons, which may also be related to the expansion of a family of chlorophyll-binding proteins. Further features suspected to contribute to stress tolerance include an expanded family of heat shock proteins, the reduction of genes involved in the production of halogenated defence compounds, and the presence of fewer cell wall polysaccharide-modifying enzymes. Overall, E. subulatus has mainly lost members of gene families down-regulated in low salinities, and conserved those that were up-regulated in the same condition. However, 96% of genes that differed between the two examined Ectocarpus species, as well as all genes under positive selection, were found to encode proteins of unknown function. This underlines the uniqueness of brown algal stress tolerance mechanisms as well as the significance of establishing E. subulatus as a comparative model for future functional studies.
Collapse
Affiliation(s)
- Simon M Dittami
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France.
| | - Erwan Corre
- CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Loraine Brillet-Guéguen
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France; CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Agnieszka P Lipinska
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Noé Pontoizeau
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France; CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Meziane Aite
- Univ Rennes, Inria, CNRS, IRISA, 35000 Rennes, France
| | - Komlan Avia
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France; Université de Strasbourg, INRA, SVQV UMR-A 1131, F-68000 Colmar, France
| | - Christophe Caron
- CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Chung Hyun Cho
- Department of Biological Sciences, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Jonas Collén
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Alexandre Cormier
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Ludovic Delage
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Sylvie Doubleau
- IRD, UMR DIADE, 911 Avenue Agropolis, BP 64501, 34394 Montpellier, France
| | | | - Angélique Gobet
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Irene González-Navarrete
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Agnès Groisillier
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Cécile Hervé
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Didier Jollivet
- Sorbonne Université, CNRS, Adaptation and Diversity in the Marine Environment (ADME), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Hetty KleinJan
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Catherine Leblanc
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Xi Liu
- CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Dominique Marie
- Sorbonne Université, CNRS, Adaptation and Diversity in the Marine Environment (ADME), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Gabriel V Markov
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - André E Minoche
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Misharl Monsoor
- CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Pierre Pericard
- CNRS, Sorbonne Université, FR2424, ABiMS platform, Station Biologique de Roscoff, 29680 Roscoff, France
| | - Marie-Mathilde Perrineau
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France; Scottish Association for Marine Science, Scottish Marine Institute, Oban PA37 1QA, United Kingdom
| | | | - Anne Siegel
- Univ Rennes, Inria, CNRS, IRISA, 35000 Rennes, France
| | - Amandine Siméon
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Camille Trottier
- Univ Rennes, Inria, CNRS, IRISA, 35000 Rennes, France; Laboratory of Digital Sciences of Nantes (LS2N) - University of Nantes, France
| | - Hwan Su Yoon
- Department of Biological Sciences, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Heinz Himmelbauer
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany; Department of Biotechnology, University of Natural Resources and Life Sciences (BOKU), Vienna, 1190 Vienna, Austria
| | - Catherine Boyen
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Thierry Tonon
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, 29680 Roscoff, France; Centre for Novel Agricultural Products, Department of Biology, University of York, York YO10 5DD, United Kingdom
| |
Collapse
|
3
|
Frazier AE, Compton AG, Kishita Y, Hock DH, Welch AE, Amarasekera SSC, Rius R, Formosa LE, Imai-Okazaki A, Francis D, Wang M, Lake NJ, Tregoning S, Jabbari JS, Lucattini A, Nitta KR, Ohtake A, Murayama K, Amor DJ, McGillivray G, Wong FY, van der Knaap MS, Jeroen Vermeulen R, Wiltshire EJ, Fletcher JM, Lewis B, Baynam G, Ellaway C, Balasubramaniam S, Bhattacharya K, Freckmann ML, Arbuckle S, Rodriguez M, Taft RJ, Sadedin S, Cowley MJ, Minoche AE, Calvo SE, Mootha VK, Ryan MT, Okazaki Y, Stroud DA, Simons C, Christodoulou J, Thorburn DR. Fatal perinatal mitochondrial cardiac failure caused by recurrent de novo duplications in the ATAD3 locus. Med (N Y) 2020; 2:49-73. [PMID: 33575671 DOI: 10.1016/j.medj.2020.06.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Background In about half of all patients with a suspected monogenic disease, genomic investigations fail to identify the diagnosis. A contributing factor is the difficulty with repetitive regions of the genome, such as those generated by segmental duplications. The ATAD3 locus is one such region, in which recessive deletions and dominant duplications have recently been reported to cause lethal perinatal mitochondrial diseases characterized by pontocerebellar hypoplasia or cardiomyopathy, respectively. Methods Whole exome, whole genome and long-read DNA sequencing techniques combined with studies of RNA and quantitative proteomics were used to investigate 17 subjects from 16 unrelated families with suspected mitochondrial disease. Findings We report six different de novo duplications in the ATAD3 gene locus causing a distinctive presentation including lethal perinatal cardiomyopathy, persistent hyperlactacidemia, and frequently corneal clouding or cataracts and encephalopathy. The recurrent 68 Kb ATAD3 duplications are identifiable from genome and exome sequencing but usually missed by microarrays. The ATAD3 duplications result in the formation of identical chimeric ATAD3A/ATAD3C proteins, altered ATAD3 complexes and a striking reduction in mitochondrial oxidative phosphorylation complex I and its activity in heart tissue. Conclusions ATAD3 duplications appear to act in a dominant-negative manner and the de novo inheritance infers a low recurrence risk for families, unlike most pediatric mitochondrial diseases. More than 350 genes underlie mitochondrial diseases. In our experience the ATAD3 locus is now one of the five most common causes of nuclear-encoded pediatric mitochondrial disease but the repetitive nature of the locus means ATAD3 diagnoses may be frequently missed by current genomic strategies. Funding Australian NHMRC, US Department of Defense, Japanese AMED and JSPS agencies, Australian Genomics Health Alliance and Australian Mito Foundation.
Collapse
Affiliation(s)
- Ann E Frazier
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,These authors contributed equally: A.E. Frazier, A.G. Compton
| | - Alison G Compton
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,These authors contributed equally: A.E. Frazier, A.G. Compton
| | - Yoshihito Kishita
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - Daniella H Hock
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Melbourne, VIC 3052, Australia
| | - AnneMarie E Welch
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Sumudu S C Amarasekera
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Rocio Rius
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Luke E Formosa
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Atsuko Imai-Okazaki
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan.,Division of Genomic Medicine Research, Medical Genomics Center, National Center for Global Health and Medicine, Tokyo 162-8655, Japan
| | - David Francis
- Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Min Wang
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Nicole J Lake
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Simone Tregoning
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Jafar S Jabbari
- Australian Genome Research Facility Ltd, Victorian Comprehensive Cancer Centre, Melbourne VIC 3052, Australia
| | - Alexis Lucattini
- Australian Genome Research Facility Ltd, Victorian Comprehensive Cancer Centre, Melbourne VIC 3052, Australia
| | - Kazuhiro R Nitta
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - Akira Ohtake
- Department of Pediatrics & Clinical Genomics, Saitama Medical University Hospital, Saitama, 350-0495, Japan
| | - Kei Murayama
- Department of Metabolism, Chiba Children's Hospital, Chiba, 266-0007, Japan
| | - David J Amor
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - George McGillivray
- Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Flora Y Wong
- Ritchie Centre, Hudson Institute of Medical Research; Department of Paediatrics, Monash University; and Monash Newborn, Monash Children's Hospital, Melbourne, VIC 3168, Australia
| | - Marjo S van der Knaap
- Child Neurology, Emma Children's Hospital, Amsterdam University Medical Centers, Vrije Universiteit and Amsterdam Neuroscience, 1081 HV Amsterdam, The Netherlands.,Functional Genomics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit and Amsterdam Neuroscience, 1081 HV Amsterdam, The Netherlands
| | - R Jeroen Vermeulen
- Department of Neurology, Maastricht University Medical Center, 6229 HX, Maastricht, The Netherlands
| | - Esko J Wiltshire
- Department of Paediatrics and Child Health, University of Otago Wellington and Capital and Coast District Health Board, Wellington 6021, New Zealand
| | - Janice M Fletcher
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, SA 5000, Australia
| | - Barry Lewis
- Department of Clinical Biochemistry, PathWest Laboratory Medicine Western Australia, Nedlands, WA 6009, Australia
| | - Gareth Baynam
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia and King Edward Memorial Hospital for Women Perth, Subiaco, WA 6008, Australia.,Telethon Kids Institute and School of Paediatrics and Child Health, The University of Western Australia, Perth, WA 6009, Australia
| | - Carolyn Ellaway
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | - Shanti Balasubramaniam
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia
| | - Kaustuv Bhattacharya
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | | | - Susan Arbuckle
- Department of Histopathology, The Children's Hospital at Westmead, Sydney Children's Hospital Network, Sydney, NSW 2145, Australia
| | - Michael Rodriguez
- Discipline of Pathology, School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | | | - Simon Sadedin
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Mark J Cowley
- Children's Cancer Institute, Kensington, NSW 2750, Australia; St Vincent's Clinical School, UNSW Sydney, Darlinghurst, NSW 2010, Australia.,Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - André E Minoche
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - Sarah E Calvo
- Broad Institute, Cambridge, MA 02142, USA; Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02446, USA
| | - Vamsi K Mootha
- Broad Institute, Cambridge, MA 02142, USA; Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02446, USA
| | - Michael T Ryan
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Yasushi Okazaki
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - David A Stroud
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Cas Simons
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072 Australia
| | - John Christodoulou
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | - David R Thorburn
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Lead contact
| |
Collapse
|
4
|
Rodríguez del Río Á, Minoche AE, Zwickl NF, Friedrich A, Liedtke S, Schmidt T, Himmelbauer H, Dohm JC. Genomes of the wild beets Beta patula and Beta vulgaris ssp. maritima. Plant J 2019; 99:1242-1253. [PMID: 31104348 PMCID: PMC9546096 DOI: 10.1111/tpj.14413] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 04/23/2019] [Accepted: 05/02/2019] [Indexed: 05/04/2023]
Abstract
We present draft genome assemblies of Beta patula, a critically endangered wild beet endemic to the Madeira archipelago, and of the closely related Beta vulgaris ssp. maritima (sea beet). Evidence-based reference gene sets for B. patula and sea beet were generated, consisting of 25 127 and 27 662 genes, respectively. The genomes and gene sets of the two wild beets were compared with their cultivated sister taxon B. vulgaris ssp. vulgaris (sugar beet). Large syntenic regions were identified, and a display tool for automatic genome-wide synteny image generation was developed. Phylogenetic analysis based on 9861 genes showing 1:1:1 orthology supported the close relationship of B. patula to sea beet and sugar beet. A comparative analysis of the Rz2 locus, responsible for rhizomania resistance, suggested that the sequenced B. patula accession was rhizomania susceptible. Reference karyotypes for the two wild beets were established, and genomic rearrangements were detected. We consider our data as highly valuable and comprehensive resources for wild beet studies, B. patula conservation management, and sugar beet breeding research.
Collapse
Affiliation(s)
- Álvaro Rodríguez del Río
- University of Natural Resources and Life Sciences (BOKU)1190ViennaAustria
- Present address:
Centro de Biotecnología y Genómica de PlantasUPM – INIA28223MadridSpain
| | - André E. Minoche
- Garvan Institute of Medical ResearchDarlinghurst2010NSWAustralia
| | - Nikolaus F. Zwickl
- University of Natural Resources and Life Sciences (BOKU)1190ViennaAustria
| | - Anja Friedrich
- University of Natural Resources and Life Sciences (BOKU)1190ViennaAustria
- Present address:
FH Campus WienUniversity of Applied Sciences1030ViennaAustria
| | | | | | - Heinz Himmelbauer
- University of Natural Resources and Life Sciences (BOKU)1190ViennaAustria
| | - Juliane C. Dohm
- University of Natural Resources and Life Sciences (BOKU)1190ViennaAustria
| |
Collapse
|
5
|
Bagnall RD, Ingles J, Dinger ME, Cowley MJ, Ross SB, Minoche AE, Lal S, Turner C, Colley A, Rajagopalan S, Berman Y, Ronan A, Fatkin D, Semsarian C. Whole Genome Sequencing Improves Outcomes of Genetic Testing in Patients With Hypertrophic Cardiomyopathy. J Am Coll Cardiol 2018; 72:419-429. [DOI: 10.1016/j.jacc.2018.04.078] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 03/18/2018] [Accepted: 04/24/2018] [Indexed: 11/24/2022]
|
6
|
Kumar KR, Wali GM, Kamate M, Wali G, Minoche AE, Puttick C, Pinese M, Gayevskiy V, Dinger ME, Roscioli T, Sue CM, Cowley MJ. Defining the genetic basis of early onset hereditary spastic paraplegia using whole genome sequencing. Neurogenetics 2016; 17:265-270. [PMID: 27679996 PMCID: PMC5061846 DOI: 10.1007/s10048-016-0495-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 09/12/2016] [Indexed: 12/20/2022]
Abstract
We performed whole genome sequencing (WGS) in nine families from India with early-onset hereditary spastic paraplegia (HSP). We obtained a genetic diagnosis in 4/9 (44 %) families within known HSP genes (DDHD2 and CYP2U1), as well as perixosomal biogenesis disorders (PEX16) and GM1 gangliosidosis (GLB1). In the remaining patients, no candidate structural variants, copy number variants or predicted splice variants affecting an extended candidate gene list were identified. Our findings demonstrate the efficacy of using WGS for diagnosing early-onset HSP, particularly in consanguineous families (4/6 diagnosed), highlighting that two of the diagnoses would not have been made using a targeted approach.
Collapse
Affiliation(s)
- Kishore R Kumar
- Department of Neurogenetics, Kolling Institute of Medical Research, Royal North Shore Hospital and University of Sydney, St Leonards, 2065, Australia. .,Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia.
| | - G M Wali
- Neurospecialities Centre, Belgaum, India
| | - Mahesh Kamate
- Department of Paediatrics, KLE University's Jawaharlal Nehru J N Medical College, Belgaum, India
| | - Gautam Wali
- Department of Neurogenetics, Kolling Institute of Medical Research, Royal North Shore Hospital and University of Sydney, St Leonards, 2065, Australia
| | - André E Minoche
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Clare Puttick
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Mark Pinese
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Velimir Gayevskiy
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Marcel E Dinger
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia.,St Vincent's Clinical School, University of New South Wales, Sydney, Australia
| | - Tony Roscioli
- Department of Neurogenetics, Kolling Institute of Medical Research, Royal North Shore Hospital and University of Sydney, St Leonards, 2065, Australia.,St Vincent's Clinical School, University of New South Wales, Sydney, Australia.,Department of Medical Genetics, Sydney Children's Hospital, Randwick, Australia
| | - Carolyn M Sue
- Department of Neurogenetics, Kolling Institute of Medical Research, Royal North Shore Hospital and University of Sydney, St Leonards, 2065, Australia
| | - Mark J Cowley
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia.,St Vincent's Clinical School, University of New South Wales, Sydney, Australia
| |
Collapse
|
7
|
Vlasova A, Capella-Gutiérrez S, Rendón-Anaya M, Hernández-Oñate M, Minoche AE, Erb I, Câmara F, Prieto-Barja P, Corvelo A, Sanseverino W, Westergaard G, Dohm JC, Pappas GJ, Saburido-Alvarez S, Kedra D, Gonzalez I, Cozzuto L, Gómez-Garrido J, Aguilar-Morón MA, Andreu N, Aguilar OM, Garcia-Mas J, Zehnsdorf M, Vázquez MP, Delgado-Salinas A, Delaye L, Lowy E, Mentaberry A, Vianello-Brondani RP, García JL, Alioto T, Sánchez F, Himmelbauer H, Santalla M, Notredame C, Gabaldón T, Herrera-Estrella A, Guigó R. Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes. Genome Biol 2016; 17:32. [PMID: 26911872 PMCID: PMC4766624 DOI: 10.1186/s13059-016-0883-6] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2015] [Accepted: 01/22/2016] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. RESULTS We report the genome and the transcription atlas of coding and non-coding genes of a Mesoamerican genotype of common bean (Phaseolus vulgaris L., BAT93). Using a comprehensive phylogenomics analysis, we assessed the past and recent evolution of common bean, and traced the diversification of patterns of gene expression following duplication. We find that successive rounds of gene duplications in legumes have shaped tissue and developmental expression, leading to increased levels of specialization in larger gene families. We also find that many long non-coding RNAs are preferentially expressed in germ-line-related tissues (pods and seeds), suggesting that they play a significant role in fruit development. Our results also suggest that most bean-specific gene family expansions, including resistance gene clusters, predate the split of the Mesoamerican and Andean gene pools. CONCLUSIONS The genome and transcriptome data herein generated for a Mesoamerican genotype represent a counterpart to the genomic resources already available for the Andean gene pool. Altogether, this information will allow the genetic dissection of the characters involved in the domestication and adaptation of the crop, and their further implementation in breeding strategies for this important crop.
Collapse
Affiliation(s)
- Anna Vlasova
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Salvador Capella-Gutiérrez
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- Yeast and Basidiomycete Research Group, CBS Fungal Biodiversity Centre, Uppsalalaan 8, 3584 LT, Utrecht, The Netherlands
| | - Martha Rendón-Anaya
- Laboratorio Nacional de Genómica para la Biodiversidad, Cinvestav-Irapuato, CP 36821, Irapuato, Guanajuato, Mexico
| | - Miguel Hernández-Oñate
- Laboratorio Nacional de Genómica para la Biodiversidad, Cinvestav-Irapuato, CP 36821, Irapuato, Guanajuato, Mexico
| | - André E Minoche
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW, 2010, Australia
| | - Ionas Erb
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Francisco Câmara
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Pablo Prieto-Barja
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - André Corvelo
- New York Genome Center, 101 Avenue of the Americas, New York, NY, 10013, USA
| | - Walter Sanseverino
- IRTA, Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, 08193 Bellaterra, Barcelona, Catalonia, Spain
| | - Gastón Westergaard
- Instituto de Agrobiotecnología Rosario (INDEAR), Rosario, Santa Fe, 2000, Argentina
| | - Juliane C Dohm
- Department of Biotechnology, University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria
| | - Georgios J Pappas
- Department of Cellular Biology, University of Brasilia, Biological Science Institute, Brasília, DF, 70790-160, Brazil
| | - Soledad Saburido-Alvarez
- Laboratorio Nacional de Genómica para la Biodiversidad, Cinvestav-Irapuato, CP 36821, Irapuato, Guanajuato, Mexico
| | - Darek Kedra
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Irene Gonzalez
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- Genomics Unit, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - Luca Cozzuto
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Jessica Gómez-Garrido
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - María A Aguilar-Morón
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- Genomics Unit, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - Nuria Andreu
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- Genomics Unit, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - O Mario Aguilar
- Instituto de Biotecnología y Biología Molecular (IBBM), UNLP-CONICET, 1900, La Plata, Argentina
| | - Jordi Garcia-Mas
- IRTA, Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, 08193 Bellaterra, Barcelona, Catalonia, Spain
| | - Maik Zehnsdorf
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- Genomics Unit, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - Martín P Vázquez
- Instituto de Agrobiotecnología Rosario (INDEAR), Rosario, Santa Fe, 2000, Argentina
| | - Alfonso Delgado-Salinas
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | - Luis Delaye
- Departamento de Ingeniería Genética, Unidad Irapuato, Cinvestav, 36821, Irapuato, Guanajuato, Mexico
| | - Ernesto Lowy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Alejandro Mentaberry
- Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), C1428EGA, Buenos Aires, Argentina
| | | | - José Luís García
- Environmental Biology Department, Centro de Investigaciones Biológicas, (CSIC), 28040, Madrid, Spain
| | - Tyler Alioto
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Federico Sánchez
- Depto. de Biología Molecular de Plantas, Instituto Biotecnología, Universidad Nacional Autónoma de México, 62210, Cuernavaca, Morelos, Mexico
| | - Heinz Himmelbauer
- Department of Biotechnology, University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria
| | - Marta Santalla
- Mision Biológica de Galicia (MBG)-National Spanish Research Council (CSIC), 36080, Pontevedra, Spain
| | - Cedric Notredame
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010, Barcelona, Spain.
| | - Alfredo Herrera-Estrella
- Laboratorio Nacional de Genómica para la Biodiversidad, Cinvestav-Irapuato, CP 36821, Irapuato, Guanajuato, Mexico.
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003, Barcelona, Spain.
- IMIM (Hospital del Mar Medical Research Institute), 08003, Barcelona, Spain.
| |
Collapse
|
8
|
Minoche AE, Dohm JC, Schneider J, Holtgräwe D, Viehöver P, Montfort M, Sörensen TR, Weisshaar B, Himmelbauer H. Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biol 2015; 16:184. [PMID: 26328666 PMCID: PMC4556409 DOI: 10.1186/s13059-015-0729-7] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 07/22/2015] [Indexed: 12/20/2022] Open
Abstract
We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes.
Collapse
Affiliation(s)
- André E Minoche
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Centre for Genomic Regulation (CRG), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Juliane C Dohm
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Centre for Genomic Regulation (CRG), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria
| | - Jessica Schneider
- Department of Biology/Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany
| | - Daniela Holtgräwe
- Department of Biology/Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany
| | - Prisca Viehöver
- Department of Biology/Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany
| | - Magda Montfort
- Centre for Genomic Regulation (CRG), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Thomas Rosleff Sörensen
- Department of Biology/Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany
| | - Bernd Weisshaar
- Department of Biology/Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany.
| | - Heinz Himmelbauer
- Max Planck Institute for Molecular Genetics, Berlin, Germany. .,Centre for Genomic Regulation (CRG), Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria.
| |
Collapse
|
9
|
Bellmunt J, Selvarajah S, Rodig S, Salido M, de Muga S, Costa I, Bellosillo B, Werner L, Mullane S, Fay AP, O'Brien R, Barretina J, Minoche AE, Signoretti S, Montagut C, Himmelbauer H, Berman DM, Kantoff P, Choueiri TK, Rosenberg JE. Identification of ALK gene alterations in urothelial carcinoma. PLoS One 2014; 9:e103325. [PMID: 25083769 PMCID: PMC4118868 DOI: 10.1371/journal.pone.0103325] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 06/26/2014] [Indexed: 12/18/2022] Open
Abstract
Background Anaplastic lymphoma kinase (ALK) genomic alterations have emerged as a potent predictor of benefit from treatment with ALK inhibitors in several cancers. Currently, there is no information about ALK gene alterations in urothelial carcinoma (UC) and its correlation with clinical or pathologic features and outcome. Methods Samples from patients with advanced UC and correlative clinical data were collected. Genomic imbalances were investigated by array comparative genomic hybridization (aCGH). ALK gene status was evaluated by fluorescence in situ hybridization (FISH). ALK expression was assessed by immunohistochemistry (IHC) and high-throughput mutation analysis with Oncomap 3 platform. Next generation sequencing was performed using Illumina Genome Analyzer IIx, and Illumina HiSeq 2000 in the FISH positive case. Results 70 of 96 patients had tissue available for all the tests performed. Arm level copy number gains at chromosome 2 were identified in 17 (24%) patients. Minor copy number alterations (CNAs) in the proximity of ALK locus were found in 3 patients by aCGH. By FISH analysis, one of these samples had a deletion of the 5′ALK. Whole genome next generation sequencing was inconclusive to confirm the deletion at the level of the ALK gene at the coverage level used. We did not observe an association between ALK CNA and overall survival, ECOG PS, or development of visceral disease. Conclusions ALK genomic alterations are rare and probably without prognostic implications in UC. The potential for testing ALK inhibitors in UC merits further investigation but might be restricted to the identification of an enriched population.
Collapse
Affiliation(s)
- Joaquim Bellmunt
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
- Hospital del Mar Research Institute-IMIM, Barcelona, Spain
- * E-mail:
| | - Shamini Selvarajah
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Scott Rodig
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Marta Salido
- Department of Pathology, Hospital del Mar Research Institute-IMIM, Barcelona, Spain
| | - Silvia de Muga
- Department of Pathology, Hospital del Mar Research Institute-IMIM, Barcelona, Spain
| | | | - Beatriz Bellosillo
- Department of Pathology, Hospital del Mar Research Institute-IMIM, Barcelona, Spain
| | - Lillian Werner
- Biostatistics and Computational Biology, Harvard Medical School, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Stephanie Mullane
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - André P. Fay
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Robert O'Brien
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jordi Barretina
- Broad Institute, Cambridge, Massachusetts, United States of America
| | - André E. Minoche
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Sabina Signoretti
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Clara Montagut
- Hospital del Mar Research Institute-IMIM, Barcelona, Spain
| | - Heinz Himmelbauer
- Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - David M. Berman
- Department of Pathology, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Philip Kantoff
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Toni K. Choueiri
- Bladder Cancer Center, Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jonathan E. Rosenberg
- Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, New York, United States of America
| |
Collapse
|
10
|
Heitkam T, Holtgräwe D, Dohm JC, Minoche AE, Himmelbauer H, Weisshaar B, Schmidt T. Profiling of extensively diversified plant LINEs reveals distinct plant-specific subclades. Plant J 2014; 79:385-97. [PMID: 24862340 DOI: 10.1111/tpj.12565] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 05/12/2014] [Accepted: 05/15/2014] [Indexed: 05/03/2023]
Abstract
A large fraction of eukaryotic genomes is made up of long interspersed nuclear elements (LINEs). Due to their capability to create novel copies via error-prone reverse transcription, they generate multiple families and reach high copy numbers. Although mammalian LINEs have been well described, plant LINEs have been only poorly investigated. Here, we present a systematic cross-species survey of LINEs in higher plant genomes shedding light on plant LINE evolution as well as diversity, and facilitating their annotation in genome projects. Applying a Hidden Markov Model (HMM)-based analysis, 59 390 intact LINE reverse transcriptases (RTs) were extracted from 23 plant genomes. These fall in only two out of 28 LINE clades (L1 and RTE) known in eukaryotes. While plant RTE LINEs are highly homogenous and mostly constitute only a single family per genome, plant L1 LINEs are extremely diverse and form numerous families. Despite their heterogeneity, all members across the 23 species fall into only seven L1 subclades, some of them defined here. Exemplarily focusing on the L1 LINEs of a basal reference plant genome (Beta vulgaris), we show that the subclade classification level does not only reflect RT sequence similarity, but also mirrors structural aspects of complete LINE retrotransposons, like element size, position and type of encoded enzymatic domains. Our comprehensive catalogue of plant LINE RTs serves the classification of highly diverse plant LINEs, while the provided subclade-specific HMMs facilitate their annotation.
Collapse
Affiliation(s)
- Tony Heitkam
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
11
|
Schmidt M, Hense S, Minoche AE, Dohm JC, Himmelbauer H, Schmidt T, Zakrzewski F. Cytosine methylation of an ancient satellite family in the wild beet Beta procumbens. Cytogenet Genome Res 2014; 143:157-67. [PMID: 24994030 DOI: 10.1159/000363485] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
DNA methylation is an essential epigenetic feature for the regulation and maintenance of heterochromatin. Satellite DNA is a repetitive sequence component that often occurs in large arrays in heterochromatin of subtelomeric, intercalary and centromeric regions. Knowledge about the methylation status of satellite DNA is important for understanding the role of repetitive DNA in heterochromatization. In this study, we investigated the cytosine methylation of the ancient satellite family pEV in the wild beet Beta procumbens. The pEV satellite is widespread in species-specific pEV subfamilies in the genus Beta and most likely originated before the radiation of the Betoideae and Chenopodioideae. In B. procumbens, the pEV subfamily occurs abundantly and spans intercalary and centromeric regions. To uncover its cytosine methylation, we performed chromosome-wide immunostaining and bisulfite sequencing of pEV satellite repeats. We found that CG and CHG sites are highly methylated while CHH sites show only low levels of methylation. As a consequence of the low frequency of CG and CHG sites and the preferential occurrence of most cytosines in the CHH motif in pEV monomers, this satellite family displays only low levels of total cytosine methylation.
Collapse
Affiliation(s)
- Martin Schmidt
- Department of Plant Cell and Molecular Biology, TU Dresden, Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
12
|
Zakrzewski F, Schubert V, Viehoever P, Minoche AE, Dohm JC, Himmelbauer H, Weisshaar B, Schmidt T. The CHH motif in sugar beet satellite DNA: a modulator for cytosine methylation. Plant J 2014; 78:937-50. [PMID: 24661787 DOI: 10.1111/tpj.12519] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Revised: 03/17/2014] [Accepted: 03/18/2014] [Indexed: 05/03/2023]
Abstract
Methylation of DNA is important for the epigenetic silencing of repetitive DNA in plant genomes. Knowledge about the cytosine methylation status of satellite DNAs, a major class of repetitive DNA, is scarce. One reason for this is that arrays of tandemly arranged sequences are usually collapsed in next-generation sequencing assemblies. We applied strategies to overcome this limitation and quantified the level of cytosine methylation and its pattern in three satellite families of sugar beet (Beta vulgaris) which differ in their abundance, chromosomal localization and monomer size. We visualized methylation levels along pachytene chromosomes with respect to small satellite loci at maximum resolution using chromosome-wide fluorescent in situ hybridization complemented with immunostaining and super-resolution microscopy. Only reduced methylation of many satellite arrays was obtained. To investigate methylation at the nucleotide level we performed bisulfite sequencing of 1569 satellite sequences. We found that the level of methylation of cytosine strongly depends on the sequence context: cytosines in the CHH motif show lower methylation (44-52%), while CG and CHG motifs are more strongly methylated. This affects the overall methylation of satellite sequences because CHH occurs frequently while CG and CHG are rare or even absent in the satellite arrays investigated. Evidently, CHH is the major target for modulation of the cytosine methylation level of adjacent monomers within individual arrays and contributes to their epigenetic function. This strongly indicates that asymmetric cytosine methylation plays a role in the epigenetic modification of satellite repeats in plant genomes.
Collapse
Affiliation(s)
- Falk Zakrzewski
- Department of Plant Cell and Molecular Biology, TU Dresden, D-01062, Dresden, Germany
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Leiva-Eriksson N, Pin PA, Kraft T, Dohm JC, Minoche AE, Himmelbauer H, Bülow L. Differential expression patterns of non-symbiotic hemoglobins in sugar beet (Beta vulgaris ssp. vulgaris). Plant Cell Physiol 2014; 55:834-44. [PMID: 24486763 DOI: 10.1093/pcp/pcu027] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Biennial sugar beet (Beta vulgaris spp. vulgaris) is a Caryophyllidae that has adapted its growth cycle to the seasonal temperature and daylength variation of temperate regions. This is the first time a holistic study of the expression pattern of non-symbiotic hemoglobins (nsHbs) is being carried out in a member of this group and under two essential environmental conditions for flowering, namely vernalization and length of photoperiod. BvHb genes were identified by sequence homology searches against the latest draft of the sugar beet genome. Three nsHb genes (BvHb1.1, BvHb1.2 and BvHb2) and one truncated Hb gene (BvHb3) were found in the genome of sugar beet. Gene expression profiling of the nsHb genes was carried out by quantitative PCR in different organs and developmental stages, as well as during vernalization and under different photoperiods. BvHb1.1 and BvHb2 showed differential expression during vernalization as well as during long and short days. The high expression of BvHb2 indicates that it has an active role in the cell, maybe even taking over some BvHb1.2 functions, except during germination where BvHb1.2 together with BvHb1.1-both Class 1 nsHbs-are highly expressed. The unprecedented finding of a leader peptide at the N-terminus of BvHb1.1, for the first time in an nsHb from higher plants, together with its observed expression indicate that it may have a very specific role due to its suggested location in chloroplasts. Our findings open up new possibilities for research, breeding and engineering since Hbs could be more involved in plant development than previously was anticipated.
Collapse
Affiliation(s)
- Nélida Leiva-Eriksson
- Department of Pure and Applied Biochemistry, Lund University, Box 124, 221.00 Lund, Sweden
| | | | | | | | | | | | | |
Collapse
|
14
|
Dohm JC, Minoche AE, Holtgräwe D, Capella-Gutiérrez S, Zakrzewski F, Tafer H, Rupp O, Sörensen TR, Stracke R, Reinhardt R, Goesmann A, Kraft T, Schulz B, Stadler PF, Schmidt T, Gabaldón T, Lehrach H, Weisshaar B, Himmelbauer H. The genome of the recently domesticated crop plant sugar beet (Beta vulgaris). Nature 2013; 505:546-9. [PMID: 24352233 DOI: 10.1038/nature12817] [Citation(s) in RCA: 326] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2013] [Accepted: 10/29/2013] [Indexed: 01/25/2023]
Abstract
Sugar beet (Beta vulgaris ssp. vulgaris) is an important crop of temperate climates which provides nearly 30% of the world's annual sugar production and is a source for bioethanol and animal feed. The species belongs to the order of Caryophylalles, is diploid with 2n = 18 chromosomes, has an estimated genome size of 714-758 megabases and shares an ancient genome triplication with other eudicot plants. Leafy beets have been cultivated since Roman times, but sugar beet is one of the most recently domesticated crops. It arose in the late eighteenth century when lines accumulating sugar in the storage root were selected from crosses made with chard and fodder beet. Here we present a reference genome sequence for sugar beet as the first non-rosid, non-asterid eudicot genome, advancing comparative genomics and phylogenetic reconstructions. The genome sequence comprises 567 megabases, of which 85% could be assigned to chromosomes. The assembly covers a large proportion of the repetitive sequence content that was estimated to be 63%. We predicted 27,421 protein-coding genes supported by transcript data and annotated them on the basis of sequence homology. Phylogenetic analyses provided evidence for the separation of Caryophyllales before the split of asterids and rosids, and revealed lineage-specific gene family expansions and losses. We sequenced spinach (Spinacia oleracea), another Caryophyllales species, and validated features that separate this clade from rosids and asterids. Intraspecific genomic variation was analysed based on the genome sequences of sea beet (Beta vulgaris ssp. maritima; progenitor of all beet crops) and four additional sugar beet accessions. We identified seven million variant positions in the reference genome, and also large regions of low variability, indicating artificial selection. The sugar beet genome sequence enables the identification of genes affecting agronomically relevant traits, supports molecular breeding and maximizes the plant's potential in energy biotechnology.
Collapse
Affiliation(s)
- Juliane C Dohm
- 1] Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany [2] Centre for Genomic Regulation (CRG), C. Dr. Aiguader 88, 08003 Barcelona, Spain [3] Universitat Pompeu Fabra (UPF), C. Dr. Aiguader 88, 08003 Barcelona, Spain [4]
| | - André E Minoche
- 1] Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany [2] Centre for Genomic Regulation (CRG), C. Dr. Aiguader 88, 08003 Barcelona, Spain [3] Universitat Pompeu Fabra (UPF), C. Dr. Aiguader 88, 08003 Barcelona, Spain [4]
| | - Daniela Holtgräwe
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Salvador Capella-Gutiérrez
- 1] Centre for Genomic Regulation (CRG), C. Dr. Aiguader 88, 08003 Barcelona, Spain [2] Universitat Pompeu Fabra (UPF), C. Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Falk Zakrzewski
- TU Dresden, Department of Biology, Zellescher Weg 20b, 01217 Dresden, Germany
| | - Hakim Tafer
- University of Leipzig, Department of Computer Science, Härtelstraße 16-18, 04107 Leipzig, Germany
| | - Oliver Rupp
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Thomas Rosleff Sörensen
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Ralf Stracke
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Richard Reinhardt
- Max Planck Genome Centre Cologne, Carl-von-Linné-Weg 10, 50829 Köln, Germany
| | - Alexander Goesmann
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | | | - Britta Schulz
- KWS SAAT AG, Grimsehlstraße 31, 37574 Einbeck, Germany
| | - Peter F Stadler
- University of Leipzig, Department of Computer Science, Härtelstraße 16-18, 04107 Leipzig, Germany
| | - Thomas Schmidt
- TU Dresden, Department of Biology, Zellescher Weg 20b, 01217 Dresden, Germany
| | - Toni Gabaldón
- 1] Centre for Genomic Regulation (CRG), C. Dr. Aiguader 88, 08003 Barcelona, Spain [2] Universitat Pompeu Fabra (UPF), C. Dr. Aiguader 88, 08003 Barcelona, Spain [3] Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Bernd Weisshaar
- Bielefeld University, CeBiTec and Department of Biology, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Heinz Himmelbauer
- 1] Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany [2] Centre for Genomic Regulation (CRG), C. Dr. Aiguader 88, 08003 Barcelona, Spain [3] Universitat Pompeu Fabra (UPF), C. Dr. Aiguader 88, 08003 Barcelona, Spain
| |
Collapse
|
15
|
Weber B, Heitkam T, Holtgräwe D, Weisshaar B, Minoche AE, Dohm JC, Himmelbauer H, Schmidt T. Highly diverse chromoviruses of Beta vulgaris are classified by chromodomains and chromosomal integration. Mob DNA 2013; 4:8. [PMID: 23448600 PMCID: PMC3605345 DOI: 10.1186/1759-8753-4-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 01/22/2013] [Indexed: 12/25/2022] Open
Abstract
Background Chromoviruses are one of the three genera of Ty3-gypsy long terminal repeat (LTR) retrotransposons, and are present in high copy numbers in plant genomes. They are widely distributed within the plant kingdom, with representatives even in lower plants such as green and red algae. Their hallmark is the presence of a chromodomain at the C-terminus of the integrase. The chromodomain exhibits structural characteristics similar to proteins of the heterochromatin protein 1 (HP1) family, which mediate the binding of each chromovirus type to specific histone variants. A specific integration via the chromodomain has been shown for only a few chromoviruses. However, a detailed study of different chromoviral clades populating a single plant genome has not yet been carried out. Results We conducted a comprehensive survey of chromoviruses within the Beta vulgaris (sugar beet) genome, and found a highly diverse chromovirus population, with significant differences in element size, primarily caused by their flanking LTRs. In total, we identified and annotated full-length members of 16 families belonging to the four plant chromoviral clades: CRM, Tekay, Reina, and Galadriel. The families within each clade are structurally highly conserved; in particular, the position of the chromodomain coding region relative to the polypurine tract is clade-specific. Two distinct groups of chromodomains were identified. The group II chromodomain was present in three chromoviral clades, whereas families of the CRM clade contained a more divergent motif. Physical mapping using representatives of all four clades identified a clade-specific integration pattern. For some chromoviral families, we detected the presence of expressed sequence tags, indicating transcriptional activity. Conclusions We present a detailed study of chromoviruses, belonging to the four major clades, which populate a single plant genome. Our results illustrate the diversity and family structure of B. vulgaris chromoviruses, and emphasize the role of chromodomains in the targeted integration of these viruses. We suggest that the diverse sets of plant chromoviruses with their different localization patterns might help to facilitate plant-genome organization in a structural and functional manner.
Collapse
Affiliation(s)
- Beatrice Weber
- Institute of Botany, Dresden University of Technology, Dresden D-01062, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Wollrab C, Heitkam T, Holtgräwe D, Weisshaar B, Minoche AE, Dohm JC, Himmelbauer H, Schmidt T. Evolutionary reshuffling in the Errantivirus lineage Elbe within the Beta vulgaris genome. Plant J 2012; 72:636-51. [PMID: 22804913 DOI: 10.1111/j.1365-313x.2012.05107.x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
LTR retrotransposons and retroviruses are closely related. Although a viral envelope gene is found in some LTR retrotransposons and all retroviruses, only the latter show infectivity. The identification of Ty3-gypsy-like retrotransposons possessing putative envelope-like open reading frames blurred the taxonomical borders and led to the establishment of the Errantivirus, Metavirus and Chromovirus genera within the Metaviridae. Only a few plant Errantiviruses have been described, and their evolutionary history is not well understood. In this study, we investigated 27 retroelements of four abundant Elbe retrotransposon families belonging to the Errantiviruses in Beta vulgaris (sugar beet). Retroelements of the Elbe lineage integrated between 0.02 and 5.59 million years ago, and show family-specific variations in autonomy and degree of rearrangements: while Elbe3 members are highly fragmented, often truncated and present in a high number of solo LTRs, Elbe2 members are mainly autonomous. We observed extensive reshuffling of structural motifs across families, leading to the formation of new retrotransposon families. Elbe retrotransposons harbor a typical envelope-like gene, often encoding transmembrane domains. During the course of Elbe evolution, the additional open reading frames have been strongly modified or independently acquired. Taken together, the Elbe lineage serves as retrotransposon model reflecting the various stages in Errantivirus evolution, and allows a detailed analysis of retrotransposon family formation.
Collapse
Affiliation(s)
- Cora Wollrab
- Department of Biology, Dresden University of Technology, D-01062, Dresden, Germany
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Menzel G, Krebs C, Diez M, Holtgräwe D, Weisshaar B, Minoche AE, Dohm JC, Himmelbauer H, Schmidt T. Survey of sugar beet (Beta vulgaris L.) hAT transposons and MITE-like hATpin derivatives. Plant Mol Biol 2012; 78:393-405. [PMID: 22246381 DOI: 10.1007/s11103-011-9872-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Accepted: 12/20/2011] [Indexed: 05/03/2023]
Abstract
Genome-wide analyses of repetitive DNA suggest a significant impact particularly of transposable elements on genome size and evolution of virtually all eukaryotic organisms. In this study, we analyzed the abundance and diversity of the hAT transposon superfamily of the sugar beet (B. vulgaris) genome, using molecular, bioinformatic and cytogenetic approaches. We identified 81 transposase-coding sequences, three of which are part of structurally intact but nonfunctional hAT transposons (BvhAT), in a B. vulgaris BAC library as well as in whole genome sequencing-derived data sets. Additionally, 116 complete and 497 truncated non-autonomous BvhAT derivatives lacking the transposase gene were in silico-detected. The 116 complete derivatives were subdivided into four BvhATpin groups each characterized by a distinct terminal inverted repeat motif. Both BvhAT and BvhATpin transposons are specific for species of the genus Beta and closely related species, showing a localization on B. vulgaris chromosomes predominantely in euchromatic regions. The lack of any BvhAT transposase function together with the high degree of degeneration observed for the BvhAT and the BvhATpin genomic fraction contrasts with the abundance and activity of autonomous and non-autonomous hAT transposons revealed in other plant species. This indicates a possible genus-specific structural and functional repression of the hAT transposon superfamily during Beta diversification and evolution.
Collapse
Affiliation(s)
- Gerhard Menzel
- Institute of Botany, Dresden University of Technology, 01062 Dresden, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Minoche AE, Dohm JC, Himmelbauer H. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol 2011; 12:R112. [PMID: 22067484 PMCID: PMC3334598 DOI: 10.1186/gb-2011-12-11-r112] [Citation(s) in RCA: 385] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 10/21/2011] [Accepted: 11/08/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. RESULTS We provide quantifications and evidence for GC bias, error rates, error sequence context, effects of quality filtering, and the reliability of quality values. By combining different filtering criteria we reduced error rates 7-fold at the expense of discarding 12.5% of alignable bases. While overall error rates are low in HiSeq data we observed regions of accumulated wrong base calls. Only 3% of all error positions accounted for 24.7% of all substitution errors. Analyzing the forward and reverse strands separately revealed error rates of up to 18.7%. Insertions and deletions occurred at very low rates on average but increased to up to 2% in homopolymers. A positive correlation between read coverage and GC content was found depending on the GC content range. CONCLUSIONS The errors and biases we report have implications for the use and the interpretation of Illumina sequencing data. GAIIx and HiSeq data sets show slightly different error profiles. Quality filtering is essential to minimize downstream analysis artifacts. Supporting previous recommendations, the strand-specificity provides a criterion to distinguish sequencing errors from low abundance polymorphisms.
Collapse
Affiliation(s)
- André E Minoche
- Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195 Berlin, Germany
- Centre for Genomic Regulation (CRG) and UPF, C. Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Juliane C Dohm
- Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195 Berlin, Germany
- Centre for Genomic Regulation (CRG) and UPF, C. Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Heinz Himmelbauer
- Centre for Genomic Regulation (CRG) and UPF, C. Dr. Aiguader 88, 08003 Barcelona, Spain
| |
Collapse
|