1
|
Zhou Y, Kathiresan N, Yu Z, Rivera LF, Yang Y, Thimma M, Manickam K, Chebotarov D, Mauleon R, Chougule K, Wei S, Gao T, Green CD, Zuccolo A, Xie W, Ware D, Zhang J, McNally KL, Wing RA. A high-performance computational workflow to accelerate GATK SNP detection across a 25-genome dataset. BMC Biol 2024; 22:13. [PMID: 38273258 PMCID: PMC10809545 DOI: 10.1186/s12915-024-01820-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 01/09/2024] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.
Collapse
Affiliation(s)
- Yong Zhou
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Nagarajan Kathiresan
- KAUST Supercomputing Laboratory (KSL), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Zhichao Yu
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Luis F Rivera
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Yujian Yang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Manjula Thimma
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Keerthana Manickam
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Dmytro Chebotarov
- International Rice Research Institute (IRRI), Los Baños, Laguna, 4031, Philippines
| | - Ramil Mauleon
- International Rice Research Institute (IRRI), Los Baños, Laguna, 4031, Philippines
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Tingting Gao
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Carl D Green
- Information Technology Department, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Andrea Zuccolo
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Crop Science Research Center (CSRC), Scuola Superiore Sant'Anna, Pisa, 56127, Italy
| | - Weibo Xie
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA ARS NEA Plant, Soil & Nutrition Laboratory Research Unit, Ithaca, NY, 14853, USA
| | - Jianwei Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Kenneth L McNally
- International Rice Research Institute (IRRI), Los Baños, Laguna, 4031, Philippines
| | - Rod A Wing
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.
- International Rice Research Institute (IRRI), Los Baños, Laguna, 4031, Philippines.
| |
Collapse
|
2
|
Harrison PW, Amode MR, Austine-Orimoloye O, Azov A, Barba M, Barnes I, Becker A, Bennett R, Berry A, Bhai J, Bhurji SK, Boddu S, Branco Lins PR, Brooks L, Ramaraju S, Campbell L, Martinez MC, Charkhchi M, Chougule K, Cockburn A, Davidson C, De Silva N, Dodiya K, Donaldson S, El Houdaigui B, Naboulsi T, Fatima R, Giron CG, Genez T, Grigoriadis D, Ghattaoraya G, Martinez JG, Gurbich T, Hardy M, Hollis Z, Hourlier T, Hunt T, Kay M, Kaykala V, Le T, Lemos D, Lodha D, Marques-Coelho D, Maslen G, Merino G, Mirabueno L, Mushtaq A, Hossain S, Ogeh D, Sakthivel MP, Parker A, Perry M, Piližota I, Poppleton D, Prosovetskaia I, Raj S, Pérez-Silva J, Salam A, Saraf S, Saraiva-Agostinho N, Sheppard D, Sinha S, Sipos B, Sitnik V, Stark W, Steed E, Suner MM, Surapaneni L, Sutinen K, Tricomi FF, Urbina-Gómez D, Veidenberg A, Walsh TA, Ware D, Wass E, Willhoft N, Allen J, Alvarez-Jarreta J, Chakiachvili M, Flint B, Giorgetti S, Haggerty L, Ilsley G, Keatley J, Loveland J, Moore B, Mudge J, Naamati G, Tate J, Trevanion S, Winterbottom A, Frankish A, Hunt SE, Cunningham F, Dyer S, Finn R, Martin F, Yates A. Ensembl 2024. Nucleic Acids Res 2024; 52:D891-D899. [PMID: 37953337 PMCID: PMC10767893 DOI: 10.1093/nar/gkad1049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/20/2023] [Accepted: 10/24/2023] [Indexed: 11/14/2023] Open
Abstract
Ensembl (https://www.ensembl.org) is a freely available genomic resource that has produced high-quality annotations, tools, and services for vertebrates and model organisms for more than two decades. In recent years, there has been a dramatic shift in the genomic landscape, with a large increase in the number and phylogenetic breadth of high-quality reference genomes, alongside major advances in the pan-genome representations of higher species. In order to support these efforts and accelerate downstream research, Ensembl continues to focus on scaling for the rapid annotation of new genome assemblies, developing new methods for comparative analysis, and expanding the depth and quality of our genome annotations. This year we have continued our expansion to support global biodiversity research, doubling the number of annotated genomes we support on our Rapid Release site to over 1700, driven by our close collaboration with biodiversity projects such as Darwin Tree of Life. We have also strengthened support for key agricultural species, including the first regulatory builds for farmed animals, and have updated key tools and resources that support the global scientific community, notably the Ensembl Variant Effect Predictor. Ensembl data, software, and tools are freely available.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Olanrewaju Austine-Orimoloye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrey G Azov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthieu Barba
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - If Barnes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Arne Becker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ruth Bennett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew Berry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Simarpreet Kaur Bhurji
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sanjay Boddu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Paulo R Branco Lins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lucy Brooks
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shashank Budhanuru Ramaraju
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lahcen I Campbell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manuel Carbajo Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mehrnaz Charkhchi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Alexander Cockburn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Claire Davidson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nishadi H De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kamalkumar Dodiya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Donaldson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bilal El Houdaigui
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tamara El Naboulsi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Reham Fatima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thiago Genez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dionysios Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gurpreet S Ghattaoraya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jose Gonzalez Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthew Hardy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Zoe Hollis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mike Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vinay Kaykala
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diana Lemos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Disha Lodha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diego Marques-Coelho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gareth Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gabriela Alejandra Merino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Louisse Paola Mirabueno
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Syed Nakib Hossain
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Denye N Ogeh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manoj Pandian Sakthivel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Malcolm Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ivana Piližota
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Daniel Poppleton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Irina Prosovetskaia
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - José G Pérez-Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ahamed Imran Abdul Salam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shradha Saraf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nuno Saraiva-Agostinho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Swati Sinha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Botond Sipos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vasily Sitnik
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - William Stark
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Emily Steed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marie-Marthe Suner
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kyösti Sutinen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Francesca Floriana Tricomi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - David Urbina-Gómez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andres Veidenberg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thomas A Walsh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| | - Elizabeth Wass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Natalie L Willhoft
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jorge Alvarez-Jarreta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bethany Flint
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stefano Giorgetti
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Garth R Ilsley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jon Keatley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jane E Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrea Winterbottom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
3
|
Ntakirutimana F, Tranchant-Dubreuil C, Cubry P, Chougule K, Zhang J, Wing RA, Adam H, Lorieux M, Jouannic S. Genome-wide association analysis identifies natural allelic variants associated with panicle architecture variation in African rice, Oryza glaberrima Steud. G3 (Bethesda) 2023; 13:jkad174. [PMID: 37535690 PMCID: PMC10542218 DOI: 10.1093/g3journal/jkad174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 06/12/2023] [Accepted: 07/18/2023] [Indexed: 08/05/2023]
Abstract
African rice (Oryza glaberrima Steud), a short-day cereal crop closely related to Asian rice (Oryza sativa L.), has been cultivated in Sub-Saharan Africa for ∼ 3,000 years. Although less cultivated globally, it is a valuable genetic resource in creating high-yielding cultivars that are better adapted to diverse biotic and abiotic stresses. While inflorescence architecture, a key trait for rice grain yield improvement, has been extensively studied in Asian rice, the morphological and genetic determinants of this complex trait are less understood in African rice. In this study, using a previously developed association panel of 162 O. glaberrima accessions and new SNP variants characterized through mapping to a new version of the O. glaberrima reference genome, we conducted a genome-wide association study of four major morphological panicle traits. We have found a total of 41 stable genomic regions that are significantly associated with these traits, of which 13 co-localized with previously identified QTLs in O. sativa populations and 28 were unique for this association panel. Additionally, we found a genomic region of interest on chromosome 3 that was associated with the number of spikelets and primary and secondary branches. Within this region was localized the O. sativa ortholog of the PHYTOCHROME B gene (Oglab_006903/OgPHYB). Haplotype analysis revealed the occurrence of natural sequence variants at the OgPHYB locus associated with panicle architecture variation through modulation of the flowering time phenotype, whereas no equivalent alleles were found in O. sativa. The identification in this study of genomic regions specific to O. glaberrima indicates panicle-related intra-specific genetic variation in this species, increasing our understanding of the underlying molecular processes governing panicle architecture. Identified candidate genes and major haplotypes may facilitate the breeding of new African rice cultivars with preferred panicle traits.
Collapse
Affiliation(s)
| | | | - Philippe Cubry
- DIADE, University of Montpellier, IRD, CIRAD, 34394 Montpellier, France
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Jianwei Zhang
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan 430070, China
| | - Rod A Wing
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
- Center for Desert Agriculture, Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Hélène Adam
- DIADE, University of Montpellier, IRD, CIRAD, 34394 Montpellier, France
| | - Mathias Lorieux
- DIADE, University of Montpellier, IRD, CIRAD, 34394 Montpellier, France
| | - Stefan Jouannic
- DIADE, University of Montpellier, IRD, CIRAD, 34394 Montpellier, France
| |
Collapse
|
4
|
Zhou Y, Yu Z, Chebotarov D, Chougule K, Lu Z, Rivera LF, Kathiresan N, Al-Bader N, Mohammed N, Alsantely A, Mussurova S, Santos J, Thimma M, Troukhan M, Fornasiero A, Green CD, Copetti D, Kudrna D, Llaca V, Lorieux M, Zuccolo A, Ware D, McNally K, Zhang J, Wing RA. Pan-genome inversion index reveals evolutionary insights into the subpopulation structure of Asian rice. Nat Commun 2023; 14:1567. [PMID: 36944612 PMCID: PMC10030860 DOI: 10.1038/s41467-023-37004-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 02/27/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here, we utilize 73 high-quality genomes that encompass the subpopulation structure of Asian rice (Oryza sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1769 non-redundant inversions that span an average of ~29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.
Collapse
Affiliation(s)
- Yong Zhou
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Zhichao Yu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China
| | - Dmytro Chebotarov
- International Rice Research Institute (IRRI), Los Baños, 4031, Laguna, Philippines
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Zhenyuan Lu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Luis F Rivera
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Nagarajan Kathiresan
- Supercomputing Core Lab, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Noor Al-Bader
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Nahed Mohammed
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Aseel Alsantely
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Saule Mussurova
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - João Santos
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Manjula Thimma
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | | | - Alice Fornasiero
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Carl D Green
- Information Technology Department, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Dario Copetti
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - David Kudrna
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Victor Llaca
- Research and Development, Corteva Agriscience, Johnston, IA, 50131, USA
| | - Mathias Lorieux
- DIADE, University of Montpellier, CIRAD, IRD, Montpellier, France
| | - Andrea Zuccolo
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
- Crop Science Research Center (CSRC), Scuola Superiore Sant'Anna, Pisa, 56127, Italy.
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
- USDA ARS NEA Plant, Soil & Nutrition Laboratory Research Unit, Ithaca, NY, 14853, USA.
| | - Kenneth McNally
- International Rice Research Institute (IRRI), Los Baños, 4031, Laguna, Philippines.
| | - Jianwei Zhang
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, 430070, China.
| | - Rod A Wing
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
- Arizona Genomics Institute (AGI), School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.
- International Rice Research Institute (IRRI), Los Baños, 4031, Laguna, Philippines.
| |
Collapse
|
5
|
Gladman N, Goodwin S, Chougule K, Richard McCombie W, Ware D. Era of gapless plant genomes: innovations in sequencing and mapping technologies revolutionize genomics and breeding. Curr Opin Biotechnol 2023; 79:102886. [PMID: 36640454 PMCID: PMC9899316 DOI: 10.1016/j.copbio.2022.102886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 12/03/2022] [Accepted: 12/13/2022] [Indexed: 01/15/2023]
Abstract
Whole-genome sequencing and assembly have revolutionized plant genetics and molecular biology over the last two decades. However, significant shortcomings in first- and second-generation technology resulted in imperfect reference genomes: numerous and large gaps of low quality or undeterminable sequence in areas of highly repetitive DNA along with limited chromosomal phasing restricted the ability of researchers to characterize regulatory noncoding elements and genic regions that underwent recent duplication events. Recently, advances in long-read sequencing have resulted in the first gapless, telomere-to-telomere (T2T) assemblies of plant genomes. This leap forward has the potential to increase the speed and confidence of genomics and molecular experimentation while reducing costs for the research community.
Collapse
Affiliation(s)
- Nicholas Gladman
- U.S. Department of Agriculture-Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and Health, 538 Tower Rd, Ithaca, NY 14853, USA; Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724 , USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724 , USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724 , USA
| | | | - Doreen Ware
- U.S. Department of Agriculture-Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and Health, 538 Tower Rd, Ithaca, NY 14853, USA; Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724 , USA.
| |
Collapse
|
6
|
Voelker WG, Krishnan K, Chougule K, Alexander LC, Lu Z, Olson A, Ware D, Songsomboon K, Ponce C, Brenton ZW, Boatwright JL, Cooper EA. Ten new high-quality genome assemblies for diverse bioenergy sorghum genotypes. Front Plant Sci 2023; 13:1040909. [PMID: 36684744 PMCID: PMC9846640 DOI: 10.3389/fpls.2022.1040909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/09/2022] [Indexed: 06/17/2023]
Abstract
INTRODUCTION Sorghum (Sorghum bicolor (L.) Moench) is an agriculturally and economically important staple crop that has immense potential as a bioenergy feedstock due to its relatively high productivity on marginal lands. To capitalize on and further improve sorghum as a potential source of sustainable biofuel, it is essential to understand the genomic mechanisms underlying complex traits related to yield, composition, and environmental adaptations. METHODS Expanding on a recently developed mapping population, we generated de novo genome assemblies for 10 parental genotypes from this population and identified a comprehensive set of over 24 thousand large structural variants (SVs) and over 10.5 million single nucleotide polymorphisms (SNPs). RESULTS We show that SVs and nonsynonymous SNPs are enriched in different gene categories, emphasizing the need for long read sequencing in crop species to identify novel variation. Furthermore, we highlight SVs and SNPs occurring in genes and pathways with known associations to critical bioenergy-related phenotypes and characterize the landscape of genetic differences between sweet and cellulosic genotypes. DISCUSSION These resources can be integrated into both ongoing and future mapping and trait discovery for sorghum and its myriad uses including food, feed, bioenergy, and increasingly as a carbon dioxide removal mechanism.
Collapse
Affiliation(s)
- William G. Voelker
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| | - Krittika Krishnan
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| | - Kapeel Chougule
- Cold Spring Harbor Research Laboratory, Cold Spring Harbor, NY, United States
| | - Louie C. Alexander
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| | - Zhenyuan Lu
- Cold Spring Harbor Research Laboratory, Cold Spring Harbor, NY, United States
| | - Andrew Olson
- Cold Spring Harbor Research Laboratory, Cold Spring Harbor, NY, United States
| | - Doreen Ware
- Cold Spring Harbor Research Laboratory, Cold Spring Harbor, NY, United States
- United States Department of Agriculture - Agricultural Research Service in the North Atlantic Area (USDA-ARS NAA), Robert W. Holley Center for Agriculture and Health, Ithaca, NY, United States
| | - Kittikun Songsomboon
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| | - Cristian Ponce
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| | - Zachary W. Brenton
- Carolina Seed Systems, Darlington, SC, United States
- Advanced Plant Technology, Clemson University, Clemson, SC, United States
| | - J. Lucas Boatwright
- Advanced Plant Technology, Clemson University, Clemson, SC, United States
- Dept. of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
| | - Elizabeth A. Cooper
- Dept. of Bioinformatics & Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
- North Carolina Research Campus, Kannapolis, NC, United States
| |
Collapse
|
7
|
Gladman N, Hufnagel B, Regulski M, Liu Z, Wang X, Chougule K, Kochian L, Magalhães J, Ware D. Sorghum root epigenetic landscape during limiting phosphorus conditions. Plant Direct 2022; 6:e393. [PMID: 35600998 PMCID: PMC9107021 DOI: 10.1002/pld3.393] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 01/07/2022] [Accepted: 02/26/2022] [Indexed: 06/15/2023]
Abstract
Efficient acquisition and use of available phosphorus from the soil is crucial for plant growth, development, and yield. With an ever-increasing acreage of croplands with suboptimal available soil phosphorus, genetic improvement of sorghum germplasm for enhanced phosphorus acquisition from soil is crucial to increasing agricultural output and reducing inputs, while confronted with a growing world population and uncertain climate. Sorghum bicolor is a globally important commodity for food, fodder, and forage. Known for robust tolerance to heat, drought, and other abiotic stresses, its capacity for optimal phosphorus use efficiency (PUE) is still being investigated for optimized root system architectures (RSA). Whilst a few RSA-influencing genes have been identified in sorghum and other grasses, the epigenetic impact on expression and tissue-specific activation of candidate PUE genes remains elusive. Here, we present transcriptomic, epigenetic, and regulatory network profiling of RSA modulation in the BTx623 sorghum background in response to limiting phosphorus (LP) conditions. We show that during LP, sorghum RSA is remodeled to increase root length and surface area, likely enhancing its ability to acquire P. Global DNA 5-methylcytosine and H3K4 and H3K27 trimethylation levels decrease in response to LP, while H3K4me3 peaks and DNA hypomethylated regions contain recognition motifs of numerous developmental and nutrient responsive transcription factors that display disparate expression patterns between different root tissues (primary root apex, elongation zone, and lateral root apex).
Collapse
Affiliation(s)
| | - Barbara Hufnagel
- Centre National de la Recherche ScientifiqueMontpellierLanguedoc‐RoussillonFrance
| | | | - Zhigang Liu
- Global Institute for Food SecurityUniversity of SaskatchewanSaskatoonCanada
| | - Xiaofei Wang
- Cold Spring Harbor LaboratoryCold Spring HarborNew YorkUSA
| | | | - Leon Kochian
- Global Institute for Food SecurityUniversity of SaskatchewanSaskatoonCanada
| | | | - Doreen Ware
- Cold Spring Harbor LaboratoryCold Spring HarborNew YorkUSA
- U.S. Department of Agriculture‐Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and HealthCornell UniversityIthacaNew YorkUSA
| |
Collapse
|
8
|
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, Lugo CSB, Elliott TA, Ware D, Peterson T, Jiang N, Hirsch CN, Hufford MB. Author Correction: Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 2022; 23:76. [PMID: 35260190 PMCID: PMC8903655 DOI: 10.1186/s13059-022-02645-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
- Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Weijia Su
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Harbor, Cold Spring, NY, 11724, USA
| | - Jireh R A Agda
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Adam J Hellinga
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | | | - Tyler A Elliott
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Harbor, Cold Spring, NY, 11724, USA.,USDA-ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY, 14853, USA
| | - Thomas Peterson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA.
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
9
|
Gladman N, Olson A, Wei S, Chougule K, Lu Z, Tello-Ruiz M, Meijs I, Van Buren P, Jiao Y, Wang B, Kumar V, Kumari S, Zhang L, Burke J, Chen J, Burow G, Hayes C, Emendack Y, Xin Z, Ware D. SorghumBase: a web-based portal for sorghum genetic information and community advancement. Planta 2022; 255:35. [PMID: 35015132 PMCID: PMC8752523 DOI: 10.1007/s00425-022-03821-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 12/27/2021] [Indexed: 05/05/2023]
Abstract
SorghumBase provides a community portal that integrates genetic, genomic, and breeding resources for sorghum germplasm improvement. Public research and development in agriculture rely on proper data and resource sharing within stakeholder communities. For plant breeders, agronomists, molecular biologists, geneticists, and bioinformaticians, centralizing desirable data into a user-friendly hub for crop systems is essential for successful collaborations and breakthroughs in germplasm development. Here, we present the SorghumBase web portal ( https://www.sorghumbase.org ), a resource for the sorghum research community. SorghumBase hosts a wide range of sorghum genomic information in a modular framework, built with open-source software, to provide a sustainable platform. This initial release of SorghumBase includes: (1) five sorghum reference genome assemblies in a pan-genome browser; (2) genetic variant information for natural diversity panels and ethyl methanesulfonate (EMS)-induced mutant populations; (3) search interface and integrated views of various data types; (4) links supporting interconnectivity with other repositories including genebank, QTL, and gene expression databases; and (5) a content management system to support access to community news and training materials. SorghumBase offers sorghum investigators improved data collation and access that will facilitate the growth of a robust research community to support genomics-assisted breeding.
Collapse
Affiliation(s)
- Nicholas Gladman
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Zhenyuan Lu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | | | - Ivar Meijs
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Peter Van Buren
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Yinping Jiao
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance, Texas Tech University, Lubbock, TX, 79409, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Lifang Zhang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - John Burke
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Junping Chen
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Gloria Burow
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Chad Hayes
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Yves Emendack
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Zhanguo Xin
- Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, U.S. Department of Agriculture-Agricultural Research Service, Lubbock, TX, 79415, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
- U.S. Department of Agriculture-Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
10
|
Yates AD, Allen J, Amode RM, Azov AG, Barba M, Becerra A, Bhai J, Campbell LI, Carbajo Martinez M, Chakiachvili M, Chougule K, Christensen M, Contreras-Moreira B, Cuzick A, Da Rin Fioretto L, Davis P, De Silva NH, Diamantakis S, Dyer S, Elser J, Filippi CV, Gall A, Grigoriadis D, Guijarro-Clarke C, Gupta P, Hammond-Kosack KE, Howe KL, Jaiswal P, Kaikala V, Kumar V, Kumari S, Langridge N, Le T, Luypaert M, Maslen GL, Maurel T, Moore B, Muffato M, Mushtaq A, Naamati G, Naithani S, Olson A, Parker A, Paulini M, Pedro H, Perry E, Preece J, Quinton-Tulloch M, Rodgers F, Rosello M, Ruffier M, Seager J, Sitnik V, Szpak M, Tate J, Tello-Ruiz MK, Trevanion SJ, Urban M, Ware D, Wei S, Williams G, Winterbottom A, Zarowiecki M, Finn RD, Flicek P. Ensembl Genomes 2022: an expanding genome resource for non-vertebrates. Nucleic Acids Res 2021; 50:D996-D1003. [PMID: 34791415 PMCID: PMC8728113 DOI: 10.1093/nar/gkab1007] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 10/07/2021] [Accepted: 11/10/2021] [Indexed: 11/28/2022] Open
Abstract
Ensembl Genomes (https://www.ensemblgenomes.org) provides access to non-vertebrate genomes and analysis complementing vertebrate resources developed by the Ensembl project (https://www.ensembl.org). The two resources collectively present genome annotation through a consistent set of interfaces spanning the tree of life presenting genome sequence, annotation, variation, transcriptomic data and comparative analysis. Here, we present our largest increase in plant, metazoan and fungal genomes since the project's inception creating one of the world's most comprehensive genomic resources and describe our efforts to reduce genome redundancy in our Bacteria portal. We detail our new efforts in gene annotation, our emerging support for pangenome analysis, our efforts to accelerate data dissemination through the Ensembl Rapid Release resource and our new AlphaFold visualization. Finally, we present details of our future plans including updates on our integration with Ensembl, and how we plan to improve our support for the microbial research community. Software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license). Data updates are synchronised with Ensembl's release cycle.
Collapse
Affiliation(s)
- Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ridwan M Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrey G Azov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Barba
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lahcen I Campbell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manuel Carbajo Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Mikkel Christensen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alayne Cuzick
- Rothamsted Research, Department of Biointeractions and Crop Protection, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Luca Da Rin Fioretto
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nishadi H De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Carla V Filippi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Instituto de Biotecnología, Centro de Investigaciones en Ciencias Veterinarias y Agronómicas (CICVyA), Instituto Nacional de Tecnología Agropecuaria (INTA); Instituto de Agrobiotecnología y Biología Molecular (IABIMO), INTA-CONICET Nicolas Repetto y Los Reseros s/n (1686), Hurlingham, Buenos Aires, Argentina.,Consejo Nacional de Investigaciones Científicas y Técnicas-CONICET, Ciudad Autónoma de Buenos Aires, Argentina
| | - Astrid Gall
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dionysios Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cristina Guijarro-Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Kim E Hammond-Kosack
- Rothamsted Research, Department of Biointeractions and Crop Protection, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Vinay Kaikala
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Nick Langridge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gareth L Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Maurel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helder Pedro
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Faye Rodgers
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Marc Rosello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magali Ruffier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Seager
- Rothamsted Research, Department of Biointeractions and Crop Protection, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Vasily Sitnik
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michal Szpak
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Martin Urban
- Rothamsted Research, Department of Biointeractions and Crop Protection, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrea Winterbottom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
11
|
Vaughn JN, Korani W, Stein JC, Edwards JD, Peterson DG, Simpson SA, Youngblood RC, Grimwood J, Chougule K, Ware DH, McClung AM, Scheffler BE. Gene disruption by structural mutations drives selection in US rice breeding over the last century. PLoS Genet 2021; 17:e1009389. [PMID: 33735256 PMCID: PMC7971508 DOI: 10.1371/journal.pgen.1009389] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 01/28/2021] [Indexed: 12/30/2022] Open
Abstract
The genetic basis of general plant vigor is of major interest to food producers, yet the trait is recalcitrant to genetic mapping because of the number of loci involved, their small effects, and linkage. Observations of heterosis in many crops suggests that recessive, malfunctioning versions of genes are a major cause of poor performance, yet we have little information on the mutational spectrum underlying these disruptions. To address this question, we generated a long-read assembly of a tropical japonica rice (Oryza sativa) variety, Carolina Gold, which allowed us to identify structural mutations (>50 bp) and orient them with respect to their ancestral state using the outgroup, Oryza glaberrima. Supporting prior work, we find substantial genome expansion in the sativa branch. While transposable elements (TEs) account for the largest share of size variation, the majority of events are not directly TE-mediated. Tandem duplications are the most common source of insertions and are highly enriched among 50-200bp mutations. To explore the relative impact of various mutational classes on crop fitness, we then track these structural events over the last century of US rice improvement using 101 resequenced varieties. Within this material, a pattern of temporary hybridization between medium and long-grain varieties was followed by recent divergence. During this long-term selection, structural mutations that impact gene exons have been removed at a greater rate than intronic indels and single-nucleotide mutations. These results support the use of ab initio estimates of mutational burden, based on structural data, as an orthogonal predictor in genomic selection. Some crop varieties have superior performance across years and environments. In hybrids, harmful mutations in one parent are masked by the ancestral alleles in the other parent, resulting in increased vigor. Unfortunately, these mutations are very difficult to identify precisely because, individually, they only have a small effect. In this study, we use long-read sequencing to characterize the entire mutational spectrum between two rice varieties. We then track these mutations through the last century of rice breeding. We show that large structural mutations in exons are selected against at a greater rate than any other mutational class. These findings illuminate the nature of deleterious alleles and will guide attempts to predict variety vigor based solely on genomic information.
Collapse
Affiliation(s)
- Justin N. Vaughn
- USDA-ARS, Genomics and Bioinformatics Research Unit, Stoneville, Mississippi, United States of America
- University of Georgia, Athens, Institute of Plant Breeding, Genetics, and Genomics, Athens, Georgia, United States of America
- * E-mail: (JNV); (BES)
| | - Walid Korani
- University of Georgia, Athens, Institute of Plant Breeding, Genetics, and Genomics, Athens, Georgia, United States of America
| | - Joshua C. Stein
- Cold Spring Harbor Laboratory, Cold Springs Harbor, New York, United States of America
| | - Jeremy D. Edwards
- USDA-ARS, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas, United States of America
| | - Daniel G. Peterson
- Mississippi State University, Institute for Genomics, Biocomputing & Biotechnology, Starkville, Mississippi, United States of America
| | - Sheron A. Simpson
- USDA-ARS, Genomics and Bioinformatics Research Unit, Stoneville, Mississippi, United States of America
| | - Ramey C. Youngblood
- Mississippi State University, Institute for Genomics, Biocomputing & Biotechnology, Starkville, Mississippi, United States of America
| | - Jane Grimwood
- Hudson-Alpha Institute for Biotechnology, Huntsville, Alabama, United States of America
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Springs Harbor, New York, United States of America
| | - Doreen H. Ware
- Cold Spring Harbor Laboratory, Cold Springs Harbor, New York, United States of America
- USDA-ARS, Robert W. Holley Center for Agriculture and Health, Ithaca, New York, United States of America
| | - Anna M. McClung
- USDA-ARS, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas, United States of America
| | - Brian E. Scheffler
- USDA-ARS, Genomics and Bioinformatics Research Unit, Stoneville, Mississippi, United States of America
- * E-mail: (JNV); (BES)
| |
Collapse
|
12
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 01/27/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA.,Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|
13
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979/5973447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 05/20/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
- Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|
14
|
Liu J, Seetharam AS, Chougule K, Ou S, Swentowsky KW, Gent JI, Llaca V, Woodhouse MR, Manchanda N, Presting GG, Kudrna DA, Alabady M, Hirsch CN, Fengler KA, Ware D, Michael TP, Hufford MB, Dawe RK. Gapless assembly of maize chromosomes using long-read technologies. Genome Biol 2020; 21:121. [PMID: 32434565 PMCID: PMC7238635 DOI: 10.1186/s13059-020-02029-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 04/23/2020] [Indexed: 12/16/2022] Open
Abstract
Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. This genome includes gapless assemblies of chromosome 3 (236 Mb) and chromosome 9 (162 Mb), and 53 Mb of the Ab10 meiotic drive haplotype. The data also reveal the internal structure of seven centromeres and five heterochromatic knobs, showing that the major tandem repeat arrays (CentC, knob180, and TR-1) are discontinuous and frequently interspersed with retroelements.
Collapse
Affiliation(s)
- Jianing Liu
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA, 50011, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Kyle W Swentowsky
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Jonathan I Gent
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Victor Llaca
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | | | - Nancy Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Gernot G Presting
- Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI, 96822, USA
| | - David A Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Magdy Alabady
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
- Georgia Genomics and Bioinformatics Core Laboratory, University of Georgia, Athens, GA, 30602, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Kevin A Fengler
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY, 14853, USA
| | - Todd P Michael
- Informatics Department, J. Craig Venter Institute, La Jolla, CA, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - R Kelly Dawe
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA.
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
15
|
Liu J, Seetharam AS, Chougule K, Ou S, Swentowsky KW, Gent JI, Llaca V, Woodhouse MR, Manchanda N, Presting GG, Kudrna DA, Alabady M, Hirsch CN, Fengler KA, Ware D, Michael TP, Hufford MB, Dawe RK. Gapless assembly of maize chromosomes using long-read technologies. Genome Biol 2020. [PMID: 32434565 DOI: 10.1101/2020.01.14.906230v1.full] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023] Open
Abstract
Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. This genome includes gapless assemblies of chromosome 3 (236 Mb) and chromosome 9 (162 Mb), and 53 Mb of the Ab10 meiotic drive haplotype. The data also reveal the internal structure of seven centromeres and five heterochromatic knobs, showing that the major tandem repeat arrays (CentC, knob180, and TR-1) are discontinuous and frequently interspersed with retroelements.
Collapse
Affiliation(s)
- Jianing Liu
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA, 50011, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Kyle W Swentowsky
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Jonathan I Gent
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Victor Llaca
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | | | - Nancy Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Gernot G Presting
- Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI, 96822, USA
| | - David A Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Magdy Alabady
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
- Georgia Genomics and Bioinformatics Core Laboratory, University of Georgia, Athens, GA, 30602, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Kevin A Fengler
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY, 14853, USA
| | - Todd P Michael
- Informatics Department, J. Craig Venter Institute, La Jolla, CA, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - R Kelly Dawe
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA.
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
16
|
Wang B, Tseng E, Baybayan P, Eng K, Regulski M, Jiao Y, Wang L, Olson A, Chougule K, Buren PV, Ware D. Variant phasing and haplotypic expression from long-read sequencing in maize. Commun Biol 2020; 3:78. [PMID: 32071408 PMCID: PMC7028979 DOI: 10.1038/s42003-020-0805-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 01/30/2020] [Indexed: 11/09/2022] Open
Abstract
Haplotype phasing maize genetic variants is important for genome interpretation, population genetic analysis and functional analysis of allelic activity. We performed an isoform-level phasing study using two maize inbred lines and their reciprocal crosses, based on single-molecule, full-length cDNA sequencing. To phase and analyze transcripts between hybrids and parents, we developed IsoPhase. Using this tool, we validated the majority of SNPs called against matching short-read data from embryo, endosperm and root tissues, and identified allele-specific, gene-level and isoform-level differential expression between the inbred parental lines and hybrid offspring. After phasing 6907 genes in the reciprocal hybrids, we annotated the SNPs and identified large-effect genes. In addition, we identified parent-of-origin isoforms, distinct novel isoforms in maize parent and hybrid lines, and imprinted genes from different tissues. Finally, we characterized variation in cis- and trans-regulatory effects. Our study provides measures of haplotypic expression that could increase accuracy in studies of allelic expression.
Collapse
Affiliation(s)
- Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Elizabeth Tseng
- Pacific Biosciences, 1380 Willow Road, Menlo Park, CA, 94025, USA
| | - Primo Baybayan
- Pacific Biosciences, 1380 Willow Road, Menlo Park, CA, 94025, USA
| | - Kevin Eng
- Pacific Biosciences, 1380 Willow Road, Menlo Park, CA, 94025, USA
| | - Michael Regulski
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Liya Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Peter Van Buren
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA. .,USDA ARS NEA Robert W. Holley Center for Agriculture and Health Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
17
|
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, Lugo CSB, Elliott TA, Ware D, Peterson T, Jiang N, Hirsch CN, Hufford MB. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 2019. [PMID: 31843001 DOI: 10.1101/657890v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. RESULTS We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. CONCLUSIONS The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Weija Su
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Jireh R A Agda
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Adam J Hellinga
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | | | - Tyler A Elliott
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA-ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY, 14853, USA
| | - Thomas Peterson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA.
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
18
|
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, Lugo CSB, Elliott TA, Ware D, Peterson T, Jiang N, Hirsch CN, Hufford MB. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 2019; 20:275. [PMID: 31843001 PMCID: PMC6913007 DOI: 10.1186/s13059-019-1905-y] [Citation(s) in RCA: 404] [Impact Index Per Article: 80.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Accepted: 11/28/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. RESULTS We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. CONCLUSIONS The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| | - Weija Su
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011 USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697 USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA
| | - Jireh R. A. Agda
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | - Adam J. Hellinga
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | | | - Tyler A. Elliott
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA
- USDA-ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY 14853 USA
| | - Thomas Peterson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011 USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI 48824 USA
| | - Candice N. Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN 55108 USA
| | - Matthew B. Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| |
Collapse
|
19
|
Tello-Ruiz MK, Marco CF, Hsu FM, Khangura RS, Qiao P, Sapkota S, Stitzer MC, Wasikowski R, Wu H, Zhan J, Chougule K, Barone LC, Ghiban C, Muna D, Olson AC, Wang L, Ware D, Micklos DA. Double triage to identify poorly annotated genes in maize: The missing link in community curation. PLoS One 2019; 14:e0224086. [PMID: 31658277 PMCID: PMC6816542 DOI: 10.1371/journal.pone.0224086] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 10/05/2019] [Indexed: 02/02/2023] Open
Abstract
The sophistication of gene prediction algorithms and the abundance of RNA-based evidence for the maize genome may suggest that manual curation of gene models is no longer necessary. However, quality metrics generated by the MAKER-P gene annotation pipeline identified 17,225 of 130,330 (13%) protein-coding transcripts in the B73 Reference Genome V4 gene set with models of low concordance to available biological evidence. Working with eight graduate students, we used the Apollo annotation editor to curate 86 transcript models flagged by quality metrics and a complimentary method using the Gramene gene tree visualizer. All of the triaged models had significant errors–including missing or extra exons, non-canonical splice sites, and incorrect UTRs. A correct transcript model existed for about 60% of genes (or transcripts) flagged by quality metrics; we attribute this to the convention of elevating the transcript with the longest coding sequence (CDS) to the canonical, or first, position. The remaining 40% of flagged genes resulted in novel annotations and represent a manual curation space of about 10% of the maize genome (~4,000 protein-coding genes). MAKER-P metrics have a specificity of 100%, and a sensitivity of 85%; the gene tree visualizer has a specificity of 100%. Together with the Apollo graphical editor, our double triage provides an infrastructure to support the community curation of eukaryotic genomes by scientists, students, and potentially even citizen scientists.
Collapse
Affiliation(s)
- Marcela K. Tello-Ruiz
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- Department of Biological Sciences, State University of New York at Old Westbury, Old Westbury, New York, United States of America
| | - Cristina F. Marco
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- * E-mail:
| | - Fei-Man Hsu
- Graduate School of Frontier Sciences, University of Tokyo, Chiba, Japan
| | - Rajdeep S. Khangura
- Department of Biochemistry, Purdue University, West Lafayette, Indiana, United States of America
| | - Pengfei Qiao
- Plant Biology Section, School of Integrative Plant Sciences, Cornell University, Ithaca, New York, United States of America
| | - Sirjan Sapkota
- Department of Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America
| | - Michelle C. Stitzer
- Department of Plant Sciences and Center for Population Biology, University of California Davis, Davis, California, United States of America
| | - Rachael Wasikowski
- Department of Biological Sciences, University of Toledo, Toledo, Ohio, United States of America
| | - Hao Wu
- Genetics, Development & Cell Biology Department, Iowa State University, Ames, Iowa, United States of America
| | - Junpeng Zhan
- School of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- Donald Danforth Plant Science Center, St. Louis, Missouri, United States of America
| | - Kapeel Chougule
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Lindsay C. Barone
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Cornel Ghiban
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Demitri Muna
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Andrew C. Olson
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Liya Wang
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Doreen Ware
- Plant Biology Program, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- USDA, Agricultural Research Service, Washington, D.C., United States of America
| | - David A. Micklos
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| |
Collapse
|
20
|
Tello-Ruiz MK, Naithani S, Stein JC, Gupta P, Campbell M, Olson A, Wei S, Preece J, Geniza MJ, Jiao Y, Lee YK, Wang B, Mulvaney J, Chougule K, Elser J, Al-Bader N, Kumari S, Thomason J, Kumar V, Bolser DM, Naamati G, Tapanari E, Fonseca N, Huerta L, Iqbal H, Keays M, Munoz-Pomer Fuentes A, Tang A, Fabregat A, D'Eustachio P, Weiser J, Stein LD, Petryszak R, Papatheodorou I, Kersey PJ, Lockhart P, Taylor C, Jaiswal P, Ware D. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Res 2019; 46:D1181-D1189. [PMID: 29165610 PMCID: PMC5753211 DOI: 10.1093/nar/gkx1111] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/25/2017] [Indexed: 12/24/2022] Open
Abstract
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Joshua C Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Michael Campbell
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Matthew J Geniza
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Young Koung Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,Division of Biological Sciences and Institute for Basic Science, Wonkwang University, Iksan 54538, Korea
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joseph Mulvaney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Noor Al-Bader
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - James Thomason
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Daniel M Bolser
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Electra Tapanari
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Nuno Fonseca
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Laura Huerta
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Haider Iqbal
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Maria Keays
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | | | - Amy Tang
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Antonio Fabregat
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Peter D'Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
| | - Joel Weiser
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
| | - Robert Petryszak
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Patti Lockhart
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Crispin Taylor
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| |
Collapse
|
21
|
Stein JC, Yu Y, Copetti D, Zwickl DJ, Zhang L, Zhang C, Chougule K, Gao D, Iwata A, Goicoechea JL, Wei S, Wang J, Liao Y, Wang M, Jacquemin J, Becker C, Kudrna D, Zhang J, Londono CEM, Song X, Lee S, Sanchez P, Zuccolo A, Ammiraju JSS, Talag J, Danowitz A, Rivera LF, Gschwend AR, Noutsos C, Wu CC, Kao SM, Zeng JW, Wei FJ, Zhao Q, Feng Q, El Baidouri M, Carpentier MC, Lasserre E, Cooke R, da Rosa Farias D, da Maia LC, Dos Santos RS, Nyberg KG, McNally KL, Mauleon R, Alexandrov N, Schmutz J, Flowers D, Fan C, Weigel D, Jena KK, Wicker T, Chen M, Han B, Henry R, Hsing YIC, Kurata N, de Oliveira AC, Panaud O, Jackson SA, Machado CA, Sanderson MJ, Long M, Ware D, Wing RA. Publisher Correction: Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 2018; 50:1618. [PMID: 30291357 DOI: 10.1038/s41588-018-0261-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This article was not made open access when initially published online, which was corrected before print publication. In addition, ORCID links were missing for 12 authors and have been added to the HTML and PDF versions of the article.
Collapse
Affiliation(s)
- Joshua C Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Yeisoo Yu
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,Phyzen Genomics Institute, Phyzen, Inc., Seoul, South Korea
| | - Dario Copetti
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,International Rice Research Institute, Los Baños, Philippines
| | - Derrick J Zwickl
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Li Zhang
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Chengjun Zhang
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.,Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Dongying Gao
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA, USA
| | - Aiko Iwata
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA, USA
| | - Jose Luis Goicoechea
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jun Wang
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Yi Liao
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Muhua Wang
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany
| | - Julie Jacquemin
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,Crop Biodiversity and Breeding Informatics Group, Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| | - Claude Becker
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Dave Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Jianwei Zhang
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Carlos E M Londono
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Xiang Song
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Seunghee Lee
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Paul Sanchez
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,Rice Experiment Station, Biggs, CA, USA
| | - Andrea Zuccolo
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,Institute of Life Sciences, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Jetty S S Ammiraju
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,DuPont-Pioneer, Johnston, IA, USA
| | - Jayson Talag
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Ann Danowitz
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Luis F Rivera
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA.,BIOS-Parque Los Yarumos, Manizales, Colombia
| | - Andrea R Gschwend
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | | | - Cheng-Chieh Wu
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan.,Institute of Botany, National Taiwan University, Taipei, Taiwan
| | - Shu-Min Kao
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan.,Department of Plant Systems Biology, VIB and Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Jhih-Wun Zeng
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Fu-Jin Wei
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan.,Department of Forest Molecular Genetics and Biotechnology, Forestry and Forest Products Research Institute, Tsukuba, Japan
| | - Qiang Zhao
- National Center for Gene Research, Chinese Academy of Sciences, Shanghai, China
| | - Qi Feng
- National Center for Gene Research, Chinese Academy of Sciences, Shanghai, China
| | - Moaine El Baidouri
- Laboratoire Génome et Développement des Plantes, UMR 5096 UPVD/CNRS, Université de Perpignan Via Domitia, Perpignan, France
| | - Marie-Christine Carpentier
- Laboratoire Génome et Développement des Plantes, UMR 5096 UPVD/CNRS, Université de Perpignan Via Domitia, Perpignan, France
| | - Eric Lasserre
- Laboratoire Génome et Développement des Plantes, UMR 5096 UPVD/CNRS, Université de Perpignan Via Domitia, Perpignan, France
| | - Richard Cooke
- Laboratoire Génome et Développement des Plantes, UMR 5096 UPVD/CNRS, Université de Perpignan Via Domitia, Perpignan, France
| | - Daniel da Rosa Farias
- Plant Genomics and Breeding Center, Universidade Federal de Pelotas, Pelotas, Brazil
| | | | - Railson S Dos Santos
- Plant Genomics and Breeding Center, Universidade Federal de Pelotas, Pelotas, Brazil
| | - Kevin G Nyberg
- Department of Biology, University of Maryland, College Park, MD, USA
| | | | - Ramil Mauleon
- International Rice Research Institute, Los Baños, Philippines
| | | | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Dave Flowers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Detlef Weigel
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Kshirod K Jena
- International Rice Research Institute, Los Baños, Philippines
| | - Thomas Wicker
- Institute of Plant Biology, University of Zurich, Zurich, Switzerland
| | - Mingsheng Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Bin Han
- National Center for Gene Research, Chinese Academy of Sciences, Shanghai, China
| | - Robert Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, Queensland, Australia
| | - Yue-Ie C Hsing
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Nori Kurata
- National Institute of Genetics, Mishima, Japan
| | | | - Olivier Panaud
- Laboratoire Génome et Développement des Plantes, UMR 5096 UPVD/CNRS, Université de Perpignan Via Domitia, Perpignan, France
| | - Scott A Jackson
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA, USA
| | - Carlos A Machado
- Department of Biology, University of Maryland, College Park, MD, USA
| | - Michael J Sanderson
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.,Robert W. Holley Center for Agriculture and Health, US Department of Agriculture, Agricultural Research Service, Ithaca, NY, USA
| | - Rod A Wing
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA. .,International Rice Research Institute, Los Baños, Philippines. .,Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
22
|
Park SH, Kyndt J, Chougule K, Park JJ, Brown JK. Low-phosphate-selected Auxenochlorella protothecoides redirects phosphate to essential pathways while producing more biomass. PLoS One 2018; 13:e0198953. [PMID: 29920531 PMCID: PMC6007911 DOI: 10.1371/journal.pone.0198953] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 05/29/2018] [Indexed: 11/19/2022] Open
Abstract
Despite the capacity to accumulate ~70% w/w of lipids, commercially produced unicellular green alga A. protothecoides may become compromised due to the high cost of phosphate fertilizers. To address this limitation A. protothecoides was selected for adaptation to conditions of 100× and 5× lower phosphate and peptone, respectively, compared to 'wild-type media'. The A. protothecoides showed initial signs of adaptation by 45-50 days, and steady state growth at ~100 days. The low phosphate (P)-adapted strain produced up to ~30% greater biomass, while total lipids (~10% w/w) remained about the same, compared to the wild-type strain. Metabolomic analyses indicated that the low P-adapted produced 3.3-fold more saturated palmitic acid (16:0) and 2.2-fold less linolenic acid (18:3), compared to the wild-type strain, resulting in an ~11% increase in caloric value, from 19.5kJ/g for the wild-type strain to 21.6kJ/g for the low P-adapted strain, due to the amounts and composition of certain saturated fatty acids, compared to the wild type strain. Biochemical changes in A. protothecoides adapted to lower phosphate conditions were assessed by comparative RNA-Seq analysis, which yielded 27,279 transcripts. Among them, 2,667 and 15 genes were significantly down- and up-regulated, at >999-fold and >3-fold (adjusted p-value <0.1), respectively. The expression of genes encoding proteins involved in cellular processes such as division, growth, and membrane biosynthesis, showed a trend toward down-regulation. At the genomic level, synonymous SNPs and Indels were observed primarily in coding regions, with the 40S ribosomal subunit gene harboring substantial SNPs. Overall, the adapted strain out-performed the wild-type strain by prioritizing the use of its limited phosphate supply for essential biological processes. The low P-adapted A. protothecoides is expected to be more economical to grow over the wild-type strain, based on overall greater productivity and caloric content, while importantly, also requiring 100-fold less phosphate.
Collapse
Affiliation(s)
- Sang-Hyuck Park
- Department of Biology, Colorado State University, Pueblo, Colorado, United States of America
| | - John Kyndt
- College of Science and Technology, Bellevue University, Bellevue, Nebraska United States of America
| | - Kapeel Chougule
- Arizona Genomics Institute, The University of Arizona, Tucson, Arizona, United States of America
| | - Jeong-Jin Park
- Biomolecular Analysis Facility, University of Virginia, Charlottesville, Virginia, United States of America
| | - Judith K. Brown
- School of Plant Sciences, The University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
23
|
Gupta P, Naithani S, Tello-Ruiz MK, Chougule K, D’Eustachio P, Fabregat A, Jiao Y, Keays M, Lee YK, Kumari S, Mulvaney J, Olson A, Preece J, Stein J, Wei S, Weiser J, Huerta L, Petryszak R, Kersey P, Stein LD, Ware D, Jaiswal P. Gramene Database: Navigating Plant Comparative Genomics Resources. Curr Plant Biol 2016; 7-8:10-15. [PMID: 28713666 PMCID: PMC5509230 DOI: 10.1016/j.cpb.2016.12.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationships to enrich the annotation of genomic data and provides tools to perform powerful comparative analyses across a wide spectrum of plant species. It consists of an integrated portal for querying, visualizing and analyzing data for 44 plant reference genomes, genetic variation data sets for 12 species, expression data for 16 species, curated rice pathways and orthology-based pathway projections for 66 plant species including various crops. Here we briefly describe the functions and uses of the Gramene database.
Collapse
Affiliation(s)
- Parul Gupta
- Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Sushma Naithani
- Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA
| | | | | | | | - Antonio Fabregat
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Maria Keays
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | | | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Justin Preece
- Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Joshua Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Joel Weiser
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Laura Huerta
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Robert Petryszak
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Paul Kersey
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | | | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- USDA ARS NEA Plant, Soil & Nutrition Laboratory Research Unit, Ithaca, NY, USA
| | - Pankaj Jaiswal
- Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA
- To whom correspondence should be addressed Address of the corresponding author: Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA, Phone: +1-541-737-8471, Fax: +1-541-737-3573,
| |
Collapse
|
24
|
Tello-Ruiz MK, Stein J, Wei S, Preece J, Olson A, Naithani S, Amarasinghe V, Dharmawardhana P, Jiao Y, Mulvaney J, Kumari S, Chougule K, Elser J, Wang B, Thomason J, Bolser DM, Kerhornou A, Walts B, Fonseca NA, Huerta L, Keays M, Tang YA, Parkinson H, Fabregat A, McKay S, Weiser J, D'Eustachio P, Stein L, Petryszak R, Kersey PJ, Jaiswal P, Ware D. Gramene 2016: comparative plant genomics and pathway resources. Nucleic Acids Res 2015; 44:D1133-40. [PMID: 26553803 PMCID: PMC4702844 DOI: 10.1093/nar/gkv1179] [Citation(s) in RCA: 108] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 10/13/2015] [Indexed: 12/21/2022] Open
Abstract
Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.
Collapse
Affiliation(s)
| | - Joshua Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Vindhya Amarasinghe
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Palitha Dharmawardhana
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joseph Mulvaney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - James Thomason
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Daniel M Bolser
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Arnaud Kerhornou
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Brandon Walts
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Nuno A Fonseca
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Laura Huerta
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Maria Keays
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Y Amy Tang
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Helen Parkinson
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Antonio Fabregat
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Sheldon McKay
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Joel Weiser
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Peter D'Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
| | - Lincoln Stein
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Robert Petryszak
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Paul J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| |
Collapse
|