1
|
Tang T, Liu Y, Zheng B, Li R, Zhang X, Liu Y. Integration of hybrid and self-correction method improves the quality of long-read sequencing data. Brief Funct Genomics 2024; 23:249-255. [PMID: 37340778 DOI: 10.1093/bfgp/elad026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 06/04/2023] [Accepted: 06/05/2023] [Indexed: 06/22/2023] Open
Abstract
Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.
Collapse
Affiliation(s)
- Tao Tang
- School of Mordern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Qixia District, 210023, Jiangsu, China
| | - Yiping Liu
- College of Computer Science and Electronic Engineering, Hunan University, 2 Lushan S Rd, Yuelu District, 410086, Changsha, China
| | - Binshuang Zheng
- School of Mordern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Qixia District, 210023, Jiangsu, China
| | - Rong Li
- School of Mordern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Qixia District, 210023, Jiangsu, China
| | - Xiaocai Zhang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), 138632, Singapore, Singapore
| | - Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, 2 Lushan S Rd, Yuelu District, 410086, Changsha, China
| |
Collapse
|
2
|
Paré L, Bideau L, Baduel L, Dalle C, Benchouaia M, Schneider SQ, Laplane L, Clément Y, Vervoort M, Gazave E. Transcriptomic landscape of posterior regeneration in the annelid Platynereis dumerilii. BMC Genomics 2023; 24:583. [PMID: 37784028 PMCID: PMC10546743 DOI: 10.1186/s12864-023-09602-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 08/18/2023] [Indexed: 10/04/2023] Open
Abstract
BACKGROUND Restorative regeneration, the capacity to reform a lost body part following amputation or injury, is an important and still poorly understood process in animals. Annelids, or segmented worms, show amazing regenerative capabilities, and as such are a crucial group to investigate. Elucidating the molecular mechanisms that underpin regeneration in this major group remains a key goal. Among annelids, the nereididae Platynereis dumerilii (re)emerged recently as a front-line regeneration model. Following amputation of its posterior part, Platynereis worms can regenerate both differentiated tissues of their terminal part as well as a growth zone that contains putative stem cells. While this regeneration process follows specific and reproducible stages that have been well characterized, the transcriptomic landscape of these stages remains to be uncovered. RESULTS We generated a high-quality de novo Reference transcriptome for the annelid Platynereis dumerilii. We produced and analyzed three RNA-sequencing datasets, encompassing five stages of posterior regeneration, along with blastema stages and non-amputated tissues as controls. We included two of these regeneration RNA-seq datasets, as well as embryonic and tissue-specific datasets from the literature to produce a Reference transcriptome. We used this Reference transcriptome to perform in depth analyzes of RNA-seq data during the course of regeneration to reveal the important dynamics of the gene expression, process with thousands of genes differentially expressed between stages, as well as unique and specific gene expression at each regeneration stage. The study of these genes highlighted the importance of the nervous system at both early and late stages of regeneration, as well as the enrichment of RNA-binding proteins (RBPs) during almost the entire regeneration process. CONCLUSIONS In this study, we provided a high-quality de novo Reference transcriptome for the annelid Platynereis that is useful for investigating various developmental processes, including regeneration. Our extensive stage-specific transcriptional analysis during the course of posterior regeneration sheds light upon major molecular mechanisms and pathways, and will foster many specific studies in the future.
Collapse
Affiliation(s)
- Louis Paré
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Loïc Bideau
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Loeiza Baduel
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Caroline Dalle
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Médine Benchouaia
- Département de biologie, GenomiqueENS, Institut de Biologie de l'ENS (IBENS), École normale supérieure, CNRS, INSERM, Université PSL, Paris, 75005, France
| | - Stephan Q Schneider
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Lucie Laplane
- Université Paris I Panthéon-Sorbonne, CNRS UMR 8590 Institut d'Histoire et de Philosophie des Sciences et des Techniques (IHPST), Paris, France
- Gustave Roussy, UMR 1287, Villejuif, France
| | - Yves Clément
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Michel Vervoort
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Eve Gazave
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France.
| |
Collapse
|
3
|
Jung J, Jhang SY, Kim B, Koh B, Ban C, Seo H, Park T, Chi WJ, Kim S, Kim H, Yu J. The first high-quality genome assembly and annotation of Patiria pectinifera. Sci Data 2023; 10:642. [PMID: 37730712 PMCID: PMC10511450 DOI: 10.1038/s41597-023-02508-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 08/29/2023] [Indexed: 09/22/2023] Open
Abstract
The blue bat star, a highly adaptive species in the East Sea of Korea, has displayed remarkable success in adapting to recent climate change. The genetic mechanisms behind this success were not well-understood, prompting our report on the first chromosome-level assembly of the Patiria genus. We assembled the genome using Nanopore and Illumina sequences, yielding a total length of 615 Mb and a scaffold N50 of 24,204,423 bp. Hi-C analysis allowed us to anchor the scaffold sequences onto 22 pseudochromosomes. K-mer based analysis revealed 5.16% heterozygosity rate of the genome, higher than any previously reported echinoderm species. Our transposable element analysis exposed a substantial number of genome-wide retrotransposons and DNA transposons. These results offer valuable resources for understanding the evolutionary mechanisms behind P. pectinifera's successful adaptation in fluctuating environments.
Collapse
Affiliation(s)
- Jaehoon Jung
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - So Yun Jhang
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Bongsang Kim
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Bomin Koh
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Chaeyoung Ban
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
| | - Hyojung Seo
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
| | - Taeseo Park
- Animal Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Won-Jae Chi
- Microorganism Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Soonok Kim
- Microorganism Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Heebal Kim
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Jaewoong Yu
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea.
| |
Collapse
|
4
|
Shumate A, Wong B, Pertea G, Pertea M. Improved transcriptome assembly using a hybrid of long and short reads with StringTie. PLoS Comput Biol 2022; 18:e1009730. [PMID: 35648784 PMCID: PMC9191730 DOI: 10.1371/journal.pcbi.1009730] [Citation(s) in RCA: 175] [Impact Index Per Article: 58.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 06/13/2022] [Accepted: 05/11/2022] [Indexed: 01/01/2023] Open
Abstract
Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites. Here we present a new release of StringTie that performs hybrid-read assembly. By taking advantage of the strengths of both long and short reads, hybrid-read assembly with StringTie is more accurate than long-read only or short-read only assembly, and on some datasets it can more than double the number of correctly assembled transcripts, while obtaining substantially higher precision than the long-read data assembly alone. Here we demonstrate the improved accuracy on simulated data and real data from Arabidopsis thaliana, Mus musculus, and human. We also show that hybrid-read assembly is more accurate than correcting long reads prior to assembly while also being substantially faster. StringTie is freely available as open source software at https://github.com/gpertea/stringtie.
Collapse
Affiliation(s)
- Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Brandon Wong
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Applied Math and Statistics, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Geo Pertea
- The Lieber Institute for Brain Development, Baltimore, Maryland, United States of America
| | - Mihaela Pertea
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
5
|
Mohammadi MM, Bavi O. DNA sequencing: an overview of solid-state and biological nanopore-based methods. Biophys Rev 2021; 14:99-110. [PMID: 34840616 PMCID: PMC8609259 DOI: 10.1007/s12551-021-00857-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/14/2021] [Indexed: 12/23/2022] Open
Abstract
The field of sequencing is a topic of significant interest since its emergence and has become increasingly important over time. Impressive achievements have been obtained in this field, especially in relations to DNA and RNA sequencing. Since the first achievements by Sanger and colleagues in the 1950s, many sequencing techniques have been developed, while others have disappeared. DNA sequencing has undergone three generations of major evolution. Each generation has its own specifications that are mentioned briefly. Among these generations, nanopore sequencing has its own exciting characteristics that have been given more attention here. Among pioneer technologies being used by the third-generation techniques, nanopores, either biological or solid-state, have been experimentally or theoretically extensively studied. All sequencing technologies have their own advantages and disadvantages, so nanopores are not free from this general rule. It is also generally pointed out what research has been done to overcome the obstacles. In this review, biological and solid-state nanopores are elaborated on, and applications of them are also discussed briefly.
Collapse
Affiliation(s)
- Mohammad M Mohammadi
- Department of Mechanical and Aerospace Engineering, Shiraz University of Technology, Shiraz, 71557-13876 Iran
| | - Omid Bavi
- Department of Mechanical and Aerospace Engineering, Shiraz University of Technology, Shiraz, 71557-13876 Iran
| |
Collapse
|
6
|
Lorenzi C, Barriere S, Arnold K, Luco RF, Oldfield AJ, Ritchie W. IRFinder-S: a comprehensive suite to discover and explore intron retention. Genome Biol 2021; 22:307. [PMID: 34749764 PMCID: PMC8573998 DOI: 10.1186/s13059-021-02515-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 10/12/2021] [Indexed: 12/15/2022] Open
Abstract
Accurate quantification and detection of intron retention levels require specialized software. Building on our previous software, we create a suite of tools called IRFinder-S, to analyze and explore intron retention events in multiple samples. Specifically, IRFinder-S allows a better identification of true intron retention events using a convolutional neural network, allows the sharing of intron retention results between labs, integrates a dynamic database to explore and contrast available samples, and provides a tested method to detect differential levels of intron retention.
Collapse
Affiliation(s)
- Claudio Lorenzi
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France
| | - Sylvain Barriere
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France
| | - Katharina Arnold
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France
| | - Reini F Luco
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France
| | - Andrew J Oldfield
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France
| | - William Ritchie
- Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier, France.
| |
Collapse
|
7
|
Wang Y, Zhao Y, Bollas A, Wang Y, Au KF. Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 797] [Impact Index Per Article: 199.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
Affiliation(s)
- Yunhao Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yue Zhao
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA
| | - Audrey Bollas
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yuru Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Kin Fai Au
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
8
|
Dorado G, Gálvez S, Rosales TE, Vásquez VF, Hernández P. Analyzing Modern Biomolecules: The Revolution of Nucleic-Acid Sequencing - Review. Biomolecules 2021; 11:1111. [PMID: 34439777 PMCID: PMC8393538 DOI: 10.3390/biom11081111] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 07/12/2021] [Accepted: 07/23/2021] [Indexed: 02/06/2023] Open
Abstract
Recent developments have revolutionized the study of biomolecules. Among them are molecular markers, amplification and sequencing of nucleic acids. The latter is classified into three generations. The first allows to sequence small DNA fragments. The second one increases throughput, reducing turnaround and pricing, and is therefore more convenient to sequence full genomes and transcriptomes. The third generation is currently pushing technology to its limits, being able to sequence single molecules, without previous amplification, which was previously impossible. Besides, this represents a new revolution, allowing researchers to directly sequence RNA without previous retrotranscription. These technologies are having a significant impact on different areas, such as medicine, agronomy, ecology and biotechnology. Additionally, the study of biomolecules is revealing interesting evolutionary information. That includes deciphering what makes us human, including phenomena like non-coding RNA expansion. All this is redefining the concept of gene and transcript. Basic analyses and applications are now facilitated with new genome editing tools, such as CRISPR. All these developments, in general, and nucleic-acid sequencing, in particular, are opening a new exciting era of biomolecule analyses and applications, including personalized medicine, and diagnosis and prevention of diseases for humans and other animals.
Collapse
Affiliation(s)
- Gabriel Dorado
- Dep. Bioquímica y Biología Molecular, Campus Rabanales C6-1-E17, Campus de Excelencia Internacional Agroalimentario (ceiA3), Universidad de Córdoba, 14071 Córdoba, Spain
| | - Sergio Gálvez
- Dep. Lenguajes y Ciencias de la Computación, Boulevard Louis Pasteur 35, Universidad de Málaga, 29071 Málaga, Spain;
| | - Teresa E. Rosales
- Laboratorio de Arqueobiología, Avda. Universitaria s/n, Universidad Nacional de Trujillo, 13011 Trujillo, Peru;
| | - Víctor F. Vásquez
- Centro de Investigaciones Arqueobiológicas y Paleoecológicas Andinas Arqueobios, Martínez de Companón 430-Bajo 100, Urbanización San Andres, 13088 Trujillo, Peru;
| | - Pilar Hernández
- Instituto de Agricultura Sostenible (IAS), Consejo Superior de Investigaciones Científicas (CSIC), Alameda del Obispo s/n, 14080 Córdoba, Spain;
| |
Collapse
|
9
|
Koiwai K, Koyama T, Tsuda S, Toyoda A, Kikuchi K, Suzuki H, Kawano R. Single-cell RNA-seq analysis reveals penaeid shrimp hemocyte subpopulations and cell differentiation process. eLife 2021; 10:e66954. [PMID: 34132195 PMCID: PMC8266392 DOI: 10.7554/elife.66954] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 06/15/2021] [Indexed: 01/03/2023] Open
Abstract
Crustacean aquaculture is expected to be a major source of fishery commodities in the near future. Hemocytes are key players of the immune system in shrimps; however, their classification, maturation, and differentiation are still under debate. To date, only discrete and inconsistent information on the classification of shrimp hemocytes has been reported, showing that the morphological characteristics are not sufficient to resolve their actual roles. Our present study using single-cell RNA sequencing revealed six types of hemocytes of Marsupenaeus japonicus based on their transcriptional profiles. We identified markers of each subpopulation and predicted the differentiation pathways involved in their maturation. We also predicted cell growth factors that might play crucial roles in hemocyte differentiation. Different immune roles among these subpopulations were suggested from the analysis of differentially expressed immune-related genes. These results provide a unified classification of shrimp hemocytes, which improves the understanding of its immune system.
Collapse
Affiliation(s)
- Keiichiro Koiwai
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and TechnologyKoganeiJapan
- Laboratory of Genome Science, Tokyo University of Marine Science and TechnologyMinatoJapan
| | - Takashi Koyama
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, The University of TokyoHamamatsuJapan
- Graduate School of Fisheries and Environmental Sciences, Nagasaki UniversityNagasakiJapan
| | | | - Atsushi Toyoda
- Advanced Genomics Center, National Institute of GeneticsMishimaJapan
| | - Kiyoshi Kikuchi
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, The University of TokyoHamamatsuJapan
| | - Hiroaki Suzuki
- Department of Precision Mechanics, Faculty of Science and Engineering, Chuo UniversityBunkyoJapan
| | - Ryuji Kawano
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and TechnologyKoganeiJapan
| |
Collapse
|
10
|
Patro R, Salmela L. Algorithms meet sequencing technologies - 10th edition of the RECOMB-Seq workshop. iScience 2021; 24:101956. [PMID: 33437938 PMCID: PMC7788091 DOI: 10.1016/j.isci.2020.101956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
DNA and RNA sequencing is a core technology in biological and medical research. The high throughput of these technologies and the consistent development of new experimental assays and biotechnologies demand the continuous development of methods to analyze the resulting data. The RECOMB Satellite Workshop on Massively Parallel Sequencing brings together leading researchers in computational genomics to discuss emerging frontiers in algorithm development for massively parallel sequencing data. The 10th meeting in this series, RECOMB-Seq 2020, was scheduled to be held in Padua, Italy, but due to the ongoing COVID-19 pandemic, the meeting was carried out virtually instead. The online workshop featured keynote talks by Paola Bonizzoni and Zamin Iqbal, two highlight talks, ten regular talks, and three short talks. Seven of the works presented in the workshop are featured in this edition of iScience, and many of the talks are available online in the RECOMB-Seq 2020 YouTube channel.
Collapse
Affiliation(s)
- Rob Patro
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, MD, USA
| | - Leena Salmela
- Department of Computer Science and Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| |
Collapse
|
11
|
Hayrabedyan S, Kostova P, Zlatkov V, Todorova K. Single-cell transcriptomics in the context of long-read nanopore sequencing. BIOTECHNOL BIOTEC EQ 2021. [DOI: 10.1080/13102818.2021.1988868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Affiliation(s)
- Soren Hayrabedyan
- Laboratory of Reproductive OMICs Technologies, Institute of Biology and Immunology of Reproduction, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Petya Kostova
- Gynecology Clinic, National Oncology Hospital, Sofia, Bulgaria
| | - Viktor Zlatkov
- Department of Obstetrics and Gynecology, Faculty of Medicine, Medical University of Sofia, Sofia, Bulgaria
| | - Krassimira Todorova
- Laboratory of Reproductive OMICs Technologies, Institute of Biology and Immunology of Reproduction, Bulgarian Academy of Sciences, Sofia, Bulgaria
| |
Collapse
|
12
|
Rousseau-Gueutin M, Belser C, Da Silva C, Richard G, Istace B, Cruaud C, Falentin C, Boideau F, Boutte J, Delourme R, Deniot G, Engelen S, de Carvalho JF, Lemainque A, Maillet L, Morice J, Wincker P, Denoeud F, Chèvre AM, Aury JM. Long-read assembly of the Brassica napus reference genome Darmor-bzh. Gigascience 2020; 9:giaa137. [PMID: 33319912 PMCID: PMC7736779 DOI: 10.1093/gigascience/giaa137] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/18/2020] [Accepted: 11/09/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The combination of long reads and long-range information to produce genome assemblies is now accepted as a common standard. This strategy not only allows access to the gene catalogue of a given species but also reveals the architecture and organization of chromosomes, including complex regions such as telomeres and centromeres. The Brassica genus is not exempt, and many assemblies based on long reads are now available. The reference genome for Brassica napus, Darmor-bzh, which was published in 2014, was produced using short reads and its contiguity was extremely low compared with current assemblies of the Brassica genus. FINDINGS Herein, we report the new long-read assembly of Darmor-bzh genome (Brassica napus) generated by combining long-read sequencing data and optical and genetic maps. Using the PromethION device and 6 flowcells, we generated ∼16 million long reads representing 93× coverage and, more importantly, 6× with reads longer than 100 kb. This ultralong-read dataset allows us to generate one of the most contiguous and complete assemblies of a Brassica genome to date (contig N50 > 10 Mb). In addition, we exploited all the advantages of the nanopore technology to detect modified bases and sequence transcriptomic data using direct RNA to annotate the genome and focus on resistance genes. CONCLUSION Using these cutting-edge technologies, and in particular by relying on all the advantages of the nanopore technology, we provide the most contiguous Brassica napus assembly, a resource that will be valuable to the Brassica community for crop improvement and will facilitate the rapid selection of agronomically important traits.
Collapse
Affiliation(s)
| | - Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Gautier Richard
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Corinne Cruaud
- Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Cyril Falentin
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Franz Boideau
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Julien Boutte
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Regine Delourme
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Gwenaëlle Deniot
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Stefan Engelen
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | | | - Arnaud Lemainque
- Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Loeiz Maillet
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Jérôme Morice
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - France Denoeud
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Anne-Marie Chèvre
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| |
Collapse
|
13
|
Grabski DF, Broseus L, Kumari B, Rekosh D, Hammarskjold ML, Ritchie W. Intron retention and its impact on gene expression and protein diversity: A review and a practical guide. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1631. [PMID: 33073477 DOI: 10.1002/wrna.1631] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/16/2020] [Accepted: 09/23/2020] [Indexed: 12/12/2022]
Abstract
Intron retention (IR) occurs when a complete and unspliced intron remains in mature mRNA. An increasing body of literature has demonstrated a major role for IR in numerous biological functions, including several that impact human health and disease. Although experimental technologies used to study other forms of mRNA splicing can also be used to investigate IR, a specialized downstream computational analysis is optimal for IR discovery and analysis. Here we provide a review of IR and its biological implications, as well as a practical guide for how to detect and analyze it. Several methods, including long read third generation direct RNA sequencing, are described. We have developed an R package, FakIR, to facilitate the execution of the bioinformatic tasks recommended in this review and a tutorial on how to fit them to users aims. Additionally, we provide guidelines and experimental protocols to validate IR discovery and to evaluate the potential impact of IR on gene expression and protein output. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA Processing > Splicing Regulation/Alternative Splicing RNA Methods > RNA Analyses in vitro and In Silico.
Collapse
Affiliation(s)
- David F Grabski
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, Charlottesville, Virginia, USA.,Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA
| | - Lucile Broseus
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| | - Bandana Kumari
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| | - David Rekosh
- Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA.,Department of Microbiology, Immunology and Cancer Biology, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Marie-Louise Hammarskjold
- Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA.,Department of Microbiology, Immunology and Cancer Biology, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - William Ritchie
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| |
Collapse
|