1
|
Chen Y, Davidson NM, Wan YK, Yao F, Su Y, Gamaarachchi H, Sim A, Patel H, Low HM, Hendra C, Wratten L, Hakkaart C, Sawyer C, Iakovleva V, Lee PL, Xin L, Ng HEV, Loo JM, Ong X, Ng HQA, Wang J, Koh WQC, Poon SYP, Stanojevic D, Tran HD, Lim KHE, Toh SY, Ewels PA, Ng HH, Iyer NG, Thiery A, Chng WJ, Chen L, DasGupta R, Sikic M, Chan YS, Tan BOP, Wan Y, Tam WL, Yu Q, Khor CC, Wüstefeld T, Lezhava A, Pratanwanich PN, Love MI, Goh WSS, Ng SB, Oshlack A, Göke J. A systematic benchmark of Nanopore long-read RNA sequencing for transcript-level analysis in human cell lines. Nat Methods 2025; 22:801-812. [PMID: 40082608 PMCID: PMC11978509 DOI: 10.1038/s41592-025-02623-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 02/04/2025] [Indexed: 03/16/2025]
Abstract
The human genome contains instructions to transcribe more than 200,000 RNAs. However, many RNA transcripts are generated from the same gene, resulting in alternative isoforms that are highly similar and that remain difficult to quantify. To evaluate the ability to study RNA transcript expression, we profiled seven human cell lines with five different RNA-sequencing protocols, including short-read cDNA, Nanopore long-read direct RNA, amplification-free direct cDNA and PCR-amplified cDNA sequencing, and PacBio IsoSeq, with multiple spike-in controls, and additional transcriptome-wide N6-methyladenosine profiling data. We describe differences in read length, coverage, throughput and transcript expression, reporting that long-read RNA sequencing more robustly identifies major isoforms. We illustrate the value of the SG-NEx data to identify alternative isoforms, novel transcripts, fusion transcripts and N6-methyladenosine RNA modifications. Together, the SG-NEx data provide a comprehensive resource enabling the development and benchmarking of computational methods for profiling complex transcriptional events at isoform-level resolution.
Collapse
Affiliation(s)
- Ying Chen
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore.
| | - Nadia M Davidson
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, Victoria, Australia
- Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Yuk Kei Wan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Fei Yao
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Yan Su
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Hasindu Gamaarachchi
- School of Computer Science and Engineering, UNSW Sydney, Sydney, New South Wales, Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Andre Sim
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | | | - Hwee Meng Low
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Christopher Hendra
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Institute of Data Science, National University of Singapore, Singapore, Singapore
| | - Laura Wratten
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | | | - Chelsea Sawyer
- Bioinformatics and Biostatistics, The Francis Crick Institute, London, UK
| | - Viktoriia Iakovleva
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Division of Gastroenterology and Hepatology, Weill Cornell Medicine, New York, NY, USA
| | - Puay Leng Lee
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Lixia Xin
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Cardiovascular and Metabolic Disorders Program, Duke-NUS Medical School, Singapore, Singapore
| | - Hui En Vanessa Ng
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
| | - Jia Min Loo
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Xuewen Ong
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Hui Qi Amanda Ng
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Jiaxu Wang
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Wei Qian Casslynn Koh
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Suk Yeah Polly Poon
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Dominik Stanojevic
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Department of Electronic Systems and Information Processing, Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
| | - Hoang-Dai Tran
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Kok Hao Edwin Lim
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Shen Yon Toh
- National Cancer Centre Singapore, Singapore, Singapore
| | | | - Huck-Hui Ng
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - N Gopalakrishna Iyer
- National Cancer Centre Singapore, Singapore, Singapore
- Duke-NUS Medical School, Singapore, Singapore
| | - Alexandre Thiery
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore
| | - Wee Joo Chng
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
- Department of Hematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Leilei Chen
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
- Department of Anatomy, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Ramanuj DasGupta
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Mile Sikic
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Department of Electronic Systems and Information Processing, Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
| | - Yun-Shen Chan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Boon Ooi Patrick Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Yue Wan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Wai Leong Tam
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Qiang Yu
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Chiea Chuan Khor
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Duke-NUS Medical School, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
| | - Torsten Wüstefeld
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- National Cancer Centre Singapore, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Alexander Lezhava
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Ploy N Pratanwanich
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Chula Intelligent and Complex Systems Research Unit, Chulalongkorn University, Bangkok, Thailand
| | - Michael I Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Wee Siong Sho Goh
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, China
| | - Sarah B Ng
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Alicia Oshlack
- Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, University of Melbourne, Parkville, Victoria, Australia
| | - Jonathan Göke
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore.
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
2
|
Kasianova AM, Penin AA, Schelkunov MI, Kasianov AS, Logacheva MD, Klepikova AV. Trans2express - de novo transcriptome assembly pipeline optimized for gene expression analysis. PLANT METHODS 2024; 20:128. [PMID: 39152473 PMCID: PMC11330051 DOI: 10.1186/s13007-024-01255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 08/01/2024] [Indexed: 08/19/2024]
Abstract
BACKGROUND As genomes of many eukaryotic species, especially plants, are large and complex, their de novo sequencing and assembly is still a difficult task despite progress in sequencing technologies. An alternative to genome assembly is the assembly of transcriptome, the set of RNA products of the expressed genes. While a bunch of de novo transcriptome assemblers exists, the challenges of transcriptomes (the existence of isoforms, the uneven expression levels across genes) complicates the generation of high-quality assemblies suitable for downstream analyses. RESULTS We developed Trans2express - a web-based tool and a pipeline of de novo hybrid transcriptome assembly and postprocessing based on rnaSPAdes with a set of subsequent filtrations. The pipeline was tested on Arabidopsis thaliana cDNA sequencing data obtained using Illumina and Oxford Nanopore Technologies platforms and three non-model plant species. The comparison of structural characteristics of the transcriptome assembly with reference Arabidopsis genome revealed the high quality of assembled transcriptome with 86.1% of Arabidopsis expressed genes assembled as a single contig. We tested the applicability of the transcriptome assembly for gene expression analysis. For both Arabidopsis and non-model species the results showed high congruence of gene expression levels and sets of differentially expressed genes between analyses based on genome and based on the transcriptome assembly. CONCLUSIONS We present Trans2express - a protocol for de novo hybrid transcriptome assembly aimed at recovering of a single transcript per gene. We expect this protocol to promote the characterization of transcriptomes and gene expression analysis in non-model plants and web-based tool to be of use to a wide range of plant biologists.
Collapse
Affiliation(s)
- Aleksandra M Kasianova
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Aleksey A Penin
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
| | - Mikhail I Schelkunov
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Artem S Kasianov
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
| | - Maria D Logacheva
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Anna V Klepikova
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
3
|
Kainth AS, Haddad GA, Hall JM, Ruthenburg AJ. Merging short and stranded long reads improves transcript assembly. PLoS Comput Biol 2023; 19:e1011576. [PMID: 37883581 PMCID: PMC10629667 DOI: 10.1371/journal.pcbi.1011576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 11/07/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
Long-read RNA sequencing has arisen as a counterpart to short-read sequencing, with the potential to capture full-length isoforms, albeit at the cost of lower depth. Yet this potential is not fully realized due to inherent limitations of current long-read assembly methods and underdeveloped approaches to integrate short-read data. Here, we critically compare the existing methods and develop a new integrative approach to characterize a particularly challenging pool of low-abundance long noncoding RNA (lncRNA) transcripts from short- and long-read sequencing in two distinct cell lines. Our analysis reveals severe limitations in each of the sequencing platforms. For short-read assemblies, coverage declines at transcript termini resulting in ambiguous ends, and uneven low coverage results in segmentation of a single transcript into multiple transcripts. Conversely, long-read sequencing libraries lack depth and strand-of-origin information in cDNA-based methods, culminating in erroneous assembly and quantitation of transcripts. We also discover a cDNA synthesis artifact in long-read datasets that markedly impacts the identity and quantitation of assembled transcripts. Towards remediating these problems, we develop a computational pipeline to "strand" long-read cDNA libraries that rectifies inaccurate mapping and assembly of long-read transcripts. Leveraging the strengths of each platform and our computational stranding, we also present and benchmark a hybrid assembly approach that drastically increases the sensitivity and accuracy of full-length transcript assembly on the correct strand and improves detection of biological features of the transcriptome. When applied to a challenging set of under-annotated and cell-type variable lncRNA, our method resolves the segmentation problem of short-read sequencing and the depth problem of long-read sequencing, resulting in the assembly of coherent transcripts with precise 5' and 3' ends. Our workflow can be applied to existing datasets for superior demarcation of transcript ends and refined isoform structure, which can enable better differential gene expression analyses and molecular manipulations of transcripts.
Collapse
Affiliation(s)
- Amoldeep S. Kainth
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Gabriela A. Haddad
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Johnathon M. Hall
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Alexander J. Ruthenburg
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
4
|
Lu N, Qiao Y, An P, Luo J, Bi C, Li M, Lu Z, Tu J. Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data. Brief Bioinform 2023; 24:bbad275. [PMID: 37529913 DOI: 10.1093/bib/bbad275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/21/2023] [Accepted: 07/10/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Multiple displacement amplification (MDA) has become the most commonly used method of whole genome amplification, generating a vast amount of DNA with higher molecular weight and greater genome coverage. Coupling with long-read sequencing, it is possible to sequence the amplicons of over 20 kb in length. However, the formation of chimeric sequences (chimeras, expressed as structural errors in sequencing data) in MDA seriously interferes with the bioinformatics analysis but its influence on long-read sequencing data is unknown. RESULTS We sequenced the phi29 DNA polymerase-mediated MDA amplicons on the PacBio platform and analyzed chimeras within the generated data. The 3rd-ChimeraMiner has been constructed as a pipeline for recognizing and restoring chimeras into the original structures in long-read sequencing data, improving the efficiency of using TGS data. Five long-read datasets and one high-fidelity long-read dataset with various amplification folds were analyzed. The result reveals that the mis-priming events in amplification are more frequently occurring than widely perceived, and the propor tion gradually accumulates from 42% to over 78% as the amplification continues. In total, 99.92% of recognized chimeric sequences were demonstrated to be artifacts, whose structures were wrongly formed in MDA instead of existing in original genomes. By restoring chimeras to their original structures, the vast majority of supplementary alignments that introduce false-positive structural variants are recycled, removing 97% of inversions on average and contributing to the analysis of structural variation in MDA-amplified samples. The impact of chimeras in long-read sequencing data analysis should be emphasized, and the 3rd-ChimeraMiner can help to quantify and reduce the influence of chimeras. AVAILABILITY AND IMPLEMENTATION The 3rd-ChimeraMiner is available on GitHub, https://github.com/dulunar/3rdChimeraMiner.
Collapse
Affiliation(s)
- Na Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Yi Qiao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Pengfei An
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
- Monash University-Southeast University Joint Research Institute, Suzhou 215123, China
| | - Jiajian Luo
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Changwei Bi
- College of Information Science and Technology, Nanjing Forestry University, Nanjing 210037, China
| | - Musheng Li
- Department of Physiology and Cell Biology, University of Nevada, Reno School of Medicine, Reno, NV 89511, USA
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Jing Tu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|
5
|
Zhu W, Liao X. LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads. Front Genet 2023; 14:1166975. [PMID: 37292144 PMCID: PMC10245045 DOI: 10.3389/fgene.2023.1166975] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 05/04/2023] [Indexed: 06/10/2023] Open
Abstract
As the carrier of genetic information, RNA carries the information from genes to proteins. Transcriptome sequencing technology is an important way to obtain transcriptome sequences, and it is also the basis for transcriptome research. With the development of third-generation sequencing, long reads can cover full-length transcripts and reflect the composition of different isoforms. However, the high error rate of third-generation sequencing affects the accuracy of long reads and downstream analysis. The current error correction methods seldom consider the existence of different isoforms in RNA, which makes the diversity of isoforms a serious loss. Here, we introduce LCAT (long-read error correction algorithm for transcriptome sequencing data), a wrapper algorithm of MECAT, to reduce the loss of isoform diversity while keeping MECAT's error correction performance. The experimental results show that LCAT can not only improve the quality of transcriptome sequencing long reads but also retain the diversity of isoforms.
Collapse
Affiliation(s)
- Wufei Zhu
- Department of Endocrinology, Yichang Central People’s Hospital, The First College of Clinical Medical Science, China Three Gorges University, Yichang, China
| | - Xingyu Liao
- Computer, Electrical and Mathematical Sciences, and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
6
|
Zhang Z, Pei P, Zhang M, Li F, Tang G. Chromosome-level genome assembly of Dastarcus helophoroides provides insights into CYP450 genes expression upon insecticide exposure. PEST MANAGEMENT SCIENCE 2023; 79:1467-1482. [PMID: 36502364 DOI: 10.1002/ps.7319] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 10/26/2022] [Accepted: 12/11/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND Dastarcus helophoroides is an important natural enemy of cerambycids, and is wildly used in biological control of pests. Nevertheless, the absence of complete genomic information limits the investigation of the underlying molecular mechanisms. Here, a chromosome-level of Dastarcus helophoroides genome is assembled using a combination strategy of Illumina, PacBio, 10x™ Genomics, and Hi-C. RESULTS The final assembly is 609.09 Mb with contig N50, scaffold N50 and GC content of 5.46 Mb, 42.56 Mb and 31.50%, respectively, and 95.25% of the contigs anchor into 13 chromosomes. In total 14 890 protein-coding genes and 65.37% repeat sequences are predicted in the assembly genome. The phylogenetic analysis of single-copy gene families shared among 20 insect species indicates that Dastarcus helophoroides is placed as the sister species to clade (Nitidulidae+Curculionoidea+Chrysomeloidea) + Tenebrionoidea, and diverges from the related species ~242.9 Mya. In total 36 expanded gene families are identified in Dastarcus helophoroides genome, and are functionally related to drug metabolism and metabolism of xenobiotics by cytochrome P450. Some members of CYP4 Clade and CYP6 Clade are up-regulated in Dastarcus helophoroides adults upon insecticide exposure, of which expressions of DhCYP4Q, DhCYP6A14X1 and DhCYP4C1 are significantly up-regulated. The silencing of the three genes leads to adults more sensitive to insecticide and increased knocked-down rate, which may indicate their critical roles in stress resistance and detoxication. CONCLUSION Our study systematically integrated the chromosome-level genome, transcriptome and gene expression of Dastarcus helophoroides, which will provide valuable resources for understanding mechanisms of pesticide metabolism, growth and development, and utilization of the natural enemy in integrated control. © 2022 Society of Chemical Industry.
Collapse
Affiliation(s)
- Zhengqing Zhang
- Key Laboratory of National Forestry and Grassland Administration on Management of Western Forest Bio-Disaster, College of Forestry, Northwest A&F University, Yangling, P. R. China
| | - Pei Pei
- Key Laboratory of National Forestry and Grassland Administration on Management of Western Forest Bio-Disaster, College of Forestry, Northwest A&F University, Yangling, P. R. China
| | - Meng Zhang
- Key Laboratory of National Forestry and Grassland Administration on Management of Western Forest Bio-Disaster, College of Forestry, Northwest A&F University, Yangling, P. R. China
| | - Feifei Li
- Key Laboratory of National Forestry and Grassland Administration on Management of Western Forest Bio-Disaster, College of Forestry, Northwest A&F University, Yangling, P. R. China
| | - Guanghui Tang
- Key Laboratory of National Forestry and Grassland Administration on Management of Western Forest Bio-Disaster, College of Forestry, Northwest A&F University, Yangling, P. R. China
| |
Collapse
|
7
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
8
|
Srikakulam N, Sridevi G, Pandi G. High-quality reference transcriptome construction improves RNA-seq quantification in Oryza sativa indica. Front Genet 2022; 13:995072. [PMID: 36246658 PMCID: PMC9558114 DOI: 10.3389/fgene.2022.995072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/02/2022] [Indexed: 11/13/2022] Open
Abstract
The Reference Transcriptomic Dataset (RTD) is an accurate and comprehensive collection of transcripts originating from a given organism. It holds the key to precise transcript quantification and downstream analysis of differential expressions and regulations. Currently, transcriptome annotations for most crop plants are far from complete. For example, Oryza sativa indica (O. sativa indica) is reported to have 40,759 transcripts in the Ensembl database without alternative transcript isoforms and alternative splicing (AS) events. To generate a high-quality RTD, we conducted RNA sequencing of rice leaf samples collected at various time points during Rhizoctonia solani infection. The obtained reads were analyzed by adopting the recently developed computational analysis pipeline to assemble the RTD with increased transcript and AS diversity for O. sativa indica (IndicaRTD). After stringent quality filtering, the newly constructed transcriptome annotation was comprised of 122,968 non-redundant transcripts from 53,695 genes. This study identified many novel transcripts compared to Ensembl deposited data that are important for regulating molecular and physiological processes in the plant system. Currently, the assembled IndicaRTD must allow fast quantification of transcript and gene expression with high precision.
Collapse
Affiliation(s)
- Nagesh Srikakulam
- Laboratory of RNA Biology and Epigenomics, Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
- *Correspondence: Nagesh Srikakulam, ; Gopal Pandi,
| | - Ganapathi Sridevi
- Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
| | - Gopal Pandi
- Laboratory of RNA Biology and Epigenomics, Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
- *Correspondence: Nagesh Srikakulam, ; Gopal Pandi,
| |
Collapse
|
9
|
Zhang R, Kuo R, Coulter M, Calixto CPG, Entizne JC, Guo W, Marquez Y, Milne L, Riegler S, Matsui A, Tanaka M, Harvey S, Gao Y, Wießner-Kroh T, Paniagua A, Crespi M, Denby K, Hur AB, Huq E, Jantsch M, Jarmolowski A, Koester T, Laubinger S, Li QQ, Gu L, Seki M, Staiger D, Sunkar R, Szweykowska-Kulinska Z, Tu SL, Wachter A, Waugh R, Xiong L, Zhang XN, Conesa A, Reddy ASN, Barta A, Kalyna M, Brown JWS. A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis. Genome Biol 2022; 23:149. [PMID: 35799267 PMCID: PMC9264592 DOI: 10.1186/s13059-022-02711-0] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 06/15/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. RESULTS We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts-twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. CONCLUSIONS AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species.
Collapse
Affiliation(s)
- Runxuan Zhang
- Information and Computational Sciences, James Hutton Institute, Dundee, DD2 5DA, Scotland, UK.
| | - Richard Kuo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Max Coulter
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, Scotland, UK
| | - Cristiane P G Calixto
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, Scotland, UK
- Present address: Institute of Biosciences, University of São Paulo, São Paulo, 05508-090, Brazil
| | - Juan Carlos Entizne
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, Scotland, UK
| | - Wenbin Guo
- Information and Computational Sciences, James Hutton Institute, Dundee, DD2 5DA, Scotland, UK
| | - Yamile Marquez
- Centre for Genomic Regulation, C/ Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Linda Milne
- Information and Computational Sciences, James Hutton Institute, Dundee, DD2 5DA, Scotland, UK
| | - Stefan Riegler
- Institute of Molecular Plant Biology, Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria
- Present address: Institute of Science and Technology Austria, Am Campus 1, 3400, Klosterneuburg, Austria
| | - Akihiro Matsui
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | - Maho Tanaka
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | - Sarah Harvey
- Centre for Novel Agricultural Products (CNAP), Department of Biology, University of York Wentworth Way, York, YO10 5DD, UK
| | - Yubang Gao
- College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Theresa Wießner-Kroh
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Auf der Morgenstelle 32, 72076, Tübingen, Germany
| | - Alejandro Paniagua
- Institute for Integrative Systems Biology (CSIC-UV), Spanish National Research Council, Paterna, Valencia, Spain
| | - Martin Crespi
- French National Centre for Scientific Research | CNRS INRAE-Universities of Paris Saclay and Paris, Institute of Plant Sciences Paris Saclay IPS2, Rue de Noetzlin, 91192, Gif sur Yvette, France
| | - Katherine Denby
- Centre for Novel Agricultural Products (CNAP), Department of Biology, University of York Wentworth Way, York, YO10 5DD, UK
| | - Asa Ben Hur
- Department of Computer Science, Colorado State University, 1873 Campus Delivery, Fort Collins, CO, 80523-1873, USA
| | - Enamul Huq
- Department of Molecular Biosciences, University of Texas at Austin, 100 East 24th St., Austin, TX, 78712-1095, USA
| | - Michael Jantsch
- Department of Cell and Developmental Biology, Center for Anatomy and Cell Biology, Medical University of Vienna, Schwarzspanierstrasse 17 A-1090, Vienna, Austria
| | - Artur Jarmolowski
- Department of Gene Expression, Adam Mickiewicz University, Poznań, Poland
| | - Tino Koester
- RNA Biology and Molecular Physiology, Faculty for Biology, Bielefeld University, Universitaetsstrasse 25, 33615, Bielefeld, Germany
| | - Sascha Laubinger
- Institut für Biologie und Umweltwissenschaften (IBU), Carl von Ossietzky Universität Oldenburg, Carl von Ossietzky-Str. 9-11, 26111, Oldenburg, Germany
- Institute of Biology, Department of Genetics, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Qingshun Quinn Li
- Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA, 91766, USA
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361102, Fujian, China
| | - Lianfeng Gu
- College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Motoaki Seki
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | - Dorothee Staiger
- RNA Biology and Molecular Physiology, Faculty for Biology, Bielefeld University, Universitaetsstrasse 25, 33615, Bielefeld, Germany
| | - Ramanjulu Sunkar
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, 74078, USA
| | | | - Shih-Long Tu
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Andreas Wachter
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Auf der Morgenstelle 32, 72076, Tübingen, Germany
- Present address: Institute for Molecular Physiology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 17, 55128, Mainz, Germany
| | - Robbie Waugh
- Cell and Molecular Sciences, James Hutton Institute, Dundee, DD2 5DA, Scotland, UK
| | - Liming Xiong
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Xiao-Ning Zhang
- Biology Department, School of Arts and Sciences, St. Bonaventure University, 3261 West State Road, St. Bonaventure, NY, 14778, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology (CSIC-UV), Spanish National Research Council, Paterna, Valencia, Spain
| | - Anireddy S N Reddy
- Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Andrea Barta
- Max F. Perutz Laboratories, Medical University of Vienna, Center of Medical Biochemistry, Dr.-Bohr-Gasse 9/3, A-1030, Vienna, Austria
| | - Maria Kalyna
- Institute of Molecular Plant Biology, Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences (BOKU), Muthgasse 18, 1190, Vienna, Austria
| | - John W S Brown
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, Scotland, UK
- Cell and Molecular Sciences, James Hutton Institute, Dundee, DD2 5DA, Scotland, UK
| |
Collapse
|
10
|
Tay AP, Hamey JJ, Martyn GE, Wilson LOW, Wilkins MR. Identification of Protein Isoforms Using Reference Databases Built from Long and Short Read RNA-Sequencing. J Proteome Res 2022; 21:1628-1639. [PMID: 35612954 DOI: 10.1021/acs.jproteome.1c00968] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Alternative splicing can lead to distinct protein isoforms. These can have different functions in specific cells and tissues or in different developmental stages. In this study, we explored whether transcripts assembled from long read, nanopore-based, direct RNA-sequencing (RNA-seq) could improve the identification of protein isoforms in human K562 cells. By comparing with Illumina-based short read RNA-seq, we showed that a large proportion of Ensembl transcripts (5949/14,326) and genes expressing alternatively spliced transcripts (486/2981) identified with long direct reads were missed by short paired-end reads. By co-analyzing proteomic and transcriptomic data, we also showed that some peptides (826/35,976), proteins (262/3215), and protein isoforms arising from distinct transcript variants (574/1212) identified with isoform-specific peptides via custom long-read-based databases were missed in Illumina-derived databases. Finally, we generated unequivocal peptide evidence for a set of protein isoforms and showed that long read, direct RNA-seq allows the discovery of novel protein isoforms not already in reference databases or custom databases built from short read RNA-seq data. Our analysis highlights the benefits of long read RNA-seq data in the generation of reference databases to increase tandem mass spectrometry (MS/MS) identification of protein isoforms.
Collapse
Affiliation(s)
- Aidan P Tay
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia.,Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Joshua J Hamey
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Gabriella E Martyn
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Laurence O W Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
11
|
Zhou Y, Ren M, Zhang P, Jiang D, Yao X, Luo Y, Yang Z, Wang Y. Application of Nanopore Sequencing in the Detection of Foodborne Microorganisms. NANOMATERIALS (BASEL, SWITZERLAND) 2022; 12:1534. [PMID: 35564242 PMCID: PMC9100974 DOI: 10.3390/nano12091534] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/28/2022] [Accepted: 04/29/2022] [Indexed: 12/21/2022]
Abstract
Foodborne pathogens have become the subject of intense interest because of their high incidence and mortality worldwide. In the past few decades, people have developed many methods to solve this challenge. At present, methods such as traditional microbial culture methods, nucleic acid or protein-based pathogen detection methods, and whole-genome analysis are widely used in the detection of pathogenic microorganisms in food. However, these methods are limited by time-consuming, cumbersome operations or high costs. The development of nanopore sequencing technology offers the possibility to address these shortcomings. Nanopore sequencing, a third-generation technology, has the advantages of simple operation, high sensitivity, real-time sequencing, and low turnaround time. It can be widely used in the rapid detection and serotyping of foodborne pathogens. This review article discusses foodborne diseases, the principle of nanopore sequencing technology, the application of nanopore sequencing technology in foodborne pathogens detection, as well as its development prospects.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Yin Wang
- Key Laboratory of Animal Diseases and Human Health of Sichuan Province, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu 611130, China; (Y.Z.); (M.R.); (P.Z.); (D.J.); (X.Y.); (Y.L.); (Z.Y.)
| |
Collapse
|
12
|
Bartalucci N, Romagnoli S, Vannucchi AM. A blood drop through the pore: nanopore sequencing in hematology. Trends Genet 2021; 38:572-586. [PMID: 34906378 DOI: 10.1016/j.tig.2021.11.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 11/09/2021] [Accepted: 11/15/2021] [Indexed: 10/19/2022]
Abstract
The development of new sequencing platforms, technologies, and bioinformatics tools in the past decade fostered key discoveries in human genomics. Among the most recent sequencing technologies, nanopore sequencing (NS) has caught the interest of researchers for its intriguing potential and flexibility. This up-to-date review highlights the recent application of NS in the hematology field, focusing on progress and challenges of the technological approaches employed for the identification of pathologic alterations. The molecular and analytic pipelines developed for the analysis of the whole-genome, target regions, and transcriptomics provide a proof of evidence of the unparalleled amount of information that could be retrieved by an innovative approach based on long-read sequencing.
Collapse
Affiliation(s)
- Niccolò Bartalucci
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy
| | - Simone Romagnoli
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy
| | - Alessandro Maria Vannucchi
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy.
| |
Collapse
|
13
|
Chen Z, He X. Application of third-generation sequencing in cancer research. MEDICAL REVIEW (BERLIN, GERMANY) 2021; 1:150-171. [PMID: 37724303 PMCID: PMC10388785 DOI: 10.1515/mr-2021-0013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/09/2021] [Indexed: 09/20/2023]
Abstract
In the past several years, nanopore sequencing technology from Oxford Nanopore Technologies (ONT) and single-molecule real-time (SMRT) sequencing technology from Pacific BioSciences (PacBio) have become available to researchers and are currently being tested for cancer research. These methods offer many advantages over most widely used high-throughput short-read sequencing approaches and allow the comprehensive analysis of transcriptomes by identifying full-length splice isoforms and several other posttranscriptional events. In addition, these platforms enable structural variation characterization at a previously unparalleled resolution and direct detection of epigenetic marks in native DNA and RNA. Here, we present a comprehensive summary of important applications of these technologies in cancer research, including the identification of complex structure variants, alternatively spliced isoforms, fusion transcript events, and exogenous RNA. Furthermore, we discuss the impact of the newly developed nanopore direct RNA sequencing (RNA-Seq) approach in advancing epitranscriptome research in cancer. Although the unique challenges still present for these new single-molecule long-read methods, they will unravel many aspects of cancer genome complexity in unprecedented ways and present an encouraging outlook for continued application in an increasing number of different cancer research settings.
Collapse
Affiliation(s)
- Zhiao Chen
- Fudan University Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xianghuo He
- Fudan University Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Key Laboratory of Breast Cancer in Shanghai, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
| |
Collapse
|
14
|
Wang Y, Zhao Y, Bollas A, Wang Y, Au KF. Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 806] [Impact Index Per Article: 201.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
Affiliation(s)
- Yunhao Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yue Zhao
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA
| | - Audrey Bollas
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yuru Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Kin Fai Au
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
15
|
Comparative Analysis of PacBio and Oxford Nanopore Sequencing Technologies for Transcriptomic Landscape Identification of Penaeus monodon. Life (Basel) 2021; 11:life11080862. [PMID: 34440606 PMCID: PMC8399832 DOI: 10.3390/life11080862] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 08/07/2021] [Accepted: 08/17/2021] [Indexed: 12/16/2022] Open
Abstract
With the advantages that long-read sequencing platforms such as Pacific Biosciences (Menlo Park, CA, USA) (PacBio) and Oxford Nanopore Technologies (Oxford, UK) (ONT) can offer, various research fields such as genomics and transcriptomics can exploit their benefits. Selecting an appropriate sequencing platform is undoubtedly crucial for the success of the research outcome, thus there is a need to compare these long-read sequencing platforms and evaluate them for specific research questions. This study aims to compare the performance of PacBio and ONT platforms for transcriptomic analysis by utilizing transcriptome data from three different tissues (hepatopancreas, intestine, and gonads) of the juvenile black tiger shrimp, Penaeus monodon. We compared three important features: (i) main characteristics of the sequencing libraries and their alignment with the reference genome, (ii) transcript assembly features and isoform identification, and (iii) correlation of the quantification of gene expression levels for both platforms. Our analyses suggest that read-length bias and differences in sequencing throughput are highly influential factors when using long reads in transcriptome studies. These comparisons can provide a guideline when designing a transcriptome study utilizing these two long-read sequencing technologies.
Collapse
|
16
|
Du N, Shang J, Sun Y. Improving protein domain classification for third-generation sequencing reads using deep learning. BMC Genomics 2021; 22:251. [PMID: 33836667 PMCID: PMC8033682 DOI: 10.1186/s12864-021-07468-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 02/19/2021] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND With the development of third-generation sequencing (TGS) technologies, people are able to obtain DNA sequences with lengths from 10s to 100s of kb. These long reads allow protein domain annotation without assembly, thus can produce important insights into the biological functions of the underlying data. However, the high error rate in TGS data raises a new challenge to established domain analysis pipelines. The state-of-the-art methods are not optimized for noisy reads and have shown unsatisfactory accuracy of domain classification in TGS data. New computational methods are still needed to improve the performance of domain prediction in long noisy reads. RESULTS In this work, we introduce ProDOMA, a deep learning model that conducts domain classification for TGS reads. It uses deep neural networks with 3-frame translation encoding to learn conserved features from partially correct translations. In addition, we formulate our problem as an open-set problem and thus our model can reject reads not containing the targeted domains. In the experiments on simulated long reads of protein coding sequences and real TGS reads from the human genome, our model outperforms HMMER and DeepFam on protein domain classification. CONCLUSIONS In summary, ProDOMA is a useful end-to-end protein domain analysis tool for long noisy reads without relying on error correction.
Collapse
Affiliation(s)
- Nan Du
- Computer Science and Engineering, Michigan State University, East Lansing, 48824 USA
| | - Jiayu Shang
- Electrical Engineering, City University of Hong Kong, Hong Kong, People’s Republic of China
| | - Yanni Sun
- Electrical Engineering, City University of Hong Kong, Hong Kong, People’s Republic of China
| |
Collapse
|
17
|
Broseus L, Thomas A, Oldfield AJ, Severac D, Dubois E, Ritchie W. TALC: Transcript-level Aware Long-read Correction. Bioinformatics 2021; 36:5000-5006. [PMID: 32910174 DOI: 10.1093/bioinformatics/btaa634] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 05/08/2020] [Accepted: 07/09/2020] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Long-read sequencing technologies are invaluable for determining complex RNA transcript architectures but are error-prone. Numerous 'hybrid correction' algorithms have been developed for genomic data that correct long reads by exploiting the accuracy and depth of short reads sequenced from the same sample. These algorithms are not suited for correcting more complex transcriptome sequencing data. RESULTS We have created a novel reference-free algorithm called Transcript-level Aware Long-Read Correction (TALC) which models changes in RNA expression and isoform representation in a weighted De Bruijn graph to correct long reads from transcriptome studies. We show that transcript-level aware correction by TALC improves the accuracy of the whole spectrum of downstream RNA-seq applications and is thus necessary for transcriptome analyses that use long read technology. AVAILABILITY AND IMPLEMENTATION TALC is implemented in C++ and available at https://github.com/lbroseus/TALC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lucile Broseus
- Department of Genome Dynamics, Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier 34396, France
| | - Aubin Thomas
- Department of Genome Dynamics, Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier 34396, France
| | - Andrew J Oldfield
- Department of Genome Dynamics, Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier 34396, France
| | - Dany Severac
- MGX-Montpellier GenomiX, c/o Institut de Génomique Fonctionnelle, Montpellier Cedex 5 34094, France
| | - Emeric Dubois
- MGX-Montpellier GenomiX, c/o Institut de Génomique Fonctionnelle, Montpellier Cedex 5 34094, France
| | - William Ritchie
- Department of Genome Dynamics, Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier 34396, France
| |
Collapse
|
18
|
Holley G, Beyter D, Ingimundardottir H, Møller PL, Kristmundsdottir S, Eggertsson HP, Halldorsson BV. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly. Genome Biol 2021; 22:28. [PMID: 33419473 PMCID: PMC7792008 DOI: 10.1186/s13059-020-02244-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 12/15/2020] [Indexed: 12/20/2022] Open
Abstract
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
Collapse
Affiliation(s)
| | | | | | - Peter L Møller
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - Snædis Kristmundsdottir
- deCODE genetics/Amgen Inc., Reykjavík, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| | | | - Bjarni V Halldorsson
- deCODE genetics/Amgen Inc., Reykjavík, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| |
Collapse
|
19
|
Sahlin K, Medvedev P. Error correction enables use of Oxford Nanopore technology for reference-free transcriptome analysis. Nat Commun 2021; 12:2. [PMID: 33397972 PMCID: PMC7782715 DOI: 10.1038/s41467-020-20340-8] [Citation(s) in RCA: 96] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 11/25/2020] [Indexed: 01/24/2023] Open
Abstract
Oxford Nanopore (ONT) is a leading long-read technology which has been revolutionizing transcriptome analysis through its capacity to sequence the majority of transcripts from end-to-end. This has greatly increased our ability to study the diversity of transcription mechanisms such as transcription initiation, termination, and alternative splicing. However, ONT still suffers from high error rates which have thus far limited its scope to reference-based analyses. When a reference is not available or is not a viable option due to reference-bias, error correction is a crucial step towards the reconstruction of the sequenced transcripts and downstream sequence analysis of transcripts. In this paper, we present a novel computational method to error correct ONT cDNA sequencing data, called isONcorrect. IsONcorrect is able to jointly use all isoforms from a gene during error correction, thereby allowing it to correct reads at low sequencing depths. We are able to obtain a median accuracy of 98.9-99.6%, demonstrating the feasibility of applying cost-effective cDNA full transcript length sequencing for reference-free transcriptome analysis.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, 106 91, Stockholm, Sweden
| | - Paul Medvedev
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA.
- Center for Computational Biology and Bioinformatics, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
20
|
Hayrabedyan S, Kostova P, Zlatkov V, Todorova K. Single-cell transcriptomics in the context of long-read nanopore sequencing. BIOTECHNOL BIOTEC EQ 2021. [DOI: 10.1080/13102818.2021.1988868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Affiliation(s)
- Soren Hayrabedyan
- Laboratory of Reproductive OMICs Technologies, Institute of Biology and Immunology of Reproduction, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Petya Kostova
- Gynecology Clinic, National Oncology Hospital, Sofia, Bulgaria
| | - Viktor Zlatkov
- Department of Obstetrics and Gynecology, Faculty of Medicine, Medical University of Sofia, Sofia, Bulgaria
| | - Krassimira Todorova
- Laboratory of Reproductive OMICs Technologies, Institute of Biology and Immunology of Reproduction, Bulgarian Academy of Sciences, Sofia, Bulgaria
| |
Collapse
|
21
|
Zhang H, Jain C, Aluru S. A comprehensive evaluation of long read error correction methods. BMC Genomics 2020; 21:889. [PMID: 33349243 PMCID: PMC7751105 DOI: 10.1186/s12864-020-07227-0] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 11/12/2020] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Third-generation single molecule sequencing technologies can sequence long reads, which is advancing the frontiers of genomics research. However, their high error rates prohibit accurate and efficient downstream analysis. This difficulty has motivated the development of many long read error correction tools, which tackle this problem through sampling redundancy and/or leveraging accurate short reads of the same biological samples. Existing studies to asses these tools use simulated data sets, and are not sufficiently comprehensive in the range of software covered or diversity of evaluation measures used. RESULTS In this paper, we present a categorization and review of long read error correction methods, and provide a comprehensive evaluation of the corresponding long read error correction tools. Leveraging recent real sequencing data, we establish benchmark data sets and set up evaluation criteria for a comparative assessment which includes quality of error correction as well as run-time and memory usage. We study how trimming and long read sequencing depth affect error correction in terms of length distribution and genome coverage post-correction, and the impact of error correction performance on an important application of long reads, genome assembly. We provide guidelines for practitioners for choosing among the available error correction tools and identify directions for future research. CONCLUSIONS Despite the high error rate of long reads, the state-of-the-art correction tools can achieve high correction quality. When short reads are available, the best hybrid methods outperform non-hybrid methods in terms of correction quality and computing resource usage. When choosing tools for use, practitioners are suggested to be careful with a few correction tools that discard reads, and check the effect of error correction tools on downstream analysis. Our evaluation code is available as open-source at https://github.com/haowenz/LRECE .
Collapse
Affiliation(s)
- Haowen Zhang
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30332, GA, USA
| | - Chirag Jain
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30332, GA, USA
| | - Srinivas Aluru
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30332, GA, USA. .,Institute for Data Engineering and Science, Georgia Institute of Technology, Atlanta, 30332, GA, USA.
| |
Collapse
|
22
|
Abstract
Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.
Collapse
Affiliation(s)
- Steve S Ho
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ryan E Mills
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
23
|
Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol 2020; 21:30. [PMID: 32033565 PMCID: PMC7006217 DOI: 10.1186/s13059-020-1935-5] [Citation(s) in RCA: 918] [Impact Index Per Article: 183.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 01/15/2020] [Indexed: 12/11/2022] Open
Abstract
Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.
Collapse
Affiliation(s)
- Shanika L. Amarasinghe
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Shian Su
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Xueyi Dong
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Luke Zappia
- Bioinformatics, Murdoch Children’s Research Institute, Parkville, 3052 Australia
- School of Biosciences, Faculty of Science, The University of Melbourne, Parkville, 3010 Australia
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
- School of Mathematics and StatisticsThe University of Melbourne, Parkville, 3010 Australia
| | - Quentin Gouil
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| |
Collapse
|
24
|
Marchet C, Morisse P, Lecompte L, Lefebvre A, Lecroq T, Peterlongo P, Limasset A. ELECTOR: evaluator for long reads correction methods. NAR Genom Bioinform 2019; 2:lqz015. [PMID: 33575566 PMCID: PMC7671326 DOI: 10.1093/nargab/lqz015] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 09/24/2019] [Accepted: 10/16/2019] [Indexed: 12/19/2022] Open
Abstract
The error rates of third-generation sequencing data have been capped >5%, mainly containing insertions and deletions. Thereby, an increasing number of diverse long reads correction methods have been proposed. The quality of the correction has huge impacts on downstream processes. Therefore, developing methods allowing to evaluate error correction tools with precise and reliable statistics is a crucial need. These evaluation methods rely on costly alignments to evaluate the quality of the corrected reads. Thus, key features must allow the fast comparison of different tools, and scale to the increasing length of the long reads. Our tool, ELECTOR, evaluates long reads correction and is directly compatible with a wide range of error correction tools. As it is based on multiple sequence alignment, we introduce a new algorithmic strategy for alignment segmentation, which enables us to scale to large instances using reasonable resources. To our knowledge, we provide the unique method that allows producing reproducible correction benchmarks on the latest ultra-long reads (>100 k bases). It is also faster than the current state-of-the-art on other datasets and provides a wider set of metrics to assess the read quality improvement after correction. ELECTOR is available on GitHub (https://github.com/kamimrcht/ELECTOR) and Bioconda.
Collapse
Affiliation(s)
- Camille Marchet
- Univ Rennes, CNRS, Inria, IRISA-UMR 6074, F-35000 Rennes, France.,Univ. Lille, CNRS, UMR 9189 - CRIStAL, 59655 Villeneuve-d'Ascq, France
| | - Pierre Morisse
- Normandie Université, UNIROUEN, INSA Rouen, LITIS, 76000 Rouen, France
| | - Lolita Lecompte
- Univ Rennes, CNRS, Inria, IRISA-UMR 6074, F-35000 Rennes, France
| | | | | | | | - Antoine Limasset
- Univ. Lille, CNRS, UMR 9189 - CRIStAL, 59655 Villeneuve-d'Ascq, France
| |
Collapse
|