1
|
Zhao S, Macakova K, Sinson JC, Dai H, Rosenfeld J, Zapata GE, Li S, Ward PA, Wang C, Qu C, Maywald B, Lee B, Eng C, Liu P. Clinical validation of RNA sequencing for Mendelian disorder diagnostics. Am J Hum Genet 2025; 112:779-792. [PMID: 40043707 DOI: 10.1016/j.ajhg.2025.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 02/06/2025] [Accepted: 02/06/2025] [Indexed: 03/12/2025] Open
Abstract
Despite rapid advancements in clinical sequencing, over half of diagnostic evaluations still lack definitive results. RNA sequencing (RNA-seq) has shown promise in research settings for bridging this gap by providing essential functional data for accurate interpretation of diagnostic sequencing results. However, despite advanced research pipelines, clinical translation of diagnostic RNA-seq has not yet been realized. We have developed and validated a clinical diagnostic RNA-seq test for individuals with suspected genetic disorders who have existing or concurrent comprehensive DNA diagnostic testing. This diagnostic RNA-seq test processes RNA samples from fibroblasts or blood and derives clinical interpretations based on the analytical detection of outliers in gene expressions and splicing patterns. The clinical validation involves 130 samples, including 90 negative and 40 positive samples. We developed provisional expression and splicing benchmarks using short-read and long-read RNA-seq data from the GM24385 lymphoblastoid sample produced by the Genome in a Bottle Consortium. For clinical validation, we first established reference ranges for each gene and junction based on expression distributions from our control data. We then evaluated the clinical performance of our outlier-based pipeline using 40 positive samples with previously identified diagnostic findings from the Undiagnosed Diseases Network project. Our study provides a paradigm and necessary resources for independent laboratories to validate a clinical RNA-seq test.
Collapse
Affiliation(s)
- Sen Zhao
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kristina Macakova
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA; Graduate Program in Diagnostic Genetics and Genomics, The University of Texas MD Anderson Cancer Center School of Health Professions, Houston, TX 77030, USA
| | - Jefferson C Sinson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Hongzheng Dai
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Baylor Genetics, Houston, TX 77021, USA
| | - Jill Rosenfeld
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Gladys E Zapata
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Shenglan Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Patricia A Ward
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christiana Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Becky Maywald
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Brendan Lee
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christine Eng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA; Baylor Genetics, Houston, TX 77021, USA
| | - Pengfei Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Genetics and Multiomics Laboratory, Baylor College of Medicine, Houston, TX 77030, USA; Baylor Genetics, Houston, TX 77021, USA.
| |
Collapse
|
2
|
Riedl M, Ruggeri C, Marx N, Borth N. Fantastic genes and where to find them expressed in CHO. Comput Struct Biotechnol J 2025; 27:1407-1415. [PMID: 40242293 PMCID: PMC12002940 DOI: 10.1016/j.csbj.2025.03.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 03/27/2025] [Accepted: 03/31/2025] [Indexed: 04/18/2025] Open
Abstract
The transcriptome of Chinese hamster ovary (CHO) cells plays a crucial role in determining cellular characteristics that are essential for biopharmaceutical applications. RNA-sequencing has been extensively used to profile gene expression patterns, aiming to gain a better understanding of intracellular behavior and mechanisms. Individual datasets, however, do not provide a comprehensive overview and characterization of the CHO cell's transcriptome, such that the fundamental structure of the transcriptome remains unknown. Using 15 RNA-sequencing datasets, encompassing almost 300 samples of various experimental setups, conditions and cell lines, we explore and classify the protein-coding transcriptome of CHO cells. Differences in cell line lineages are found to be the primary source of variation in transcribed genes. By employing a novel approach, we identified the core transcriptome that is ubiquitously expressed in all cell lines and culture conditions, as well as genes that remain entirely non-expressed. Additionally, we identified a set of genes that may be active or inactive depending on different conditions, which are linked to biological processes including translation as well as immune and stress response. Lastly, by integrating chromatin states derived from histone modifications, we provided additional context on the epigenetic level that supports our protein-coding gene classification. Our study offers a comprehensive insight into the CHO cell transcriptome and lays the foundation for future research into cellular adaptation to changing conditions and the development of phenotypes.
Collapse
Affiliation(s)
- Markus Riedl
- Department of Biotechnology, BOKU University, Vienna, Austria
| | | | - Nicolas Marx
- Department of Biotechnology, BOKU University, Vienna, Austria
| | - Nicole Borth
- Department of Biotechnology, BOKU University, Vienna, Austria
| |
Collapse
|
3
|
Vasquez-Velez L, D'Mello V, Soteropoulos P. RNA Sequencing Protocols for Short-Read Sequencing. Methods Mol Biol 2025; 2866:125-158. [PMID: 39546201 DOI: 10.1007/978-1-0716-4192-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
RNA sequencing (RNA-seq) methodologies allow the discovery of novel variants and transcripts. These comprise three general steps: (1) capture of RNA species of interest, (2) conversion of RNA to complementary DNA (cDNA), and (3) modification of cDNA to fit the sequencing platform. Here we describe four different library preparation protocols for short-read sequencing: cDNA synthesis with poly(A) selection, library preparation with ribosomal depletion, and cDNA synthesis with SMART® (Switching Mechanism at 5' end of RNA Template) technology for low and Pico inputs.
Collapse
Affiliation(s)
| | - Veera D'Mello
- Genomics Center, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Patricia Soteropoulos
- Department of Microbiology, Biochemistry, and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, USA.
| |
Collapse
|
4
|
Huang H, Zhang M, Lu H, Chen Y, Sun W, Zhu J, Chen Z. Identification and evaluation of plasma exosome RNA biomarkers for non-invasive diagnosis of hepatocellular carcinoma using RNA-seq. BMC Cancer 2024; 24:1552. [PMID: 39696145 DOI: 10.1186/s12885-024-13332-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Accepted: 12/11/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Non-invasive diagnostic methods, including medical imaging techniques and blood biomarkers such as alpha-fetoprotein (AFP), have been crucial in detecting hepatocellular carcinoma (HCC). However, imaging techniques are only effective for tumor size larger than 2 cm. AFP measurement remains unsatisfactory due to high rate of misdiagnosis and underdiagnosis. Therefore, new reliable biomarkers and better non-invasive diagnostic approach are necessary for HCC identification. METHODS The differentially expressed genes were identified using multiple public RNA-seq data of liver tissues from healthy individuals and HCC patients including peritumoral and tumor tissues. The hub genes for HCC diagnosis were identified combining pathway enrichment analysis and protein-protein interaction network analysis. The performance of hub genes for non-invasive HCC diagnosis was analyzed in plasma of healthy individuals, HBV infected patients, and HCC patients based on exosomal RNA-seq data. A multi-layer perceptron (MLP) model based on exosomal hub genes was developed for non-invasive HCC diagnosis. RESULTS Through differential gene expression and pathway enrichment analysis on multiple public RNA-seq datasets, we first identified 30 dysregulated genes in HCC tissues. Protein-protein interaction analysis further narrowed down this list to 10 key genes: BRCA2, CDK1, MCM4, PLK1, DNA2, BLM, PCNA, POLD1, BRCA1 and FEN1. By further evaluation using additional public HCC tissue datasets, POLD1 and MCM4 were excluded from consideration as potential biomarkers due to their suboptimal performance. Notably, CDK1, FEN1, and PCNA gene were found to be significantly elevated in the plasma exosomes of HCC patients compared to non-HCC individuals, including those with HBV-infected hepatitis and healthy controls. The MLP model, based on three biomarkers, showed an area under the curve (AUC) of 0.85 and 0.84 in training and test dataset respectively, after adjusting for the covariates sex and age. CONCLUSION We identified three key genes, CDK1, FEN1, and PCNA, as exosomal biomarkers for non-invasive diagnosis of HCC. The MLP model utilizing three biomarkers showed good differentiation between non-HCC individuals and HCC patients, which exhibits promising potential as a non-invasive diagnostic tool for detecting HCC. Additional validation with a larger sample size is essential to thoroughly assess the reliability of the biomarkers and the model's performance.
Collapse
Affiliation(s)
- Heqing Huang
- Infectious Disease Department, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
| | - Min Zhang
- BamRock Research Department, Suzhou BamRock Biotechnology Ltd., Suzhou, Jiangsu Province, China
| | - Hong Lu
- Infectious Disease Department, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
| | - Yiling Chen
- Infectious Disease Department, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
| | - Weijie Sun
- Ulink College of Shanghai, Shanghai, China
| | - Jinghan Zhu
- Infectious Disease Department, The Fourth Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China.
| | - Zutao Chen
- Infectious Disease Department, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China.
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China.
- Infectious Disease Department, The Fourth Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China.
| |
Collapse
|
5
|
Li G, Schnell D, Bhattacharjee A, Yarmarkovich M, Salomonis N. Quantifying tumor specificity using Bayesian probabilistic modeling for drug and immunotherapeutic target discovery. CELL REPORTS METHODS 2024; 4:100900. [PMID: 39515334 PMCID: PMC11705768 DOI: 10.1016/j.crmeth.2024.100900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 07/16/2024] [Accepted: 10/17/2024] [Indexed: 11/16/2024]
Abstract
In diseases such as cancer, the design of new therapeutic strategies requires extensive, costly, and unfortunately sometimes deadly testing to reveal life threatening off-target effects. We hypothesized that the disease specificity of targets can be systematically learned for all genes by jointly evaluating complementary molecular measurements of healthy tissues using a hierarchical Bayesian modeling approach. Our method, BayesTS, integrates protein and gene expression evidence and includes tunable parameters to moderate tissue essentiality. Applied to all protein coding genes, BayesTS outperforms alternative strategies to define therapeutic targets and nominates previously unknown targets while allowing for incorporation of new types of modalities. To expand target repertoires, we show that extension of BayesTS to splicing antigens and combinatorial target pairs results in more specific targets for therapy. We expect that BayesTS will facilitate improved target prioritization for oncology drug development, ultimately leading to the discovery of more effective and safer treatments.
Collapse
Affiliation(s)
- Guangyuan Li
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Department of Biomedical Informatics, College of Medicine, University of Cincinnati, Cincinnati, OH 45267, USA; Perlmutter Cancer Center, New York University Grossman School of Medicine, New York, NY, USA.
| | - Daniel Schnell
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Anukana Bhattacharjee
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Mark Yarmarkovich
- Perlmutter Cancer Center, New York University Grossman School of Medicine, New York, NY, USA
| | - Nathan Salomonis
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Department of Biomedical Informatics, College of Medicine, University of Cincinnati, Cincinnati, OH 45267, USA
| |
Collapse
|
6
|
Wang D, Liu Y, Zhang Y, Chen Q, Han Y, Hou W, Liu C, Yu Y, Li Z, Li Z, Zhao J, Shi L, Zheng Y, Li J, Zhang R. A real-world multi-center RNA-seq benchmarking study using the Quartet and MAQC reference materials. Nat Commun 2024; 15:6167. [PMID: 39039053 PMCID: PMC11263697 DOI: 10.1038/s41467-024-50420-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 07/02/2024] [Indexed: 07/24/2024] Open
Abstract
Translating RNA-seq into clinical diagnostics requires ensuring the reliability and cross-laboratory consistency of detecting clinically relevant subtle differential expressions, such as those between different disease subtypes or stages. As part of the Quartet project, we present an RNA-seq benchmarking study across 45 laboratories using the Quartet and MAQC reference samples spiked with ERCC controls. Based on multiple types of 'ground truth', we systematically assess the real-world RNA-seq performance and investigate the influencing factors involved in 26 experimental processes and 140 bioinformatics pipelines. Here we show greater inter-laboratory variations in detecting subtle differential expressions among the Quartet samples. Experimental factors including mRNA enrichment and strandedness, and each bioinformatics step, emerge as primary sources of variations in gene expression. We underscore the profound influence of experimental execution, and provide best practice recommendations for experimental designs, strategies for filtering low-expression genes, and the optimal gene annotation and analysis pipelines. In summary, this study lays the foundation for developing and quality control of RNA-seq for clinical diagnostic purposes.
Collapse
Affiliation(s)
- Duo Wang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanfeng Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yanxi Han
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Cong Liu
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ziyang Li
- Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan, PR China
| | - Ziqiang Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Jiaxin Zhao
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China.
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China.
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China.
| | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, PR China.
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, PR China.
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, PR China.
| |
Collapse
|
7
|
Wulff JP, Hickner PV, Watson DW, Denning SS, Belikoff EJ, Scott MJ. Antennal transcriptome analysis reveals sensory receptors potentially associated with host detection in the livestock pest Lucilia cuprina. Parasit Vectors 2024; 17:308. [PMID: 39026238 PMCID: PMC11256703 DOI: 10.1186/s13071-024-06391-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 07/03/2024] [Indexed: 07/20/2024] Open
Abstract
BACKGROUND Lucilia cuprina (Wiedemann, 1830) (Diptera: Calliphoridae) is the main causative agent of flystrike of sheep in Australia and New Zealand. Female flies lay eggs in an open wound or natural orifice, and the developing larvae eat the host's tissues, a condition called myiasis. To improve our understanding of host-seeking behavior, we quantified gene expression in male and female antennae based on their behavior. METHODS A spatial olfactometer was used to evaluate the olfactory response of L. cuprina mated males and gravid females to fresh or rotting beef. Antennal RNA-Seq analysis was used to identify sensory receptors differentially expressed between groups. RESULTS Lucilia cuprina females were more attracted to rotten compared to fresh beef (> fivefold increase). However, males and some females did not respond to either type of beef. RNA-Seq analysis was performed on antennae dissected from attracted females, non-attracted females and males. Transcripts encoding sensory receptors from 11 gene families were identified above a threshold (≥ 5 transcript per million) including 49 ATP-binding cassette transporters (ABCs), two ammonium transporters (AMTs), 37 odorant receptors (ORs), 16 ionotropic receptors (IRs), 5 gustatory receptors (GRs), 22 odorant-binding proteins (OBPs), 9 CD36-sensory neuron membrane proteins (CD36/SNMPs), 4 chemosensory proteins (CSPs), 4 myeloid lipid-recognition (ML) and Niemann-Pick C2 disease proteins (ML/NPC2), 2 pickpocket receptors (PPKs) and 3 transient receptor potential channels (TRPs). Differential expression analyses identified sex-biased sensory receptors. CONCLUSIONS We identified sensory receptors that were differentially expressed between the antennae of both sexes and hence may be associated with host detection by female flies. The most promising for future investigations were as follows: an odorant receptor (LcupOR46) which is female-biased in L. cuprina and Cochliomyia hominivorax Coquerel, 1858; an ABC transporter (ABC G23.1) that was the sole sensory receptor upregulated in the antennae of females attracted to rotting beef compared to non-attracted females; a female-biased ammonia transporter (AMT_Rh50), which was previously associated with ammonium detection in Drosophila melanogaster Meigen, 1830. This is the first report suggesting a possible role for ABC transporters in L. cuprina olfaction and potentially in other insects.
Collapse
Affiliation(s)
- Juan P Wulff
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
| | - Paul V Hickner
- United States Department of Agriculture, Agricultural Research Service, Knipling-Bushland U.S. Livestock Insects Research Laboratory, 2700 Fredericksburg Road, Kerrville, TX, 78028-9184, USA
| | - David W Watson
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
| | - Steven S Denning
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
| | - Esther J Belikoff
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
| | - Maxwell J Scott
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA.
| |
Collapse
|
8
|
Jackson DJ, Cerveau N, Posnien N. De novo assembly of transcriptomes and differential gene expression analysis using short-read data from emerging model organisms - a brief guide. Front Zool 2024; 21:17. [PMID: 38902827 PMCID: PMC11188175 DOI: 10.1186/s12983-024-00538-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 06/12/2024] [Indexed: 06/22/2024] Open
Abstract
Many questions in biology benefit greatly from the use of a variety of model systems. High-throughput sequencing methods have been a triumph in the democratization of diverse model systems. They allow for the economical sequencing of an entire genome or transcriptome of interest, and with technical variations can even provide insight into genome organization and the expression and regulation of genes. The analysis and biological interpretation of such large datasets can present significant challenges that depend on the 'scientific status' of the model system. While high-quality genome and transcriptome references are readily available for well-established model systems, the establishment of such references for an emerging model system often requires extensive resources such as finances, expertise and computation capabilities. The de novo assembly of a transcriptome represents an excellent entry point for genetic and molecular studies in emerging model systems as it can efficiently assess gene content while also serving as a reference for differential gene expression studies. However, the process of de novo transcriptome assembly is non-trivial, and as a rule must be empirically optimized for every dataset. For the researcher working with an emerging model system, and with little to no experience with assembling and quantifying short-read data from the Illumina platform, these processes can be daunting. In this guide we outline the major challenges faced when establishing a reference transcriptome de novo and we provide advice on how to approach such an endeavor. We describe the major experimental and bioinformatic steps, provide some broad recommendations and cautions for the newcomer to de novo transcriptome assembly and differential gene expression analyses. Moreover, we provide an initial selection of tools that can assist in the journey from raw short-read data to assembled transcriptome and lists of differentially expressed genes.
Collapse
Affiliation(s)
- Daniel J Jackson
- University of Göttingen, Department of Geobiology, Goldschmidtstr.3, Göttingen, 37077, Germany.
| | - Nicolas Cerveau
- University of Göttingen, Department of Geobiology, Goldschmidtstr.3, Göttingen, 37077, Germany
| | - Nico Posnien
- University of Göttingen, Department of Developmental Biology, GZMB, Justus-Von-Liebig-Weg 11, Göttingen, 37077, Germany.
| |
Collapse
|
9
|
Sasseville M, Nguyen HDT, Drouin S, Bahadoor A. Production of Ochratoxin A and Citrinin and the Expression of Their Biosynthetic Genes from Penicillium verrucosum in Liquid Culture. ACS OMEGA 2024; 9:20368-20377. [PMID: 38737015 PMCID: PMC11080038 DOI: 10.1021/acsomega.4c00874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/04/2024] [Accepted: 04/05/2024] [Indexed: 05/14/2024]
Abstract
Penicillium verrucosum is a fungal pathogen capable of producing two mycotoxins of concern, ochratoxin A (OTA) and citrinin (CIT). The production profile of these two mycotoxins is not well understood but could help mitigate co-contamination in the food supply. As such, the production of OTA and CIT from P. verrucosum DAOMC 242724 was investigated under different growing conditions in liquid culture. We found that among the different liquid media chosen, liquid YES (yeast extract sucrose) medium induced the highest production of both OTA and CIT, when P. verrucosum DAOMC 242724 was cultured in stationary mode. Shake culture significantly reduced the amounts of OTA and CIT produced. Among all culture conditions tested, far greater amounts of CIT were produced compared to OTA. Consequently, upon transcriptomic data analysis, a statistically significant increase in the expression of CIT biosynthetic genes was easier to detect than the expression of OTA biosynthetic genes. Our study also revealed that the putative biosynthetic gene clusters of OTA and CIT in P. verrusocum DAOMC 242724 are likely distinct from each other. It appears that despite sharing a highly similar structure, the isocoumarin rings of OTA and CIT are each assembled by a specialized polyketide synthase enzyme. Our data identified a putative nonreducing polyketide synthase responsible for assembling the carbo-skeleton of CIT. In contrast, a highly reducing polyketide synthase appears to be involved in the biosynthesis of OTA.
Collapse
Affiliation(s)
- Marc Sasseville
- Applied
Genomics, Human Health Therapeutics, National
Research Council, 6100 Royalmount Ave, Montreal, Quebec H4P 2R2, Canada
| | - Hai D. T. Nguyen
- Ottawa
Research and Development Centre, Agriculture
and Agri-Food Canada, 960 Carling Ave, Ottawa, Ontario K1A 0C6, Canada
| | - Simon Drouin
- Applied
Genomics, Human Health Therapeutics, National
Research Council, 6100 Royalmount Ave, Montreal, Quebec H4P 2R2, Canada
| | - Adilah Bahadoor
- Metrology,
National Research Council, 1200 Montreal Road, Ottawa, Ontario K1A 0R6, Canada
| |
Collapse
|
10
|
Karagianni K, Bibi A, Madé A, Acharya S, Parkkonen M, Barbalata T, Srivastava PK, de Gonzalo-Calvo D, Emanueli C, Martelli F, Devaux Y, Dafou D, Nossent AY, on behalf of EU-CardioRNA COST Action CA17129. Recommendations for detection, validation, and evaluation of RNA editing events in cardiovascular and neurological/neurodegenerative diseases. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102085. [PMID: 38192612 PMCID: PMC10772297 DOI: 10.1016/j.omtn.2023.102085] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
RNA editing, a common and potentially highly functional form of RNA modification, encompasses two different RNA modifications, namely adenosine to inosine (A-to-I) and cytidine to uridine (C-to-U) editing. As inosines are interpreted as guanosines by the cellular machinery, both A-to-I and C-to-U editing change the nucleotide sequence of the RNA. Editing events in coding sequences have the potential to change the amino acid sequence of proteins, whereas editing events in noncoding RNAs can, for example, affect microRNA target binding. With advancing RNA sequencing technology, more RNA editing events are being discovered, studied, and reported. However, RNA editing events are still often overlooked or discarded as sequence read quality defects. With this position paper, we aim to provide guidelines and recommendations for the detection, validation, and follow-up experiments to study RNA editing, taking examples from the fields of cardiovascular and brain disease. We discuss all steps, from sample collection, storage, and preparation, to different strategies for RNA sequencing and editing-sensitive data analysis strategies, to validation and follow-up experiments, as well as potential pitfalls and gaps in the available technologies. This paper may be used as an experimental guideline for RNA editing studies in any disease context.
Collapse
Affiliation(s)
- Korina Karagianni
- Department of Genetics, Development, and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - Alessia Bibi
- Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Via Morandi 30, San Donato Milanese, 20097 Milan, Italy
- Department of Biosciences, University of Milan, Milan, Italy
| | - Alisia Madé
- Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Via Morandi 30, San Donato Milanese, 20097 Milan, Italy
| | - Shubhra Acharya
- Cardiovascular Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
- Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-alzette, Luxembourg
| | - Mikko Parkkonen
- Research Unit of Biomedicine and Internal Medicine, Department of Pharmacology and Toxicology, University of Oulu, Oulu, Finland
| | - Teodora Barbalata
- Lipidomics Department, Institute of Cellular Biology and Pathology “Nicolae Simionescu” of the Romanian Academy, 8, B. P. Hasdeu Street, 050568 Bucharest, Romania
| | | | - David de Gonzalo-Calvo
- Translational Research in Respiratory Medicine, University Hospital Arnau de Vilanova and Santa Maria, IRBLleida, Lleida, Spain
- CIBER of Respiratory Diseases (CIBERES), Institute of Health Carlos III, Madrid, Spain
| | | | - Fabio Martelli
- Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Via Morandi 30, San Donato Milanese, 20097 Milan, Italy
| | - Yvan Devaux
- Cardiovascular Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Dimitra Dafou
- Department of Genetics, Development, and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - A. Yaël Nossent
- Department of Surgery, Leiden University Medical Center, Leiden, the Netherlands
- Department of Nutrition, Exercise and Sports (NEXS), University of Copenhagen, Copenhagen, Denmark
| | - on behalf of EU-CardioRNA COST Action CA17129
- Department of Genetics, Development, and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
- Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Via Morandi 30, San Donato Milanese, 20097 Milan, Italy
- Department of Biosciences, University of Milan, Milan, Italy
- Cardiovascular Research Unit, Luxembourg Institute of Health, Strassen, Luxembourg
- Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-alzette, Luxembourg
- Research Unit of Biomedicine and Internal Medicine, Department of Pharmacology and Toxicology, University of Oulu, Oulu, Finland
- Lipidomics Department, Institute of Cellular Biology and Pathology “Nicolae Simionescu” of the Romanian Academy, 8, B. P. Hasdeu Street, 050568 Bucharest, Romania
- National Heart & Lung Institute, Imperial College London, London, UK
- Translational Research in Respiratory Medicine, University Hospital Arnau de Vilanova and Santa Maria, IRBLleida, Lleida, Spain
- CIBER of Respiratory Diseases (CIBERES), Institute of Health Carlos III, Madrid, Spain
- Department of Surgery, Leiden University Medical Center, Leiden, the Netherlands
- Department of Nutrition, Exercise and Sports (NEXS), University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
11
|
Cheng S, You Y, Wang X, Yi C, Zhang W, Xie Y, Xiu L, Luo F, Lu Y, Wang J, Hu W. Dynamic profiles of lncRNAs reveal a functional natural antisense RNA that regulates the development of Schistosoma japonicum. PLoS Pathog 2024; 20:e1011949. [PMID: 38285715 PMCID: PMC10878521 DOI: 10.1371/journal.ppat.1011949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 02/20/2024] [Accepted: 01/06/2024] [Indexed: 01/31/2024] Open
Abstract
Schistosomes are flatworm parasites that undergo a complex life cycle involving two hosts. The regulation of the parasite's developmental processes relies on both coding RNAs and non-coding RNAs. However, the roles of non-coding RNAs, including long non-coding RNAs (lncRNAs) in schistosomes remain largely unexplored. Here we conduct advanced RNA sequencing on male and female S. japonicum during their pairing and reproductive development, resulting in the identification of nearly 8,000 lncRNAs. This extensive dataset enables us to construct a comprehensive co-expression network of lncRNAs and mRNAs, shedding light on their interactions during the crucial reproductive stages within the mammalian host. Importantly, we have also revealed a specific lncRNA, LNC3385, which appears to play a critical role in the survival and reproduction of the parasite. These findings not only enhance our understanding of the dynamic nature of lncRNAs during the reproductive phase of schistosomes but also highlight LNC3385 as a potential therapeutic target for combating schistosomiasis.
Collapse
Affiliation(s)
- Shaoyun Cheng
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Yanmin You
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Xiaoling Wang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Cun Yi
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Wei Zhang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Yuxiang Xie
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Lei Xiu
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Fang Luo
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Yan Lu
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Jipeng Wang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Wei Hu
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Center for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, China
- College of Life Sciences, Inner Mongolia University, Hohhot, Inner Mongolia Autonomous Region, China
- Department of Infectious Diseases, Huashan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
12
|
Zong L, Zhu Y, Jiang Y, Xia Y, Liu Q, Jiang S. A comprehensive assessment of exome capture methods for RNA sequencing of formalin-fixed and paraffin-embedded samples. BMC Genomics 2023; 24:777. [PMID: 38102591 PMCID: PMC10722801 DOI: 10.1186/s12864-023-09886-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open
Abstract
RNA-Seq analysis of Formalin-Fixed and Paraffin-Embedded (FFPE) samples has emerged as a highly effective approach and is increasingly being used in clinical research and drug development. However, the processing and storage of FFPE samples are known to cause extensive degradation of RNAs, which limits the discovery of gene expression or gene fusion-based biomarkers using RNA sequencing, particularly methods reliant on Poly(A) enrichment. Recently, researchers have developed an exome targeted RNA-Seq methodology that utilizes biotinylated oligonucleotide probes to enrich RNA transcripts of interest, which could overcome these limitations. Nevertheless, the standardization of this experimental framework, including probe designs, sample multiplexing, sequencing read length, and bioinformatic pipelines, remains an essential requirement. In this study, we conducted a comprehensive comparison of three main commercially available exome capture kits and evaluated key experimental parameters, to provide the overview of the advantages and limitations associated with the selection of library preparation protocols and sequencing platforms. The results provide valuable insights into the best practices for obtaining high-quality data from FFPE samples.
Collapse
Affiliation(s)
- Liang Zong
- Wuhan BGI Technology Service Co., Ltd. BGI-Wuhan, Wuhan, China
- College of Life and Health Sciences, Wuhan University of Science and Technology, Wuhan, China
| | - Yabing Zhu
- BGI Tech Solutions Co., Ltd. BGI-Shenzhen, Shenzhen, China
| | - Yuan Jiang
- Wuhan BGI Technology Service Co., Ltd. BGI-Wuhan, Wuhan, China
| | - Ying Xia
- Wuhan BGI Technology Service Co., Ltd. BGI-Wuhan, Wuhan, China
| | - Qun Liu
- Wuhan BGI Technology Service Co., Ltd. BGI-Wuhan, Wuhan, China
| | - Sanjie Jiang
- BGI Tech Solutions Co., Ltd. BGI-Shenzhen, Shenzhen, China.
| |
Collapse
|
13
|
Schuster J, Ritchie ME, Gouil Q. Restrander: rapid orientation and artefact removal for long-read cDNA data. NAR Genom Bioinform 2023; 5:lqad108. [PMID: 38143957 PMCID: PMC10748469 DOI: 10.1093/nargab/lqad108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 11/07/2023] [Accepted: 12/14/2023] [Indexed: 12/26/2023] Open
Abstract
In transcriptomic analyses, it is helpful to keep track of the strand of the RNA molecules. However, the Oxford Nanopore long-read cDNA sequencing protocols generate reads that correspond to either the first or second-strand cDNA, therefore the strandedness of the initial transcript has to be inferred bioinformatically. Reverse transcription and PCR can also introduce artefacts which should be flagged in data pre-processing. Here we introduce Restrander, a lightning-fast and highly accurate tool for restranding and removing artefacts in long-read cDNA sequencing data. Thanks to its C++ implementation, Restrander was faster than Oxford Nanopore Technologies' existing tool Pychopper, and correctly restranded more reads due to its strategy of searching for polyA/T tails in addition to primer sequences from the reverse transcription and template-switch steps. We found that restranding improved the process of visualising and exploring data, and increased the number of novel isoforms discovered by bambu, particularly in regions where sense and anti-sense transcripts co-occur. The artefact detection implemented in Restrander quantifies reads lacking the correct 5' and 3' ends, a useful feature in quality control for library preparation. Restrander is pre-configured for all major cDNA protocols, and can be customised with user-defined primers. Restrander is available at https://github.com/mritchielab/restrander.
Collapse
Affiliation(s)
- Jakob Schuster
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Matthew E Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Quentin Gouil
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
14
|
Oreper D, Klaeger S, Jhunjhunwala S, Delamarre L. The peptide woods are lovely, dark and deep: Hunting for novel cancer antigens. Semin Immunol 2023; 67:101758. [PMID: 37027981 DOI: 10.1016/j.smim.2023.101758] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/08/2023]
Abstract
Harnessing the patient's immune system to control a tumor is a proven avenue for cancer therapy. T cell therapies as well as therapeutic vaccines, which target specific antigens of interest, are being explored as treatments in conjunction with immune checkpoint blockade. For these therapies, selecting the best suited antigens is crucial. Most of the focus has thus far been on neoantigens that arise from tumor-specific somatic mutations. Although there is clear evidence that T-cell responses against mutated neoantigens are protective, the large majority of these mutations are not immunogenic. In addition, most somatic mutations are unique to each individual patient and their targeting requires the development of individualized approaches. Therefore, novel antigen types are needed to broaden the scope of such treatments. We review high throughput approaches for discovering novel tumor antigens and some of the key challenges associated with their detection, and discuss considerations when selecting tumor antigens to target in the clinic.
Collapse
Affiliation(s)
- Daniel Oreper
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | - Susan Klaeger
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | | | | |
Collapse
|
15
|
Dehghanzad R, Khalafiyan A, Khanahmad H. The Necessity of Using Strand-Specific cDNA for Achieving Accurate Transcriptome Analysis Result. Adv Biomed Res 2023; 12:108. [PMID: 37288031 PMCID: PMC10241614 DOI: 10.4103/abr.abr_102_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 08/04/2022] [Accepted: 08/06/2022] [Indexed: 06/09/2023] Open
Affiliation(s)
- Reyhaneh Dehghanzad
- Department of Medical Genetics, Faculty of Medical Science, Tehran University of Medical Science, Tehran, Iran
| | - Anis Khalafiyan
- Department of Genetics and Molecular Biology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Khanahmad
- Department of Genetics and Molecular Biology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
16
|
Yang L, Huang L, Mu Y, Li K. Characterization of A-to-I Editing in Pigs under a Long-Term High-Energy Diet. Int J Mol Sci 2023; 24:ijms24097921. [PMID: 37175634 PMCID: PMC10178050 DOI: 10.3390/ijms24097921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/11/2023] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
Long-term high-energy intake has detrimental effects on pig health and elevates the risk of metabolic disease. RNA editing modifying RNA bases in a post-transcriptional process has been extensively studied for model animals. However, less evidence is available that RNA editing plays a role in the development of metabolic disorders. Here, we profiled the A-to-I editing in three tissues and six gut segments and characterized the functional aspect of editing sites in model pigs for metabolic disorders. We detected 64,367 non-redundant A-to-I editing sites across the pig genome, and 20.1% correlated with their located genes' expression. The largest number of A-to-I sites was found in the abdominal aorta with the highest editing levels. The significant difference in editing levels between high-energy induced and control pigs was detected in the abdominal aorta, testis, duodenum, ileum, colon, and cecum. We next focused on 6041 functional A-to-I sites that detected differences or specificity between treatments. We found functional A-to-I sites specifically involved in a tissue-specific manner. Two of them, located in gene SLA-DQB1 and near gene B4GALT5 were found to be shared by three tissues and six gut segments. Although we did not find them enriched in each of the gene features, in correlation analysis, we noticed that functional A-to-I sites were significantly enriched in gene 3'-UTRs. This result indicates, in general, A-to-I editing has the largest potential in the regulation of gene expression through changing the 3'-UTRs' sequence, which is functionally involved in pigs under a long-term high-energy diet. Our work provides valuable knowledge of A-to-I editing sites functionally involved in the development of the metabolic disorder.
Collapse
Affiliation(s)
- Liu Yang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Lei Huang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Yulian Mu
- State Key Laboratory of Animal Nutrition and Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs of China, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Kui Li
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- State Key Laboratory of Animal Nutrition and Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs of China, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| |
Collapse
|
17
|
Gimenez G, Stockwell PA, Rodger EJ, Chatterjee A. Strategy for RNA-Seq Experimental Design and Data Analysis. Methods Mol Biol 2023; 2588:249-278. [PMID: 36418693 DOI: 10.1007/978-1-0716-2780-8_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Ribonucleic acids (RNAs) are fundamental molecules that control regulation and expression of the genome and therefore the function of a cell. Robust analysis and quantification of RNA transcripts hold critical importance in understanding cell function, altered phenotypes in different biological context, for understanding and targeting diseases. The development of RNA-sequencing (RNA-Seq) now provides opportunities to analyze the expression and function of RNA molecules at an unprecedented scale. However, the strategy for RNA-Seq experimental design and data analysis can substantially differ depending on the biological application. The design choice could also have significant impact for downstream results and interpretation of data. Here we describe key critical considerations required for RNA-Seq experimental design and also describe a step-by-step bioinformatics workflow detailing the different steps required for RNA-Seq data analysis. We believe this article will be a valuable guide for designing and analyzing RNA-Seq data to address a wide range of different biological questions.
Collapse
Affiliation(s)
- Gregory Gimenez
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand.
| | - Peter A Stockwell
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | - Euan J Rodger
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | - Aniruddha Chatterjee
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand. .,UPES University, School of Health Sciences, Dehradun, India.
| |
Collapse
|
18
|
Wang T, Shen P, Chai R, He Y, Liu J. Profiling of bacterial transcriptome from ultra-low input with MiniBac-seq. Environ Microbiol 2022; 24:5774-5787. [PMID: 36053758 DOI: 10.1111/1462-2920.16169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 08/10/2022] [Indexed: 01/12/2023]
Abstract
There is a lack of appropriate methods for preparing bacterial RNA-seq library with ultra-low amount of RNA. To address this issue, we developed miniBac-seq, a strand-specific method for high-quality library construction from sub-nanogram of total RNA, which is 100-fold lower than the current benchmark kit and dramatically reduces preparation cost ($28 + $15 × samples). We further demonstrated the high sensitivity of miniBac-seq via detecting more than 500 genes from amount of total RNA equivalent to that of a single bacterial cell. Finally, we profiled the transcriptome of growth-arrested bacteria in isogenic culture of Escherichia coli. This subpopulation of bacteria is generally low in abundance but is a potent reservoir of antibiotic persistence, and their gene expression has been largely unknown due to technical limitations. Using miniBac-seq, we identified potential molecular driver towards arrested growth as well as antibiotic tolerance. Our method thus expands the capacity to quantify bacterial transcriptome in situ, which is useful to the understanding of bacterial physiology and regulation in their native contexts.
Collapse
Affiliation(s)
- Tianmin Wang
- Center for Infectious Disease Research, School of Medicine, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Ping Shen
- Center for Infectious Disease Research, School of Medicine, Tsinghua University, Beijing, China
| | - Ruochen Chai
- Center for Infectious Disease Research, School of Medicine, Tsinghua University, Beijing, China
| | - Yihui He
- Center for Infectious Disease Research, School of Medicine, Tsinghua University, Beijing, China
| | - Jintao Liu
- Center for Infectious Disease Research, School of Medicine, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| |
Collapse
|
19
|
Salmen F, De Jonghe J, Kaminski TS, Alemany A, Parada GE, Verity-Legg J, Yanagida A, Kohler TN, Battich N, van den Brekel F, Ellermann AL, Arias AM, Nichols J, Hemberg M, Hollfelder F, van Oudenaarden A. High-throughput total RNA sequencing in single cells using VASA-seq. Nat Biotechnol 2022; 40:1780-1793. [PMID: 35760914 PMCID: PMC9750877 DOI: 10.1038/s41587-022-01361-8] [Citation(s) in RCA: 113] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 05/13/2022] [Indexed: 01/14/2023]
Abstract
Most methods for single-cell transcriptome sequencing amplify the termini of polyadenylated transcripts, capturing only a small fraction of the total cellular transcriptome. This precludes the detection of many long non-coding, short non-coding and non-polyadenylated protein-coding transcripts and hinders alternative splicing analysis. We, therefore, developed VASA-seq to detect the total transcriptome in single cells, which is enabled by fragmenting and tailing all RNA molecules subsequent to cell lysis. The method is compatible with both plate-based formats and droplet microfluidics. We applied VASA-seq to more than 30,000 single cells in the developing mouse embryo during gastrulation and early organogenesis. Analyzing the dynamics of the total single-cell transcriptome, we discovered cell type markers, many based on non-coding RNA, and performed in vivo cell cycle analysis via detection of non-polyadenylated histone genes. RNA velocity characterization was improved, accurately retracing blood maturation trajectories. Moreover, our VASA-seq data provide a comprehensive analysis of alternative splicing during mammalian development, which highlighted substantial rearrangements during blood development and heart morphogenesis.
Collapse
Affiliation(s)
- Fredrik Salmen
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Joachim De Jonghe
- Department of Biochemistry, University of Cambridge, Cambridge, UK
- Francis Crick Institute, London, UK
| | - Tomasz S Kaminski
- Department of Biochemistry, University of Cambridge, Cambridge, UK
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Anna Alemany
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | | | - Joe Verity-Legg
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Ayaka Yanagida
- Division of Stem Cell Therapy, Center for Stem Cell Biology and Regenerative Medicine, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Timo N Kohler
- Department of Biochemistry, University of Cambridge, Cambridge, UK
- Wellcome Trust - Medical Research Council Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Cambridge, UK
| | - Nicholas Battich
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Floris van den Brekel
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Anna L Ellermann
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Alfonso Martinez Arias
- Systems Bioengineering, DCEXS, Universidad Pompeu Fabra, Doctor Aiguader 88 ICREA (Institució Catalana de Recerca i Estudis Avançats), Barcelona, Spain
| | - Jennifer Nichols
- Wellcome Trust - Medical Research Council Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Cambridge, UK
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | | | - Alexander van Oudenaarden
- Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center, Utrecht, Netherlands.
- Oncode Institute, Utrecht, Netherlands.
| |
Collapse
|
20
|
Pan-cancer identification of the relationship of metabolism-related differentially expressed transcription regulation with non-differentially expressed target genes via a gated recurrent unit network. Comput Biol Med 2022; 148:105883. [PMID: 35878490 DOI: 10.1016/j.compbiomed.2022.105883] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/10/2022] [Accepted: 07/16/2022] [Indexed: 11/20/2022]
Abstract
The transcriptome describes the expression of all genes in a sample. Most studies have investigated the differential patterns or discrimination powers of transcript expression levels. In this study, we hypothesized that the quantitative correlations between the expression levels of transcription factors (TFs) and their regulated target genes (mRNAs) serve as a novel view of healthy status, and a disease sample exhibits a differential landscape (mqTrans) of transcription regulations compared with healthy status. We formulated quantitative transcription regulation relationships of metabolism-related genes as a multi-input multi-output regression model via a gated recurrent unit (GRU) network. The GRU model was trained using healthy blood transcriptomes and the expression levels of mRNAs were predicted by those of the TFs. The mqTrans feature of a gene was defined as the difference between its predicted and actual expression levels. A pan-cancer investigation of the differentially expressed mqTrans features was conducted between the early- and late-stage cancers in 26 cancer types of The Cancer Genome Atlas database. This study focused on the differentially expressed mqTrans features, that did not show differential expression in the actual expression levels. These genes could not be detected by conventional differential analysis. Such dark biomarkers are worthy of further wet-lab investigation. The experimental data also showed that the proposed mqTrans investigation improved the classification between early- and late-stage samples for some cancer types. Thus, the mqTrans features serve as a complementary view to transcriptomes, an OMIC type with mature high-throughput production technologies, and abundant public resources.
Collapse
|
21
|
Choi G, Jeon J, Lee H, Zhou S, Lee YH. Genome-wide profiling of long non-coding RNA of the rice blast fungus Magnaporthe oryzae during infection. BMC Genomics 2022; 23:132. [PMID: 35168559 PMCID: PMC8845233 DOI: 10.1186/s12864-022-08380-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 02/09/2022] [Indexed: 12/05/2022] Open
Abstract
Background Long non-coding RNAs (lncRNAs) play essential roles in developmental processes and disease development at the transcriptional and post-transcriptional levels across diverse taxa. However, only few studies have profiled fungal lncRNAs in a genome-wide manner during host infection. Results Infection-associated lncRNAs were identified using lncRNA profiling over six stages of host infection (e.g., vegetative growth, pre-penetration, biotrophic, and necrotrophic stages) in the model pathogenic fungus, Magnaporthe oryzae. We identified 2,601 novel lncRNAs, including 1,286 antisense lncRNAs and 980 intergenic lncRNAs. Among the identified lncRNAs, 755 were expressed in a stage-specific manner and 560 were infection-specifically expressed lncRNAs (ISELs). To decipher the potential roles of lncRNAs during infection, we identified 365 protein-coding genes that were associated with 214 ISELs. Analysis of the predicted functions of these associated genes suggested that lncRNAs regulate pathogenesis-related genes, including xylanases and effectors. Conclusions The ISELs and their associated genes provide a comprehensive view of lncRNAs during fungal pathogen-plant interactions. This study expands new insights into the role of lncRNAs in the rice blast fungus, as well as other plant pathogenic fungi. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08380-4.
Collapse
Affiliation(s)
- Gobong Choi
- Interdisciplinary Program in Agricultural Genomics, Seoul National University, Seoul, 08826, Korea
| | - Jongbum Jeon
- Interdisciplinary Program in Agricultural Genomics, Seoul National University, Seoul, 08826, Korea.,Plant Immunity Research Center, Seoul National University, Seoul, 08826, Korea.,Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Korea
| | - Hyunjun Lee
- Department of Agricultural Biotechnology, Seoul National University, Seoul, 08826, Korea
| | - Shenxian Zhou
- Department of Agricultural Biotechnology, Seoul National University, Seoul, 08826, Korea
| | - Yong-Hwan Lee
- Interdisciplinary Program in Agricultural Genomics, Seoul National University, Seoul, 08826, Korea. .,Plant Immunity Research Center, Seoul National University, Seoul, 08826, Korea. .,Department of Agricultural Biotechnology, Seoul National University, Seoul, 08826, Korea. .,Center for Plant Microbiome Research, Center for Fungal Genetic Resources, Plant Genomics and Breeding Institute, and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Korea.
| |
Collapse
|
22
|
Systematic comparative analysis of strand-specific RNA-seq library preparation methods for low input samples. Sci Rep 2022; 12:1789. [PMID: 35110572 PMCID: PMC8810888 DOI: 10.1038/s41598-021-04583-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 12/22/2021] [Indexed: 02/07/2023] Open
Abstract
Despite the recent precipitous decline in the cost of genome sequencing, library preparation for RNA-seq is still laborious and expensive for applications such as high throughput screening. Limited availability of RNA generated by some experimental workflows poses an additional challenge and increases the cost of RNA library preparation. In a search for low cost, automation-compatible RNA library preparation kits that maintain strand specificity and are amenable to low input RNA quantities, we systematically tested two recent commercial technologies—Swift RNA and Swift Rapid RNA, presently offered by Integrated DNA Technologies (IDT) —alongside the Illumina TruSeq stranded mRNA, the de facto standard workflow for bulk transcriptomics. We used the Universal Human Reference RNA (UHRR) (composed of equal quantities of total RNA from 10 human cancer cell lines) to benchmark gene expression in these kits, at input quantities ranging between 10 to 500 ng. We found normalized read counts between all treatment groups to be in high agreement. Compared to the Illumina TruSeq stranded mRNA kit, both Swift RNA library kits offer shorter workflow times enabled by their patented Adaptase technology. We also found the Swift RNA kit to produce the fewest number of differentially expressed genes and pathways directly attributable to input mRNA amount.
Collapse
|
23
|
Signal B, Kahlke T. how_are_we_stranded_here: quick determination of RNA-Seq strandedness. BMC Bioinformatics 2022; 23:49. [PMID: 35065593 PMCID: PMC8783475 DOI: 10.1186/s12859-022-04572-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 01/10/2022] [Indexed: 11/07/2022] Open
Abstract
Background Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses. Results To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination. Conclusions how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at https://github.com/betsig/how_are_we_stranded_here. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04572-7.
Collapse
|
24
|
Pavlovich PV, Cauchy P. Sequences to Differences in Gene Expression: Analysis of RNA-Seq Data. Methods Mol Biol 2022; 2508:279-318. [PMID: 35737247 DOI: 10.1007/978-1-0716-2376-3_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
RNA-Seq is now a routinely employed assay to measure gene expression. As the technique matured over the last decade, so have dedicated analytic tools. In this chapter, we first describe the mainstream as well as the most up-to-date protocols and their implications on downstream analysis. We then detail the steps entailing RNA-Seq analysis in three main stages: (i) preprocessing and data preparation, (ii) upstream processing, and (iii) high-level analyses. We review the most recent and relevant tools as one workflow following a stepwise order. The chapter further encompasses in-depth features of these tools. Details of the required code are made available throughout the chapter, as well as of the underlying statistics. We illustrate these steps with analysis of publicly available RNA-Seq data.
Collapse
Affiliation(s)
| | - Pierre Cauchy
- Universitätskilinkum Freiburg, Freiburg, Germany.
- Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany.
| |
Collapse
|
25
|
Abstract
Rhodopsins are light-activated proteins displaying an enormous versatility of function as cation/anion pumps or sensing environmental stimuli and are widely distributed across all domains of life. Even with wide sequence divergence and uncertain evolutionary linkages between microbial (type 1) and animal (type 2) rhodopsins, the membrane orientation of the core structural scaffold of both was presumed universal. This was recently amended through the discovery of heliorhodopsins (HeRs; type 3), that, in contrast to known rhodopsins, display an inverted membrane topology and yet retain similarities in sequence, structure, and the light-activated response. While no ion-pumping activity has been demonstrated for HeRs and multiple crystal structures are available, fundamental questions regarding their cellular and ecological function or even their taxonomic distribution remain unresolved. Here, we investigated HeR function and distribution using genomic/metagenomic data with protein domain fusions, contextual genomic information, and gene coexpression analysis with strand-specific metatranscriptomics. We bring to resolution the debated monoderm/diderm occurrence patterns and show that HeRs are restricted to monoderms. Moreover, we provide compelling evidence that HeRs are a novel type of sensory rhodopsins linked to histidine kinases and other two-component system genes across phyla. In addition, we also describe two novel putative signal-transducing domains fused to some HeRs. We posit that HeRs likely function as generalized light-dependent switches involved in the mitigation of light-induced oxidative stress and metabolic circuitry regulation. Their role as sensory rhodopsins is corroborated by their photocycle dynamics and their presence/function in monoderms is likely connected to the higher sensitivity of these organisms to light-induced damage. IMPORTANCE Heliorhodopsins are enigmatic, novel rhodopsins with a membrane orientation that is opposite to all known rhodopsins. However, their cellular and ecological functions are unknown, and even their taxonomic distribution remains a subject of debate. We provide evidence that HeRs are a novel type of sensory rhodopsins linked to histidine kinases and other two-component system genes across phyla boundaries. In support of this, we also identify two novel putative signal transducing domains in HeRs that are fused with them. We also observe linkages of HeRs to genes involved in mitigation of light-induced oxidative stress and increased carbon and nitrogen metabolism. Finally, we synthesize these findings into a framework that connects HeRs with the cellular response to light in monoderms, activating light-induced oxidative stress defenses along with carbon/nitrogen metabolic circuitries. These findings are consistent with the evolutionary, taxonomic, structural, and genomic data available so far.
Collapse
|
26
|
Chakraborty A, Mahajan S, Jaiswal SK, Sharma VK. Genome sequencing of turmeric provides evolutionary insights into its medicinal properties. Commun Biol 2021; 4:1193. [PMID: 34654884 PMCID: PMC8521574 DOI: 10.1038/s42003-021-02720-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 08/13/2021] [Indexed: 12/28/2022] Open
Abstract
Curcuma longa, or turmeric, is traditionally known for its immense medicinal properties and has diverse therapeutic applications. However, the absence of a reference genome sequence is a limiting factor in understanding the genomic basis of the origin of its medicinal properties. In this study, we present the draft genome sequence of C. longa, belonging to Zingiberaceae plant family, constructed using 10x Genomics linked reads and Oxford Nanopore long reads. For comprehensive gene set prediction and for insights into its gene expression, transcriptome sequencing of leaf tissue was also performed. The draft genome assembly had a size of 1.02 Gbp with ~70% repetitive sequences, and contained 50,401 coding gene sequences. The phylogenetic position of C. longa was resolved through a comprehensive genome-wide analysis including 16 other plant species. Using 5,388 orthogroups, the comparative evolutionary analysis performed across 17 species including C. longa revealed evolution in genes associated with secondary metabolism, plant phytohormones signaling, and various biotic and abiotic stress tolerance responses. These mechanisms are crucial for perennial and rhizomatous plants such as C. longa for defense and environmental stress tolerance via production of secondary metabolites, which are associated with the wide range of medicinal properties in C. longa.
Collapse
Affiliation(s)
- Abhisek Chakraborty
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Shruti Mahajan
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Shubham K Jaiswal
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Vineet K Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India.
| |
Collapse
|
27
|
Peroxisome Proliferator-Activated Receptor γ, but Not α or G-Protein Coupled Estrogen Receptor Drives Functioning of Postnatal Boar Testis-Next Generation Sequencing Analysis. Animals (Basel) 2021; 11:ani11102868. [PMID: 34679887 PMCID: PMC8532933 DOI: 10.3390/ani11102868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/19/2021] [Accepted: 09/27/2021] [Indexed: 12/12/2022] Open
Abstract
Simple Summary As of now, the Next Generation Sequencing (NGS) analysis has not been utilized to identify biological processes and signaling pathways that are regulated in the boar postnatal testes. Our prior studies revealed that the peroxisome proliferator-activated receptor (PPAR) and G-protein coupled estrogen receptor (GPER) were significant for the morpho-functional status of testicular cells. Here, the pharmacological blockage of PPARα, PPARγ or GPER was performed in ex vivo immature boar testes. The NGS results showed 382 transcripts with an altered expression. The blockage by the PPARγ antagonist markedly affected biological processes such as: drug metabolism (genes: Ctsh, Duox2, Atp1b1, Acss2, Pkd2, Aldh2, Hbb, Sdhd, Cox3, Nd4, Nd5, Cytb, Cbr1, and Pid1), adhesion (genes: Plpp3, Anxa1, Atp1b1, S100a8, Cd93, Ephb4, Vsir, Cldn11, Gpc4, Fermt3, Dusp26, Sox9, and Cdh5) and tube development (genes: Ctsh, Mmp14, Dll4, Anxa1, Ephb4, Pkd2, Angptl4, Robo4, Sox9, Hikeshi, Ing2, Loc100738836, and Rarres2), as well as the Notch signaling pathway. This was not the case for the PPARα or GPER antagonists. Our observations suggested that PPARγ may be the principal player in the management of the development and function of boar testes during the early postnatal window. Moreover, due to a highly similar porcine gene expression pattern to human homologues genes, our results can be used to understand both animal and human testes physiology and to predict or treat pathological processes. Abstract Porcine tissue gene expression is highly similar to the expression of homologous genes in humans. Based on this fact, the studies on porcine tissues can be employed to understand human physiology and to predict or treat diseases. Our prior studies clearly showed that there was a regulatory partnership of the peroxisome proliferator-activated receptor (PPAR) and the G-protein coupled membrane estrogen receptor (GPER) that relied upon the tumorigenesis of human and mouse testicular interstitial cells, as well as the PPAR-estrogen related receptor and GPER–xenoestrogen relationships which affected the functional status of immature boar testes. The main objective of this study was to identify the biological processes and signaling pathways governed by PPARα, PPARγ and GPER in the immature testes of seven-day-old boars after pharmacological receptor ligand treatment. Boar testicular tissues were cultured in an organotypic system with the respective PPARα, PPARγ or GPER antagonists. To evaluate the effect of the individual receptor deprivation in testicular tissue on global gene expression, Next Generation Sequencing was performed. Bioinformatic analysis revealed 382 transcripts with altered expression. While tissues treated with PPARα or GPER antagonists showed little significance in the enrichment analysis, the antagonists challenged with the PPARγ antagonist displayed significant alterations in biological processes such as: drug metabolism, adhesion and tubule development. Diverse disruption in the Notch signaling pathway was also observed. The findings of our study proposed that neither PPARα nor GPER, but PPARγ alone seemed to be the main player in the regulation of boar testes functioning during early the postnatal developmental window.
Collapse
|
28
|
Assis R. No Expression Divergence despite Transcriptional Interference between Nested Protein-Coding Genes in Mammals. Genes (Basel) 2021; 12:genes12091381. [PMID: 34573363 PMCID: PMC8467205 DOI: 10.3390/genes12091381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 01/05/2023] Open
Abstract
Nested protein-coding genes accumulated throughout metazoan evolution, with early analyses of human and Drosophila microarray data indicating that this phenomenon was simply due to the presence of large introns. However, a recent study employing RNA-seq data uncovered evidence of transcriptional interference driving rapid expression divergence between Drosophila nested genes, illustrating that accurate expression estimation of overlapping genes can enhance detection of their relationships. Hence, here I apply an analogous approach to strand-specific RNA-seq data from human and mouse to revisit the role of transcriptional interference in the evolution of mammalian nested genes. A genomic survey reveals that whereas mammalian nested genes indeed accrued over evolutionary time, they are retained at lower frequencies than in Drosophila. Though several properties of mammalian nested genes align with observations in Drosophila and with expectations under transcriptional interference, contrary to both, their expression divergence is not statistically different from that between unnested genes, and also does not increase after nesting. Together, these results support the hypothesis that lower selection efficiencies limit rates of gene expression evolution in mammals, leading to their reliance on immediate eradication of deleterious nested genes to avoid transcriptional interference.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Electrical Engineering and Computer Science, Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
29
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
30
|
Li J, Singh U, Arendsee Z, Wurtele ES. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data. Front Genet 2021; 12:722981. [PMID: 34484307 PMCID: PMC8415361 DOI: 10.3389/fgene.2021.722981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
The "dark transcriptome" can be considered the multitude of sequences that are transcribed but not annotated as genes. We evaluated expression of 6,692 annotated genes and 29,354 unannotated open reading frames (ORFs) in the Saccharomyces cerevisiae genome across diverse environmental, genetic and developmental conditions (3,457 RNA-Seq samples). Over 30% of the highly transcribed ORFs have translation evidence. Phylostratigraphic analysis infers most of these transcribed ORFs would encode species-specific proteins ("orphan-ORFs"); hundreds have mean expression comparable to annotated genes. These data reveal unannotated ORFs most likely to be protein-coding genes. We partitioned a co-expression matrix by Markov Chain Clustering; the resultant clusters contain 2,468 orphan-ORFs. We provide the aggregated RNA-Seq yeast data with extensive metadata as a project in MetaOmGraph (MOG), a tool designed for interactive analysis and visualization. This approach enables reuse of public RNA-Seq data for exploratory discovery, providing a rich context for experimentalists to make novel, experimentally testable hypotheses about candidate genes.
Collapse
Affiliation(s)
- Jing Li
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
| | - Urminder Singh
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Zebulun Arendsee
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Eve Syrkin Wurtele
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| |
Collapse
|
31
|
Singh N. Role of mammalian long non-coding RNAs in normal and neuro oncological disorders. Genomics 2021; 113:3250-3273. [PMID: 34302945 DOI: 10.1016/j.ygeno.2021.07.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 07/10/2021] [Accepted: 07/14/2021] [Indexed: 12/09/2022]
Abstract
Long non-coding RNAs (lncRNAs) are expressed at lower levels than protein-coding genes but have a crucial role in gene regulation. LncRNA is distinct, they are being transcribed using RNA polymerase II, and their functionality depends on subcellular localization. Depending on their niche, they specifically interact with DNA, RNA, and proteins and modify chromatin function, regulate transcription at various stages, forms nuclear condensation bodies and nucleolar organization. lncRNAs may also change the stability and translation of cytoplasmic mRNAs and hamper signaling pathways. Thus, lncRNAs affect the physio-pathological states and lead to the development of various disorders, immune responses, and cancer. To date, ~40% of lncRNAs have been reported in the nervous system (NS) and are involved in the early development/differentiation of the NS to synaptogenesis. LncRNA expression patterns in the most common adult and pediatric tumor suggest them as potential biomarkers and provide a rationale for targeting them pharmaceutically. Here, we discuss the mechanisms of lncRNA synthesis, localization, and functions in transcriptional, post-transcriptional, and other forms of gene regulation, methods of lncRNA identification, and their potential therapeutic applications in neuro oncological disorders as explained by molecular mechanisms in other malignant disorders.
Collapse
Affiliation(s)
- Neetu Singh
- Molecular Biology Unit, Department of Centre for Advance Research, King George's Medical University, Lucknow, Uttar Pradesh 226 003, India.
| |
Collapse
|
32
|
Haile S, Corbett RD, LeBlanc VG, Wei L, Pleasance S, Bilobram S, Nip KM, Brown K, Trinh E, Smith J, Trinh DL, Bala M, Chuah E, Coope RJN, Moore RA, Mungall AJ, Mungall KL, Zhao Y, Hirst M, Aparicio S, Birol I, Jones SJM, Marra MA. A Scalable Strand-Specific Protocol Enabling Full-Length Total RNA Sequencing From Single Cells. Front Genet 2021; 12:665888. [PMID: 34149808 PMCID: PMC8209500 DOI: 10.3389/fgene.2021.665888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/21/2021] [Indexed: 12/14/2022] Open
Abstract
RNA sequencing (RNAseq) has been widely used to generate bulk gene expression measurements collected from pools of cells. Only relatively recently have single-cell RNAseq (scRNAseq) methods provided opportunities for gene expression analyses at the single-cell level, allowing researchers to study heterogeneous mixtures of cells at unprecedented resolution. Tumors tend to be composed of heterogeneous cellular mixtures and are frequently the subjects of such analyses. Extensive method developments have led to several protocols for scRNAseq but, owing to the small amounts of RNA in single cells, technical constraints have required compromises. For example, the majority of scRNAseq methods are limited to sequencing only the 3' or 5' termini of transcripts. Other protocols that facilitate full-length transcript profiling tend to capture only polyadenylated mRNAs and are generally limited to processing only 96 cells at a time. Here, we address these limitations and present a novel protocol that allows for the high-throughput sequencing of full-length, total RNA at single-cell resolution. We demonstrate that our method produced strand-specific sequencing data for both polyadenylated and non-polyadenylated transcripts, enabled the profiling of transcript regions beyond only transcript termini, and yielded data rich enough to allow identification of cell types from heterogeneous biological samples.
Collapse
Affiliation(s)
- Simon Haile
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Richard D Corbett
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Veronique G LeBlanc
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Lisa Wei
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Stephen Pleasance
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Steve Bilobram
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Kirstin Brown
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Eva Trinh
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Jillian Smith
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Diane L Trinh
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Miruna Bala
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Eric Chuah
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Robin J N Coope
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Richard A Moore
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Andrew J Mungall
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Karen L Mungall
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Yongjun Zhao
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Martin Hirst
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer, Vancouver, BC, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
33
|
Ranjan G, Sehgal P, Sharma D, Scaria V, Sivasubbu S. Functional long non-coding and circular RNAs in zebrafish. Brief Funct Genomics 2021:elab014. [PMID: 33755040 DOI: 10.1093/bfgp/elab014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 01/04/2021] [Accepted: 02/19/2021] [Indexed: 02/06/2023] Open
Abstract
The utility of model organisms to understand the function of a novel transcript/genes has allowed us to delineate their molecular mechanisms in maintaining cellular homeostasis. Organisms such as zebrafish have contributed a lot in the field of developmental and disease biology. Attributable to advancement and deep transcriptomics, many new transcript isoforms and non-coding RNAs such as long noncoding RNA (lncRNA) and circular RNAs (circRNAs) have been identified and cataloged in multiple databases and many more are yet to be identified. Various methods and tools have been utilized to identify lncRNAs/circRNAs in zebrafish using deep sequencing of transcriptomes as templates. Functional analysis of a few candidates such as tie1-AS, ECAL1 and CDR1as in zebrafish provides a prospective outline to approach other known or novel lncRNA/circRNA. New genetic alteration tools like TALENS and CRISPRs have helped in probing for the molecular function of lncRNA/circRNA in zebrafish. Further latest improvements in experimental and computational techniques offer the identification of lncRNA/circRNA counterparts in humans and zebrafish thereby allowing easy modeling and analysis of function at cellular level.
Collapse
|
34
|
Choi HM, Lee SH, Lee MS, Park D, Choi SS. Investigation of the putative role of antisense transcripts as regulators of sense transcripts by correlation analysis of sense-antisense pairs in colorectal cancers. FASEB J 2021; 35:e21482. [PMID: 33710708 DOI: 10.1096/fj.202002297rrr] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 02/12/2021] [Accepted: 02/15/2021] [Indexed: 12/13/2022]
Abstract
Antisense transcription occurs widely more expected than when it was first identified in bacteria in the 1980s. However, the functional relevance of antisense transcripts in transcription remains controversial. Here, we investigated the putative role of antisense transcripts in regulating their corresponding sense transcripts by analyzing changes in correlative relationships between sense-antisense pairs under tumor and normal conditions. A total of 3469 sense-antisense gene pairs (SAGPs) downloaded from BioMart mapped to a list of sense and antisense genes in RNA-seq data derived from 80 paired colorectal cancer (CRC) samples were analyzed. As a result, cancer-related genes were significantly enriched in the significantly correlated SAGPs (SCPs). Differentially expressed genes estimated between normal and tumor conditions were also significantly more enriched in SCPs than in non-SCPs. Interestingly, using differential correlation analysis, we found that tumor samples had a significantly larger density of genes with higher correlation coefficients than normal samples, as verified by various cancer transcriptomes from The Cancer Genome Atlas (TCGA). Moreover, we found that the magnitude of the correlation between SAGPs could distinguish poor prognostic CRCs from good prognostic CRCs, showing that correlation coefficients between the SAGPs of CRCs with a poor prognosis were significantly stronger than CRCs with a good prognosis. Consistent with this finding, the Cox proportion hazards model showed that the survival rates were significantly different between patients with high and low expression of genes in the SCPs. All these results strongly support the idea that antisense transcripts are important regulators of their corresponding sense transcripts.
Collapse
Affiliation(s)
- Hye-Mi Choi
- Division of Biomedical Convergence, College of Biomedical Science, Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, Korea
| | - Sang-Hyeop Lee
- Division of Biomedical Convergence, College of Biomedical Science, Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, Korea
| | - Min-Seok Lee
- Division of Biomedical Convergence, College of Biomedical Science, Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, Korea
| | | | - Sun Shim Choi
- Division of Biomedical Convergence, College of Biomedical Science, Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, Korea
| |
Collapse
|
35
|
Rodriguez PD, Paculova H, Kogut S, Heath J, Schjerven H, Frietze S. Non-Coding RNA Signatures of B-Cell Acute Lymphoblastic Leukemia. Int J Mol Sci 2021; 22:ijms22052683. [PMID: 33799946 PMCID: PMC7961854 DOI: 10.3390/ijms22052683] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 12/15/2022] Open
Abstract
Non-coding RNAs (ncRNAs) comprise a diverse class of non-protein coding transcripts that regulate critical cellular processes associated with cancer. Advances in RNA-sequencing (RNA-Seq) have led to the characterization of non-coding RNA expression across different types of human cancers. Through comprehensive RNA-Seq profiling, a growing number of studies demonstrate that ncRNAs, including long non-coding RNA (lncRNAs) and microRNAs (miRNA), play central roles in progenitor B-cell acute lymphoblastic leukemia (B-ALL) pathogenesis. Furthermore, due to their central roles in cellular homeostasis and their potential as biomarkers, the study of ncRNAs continues to provide new insight into the molecular mechanisms of B-ALL. This article reviews the ncRNA signatures reported for all B-ALL subtypes, focusing on technological developments in transcriptome profiling and recently discovered examples of ncRNAs with biologic and therapeutic relevance in B-ALL.
Collapse
Affiliation(s)
- Princess D. Rodriguez
- Department of Biomedical and Health Sciences, University of Vermont, Burlington, VT 05405, USA; (P.D.R.); (H.P.); (S.K.)
| | - Hana Paculova
- Department of Biomedical and Health Sciences, University of Vermont, Burlington, VT 05405, USA; (P.D.R.); (H.P.); (S.K.)
| | - Sophie Kogut
- Department of Biomedical and Health Sciences, University of Vermont, Burlington, VT 05405, USA; (P.D.R.); (H.P.); (S.K.)
| | - Jessica Heath
- The University of Vermont Cancer Center, University of Vermont, Burlington, VT 05405, USA;
- Department of Biochemistry, University of Vermont, Burlington, VT 05405, USA
- Department of Pediatrics, University of Vermont, Burlington, VT 05405, USA
| | - Hilde Schjerven
- Department of Laboratory Medicine, University of California, San Francisco, CA 94143, USA;
| | - Seth Frietze
- Department of Biomedical and Health Sciences, University of Vermont, Burlington, VT 05405, USA; (P.D.R.); (H.P.); (S.K.)
- The University of Vermont Cancer Center, University of Vermont, Burlington, VT 05405, USA;
- Department of Biochemistry, University of Vermont, Burlington, VT 05405, USA
- Correspondence:
| |
Collapse
|
36
|
Grabski DF, Broseus L, Kumari B, Rekosh D, Hammarskjold ML, Ritchie W. Intron retention and its impact on gene expression and protein diversity: A review and a practical guide. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1631. [PMID: 33073477 DOI: 10.1002/wrna.1631] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/16/2020] [Accepted: 09/23/2020] [Indexed: 12/12/2022]
Abstract
Intron retention (IR) occurs when a complete and unspliced intron remains in mature mRNA. An increasing body of literature has demonstrated a major role for IR in numerous biological functions, including several that impact human health and disease. Although experimental technologies used to study other forms of mRNA splicing can also be used to investigate IR, a specialized downstream computational analysis is optimal for IR discovery and analysis. Here we provide a review of IR and its biological implications, as well as a practical guide for how to detect and analyze it. Several methods, including long read third generation direct RNA sequencing, are described. We have developed an R package, FakIR, to facilitate the execution of the bioinformatic tasks recommended in this review and a tutorial on how to fit them to users aims. Additionally, we provide guidelines and experimental protocols to validate IR discovery and to evaluate the potential impact of IR on gene expression and protein output. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA Processing > Splicing Regulation/Alternative Splicing RNA Methods > RNA Analyses in vitro and In Silico.
Collapse
Affiliation(s)
- David F Grabski
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, Charlottesville, Virginia, USA.,Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA
| | - Lucile Broseus
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| | - Bandana Kumari
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| | - David Rekosh
- Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA.,Department of Microbiology, Immunology and Cancer Biology, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Marie-Louise Hammarskjold
- Myles H. Thaler Center for AIDS and Human Retrovirus Research, University of Virginia, Charlottesville, Virginia, USA.,Department of Microbiology, Immunology and Cancer Biology, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - William Ritchie
- IGH, Centre National de la Recherche Scientifique, University of Montpellier, Montpellier, France
| |
Collapse
|
37
|
Hu Y, Zhang Y, Liu C, Qin R, Gong D, Wang R, Zhang D, Che L, Chen D, Xin G, Gao F, Hu Q. Multi-omics profiling highlights lipid metabolism alterations in pigs fed low-dose antibiotics. BMC Genet 2020; 21:112. [PMID: 32957918 PMCID: PMC7507292 DOI: 10.1186/s12863-020-00918-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Accepted: 09/11/2020] [Indexed: 02/02/2023] Open
Abstract
Background In order to study the relations of hepatocellular functions, weight gain and metabolic imbalance caused by low-dose antibiotics (LDA) via epigenetic regulation of gene transcription, 32 weaned piglets were employed as animal models and randomly allocated into two groups with diets supplemented with 0 or LDA (chlorotetracycline and virginiamycin). Results During the 4 weeks of the experiment, LDA showed a clear growth-promoting effect, which was exemplified by the significantly elevated body weight and average daily gain. Promoter methylome profiling using liquid hybridization capture-based bisulfite sequencing (LHC-BS) indicated that most of the 745 differential methylation regions (DMRs) were hypermethylated in the LDA group. Several DMRs were significantly enriched in genes related with fatty acids metabolic pathways, such as FABP1 and PCK1. In addition, 71 differentially expressed genes (DEGs) were obtained by strand-specific transcriptome analysis of liver tissues, including ALOX15, CXCL10 and NNMT, which are three key DEGs that function in lipid metabolism and immunity and which had highly elevated expression in the LDA group. In accordance with these molecular changes, the lipidome analyses of serum by LC-MS identified 38 significantly differential lipids, most of which were downregulated in the LDA group. Conclusions Our results indicate that LDA could induce epigenetic and transcriptional changes of key genes and lead to enhanced efficiency of lipid metabolism in the liver.
Collapse
Affiliation(s)
- Yue Hu
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Yihe Zhang
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Cong Liu
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Rui Qin
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Desheng Gong
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Ru Wang
- Institute of Animal Nutrition, Sichuan Agricultural University, Ya'an, 625014, Sichuan Province, China
| | - Du Zhang
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Lianqiang Che
- Institute of Animal Nutrition, Sichuan Agricultural University, Ya'an, 625014, Sichuan Province, China
| | - Daiwen Chen
- Institute of Animal Nutrition, Sichuan Agricultural University, Ya'an, 625014, Sichuan Province, China
| | - Guizhong Xin
- State Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, China Pharmaceutical University, Nanjing, China
| | - Fei Gao
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.,Comparative Pediatrics and Nutrition, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, DK, Denmark
| | - Qi Hu
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.
| |
Collapse
|
38
|
Chiu R, Nip KM, Birol I. Fusion-Bloom: fusion detection in assembled transcriptomes. Bioinformatics 2020; 36:2256-2257. [PMID: 31790154 PMCID: PMC7141844 DOI: 10.1093/bioinformatics/btz902] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 11/13/2019] [Accepted: 11/27/2019] [Indexed: 11/13/2022] Open
Abstract
Summary Presence or absence of gene fusions is one of the most important diagnostic markers in many cancer types. Consequently, fusion detection methods using various genomics data types, such as RNA sequencing (RNA-seq) are valuable tools for research and clinical applications. While information-rich RNA-seq data have proven to be instrumental in discovery of a number of hallmark fusion events, bioinformatics tools to detect fusions still have room for improvement. Here, we present Fusion-Bloom, a fusion detection method that leverages recent developments in de novo transcriptome assembly and assembly-based structural variant calling technologies (RNA-Bloom and PAVFinder, respectively). We benchmarked Fusion-Bloom against the performance of five other state-of-the-art fusion detection tools using multiple datasets. Overall, we observed Fusion-Bloom to display a good balance between detection sensitivity and specificity. We expect the tool to find applications in translational research and clinical genomics pipelines. Availability and implementation Fusion-Bloom is implemented as a UNIX Make utility, available at https://github.com/bcgsc/pavfinder and released under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada.,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| |
Collapse
|
39
|
Francisco DMF, Marchetti L, Rodríguez-Lorenzo S, Frías-Anaya E, Figueiredo RM, Winter P, Romero IA, de Vries HE, Engelhardt B, Bruggmann R. Advancing brain barriers RNA sequencing: guidelines from experimental design to publication. Fluids Barriers CNS 2020; 17:51. [PMID: 32811511 PMCID: PMC7433166 DOI: 10.1186/s12987-020-00207-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 07/06/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND RNA sequencing (RNA-Seq) in its varied forms has become an indispensable tool for analyzing differential gene expression and thus characterization of specific tissues. Aiming to understand the brain barriers genetic signature, RNA seq has also been introduced in brain barriers research. This has led to availability of both, bulk and single-cell RNA-Seq datasets over the last few years. If appropriately performed, the RNA-Seq studies provide powerful datasets that allow for significant deepening of knowledge on the molecular mechanisms that establish the brain barriers. However, RNA-Seq studies comprise complex workflows that require to consider many options and variables before, during and after the proper sequencing process. MAIN BODY In the current manuscript, we build on the interdisciplinary experience of the European PhD Training Network BtRAIN ( https://www.btrain-2020.eu/ ) where bioinformaticians and brain barriers researchers collaborated to analyze and establish RNA-Seq datasets on vertebrate brain barriers. The obstacles BtRAIN has identified in this process have been integrated into the present manuscript. It provides guidelines along the entire workflow of brain barriers RNA-Seq studies starting from the overall experimental design to interpretation of results. Focusing on the vertebrate endothelial blood-brain barrier (BBB) and epithelial blood-cerebrospinal-fluid barrier (BCSFB) of the choroid plexus, we provide a step-by-step description of the workflow, highlighting the decisions to be made at each step of the workflow and explaining the strengths and weaknesses of individual choices made. Finally, we propose recommendations for accurate data interpretation and on the information to be included into a publication to ensure appropriate accessibility of the data and reproducibility of the observations by the scientific community. CONCLUSION Next generation transcriptomic profiling of the brain barriers provides a novel resource for understanding the development, function and pathology of these barrier cells, which is essential for understanding CNS homeostasis and disease. Continuous advancement and sophistication of RNA-Seq will require interdisciplinary approaches between brain barrier researchers and bioinformaticians as successfully performed in BtRAIN. The present guidelines are built on the BtRAIN interdisciplinary experience and aim to facilitate collaboration of brain barriers researchers with bioinformaticians to advance RNA-Seq study design in the brain barriers community.
Collapse
Affiliation(s)
- David M F Francisco
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, Switzerland
| | - Luca Marchetti
- Theodor Kocher Institute, University of Bern, Bern, Switzerland
| | - Sabela Rodríguez-Lorenzo
- MS Center Amsterdam, Amsterdam Neuroscience, Department of Molecular Cell Biology and Immunology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Eduardo Frías-Anaya
- School of Life, Health and Chemical Sciences, The Open University, Milton Keynes, UK
| | - Ricardo M Figueiredo
- GenXPro GmbH, Frankfurt/Main, Germany
- Johann Wolfgang Goethe University, Frankfurt/Main, Germany
| | | | - Ignacio Andres Romero
- School of Life, Health and Chemical Sciences, The Open University, Milton Keynes, UK
| | - Helga E de Vries
- MS Center Amsterdam, Amsterdam Neuroscience, Department of Molecular Cell Biology and Immunology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | | - Rémy Bruggmann
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, Switzerland.
| |
Collapse
|
40
|
Zhao S, Ye Z, Stanton R. Misuse of RPKM or TPM normalization when comparing across samples and sequencing protocols. RNA (NEW YORK, N.Y.) 2020; 26:903-909. [PMID: 32284352 PMCID: PMC7373998 DOI: 10.1261/rna.074922.120] [Citation(s) in RCA: 245] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In recent years, RNA-sequencing (RNA-seq) has emerged as a powerful technology for transcriptome profiling. For a given gene, the number of mapped reads is not only dependent on its expression level and gene length, but also the sequencing depth. To normalize these dependencies, RPKM (reads per kilobase of transcript per million reads mapped) and TPM (transcripts per million) are used to measure gene or transcript expression levels. A common misconception is that RPKM and TPM values are already normalized, and thus should be comparable across samples or RNA-seq projects. However, RPKM and TPM represent the relative abundance of a transcript among a population of sequenced transcripts, and therefore depend on the composition of the RNA population in a sample. Quite often, it is reasonable to assume that total RNA concentration and distributions are very close across compared samples. Nevertheless, the sequenced RNA repertoires may differ significantly under different experimental conditions and/or across sequencing protocols; thus, the proportion of gene expression is not directly comparable in such cases. In this review, we illustrate typical scenarios in which RPKM and TPM are misused, unintentionally, and hope to raise scientists' awareness of this issue when comparing them across samples or different sequencing protocols.
Collapse
Affiliation(s)
- Shanrong Zhao
- Integrative Biology Center of Excellence, Pfizer Worldwide Research and Development, Cambridge, Massachusetts 02139, USA
| | - Zhan Ye
- Early Clinical Development, Pfizer Worldwide Research and Development, Cambridge, Massachusetts 02139, USA
| | - Robert Stanton
- Integrative Biology Center of Excellence, Pfizer Worldwide Research and Development, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
41
|
Machado FB, Moharana KC, Almeida-Silva F, Gazara RK, Pedrosa-Silva F, Coelho FS, Grativol C, Venancio TM. Systematic analysis of 1298 RNA-Seq samples and construction of a comprehensive soybean (Glycine max) expression atlas. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 103:1894-1909. [PMID: 32445587 DOI: 10.1111/tpj.14850] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 04/15/2020] [Accepted: 05/06/2020] [Indexed: 05/23/2023]
Abstract
Soybean (Glycine max [L.] Merr.) is a major crop in animal feed and human nutrition, mainly for its rich protein and oil contents. The remarkable rise in soybean transcriptome studies over the past 5 years generated an enormous amount of RNA-seq data, encompassing various tissues, developmental conditions and genotypes. In this study, we have collected data from 1298 publicly available soybean transcriptome samples, processed the raw sequencing reads and mapped them to the soybean reference genome in a systematic fashion. We found that 94% of the annotated genes (52 737/56 044) had detectable expression in at least one sample. Unsupervised clustering revealed three major groups, comprising samples from aerial, underground and seed/seed-related parts. We found 452 genes with uniform and constant expression levels, supporting their roles as housekeeping genes. On the other hand, 1349 genes showed heavily biased expression patterns towards particular tissues. A transcript-level analysis revealed that 95% (70 963 of 74 490) of the assembled transcripts have intron chains exactly matching those from known transcripts, whereas 3256 assembled transcripts represent potentially novel splicing isoforms. The dataset compiled here constitute a new resource for the community, which can be downloaded or accessed through a user-friendly web interface at http://venanciogroup.uenf.br/resources/. This comprehensive transcriptome atlas will likely accelerate research on soybean genetics and genomics.
Collapse
Affiliation(s)
- Fabricio B Machado
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Kanhu C Moharana
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Fabricio Almeida-Silva
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Rajesh K Gazara
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Francisnei Pedrosa-Silva
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Fernanda S Coelho
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Clícia Grativol
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| | - Thiago M Venancio
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil
| |
Collapse
|
42
|
Al Kadi M, Jung N, Ito S, Kameoka S, Hishida T, Motooka D, Nakamura S, Iida T, Okuzaki D. UNAGI: an automated pipeline for nanopore full-length cDNA sequencing uncovers novel transcripts and isoforms in yeast. Funct Integr Genomics 2020; 20:523-536. [PMID: 31955296 PMCID: PMC7283198 DOI: 10.1007/s10142-020-00732-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 12/20/2019] [Accepted: 01/09/2020] [Indexed: 11/25/2022]
Abstract
Sequencing the entire RNA molecule leads to a better understanding of the transcriptome architecture. SMARTer (Switching Mechanism at 5'-End of RNA Template) is a technology aimed at generating full-length cDNA from low amounts of mRNA for sequencing by short-read sequencers such as those from Illumina. However, short read sequencing such as Illumina technology includes fragmentation that results in bias and information loss. Here, we built a pipeline, UNAGI or UNAnnotated Gene Identifier, to process long reads obtained with nanopore sequencing and compared this pipeline with the standard Illumina pipeline by studying the Saccharomyces cerevisiae transcriptome in full-length cDNA samples generated from two different biological samples: haploid and diploid cells. Additionally, we processed the long reads with another long read tool, FLAIR. Our strand-aware method revealed significant differential gene expression that was masked in Illumina data by antisense transcripts. Our pipeline, UNAGI, outperformed the Illumina pipeline and FLAIR in transcript reconstruction (sensitivity and specificity of 80% and 40% vs. 18% and 34% and 79% and 32%, respectively). Moreover, UNAGI discovered 3877 unannotated transcripts including 1282 intergenic transcripts while the Illumina pipeline discovered only 238 unannotated transcripts. For isoforms profiling, UNAGI also outperformed the Illumina pipeline and FLAIR in terms of sensitivity (91% vs. 82% and 63%, respectively). But the low accuracy of nanopore sequencing led to a closer gap in terms of specificity with Illumina pipeline (70% vs. 63%) and to a huge gap with FLAIR (70% vs 0.02%).
Collapse
Affiliation(s)
- Mohamad Al Kadi
- Department of Bacterial Infections, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
| | - Nicolas Jung
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
| | - Shingo Ito
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
| | - Shoichiro Kameoka
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
- Cykinso, Inc., Tokyo, 151-0053, Japan
| | - Takashi Hishida
- Department of Molecular Biology, Graduate School of Science, Gakushuin University, Tokyo, 171-0031, Japan
| | - Daisuke Motooka
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
- Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
| | - Shota Nakamura
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
- Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Osaka, 565-0871, Japan
| | - Tetsuya Iida
- Department of Bacterial Infections, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
- Department of Infection Metagenomics, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan
| | - Daisuke Okuzaki
- Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Osaka, 565-0871, Japan.
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Osaka, 565-0871, Japan.
- Single Cell Genomics, Human Immunology, WPI Immunology Frontier Research Center, Osaka University, Osaka, 565-0871, Japan.
- Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Yamadaoka 3-1, Suita City, Osaka, Japan.
| |
Collapse
|
43
|
The Clinical Application of RNA Sequencing in Genetic Diagnosis of Mendelian Disorders. Clin Lab Med 2020; 40:121-133. [PMID: 32439064 DOI: 10.1016/j.cll.2020.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
44
|
A Multi-Omics Perspective of Quantitative Trait Loci in Precision Medicine. Trends Genet 2020; 36:318-336. [PMID: 32294413 DOI: 10.1016/j.tig.2020.01.009] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 01/05/2020] [Accepted: 01/21/2020] [Indexed: 02/07/2023]
Abstract
Quantitative trait loci (QTL) analysis is an important approach to investigate the effects of genetic variants identified through an increasing number of large-scale, multidimensional 'omics data sets. In this 'big data' era, the research community has identified a significant number of molecular QTLs (molQTLs) and increased our understanding of their effects. Herein, we review multiple categories of molQTLs, including those associated with transcriptome, post-transcriptional regulation, epigenetics, proteomics, metabolomics, and the microbiome. We summarize approaches to identify molQTLs and to infer their causal effects. We further discuss the integrative analysis of molQTLs through a multi-omics perspective. Our review highlights future opportunities to better understand the functional significance of genetic variants and to utilize the discovery of molQTLs in precision medicine.
Collapse
|
45
|
Simonovic S, Hinze C, Schmidt-Ott KM, Busch J, Jung M, Jung K, Rabien A. Limited utility of qPCR-based detection of tumor-specific circulating mRNAs in whole blood from clear cell renal cell carcinoma patients. BMC Urol 2020; 20:7. [PMID: 32013938 PMCID: PMC6998103 DOI: 10.1186/s12894-019-0542-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 10/21/2019] [Indexed: 02/08/2023] Open
Abstract
Background RNA sequencing data is providing abundant information about the levels of dysregulation of genes in various tumors. These data, as well as data based on older microarray technologies have enabled the identification of many genes which are upregulated in clear cell renal cell carcinoma (ccRCC) compared to matched normal tissue. Here we use RNA sequencing data in order to construct a panel of highly overexpressed genes in ccRCC so as to evaluate their RNA levels in whole blood and determine any diagnostic potential of these levels for renal cell carcinoma patients. Methods A bioinformatics analysis with Python was performed using TCGA, GEO and other databases to identify genes which are upregulated in ccRCC while being absent in the blood of healthy individuals. Quantitative Real Time PCR (RT-qPCR) was subsequently used to measure the levels of candidate genes in whole blood (PAX gene) of 16 ccRCC patients versus 11 healthy individuals. PCR results were processed in qBase and GraphPadPrism and statistics was done with Mann-Whitney U test. Results While most analyzed genes were either undetectable or did not show any dysregulated expression, two genes, CDK18 and CCND1, were paradoxically downregulated in the blood of ccRCC patients compared to healthy controls. Furthermore, LOX showed a tendency towards upregulation in metastatic ccRCC samples compared to non-metastatic. Conclusions This analysis illustrates the difficulty of detecting tumor regulated genes in blood and the possible influence of interference from expression in blood cells even for genes conditionally absent in normal blood. Testing in plasma samples indicated that tumor specific mRNAs were not detectable. While CDK18, CCND1 and LOX mRNAs might carry biomarker potential, this would require validation in an independent, larger patient cohort.
Collapse
Affiliation(s)
- Sinisa Simonovic
- Department of Urology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany. .,Berlin Institute for Urologic Research, Berlin, Germany. .,Max-Delbrück-Center for Molecular Medicine (MDC), Berlin, Germany.
| | - Christian Hinze
- Max-Delbrück-Center for Molecular Medicine (MDC), Berlin, Germany
| | - Kai M Schmidt-Ott
- Max-Delbrück-Center for Molecular Medicine (MDC), Berlin, Germany.,Department of Nephrology and Medical Intensive Care, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Jonas Busch
- Department of Urology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Monika Jung
- Department of Urology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Klaus Jung
- Department of Urology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany.,Berlin Institute for Urologic Research, Berlin, Germany
| | - Anja Rabien
- Department of Urology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany.,Berlin Institute for Urologic Research, Berlin, Germany
| |
Collapse
|
46
|
Muhammad II, Kong SL, Akmar Abdullah SN, Munusamy U. RNA-seq and ChIP-seq as Complementary Approaches for Comprehension of Plant Transcriptional Regulatory Mechanism. Int J Mol Sci 2019; 21:E167. [PMID: 31881735 PMCID: PMC6981605 DOI: 10.3390/ijms21010167] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Revised: 12/19/2019] [Accepted: 12/23/2019] [Indexed: 02/07/2023] Open
Abstract
The availability of data produced from various sequencing platforms offer the possibility to answer complex questions in plant research. However, drawbacks can arise when there are gaps in the information generated, and complementary platforms are essential to obtain more comprehensive data sets relating to specific biological process, such as responses to environmental perturbations in plant systems. The investigation of transcriptional regulation raises different challenges, particularly in associating differentially expressed transcription factors with their downstream responsive genes. In this paper, we discuss the integration of transcriptional factor studies through RNA sequencing (RNA-seq) and Chromatin Immunoprecipitation sequencing (ChIP-seq). We show how the data from ChIP-seq can strengthen information generated from RNA-seq in elucidating gene regulatory mechanisms. In particular, we discuss how integration of ChIP-seq and RNA-seq data can help to unravel transcriptional regulatory networks. This review discusses recent advances in methods for studying transcriptional regulation using these two methods. It also provides guidelines for making choices in selecting specific protocols in RNA-seq pipelines for genome-wide analysis to achieve more detailed characterization of specific transcription regulatory pathways via ChIP-seq.
Collapse
Affiliation(s)
- Isiaka Ibrahim Muhammad
- Laboratory of Plantation Science and Technology, Institute of Plantation Studies, Universiti Putra Malaysia, Selangor 43400, Malaysia; (I.I.M.); (S.L.K.); (U.M.)
| | - Sze Ling Kong
- Laboratory of Plantation Science and Technology, Institute of Plantation Studies, Universiti Putra Malaysia, Selangor 43400, Malaysia; (I.I.M.); (S.L.K.); (U.M.)
| | - Siti Nor Akmar Abdullah
- Laboratory of Plantation Science and Technology, Institute of Plantation Studies, Universiti Putra Malaysia, Selangor 43400, Malaysia; (I.I.M.); (S.L.K.); (U.M.)
- Department of Agriculture Technology, Faculty of Agriculture, Universiti Putra Malaysia, Selangor 43400, Malaysia
| | - Umaiyal Munusamy
- Laboratory of Plantation Science and Technology, Institute of Plantation Studies, Universiti Putra Malaysia, Selangor 43400, Malaysia; (I.I.M.); (S.L.K.); (U.M.)
| |
Collapse
|
47
|
Zheng H, Brennan K, Hernaez M, Gevaert O. Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples. Gigascience 2019; 8:giz145. [PMID: 31808800 PMCID: PMC6897288 DOI: 10.1093/gigascience/giz145] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 09/30/2019] [Accepted: 11/15/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) are emerging as important regulators of various biological processes. While many studies have exploited public resources such as RNA sequencing (RNA-Seq) data in The Cancer Genome Atlas to study lncRNAs in cancer, it is crucial to choose the optimal method for accurate expression quantification. RESULTS In this study, we compared the performance of pseudoalignment methods Kallisto and Salmon, alignment-based transcript quantification method RSEM, and alignment-based gene quantification methods HTSeq and featureCounts, in combination with read aligners STAR, Subread, and HISAT2, in lncRNA quantification, by applying them to both un-stranded and stranded RNA-Seq datasets. Full transcriptome annotation, including protein-coding and non-coding RNAs, greatly improves the specificity of lncRNA expression quantification. Pseudoalignment methods and RSEM outperform HTSeq and featureCounts for lncRNA quantification at both sample- and gene-level comparison, regardless of RNA-Seq protocol type, choice of aligners, and transcriptome annotation. Pseudoalignment methods and RSEM detect more lncRNAs and correlate highly with simulated ground truth. On the contrary, HTSeq and featureCounts often underestimate lncRNA expression. Antisense lncRNAs are poorly quantified by alignment-based gene quantification methods, which can be improved using stranded protocols and pseudoalignment methods. CONCLUSIONS Considering the consistency with ground truth and computational resources, pseudoalignment methods Kallisto or Salmon in combination with full transcriptome annotation is our recommended strategy for RNA-Seq analysis for lncRNAs.
Collapse
Affiliation(s)
- Hong Zheng
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
| | - Kevin Brennan
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
| | - Mikel Hernaez
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 W. Gregory Dr, Urbana, 61805, IL, USA
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
- Department of Biomedical Data Science, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
| |
Collapse
|
48
|
Pomaznoy M, Sethi A, Greenbaum J, Peters B. Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data. Sci Rep 2019; 9:16342. [PMID: 31704962 PMCID: PMC6841694 DOI: 10.1038/s41598-019-52584-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 10/21/2019] [Indexed: 01/05/2023] Open
Abstract
RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.
Collapse
Affiliation(s)
- Mikhail Pomaznoy
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States.
| | - Ashu Sethi
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States
| | - Jason Greenbaum
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States
| | - Bjoern Peters
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States.,Department of Medicine, University of California San Diego, La Jolla, CA, United States
| |
Collapse
|
49
|
Comparative evaluation of RNA-Seq library preparation methods for strand-specificity and low input. Sci Rep 2019; 9:13477. [PMID: 31530843 PMCID: PMC6748930 DOI: 10.1038/s41598-019-49889-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 08/19/2019] [Indexed: 01/04/2023] Open
Abstract
Library preparation is a key step in sequencing. For RNA sequencing there are advantages to both strand specificity and working with minute starting material, yet until recently there was no kit available enabling both. The Illumina TruSeq stranded mRNA Sample Preparation kit (TruSeq) requires abundant starting material while the Takara Bio SMART-Seq v4 Ultra Low Input RNA kit (V4) sacrifices strand specificity. The SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Pico) by Takara Bio claims to overcome these limitations. Comparative evaluation of these kits is important for selecting the appropriate protocol. We compared the three kits in a realistic differential expression analysis. We prepared and sequenced samples from two experimental conditions of biological interest with each of the three kits. We report differences between the kits at the level of differential gene expression; for example, the Pico kit results in 55% fewer differentially expressed genes than TruSeq. Nevertheless, the agreement of the observed enriched pathways suggests that comparable functional results can be obtained. In summary we conclude that the Pico kit sufficiently reproduces the results of the other kits at the level of pathway analysis while providing a combination of options that is not available in the other kits.
Collapse
|
50
|
Galli V, Messias RS, Guzman F, Perin EC, Margis R, Rombaldi CV. Transcriptome analysis of strawberry (Fragaria × ananassa) fruits under osmotic stresses and identification of genes related to ascorbic acid pathway. PHYSIOLOGIA PLANTARUM 2019; 166:979-995. [PMID: 30367706 DOI: 10.1111/ppl.12861] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2018] [Revised: 10/22/2018] [Accepted: 10/23/2018] [Indexed: 06/08/2023]
Abstract
Strawberry (Fragaria ananassa Duch.) is an economically important fruit with a high demand owing to its good taste and medicinal properties. However, its cultivation is affected by various biotic and abiotic stresses. Plants exhibit several intrinsic mechanisms to deal with stresses. In the case of strawberry, the mechanisms highlighting the response against these stresses remain to be elucidated, which has hampered the efforts to develop and cultivate strawberry plants with high yield and quality. Although a virtual reference genome of F. ananassa has recently been published, there is still a lack of information on the expression of genes in response to various stresses. Therefore, to provide molecular information for further studies with strawberry plants, we present the reference transcriptome dataset of F. ananassa, assembled and annotated from deep RNA-Seq data of fruits cultivated under salinity and drought stresses. We also systematically arranged a series of transcripts differentially expressed during these stresses, with an emphasis on genes related to the accumulation of ascorbic acid (AsA). Ascorbic acid is the most potent antioxidant present in these fruits and highly considered during biofortification. A comparison of the expression profile of these genes by RT-qPCR with the content of AsA in the fruits verified a tight regulation and balance between the expression of genes, from biosynthesis, degradation and recycling pathways, resulting in the reduced content of AsA in fruits under these stresses. These results provide a useful repertoire of genes for metabolic engineering, thereby improving the tolerance to stresses.
Collapse
Affiliation(s)
- Vanessa Galli
- Departamento de Ciência e Tecnologia de Alimentos, Universidade Federal de Pelotas, Pelotas, Brazil
| | - Rafael S Messias
- Departamento de Ciência e Tecnologia de Alimentos, Universidade Federal de Pelotas, Pelotas, Brazil
| | - Frank Guzman
- Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Ellen C Perin
- Departamento de Ciência e Tecnologia de Alimentos, Universidade Federal de Pelotas, Pelotas, Brazil
| | - Rogério Margis
- Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Cesar V Rombaldi
- Departamento de Ciência e Tecnologia de Alimentos, Universidade Federal de Pelotas, Pelotas, Brazil
| |
Collapse
|