1
|
Morrissey A, Shi J, James DQ, Mahony S. Accurate allocation of multimapped reads enables regulatory element analysis at repeats. Genome Res 2024; 34:937-951. [PMID: 38986578 PMCID: PMC11293539 DOI: 10.1101/gr.278638.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 06/14/2024] [Indexed: 07/12/2024]
Abstract
Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. However, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multimapped" reads that align equally well to multiple genomic locations. Because multimapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multimapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multimapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multimapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq data sets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly beneficial in identifying ChIP-seq peaks at centromeres, near segmentally duplicated genes, and in younger TEs, enabling new regulatory analyses in these regions.
Collapse
Affiliation(s)
- Alexis Morrissey
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Jeffrey Shi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Daniela Q James
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
2
|
Kobiita A, Silva PN, Schmid MW, Stoffel M. FoxM1 coordinates cell division, protein synthesis, and mitochondrial activity in a subset of β cells during acute metabolic stress. Cell Rep 2023; 42:112986. [PMID: 37590136 DOI: 10.1016/j.celrep.2023.112986] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 06/06/2023] [Accepted: 07/31/2023] [Indexed: 08/19/2023] Open
Abstract
Pancreatic β cells display functional and transcriptional heterogeneity in health and disease. The sequence of events leading to β cell heterogeneity during metabolic stress is poorly understood. Here, we characterize β cell responses to early metabolic stress in vivo by employing RNA sequencing (RNA-seq), assay for transposase-accessible chromatin with sequencing (ATAC-seq), single-cell RNA-seq (scRNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), and real-time imaging to decipher temporal events of chromatin remodeling and gene expression regulating the unfolded protein response (UPR), protein synthesis, mitochondrial function, and cell-cycle progression. We demonstrate that a subpopulation of β cells with active UPR, decreased protein synthesis, and insulin secretary capacities is more susceptible to proliferation after insulin depletion. Alleviation of endoplasmic reticulum (ER) stress precedes the progression of the cell cycle and mitosis and ensures appropriate insulin synthesis. Furthermore, metabolic stress rapidly activates key transcription factors including FoxM1, which impacts on proliferative and quiescent β cells by regulating protein synthesis, ER stress, and mitochondrial activity via direct repression of mitochondrial-encoded genes.
Collapse
Affiliation(s)
- Ahmad Kobiita
- Institute of Molecular Health Sciences, ETH Zürich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland
| | - Pamuditha N Silva
- Institute of Molecular Health Sciences, ETH Zürich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland
| | - Marc W Schmid
- MWSchmid GmbH, Hauptstrasse 34, 8750 Glarus, Switzerland
| | - Markus Stoffel
- Institute of Molecular Health Sciences, ETH Zürich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland; Medical Faculty, Universitäts-Spital Zürich, Rämistrasse 100, 8091 Zürich, Switzerland.
| |
Collapse
|
3
|
Roganowicz M, Bär D, Bersaglieri C, Aprigliano R, Santoro R. BAZ2A-RNA mediated association with TOP2A and KDM1A represses genes implicated in prostate cancer. Life Sci Alliance 2023; 6:e202301950. [PMID: 37184661 PMCID: PMC10130768 DOI: 10.26508/lsa.202301950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 04/13/2023] [Accepted: 04/17/2023] [Indexed: 05/16/2023] Open
Abstract
BAZ2A represses rRNA genes (rDNA) that are transcribed by RNA polymerase I. In prostate cancer (PCa), BAZ2A function goes beyond this role because it represses genes frequently silenced in metastatic disease. However, the mechanisms of this BAZ2A-mediated repression remain elusive. Here, we show that BAZ2A represses genes through its RNA-binding TAM domain using mechanisms differing from rDNA silencing. Although the TAM domain mediates BAZ2A recruitment to rDNA, in PCa, this is not required for BAZ2A association with target genes. Instead, the BAZ2A-TAM domain in association with RNA mediates the interaction with topoisomerase 2A (TOP2A) and histone demethylase KDM1A, whose expression positively correlates with BAZ2A levels in localized and metastatic PCa. TOP2A and KDM1A pharmacological inhibition up-regulate BAZ2A-repressed genes that are regulated by inactive enhancers bound by BAZ2A, whereas rRNA genes are not affected. Our findings showed a novel RNA-based mechanism of gene regulation in PCa. Furthermore, we determined that RNA-mediated interactions between BAZ2A and TOP2A and KDM1A repress genes critical to PCa and may prove to be useful to stratify prostate cancer risk and treatment in patients.
Collapse
Affiliation(s)
- Marcin Roganowicz
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
- RNA Biology Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Dominik Bär
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Cristiana Bersaglieri
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Rossana Aprigliano
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Raffaella Santoro
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| |
Collapse
|
4
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
5
|
Francine P. Systems Biology: New Insight into Antibiotic Resistance. Microorganisms 2022; 10:2362. [PMID: 36557614 PMCID: PMC9781975 DOI: 10.3390/microorganisms10122362] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 11/26/2022] [Accepted: 11/28/2022] [Indexed: 12/05/2022] Open
Abstract
Over the past few decades, antimicrobial resistance (AMR) has emerged as an important threat to public health, resulting from the global propagation of multidrug-resistant strains of various bacterial species. Knowledge of the intrinsic factors leading to this resistance is necessary to overcome these new strains. This has contributed to the increased use of omics technologies and their extrapolation to the system level. Understanding the mechanisms involved in antimicrobial resistance acquired by microorganisms at the system level is essential to obtain answers and explore options to combat this resistance. Therefore, the use of robust whole-genome sequencing approaches and other omics techniques such as transcriptomics, proteomics, and metabolomics provide fundamental insights into the physiology of antimicrobial resistance. To improve the efficiency of data obtained through omics approaches, and thus gain a predictive understanding of bacterial responses to antibiotics, the integration of mathematical models with genome-scale metabolic models (GEMs) is essential. In this context, here we outline recent efforts that have demonstrated that the use of omics technology and systems biology, as quantitative and robust hypothesis-generating frameworks, can improve the understanding of antibiotic resistance, and it is hoped that this emerging field can provide support for these new efforts.
Collapse
Affiliation(s)
- Piubeli Francine
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Seville, 41012 Seville, Spain
| |
Collapse
|
6
|
Liu X, Zhao J, Xue L, Zhao T, Ding W, Han Y, Ye H. A comparison of transcriptome analysis methods with reference genome. BMC Genomics 2022; 23:232. [PMID: 35337265 PMCID: PMC8957167 DOI: 10.1186/s12864-022-08465-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 03/08/2022] [Indexed: 11/10/2022] Open
Abstract
Background The application of RNA-seq technology has become more extensive and the number of analysis procedures available has increased over the past years. Selecting an appropriate workflow has become an important issue for researchers in the field. Methods In our study, six popular analytical procedures/pipeline were compared using four RNA-seq datasets from mouse, human, rat, and macaque, respectively. The gene expression value, fold change of gene expression, and statistical significance were evaluated to compare the similarities and differences among the six procedures. qRT-PCR was performed to validate the differentially expressed genes (DEGs) from all six procedures. Results Cufflinks-Cuffdiff demands the highest computing resources and Kallisto-Sleuth demands the least. Gene expression values, fold change, p and q values of differential expression (DE) analysis are highly correlated among procedures using HTseq for quantification. For genes with medium expression abundance, the expression values determined using the different procedures were similar. Major differences in expression values come from genes with particularly high or low expression levels. HISAT2-StringTie-Ballgown is more sensitive to genes with low expression levels, while Kallisto-Sleuth may only be useful to evaluate genes with medium to high abundance. When the same thresholds for fold change and p value are chosen in DE analysis, StringTie-Ballgown produce the least number of DEGs, while HTseq-DESeq2, -edgeR or -limma generally produces more DEGs. The performance of Cufflinks-Cuffdiff and Kallisto-Sleuth varies in different datasets. For DEGs with medium expression levels, the biological verification rates were similar among all procedures. Conclusion Results are highly correlated among RNA-seq analysis procedures using HTseq for quantification. Difference in gene expression values mainly come from genes with particularly high or low expression levels. Moreover, biological validation rates of DEGs from all six procedures were similar for genes with medium expression levels. Investigators can choose analytical procedures according to their available computer resources, or whether genes of high or low expression levels are of interest. If computer resources are abundant, one can utilize multiple procedures to obtain the intersection of results to get the most reliable DEGs, or to obtain a combination of results to get a more comprehensive DE profile for transcriptomes. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08465-0.
Collapse
Affiliation(s)
- Xu Liu
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China
| | - Jialu Zhao
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China.,Monogenic Disease Research Center for Neurological Disorders, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.,China National Clinical Research Center for Neurological Diseases, Beijing, China
| | - Liting Xue
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China
| | - Tian Zhao
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China
| | - Wei Ding
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China
| | - Yuying Han
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China. .,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China.
| | - Haihong Ye
- Department of Medical Genetics and Developmental Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, China. .,Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing, China.
| |
Collapse
|
7
|
Wang H, Feng X, Muhatai G, Wang L. Expression profile analysis of sheep ovary after superovulation and estrus synchronisation treatment. Vet Med Sci 2022; 8:1276-1287. [PMID: 35305293 PMCID: PMC9122410 DOI: 10.1002/vms3.783] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Superovulation is a widely used reproductive technique in livestock production, but the mechanism of sheep's superovulation is not yet clear. Here, a method of superovulation and estrus synchronisation was used to treat female Duolang sheep. After treatment, there were significant differences in serum FSH and LH levels and the number of dominant follicles between the two groups of sheep. We identified a total of 5021 differentially expressed genes (11, 13 and 15 days after treatment) and performed RT‐qPCR analysis to identify several mRNA expression levels. GO and KEGG enrichment analysis revealed that differentially expressed genes were involved in the regulation of signalling pathways of follicular development, cell cycle, material synthesis, energy metabolism, such as COL3A1, RPS8, ACTA2, RPL7 RPS6 and TNFAIP6 may play a key role in regulating the development of follicles. Our results show a comprehensive expression profile after superovulation and estrus synchronisation treatment. We provide the basis for further research on breeding techniques to improve the ovulation rate and birth rate of livestock.
Collapse
Affiliation(s)
- Huie Wang
- College of Animal Science, Tarim University, Alar, Xinjiang, China
| | - Xinwei Feng
- College of Animal Science, Tarim University, Alar, Xinjiang, China
| | | | - Lan Wang
- College of Animal Science, Tarim University, Alar, Xinjiang, China
| |
Collapse
|
8
|
Peña-Hernández R, Aprigliano R, Carina Frommel S, Pietrzak K, Steiger S, Roganowicz M, Lerra L, Bizzarro J, Santoro R. BAZ2A-mediated repression via H3K14ac-marked enhancers promotes prostate cancer stem cells. EMBO Rep 2021; 22:e53014. [PMID: 34403195 PMCID: PMC8567280 DOI: 10.15252/embr.202153014] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/29/2021] [Accepted: 08/02/2021] [Indexed: 12/09/2022] Open
Abstract
Prostate cancer (PCa) is one of the most prevalent cancers in men. Cancer stem cells are thought to be associated with PCa relapse. Here, we show that BAZ2A is required for PCa cells with a cancer stem‐like state. BAZ2A genomic occupancy in PCa cells coincides with H3K14ac‐enriched chromatin regions. This association is mediated by BAZ2A‐bromodomain (BAZ2A‐BRD) that specifically binds H3K14ac. BAZ2A associates with inactive enhancers marked by H3K14ac and repressing transcription of genes frequently silenced in aggressive and poorly differentiated PCa. BAZ2A‐mediated repression is also linked to EP300 that acetylates H3K14ac. BAZ2A‐BRD mutations or treatment with inhibitors abrogating BAZ2A‐BRD/H3K14ac interaction impair PCa stem cells. Furthermore, pharmacological inactivation of BAZ2A‐BRD impairs Pten‐loss oncogenic transformation of prostate organoids. Our findings indicate a role of BAZ2A‐BRD in PCa stem cell features and suggest potential epigenetic‐reader therapeutic strategies to target BAZ2A in aggressive PCa.
Collapse
Affiliation(s)
- Rodrigo Peña-Hernández
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland.,Molecular Life Science Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Rossana Aprigliano
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Sandra Carina Frommel
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Karolina Pietrzak
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland.,Molecular Life Science Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Seraina Steiger
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Marcin Roganowicz
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland.,RNA Biology Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Luigi Lerra
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland.,RNA Biology Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Juliana Bizzarro
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| | - Raffaella Santoro
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, Zurich, Switzerland
| |
Collapse
|
9
|
OCTAD: an open workspace for virtually screening therapeutics targeting precise cancer patient groups using gene expression features. Nat Protoc 2020; 16:728-753. [PMID: 33361798 DOI: 10.1038/s41596-020-00430-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 09/28/2020] [Indexed: 12/20/2022]
Abstract
As the field of precision medicine progresses, treatments for patients with cancer are starting to be tailored to their molecular as well as their clinical features. The emerging cancer subtypes defined by these molecular features require that dedicated resources be used to assist the discovery of drug candidates for preclinical evaluation. Voluminous gene expression profiles of patients with cancer have been accumulated in public databases, enabling the creation of cancer-specific expression signatures. Meanwhile, large-scale gene expression profiles of cellular responses to chemical compounds have also recently became available. By matching the cancer-specific expression signature to compound-induced gene expression profiles from large drug libraries, researchers can prioritize small molecules that present high potency to reverse expression of signature genes for further experimental testing of their efficacy. This approach has proven to be an efficient and cost-effective way to identify efficacious drug candidates. However, the success of this approach requires multiscale procedures, imposing considerable challenges to many labs. To address this, we developed Open Cancer TherApeutic Discovery (OCTAD; http://octad.org ): an open workspace for virtually screening compounds targeting precise groups of patients with cancer using gene expression features. Its database includes 19,127 patient tissue samples covering more than 50 cancer types and expression profiles for 12,442 distinct compounds. The program is used to perform deep-learning-based reference tissue selection, disease gene expression signature creation, drug reversal potency scoring and in silico validation. OCTAD is available as a web portal and a standalone R package to allow experimental and computational scientists to easily navigate the tool.
Collapse
|
10
|
Dalcher D, Tan JY, Bersaglieri C, Peña‐Hernández R, Vollenweider E, Zeyen S, Schmid MW, Bianchi V, Butz S, Roganowicz M, Kuzyakiv R, Baubec T, Marques AC, Santoro R. BAZ2A safeguards genome architecture of ground-state pluripotent stem cells. EMBO J 2020; 39:e105606. [PMID: 33433018 PMCID: PMC7705451 DOI: 10.15252/embj.2020105606] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 08/31/2020] [Accepted: 09/02/2020] [Indexed: 12/30/2022] Open
Abstract
Chromosomes have an intrinsic tendency to segregate into compartments, forming long-distance contacts between loci of similar chromatin states. How genome compartmentalization is regulated remains elusive. Here, comparison of mouse ground-state embryonic stem cells (ESCs) characterized by open and active chromatin, and advanced serum ESCs with a more closed and repressed genome, reveals distinct regulation of their genome organization due to differential dependency on BAZ2A/TIP5, a component of the chromatin remodeling complex NoRC. On ESC chromatin, BAZ2A interacts with SNF2H, DNA topoisomerase 2A (TOP2A) and cohesin. BAZ2A associates with chromatin sub-domains within the active A compartment, which intersect through long-range contacts. We found that ground-state chromatin selectively requires BAZ2A to limit the invasion of active domains into repressive compartments. BAZ2A depletion increases chromatin accessibility at B compartments. Furthermore, BAZ2A regulates H3K27me3 genome occupancy in a TOP2A-dependent manner. Finally, ground-state ESCs require BAZ2A for growth, differentiation, and correct expression of developmental genes. Our results uncover the propensity of open chromatin domains to invade repressive domains, which is counteracted by chromatin remodeling to establish genome partitioning and preserve cell identity.
Collapse
Affiliation(s)
- Damian Dalcher
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Jennifer Yihong Tan
- Department of Computational BiologyUniversity of LausanneLausanneSwitzerland
| | - Cristiana Bersaglieri
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Rodrigo Peña‐Hernández
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Eva Vollenweider
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Stefan Zeyen
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Marc W Schmid
- Service and Support for Science ITUniversity of ZurichZurichSwitzerland
| | - Valerio Bianchi
- Oncode InstituteHubrecht Institute‐KNAWUniversity Medical Center UtrechtUtrechtThe Netherlands
| | - Stefan Butz
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Marcin Roganowicz
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Molecular Life Science ProgramLife Science Zurich Graduate SchoolUniversity of ZurichZurichSwitzerland
| | - Rostyslav Kuzyakiv
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
- Service and Support for Science ITUniversity of ZurichZurichSwitzerland
| | - Tuncay Baubec
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
| | - Ana Claudia Marques
- Department of Computational BiologyUniversity of LausanneLausanneSwitzerland
| | - Raffaella Santoro
- Department of Molecular Mechanisms of Disease, DMMDUniversity of ZurichZurichSwitzerland
| |
Collapse
|
11
|
Eghbalnia HR, Wilfinger WW, Mackey K, Chomczynski P. Coordinated analysis of exon and intron data reveals novel differential gene expression changes. Sci Rep 2020; 10:15669. [PMID: 32973253 PMCID: PMC7515875 DOI: 10.1038/s41598-020-72482-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/24/2020] [Indexed: 12/14/2022] Open
Abstract
RNA-Seq expression analysis currently relies primarily upon exon expression data. The recognized role of introns during translation, and the presence of substantial RNA-Seq counts attributable to introns, provide the rationale for the simultaneous consideration of both exon and intron data. We describe here a method for the coordinated analysis of exon and intron data by investigating their relationship within individual genes and across samples, while taking into account changes in both variability and expression level. This coordinated analysis of exon and intron data offers strong evidence for significant differences that distinguish the profiles of the exon-only expression data from the combined exon and intron data. One advantage of our proposed method, called matched change characterization for exons and introns (MEI), is its straightforward applicability to existing archived data using small modifications to standard RNA-Seq pipelines. Using MEI, we demonstrate that when data are examined for changes in variability across control and case conditions, novel differential changes can be detected. Notably, when MEI criteria were employed in the analysis of an archived data set involving polyarthritic subjects, the number of differentially expressed genes was expanded by sevenfold. More importantly, the observed changes in exon and intron variability with statistically significant false discovery rates could be traced to specific immune pathway gene networks. The application of MEI analysis provides a strategy for incorporating the significance of exon and intron variability and further developing the role of using both exons and intron sequencing counts in studies of gene regulatory processes.
Collapse
Affiliation(s)
- Hamid R Eghbalnia
- University of Wisconsin-Madison, Madison, USA. .,University of Cincinnati, Cincinnati, USA.
| | | | - Karol Mackey
- Molecular Research Center, Inc., Cincinnati, USA
| | | |
Collapse
|
12
|
Nikolaou KC, Vatandaslar H, Meyer C, Schmid MW, Tuschl T, Stoffel M. The RNA-Binding Protein A1CF Regulates Hepatic Fructose and Glycerol Metabolism via Alternative RNA Splicing. Cell Rep 2020; 29:283-300.e8. [PMID: 31597092 DOI: 10.1016/j.celrep.2019.08.100] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 08/09/2019] [Accepted: 08/29/2019] [Indexed: 01/11/2023] Open
Abstract
The regulation of hepatic gene expression has been extensively studied at the transcriptional level; however, the control of metabolism through posttranscriptional gene regulation by RNA-binding proteins in physiological and disease states is less understood. Here, we report a major role for the hormone-sensitive RNA-binding protein (RBP) APOBEC1 complementation factor (A1CF) in the generation of hepatocyte-specific and alternatively spliced transcripts. Among these transcripts are isoforms for the dominant and high-affinity fructose-metabolizing ketohexokinase C and glycerol kinase, two key metabolic enzymes that are linked to hepatic gluconeogenesis and found to be markedly reduced upon hepatic ablation of A1cf. Consequently, mice lacking A1CF exhibit improved glucose tolerance and are protected from fructose-induced hyperglycemia, hepatic steatosis, and development of obesity. Our results identify a previously unreported function of A1CF as a regulator of alternative splicing of a subset of genes influencing hepatic glucose production through fructose and glycerol metabolism.
Collapse
Affiliation(s)
- Kostas C Nikolaou
- Institute of Molecular Health Sciences, ETH Zurich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland
| | - Hasan Vatandaslar
- Institute of Molecular Health Sciences, ETH Zurich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland
| | - Cindy Meyer
- Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA
| | - Marc W Schmid
- MWSchmid GmbH, Möhrlistrasse 25, 8006 Zurich, Switzerland
| | - Thomas Tuschl
- Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA
| | - Markus Stoffel
- Institute of Molecular Health Sciences, ETH Zurich, Otto-Stern-Weg 7, 8093 Zürich, Switzerland; Medical Faculty, University of Zurich, 8091 Zurich, Switzerland.
| |
Collapse
|
13
|
Deschamps-Francoeur G, Simoneau J, Scott MS. Handling multi-mapped reads in RNA-seq. Comput Struct Biotechnol J 2020; 18:1569-1576. [PMID: 32637053 PMCID: PMC7330433 DOI: 10.1016/j.csbj.2020.06.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 06/06/2020] [Accepted: 06/07/2020] [Indexed: 11/07/2022] Open
Abstract
Many eukaryotic genomes harbour large numbers of duplicated sequences, of diverse biotypes, resulting from several mechanisms including recombination, whole genome duplication and retro-transposition. Such repeated sequences complicate gene/transcript quantification during RNA-seq analysis due to reads mapping to more than one locus, sometimes involving genes embedded in other genes. Genes of different biotypes have dissimilar levels of sequence duplication, with long-noncoding RNAs and messenger RNAs sharing less sequence similarity to other genes than biotypes encoding shorter RNAs. Many strategies have been elaborated to handle these multi-mapped reads, resulting in increased accuracy in gene/transcript quantification, although separate tools are typically used to estimate the abundance of short and long genes due to their dissimilar characteristics. This review discusses the mechanisms leading to sequence duplication, the biotypes affected, the computational strategies employed to deal with multi-mapped reads and the challenges that still remain to be overcome.
Collapse
Affiliation(s)
- Gabrielle Deschamps-Francoeur
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Joël Simoneau
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Michelle S. Scott
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| |
Collapse
|
14
|
Liao Y, Smyth GK, Shi W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res 2019; 47:e47. [PMID: 30783653 PMCID: PMC6486549 DOI: 10.1093/nar/gkz114] [Citation(s) in RCA: 1678] [Impact Index Per Article: 279.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 01/02/2019] [Accepted: 02/13/2019] [Indexed: 11/29/2022] Open
Abstract
We present Rsubread, a Bioconductor software package that provides high-performance alignment and read counting functions for RNA-seq reads. Rsubread is based on the successful Subread suite with the added ease-of-use of the R programming environment, creating a matrix of read counts directly as an R object ready for downstream analysis. It integrates read mapping and quantification in a single package and has no software dependencies other than R itself. We demonstrate Rsubread’s ability to detect exon–exon junctions de novo and to quantify expression at the level of either genes, exons or exon junctions. The resulting read counts can be input directly into a wide range of downstream statistical analyses using other Bioconductor packages. Using SEQC data and simulations, we compare Rsubread to TopHat2, STAR and HTSeq as well as to counting functions in the Bioconductor infrastructure packages. We consider the performance of these tools on the combined quantification task starting from raw sequence reads through to summary counts, and in particular evaluate the performance of different combinations of alignment and counting algorithms. We show that Rsubread is faster and uses less memory than competitor tools and produces read count summaries that more accurately correlate with true values.
Collapse
Affiliation(s)
- Yang Liao
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Gordon K Smyth
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Wei Shi
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,School of Computing and Information Systems, The University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
15
|
Caldelari R, Dogga S, Schmid MW, Franke-Fayard B, Janse CJ, Soldati-Favre D, Heussler V. Transcriptome analysis of Plasmodium berghei during exo-erythrocytic development. Malar J 2019; 18:330. [PMID: 31551073 PMCID: PMC6760107 DOI: 10.1186/s12936-019-2968-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 09/17/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND The complex life cycle of malaria parasites requires well-orchestrated stage specific gene expression. In the vertebrate host the parasites grow and multiply by schizogony in two different environments: within erythrocytes and within hepatocytes. Whereas erythrocytic parasites are well-studied in this respect, relatively little is known about the exo-erythrocytic stages. METHODS In an attempt to fill this gap, genome wide RNA-seq analyses of various exo-erythrocytic stages of Plasmodium berghei including sporozoites, samples from a time-course of liver stage development and detached cells were performed. These latter contain infectious merozoites and represent the final step in exo-erythrocytic development. RESULTS The analysis represents the complete transcriptome of the entire life cycle of P. berghei parasites with temporal detailed analysis of the liver stage allowing comparison of gene expression across the progression of the life cycle. These RNA-seq data from different developmental stages were used to cluster genes with similar expression profiles, in order to infer their functions. A comparison with published data from other parasite stages confirmed stage-specific gene expression and revealed numerous genes that are expressed differentially in blood and exo-erythrocytic stages. One of the most exo-erythrocytic stage-specific genes was PBANKA_1003900, which has previously been annotated as a "gametocyte specific protein". The promoter of this gene drove high GFP expression in exo-erythrocytic stages, confirming its expression profile seen by RNA-seq. CONCLUSIONS The comparative analysis of the genome wide mRNA expression profiles of erythrocytic and different exo-erythrocytic stages could be used to improve the understanding of gene regulation in Plasmodium parasites and can be used to model exo-erythrocytic stage metabolic networks toward the identification of differences in metabolic processes during schizogony in erythrocytes and hepatocytes.
Collapse
Affiliation(s)
- Reto Caldelari
- Institute of Cell Biology, University of Bern, Bern, Switzerland.
| | - Sunil Dogga
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva CMU, Geneva, Switzerland
| | | | - Blandine Franke-Fayard
- Leiden Malaria Research Group, Department of Parasitology, Leiden University Medical Center, Leiden, The Netherlands
| | - Chris J Janse
- Leiden Malaria Research Group, Department of Parasitology, Leiden University Medical Center, Leiden, The Netherlands
| | - Dominique Soldati-Favre
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva CMU, Geneva, Switzerland
| | - Volker Heussler
- Institute of Cell Biology, University of Bern, Bern, Switzerland.
| |
Collapse
|
16
|
Krattinger SG, Kang J, Bräunlich S, Boni R, Chauhan H, Selter LL, Robinson MD, Schmid MW, Wiederhold E, Hensel G, Kumlehn J, Sucher J, Martinoia E, Keller B. Abscisic acid is a substrate of the ABC transporter encoded by the durable wheat disease resistance gene Lr34. THE NEW PHYTOLOGIST 2019; 223:853-866. [PMID: 30913300 PMCID: PMC6618152 DOI: 10.1111/nph.15815] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2018] [Accepted: 03/20/2019] [Indexed: 05/10/2023]
Abstract
The wheat Lr34res allele, coding for an ATP-binding cassette transporter, confers durable resistance against multiple fungal pathogens. The Lr34sus allele, differing from Lr34res by two critical nucleotide polymorphisms, is found in susceptible wheat cultivars. Lr34res is functionally transferrable as a transgene into all major cereals, including rice, barley, maize, and sorghum. Here, we used transcriptomics, physiology, genetics, and in vitro and in vivo transport assays to study the molecular function of Lr34. We report that Lr34res results in a constitutive induction of transcripts reminiscent of an abscisic acid (ABA)-regulated response in transgenic rice. Lr34-expressing rice was altered in biological processes that are controlled by this phytohormone, including dehydration tolerance, transpiration and seedling growth. In planta seedling and in vitro yeast accumulation assays revealed that both LR34res and LR34sus act as ABA transporters. However, whereas the LR34res protein was detected in planta the LR34sus version was not, suggesting a post-transcriptional regulatory mechanism. Our results identify ABA as a substrate of the LR34 ABC transporter. We conclude that LR34res-mediated ABA redistribution has a major effect on the transcriptional response and physiology of Lr34res-expressing plants and that ABA is a candidate molecule that contributes to Lr34res-mediated disease resistance.
Collapse
Affiliation(s)
- Simon G. Krattinger
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Biological and Environmental Science & Engineering DivisionKing Abdullah University of Science and TechnologyThuwalSaudi Arabia
| | - Joohyun Kang
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Stephanie Bräunlich
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Rainer Boni
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Harsh Chauhan
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Liselotte L. Selter
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Mark D. Robinson
- Institute of Molecular Life SciencesUniversity of ZurichZurichSwitzerland
- SIB Swiss Institute of BioinformaticsUniversity of ZurichZurichSwitzerland
| | - Marc W. Schmid
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Elena Wiederhold
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Goetz Hensel
- Plant Reproductive BiologyLeibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeeland/OT, GaterslebenGermany
| | - Jochen Kumlehn
- Plant Reproductive BiologyLeibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeeland/OT, GaterslebenGermany
| | - Justine Sucher
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Enrico Martinoia
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| | - Beat Keller
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| |
Collapse
|
17
|
Grob S, Grossniklaus U. Invasive DNA elements modify the nuclear architecture of their insertion site by KNOT-linked silencing in Arabidopsis thaliana. Genome Biol 2019; 20:120. [PMID: 31186073 PMCID: PMC6560877 DOI: 10.1186/s13059-019-1722-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 05/22/2019] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND The three-dimensional (3D) organization of chromosomes is linked to epigenetic regulation and transcriptional activity. However, only few functional features of 3D chromatin architecture have been described to date. The KNOT is a 3D chromatin structure in Arabidopsis, comprising 10 interacting genomic regions termed KNOT ENGAGED ELEMENTs (KEEs). KEEs are enriched in transposable elements and associated small RNAs, suggesting a function in transposon biology. RESULTS Here, we report the KNOT's involvement in regulating invasive DNA elements. Transgenes can specifically interact with the KNOT, leading to perturbations of 3D nuclear organization, which correlates with the transgene's expression: high KNOT interaction frequencies are associated with transgene silencing. KNOT-linked silencing (KLS) cannot readily be connected to canonical silencing mechanisms, such as RNA-directed DNA methylation and post-transcriptional gene silencing, as both cytosine methylation and small RNA abundance do not correlate with KLS. Furthermore, KLS exhibits paramutation-like behavior, as silenced transgenes can lead to the silencing of active transgenes in trans. CONCLUSION Transgene silencing can be connected to a specific feature of Arabidopsis 3D nuclear organization, namely the KNOT. KLS likely acts either independent of or prior to canonical silencing mechanisms, such that its characterization not only contributes to our understanding of chromosome folding but also provides valuable insights into how genomes are defended against invasive DNA elements.
Collapse
Affiliation(s)
- Stefan Grob
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland.
| | - Ueli Grossniklaus
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland.
| |
Collapse
|
18
|
Jin Y, Chen G, Xiao W, Hong H, Xu J, Guo Y, Xiao W, Shi T, Shi L, Tong W, Ning B. Sequencing XMET genes to promote genotype-guided risk assessment and precision medicine. SCIENCE CHINA-LIFE SCIENCES 2019; 62:895-904. [PMID: 31114935 DOI: 10.1007/s11427-018-9479-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 12/06/2018] [Indexed: 12/26/2022]
Abstract
High-throughput next generation sequencing (NGS) is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known reference genome or by de novo assembly without reference genome. This technology has led researchers to conduct an explosion of sequencing related projects in multidisciplinary fields of science. However, due to the limitations of sequencing-based chemistry, length of sequencing reads and the complexity of genes, it is difficult to determine the sequences of some portions of the human genome, leaving gaps in genomic data that frustrate further analysis. Particularly, some complex genes are difficult to be accurately sequenced or mapped because they contain high GC-content and/or low complexity regions, and complicated pseudogenes, such as the genes encoding xenobiotic metabolizing enzymes and transporters (XMETs). The genetic variants in XMET genes are critical to predicate inter-individual variability in drug efficacy, drug safety and susceptibility to environmental toxicity. We summarized and discussed challenges, wet-lab methods, and bioinformatics algorithms in sequencing "complex" XMET genes, which may provide insightful information in the application of NGS technology for implementation in toxicogenomics and pharmacogenomics.
Collapse
Affiliation(s)
- Yaqiong Jin
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Geng Chen
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Wenming Xiao
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Yongli Guo
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Wenzhong Xiao
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Cancer Center; Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, 200433, China
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Baitang Ning
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
19
|
Hou JY, Wu HY, He RQ, Lin P, Dang YW, Chen G. Clinical and prognostic value of chaperonin containing T-complex 1 subunit 3 in hepatocellular carcinoma: A Study based on microarray and RNA-sequencing with 4272 cases. Pathol Res Pract 2018; 215:177-194. [PMID: 30473171 DOI: 10.1016/j.prp.2018.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Revised: 10/25/2018] [Accepted: 11/06/2018] [Indexed: 12/27/2022]
Abstract
Liver cancer is one of the few tumors with a steadily increasing morbidity and mortality; hepatocellular carcinoma (HCC) is the most common type of primary liver cancer. We combined the expression profiles of Chaperonin Containing T-complex 1 Subunit 3 (CCT3) in HCC tissues based on microarray and RNA-sequencing data. The CCT3 expression levels were extracted and examined based on 421 samples from The Cancer Genome Atlas (TCGA) (HCC, n = 371; non-HCC, n = 50) and 3851 samples from 31 microarray or RNA-sequencing datasets (HCC, n = 1975; non-tumor = 1876). We used a variety of meta-analytic methods, including SMD forest maps, sensitivity analysis, subgroup analysis and sROC curves, to confirm the final results. Meanwhile, database-derived immunohistochemistry data was used for validation. We also further explained the potential mechanism of CCT3 in HCC through signal pathway analyses and PPI network construction with the CCT3 co-expressed genes. The mRNA and protein expression of CCT3 in HCC tissues were higher than in non-HCC tissues. The expression of CCT3 differed between groups when grouped according to clinicopathological parameters, such as race, family history, and histological grade. The results of standardised mean difference (SMD) forest map and summary receiver operating characteristic (sROC) curve revealed that CCT3 was highly expressed in HCC tissues and had a high ability to distinguish between cancer tissues and non-cancer tissues. The main form of CCT3 gene alteration in HCC was mRNA up-regulation and amplification (23%), and the most common mutation type was missense. The mRNA expression of CCT3 in HCC was negatively correlated with DNA methylation. According to the Kyoto Encyclopedia of Genes and Genomes pathway analysis, CCT3 can influence HCC occurrence and development through cell cycle and DNA replication pathways. In summary, this study carries out the staging and prognostic analysis of HCC. It suggests that CCT3 might play an important part in the tumorigenesis and progression of HCC and may have a certain prognostic value in HCC. Moreover, CCT3 might represent a promising biomarker for HCC.
Collapse
Affiliation(s)
- Jia-Yin Hou
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China
| | - Hua-Yu Wu
- Department of Cell Biology and Genetics, School of Preclinical Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China
| | - Rong-Quan He
- Department of Medical Oncology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China
| | - Peng Lin
- Department of Ultrasonography, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China
| | - Yi-Wu Dang
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China
| | - Gang Chen
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region 530021, China.
| |
Collapse
|
20
|
Lippuner C, Ramakrishnan C, Basso WU, Schmid MW, Okoniewski M, Smith NC, Hässig M, Deplazes P, Hehl AB. RNA-Seq analysis during the life cycle of Cryptosporidium parvum reveals significant differential gene expression between proliferating stages in the intestine and infectious sporozoites. Int J Parasitol 2018; 48:413-422. [DOI: 10.1016/j.ijpara.2017.10.007] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 10/06/2017] [Accepted: 10/21/2017] [Indexed: 10/18/2022]
|
21
|
Schmid MW. RNA-Seq Data Analysis Protocol: Combining In-House and Publicly Available Data. Methods Mol Biol 2017; 1669:309-335. [PMID: 28936668 DOI: 10.1007/978-1-4939-7286-9_24] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Comparing gene expression profiles measured in a wide range of different tissue types, at different developmental stages, or under different environmental conditions can yield valuable insights into the mechanisms of cell/tissue specification and differentiation, or identify cell/tissue-type specific responses to environmental stimuli. Critical for such comparisons is the identical processing of data from different sources. This may also include the integration of a novel data set into an existing collection of data sets (e.g., in-house and publicly available data). Here, I describe a complete workflow for RNA-Seq data, from data processing steps to the comparison of gene expression profiles measured with RNA-Seq. I use publicly available data for demonstration purposes, but I also describe how to integrate your own data sets. The workflow runs on all three major operating systems (Linux, MacOS, and Windows). The scripts and the tutorial can be accessed on github.com/MWSchmid/RNAseq_protocol .
Collapse
Affiliation(s)
- Marc W Schmid
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057, Zürich, Switzerland. .,Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, 8008, Zürich, Switzerland. .,URPP Global Change and Biodiversity, University of Zurich, Winterthurerstrasse 190, 8057, Zürich, Switzerland. .,S3IT, University of Zurich, Winterthurerstrasse 190, 8057, Zürich, Switzerland.
| |
Collapse
|
22
|
Consiglio A, Mencar C, Grillo G, Marzano F, Caratozzolo MF, Liuni S. A fuzzy method for RNA-Seq differential expression analysis in presence of multireads. BMC Bioinformatics 2016; 17:345. [PMID: 28185579 PMCID: PMC5123383 DOI: 10.1186/s12859-016-1195-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background When the reads obtained from high-throughput RNA sequencing are mapped against a reference database, a significant proportion of them - known as multireads - can map to more than one reference sequence. These multireads originate from gene duplications, repetitive regions or overlapping genes. Removing the multireads from the mapping results, in RNA-Seq analyses, causes an underestimation of the read counts, while estimating the real read count can lead to false positives during the detection of differentially expressed sequences. Results We present an innovative approach to deal with multireads and evaluate differential expression events, entirely based on fuzzy set theory. Since multireads cause uncertainty in the estimation of read counts during gene expression computation, they can also influence the reliability of differential expression analysis results, by producing false positives. Our method manages the uncertainty in gene expression estimation by defining the fuzzy read counts and evaluates the possibility of a gene to be differentially expressed with three fuzzy concepts: over-expression, same-expression and under-expression. The output of the method is a list of differentially expressed genes enriched with information about the uncertainty of the results due to the multiread presence. We have tested the method on RNA-Seq data designed for case-control studies and we have compared the obtained results with other existing tools for read count estimation and differential expression analysis. Conclusions The management of multireads with the use of fuzzy sets allows to obtain a list of differential expression events which takes in account the uncertainty in the results caused by the presence of multireads. Such additional information can be used by the biologists when they have to select the most relevant differential expression events to validate with laboratory assays. Our method can be used to compute reliable differential expression events and to highlight possible false positives in the lists of differentially expressed genes computed with other tools. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1195-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arianna Consiglio
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy.
| | - Corrado Mencar
- Department of Informatics, University of Bari Aldo Moro, Bari, 70121, Italy
| | - Giorgio Grillo
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| | - Flaviana Marzano
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| | | | - Sabino Liuni
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| |
Collapse
|
23
|
Abstract
High-throughput sequencing of small RNAs (sRNA-seq) is a popular method used to discover and annotate microRNAs (miRNAs), endogenous short interfering RNAs (siRNAs), and Piwi-associated RNAs (piRNAs). One of the key steps in sRNA-seq data analysis is alignment to a reference genome. sRNA-seq libraries often have a high proportion of reads that align to multiple genomic locations, which makes determining their true origins difficult. Commonly used sRNA-seq alignment methods result in either very low precision (choosing an alignment at random), or sensitivity (ignoring multi-mapping reads). Here, we describe and test an sRNA-seq alignment strategy that uses local genomic context to guide decisions on proper placements of multi-mapped sRNA-seq reads. Tests using simulated sRNA-seq data demonstrated that this local-weighting method outperforms other alignment strategies using three different plant genomes. Experimental analyses with real sRNA-seq data also indicate superior performance of local-weighting methods for both plant miRNAs and heterochromatic siRNAs. The local-weighting methods we have developed are implemented as part of the sRNA-seq analysis program ShortStack, which is freely available under a general public license. Improved genome alignments of sRNA-seq data should increase the quality of downstream analyses and genome annotation efforts.
Collapse
|
24
|
Abstract
High-throughput sequencing of small RNAs (sRNA-seq) is a popular method used to discover and annotate microRNAs (miRNAs), endogenous short interfering RNAs (siRNAs), and Piwi-associated RNAs (piRNAs). One of the key steps in sRNA-seq data analysis is alignment to a reference genome. sRNA-seq libraries often have a high proportion of reads that align to multiple genomic locations, which makes determining their true origins difficult. Commonly used sRNA-seq alignment methods result in either very low precision (choosing an alignment at random), or sensitivity (ignoring multi-mapping reads). Here, we describe and test an sRNA-seq alignment strategy that uses local genomic context to guide decisions on proper placements of multi-mapped sRNA-seq reads. Tests using simulated sRNA-seq data demonstrated that this local-weighting method outperforms other alignment strategies using three different plant genomes. Experimental analyses with real sRNA-seq data also indicate superior performance of local-weighting methods for both plant miRNAs and heterochromatic siRNAs. The local-weighting methods we have developed are implemented as part of the sRNA-seq analysis program ShortStack, which is freely available under a general public license. Improved genome alignments of sRNA-seq data should increase the quality of downstream analyses and genome annotation efforts.
Collapse
|
25
|
Schuierer S, Roma G. The exon quantification pipeline (EQP): a comprehensive approach to the quantification of gene, exon and junction expression from RNA-seq data. Nucleic Acids Res 2016; 44:e132. [PMID: 27302131 PMCID: PMC5027495 DOI: 10.1093/nar/gkw538] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 06/04/2016] [Indexed: 01/24/2023] Open
Abstract
The quantification of transcriptomic features is the basis of the analysis of RNA-seq data. We present an integrated alignment workflow and a simple counting-based approach to derive estimates for gene, exon and exon–exon junction expression. In contrast to previous counting-based approaches, EQP takes into account only reads whose alignment pattern agrees with the splicing pattern of the features of interest. This leads to improved gene expression estimates as well as to the generation of exon counts that allow disambiguating reads between overlapping exons. Unlike other methods that quantify skipped introns, EQP offers a novel way to compute junction counts based on the agreement of the read alignments with the exons on both sides of the junction, thus providing a uniformly derived set of counts. We evaluated the performance of EQP on both simulated and real Illumina RNA-seq data and compared it with other quantification tools. Our results suggest that EQP provides superior gene expression estimates and we illustrate the advantages of EQP's exon and junction counts. The provision of uniformly derived high-quality counts makes EQP an ideal quantification tool for differential expression and differential splicing studies. EQP is freely available for download at https://github.com/Novartis/EQP-cluster.
Collapse
Affiliation(s)
- Sven Schuierer
- Novartis Institutes for Biomedical Research, CH-4056 Basel, Switzerland
| | - Guglielmo Roma
- Novartis Institutes for Biomedical Research, CH-4056 Basel, Switzerland
| |
Collapse
|
26
|
Schmid MW, Schmidt A, Grossniklaus U. The female gametophyte: an emerging model for cell type-specific systems biology in plant development. FRONTIERS IN PLANT SCIENCE 2015; 6:907. [PMID: 26579157 PMCID: PMC4630298 DOI: 10.3389/fpls.2015.00907] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Accepted: 10/10/2015] [Indexed: 05/03/2023]
Abstract
Systems biology, a holistic approach describing a system emerging from the interactions of its molecular components, critically depends on accurate qualitative determination and quantitative measurements of these components. Development and improvement of large-scale profiling methods ("omics") now facilitates comprehensive measurements of many relevant molecules. For multicellular organisms, such as animals, fungi, algae, and plants, the complexity of the system is augmented by the presence of specialized cell types and organs, and a complex interplay within and between them. Cell type-specific analyses are therefore crucial for the understanding of developmental processes and environmental responses. This review first gives an overview of current methods used for large-scale profiling of specific cell types exemplified by recent advances in plant biology. The focus then lies on suitable model systems to study plant development and cell type specification. We introduce the female gametophyte of flowering plants as an ideal model to study fundamental developmental processes. Moreover, the female reproductive lineage is of importance for the emergence of evolutionary novelties such as an unequal parental contribution to the tissue nurturing the embryo or the clonal production of seeds by asexual reproduction (apomixis). Understanding these processes is not only interesting from a developmental or evolutionary perspective, but bears great potential for further crop improvement and the simplification of breeding efforts. We finally highlight novel methods, which are already available or which will likely soon facilitate large-scale profiling of the specific cell types of the female gametophyte in both model and non-model species. We conclude that it may take only few years until an evolutionary systems biology approach toward female gametogenesis may decipher some of its biologically most interesting and economically most valuable processes.
Collapse
Affiliation(s)
| | | | - Ueli Grossniklaus
- Department of Plant & Microbial Biology and Zurich-Basel Plant Science Center, University of ZurichZurich, Switzerland
| |
Collapse
|
27
|
Spies D, Ciaudo C. Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis. Comput Struct Biotechnol J 2015; 13:469-77. [PMID: 26430493 PMCID: PMC4564389 DOI: 10.1016/j.csbj.2015.08.004] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 08/05/2015] [Accepted: 08/07/2015] [Indexed: 12/17/2022] Open
Abstract
Analysis of gene expression has contributed to a plethora of biological and medical research studies. Microarrays have been intensively used for the profiling of gene expression during diverse developmental processes, treatments and diseases. New massively parallel sequencing methods, often named as RNA-sequencing (RNA-seq) are extensively improving our understanding of gene regulation and signaling networks. Computational methods developed originally for microarrays analysis can now be optimized and applied to genome-wide studies in order to have access to a better comprehension of the whole transcriptome. This review addresses current challenges on RNA-seq analysis and specifically focuses on new bioinformatics tools developed for time series experiments. Furthermore, possible improvements in analysis, data integration as well as future applications of differential expression analysis are discussed.
Collapse
Affiliation(s)
- Daniel Spies
- Swiss Federal Institute of Technology Zurich, Department of Biology, Institute of Molecular Health Sciences, Zurich, Otto-Stern Weg 7, 8093 Zurich, Switzerland
- Life Science Zurich Graduate School, Molecular Life Science Program, University of Zurich, Institute of Molecular Life Sciences, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Constance Ciaudo
- Swiss Federal Institute of Technology Zurich, Department of Biology, Institute of Molecular Health Sciences, Zurich, Otto-Stern Weg 7, 8093 Zurich, Switzerland
| |
Collapse
|