1
|
Yu P, Li J, Deng SP, Zhang F, Grozdanov PN, Chin EWM, Martin SD, Vergnes L, Islam MS, Sun D, LaSalle JM, McGee SL, Goh E, MacDonald CC, Jin P. Integrated analysis of a compendium of RNA-Seq datasets for splicing factors. Sci Data 2020; 7:178. [PMID: 32546682 PMCID: PMC7297722 DOI: 10.1038/s41597-020-0514-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 03/13/2020] [Indexed: 02/05/2023] Open
Abstract
A vast amount of public RNA-sequencing datasets have been generated and used widely to study transcriptome mechanisms. These data offer precious opportunity for advancing biological research in transcriptome studies such as alternative splicing. We report the first large-scale integrated analysis of RNA-Seq data of splicing factors for systematically identifying key factors in diseases and biological processes. We analyzed 1,321 RNA-Seq libraries of various mouse tissues and cell lines, comprising more than 6.6 TB sequences from 75 independent studies that experimentally manipulated 56 splicing factors. Using these data, RNA splicing signatures and gene expression signatures were computed, and signature comparison analysis identified a list of key splicing factors in Rett syndrome and cold-induced thermogenesis. We show that cold-induced RNA-binding proteins rescue the neurite outgrowth defects in Rett syndrome using neuronal morphology analysis, and we also reveal that SRSF1 and PTBP1 are required for energy expenditure in adipocytes using metabolic flux analysis. Our study provides an integrated analysis for identifying key factors in diseases and biological processes and highlights the importance of public data resources for identifying hypotheses for experimental testing.
Collapse
Affiliation(s)
- Peng Yu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China.
- Medical Big Data Center, Sichuan University, Chengdu, China.
| | - Jin Li
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, College of Medicine, Texas A&M University, Houston, TX, 77030, USA
| | - Su-Ping Deng
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu, 215009, China
| | - Feiran Zhang
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Petar N Grozdanov
- Department of Cell Biology & Biochemistry, Texas Tech University Health Sciences Center, Lubbock, Texas, 79430, USA
| | - Eunice W M Chin
- Neuroscience Academic Clinical Programme, Duke-NUS Medical School, NA, Singapore
| | - Sheree D Martin
- Metabolic Reprogramming Laboratory, Metabolic Research Unit, School of Medicine and Centre for Molecular and Medical Research, Deakin University, Geelong, Victoria, Australia
| | - Laurent Vergnes
- Department of Human Genetics, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, CA, USA
| | - M Saharul Islam
- Department of Medical Microbiology and Immunology, Genome Center, and MIND Institute, University of California Davis, Davis, CA, USA
| | - Deqiang Sun
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, College of Medicine, Texas A&M University, Houston, TX, 77030, USA
| | - Janine M LaSalle
- Department of Medical Microbiology and Immunology, Genome Center, and MIND Institute, University of California Davis, Davis, CA, USA
| | - Sean L McGee
- Metabolic Reprogramming Laboratory, Metabolic Research Unit, School of Medicine and Centre for Molecular and Medical Research, Deakin University, Geelong, Victoria, Australia
| | - Eyleen Goh
- Neuroscience Academic Clinical Programme, Duke-NUS Medical School, NA, Singapore
| | - Clinton C MacDonald
- Department of Cell Biology & Biochemistry, Texas Tech University Health Sciences Center, Lubbock, Texas, 79430, USA
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, USA
| |
Collapse
|
2
|
Ji H, Hui B, Wang J, Zhu Y, Tang L, Peng P, Wang T, Wang L, Xu S, Li J, Wang K. Long noncoding RNA MAPKAPK5-AS1 promotes colorectal cancer proliferation by partly silencing p21 expression. Cancer Sci 2019; 110:72-85. [PMID: 30343528 PMCID: PMC6317943 DOI: 10.1111/cas.13838] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 09/25/2018] [Accepted: 10/04/2018] [Indexed: 02/06/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common malignancy in the world, and long noncoding RNA (lncRNA) plays a critical role in carcinogenesis. Here, we report a novel lncRNA, MAPKAPK5-AS1, that acts as a critical oncogene in CRC. In addition, we attempted to explore the functions of MAPKAPK5-AS1 on tumor progression in vitro and in vivo. Quantitative RT-PCR was used to examine the expression of MAPKAPK5-AS1 in CRC tissues and cells. Expression of MAPKAPK5-AS1 was significantly upregulated in 50 CRC tissues, and increased expression of MAPKAPK5-AS1 was found to be associated with greater tumor size and advanced pathological stage in CRC patients. Knockdown of MAPKAPK5-AS1 significantly inhibited proliferation and caused apoptosis in CRC cells. We also found that p21 is a target of MAPKAPK5-AS1. In addition, we are the first to report that MAPKAPK5-AS1 plays a carcinogenic role in CRC. MAPKAPK5-AS1 is a novel prognostic biomarker and a potential therapeutic candidate for CRC cancer.
Collapse
Affiliation(s)
- Hao Ji
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Bingqing Hui
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Jirong Wang
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Ya Zhu
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Lingyu Tang
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
- Institute of Digestive Endoscopy and Medical Center for Digestive DiseasesSecond Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Peng Peng
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Tianjun Wang
- Department of Obstetrics and GynecologyThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Lijuan Wang
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
- Department of GeriatricsSecond Affiliated HospitalNanjing Medical UniversityNanjingChina
| | - Shufeng Xu
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Juan Li
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Keming Wang
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| |
Collapse
|
3
|
Li J, Zheng L, Uchiyama A, Bin L, Mauro TM, Elias PM, Pawelczyk T, Sakowicz-Burkiewicz M, Trzeciak M, Leung DYM, Morasso MI, Yu P. A data mining paradigm for identifying key factors in biological processes using gene expression data. Sci Rep 2018; 8:9083. [PMID: 29899432 PMCID: PMC5998123 DOI: 10.1038/s41598-018-27258-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 05/21/2018] [Indexed: 12/15/2022] Open
Abstract
A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.
Collapse
Affiliation(s)
- Jin Li
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
- TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, 77843, USA
| | - Le Zheng
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
| | - Akihiko Uchiyama
- Laboratory of Skin Biology, National Institute for Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Lianghua Bin
- Department of Pediatrics, National Jewish Health, Denver, Colorado, USA
| | - Theodora M Mauro
- Dermatology Service, Veterans Affairs Medical Center, and Department of Dermatology, UCSF, San Francisco, California, USA
| | - Peter M Elias
- Dermatology Service, Veterans Affairs Medical Center, and Department of Dermatology, UCSF, San Francisco, California, USA
| | - Tadeusz Pawelczyk
- Department of Molecular Medicine, Medical University of Gdansk, Gdansk, Poland
| | | | - Magdalena Trzeciak
- Department of Dermatology, Venerology and Allergology, Medical University of Gdansk, Gdansk, Poland
| | - Donald Y M Leung
- Department of Pediatrics, National Jewish Health, Denver, Colorado, USA
| | - Maria I Morasso
- Laboratory of Skin Biology, National Institute for Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Peng Yu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA.
- TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|