1
|
Abbas SZ, Qadir MI, Muhammad SA. Systems-level differential gene expression analysis reveals new genetic variants of oral cancer. Sci Rep 2020; 10:14667. [PMID: 32887903 PMCID: PMC7473858 DOI: 10.1038/s41598-020-71346-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 07/20/2020] [Indexed: 01/28/2023] Open
Abstract
Oral cancer (OC) ranked as eleventh malignancy worldwide, with the increasing incidence among young patients. Limited understanding of complications in cancer progression, its development system, and their interactions are major restrictions towards the progress of optimal and effective treatment strategies. The system-level approach has been designed to explore genetic complexity of the disease and to identify novel oral cancer related genes to detect genomic alterations at molecular level, through cDNA differential analysis. We analyzed 21 oral cancer-related cDNA datasets and listed 30 differentially expressed genes (DEGs). Among 30, we found 6 significant DEGs including CYP1A1, CYP1B1, ADCY2, C7, SERPINB5, and ANAPC13 and studied their functional role in OC. Our genomic and interactive analysis showed significant enrichment of xenobiotics metabolism, p53 signaling pathway and microRNA pathways, towards OC progression and development. We used human proteomic data for post-translational modifications to interpret disease mutations and inter-individual genetic variations. The mutational analysis revealed the sequence predicted disordered region of 14%, 12.5%, 10.5% for ADCY2, CYP1B1, and C7 respectively. The MiRNA target prediction showed functional molecular annotation including specific miRNA-targets hsa-miR-4282, hsa-miR-2052, hsa-miR-216a-3p, for CYP1B1, C7, and ADCY2 respectively associated with oral cancer. We constructed the system level network and found important gene signatures. The drug-gene interaction of OC source genes with seven FDA approved OC drugs help to design or identify new drug target or establishing novel biomedical linkages regarding disease pathophysiology. This investigation demonstrates the importance of system genetics for identifying 6 OC genes (CYP1A1, CYP1B1, ADCY2, C7, SERPINB5, and ANAPC13) as potential drugs targets. Our integrative network-based system-level approach would help to find the genetic variants of OC that can accelerate drug discovery outcomes to develop a better understanding regarding treatment strategies for many cancer types.
Collapse
Affiliation(s)
- Syeda Zahra Abbas
- Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University, Multan, Pakistan
| | - Muhammad Imran Qadir
- Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University, Multan, Pakistan
| | - Syed Aun Muhammad
- Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University, Multan, Pakistan.
| |
Collapse
|
2
|
Wei C, Li Y, Huang K, Li G, He M. Exosomal miR-1246 in body fluids is a potential biomarker for gastrointestinal cancer. Biomark Med 2018; 12:1185-1196. [PMID: 30235938 DOI: 10.2217/bmm-2017-0440] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
AIM The aim was to systematically evaluate whether exosomal miRNAs could be regarded as potential minimally invasive biomarkers of diagnosis for gastrointestinal cancer. METHODS A systematic review and meta analysis of exosomal miRNA expression in gastrointestinal cancer were performed. RESULTS A total of 370 articles were retrieved from PubMed and EMBASE. The summary receiver operating characteristic curves of three miRNAs (miR-21, miR-1246 and miR-4644) were drawn, miR-21, miR-1246 and miR-4644 exhibited sensitivities of 0.66, 0.920 and 0.750, respectively; specificities were 0.87, 0.958 and 0.769, respectively; and areas under the curve for discriminating gastrointestinal cancer patients from control subjects were 0.876, 0.969 and 0.827, respectively. CONCLUSION Exosome miR-1246 had the highest level of diagnostic efficiency, which indicated that miR-1246 could be a biomarker.
Collapse
Affiliation(s)
- Chunmeng Wei
- School of Public Health, Guangxi Medical University, Nanning 530021, PR China
| | - Yasi Li
- College of Arts & Sciences, Stony Brook University, NY 11790, USA
| | - Kaiming Huang
- School of Public Health, Guangxi Medical University, Nanning 530021, PR China
| | - Gang Li
- School of Public Health, Guangxi Medical University, Nanning 530021, PR China
| | - Min He
- School of Public Health, Guangxi Medical University, Nanning 530021, PR China.,Key Laboratory of High-Incidence Tumor Prevention & Treatment (Guangxi Medical University), Ministry of Education, PR China
| |
Collapse
|
3
|
Mojtabavi Naeini M, Tavassoli M, Ghaedi K. Systematic bioinformatic approaches reveal novel gene expression signatures associated with acquired resistance to EGFR targeted therapy in lung cancer. Gene 2018; 667:62-69. [DOI: 10.1016/j.gene.2018.04.077] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 04/23/2018] [Accepted: 04/25/2018] [Indexed: 11/25/2022]
|
4
|
Rattray NJW, Deziel NC, Wallach JD, Khan SA, Vasiliou V, Ioannidis JPA, Johnson CH. Beyond genomics: understanding exposotypes through metabolomics. Hum Genomics 2018; 12:4. [PMID: 29373992 PMCID: PMC5787293 DOI: 10.1186/s40246-018-0134-x] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 01/11/2018] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Over the past 20 years, advances in genomic technology have enabled unparalleled access to the information contained within the human genome. However, the multiple genetic variants associated with various diseases typically account for only a small fraction of the disease risk. This may be due to the multifactorial nature of disease mechanisms, the strong impact of the environment, and the complexity of gene-environment interactions. Metabolomics is the quantification of small molecules produced by metabolic processes within a biological sample. Metabolomics datasets contain a wealth of information that reflect the disease state and are consequent to both genetic variation and environment. Thus, metabolomics is being widely adopted for epidemiologic research to identify disease risk traits. In this review, we discuss the evolution and challenges of metabolomics in epidemiologic research, particularly for assessing environmental exposures and providing insights into gene-environment interactions, and mechanism of biological impact. MAIN TEXT Metabolomics can be used to measure the complex global modulating effect that an exposure event has on an individual phenotype. Combining information derived from all levels of protein synthesis and subsequent enzymatic action on metabolite production can reveal the individual exposotype. We discuss some of the methodological and statistical challenges in dealing with this type of high-dimensional data, such as the impact of study design, analytical biases, and biological variance. We show examples of disease risk inference from metabolic traits using metabolome-wide association studies. We also evaluate how these studies may drive precision medicine approaches, and pharmacogenomics, which have up to now been inefficient. Finally, we discuss how to promote transparency and open science to improve reproducibility and credibility in metabolomics. CONCLUSIONS Comparison of exposotypes at the human population level may help understanding how environmental exposures affect biology at the systems level to determine cause, effect, and susceptibilities. Juxtaposition and integration of genomics and metabolomics information may offer additional insights. Clinical utility of this information for single individuals and populations has yet to be routinely demonstrated, but hopefully, recent advances to improve the robustness of large-scale metabolomics will facilitate clinical translation.
Collapse
Affiliation(s)
- Nicholas J. W. Rattray
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT USA
| | - Nicole C. Deziel
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT USA
| | - Joshua D. Wallach
- Collaboration for Research Integrity and Transparency (CRIT), Yale Law School, New Haven, CT USA
- Center for Outcomes Research and Evaluation (CORE), Yale-New Haven Health System, New Haven, CT USA
| | - Sajid A. Khan
- Department of Surgery, Section of Surgical Oncology, Yale University School of Medicine, New Haven, CT USA
- Yale Cancer Center, Yale University School of Medicine, New Haven, CT USA
| | - Vasilis Vasiliou
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT USA
- Yale Cancer Center, Yale University School of Medicine, New Haven, CT USA
| | - John P. A. Ioannidis
- Stanford Prevention Research Center, Department of Medicine, Stanford University, Stanford, CA USA
- Department of Health Research and Policy, Stanford University, Stanford, CA USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA USA
- Department of Statistics, Stanford University, Stanford, CA USA
- Meta-Research Innovation Center at Stanford, Stanford University, Stanford, CA USA
| | - Caroline H. Johnson
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT USA
- Yale Cancer Center, Yale University School of Medicine, New Haven, CT USA
| |
Collapse
|
5
|
Ho XD, Phung P, Q Le V, H Nguyen V, Reimann E, Prans E, Kõks G, Maasalu K, Le NT, H Trinh L, G Nguyen H, Märtson A, Kõks S. Whole transcriptome analysis identifies differentially regulated networks between osteosarcoma and normal bone samples. Exp Biol Med (Maywood) 2017; 242:1802-1811. [PMID: 29050494 DOI: 10.1177/1535370217736512] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
We performed whole transcriptome analysis of osteosarcoma bone samples. Initially, we sequenced total RNA from 36 fresh-frozen samples (18 tumoral bone samples and 18 non-tumoral paired samples) matching in pairs for each osteosarcoma patient. We also performed independent gene expression analysis of formalin-fixed paraffin-embedded samples to verify the RNAseq results. Formalin-fixed paraffin-embedded samples allowed us to analyze the effect of chemotherapy. Data were analyzed with DESeq2, edgeR and Reactome packages of R. We found 5365 genes expressed differentially between the normal bone and osteosarcoma tissues with an FDR below 0.05, of which 3399 genes were upregulated and 1966 were downregulated. Among those genes, BTNL9, MMP14, ABCA10, ACACB, COL11A1, and PKM2 were expressed differentially with the highest significance between tumor and normal bone. Functional annotation with the reactome identified significant changes in the pathways related to the extracellular matrix degradation and collagen biosynthesis. It was suggested that chemotherapy may induce the modification of ECM with important collagen biosynthesis. Taken together, our results indicate that changes in the degradation of extracellular matrix seem to be an important mechanism of osteosarcoma and efficient chemotherapy induces the genes related to bone formation. Impact statement Osteosarcoma is a rare disease but it is of interest to many scientists all over the world because the current standard treatment still has poor results. We sequenced total RNA from 36 fresh-frozen paired samples (18 tumoral bone samples and 18 non-tumoral paired samples) from osteosarcoma patients. We found that differences in the gene expressions between the normal and affected bones reflected the changes in the regulation of the degradation of collagen and extracellular matrix. We believe that these findings contribute to the understanding of OS and suggest ideas for further studies.
Collapse
Affiliation(s)
- Xuan Dung Ho
- 1 Department of Oncology, 155407 College of Medicine and Pharmacy , Hue University, Hue 53000, Vietnam.,2 Department of Pathophysiology, 37546 University of Tartu , Tartu 50411, Estonia
| | - Phuong Phung
- 1 Department of Oncology, 155407 College of Medicine and Pharmacy , Hue University, Hue 53000, Vietnam
| | - Van Q Le
- 3 Department of Oncology, Hanoi Medical University, Hanoi 15000, Vietnam
| | - Van H Nguyen
- 3 Department of Oncology, Hanoi Medical University, Hanoi 15000, Vietnam
| | - Ene Reimann
- 2 Department of Pathophysiology, 37546 University of Tartu , Tartu 50411, Estonia.,4 Department of Reproductive Biology, 85334 Estonian University of Life Sciences , Tartu 51014, Estonia
| | - Ele Prans
- 2 Department of Pathophysiology, 37546 University of Tartu , Tartu 50411, Estonia
| | - Gea Kõks
- 2 Department of Pathophysiology, 37546 University of Tartu , Tartu 50411, Estonia
| | - Katre Maasalu
- 5 Department of Traumatology and Orthopedics, 37546 University of Tartu , Tartu 50411, Estonia.,6 Clinic of Traumatology and Orthopaedics of Tartu University Hospital, Tartu 50406, Estonia
| | - Nghi Tn Le
- 7 Department of Orthopedics, 155407 College of Medicine and Pharmacy , Hue University, Hue 53000, Vietnam
| | - Le H Trinh
- 3 Department of Oncology, Hanoi Medical University, Hanoi 15000, Vietnam
| | - Hoang G Nguyen
- 3 Department of Oncology, Hanoi Medical University, Hanoi 15000, Vietnam
| | - Aare Märtson
- 5 Department of Traumatology and Orthopedics, 37546 University of Tartu , Tartu 50411, Estonia.,6 Clinic of Traumatology and Orthopaedics of Tartu University Hospital, Tartu 50406, Estonia
| | - Sulev Kõks
- 2 Department of Pathophysiology, 37546 University of Tartu , Tartu 50411, Estonia.,4 Department of Reproductive Biology, 85334 Estonian University of Life Sciences , Tartu 51014, Estonia
| |
Collapse
|
6
|
Gene expression meta-analysis in diffuse low-grade glioma and the corresponding histological subtypes. Sci Rep 2017; 7:11741. [PMID: 28924174 PMCID: PMC5603565 DOI: 10.1038/s41598-017-12087-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 09/04/2017] [Indexed: 01/20/2023] Open
Abstract
Diffuse low-grade glioma (DLGG) is a well-differentiated, slow-growing tumour with an inherent tendency to progress to high-grade glioma. The potential roles of genetic alterations in DLGG development have not yet been fully delineated. Therefore, the current study performed an integrated gene expression meta-analysis of eight independent, publicly available microarray datasets including 291 DLGGs and 83 non-glioma (NG) samples to identify gene expression signatures associated with DLGG. Using INMEX, 708 differentially expressed genes (DEGs) (385 upregulated and 323 downregulated genes) were identified in DLGG compared to NG. Furthermore, 497 DEGs (222 upregulated and 275 downregulated genes) corresponding to two histological types were identified. Of these, high expression of HIP1R significantly correlated with increased overall survival, whereas high expression of TBXAS1 significantly correlated with decreased overall survival. Additionally, network-based meta-analysis identified FN1 and APP as the key hub genes in DLGG compared with NG. PTPN6 and CUL3 were the key hub genes identified in the astrocytoma relative to the oligodendroglioma. Further immunohistochemical validation revealed that MTHFD2 and SPARC were positively expressed in DLGG, whereas RBP4 was positively expressed in NG. These findings reveal potential molecular biomarkers for diagnosis and therapy in patients with DLGG and provide a rich and novel candidate reservoir for future studies.
Collapse
|
7
|
Abstract
PURPOSE OF REVIEW This article discusses genomic investigations in ankylosing spondylitis (AS) beyond genome-wide association (GWA) studies, but prior to this, genetic variants achieving genome-wide significance will be summarized highlighting key pathways contributing to disease pathogenesis. RECENT FINDINGS Evidence suggests that disease pathogenesis is attributed to a complex interplay of genetic, environmental and immunological factors. GWA studies have greatly enhanced our understanding of AS pathogenesis by illuminating distinct immunomodulatory pathways affecting innate and acquired immunity, most notably the interleukin-23/interleukin-17 pathway. However, despite the wealth of new information gleaned from such studies, a fraction of the heritability (24.4%) has been explained. This review will focus on investigations beyond GWA studies including copy number variants, gene expression profiling, including microRNA (miRNA), epigenetics, rare variants and gene-gene interactions. SUMMARY To address the 'missing heritability' and advance beyond GWA studies, a concerted effort involving rethinking of study design and implementation of newer technologies will be required. The coming of age of next-generation sequencing and advancements in epigenetic and miRNA technologies, combined with familial-focused investigations using well-characterized cohorts, is likely to reveal some of the hidden genomic mysteries associated with AS.
Collapse
|
8
|
Wang T, Zhang L, Tian P, Tian S. Identification of differentially-expressed genes between early-stage adenocarcinoma and squamous cell carcinoma lung cancer using meta-analysis methods. Oncol Lett 2017; 13:3314-3322. [PMID: 28521438 DOI: 10.3892/ol.2017.5838] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 10/06/2016] [Indexed: 01/04/2023] Open
Abstract
Lung adenocarcinoma (AC) and squamous cell lung carcinoma (SCC) are two major subtypes of non-small cell lung cancer (NSCLC). Previous studies have demonstrated that fundamental differences exist in the underlying mechanisms of tumor development, growth and invasion between these subtypes. The investigation of differentially-expressed genes (DEGs) between these two NSCLC subtypes is useful for determining and understanding such differences. The present study aimed to identify those DEGs using meta-analysis and the data from four microarray experiments, consisting of 164 AC and 161 SCC samples. Raw gene expression values were converted into the probability of expression (POE) representing the differentially-expressed probability of a gene and expression barcode values representing its expression status. The results indicated that when applying a meta-analysis using barcode values, heterogeneity in genes across studies was less severe than when applying a meta-analysis using POE values. DEGs in each meta-analysis method overlapped substantially (P=1.3×10-4), but the barcode method yielded a lower global false discovery rate. Based on this and several other performance statistics, it was concluded that the barcode approach outperformed the POE method. Finally, using those DEGs, ontology and pathway analyses were conducted. A number of genes and enriched pathways were found to be closely associated with NSCLC.
Collapse
Affiliation(s)
- Tianjiao Wang
- School of Life Science, Jilin University, Changchun, Jilin 130012, P.R. China
| | - Lei Zhang
- School of Life Science, Jilin University, Changchun, Jilin 130012, P.R. China.,Department of Neurology, The Second Hospital of Jilin University, Changchun, Jilin 130041, P.R. China
| | - Pu Tian
- School of Life Science, Jilin University, Changchun, Jilin 130012, P.R. China
| | - Suyan Tian
- Division of Clinical Epidemiology, First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
9
|
Jia X, Yu H, Zhang H, Si Y, Tian D, Zhao X, Luan J, Jia H. Integrated analysis of different microarray studies to identify candidate genes in type 1 diabetes. J Diabetes 2017; 9:149-157. [PMID: 26930153 DOI: 10.1111/1753-0407.12391] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 01/20/2016] [Accepted: 02/15/2016] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Type 1 diabetes (T1D), an autoimmune disease, occurs most commonly in children. Identifying altered gene expression in peripheral blood mononuclear cells (PBMCs) of T1D may lead to new strategies for preserving or improving β-ell function in patients with T1D. METHODS The Gene Expression Omnibus database was searched for microarray studies in PBMCs of T1D. Subsequently, gene expression datasets from multiple microarray studies were integrated to obtain differentially expressed genes (DEGs) between T1D and normal controls (NC). Gene function analysis was performed to determine the functions of the DEGs identified. RESULTS Four microarray studies were available for analysis, including 199 T1D samples and 74 NC samples. Analysis revealed 695 genes that were significantly differentially expressed in PBMCs from T1D compared with NC samples, with 450 upregulated and 245 downregulated. Signal transduction (gene ontology [GO]: 0007165; false discovery rate [FDR] = 1.54 × 10-7 ) and protein binding (GO: 0005515; FDR = 2.93 × 10-24 ) were significantly enriched for the GO categories of biological processes and molecular functions, respectively. The most significant pathway in the Kyoto Encyclopedia of Genes and Genomes analysis was arachidonic acid metabolism (FDR = 1.44 × 10-3 ). Protein-protein interaction network analysis showed that the significant hub proteins contained immature colon carcinoma transcript 1 (ICT1; degree = 214; clustering coefficient [C] = 4.39 × 10-5 ), zinc finger and BTB domain containing 16 (ZBTB16; degree = 112; C = 8.04 × 10-4 ), and SERTA domain containing 1 (SERTAD1; degree = 38; C = 0.0014). CONCLUSIONS This integrated analysis will help develop improved therapies and interventions for T1D by identifying novel drug targets.
Collapse
Affiliation(s)
- Xiaowei Jia
- Department of Endocrinology, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| | - Haotian Yu
- Department of Medicine, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| | - Hui Zhang
- Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, China
| | - Yanfang Si
- Department of Ophthalmology, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| | - Dengmei Tian
- Department of Hematology, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| | - Xin Zhao
- Department of Endocrinology, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| | - Jin Luan
- Department of Disease Control, Center for Disease Control and Prevention of the Chinese Armed Police Force (CAPF), Beijing, China
| | - Hetang Jia
- Department of Endocrinology, The 309 Hospital of Chinese People's Liberation Army, Beijing, China
| |
Collapse
|
10
|
Li JJ, Wang BQ, Fei Q, Yang Y, Li D. Identification of candidate genes in osteoporosis by integrated microarray analysis. Bone Joint Res 2016; 5:594-601. [PMID: 27908864 PMCID: PMC5227060 DOI: 10.1302/2046-3758.512.bjr-2016-0073.r1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 10/05/2016] [Indexed: 11/06/2022] Open
Abstract
Objectives In order to screen the altered gene expression profile in peripheral blood mononuclear cells of patients with osteoporosis, we performed an integrated analysis of the online microarray studies of osteoporosis. Methods We searched the Gene Expression Omnibus (GEO) database for microarray studies of peripheral blood mononuclear cells in patients with osteoporosis. Subsequently, we integrated gene expression data sets from multiple microarray studies to obtain differentially expressed genes (DEGs) between patients with osteoporosis and normal controls. Gene function analysis was performed to uncover the functions of identified DEGs. Results A total of three microarray studies were selected for integrated analysis. In all, 1125 genes were found to be significantly differentially expressed between osteoporosis patients and normal controls, with 373 upregulated and 752 downregulated genes. Positive regulation of the cellular amino metabolic process (gene ontology (GO): 0033240, false discovery rate (FDR) = 1.00E + 00) was significantly enriched under the GO category for biological processes, while for molecular functions, flavin adenine dinucleotide binding (GO: 0050660, FDR = 3.66E-01) and androgen receptor binding (GO: 0050681, FDR = 6.35E-01) were significantly enriched. DEGs were enriched in many osteoporosis-related signalling pathways, including those of mitogen-activated protein kinase (MAPK) and calcium. Protein-protein interaction (PPI) network analysis showed that the significant hub proteins contained ubiquitin specific peptidase 9, X-linked (Degree = 99), ubiquitin specific peptidase 19 (Degree = 57) and ubiquitin conjugating enzyme E2 B (Degree = 57). Conclusion Analysis of gene function of identified differentially expressed genes may expand our understanding of fundamental mechanisms leading to osteoporosis. Moreover, significantly enriched pathways, such as MAPK and calcium, may involve in osteoporosis through osteoblastic differentiation and bone formation. Cite this article: J. J. Li, B. Q. Wang, Q. Fei, Y. Yang, D. Li. Identification of candidate genes in osteoporosis by integrated microarray analysis. Bone Joint Res 2016;5:594–601. DOI: 10.1302/2046-3758.512.BJR-2016-0073.R1.
Collapse
Affiliation(s)
- J J Li
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, Yongan Road 95, Xicheng District, Beijing 100050, China
| | - B Q Wang
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, Yongan Road 95, Xicheng District, Beijing 100050, China
| | - Q Fei
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, Yongan Road 95, Xicheng District, Beijing 100050, China
| | - Y Yang
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, Yongan Road 95, Xicheng District, Beijing 100050, China
| | - D Li
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, Yongan Road 95, Xicheng District, Beijing 100050, China
| |
Collapse
|
11
|
Sun Y, Sang Z, Jiang Q, Ding X, Yu Y. Transcriptomic characterization of differential gene expression in oral squamous cell carcinoma: a meta-analysis of publicly available microarray data sets. Tumour Biol 2016; 37:10.1007/s13277-016-5439-6. [PMID: 27704359 DOI: 10.1007/s13277-016-5439-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Accepted: 09/23/2016] [Indexed: 01/04/2023] Open
Abstract
Oral squamous cell carcinoma (OSCC) is a highly prevalent cancer worldwide, and OSCC often goes undiagnosed until advanced disease is present, which contributes to a low survival rate for OSCC patients. The identification of biomarkers for the early detection OSCC and novel therapeutic targets for OSCC treatment is an important research objective. We performed bioinformatics analyses of the gene expression profile of OSCC using microarray data to identify genes that contribute to the development of OSCC. We also predicted the transcription factors involved in the regulation of differential gene expression in OSCC. Our results showed that PI3K, EGFR, STAT1, and CPBP are important contributors to the changes in cellular physiology that occur during the development of OSCC. Therefore, these genes represent potential diagnostic biomarkers and therapeutic targets for OSCC.
Collapse
Affiliation(s)
- Yang Sun
- Department of Stomatology, Zhongshan Hospital, Fudan University, 111 Yixueyuan Road, Shanghai, 200032, China
| | - Zhijian Sang
- Department of Stomatology, Zhongshan Hospital, Fudan University, 111 Yixueyuan Road, Shanghai, 200032, China
| | - Qian Jiang
- Department of Stomatology, Zhongshan Hospital, Fudan University, 111 Yixueyuan Road, Shanghai, 200032, China
| | - Xiaojun Ding
- Department of Stomatology, Zhongshan Hospital, Fudan University, 111 Yixueyuan Road, Shanghai, 200032, China.
| | - Youcheng Yu
- Department of Stomatology, Zhongshan Hospital, Fudan University, 111 Yixueyuan Road, Shanghai, 200032, China
| |
Collapse
|
12
|
Integrated Analysis of Expression Profile Based on Differentially Expressed Genes in Middle Cerebral Artery Occlusion Animal Models. Int J Mol Sci 2016; 17:ijms17050776. [PMID: 27213359 PMCID: PMC4881595 DOI: 10.3390/ijms17050776] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Revised: 05/10/2016] [Accepted: 05/16/2016] [Indexed: 12/19/2022] Open
Abstract
Stroke is one of the most common causes of death, only second to heart disease. Molecular investigations about stroke are in acute shortage nowadays. This study is intended to explore a gene expression profile after brain ischemia reperfusion. Meta-analysis, differential expression analysis, and integrated analysis were employed on an eight microarray series. We explored the functions and pathways of target genes in gene ontology (GO) enrichment analysis and constructed a protein-protein interaction network. Meta-analysis identified 360 differentially expressed genes (DEGs) for Mus musculus and 255 for Rattus norvegicus. Differential expression analysis identified 44 DEGs for Mus musculus and 21 for Rattus norvegicus. Timp1 and Lcn2 were overexpressed in both species. The cytokine-cytokine receptor interaction and chemokine signaling pathway were highly enriched for the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. We have exhibited a global view of the potential molecular differences between middle cerebral artery occlusion (MCAO) animal model and sham for Mus musculus or Rattus norvegicus, including the biological process and enriched pathways in DEGs. This research helps contribute to a clearer understanding of the inflammation process and accurate identification of ischemic infarction stages, which might be transformed into a therapeutic approach.
Collapse
|
13
|
Fei Q, Lin J, Meng H, Wang B, Yang Y, Wang Q, Su N, Li J, Li D. Identification of upstream regulators for synovial expression signature genes in osteoarthritis. Joint Bone Spine 2016; 83:545-51. [PMID: 26832188 DOI: 10.1016/j.jbspin.2015.09.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 09/14/2015] [Indexed: 11/28/2022]
Abstract
OBJECTIVES The detection of transcription factors (TFs) for OA signature genes provides better clues to the underlying regulatory mechanisms and therapeutic applications. METHODS We searched GEO database for synovial expression profiling from different OA microarray studies to perform a systematic analysis. Functional annotation of DEGs was conducted, including gene ontology (GO) enrichment analysis and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analysis. Based on motif databases and the results from integrated analysis of current gene expression data, a global transcriptional regulatory network was constructed, and the upstream TFs were identified for OA signature genes. RESULTS Six GEO datasets were obtained. Totally, 805 genes across the studies were consistently differentially expressed in OA (469 up-regulated and 336 down-regulated genes) with FDR≤0.01. Supporting an involvement of ECM in the development of OA, we showed that ECM-receptor interaction was the most significant pathway in our KEGG analysis (P=5.92E-12). Sixty-one differentially expressed TFs were identified with FDR≤0.05. The constructed OA-specific regulatory networks consisted of 648 TF-target interactions between 51 TFs and 429 DEGs in the context of OA. The top 10 TFs covering the most downstream DEGs were identified as crucial TFs involved in the development of OA, including ARID3A, NFIC, ZNF354C, NR4A2, BRCA1, EHF, FOXL1, FOXC1, EGR1, and HOXA5. CONCLUSION This integrated analysis has identified the OA signature, providing clues to pathogenesis of OA at the molecular level, which may be also used as diagnostic markers for OA. Some crucial upstream regulators, such as NR4A2, EHF, and EGR1 may be considered as potential new therapeutic targets for OA.
Collapse
Affiliation(s)
- Qi Fei
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - JiSheng Lin
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Hai Meng
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - BingQiang Wang
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Yong Yang
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Qi Wang
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Nan Su
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Jinjun Li
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China
| | - Dong Li
- Department of Orthopaedics, Beijing Friendship Hospital, Capital Medical University, 95, Yong'an Road, Beijing 100050, China.
| |
Collapse
|
14
|
Integrated analysis of differential gene expression profiles in hippocampi to identify candidate genes involved in Alzheimer's disease. Mol Med Rep 2015; 12:6679-87. [PMID: 26324066 PMCID: PMC4626122 DOI: 10.3892/mmr.2015.4271] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 07/28/2015] [Indexed: 01/01/2023] Open
Abstract
Alzheimer's disease (AD) is a complex neurodegenerative disorder with largely unknown genetic mechanisms. Identifying altered neuronal gene expression in AD may provide diagnostic or therapeutic targets for AD. The present study aimed to identify differentially expressed genes (DEGs) and their further association with other biological processes that regulate causative factors for AD. The present study performed an integrated analysis of publicly available gene expression omnibus datasets of AD hippocampi. Gene ontology (GO) enrichment analyses, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and Protein-Protein interaction (PPI) network analysis were performed. The present study detected 295 DEGs (109 upregulated and 186 downregulated genes) in hippocampi between AD and control samples by integrating four datasets of gene expression profiles of hippocampi of patients with AD. Respiratory electron transport chain (GO: 0022904; P=1.64×10−11) was the most significantly enriched GO term among biological processes, while for molecular functions, the most significantly enriched GO term was that of protein binding (GO: 0005515; P=3.03×10−29), and for cellular components, the most significantly enriched GO term was that of the cytoplasm (GO: 0005737; P=8.67×10−33). The most significant pathway in the KEGG analysis was oxidative phosphorylation (P=1.61×10−13). PPI network analysis showed that the significant hub proteins contained β-actin (degree, 268), hepatoma-derived growth factor (degree, 218) and WD repeat-containing protein 82 (degree, 87). The integrated analysis performed in the present study serves as a basis for identifying novel drug targets to develop improved therapies and interventions for common and devastating neurological diseases such as AD.
Collapse
|
15
|
Wang X, Ning Y, Guo X. Integrative meta-analysis of differentially expressed genes in osteoarthritis using microarray technology. Mol Med Rep 2015; 12:3439-3445. [PMID: 25975828 PMCID: PMC4526045 DOI: 10.3892/mmr.2015.3790] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2014] [Accepted: 04/22/2015] [Indexed: 01/15/2023] Open
Abstract
The aim of the present study was to identify differentially expressed (DE) genes in patients with osteoarthritis (OA), and biological processes associated with changes in gene expression that occur in this disease. Using the INMEX (integrative meta-analysis of expression data) software tool, a meta-analysis of publicly available microarray Gene Expression Omnibus (GEO) datasets of OA was performed. Gene ontology (GO) enrichment analysis was performed in order to detect enriched functional attributes based on gene-associated GO terms. Three GEO datasets, containing 137 patients with OA and 52 healthy controls, were included in the meta-analysis. The analysis identified 85 genes that were consistently differentially expressed in OA (30 genes were upregulated and 55 genes were downregulated). The upregulated gene with the lowest P-value (P=5.36E-07) was S-phase kinase-associated protein 2, E3 ubiquitin protein ligase (SKP2). The downregulated gene with the lowest P-value (P=4.42E-09) was Proline rich 5 like (PRR5L). Among the 210 GO terms that were associated with the set of DE genes, the most significant two enrichments were observed in the GO categories of 'Immune response', with a P-value of 0.000129438, and 'Immune effectors process', with a P-value of 0.000288619. The current meta-analysis identified genes that were consistently DE in OA, in addition to biological pathways associated with changes in gene expression that occur during OA, which may provide insight into the molecular mechanisms underlying the pathogenesis of this disease.
Collapse
Affiliation(s)
- Xi Wang
- School of Public Health, Xi'an Jiaotong University Health Science Center, Key Laboratory of Trace Elements and Endemic Diseases, National Health and Family Planning Commission, Xi'an, Shaanxi 710061, P.R. China
| | - Yujie Ning
- School of Public Health, Xi'an Jiaotong University Health Science Center, Key Laboratory of Trace Elements and Endemic Diseases, National Health and Family Planning Commission, Xi'an, Shaanxi 710061, P.R. China
| | - Xiong Guo
- School of Public Health, Xi'an Jiaotong University Health Science Center, Key Laboratory of Trace Elements and Endemic Diseases, National Health and Family Planning Commission, Xi'an, Shaanxi 710061, P.R. China
| |
Collapse
|
16
|
Xiao WH, Qu XL, Li XM, Sun YL, Zhao HX, Wang S, Zhou X. Identification of commonly dysregulated genes in colorectal cancer by integrating analysis of RNA-Seq data and qRT-PCR validation. Cancer Gene Ther 2015; 22:278-84. [PMID: 25908452 DOI: 10.1038/cgt.2015.20] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 03/09/2015] [Accepted: 03/12/2015] [Indexed: 02/07/2023]
Abstract
The progression of colorectal cancer (CRC) is a multistep process and metastatic CRC is always incurable; consequently, CRC is the leading cause of cancer-related deaths. There is therefore an urgent need for identifying useful biomarkers with enough sensitivity and specificity to detect this disease at early stages, which will significantly reduce the mortality for this malignancy. In this study, we performed an integrating analysis of different RNA-Seq data sets to find new candidate biomarkers for diagnosis, prognosis and as therapeutic targets for this malignancy, as well as to elucidate the molecular mechanisms of CRC carcinogenesis. We identified 883 differentially expressed genes (DEGs) across the studies between CRC and normal control (NC) tissues by combining five RNA-Seq data sets. Gene function analysis revealed high correlation with carcinogenesis. The top 10 most significantly DEGs were further evaluated by quantitative real-time polymerase chain reaction (qRT-PCR) in both rectal cancer (RC) and colon cancer (CC), and the results matched well with integrating data, suggesting that the method of integrating analysis of different RNA-seq data sets is acceptable. Therefore, integrating analysis of different RNA-seq data sets may be a useful way to overcome the limitation of small sample size in a single RNA-seq study. In addition, our study showed that some genes, such as SIM2, ADAMTS6, FOXD4L4 and DNAH5, may have an important role in the development of CRC, which could be applied for diagnosis, prognosis and as therapy for this malignancy. Our findings would also help to understand the pathology of CRC.
Collapse
Affiliation(s)
- W H Xiao
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| | - X L Qu
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| | - X M Li
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| | - Y L Sun
- Beijing Yangshen Bioinformatic Technology, Beijing, China
| | - H X Zhao
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| | - S Wang
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| | - X Zhou
- Department of Oncology, The First affiliated Hospital of PLA General Hospital, Beijing, China
| |
Collapse
|
17
|
LEE YOUNGSEOK, RYU SEUNGWON, BAE SEJONG, PARK TAEHWAN, KWON KANG, NOH YUNHEE, KIM SUNGYOUNG. Cross-platform meta-analysis of multiple gene expression profiles identifies novel expression signatures in acquired anthracycline-resistant breast cancer. Oncol Rep 2015; 33:1985-93. [DOI: 10.3892/or.2015.3810] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 02/02/2015] [Indexed: 11/06/2022] Open
|
18
|
Zhao P, Hu W, Wang H, Yu S, Li C, Bai J, Gui S, Zhang Y. Identification of differentially expressed genes in pituitary adenomas by integrating analysis of microarray data. Int J Endocrinol 2015; 2015:164087. [PMID: 25642247 PMCID: PMC4302352 DOI: 10.1155/2015/164087] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 12/16/2014] [Accepted: 12/16/2014] [Indexed: 01/15/2023] Open
Abstract
Pituitary adenomas, monoclonal in origin, are the most common intracranial neoplasms. Altered gene expression as well as somatic mutations is detected frequently in pituitary adenomas. The purpose of this study was to detect differentially expressed genes (DEGs) and biological processes during tumor formation of pituitary adenomas. We performed an integrated analysis of publicly available GEO datasets of pituitary adenomas to identify DEGs between pituitary adenomas and normal control (NC) tissues. Gene function analysis including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, and protein-protein interaction (PPI) networks analysis was conducted to interpret the biological role of those DEGs. In this study we detected 3994 DEGs (2043 upregulated and 1951 downregulated) in pituitary adenoma through an integrated analysis of 5 different microarray datasets. Gene function analysis revealed that the functions of those DEGs were highly correlated with the development of pituitary adenoma. This integrated analysis of microarray data identified some genes and pathways associated with pituitary adenoma, which may help to understand the pathology underlying pituitary adenoma and contribute to the successful identification of therapeutic targets for pituitary adenoma.
Collapse
Affiliation(s)
- Peng Zhao
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Wei Hu
- Department of Cardiology, Beijing Chuiyangliu Hospital, Beijing, China
| | - Hongyun Wang
- Beijing Neurosurgical Institute, Center of Brain Tumor, Beijing Institute for Brain Disorders, Capital Medical University, Beijing, China
| | - Shengyuan Yu
- Beijing Neurosurgical Institute, Center of Brain Tumor, Beijing Institute for Brain Disorders, Capital Medical University, Beijing, China
| | - Chuzhong Li
- Beijing Neurosurgical Institute, Center of Brain Tumor, Beijing Institute for Brain Disorders, Capital Medical University, Beijing, China
| | - Jiwei Bai
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Songbai Gui
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Yazhuo Zhang
- Beijing Neurosurgical Institute, Center of Brain Tumor, Beijing Institute for Brain Disorders, Capital Medical University, Beijing, China
- *Yazhuo Zhang:
| |
Collapse
|
19
|
Yang Z, Chen Y, Fu Y, Yang Y, Zhang Y, Chen Y, Li D. Meta-analysis of differentially expressed genes in osteosarcoma based on gene expression data. BMC MEDICAL GENETICS 2014; 15:80. [PMID: 25023069 PMCID: PMC4109777 DOI: 10.1186/1471-2350-15-80] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 06/30/2014] [Indexed: 02/04/2023]
Abstract
Background To uncover the genes involved in the development of osteosarcoma (OS), we performed a meta-analysis of OS microarray data to identify differentially expressed genes (DEGs) and biological functions associated with gene expression changes between OS and normal control (NC) tissues. Methods We used publicly available GEO datasets of OS to perform a meta-analysis. We performed Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and Protein-Protein interaction (PPI) networks analysis. Results Eight GEO datasets, including 240 samples of OS and 35 samples of controls, were available for the meta-analysis. We identified 979 DEGs across the studies between OS and NC tissues (472 up-regulated and 507 down-regulated). We found GO terms for molecular functions significantly enriched in protein binding (GO: 0005515, P = 3.83E-60) and calcium ion binding (GO: 0005509, P = 3.79E-13), while for biological processes, the enriched GO terms were cell adhesion (GO:0007155, P = 2.26E-19) and negative regulation of apoptotic process (GO: 0043066, P = 3.24E-15), and for cellular component, the enriched GO terms were cytoplasm (GO: 0005737, P = 9.18E-63) and extracellular region (GO: 0005576, P = 2.28E-47). The most significant pathway in our KEGG analysis was Focal adhesion (P = 5.70E-15). Furthermore, ECM-receptor interaction (P = 1.27E-13) and Cell cycle (P = 4.53E-11) are found to be highly enriched. PPI network analysis indicated that the significant hub proteins containing PTBP2 (Degree = 33), RGS4 (Degree = 15) and FXYD6 (Degree = 13). Conclusions Our meta-analysis detected DEGs and biological functions associated with gene expression changes between OS and NC tissues, guiding further identification and treatment for OS.
Collapse
Affiliation(s)
- Zuozhang Yang
- Bone and Soft Tissue Tumors Research Center of Yunnan Province, Department of Orthopaedics, The Third Affiliated Hospital of Kunming Medical University (Tumor Hospital of Yunnan Province), Kunming, Yunnan 650118, PR China.
| | | | | | | | | | | | | |
Collapse
|
20
|
Lee YH, Bae SC, Song GG. Meta-analysis of gene expression profiles to predict response to biologic agents in rheumatoid arthritis. Clin Rheumatol 2014; 33:775-82. [PMID: 24595895 DOI: 10.1007/s10067-014-2547-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 02/07/2014] [Accepted: 02/19/2014] [Indexed: 10/25/2022]
Abstract
Our aim was to identify differentially expressed (DE) genes and biological processes that may help predict patient response to biologic agents for rheumatoid arthritis (RA). Using the INMEX (integrative meta-analysis of expression data) software tool, we performed a meta-analysis of publicly available microarray Gene Expression Omnibus (GEO) datasets that examined patient response to biologic therapy for RA. Three GEO datasets, containing 79 responders and 34 non-responders, were included in the meta-analysis. We identified 1,374 genes that were consistently differentially expressed in responders vs. non-responders (651 up-regulated and 723 down-regulated). The up-regulated gene with the smallest p value (p=0.000192) was ASCC2 (Activating Signal Cointegrator 1 Complex Subunit 2), and the up-regulated gene with the largest fold change (average log fold change=-0.75869, p=0.000206) was KLRC3 (Killer Cell Lectin-Like Receptor Subfamily C, Member 3). The down-regulated gene with the smallest p value (p=0.000195) was MPL (Myeloproliferative Leukemia Virus Oncogene). Among the 236 GO terms associated with the set of DE genes, the most significantly enriched was "CTP biosynthetic process" (GO:0006241; p=0.000454). Our meta-analysis identified genes that were consistently DE in responders vs. non-responders, as well as biological pathways associated with this set of genes. These results provide insight into the molecular mechanisms underlying responsiveness to biologic therapy for RA.
Collapse
Affiliation(s)
- Young Ho Lee
- Division of Rheumatology, Department of Internal Medicine, Korea University Anam Hospital, Korea University College of Medicine, 126-1 5 ga, Anam-dong, Seongbuk-gu, Seoul, 136-705, Korea,
| | | | | |
Collapse
|
21
|
Li GH, Huang JF. Inferring therapeutic targets from heterogeneous data: HKDC1 is a novel potential therapeutic target for cancer. Bioinformatics 2013; 30:748-52. [DOI: 10.1093/bioinformatics/btt606] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
|
22
|
Reconstruction and analysis of human kidney-specific metabolic network based on omics data. BIOMED RESEARCH INTERNATIONAL 2013; 2013:187509. [PMID: 24222897 PMCID: PMC3814056 DOI: 10.1155/2013/187509] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 08/23/2013] [Accepted: 08/26/2013] [Indexed: 01/15/2023]
Abstract
With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases.
Collapse
|
23
|
Song GG, Kim JH, Seo YH, Choi SJ, Ji JD, Lee YH. Meta-analysis of differentially expressed genes in primary Sjogren's syndrome by using microarray. Hum Immunol 2013; 75:98-104. [PMID: 24090683 DOI: 10.1016/j.humimm.2013.09.012] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 09/11/2013] [Accepted: 09/20/2013] [Indexed: 12/16/2022]
Abstract
INTRODUCTION The purpose of this study was to identify differentially expressed (DE) genes and biological processes associated with changes in gene expression in primary Sjogren's syndrome (pSS). METHODS We performed a meta-analysis using the INMEX program (integrative meta-analysis of expression data) of publicly available microarray GEO datasets of pSS. We performed Gene Ontology (GO) enrichment analyses and pathway analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG). RESULTS Three GEO datasets including 37 cases and 33 controls were available for the meta-analysis. We identified 179 genes across the studies which were consistently DE in pSS (146 up-regulated and 33 down-regulated). The up-regulated gene with the largest effect size (ES) (ES = -2.4228) was SELL (selectin L), whose product is required for the binding and subsequent rolling of leucocytes on endothelial cells to facilitate their migration into secondary lymphoid organs and inflammation sites. The most significant enrichment was in the immune response GO category (P = 2.52 × 10(-25)). The most significant pathway in our KEGG analysis was Epstein-Barr virus infection (P = 9.91 × 10(-06)). CONCLUSIONS Our meta-analysis demonstrated genes that were consistently DE and biological pathways associated with gene expression changes with pSS.
Collapse
Affiliation(s)
- Gwan Gyu Song
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea
| | - Jae-Hoon Kim
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea
| | - Young Ho Seo
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea
| | - Sung Jae Choi
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea
| | - Jong Dae Ji
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea
| | - Young Ho Lee
- Division of Rheumatology, Department of Internal Medicine, Korea University College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
24
|
Lascorz J, Hemminki K, Försti A. Systematic enrichment analysis of gene expression profiling studies identifies consensus pathways implicated in colorectal cancer development. J Carcinog 2011; 10:7. [PMID: 21483658 PMCID: PMC3072670 DOI: 10.4103/1477-3163.78268] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Accepted: 02/22/2011] [Indexed: 01/31/2023] Open
Abstract
Background: A large number of gene expression profiling (GEP) studies on colorectal carcinogenesis have been performed but no reliable gene signature has been identified so far due to the lack of reproducibility in the reported genes. There is growing evidence that functionally related genes, rather than individual genes, contribute to the etiology of complex traits. We used, as a novel approach, pathway enrichment tools to define functionally related genes that are consistently up- or down-regulated in colorectal carcinogenesis. Materials and Methods: We started the analysis with 242 unique annotated genes that had been reported by any of three recent meta-analyses covering GEP studies on genes differentially expressed in carcinoma vs normal mucosa. Most of these genes (218, 91.9%) had been reported in at least three GEP studies. These 242 genes were submitted to bioinformatic analysis using a total of nine tools to detect enrichment of Gene Ontology (GO) categories or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. As a final consistency criterion the pathway categories had to be enriched by several tools to be taken into consideration. Results: Our pathway-based enrichment analysis identified the categories of ribosomal protein constituents, extracellular matrix receptor interaction, carbonic anhydrase isozymes, and a general category related to inflammation and cellular response as significantly and consistently overrepresented entities. Conclusions: We triaged the genes covered by the published GEP literature on colorectal carcinogenesis and subjected them to multiple enrichment tools in order to identify the consistently enriched gene categories. These turned out to have known functional relationships to cancer development and thus deserve further investigation.
Collapse
Affiliation(s)
- Jesús Lascorz
- Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | | |
Collapse
|
25
|
Eveland AL, Satoh-Nagasawa N, Goldshmidt A, Meyer S, Beatty M, Sakai H, Ware D, Jackson D. Digital gene expression signatures for maize development. PLANT PHYSIOLOGY 2010; 154:1024-39. [PMID: 20833728 PMCID: PMC2971585 DOI: 10.1104/pp.110.159673] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Genome-wide expression signatures detect specific perturbations in developmental programs and contribute to functional resolution of key regulatory networks. In maize (Zea mays) inflorescences, mutations in the RAMOSA (RA) genes affect the determinacy of axillary meristems and thus alter branching patterns, an important agronomic trait. In this work, we developed and tested a framework for analysis of tag-based, digital gene expression profiles using Illumina's high-throughput sequencing technology and the newly assembled B73 maize reference genome. We also used a mutation in the RA3 gene to identify putative expression signatures specific to stem cell fate in axillary meristem determinacy. The RA3 gene encodes a trehalose-6-phosphate phosphatase and may act at the interface between developmental and metabolic processes. Deep sequencing of digital gene expression libraries, representing three biological replicate ear samples from wild-type and ra3 plants, generated 27 million 20- to 21-nucleotide reads with frequencies spanning 4 orders of magnitude. Unique sequence tags were anchored to 3'-ends of individual transcripts by DpnII and NlaIII digests, which were multiplexed during sequencing. We mapped 86% of nonredundant signature tags to the maize genome, which associated with 37,117 gene models and unannotated regions of expression. In total, 66% of genes were detected by at least nine reads in immature maize ears. We used comparative genomics to leverage existing information from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) in functional analyses of differentially expressed maize genes. Results from this study provide a basis for the analysis of short-read expression data in maize and resolved specific expression signatures that will help define mechanisms of action for the RA3 gene.
Collapse
|
26
|
Harhay GP, Smith TP, Alexander LJ, Haudenschild CD, Keele JW, Matukumalli LK, Schroeder SG, Van Tassell CP, Gresham CR, Bridges SM, Burgess SC, Sonstegard TS. An atlas of bovine gene expression reveals novel distinctive tissue characteristics and evidence for improving genome annotation. Genome Biol 2010; 11:R102. [PMID: 20961407 PMCID: PMC3218658 DOI: 10.1186/gb-2010-11-10-r102] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2010] [Revised: 07/22/2010] [Accepted: 10/20/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A comprehensive transcriptome survey, or gene atlas, provides information essential for a complete understanding of the genomic biology of an organism. We present an atlas of RNA abundance for 92 adult, juvenile and fetal cattle tissues and three cattle cell lines. RESULTS The Bovine Gene Atlas was generated from 7.2 million unique digital gene expression tag sequences (300.2 million total raw tag sequences), from which 1.59 million unique tag sequences were identified that mapped to the draft bovine genome accounting for 85% of the total raw tag abundance. Filtering these tags yielded 87,764 unique tag sequences that unambiguously mapped to 16,517 annotated protein-coding loci in the draft genome accounting for 45% of the total raw tag abundance. Clustering of tissues based on tag abundance profiles generally confirmed ontology classification based on anatomy. There were 5,429 constitutively expressed loci and 3,445 constitutively expressed unique tag sequences mapping outside annotated gene boundaries that represent a resource for enhancing current gene models. Physical measures such as inferred transcript length or antisense tag abundance identified tissues with atypical transcriptional tag profiles. We report for the first time the tissue-specific variation in the proportion of mitochondrial transcriptional tag abundance. CONCLUSIONS The Bovine Gene Atlas is the deepest and broadest transcriptome survey of any livestock genome to date. Commonalities and variation in sense and antisense transcript tag profiles identified in different tissues facilitate the examination of the relationship between gene expression, tissue, and gene function.
Collapse
Affiliation(s)
- Gregory P Harhay
- USDA-ARS US Meat Animal Research Center, State Spur 18 D, Clay Center, NE 68901, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Quantification of the yeast transcriptome by single-molecule sequencing. Nat Biotechnol 2009; 27:652-8. [PMID: 19581875 DOI: 10.1038/nbt.1551] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2009] [Accepted: 06/09/2009] [Indexed: 12/26/2022]
Abstract
We present single-molecule sequencing digital gene expression (smsDGE), a high-throughput, amplification-free method for accurate quantification of the full range of cellular polyadenylated RNA transcripts using a Helicos Genetic Analysis system. smsDGE involves a reverse-transcription and polyA-tailing sample preparation procedure followed by sequencing that generates a single read per transcript. We applied smsDGE to the transcriptome of Saccharomyces cerevisiae strain DBY746, using 6 of the available 50 channels in a single sequencing run, yielding on average 12 million aligned reads per channel. Using spiked-in RNA, accurate quantitative measurements were obtained over four orders of magnitude. High correlation was demonstrated across independent flow-cell channels, instrument runs and sample preparations. Transcript counting in smsDGE is highly efficient due to the representation of each transcript molecule by a single read. This efficiency, coupled with the high throughput enabled by the single-molecule sequencing platform, provides an alternative method for expression profiling.
Collapse
|
28
|
Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA. Next-generation tag sequencing for cancer gene expression profiling. Genome Res 2009; 19:1825-35. [PMID: 19541910 DOI: 10.1101/gr.094482.109] [Citation(s) in RCA: 277] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.
Collapse
|
29
|
Kutlu B, Burdick D, Baxter D, Rasschaert J, Flamez D, Eizirik DL, Welsh N, Goodman N, Hood L. Detailed transcriptome atlas of the pancreatic beta cell. BMC Med Genomics 2009; 2:3. [PMID: 19146692 PMCID: PMC2635377 DOI: 10.1186/1755-8794-2-3] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2008] [Accepted: 01/15/2009] [Indexed: 01/21/2023] Open
Abstract
Background Gene expression patterns provide a detailed view of cellular functions. Comparison of profiles in disease vs normal conditions provides insights into the processes underlying disease progression. However, availability and integration of public gene expression datasets remains a major challenge. The aim of the present study was to explore the transcriptome of pancreatic islets and, based on this information, to prepare a comprehensive and open access inventory of insulin-producing beta cell gene expression, the Beta Cell Gene Atlas (BCGA). Methods We performed Massively Parallel Signature Sequencing (MPSS) analysis of human pancreatic islet samples and microarray analyses of purified rat beta cells, alpha cells and INS-1 cells, and compared the information with available array data in the literature. Results MPSS analysis detected around 7600 mRNA transcripts, of which around a third were of low abundance. We identified 2000 and 1400 transcripts that are enriched/depleted in beta cells compared to alpha cells and INS-1 cells, respectively. Microarray analysis identified around 200 transcription factors that are differentially expressed in either beta or alpha cells. We reanalyzed publicly available gene expression data and integrated these results with the new data from this study to build the BCGA. The BCGA contains basal (untreated conditions) gene expression level estimates in beta cells as well as in different cell types in human, rat and mouse pancreas. Hierarchical clustering of expression profile estimates classify cell types based on species while beta cells were clustered together. Conclusion Our gene atlas is a valuable source for detailed information on the gene expression distribution in beta cells and pancreatic islets along with insulin producing cell lines. The BCGA tool, as well as the data and code used to generate the Atlas are available at the T1Dbase website (T1DBase.org).
Collapse
Affiliation(s)
- Burak Kutlu
- Institute for Systems Biology, Seattle, WA, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Matukumalli LK, Schroeder SG. Sequence Based Gene Expression Analysis. Bioinformatics 2009. [DOI: 10.1007/978-0-387-92738-1_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
31
|
't Hoen PAC, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RHAM, de Menezes RX, Boer JM, van Ommen GJB, den Dunnen JT. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res 2008; 36:e141. [PMID: 18927111 PMCID: PMC2588528 DOI: 10.1093/nar/gkn705] [Citation(s) in RCA: 560] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The hippocampal expression profiles of wild-type mice and mice transgenic for δC-doublecortin-like kinase were compared with Solexa/Illumina deep sequencing technology and five different microarray platforms. With Illumina's digital gene expression assay, we obtained ∼2.4 million sequence tags per sample, their abundance spanning four orders of magnitude. Results were highly reproducible, even across laboratories. With a dedicated Bayesian model, we found differential expression of 3179 transcripts with an estimated false-discovery rate of 8.5%. This is a much higher figure than found for microarrays. The overlap in differentially expressed transcripts found with deep sequencing and microarrays was most significant for Affymetrix. The changes in expression observed by deep sequencing were larger than observed by microarrays or quantitative PCR. Relevant processes such as calmodulin-dependent protein kinase activity and vesicle transport along microtubules were found affected by deep sequencing but not by microarrays. While undetectable by microarrays, antisense transcription was found for 51% of all genes and alternative polyadenylation for 47%. We conclude that deep sequencing provides a major advance in robustness, comparability and richness of expression profiling data and is expected to boost collaborative, comparative and integrative genomics studies.
Collapse
Affiliation(s)
- Peter A C 't Hoen
- The Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Hanriot L, Keime C, Gay N, Faure C, Dossat C, Wincker P, Scoté-Blachon C, Peyron C, Gandrillon O. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome. BMC Genomics 2008; 9:418. [PMID: 18796152 PMCID: PMC2562395 DOI: 10.1186/1471-2164-9-418] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2008] [Accepted: 09/16/2008] [Indexed: 01/29/2023] Open
Abstract
Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method.
Collapse
Affiliation(s)
- Lucie Hanriot
- UMR5534 CNRS Université Claude Bernard Lyon1, Université de Lyon, Institut Fédératif des Neurosciences de Lyon, Lyon cedex, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Rosenkranz R, Borodina T, Lehrach H, Himmelbauer H. Characterizing the mouse ES cell transcriptome with Illumina sequencing. Genomics 2008; 92:187-94. [PMID: 18602984 DOI: 10.1016/j.ygeno.2008.05.011] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2008] [Revised: 05/23/2008] [Accepted: 05/23/2008] [Indexed: 12/22/2022]
Abstract
Large datasets generated by Illumina sequencing are ideally suited to transcriptome characterization. We generated 3,052,501 27-mer reads from F1 mouse embryonic stem (ES) cell cDNA. Using the ELAND alignment tool, 74.5% of reads matched sequenced mouse resources, <1% were contaminants, and 3.7% failed quality control. Of the reads, 21.6% did not match mouse sequences using ELAND, but most of them were successfully aligned with mouse mRNAs using MegaBLAST. We conclude that most of the reads in the dataset are derived from mouse transcripts. A total of 14,434 mouse RefSeq genes were represented by at least 1 read. A Pearson correlation coefficient of 0.7 between Illumina sequencing and Illumina array expression data suggested similar results for both technologies. A weak 3' bias of reads was found. Reads from genes with low expression had lower GC content than the corresponding RefSeq genes, indicating a GC bias. Biases were confirmed with further Illumina read datasets generated with cDNA from mouse brain and from mutagen-treated F1 ES cells. We calculated relative expression values, because transcript length and read number were correlated. In the absence of signal saturation or background noise, we believe that short-read sequencing technologies will have a major impact on gene expression studies in the near future.
Collapse
Affiliation(s)
- Ruben Rosenkranz
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, Berlin, Germany
| | | | | | | |
Collapse
|
34
|
Dohm JC, Lottaz C, Borodina T, Himmelbauer H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res 2008; 36:e105. [PMID: 18660515 PMCID: PMC2532726 DOI: 10.1093/nar/gkn425] [Citation(s) in RCA: 748] [Impact Index Per Article: 46.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Novel sequencing technologies permit the rapid production of large sequence data sets. These technologies are likely to revolutionize genetics and biomedical research, but a thorough characterization of the ultra-short read output is necessary. We generated and analyzed two Illumina 1G ultra-short read data sets, i.e. 2.8 million 27mer reads from a Beta vulgaris genomic clone and 12.3 million 36mers from the Helicobacter acinonychis genome. We found that error rates range from 0.3% at the beginning of reads to 3.8% at the end of reads. Wrong base calls are frequently preceded by base G. Base substitution error frequencies vary by 10- to 11-fold, with A > C transversion being among the most frequent and C > G transversions among the least frequent substitution errors. Insertions and deletions of single bases occur at very low rates. When simulating re-sequencing we found a 20-fold sequencing coverage to be sufficient to compensate errors by correct reads. The read coverage of the sequenced regions is biased; the highest read density was found in intervals with elevated GC content. High Solexa quality scores are over-optimistic and low scores underestimate the data quality. Our results show different types of biases and ways to detect them. Such biases have implications on the use and interpretation of Solexa data, for de novo sequencing, re-sequencing, the identification of single nucleotide polymorphisms and DNA methylation sites, as well as for transcriptome analysis.
Collapse
Affiliation(s)
- Juliane C Dohm
- Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195 Berlin, Germany
| | | | | | | |
Collapse
|
35
|
Chan SK, Griffith OL, Tai IT, Jones SJ. Meta-analysis of Colorectal Cancer Gene Expression Profiling Studies Identifies Consistently Reported Candidate Biomarkers. Cancer Epidemiol Biomarkers Prev 2008; 17:543-52. [DOI: 10.1158/1055-9965.epi-07-2615] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
36
|
Hene L, Sreenu VB, Vuong MT, Abidi SHI, Sutton JK, Rowland-Jones SL, Davis SJ, Evans EJ. Deep analysis of cellular transcriptomes - LongSAGE versus classic MPSS. BMC Genomics 2007; 8:333. [PMID: 17892551 PMCID: PMC2104538 DOI: 10.1186/1471-2164-8-333] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2007] [Accepted: 09/24/2007] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Deep transcriptome analysis will underpin a large fraction of post-genomic biology. 'Closed' technologies, such as microarray analysis, only detect the set of transcripts chosen for analysis, whereas 'open' e.g. tag-based technologies are capable of identifying all possible transcripts, including those that were previously uncharacterized. Although new technologies are now emerging, at present the major resources for open-type analysis are the many publicly available SAGE (serial analysis of gene expression) and MPSS (massively parallel signature sequencing) libraries. These technologies have never been compared for their utility in the context of deep transcriptome mining. RESULTS We used a single LongSAGE library of 503,431 tags and a "classic" MPSS library of 1,744,173 tags, both prepared from the same T cell-derived RNA sample, to compare the ability of each method to probe, at considerable depth, a human cellular transcriptome. We show that even though LongSAGE is more error-prone than MPSS, our LongSAGE library nevertheless generated 6.3-fold more genome-matching (and therefore likely error-free) tags than the MPSS library. An analysis of a set of 8,132 known genes detectable by both methods, and for which there is no ambiguity about tag matching, shows that MPSS detects only half (54%) the number of transcripts identified by SAGE (3,617 versus 1,955). Analysis of two additional MPSS libraries shows that each library samples a different subset of transcripts, and that in combination the three MPSS libraries (4,274,992 tags in total) still only detect 73% of the genes identified in our test set using SAGE. The fraction of transcripts detected by MPSS is likely to be even lower for uncharacterized transcripts, which tend to be more weakly expressed. The source of the loss of complexity in MPSS libraries compared to SAGE is unclear, but its effects become more severe with each sequencing cycle (i.e. as MPSS tag length increases). CONCLUSION We show that MPSS libraries are significantly less complex than much smaller SAGE libraries, revealing a serious bias in the generation of MPSS data unlikely to have been circumvented by later technological improvements. Our results emphasize the need for the rigorous testing of new expression profiling technologies.
Collapse
Affiliation(s)
- Lawrence Hene
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Vattipally B Sreenu
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Mai T Vuong
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - S Hussain I Abidi
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Julian K Sutton
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Sarah L Rowland-Jones
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Simon J Davis
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| | - Edward J Evans
- Nuffield Department of Clinical Medicine and MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, The University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DS, UK
| |
Collapse
|
37
|
Liu F, Jenssen TK, Trimarchi J, Punzo C, Cepko CL, Ohno-Machado L, Hovig E, Patrick Kuo W. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates. BMC Genomics 2007; 8:153. [PMID: 17555589 PMCID: PMC1899500 DOI: 10.1186/1471-2164-8-153] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2006] [Accepted: 06/07/2007] [Indexed: 02/06/2023] Open
Abstract
Background High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing). Results The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Conclusion Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.
Collapse
Affiliation(s)
- Fang Liu
- Department of Tumor Biology, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
- PubGene AS, Vinderen, NO-0319 Oslo, Norway
| | | | - Jeff Trimarchi
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Claudio Punzo
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Connie L Cepko
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | | | - Eivind Hovig
- Department of Tumor Biology, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
- Department of Medical Informatics, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
| | - Winston Patrick Kuo
- Decision Systems Group, Brigham and Women's Hospital, Boston, MA, USA
- Department of Developmental Biology, Harvard School of Dental Medicine, Boston, MA, USA
- Department of Organismic and Evolutionary Biology/Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA
| |
Collapse
|
38
|
Shadeo A, Chari R, Vatcher G, Campbell J, Lonergan KM, Matisic J, van Niekerk D, Ehlen T, Miller D, Follen M, Lam WL, MacAulay C. Comprehensive serial analysis of gene expression of the cervical transcriptome. BMC Genomics 2007; 8:142. [PMID: 17543121 PMCID: PMC1899502 DOI: 10.1186/1471-2164-8-142] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2006] [Accepted: 06/01/2007] [Indexed: 12/21/2022] Open
Abstract
Background More than half of the approximately 500,000 women diagnosed with cervical cancer worldwide each year will die from this disease. Investigation of genes expressed in precancer lesions compared to those expressed in normal cervical epithelium will yield insight into the early stages of disease. As such, establishing a baseline from which to compare to, is critical in elucidating the abnormal biology of disease. In this study we examine the normal cervical tissue transcriptome and investigate the similarities and differences in relation to CIN III by Long-SAGE (L-SAGE). Results We have sequenced 691,390 tags from four L-SAGE libraries increasing the existing gene expression data on cervical tissue by 20 fold. One-hundred and eighteen unique tags were highly expressed in normal cervical tissue and 107 of them mapped to unique genes, most belong to the ribosomal, calcium-binding and keratinizing gene families. We assessed these genes for aberrant expression in CIN III and five genes showed altered expression. In addition, we have identified twelve unique HPV 16 SAGE tags in the CIN III libraries absent in the normal libraries. Conclusion Establishing a baseline of gene expression in normal cervical tissue is key for identifying changes in cancer. We demonstrate the utility of this baseline data by identifying genes with aberrant expression in CIN III when compared to normal tissue.
Collapse
Affiliation(s)
- Ashleen Shadeo
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Raj Chari
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Greg Vatcher
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Jennifer Campbell
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Kim M Lonergan
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Jasenka Matisic
- Pathology, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Dirk van Niekerk
- Pathology, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Thomas Ehlen
- Obstetrics and Gynaecology, The University of British Columbia, Vancouver, BC, Canada
- Gynecologic Oncology, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Dianne Miller
- Obstetrics and Gynaecology, The University of British Columbia, Vancouver, BC, Canada
- Gynecologic Oncology, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Michele Follen
- Gynecologic Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA
| | - Wan L Lam
- Cancer Genetics & Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| | - Calum MacAulay
- Cancer Imaging, British Columbia Cancer Research Centre, Vancouver, BC, Canada
| |
Collapse
|
39
|
Ostrowski J, Mikula M, Karczmarski J, Rubel T, Wyrwicz LS, Bragoszewski P, Gaj P, Dadlez M, Butruk E, Regula J. Molecular defense mechanisms of Barrett's metaplasia estimated by an integrative genomics. J Mol Med (Berl) 2007; 85:733-43. [PMID: 17415542 DOI: 10.1007/s00109-007-0176-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2006] [Revised: 01/11/2007] [Accepted: 01/30/2007] [Indexed: 12/18/2022]
Abstract
Barrett's esophagus is characterized by the replacement of squamous epithelium with specialized intestinal metaplastic mucosa. The exact mechanisms of initiation and development of Barrett's metaplasia remain unknown, but a hypothesis of "successful adaptation" against noxious reflux components has been proposed. To search for the repertoire of adaptation mechanisms of Barrett's metaplasia, we employed high-throughput functional genomic and proteomic methods that defined the molecular background of metaplastic mucosa resistance to reflux. Transcriptional profiling was established for 23 pairs of esophageal squamous epithelium and Barrett's metaplasia tissue samples using Affymetrix U133A 2.0 GeneChips and validated by quantitative real-time polymerase chain reaction. Differences in protein composition were assessed by electrophoretic and mass-spectrometry-based methods. Among 2,822 genes differentially expressed between Barrett's metaplasia and squamous epithelium, we observed significantly overexpressed metaplastic mucosa genes that encode cytokines and growth factors, constituents of extracellular matrix, basement membrane and tight junctions, and proteins involved in prostaglandin and phosphoinositol metabolism, nitric oxide production, and bioenergetics. Their expression likely reflects defense and repair responses of metaplastic mucosa, whereas overexpression of genes encoding heat shock proteins and several protein kinases in squamous epithelium may reflect lower resistance of normal esophageal epithelium than Barrett's metaplasia to reflux components. Despite the methodological and interpretative difficulties in data analyses discussed in this paper, our studies confirm that Barrett's metaplasia may be regarded as a specific microevolution allowing for accumulation of mucosal morphological and physiological changes that better protect against reflux injury.
Collapse
Affiliation(s)
- Jerzy Ostrowski
- Department of Gastroenterology, Medical Center for Postgraduate Education, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, ul. Roentgena 5, 02-781, Warsaw, Poland.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Wang SM. Understanding SAGE data. Trends Genet 2006; 23:42-50. [PMID: 17109989 DOI: 10.1016/j.tig.2006.11.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2006] [Revised: 10/05/2006] [Accepted: 11/01/2006] [Indexed: 02/08/2023]
Abstract
Serial analysis of gene expression (SAGE) is a method for identifying and quantifying transcripts from eukaryotic genomes. Since its invention, SAGE has been widely applied to analyzing gene expression in many biological and medical studies. Vast amounts of SAGE data have been collected and more than a thousand SAGE-related studies have been published since the mid-1990s. The principle of SAGE has been developed to address specific issues such as determination of normal gene structure and identification of abnormal genome structural changes. This review focuses on the general features of SAGE data, including the specificity of SAGE tags with respect to their original transcripts, the quantitative nature of SAGE data for differentially expressed genes, the reproducibility, the comparability of SAGE with microarray and the future potential of SAGE. Understanding these basic features should aid the proper interpretation of SAGE data to address biological and medical questions.
Collapse
Affiliation(s)
- San Ming Wang
- Center for Functional Genomics, ENH Research Institute, Robert H. Lurie Comprehensive Cancer Center, Northwestern University, 1001 University Place, Evanston, IL 60201, USA.
| |
Collapse
|
41
|
Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R. NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res 2006; 35:D760-5. [PMID: 17099226 PMCID: PMC1669752 DOI: 10.1093/nar/gkl887] [Citation(s) in RCA: 1010] [Impact Index Per Article: 56.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at
Collapse
Affiliation(s)
- Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Arhondakis S, Clay O, Bernardi G. Compositional properties of human cDNA libraries: practical implications. FEBS Lett 2006; 580:5772-8. [PMID: 17022979 DOI: 10.1016/j.febslet.2006.09.034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 09/12/2006] [Accepted: 09/19/2006] [Indexed: 01/28/2023]
Abstract
The strikingly wide and bimodal gene distribution exhibited by the human genome has prompted us to study the correlations between EST-counts (expression levels) and base composition of genes, especially since existing data are contradictory. Here we investigate how cDNA library preparation affects the GC distributions of ESTs and/or genes found in the library, and address consequences for expression studies. We observe that strongly anomalous GC distributions often indicate experimental biases or deficits during their preparation. We propose the use of compositional distributions of raw ESTs from a cDNA library, and/or of the genes they represent, as a simple and effective tool for quality control.
Collapse
Affiliation(s)
- Stilianos Arhondakis
- Laboratory of Molecular Evolution, Stazione Zoologica Anton Dohrn, 80121 Naples, Italy
| | | | | |
Collapse
|