1
|
Amniouel S, Yalamanchili K, Sankararaman S, Jafri MS. Evaluating Ovarian Cancer Chemotherapy Response Using Gene Expression Data and Machine Learning. BIOMEDINFORMATICS 2024; 4:1396-1424. [PMID: 39149564 PMCID: PMC11326537 DOI: 10.3390/biomedinformatics4020077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Background Ovarian cancer (OC) is the most lethal gynecological cancer in the United States. Among the different types of OC, serous ovarian cancer (SOC) stands out as the most prevalent. Transcriptomics techniques generate extensive gene expression data, yet only a few of these genes are relevant to clinical diagnosis. Methods Methods for feature selection (FS) address the challenges of high dimensionality in extensive datasets. This study proposes a computational framework that applies FS techniques to identify genes highly associated with platinum-based chemotherapy response on SOC patients. Using SOC datasets from the Gene Expression Omnibus (GEO) database, LASSO and varSelRF FS methods were employed. Machine learning classification algorithms such as random forest (RF) and support vector machine (SVM) were also used to evaluate the performance of the models. Results The proposed framework has identified biomarkers panels with 9 and 10 genes that are highly correlated with platinum-paclitaxel and platinum-only response in SOC patients, respectively. The predictive models have been trained using the identified gene signatures and accuracy of above 90% was achieved. Conclusions In this study, we propose that applying multiple feature selection methods not only effectively reduces the number of identified biomarkers, enhancing their biological relevance, but also corroborates the efficacy of drug response prediction models in cancer treatment.
Collapse
Affiliation(s)
- Soukaina Amniouel
- School of System Biology, George Mason University, Fairfax, VA 22030, USA
| | - Keertana Yalamanchili
- School of System Biology, George Mason University, Fairfax, VA 22030, USA
- School of Engineering, Brown University, Providence, RI 02912, USA
| | - Sreenidhi Sankararaman
- School of System Biology, George Mason University, Fairfax, VA 22030, USA
- Department of Biomedical Engineering, The John Hopkins University, Baltimore, MD 21218, USA
| | - Mohsin Saleet Jafri
- School of System Biology, George Mason University, Fairfax, VA 22030, USA
- Center for Biomedical Engineering and Technology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
2
|
Tihagam RD, Bhatnagar S. A multi-platform normalization method for meta-analysis of gene expression data. Methods 2023:S1046-2023(23)00110-X. [PMID: 37423473 DOI: 10.1016/j.ymeth.2023.06.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 06/21/2023] [Accepted: 06/29/2023] [Indexed: 07/11/2023] Open
Abstract
Transcriptomic profiling is a mainstay of translational cancer research and is often used to identify cancer subtypes, stratify responders vs. non-responders patients, predict survival, and identify potential targets for therapeutic intervention. Analysis of gene expression data gathered by RNA sequencing (RNA-seq) and microarray is generally the first step in identifying and characterizing cancer-associated molecular determinants. The methodological advancements and reduced costs associated with transcriptomic profiling have increased the number of publicly available gene expression profiles for cancer subtypes. Data integration from multiple datasets is routinely done to increase the number of samples, improve statistical power, and provide better insight into the heterogeneity of the biological determinant. However, utilizing raw data from multiple platforms, species, and sources introduces systematic variations due to noise, batch effects, and biases. As such, the integrated data is mathematically adjusted through normalization, which allows direct comparison of expression measures among studies while minimizing technical and systemic variations. This study applied meta-analysis to multiple independent Affymetrix microarray and Illumina RNA-seq datasets available through the Gene Expression Omnibus (GEO) and The Cancer Gene Atlas (TCGA). We have previously identified a tripartite motif containing 37 (TRIM37), a breast cancer oncogene, that drives tumorigenesis and metastasis in triple-negative breast cancer. In this article, we adapted and assessed the validity of Stouffer's z-score normalization method to interrogate TRIM37 expression across different cancer types using multiple large-scale datasets.
Collapse
Affiliation(s)
- Rachisan Djiake Tihagam
- Department of Medical Microbiology and Immunology, The University of California Davis School of Medicine, Davis, CA 95616, USA
| | - Sanchita Bhatnagar
- Department of Medical Microbiology and Immunology, The University of California Davis School of Medicine, Davis, CA 95616, USA.
| |
Collapse
|
3
|
Resistin-like beta reduction is associated to low survival rate and is downregulated by adjuvant therapy in colorectal cancer patients. Sci Rep 2023; 13:1490. [PMID: 36707698 PMCID: PMC9883247 DOI: 10.1038/s41598-023-28450-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 01/18/2023] [Indexed: 01/28/2023] Open
Abstract
Colorectal Cancer (CRC) is one of the most common cancers accounting for 1.8 million new cases worldwide every year. Therefore, the identification of new potential therapeutic targets represents a continuous challenge to improve survival and quality of CRC patient's life. We performed a microarray analysis dataset consisting of colon biopsies of healthy subjects (HS) and CRC patients. These results were further confirmed in a clinical setting evaluating a series of CRC patients to assess the expression of Resistin-Like Beta (RETNLB) and to correlate it with their clinical data. Our results showed a significant reduction of RETNLB expression in CRC biopsies compared to the HS mucosa. Furthermore, such reduction was significantly associated with the TNM grade and patients' age. Furthermore, a significantly positive correlation was found within mutated subjects for KRAS, TP53, and BRAF. In particular, patients with poor prognosis at 5 years exhibited RETNLB lower levels. In-silico analysis data were confirmed by histochemical analysis in a series of CRC patients recruited by our group. The results obtained provided that RETNLB low levels are associated with an unfavorable prognosis in CRC patients and its expression is also dependent on adjuvant therapy. Further studies are warranted in order to evaluate the molecular mechanisms underlying the role of RETNLB in CRC progression.
Collapse
|
4
|
Zhao Z, Feng Q, Zhang Y, Ning Z. Adaptive risk-aware sharable and individual subspace learning for cancer survival analysis with multi-modality data. Brief Bioinform 2023; 24:6847200. [PMID: 36433784 DOI: 10.1093/bib/bbac489] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/16/2022] [Accepted: 10/15/2022] [Indexed: 11/27/2022] Open
Abstract
Biomedical multi-modality data (also named multi-omics data) refer to data that span different types and derive from multiple sources in clinical practices (e.g. gene sequences, proteomics and histopathological images), which can provide comprehensive perspectives for cancers and generally improve the performance of survival models. However, the performance improvement of multi-modality survival models may be hindered by two key issues as follows: (1) how to learn and fuse modality-sharable and modality-individual representations from multi-modality data; (2) how to explore the potential risk-aware characteristics in each risk subgroup, which is beneficial to risk stratification and prognosis evaluation. Additionally, learning-based survival models generally refer to numerous hyper-parameters, which requires time-consuming parameter setting and might result in a suboptimal solution. In this paper, we propose an adaptive risk-aware sharable and individual subspace learning method for cancer survival analysis. The proposed method jointly learns sharable and individual subspaces from multi-modality data, whereas two auxiliary terms (i.e. intra-modality complementarity and inter-modality incoherence) are developed to preserve the complementary and distinctive properties of each modality. Moreover, it equips with a grouping co-expression constraint for obtaining risk-aware representation and preserving local consistency. Furthermore, an adaptive-weighted strategy is employed to efficiently estimate crucial parameters during the training stage. Experimental results on three public datasets demonstrate the superiority of our proposed model.
Collapse
Affiliation(s)
- Zhangxin Zhao
- School of Biomedical Engineering at Southern Medical University, Guangdong, China
| | - Qianjin Feng
- School of Biomedical Engineering at Southern Medical University, Guangdong, China
| | - Yu Zhang
- School of Biomedical Engineering, Southern Medical University, Guangdong, China
| | - Zhenyuan Ning
- School of Biomedical Engineering at Southern Medical University, Guangdong, China
| |
Collapse
|
5
|
Kumar R, Khatri A, Acharya V. Deep learning uncovers distinct behavior of rice network to pathogens response. iScience 2022; 25:104546. [PMID: 35754717 PMCID: PMC9218438 DOI: 10.1016/j.isci.2022.104546] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 05/06/2022] [Accepted: 06/02/2022] [Indexed: 12/15/2022] Open
Abstract
Rice, apart from abiotic stress, is prone to attack from multiple pathogens. Predominantly, the two rice pathogens, bacterial Xanthomonas oryzae (Xoo) and hemibiotrophic fungus, Magnaporthe oryzae, are extensively well explored for more than the last decade. However, because of lack of holistic studies, we design a deep learning-based rice network model (DLNet) that has explored the quantitative differences resulting in the distinct rice network architecture. Validation studies on rice in response to biotic stresses show that DLNet outperforms other machine learning methods. The current finding indicates the compactness of the rice PTI network and the rise of independent modules in the rice ETI network, resulting in similar patterns of the plant immune response. The results also show more independent network modules and minimum structural disorderness in rice-M. oryzae as compared to the rice-Xoo model revealing the different adaptation strategies of the rice plant to evade pathogen effectors.
Collapse
Affiliation(s)
- Ravi Kumar
- Functional Genomics and Complex System Lab, Biotechnology Division, The Himalayan Centre for High-throughput Computational Biology (HiCHiCoB, A BIC Supported by DBT, India), CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Abhishek Khatri
- Functional Genomics and Complex System Lab, Biotechnology Division, The Himalayan Centre for High-throughput Computational Biology (HiCHiCoB, A BIC Supported by DBT, India), CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh, India
| | - Vishal Acharya
- Functional Genomics and Complex System Lab, Biotechnology Division, The Himalayan Centre for High-throughput Computational Biology (HiCHiCoB, A BIC Supported by DBT, India), CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
6
|
Sarafidis M, Lambrou GI, Zoumpourlis V, Koutsouris D. An Integrated Bioinformatics Analysis towards the Identification of Diagnostic, Prognostic, and Predictive Key Biomarkers for Urinary Bladder Cancer. Cancers (Basel) 2022; 14:cancers14143358. [PMID: 35884419 PMCID: PMC9319344 DOI: 10.3390/cancers14143358] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/03/2022] [Accepted: 07/06/2022] [Indexed: 02/04/2023] Open
Abstract
Simple Summary Bladder cancer is evidently a challenge as far as its prognosis and treatment are concerned. The investigation of potential biomarkers and therapeutic targets is indispensable and still in progress. Most studies attempt to identify differential signatures between distinct molecular tumor subtypes. Therefore, keeping in mind the heterogeneity of urinary bladder tumors, we attempted to identify a consensus gene-related signature between the common expression profile of bladder cancer and control samples. In the quest for substantive features, we were able to identify key hub genes, whose signatures could hold diagnostic, prognostic, or therapeutic significance, but, primarily, could contribute to a better understanding of urinary bladder cancer biology. Abstract Bladder cancer (BCa) is one of the most prevalent cancers worldwide and accounts for high morbidity and mortality. This study intended to elucidate potential key biomarkers related to the occurrence, development, and prognosis of BCa through an integrated bioinformatics analysis. In this context, a systematic meta-analysis, integrating 18 microarray gene expression datasets from the GEO repository into a merged meta-dataset, identified 815 robust differentially expressed genes (DEGs). The key hub genes resulted from DEG-based protein–protein interaction and weighted gene co-expression network analyses were screened for their differential expression in urine and blood plasma samples of BCa patients. Subsequently, they were tested for their prognostic value, and a three-gene signature model, including COL3A1, FOXM1, and PLK4, was built. In addition, they were tested for their predictive value regarding muscle-invasive BCa patients’ response to neoadjuvant chemotherapy. A six-gene signature model, including ANXA5, CD44, NCAM1, SPP1, CDCA8, and KIF14, was developed. In conclusion, this study identified nine key biomarker genes, namely ANXA5, CDT1, COL3A1, SPP1, VEGFA, CDCA8, HJURP, TOP2A, and COL6A1, which were differentially expressed in urine or blood of BCa patients, held a prognostic or predictive value, and were immunohistochemically validated. These biomarkers may be of significance as prognostic and therapeutic targets for BCa.
Collapse
Affiliation(s)
- Michail Sarafidis
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., 15780 Athens, Greece;
- Correspondence: ; Tel.: +30-210-772-2430
| | - George I. Lambrou
- Choremeio Research Laboratory, First Department of Pediatrics, National and Kapodistrian University of Athens, 8 Thivon & Levadeias Str., 11527 Athens, Greece;
- University Research Institute of Maternal and Child Health and Precision Medicine, National and Kapodistrian University of Athens, 8 Thivon & Levadeias Str., 11527 Athens, Greece
| | - Vassilis Zoumpourlis
- Biomedical Applications Unit, Institute of Chemical Biology, National Hellenic Research Foundation, 48 Vas. Konstantinou Ave., 11635 Athens, Greece;
| | - Dimitrios Koutsouris
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., 15780 Athens, Greece;
| |
Collapse
|
7
|
Zhou X, He YZ, Liu D, Lin CR, Liang D, Huang R, Wang L. An Autophagy-Related Gene Signature can Better Predict Prognosis and Resistance in Diffuse Large B-Cell Lymphoma. Front Genet 2022; 13:862179. [PMID: 35846146 PMCID: PMC9280409 DOI: 10.3389/fgene.2022.862179] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 05/12/2022] [Indexed: 01/11/2023] Open
Abstract
Background: Diffuse large B-cell lymphoma (DLBCL) is a highly heterogeneous disease, and about 30%–40% of patients will develop relapsed/refractory DLBCL. In this study, we aimed to develop a gene signature to predict survival outcomes of DLBCL patients based on the autophagy-related genes (ARGs). Methods: We sequentially used the univariate, least absolute shrinkage and selector operation (LASSO), and multivariate Cox regression analyses to build a gene signature. The Kaplan–Meier curve and the area under the receiver operating characteristic curve (AUC) were performed to estimate the prognostic capability of the gene signature. GSEA analysis, ESTIMATE and ssGSEA algorithms, and one-class logistic regression were performed to analyze differences in pathways, immune response, and tumor stemness between the high- and low-risk groups. Results: Both in the training cohort and validation cohorts, high-risk patients had inferior overall survival compared with low-risk patients. The nomogram consisted of the autophagy-related gene signature, and clinical factors had better discrimination of survival outcomes, and it also had a favorable consistency between the predicted and actual survival. GSEA analysis found that patients in the high-risk group were associated with the activation of doxorubicin resistance, NF-κB, cell cycle, and DNA replication pathways. The results of ESTIMATE, ssGSEA, and mRNAsi showed that the high-risk group exhibited lower immune cell infiltration and immune activation responses and had higher similarity to cancer stem cells. Conclusion: We proposed a novel and reliable autophagy-related gene signature that was capable of predicting the survival and resistance of patients with DLBCL and could guide individualized treatment in future.
Collapse
Affiliation(s)
- Xuan Zhou
- Second Clinical Medical College of Southern Medical University, Zhujiang Hospital of Southern Medical University, Guangzhou, China
- Department of Endocrinology, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Ying-Zhi He
- Department of Hematology, Zhujiang Hospital of Southern Medical University, Guangzhou, China
| | - Dan Liu
- The First School of Clinical Medicine, Guangdong Medical University, Zhanjiang, China
| | - Chao-Ran Lin
- The First School of Clinical Medicine, Guangdong Medical University, Zhanjiang, China
| | - Dan Liang
- Second Clinical Medical College of Southern Medical University, Zhujiang Hospital of Southern Medical University, Guangzhou, China
| | - Rui Huang
- Department of Hematology, Zhujiang Hospital of Southern Medical University, Guangzhou, China
- *Correspondence: Rui Huang, ; Liang Wang,
| | - Liang Wang
- Department of Hematology, Beijing TongRen Hospital, Capital Medical University, Beijing, China
- *Correspondence: Rui Huang, ; Liang Wang,
| |
Collapse
|
8
|
Tan K, Huang W, Liu X, Hu J, Dong S. A Hierarchical Graph Convolution Network for Representation Learning of Gene Expression Data. IEEE J Biomed Health Inform 2021; 25:3219-3229. [PMID: 33449889 DOI: 10.1109/jbhi.2021.3052008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The curse of dimensionality, which is caused by high-dimensionality and low-sample-size, is a major challenge in gene expression data analysis. However, the real situation is even worse: labelling data is laborious and time-consuming, so only a small part of the limited samples will be labelled. Having such few labelled samples further increases the difficulty of training deep learning models. Interpretability is an important requirement in biomedicine. Many existing deep learning methods are trying to provide interpretability, but rarely apply to gene expression data. Recent semi-supervised graph convolution network methods try to address these problems by smoothing the label information over a graph. However, to the best of our knowledge, these methods only utilize graphs in either the feature space or sample space, which restrict their performance. We propose a transductive semi-supervised representation learning method called a hierarchical graph convolution network (HiGCN) to aggregate the information of gene expression data in both feature and sample spaces. HiGCN first utilizes external knowledge to construct a feature graph and a similarity kernel to construct a sample graph. Then, two spatial-based GCNs are used to aggregate information on these graphs. To validate the model's performance, synthetic and real datasets are provided to lend empirical support. Compared with two recent models and three traditional models, HiGCN learns better representations of gene expression data, and these representations improve the performance of downstream tasks, especially when the model is trained on a few labelled samples. Important features can be extracted from our model to provide reliable interpretability.
Collapse
|
9
|
Savino A, De Marzo N, Provero P, Poli V. Meta-Analysis of Microdissected Breast Tumors Reveals Genes Regulated in the Stroma but Hidden in Bulk Analysis. Cancers (Basel) 2021; 13:3371. [PMID: 34282769 PMCID: PMC8268805 DOI: 10.3390/cancers13133371] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 06/22/2021] [Accepted: 06/29/2021] [Indexed: 02/06/2023] Open
Abstract
Transcriptome data provide a valuable resource for the study of cancer molecular mechanisms, but technical biases, sample heterogeneity, and small sample sizes result in poorly reproducible lists of regulated genes. Additionally, the presence of multiple cellular components contributing to cancer development complicates the interpretation of bulk transcriptomic profiles. To address these issues, we collected 48 microarray datasets derived from laser capture microdissected stroma or epithelium in breast tumors and performed a meta-analysis identifying robust lists of differentially expressed genes. This was used to create a database with carefully harmonized metadata that we make freely available to the research community. As predicted, combining the results of multiple datasets improved statistical power. Moreover, the separate analysis of stroma and epithelium allowed the identification of genes with different contributions in each compartment, which would not be detected by bulk analysis due to their distinct regulation in the two compartments. Our method can be profitably used to help in the discovery of biomarkers and the identification of functionally relevant genes in both the stroma and the epithelium. This database was made to be readily accessible through a user-friendly web interface.
Collapse
Affiliation(s)
- Aurora Savino
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy;
| | - Niccolò De Marzo
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy;
| | - Paolo Provero
- Department of Neurosciences “Rita Levi Montalcini”, University of Turin, Corso Massimo D’Azeglio 52, 10126 Turin, Italy;
- Center for Omics Sciences, Ospedale San Raffaele IRCCS, Via Olgettina 60, 20132 Milan, Italy
| | - Valeria Poli
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy;
| |
Collapse
|
10
|
Yu F, Quan F, Xu J, Zhang Y, Xie Y, Zhang J, Lan Y, Yuan H, Zhang H, Cheng S, Xiao Y, Li X. Breast cancer prognosis signature: linking risk stratification to disease subtypes. Brief Bioinform 2020; 20:2130-2140. [PMID: 30184043 DOI: 10.1093/bib/bby073] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2017] [Revised: 07/14/2018] [Accepted: 07/28/2018] [Indexed: 01/29/2023] Open
Abstract
Breast cancer is a very complex and heterogeneous disease with variable molecular mechanisms of carcinogenesis and clinical behaviors. The identification of prognostic risk factors may enable effective diagnosis and treatment of breast cancer. In particular, numerous gene-expression-based prognostic signatures were developed and some of them have already been applied into clinical trials and practice. In this study, we summarized several representative gene-expression-based signatures with significant prognostic value and separately assessed their ability of prognosis prediction in their originally targeted populations of breast cancer. Notably, many of the collected signatures were originally designed to predict the outcomes of estrogen receptor positive (ER+) patients or the whole breast cancer cohort; there are no typical signatures used for the prognostic prediction in a specific population of patients with the intrinsic subtype. We thus attempted to identify subtype-specific prognostic signatures via a computational framework for analyzing multi-omics profiles and patient survival. For both the discovery and an independent data set, we confirmed that subtype-specific signature is a strong and significant independent prognostic factor in the corresponding cohort. These results indicate that the subtype-specific prognostic signature has a much higher resolution in the risk stratification, which may lead to improved therapies and precision medicine for patients with breast cancer.
Collapse
Affiliation(s)
- Fulong Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Fei Quan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Jinyuan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yi Xie
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Jingyu Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Huating Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Hongyi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Shujun Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China.,State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, Cancer Institute and Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100021, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| |
Collapse
|
11
|
Rong Z, Tan Q, Cao L, Zhang L, Deng K, Huang Y, Zhu ZJ, Li Z, Li K. NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data. Anal Chem 2020; 92:5082-5090. [DOI: 10.1021/acs.analchem.9b05460] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Zhiwei Rong
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Qilong Tan
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Lei Cao
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Liuchao Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Kui Deng
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Yue Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Zheng-Jiang Zhu
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Zhenzi Li
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Kang Li
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| |
Collapse
|
12
|
Kazakova A, Kakkola L, Ziegler T, Syrjänen R, Päkkilä H, Waris M, Soukka T, Julkunen I. Pandemic influenza A(H1N1pdm09) vaccine induced high levels of influenza-specific IgG and IgM antibodies as analyzed by enzyme immunoassay and dual-mode multiplex microarray immunoassay methods. Vaccine 2020; 38:1933-1942. [PMID: 31987689 DOI: 10.1016/j.vaccine.2020.01.022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 01/03/2020] [Accepted: 01/07/2020] [Indexed: 12/17/2022]
Abstract
Influenza A viruses continue to circulate throughout the world as yearly epidemics or occasional pandemics. Influenza infections can be prevented by seasonal multivalent or monovalent pandemic vaccines. In the present study, we describe a novel multiplex microarray immunoassay (MAIA) for simultaneous measurement of virus-specific IgG and IgM antibodies using Pandemrix-vaccinated adult sera collected at day 0 and 28 and 180 days after vaccination as the study material. MAIA showed excellent correlation with a conventional enzyme immunoassay (EIA) in both IgG and IgM anti-influenza A antibodies and good correlation with hemagglutination inhibition (HI) test. Pandemrix vaccine induced 5-30 fold increases in anti-H1N1pdm09 influenza antibodies as measured by HI, EIA or MAIA. A clear increase in virus-specific IgG antibodies was found in 93-97% of vaccinees by MAIA and EIA. Virus-specific IgM antibodies were found in 90-92% of vaccinees by MAIA and EIA, respectively and IgM antibodies persisted for up to 6 months after vaccination in 55-62% of the vaccinees. Pandemic influenza vaccine induced strong anti-influenza A IgG and IgM responses that persisted several months after vaccination. MAIA was demonstrated to be an excellent method for simultaneous measurement of antiviral IgG and IgM antibodies against multiple virus antigens. Thus the method is well suitable for large scale epidemiological and vaccine immunity studies.
Collapse
Affiliation(s)
- Anna Kazakova
- Institute of Biomedicine/Virology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland
| | - Laura Kakkola
- Institute of Biomedicine/Virology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland
| | - Thedi Ziegler
- Research Center for Child Psychiatry, University of Turku, Itäinen Pitkäkatu 1, 20520 Turku, Finland
| | - Ritva Syrjänen
- National Institute for Health and Welfare, Mannerheimintie 166, 00300 Helsinki, Finland
| | - Henna Päkkilä
- Department of Biotechnology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland
| | - Matti Waris
- Institute of Biomedicine/Virology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland; Turku University Hospital, Clinical Microbiology, Kiinamyllynkatu 10, 20520 Turku, Finland
| | - Tero Soukka
- Department of Biotechnology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland
| | - Ilkka Julkunen
- Institute of Biomedicine/Virology, University of Turku, Kiinamyllynkatu 10, 20520 Turku, Finland; Turku University Hospital, Clinical Microbiology, Kiinamyllynkatu 10, 20520 Turku, Finland.
| |
Collapse
|
13
|
The Transcription Factor Elf3 Is Essential for a Successful Mesenchymal to Epithelial Transition. Cells 2019; 8:cells8080858. [PMID: 31404945 PMCID: PMC6721682 DOI: 10.3390/cells8080858] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 07/22/2019] [Accepted: 07/27/2019] [Indexed: 12/13/2022] Open
Abstract
The epithelial to mesenchymal transition (EMT) and the mesenchymal to epithelial transition (MET) are two critical biological processes that are involved in both physiological events such as embryogenesis and development and also pathological events such as tumorigenesis. They present with dramatic changes in cellular morphology and gene expression exhibiting acute changes in E-cadherin expression. Despite the comprehensive understanding of EMT, the regulation of MET is far from being understood. To find novel regulators of MET, we hypothesized that such factors would correlate with Cdh1 expression. Bioinformatics examination of several expression profiles suggested Elf3 as a strong candidate. Depletion of Elf3 at the onset of MET severely impaired the progression to the epithelial state. This MET defect was explained, in part, by the absence of E-cadherin at the plasma membrane. Moreover, during MET, ELF3 interacts with the Grhl3 promoter and activates its expression. Our findings present novel insights into the regulation of MET and reveal ELF3 as an indispensable guardian of the epithelial state. A better understanding of MET will, eventually, lead to better management of metastatic cancers.
Collapse
|
14
|
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data. BIOMED RESEARCH INTERNATIONAL 2018; 2018:7538204. [PMID: 30228989 PMCID: PMC6136508 DOI: 10.1155/2018/7538204] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 07/17/2018] [Accepted: 07/29/2018] [Indexed: 11/18/2022]
Abstract
The application of gene expression data to the diagnosis and classification of cancer has become a hot issue in the field of cancer classification. Gene expression data usually contains a large number of tumor-free data and has the characteristics of high dimensions. In order to select determinant genes related to breast cancer from the initial gene expression data, we propose a new feature selection method, namely, support vector machine based on recursive feature elimination and parameter optimization (SVM-RFE-PO). The grid search (GS) algorithm, the particle swarm optimization (PSO) algorithm, and the genetic algorithm (GA) are applied to search the optimal parameters in the feature selection process. Herein, the new feature selection method contains three kinds of algorithms: support vector machine based on recursive feature elimination and grid search (SVM-RFE-GS), support vector machine based on recursive feature elimination and particle swarm optimization (SVM-RFE-PSO), and support vector machine based on recursive feature elimination and genetic algorithm (SVM-RFE-GA). Then the selected optimal feature subsets are used to train the SVM classifier for cancer classification. We also use random forest feature selection (RFFS), random forest feature selection and grid search (RFFS-GS), and minimal redundancy maximal relevance (mRMR) algorithm as feature selection methods to compare the effects of the SVM-RFE-PO algorithm. The results showed that the feature subset obtained by feature selection using SVM-RFE-PSO algorithm results has a better prediction performance of Area Under Curve (AUC) in the testing data set. This algorithm not only is time-saving, but also is capable of extracting more representative and useful genes.
Collapse
|
15
|
Irigoyen A, Jimenez-Luna C, Benavides M, Caba O, Gallego J, Ortuño FM, Guillen-Ponce C, Rojas I, Aranda E, Torres C, Prados J. Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers. PLoS One 2018; 13:e0194844. [PMID: 29617451 PMCID: PMC5884535 DOI: 10.1371/journal.pone.0194844] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 03/09/2018] [Indexed: 01/16/2023] Open
Abstract
Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.
Collapse
Affiliation(s)
- Antonio Irigoyen
- Department of Medical Oncology, Virgen de la Salud Hospital, Toledo, Spain
| | - Cristina Jimenez-Luna
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, Granada, Spain
| | - Manuel Benavides
- Department of Medical Oncology, Virgen de la Victoria Hospital, Malaga, Spain
| | - Octavio Caba
- Department of Health Sciences, University of Jaen, Jaen, Spain
| | - Javier Gallego
- Department of Medical Oncology, University General Hospital of Elche, Alicante, Spain
| | - Francisco Manuel Ortuño
- Department of Computer Architecture and Computer Technology, Research Center for Information and Communications Technologies, University of Granada, Granada, Spain
| | | | - Ignacio Rojas
- Department of Computer Architecture and Computer Technology, Research Center for Information and Communications Technologies, University of Granada, Granada, Spain
| | - Enrique Aranda
- Maimonides Institute of Biomedical Research (IMIBIC), Reina Sofia Hospital, University of Cordoba, Cordoba, Spain
| | - Carolina Torres
- Department of Biochemistry and Molecular Biology I, Faculty of Sciences, University of Granada, Granada, Spain
| | - Jose Prados
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, Granada, Spain
| |
Collapse
|
16
|
He X, Zhang C, Shi C, Lu Q. Meta-analysis of mRNA expression profiles to identify differentially expressed genes in lung adenocarcinoma tissue from smokers and non-smokers. Oncol Rep 2018; 39:929-938. [PMID: 29328493 PMCID: PMC5802042 DOI: 10.3892/or.2018.6197] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 12/29/2017] [Indexed: 11/24/2022] Open
Abstract
Compared to other types of lung cancer, lung adenocarcinoma patients with a history of smoking have a poor prognosis during the treatment of lung cancer. How lung adenocarcinoma-related genes are differentially expressed between smoker and non-smoker patients has yet to be fully elucidated. We performed a meta-analysis of four publicly available microarray datasets related to lung adenocarcinoma tissue in patients with a history of smoking using R statistical software. The top 50 differentially expressed genes (DEGs) in smoking vs. non‑smoking patients are shown using heat maps. Additionally, we conducted KEGG and GO analyses. In addition, we performed a PPI network analysis for 8 genes that were selected during a previous analysis. We identified a total of 2,932 DEGs (1,806 upregulated, 1,126 downregulated) and five genes (CDC45, CDC20, ANAPC7, CDC6, ESPL1) that may link lung adenocarcinoma to smoking history. Our study may provide new insights into the complex mechanisms of lung adenocarcinoma in smoking patients, and our novel gene expression signatures will be useful for future clinical studies.
Collapse
Affiliation(s)
- Xiaona He
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi 330006, P.R. China
| | - Cheng Zhang
- Center for Experimental Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi 330006, P.R. China
| | - Chao Shi
- Center for Experimental Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi 330006, P.R. China
| | - Quqin Lu
- Department of Biostatistics and Epidemiology, School of Public Health, Nanchang University, Nanchang, Jiangxi 330006, P.R. China
| |
Collapse
|
17
|
Plantier L, Renaud H, Respaud R, Marchand-Adam S, Crestani B. Transcriptome of Cultured Lung Fibroblasts in Idiopathic Pulmonary Fibrosis: Meta-Analysis of Publically Available Microarray Datasets Reveals Repression of Inflammation and Immunity Pathways. Int J Mol Sci 2016; 17:ijms17122091. [PMID: 27983601 PMCID: PMC5187891 DOI: 10.3390/ijms17122091] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Revised: 12/02/2016] [Accepted: 12/05/2016] [Indexed: 12/21/2022] Open
Abstract
Heritable profibrotic differentiation of lung fibroblasts is a key mechanism of idiopathic pulmonary fibrosis (IPF). Its mechanisms are yet to be fully understood. In this study, individual data from four independent microarray studies comparing the transcriptome of fibroblasts cultured in vitro from normal (total n = 20) and IPF (total n = 20) human lung were compiled for meta-analysis following normalization to z-scores. One hundred and thirteen transcripts were upregulated and 115 were downregulated in IPF fibroblasts using the Significance Analysis of Microrrays algorithm with a false discovery rate of 5%. Downregulated genes were highly enriched for Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional classes related to inflammation and immunity such as Defense response to virus, Influenza A, tumor necrosis factor (TNF) mediated signaling pathway, interferon-inducible absent in melanoma2 (AIM2) inflammasome as well as Apoptosis. Although upregulated genes were not enriched for any functional class, select factors known to play key roles in lung fibrogenesis were overexpressed in IPF fibroblasts, most notably connective tissue growth factor (CTGF) and serum response factor (SRF), supporting their role as drivers of IPF. The full data table is available as a supplement.
Collapse
Affiliation(s)
- Laurent Plantier
- Centre d'Étude des Pathologies Respiratoires-CEPR, Institut National de la Santé et de la Recherche Médicale-INSERM, Unité Mixte de Recherche-UMR1100, Labex Mabimprove, 37000 Tours, France.
- Université François Rabelais, 37000 Tours, France.
- Centre Hospitalier Régional Universitaire-CHRU de Tours, Hôpital Bretonneau, Service de Pneumologie et Explorations Fonctionnelles Respiratoires, 37000 Tours, France.
| | - Hélène Renaud
- Institut National de la Santé et de la Recherche Médicale-INSERM, Unité Mixte de Recherche-UMR1152, Labex Inflamex, 75018 Paris, France.
| | - Renaud Respaud
- Centre d'Étude des Pathologies Respiratoires-CEPR, Institut National de la Santé et de la Recherche Médicale-INSERM, Unité Mixte de Recherche-UMR1100, Labex Mabimprove, 37000 Tours, France.
- Université François Rabelais, 37000 Tours, France.
- Centre Hospitalier Régional Universitaire-CHRU de Tours, Hôpital Trousseau, Service de Pharmacie, 37170 Chambray-les-Tours, France.
| | - Sylvain Marchand-Adam
- Centre d'Étude des Pathologies Respiratoires-CEPR, Institut National de la Santé et de la Recherche Médicale-INSERM, Unité Mixte de Recherche-UMR1100, Labex Mabimprove, 37000 Tours, France.
- Université François Rabelais, 37000 Tours, France.
- Centre Hospitalier Régional Universitaire-CHRU de Tours, Hôpital Bretonneau, Service de Pneumologie et Explorations Fonctionnelles Respiratoires, 37000 Tours, France.
| | - Bruno Crestani
- Institut National de la Santé et de la Recherche Médicale-INSERM, Unité Mixte de Recherche-UMR1152, Labex Inflamex, 75018 Paris, France.
- Université Paris Diderot, PRES Sorbonne Paris Cité, 75018 Paris, France.
- AP-HP, Hôpital Bichat, Service de Pneumologie A, DHU FIRE, 75018 Paris, France.
| |
Collapse
|