1
|
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:4737969. [PMID: 33178256 PMCID: PMC7644310 DOI: 10.1155/2020/4737969] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 05/31/2020] [Accepted: 10/09/2020] [Indexed: 12/20/2022]
Abstract
Background Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients. Methods We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level. Results We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent. Conclusion The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the “Sensitivity,” “Specificity,” “Accuracy,” “F1,” and “AUC” metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes.
Collapse
|
2
|
Han L, Zhang HC, Li L, Li CX, Di X, Qu X. Downregulation of Long Noncoding RNA HOTAIR and EZH2 Induces Apoptosis and Inhibits Proliferation, Invasion, and Migration of Human Breast Cancer Cells. Cancer Biother Radiopharm 2018; 33:241-251. [PMID: 30048163 DOI: 10.1089/cbr.2017.2432] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The long noncoding RNA HOTAIR (HOX transcript antisense intergenic RNA) has been reported to be a biomarker for various malignant tumors; however, its involvement in breast cancer is not fully understood. The aim of this study was to investigate the effects involved with long noncoding RNA HOTAIR and EZH2 (enhancer of zeste homologue 2) on the processes of proliferation, invasion, migration, and apoptosis of breast cancer cells. MATERIALS AND METHODS The expressions of HOTAIR and EZH2 in both normal human mammary epithelial cell (HBL-100) and breast cancer cell lines (MCF-7, MDA-MB-231, and SKBR-3) were detected by means of reverse transcription-quantitative polymerase chain reaction. The MCF-7 cells that exhibited the highest HOTAIR expressions were selected for further studies and divided into the control, negative control, and small interfering RNA-HOTAIR groups. The proliferation, invasion, migration, and apoptosis of breast cancer cells were evaluated by MTT assay, Scratch test, Transwell assay, and flow cytometry, respectively. The combination of HOTAIR with EZH2 and PTEN was predicted by bioinformation, with a dual-luciferase reporter gene assay providing further verification. RESULTS Initially, lower expressions of HOTAIR and EZH2 in the normal human mammary epithelial cells, while higher expressions in the breast cancer cells of MCF-7, MDA-MB-231, and SKBR-3 were detected. In addition, the downregulation of HOTAIR or silencing of EZH2 was revealed to repress the proliferation, invasion, and migration, while acting to promote the apoptosis of the breast cancer cells. Furthermore, HOTAIR could bind specifically to EZH2 and PTEN, highlighting the capability of HOTAIR to inhibit the expression of PTEN by recruiting EZH2 in breast cancer, while the TCGA database demonstrated the expressions of PTEN were lower in breast cancer cells. CONCLUSIONS The study suggests the higher expressions of HOTAIR and EZH2 among three breast cancer cells. Furthermore, the downregulation of HOTAIR or silencing of EZH2 was noted to inhibit the proliferation, invasion, and migration of breast cancer cells, while promoting their apoptosis.
Collapse
Affiliation(s)
- Lu Han
- 1 Department of Thyroid and Breast Surgery, Tianjin 4th Centre Hospital , Tianjin, P.R. China
| | - Hai-Chao Zhang
- 1 Department of Thyroid and Breast Surgery, Tianjin 4th Centre Hospital , Tianjin, P.R. China
| | - Li Li
- 2 Department of General Surgery, Tianjin Haihe Hospital , Tianjin, P.R. China
| | - Cai-Xia Li
- 1 Department of Thyroid and Breast Surgery, Tianjin 4th Centre Hospital , Tianjin, P.R. China
| | - Xu Di
- 1 Department of Thyroid and Breast Surgery, Tianjin 4th Centre Hospital , Tianjin, P.R. China
| | - Xin Qu
- 1 Department of Thyroid and Breast Surgery, Tianjin 4th Centre Hospital , Tianjin, P.R. China
| |
Collapse
|
3
|
In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer. Int J Mol Sci 2018; 19:ijms19030910. [PMID: 29562723 PMCID: PMC5877771 DOI: 10.3390/ijms19030910] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 03/15/2018] [Accepted: 03/16/2018] [Indexed: 12/12/2022] Open
Abstract
Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC.
Collapse
|
4
|
Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors. BMC Bioinformatics 2017; 18:552. [PMID: 29297278 PMCID: PMC5751802 DOI: 10.1186/s12859-017-1893-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Background Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. If knowledge concerning the feature-label distribution – for instance, genetic pathways – is available, then it can be used in learning. Optimal Bayesian classification provides optimal classification under model uncertainty. It differs from classical Bayesian methods in which a classification model is assumed and prior distributions are placed on model parameters. With optimal Bayesian classification, uncertainty is treated directly on the feature-label distribution, which assures full utilization of prior knowledge and is guaranteed to outperform classical methods. Results The salient problem confronting optimal Bayesian classification is prior construction. In this paper, we propose a new prior construction methodology based on a general framework of constraints in the form of conditional probability statements. We call this prior the maximal knowledge-driven information prior (MKDIP). The new constraint framework is more flexible than our previous methods as it naturally handles the potential inconsistency in archived regulatory relationships and conditioning can be augmented by other knowledge, such as population statistics. We also extend the application of prior construction to a multinomial mixture model when labels are unknown, which often occurs in practice. The performance of the proposed methods is examined on two important pathway families, the mammalian cell-cycle and a set of p53-related pathways, and also on a publicly available gene expression dataset of non-small cell lung cancer when combined with the existing prior knowledge on relevant signaling pathways. Conclusion The new proposed general prior construction framework extends the prior construction methodology to a more flexible framework that results in better inference when proper prior knowledge exists. Moreover, the extension of optimal Bayesian classification to multinomial mixtures where data sets are both small and unlabeled, enables superior classifier design using small, unstructured data sets. We have demonstrated the effectiveness of our approach using pathway information and available knowledge of gene regulating functions; however, the underlying theory can be applied to a wide variety of knowledge types, and other applications when there are small samples.
Collapse
|
5
|
Jhan JR, Andrechek ER. Effective personalized therapy for breast cancer based on predictions of cell signaling pathway activation from gene expression analysis. Oncogene 2017. [PMID: 28135251 DOI: 10.1038/onc.2016.1503] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
Current therapeutic outcomes for breast cancer underscore the complexity of treating a heterogeneous disease. Indeed, studies have shown that differences in gene expression among patients with the same subtype of breast cancer are correlated with the response to treatment. This strongly suggests that there is an urgent need to treat breast cancer with a personalized approach. Here we employed cell signaling pathway signatures to predict pathway activity in subtypes of MMTV-Myc mammary tumors. We then split tumors into subsets and developed individualized combinatorial treatments for two subtypes with distinct pathway activation patterns. Elevation of the EGFR, RAS and TGFβ pathways was observed in one subtype whereas these pathways were not predicted to be active in the other subtype that had high predicted activity of the Myc, Stat3 and Akt pathways. In a proof-of-principle experiment, treatment of these two subtypes with targeted therapies inhibited tumor growth only in the subtype of tumor where the therapy was designed to be active. We then analyzed gene expression profiles of human breast cancer patients and patient-derived xenograft (PDX) samples to predict pathway activity, and validated our approach of developing individualized treatments in mice with PDX tumors. Importantly, our combinatorial therapy resulted in tumor regression, including regression in PDX samples from triple-negative breast cancer. Together our data is a proof-of-principle experiment that demonstrates that cell signaling pathway signature-guided treatment for breast cancer is viable.
Collapse
Affiliation(s)
- J-R Jhan
- Department of Physiology, Michigan State University, East Lansing, MI, USA
| | - E R Andrechek
- Department of Physiology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
6
|
Effective personalized therapy for breast cancer based on predictions of cell signaling pathway activation from gene expression analysis. Oncogene 2017; 36:3553-3561. [PMID: 28135251 DOI: 10.1038/onc.2016.503] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Revised: 11/16/2016] [Accepted: 11/21/2016] [Indexed: 12/12/2022]
Abstract
Current therapeutic outcomes for breast cancer underscore the complexity of treating a heterogeneous disease. Indeed, studies have shown that differences in gene expression among patients with the same subtype of breast cancer are correlated with the response to treatment. This strongly suggests that there is an urgent need to treat breast cancer with a personalized approach. Here we employed cell signaling pathway signatures to predict pathway activity in subtypes of MMTV-Myc mammary tumors. We then split tumors into subsets and developed individualized combinatorial treatments for two subtypes with distinct pathway activation patterns. Elevation of the EGFR, RAS and TGFβ pathways was observed in one subtype whereas these pathways were not predicted to be active in the other subtype that had high predicted activity of the Myc, Stat3 and Akt pathways. In a proof-of-principle experiment, treatment of these two subtypes with targeted therapies inhibited tumor growth only in the subtype of tumor where the therapy was designed to be active. We then analyzed gene expression profiles of human breast cancer patients and patient-derived xenograft (PDX) samples to predict pathway activity, and validated our approach of developing individualized treatments in mice with PDX tumors. Importantly, our combinatorial therapy resulted in tumor regression, including regression in PDX samples from triple-negative breast cancer. Together our data is a proof-of-principle experiment that demonstrates that cell signaling pathway signature-guided treatment for breast cancer is viable.
Collapse
|
7
|
Cava C, Bertoli G, Castiglioni I. Integrating genetics and epigenetics in breast cancer: biological insights, experimental, computational methods and therapeutic potential. BMC SYSTEMS BIOLOGY 2015; 9:62. [PMID: 26391647 PMCID: PMC4578257 DOI: 10.1186/s12918-015-0211-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Accepted: 09/15/2015] [Indexed: 12/11/2022]
Abstract
BACKGROUND Development of human cancer can proceed through the accumulation of different genetic changes affecting the structure and function of the genome. Combined analyses of molecular data at multiple levels, such as DNA copy-number alteration, mRNA and miRNA expression, can clarify biological functions and pathways deregulated in cancer. The integrative methods that are used to investigate these data involve different fields, including biology, bioinformatics, and statistics. RESULTS These methodologies are presented in this review, and their implementation in breast cancer is discussed with a focus on integration strategies. We report current applications, recent studies and interesting results leading to the identification of candidate biomarkers for diagnosis, prognosis, and therapy in breast cancer by using both individual and combined analyses. CONCLUSION This review presents a state of art of the role of different technologies in breast cancer based on the integration of genetics and epigenetics, and shares some issues related to the new opportunities and challenges offered by the application of such integrative approaches.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy.
| | - Gloria Bertoli
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy.
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy.
| |
Collapse
|
8
|
MicroRNA-10b and minichromosome maintenance complex component 5 gene as prognostic biomarkers in breast cancer. Tumour Biol 2015; 36:4487-94. [PMID: 25596707 DOI: 10.1007/s13277-015-3090-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Accepted: 01/08/2015] [Indexed: 01/20/2023] Open
Abstract
The aim of this study is to identify micro-ribonucleic acid (microRNA) and its target, in addition to their relationship to the outcome in breast cancer (BC). To achieve this aim, we investigated microRNA-10b (miR-10b) and minichromosome maintenance complex component 5 (MCM5 mRNA) expression in 230 breast tissue samples by real-time PCR and semiquantitative conventional RT-PCR, respectively. Relapse-free survival (RFS) associated with miRNA-10b and MCM5 mRNA were tested by Kaplan-Meier survival analysis. The impact of miRNA-10b andMCM5 mRNA expression on the survival was evaluated by Cox proportional hazard regression model. The expression of miRNA-10b and MCM5 mRNA was positive in 86.4 and 79.7 % breast cancer patients, respectively. The overall concordance rate between miRNA-10b and MCM5 RNA was 90.4 %. The median follow-up period was 50 months. The survival analysis showed that high levels of both miR-10b and MCM5 were associated with short relapse free survival of BC. We identified MCM5 mRNA expression changes consistent with the miRNA-10b target regulation. Thus, we could consider miRNA-10b and MCM5 mRNA as prognostic markers and potential therapeutic targets in breast cancer to be applied to other patient data sets.
Collapse
|
9
|
Cava C, Bertoli G, Ripamonti M, Mauri G, Zoppis I, Rosa PAD, Gilardi MC, Castiglioni I. Integration of mRNA expression profile, copy number alterations, and microRNA expression levels in breast cancer to improve grade definition. PLoS One 2014; 9:e97681. [PMID: 24866763 PMCID: PMC4035288 DOI: 10.1371/journal.pone.0097681] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2013] [Accepted: 04/23/2014] [Indexed: 12/20/2022] Open
Abstract
Defining the aggressiveness and growth rate of a malignant cell population is a key step in the clinical approach to treating tumor disease. The correct grading of breast cancer (BC) is a fundamental part in determining the appropriate treatment. Biological variables can make it difficult to elucidate the mechanisms underlying BC development. To identify potential markers that can be used for BC classification, we analyzed mRNAs expression profiles, gene copy numbers, microRNAs expression and their association with tumor grade in BC microarray-derived datasets. From mRNA expression results, we found that grade 2 BC is most likely a mixture of grade 1 and grade 3 that have been misclassified, being described by the gene signature of either grade 1 or grade 3. We assessed the potential of the new approach of integrating mRNA expression profile, copy number alterations, and microRNA expression levels to select a limited number of genomic BC biomarkers. The combination of mRNA profile analysis and copy number data with microRNA expression levels led to the identification of two gene signatures of 42 and 4 altered genes (FOXM1, KPNA4, H2AFV and DDX19A) respectively, the latter obtained through a meta-analytical procedure. The 42-based gene signature identifies 4 classes of up- or down-regulated microRNAs (17 microRNAs) and of their 17 target mRNA, and the 4-based genes signature identified 4 microRNAs (Hsa-miR-320d, Hsa-miR-139-5p, Hsa-miR-567 and Hsa-let-7c). These results are discussed from a biological point of view with respect to pathological features of BC. Our identified mRNAs and microRNAs were validated as prognostic factors of BC disease progression, and could potentially facilitate the implementation of assays for laboratory validation, due to their reduced number.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Gloria Bertoli
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Marilena Ripamonti
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Giancarlo Mauri
- Department of Informatics, Systems and Communications, University of Milan–Bicocca, Milan, Italy
| | - Italo Zoppis
- Department of Informatics, Systems and Communications, University of Milan–Bicocca, Milan, Italy
| | | | - Maria Carla Gilardi
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| |
Collapse
|
10
|
Esfahani MS, Dougherty ER. Incorporation of Biological Pathway Knowledge in the Construction of Priors for Optimal Bayesian Classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:202-218. [PMID: 26355519 DOI: 10.1109/tcbb.2013.143] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Small samples are commonplace in genomic/proteomic classification, the result being inadequate classifier design and poor error estimation. The problem has recently been addressed by utilizing prior knowledge in the form of a prior distribution on an uncertainty class of feature-label distributions. A critical issue remains: how to incorporate biological knowledge into the prior distribution. For genomics/proteomics, the most common kind of knowledge is in the form of signaling pathways. Thus, it behooves us to find methods of transforming pathway knowledge into knowledge of the feature-label distribution governing the classification problem. In this paper, we address the problem of prior probability construction by proposing a series of optimization paradigms that utilize the incomplete prior information contained in pathways (both topological and regulatory). The optimization paradigms employ the marginal log-likelihood, established using a small number of feature-label realizations (sample points) regularized with the prior pathway information about the variables. In the special case of a Normal-Wishart prior distribution on the mean and inverse covariance matrix (precision matrix) of a Gaussian distribution, these optimization problems become convex. Companion website: gsp.tamu.edu/Publications/supplementary/shahrokh13a.
Collapse
|