1
|
Häggström C, Rowley M, Liedberg F, Coolen ACC, Holmberg L. Latent heterogeneity of muscle-invasive bladder cancer in patient characteristics and survival: A population-based nation-wide study in the Bladder Cancer Data Base Sweden (BladderBaSe). Cancer Med 2023. [PMID: 37096787 DOI: 10.1002/cam4.5981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/13/2023] [Accepted: 04/08/2023] [Indexed: 04/26/2023] Open
Abstract
BACKGROUND Patients with muscle-invasive bladder cancer (MIBC) constitute a heterogenous group in terms of patient and tumour characteristics ('case-mix') and prognosis. The aim of the current study was to investigate whether differences in survival could be used to separate MIBC patients into separate classes using a recently developed latent class regression method for survival analysis with competing risks. METHODS We selected all participants diagnosed with MIBC in the Bladder Cancer Data Base Sweden (BladderBase) and analysed inter-patient heterogeneity in risk of death from bladder cancer and other causes. RESULTS Using data from 9653 MIBC patients, we detected heterogeneity with six distinct latent classes in the studied population. The largest, and most frail class included 50% of the study population and was characterised by a somewhat larger proportion of women, higher age at diagnosis, more advanced disease and lower probability of curative treatment. Despite this, patients in this class treated with curative intent by radical cystectomy or radiotherapy had a lower association to risk of death. The second largest class included 23% and was substantially less frail as compared to the largest class. The third and fourth class included each around 9%-10%, whereas the fifth and sixth class included each 3%-4% of the population. CONCLUSIONS Results from the current study are compatible with previous research and the method can be used to adjust comparisons in prognosis between MIBC populations for influential differences in the distribution of sub-classes.
Collapse
Affiliation(s)
- Christel Häggström
- Department of Surgical Sciences, Uppsala University, Uppsala, Sweden
- Northern Registry Centre, Department of Public Health and Clinical Medicine, Umeå University, Umeå University, Umeå, Sweden
- Translational Oncology & Urology Research (TOUR), School of Cancer and Pharmaceutical Sciences, King's College, London, UK
| | - Mark Rowley
- Saddle Point Science, York, UK
- Saddle Point Science Europe, Nijmegen, The Netherlands
| | - Fredrik Liedberg
- Department of Urology, Skåne University Hospital, Malmö, Sweden
- Institution of Translational Medicine, Lund University, Malmö, Sweden
| | - Anthony C C Coolen
- Saddle Point Science, York, UK
- Saddle Point Science Europe, Nijmegen, The Netherlands
- Department of Biophysics, Faculty of Science, Donders Institute, Radboud University, Nijmegen, The Netherlands
| | - Lars Holmberg
- Department of Surgical Sciences, Uppsala University, Uppsala, Sweden
- Translational Oncology & Urology Research (TOUR), School of Cancer and Pharmaceutical Sciences, King's College, London, UK
| |
Collapse
|
2
|
Williams SM, Moore JH. Genetics and precision health: the ecological fallacy and artificial intelligence solutions. BioData Min 2023; 16:9. [PMID: 36927508 PMCID: PMC10018838 DOI: 10.1186/s13040-023-00327-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023] Open
Affiliation(s)
- Scott M Williams
- Departments of Population and Quantitative Health Sciences, Department of Genetics and Genome Sciences, and Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
3
|
Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022; 46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]
Abstract
Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.
Collapse
Affiliation(s)
- Alexa A. Woodward
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Ryan J. Urbanowicz
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
4
|
Yu C, Wang J. Data mining and mathematical models in cancer prognosis and prediction. MEDICAL REVIEW (BERLIN, GERMANY) 2022; 2:285-307. [PMID: 37724193 PMCID: PMC10388766 DOI: 10.1515/mr-2021-0026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/29/2021] [Indexed: 09/20/2023]
Abstract
Cancer is a fetal and complex disease. Individual differences of the same cancer type or the same patient at different stages of cancer development may require distinct treatments. Pathological differences are reflected in tissues, cells and gene levels etc. The interactions between the cancer cells and nearby microenvironments can also influence the cancer progression and metastasis. It is a huge challenge to understand all of these mechanistically and quantitatively. Researchers applied pattern recognition algorithms such as machine learning or data mining to predict cancer types or classifications. With the rapidly growing and available computing powers, researchers begin to integrate huge data sets, multi-dimensional data types and information. The cells are controlled by the gene expressions determined by the promoter sequences and transcription regulators. For example, the changes in the gene expression through these underlying mechanisms can modify cell progressing in the cell-cycle. Such molecular activities can be governed by the gene regulations through the underlying gene regulatory networks, which are essential for cancer study when the information and gene regulations are clear and available. In this review, we briefly introduce several machine learning methods of cancer prediction and classification which include Artificial Neural Networks (ANNs), Decision Trees (DTs), Support Vector Machine (SVM) and naive Bayes. Then we describe a few typical models for building up gene regulatory networks such as Correlation, Regression and Bayes methods based on available data. These methods can help on cancer diagnosis such as susceptibility, recurrence, survival etc. At last, we summarize and compare the modeling methods to analyze the development and progression of cancer through gene regulatory networks. These models can provide possible physical strategies to analyze cancer progression in a systematic and quantitative way.
Collapse
Affiliation(s)
- Chong Yu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
- Department of Statistics, JiLin University of Finance and Economics, Changchun, Jilin Province, China
| | - Jin Wang
- Department of Chemistry and of Physics and Astronomy, State University of New York, Stony Brook, NY, USA
| |
Collapse
|
5
|
Effectiveness of Artificial Intelligence for Personalized Medicine in Neoplasms: A Systematic Review. BIOMED RESEARCH INTERNATIONAL 2022; 2022:7842566. [PMID: 35434134 PMCID: PMC9010213 DOI: 10.1155/2022/7842566] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 01/29/2022] [Accepted: 03/06/2022] [Indexed: 02/07/2023]
Abstract
Purpose Artificial intelligence (AI) techniques are used in precision medicine to explore novel genotypes and phenotypes data. The main aims of precision medicine include early diagnosis, screening, and personalized treatment regime for a patient based on genetic-oriented features and characteristics. The main objective of this study was to review AI techniques and their effectiveness in neoplasm precision medicine. Materials and Methods A comprehensive search was performed in Medline (through PubMed), Scopus, ISI Web of Science, IEEE Xplore, Embase, and Cochrane databases from inception to December 29, 2021, in order to identify the studies that used AI methods for cancer precision medicine and evaluate outcomes of the models. Results Sixty-three studies were included in this systematic review. The main AI approaches in 17 papers (26.9%) were linear and nonlinear categories (random forest or decision trees), and in 21 citations, rule-based systems and deep learning models were used. Notably, 62% of the articles were done in the United States and China. R package was the most frequent software, and breast and lung cancer were the most selected neoplasms in the papers. Out of 63 papers, in 34 articles, genomic data like gene expression, somatic mutation data, phenotype data, and proteomics with drug-response which is functional data was used as input in AI methods; in 16 papers' (25.3%) drug response, functional data was utilized in personalization of treatment. The maximum values of the assessment indicators such as accuracy, sensitivity, specificity, precision, recall, and area under the curve (AUC) in included studies were 0.99, 1.00, 0.96, 0.98, 0.99, and 0.9929, respectively. Conclusion The findings showed that in many cases, the use of artificial intelligence methods had effective application in personalized medicine.
Collapse
|
6
|
Li CF, Chan TC, Pan CT, Vejvisithsakul PP, Lai JC, Chen SY, Hsu YW, Shiao MS, Shiue YL. EMP2 induces cytostasis and apoptosis via the TGFβ/SMAD/SP1 axis and recruitment of P2RX7 in urinary bladder urothelial carcinoma. Cell Oncol (Dordr) 2021; 44:1133-1150. [PMID: 34339014 DOI: 10.1007/s13402-021-00624-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 06/29/2021] [Indexed: 12/18/2022] Open
Abstract
PURPOSE Urinary bladder urothelial carcinoma (UBUC) is a common malignant disease, and its high recurrence rates impose a heavy clinical burden. The objective of this study was to identify signaling pathways downstream of epithelial membrane protein 2 (EMP2), which induces cytostasis and apoptosis in UBUC. METHODS A series of in vitro and in vivo assays using different UBUC-derived cell lines and mouse xenograft models were performed, respectively. In addition, primary UBUC specimens were evaluated by immunohistochemistry. RESULTS Exogenous expression of EMP2 in J82 UBUC cells significantly decreased DNA replication and altered the expression levels of several TGFβ signaling-related proteins. EMP2 knockdown in BFTC905 UBUC cells resulted in opposite effects. EMP2-dysregulated cell cycle progression was found to be mediated by the TGFβ/TGFBR1/SP1 family member SMAD. EMP2 or purinergic receptor P2X7 (P2RX7) gene expression upregulation induced apoptosis via both intrinsic and extrinsic pathways. In 242 UBUC patient samples, P2RX7 protein levels were found to be significantly and positively correlated with EMP2 protein levels. Low P2RX7 levels conferred poor disease-specific and metastasis-free survival rates, and significantly decreased apoptotic cell rates. EMP2 was found to physically interact with P2RX7. In the presence of a P2RX7 agonist, BzATP, overexpression of both EMP2 and P2RX7 significantly increased apoptotic cell rates compared to overexpression of EMP2 or P2RX7 alone. CONCLUSIONS EMP2 induces cytostasis via the TGFβ/SMAD/SP1 axis and recruits P2RX7 to enhance apoptosis in UBUC. Our data provide new insights that may be employed for the design of UBUC targeting therapies.
Collapse
MESH Headings
- Animals
- Apoptosis/genetics
- Carcinoma, Transitional Cell/genetics
- Carcinoma, Transitional Cell/metabolism
- Carcinoma, Transitional Cell/pathology
- Cell Line, Tumor
- Cell Proliferation/genetics
- Gene Expression Regulation, Neoplastic
- Humans
- Immunoblotting
- Membrane Glycoproteins/genetics
- Membrane Glycoproteins/metabolism
- Mice, Inbred NOD
- Mice, SCID
- Proteins/genetics
- Proteins/metabolism
- Receptors, Purinergic P2X7/genetics
- Receptors, Purinergic P2X7/metabolism
- Reverse Transcriptase Polymerase Chain Reaction
- Signal Transduction/genetics
- Smad Proteins/genetics
- Smad Proteins/metabolism
- Sp1 Transcription Factor/genetics
- Sp1 Transcription Factor/metabolism
- Transforming Growth Factor beta/genetics
- Transforming Growth Factor beta/metabolism
- Transplantation, Heterologous
- Urinary Bladder Neoplasms/genetics
- Urinary Bladder Neoplasms/metabolism
- Urinary Bladder Neoplasms/pathology
- Mice
Collapse
Affiliation(s)
- Chien-Feng Li
- Department of Medical Research, Chi-Mei Medical Center, Tainan, Taiwan
- National Cancer Research Institute, National Health Research Institutes, Tainan, Taiwan
- Department of Pathology, Chi-Mei Medical Center, Tainan, Taiwan
| | - Ti-Chun Chan
- Department of Medical Research, Chi-Mei Medical Center, Tainan, Taiwan
- National Cancer Research Institute, National Health Research Institutes, Tainan, Taiwan
| | - Cheng-Tang Pan
- Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan
- Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Pichpisith Pierre Vejvisithsakul
- Institute of Biomedical Sciences, National Sun Yat-sen University, 70 Lienhai Rd, 80424, Kaohsiung, Taiwan
- Section for Translational Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Jia-Chen Lai
- Institute of Biomedical Sciences, National Sun Yat-sen University, 70 Lienhai Rd, 80424, Kaohsiung, Taiwan
| | - Szu-Yu Chen
- Institute of Biomedical Sciences, National Sun Yat-sen University, 70 Lienhai Rd, 80424, Kaohsiung, Taiwan
| | - Ya-Wen Hsu
- Institute of Biomedical Sciences, National Sun Yat-sen University, 70 Lienhai Rd, 80424, Kaohsiung, Taiwan
| | - Meng-Shin Shiao
- Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Yow-Ling Shiue
- Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan.
- Institute of Biomedical Sciences, National Sun Yat-sen University, 70 Lienhai Rd, 80424, Kaohsiung, Taiwan.
| |
Collapse
|
7
|
Oh EJ, Parikh RB, Chivers C, Chen J. Two-Stage Approaches to Accounting for Patient Heterogeneity in Machine Learning Risk Prediction Models in Oncology. JCO Clin Cancer Inform 2021; 5:1015-1023. [PMID: 34591602 PMCID: PMC8812620 DOI: 10.1200/cci.21.00077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 07/24/2021] [Accepted: 08/26/2021] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Machine learning models developed from electronic health records data have been increasingly used to predict risk of mortality for general oncology patients. But these models may have suboptimal performance because of patient heterogeneity. The objective of this work is to develop a new modeling approach to predicting short-term mortality that accounts for heterogeneity across multiple subgroups in the presence of a large number of electronic health record predictors. METHODS We proposed a two-stage approach to addressing heterogeneity among oncology patients of different cancer types for predicting their risk of mortality. Structured data were extracted from the University of Pennsylvania Health System for 20,723 patients of 11 cancer types, where 1,340 (6.5%) patients were deceased. We first modeled the overall risk for all patients without differentiating cancer types, as is done in the current practice. We then developed cancer type-specific models using the overall risk score as a predictor along with preselected type-specific predictors. The overall and type-specific models were compared with respect to discrimination using the area under the precision-recall curve (AUPRC) and calibration using the calibration slope. We also proposed metrics that characterize the degree of risk heterogeneity by comparing risk predictors in the overall and type-specific models. RESULTS The two-stage modeling resulted in improved calibration and discrimination across all 11 cancer types. The improvement in AUPRC was significant for hematologic malignancies including leukemia, lymphoma, and myeloma. For instance, the AUPRC increased from 0.358 to 0.519 (∆ = 0.161; 95% CI, 0.102 to 0.224) and from 0.299 to 0.354 (∆ = 0.055; 95% CI, 0.009 to 0.107) for leukemia and lymphoma, respectively. For all 11 cancer types, the two-stage approach generated well-calibrated risks. A high degree of heterogeneity between type-specific and overall risk predictors was observed for most cancer types. CONCLUSION Our two-stage modeling approach that accounts for cancer type-specific risk heterogeneity has improved calibration and discrimination than a model agnostic to cancer types.
Collapse
Affiliation(s)
- Eun Jeong Oh
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Ravi B. Parikh
- Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Corey Chivers
- University of Pennsylvania Health System, Philadelphia, PA
| | - Jinbo Chen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| |
Collapse
|
8
|
Hall MA, Wallace J, Lucas AM, Bradford Y, Verma SS, Müller-Myhsok B, Passero K, Zhou J, McGuigan J, Jiang B, Pendergrass SA, Zhang Y, Peissig P, Brilliant M, Sleiman P, Hakonarson H, Harley JB, Kiryluk K, Van Steen K, Moore JH, Ritchie MD. Novel EDGE encoding method enhances ability to identify genetic interactions. PLoS Genet 2021; 17:e1009534. [PMID: 34086673 PMCID: PMC8208534 DOI: 10.1371/journal.pgen.1009534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/16/2021] [Accepted: 04/06/2021] [Indexed: 11/26/2022] Open
Abstract
Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)–rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action. Although traditional genetic encodings are widely implemented in genetics research, including in genome-wide association studies (GWAS) and epistasis, each method makes assumptions that may not reflect the underlying etiology. Here, we introduce a novel encoding method that estimates and assigns an individualized data-driven encoding for each single nucleotide polymorphism (SNP): the elastic data-driven genetic encoding (EDGE). With simulations, we demonstrate that this novel method is more accurate and robust than traditional encoding methods in estimating heterozygous genotype values, reducing the type I error, and detecting SNP-SNP interactions. We further applied the traditional encodings and EDGE to biomedical data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes, and EDGE identified a novel interaction for age-related cataract not detected by traditional methods, which replicated in data from the UK Biobank. EDGE provides an alternative approach to understanding and modeling diverse SNP models and is recommended for studying complex genetics in common human phenotypes.
Collapse
Affiliation(s)
- Molly A. Hall
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Penn State Cancer Institute, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - John Wallace
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Anastasia M. Lucas
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shefali S. Verma
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Bertram Müller-Myhsok
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Kristin Passero
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jiayan Zhou
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - John McGuigan
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Beibei Jiang
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | | | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Peggy Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Murray Brilliant
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Patrick Sleiman
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Hakon Hakonarson
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - John B. Harley
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- United States Department of Veterans Affairs Medical Center, Cincinnati, Ohio, United States of America
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liège, Belgium
- Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Jason H. Moore
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
9
|
Dehghanian SZ, Pan CT, Lee JM, Shiue YL. ABT-751 Induces Multiple Anticancer Effects in Urinary Bladder Urothelial Carcinoma-Derived Cells: Highlighting the Induction of Cytostasis through the Inhibition of SKP2 at Both Transcriptional and Post-Translational Levels. Int J Mol Sci 2021; 22:ijms22020945. [PMID: 33478005 PMCID: PMC7835924 DOI: 10.3390/ijms22020945] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 01/09/2021] [Accepted: 01/14/2021] [Indexed: 12/14/2022] Open
Abstract
The objective was to investigate the anti-cancer effects and underlying molecular mechanisms of cytostasis which were activated by an anti-microtubule drug, ABT-751, in two urinary bladder urothelial carcinoma (UBUC)-derived cell lines, BFTC905 and J82, with distinct genetic backgrounds. A series of in vitro assays demonstrated that ABT-751 induced G2/M cell cycle arrest, decreased cell number in the S phase of the cell cycle and suppressed colony formation/independent cell growth, accompanied with alterations of the protein levels of several cell cycle regulators. In addition, ABT-751 treatment significantly hurdled cell migration and invasion along with the regulation of epithelial–mesenchymal transition-related proteins. ABT-751 triggered autophagy and apoptosis, downregulated the mechanistic target of rapamycin kinase (MTOR) and upregulated several pro-apoptotic proteins that are involved in extrinsic and intrinsic apoptotic pathways. Inhibition of autophagosome and autolysosome enhanced apoptosis was also observed. Through the inhibition of the NFκB signaling pathway, ABT-751 suppressed S-phase kinase associated protein 2 (SKP2) transcription and subsequent translation by downregulation of active/phospho-AKT serine/threonine kinase 1 (AKT1), component of inhibitor of nuclear factor kappa B kinase complex (CHUK), NFKB inhibitor alpha (NFKBIA), nuclear RELA proto-oncogene, NFκB subunit (RELA) and maintained a strong interaction between NFKBIA and RELA to prevent RELA nuclear translocation for SKP2 transcription. ABT-751 downregulated stable/phospho-SKP2 including pSKP2(S64) and pSKP2(S72), which targeted cyclin-dependent kinase inhibitors for degradation through the inactivation of AKT. Our results suggested that ABT-751 may act as an anti-cancer drug by inhibiting cell migration, invasion yet inducing cell cycle arrest, autophagy and apoptosis in distinct UBUC-derived cells. Particularly, the upstream molecular mechanism of its anticancer effects was identified as ABT-751-induced cytostasis through the inhibition of SKP2 at both transcriptional and post-translational levels to stabilize cyclin dependent kinase inhibitor 1A (CDKN1A) and CDKN1B proteins.
Collapse
Affiliation(s)
- Seyedeh Zahra Dehghanian
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, 70 Lienhai Rd, Kaohsiung 80424, Taiwan;
| | - Cheng-Tang Pan
- Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung 80424, Taiwan;
- Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| | | | - Yow-Ling Shiue
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, 70 Lienhai Rd, Kaohsiung 80424, Taiwan;
- Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung 80424, Taiwan;
- Correspondence: ; Tel.: +886-7-5252000 (ext. 5818); Fax: +886-7-5250197
| |
Collapse
|
10
|
Sangphukieo A, Laomettachit T, Ruengjitchatchawalya M. Photosynthetic protein classification using genome neighborhood-based machine learning feature. Sci Rep 2020; 10:7108. [PMID: 32346070 PMCID: PMC7189237 DOI: 10.1038/s41598-020-64053-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Accepted: 04/07/2020] [Indexed: 11/08/2022] Open
Abstract
Identification of novel photosynthetic proteins is important for understanding and improving photosynthetic efficiency. Synergistically, genome neighborhood can provide additional useful information to identify photosynthetic proteins. We, therefore, expected that applying a computational approach, particularly machine learning (ML) with the genome neighborhood-based feature should facilitate the photosynthetic function assignment. Our results revealed a functional relationship between photosynthetic genes and their conserved neighboring genes observed by 'Phylo score', indicating their functions could be inferred from the genome neighborhood profile. Therefore, we created a new method for extracting patterns based on the genome neighborhood network (GNN) and applied them for the photosynthetic protein classification using ML algorithms. Random forest (RF) classifier using genome neighborhood-based features achieved the highest accuracy up to 87% in the classification of photosynthetic proteins and also showed better performance (Mathew's correlation coefficient = 0.718) than other available tools including the sequence similarity search (0.447) and ML-based method (0.361). Furthermore, we demonstrated the ability of our model to identify novel photosynthetic proteins compared to the other methods. Our classifier is available at http://bicep2.kmutt.ac.th/photomod_standalone, https://bit.ly/2S0I2Ox and DockerHub: https://hub.docker.com/r/asangphukieo/photomod.
Collapse
Affiliation(s)
- Apiwat Sangphukieo
- Bioinformatics and Systems Biology Program, School of Bioresources and Technology, King Mongkut's University of Technology Thonburi (KMUTT), Bang Khun Thian, Bangkok, 10150, Thailand
- School of Information Technology, KMUTT, Bang Mod, Thung Khru, Bangkok, 10140, Thailand
| | - Teeraphan Laomettachit
- Bioinformatics and Systems Biology Program, School of Bioresources and Technology, King Mongkut's University of Technology Thonburi (KMUTT), Bang Khun Thian, Bangkok, 10150, Thailand
| | - Marasri Ruengjitchatchawalya
- Bioinformatics and Systems Biology Program, School of Bioresources and Technology, King Mongkut's University of Technology Thonburi (KMUTT), Bang Khun Thian, Bangkok, 10150, Thailand.
- Biotechnology program, School of Bioresources and Technology, KMUTT, Bang Khun Thian, Bangkok, 10150, Thailand.
- Algal Biotechnology Research Group, Pilot Plant Development and Training Institute (PDTI), KMUTT, Bang Khun Thian, Bangkok, 10150, Thailand.
| |
Collapse
|
11
|
Wu WR, Lin JT, Pan CT, Chan TC, Liu CL, Wu WJ, Sheu JJC, Yeh BW, Huang SK, Jhung JY, Hsiao MS, Li CF, Shiue YL. Amplification-driven BCL6-suppressed cytostasis is mediated by transrepression of FOXO3 and post-translational modifications of FOXO3 in urinary bladder urothelial carcinoma. Am J Cancer Res 2020; 10:707-724. [PMID: 31903146 PMCID: PMC6929993 DOI: 10.7150/thno.39018] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2019] [Accepted: 10/17/2019] [Indexed: 01/14/2023] Open
Abstract
Muscle-invasive urinary bladder urothelial carcinoma (UBUC) is a lethal disease for which effective prognostic markers and potential therapy targets are still lacking. Previous array comparative genomic hybridization identified that 3q27 is frequently amplified in muscle-invasive UBUCs, one candidate proto-oncogene, B-cell CLL/lymphoma 6 (BCL6), mapped to this region. We therefore aimed to explore its downstream targets and physiological roles in UBUC progression. Methods: Specimens from UBUC patients, NOD/SCID mice and several UBUC-derived cell lines were used to perform quantitative RT-PCR, fluorescence in situ hybridization immunohistochemistry, xenograft, gene stable overexpression/knockdown and a series of in vitro experiments. Results: Amplification of the BCL6 gene lead to upregulation of BCL6 mRNA and protein levels in a substantial set of advanced UBUCs. High BCL6 protein level significantly predicted poor disease-specific and metastasis-free survivals. Knockdown of the BCL6 gene in J82 cells inhibited tumor growth and enhanced apoptosis in the NOD/SCID xenograft model. In vitro experiments demonstrated that BCL6 inhibited cytostasis, induced cell migration, invasion along with alteration of the expression levels of several related regulators. At molecular level, BCL6 inhibited forkhead box O3 (FOXO3) transcription, subsequent translation and upregulation of phosphorylated/inactive FOXO3 through phosphoinositide 3-kinase (PI3K)/AKT serine/threonine kinase (AKT) and/or epidermal growth factor receptor (EGFR)/mitogen-activated protein kinase 1/2 (MAP2K1/2) signaling pathway(s). Two BCL6 binding sites on the proximal promoter region of the FOXO3 gene were confirmed. Conclusion: Overexpression of BCL6 served a poor prognostic factor in UBUC patients. In vivo and in vitro studies suggested that BCL6 functions as an oncogene through direct transrepression of the FOXO3 gene, downregulation and phosphorylation of the FOXO3 protein.
Collapse
|
12
|
Liu Y, Huang J, Urbanowicz RJ, Chen K, Manduchi E, Greene CS, Moore JH, Scheet P, Chen Y. Embracing study heterogeneity for finding genetic interactions in large-scale research consortia. Genet Epidemiol 2019; 44:52-66. [PMID: 31583758 DOI: 10.1002/gepi.22262] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 08/02/2019] [Accepted: 08/09/2019] [Indexed: 11/12/2022]
Abstract
Genetic interactions have been recognized as a potentially important contributor to the heritability of complex diseases. Nevertheless, due to small effect sizes and stringent multiple-testing correction, identifying genetic interactions in complex diseases is particularly challenging. To address the above challenges, many genomic research initiatives collaborate to form large-scale consortia and develop open access to enable sharing of genome-wide association study (GWAS) data. Despite the perceived benefits of data sharing from large consortia, a number of practical issues have arisen, such as privacy concerns on individual genomic information and heterogeneous data sources from distributed GWAS databases. In the context of large consortia, we demonstrate that the heterogeneously appearing marginal effects over distributed GWAS databases can offer new insights into genetic interactions for which conventional methods have had limited success. In this paper, we develop a novel two-stage testing procedure, named phylogenY-based effect-size tests for interactions using first 2 moments (YETI2), to detect genetic interactions through both pooled marginal effects, in terms of averaging site-specific marginal effects, and heterogeneity in marginal effects across sites, using a meta-analytic framework. YETI2 can not only be applied to large consortia without shared personal information but also can be used to leverage underlying heterogeneity in marginal effects to prioritize potential genetic interactions. We investigate the performance of YETI2 through simulation studies and apply YETI2 to bladder cancer data from dbGaP.
Collapse
Affiliation(s)
- Yulun Liu
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, Texas
| | - Jing Huang
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kun Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Casey S Greene
- Department of Pharmacology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Paul Scheet
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Yong Chen
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
13
|
Li X, Yang H, Wen K, Zhong X, Xia X, Liu L, Qin D. A Method for Analyzing Two-locus Epistasis of Complex Diseases based on Decision Tree and Mutual Entropy. CURR PROTEOMICS 2019. [DOI: 10.2174/1570164616666190123150236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Epistasis makes complex diseases difficult to understand, especially when
heterogeneity also exists. Heterogeneity of complex diseases makes the distribution of case population
more confused. However, the traditional methods proposed to detect epistasis often ignore heterogeneity,
resulting in low power of association studies.
Methods:
In this study, we firstly use rank information in the Classification Decision Tree and Mutual
Entropy (CTME) to construct two different evaluation scores, namely multiple objectives. In addition, we
improve the calculation of joint entropy between SNPs and disease label, which elevates the efficiency of
CTME. Then, the ant colony algorithm is applied to search two-locus epistatic combination space. To
handle the potential heterogeneity, all candidate two-locus SNPs are merged to recognize multiple different
epistatic combinations. Finally, all these solutions are tested by χ2 test.
Results and Conclusion:
Experiments show that our method CTME improves the power of association
study. More importantly, CTME also detects multiple epistatic SNPs contributing to heterogeneity. The
experimental results show that CTME has advantages on power and efficiency.
Collapse
Affiliation(s)
- Xiong Li
- Key Laboratory of Advanced Control & Optimization of Jiangxi Province, East China Jiaotong University, Nanchang, 330013, China
| | - Hui Yang
- Key Laboratory of Advanced Control & Optimization of Jiangxi Province, East China Jiaotong University, Nanchang, 330013, China
| | - Kaifu Wen
- Postdoctoral Research Station, Jiang Xi Holitech Technology Co., Ltd., Jian, 343700, China
| | - Xiaoming Zhong
- Postdoctoral Research Station, Jiang Xi Holitech Technology Co., Ltd., Jian, 343700, China
| | - Xuewen Xia
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| | - Liyue Liu
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| | - Dehao Qin
- School of Software, East China Jiaotong University, Nanchang, 330013, China
| |
Collapse
|
14
|
Hanley JP, Rizzo DM, Buzas JS, Eppstein MJ. A Tandem Evolutionary Algorithm for Identifying Causal Rules from Complex Data. EVOLUTIONARY COMPUTATION 2019; 28:87-114. [PMID: 30817200 DOI: 10.1162/evco_a_00252] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We propose a new evolutionary approach for discovering causal rules in complex classification problems from batch data. Key aspects include (a) the use of a hypergeometric probability mass function as a principled statistic for assessing fitness that quantifies the probability that the observed association between a given clause and target class is due to chance, taking into account the size of the dataset, the amount of missing data, and the distribution of outcome categories, (b) tandem age-layered evolutionary algorithms for evolving parsimonious archives of conjunctive clauses, and disjunctions of these conjunctions, each of which have probabilistically significant associations with outcome classes, and (c) separate archive bins for clauses of different orders, with dynamically adjusted order-specific thresholds. The method is validated on majority-on and multiplexer benchmark problems exhibiting various combinations of heterogeneity, epistasis, overlap, noise in class associations, missing data, extraneous features, and imbalanced classes. We also validate on a more realistic synthetic genome dataset with heterogeneity, epistasis, extraneous features, and noise. In all synthetic epistatic benchmarks, we consistently recover the true causal rule sets used to generate the data. Finally, we discuss an application to a complex real-world survey dataset designed to inform possible ecohealth interventions for Chagas disease.
Collapse
Affiliation(s)
- John P Hanley
- Department of Civil and Environmental Engineering, University of Vermont, Burlington, 05405, USA
| | - Donna M Rizzo
- Department of Civil and Environmental Engineering, University of Vermont, Burlington, 05405, USA
| | - Jeffrey S Buzas
- Department of Mathematics and Statistics, University of Vermont, Burlington, 05405, USA
| | - Margaret J Eppstein
- Department of Computer Science, University of Vermont, Burlington, 05405, USA
| |
Collapse
|
15
|
Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform 2018; 85:168-188. [PMID: 30030120 PMCID: PMC6299838 DOI: 10.1016/j.jbi.2018.07.015] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 06/30/2018] [Accepted: 07/14/2018] [Indexed: 11/23/2022]
Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the 'Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data, multiple classes, or class imbalance, (3) identify previously unknown limitations of specific RBAs, and (4) suggest that while MultiSURF∗ performs best for explicitly identifying pure 2-way interactions, MultiSURF yields the most reliable feature selection performance across a wide range of problem types.
Collapse
Affiliation(s)
- Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Randal S Olson
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Peter Schmitt
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | - Jason H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
16
|
Li X. A fast and exhaustive method for heterogeneity and epistasis analysis based on multi-objective optimization. Bioinformatics 2018; 33:2829-2836. [PMID: 28541468 DOI: 10.1093/bioinformatics/btx339] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 05/20/2017] [Indexed: 12/29/2022] Open
Abstract
Motivation The existing epistasis analysis approaches have been criticized mainly for their: (i) ignoring heterogeneity during epistasis analysis; (ii) high computational costs; and (iii) volatility of performances and results. Therefore, they will not perform well in general, leading to lack of reproducibility and low power in complex disease association studies. In this work, a fast scheme is proposed to accelerate exhaustive searching based on multi-objective optimization named ESMO for concurrently analyzing heterogeneity and epistasis phenomena. In ESMO, mutual entropy and Bayesian network approaches are combined for evaluating epistatic SNP combinations. In order to be compatible with heterogeneity of complex diseases, we designed an adaptive framework based on non-dominant sort and top k selection algorithm with improved time complexity O(k*M*N) . Moreover, ESMO is accelerated by strategies such as trading space for time, calculation sharing and parallel computing. Finally, ESMO is nonparametric and model-free. Results We compared ESMO with other recent or classic methods using different evaluating measures. The experimental results show that our method not only can quickly handle epistasis, but also can effectively detect heterogeneity of complex population structures. Availability and implementation https://github.com/XiongLi2016/ESMO/tree/master/ESMO-common-master . Contact lx_hncs@163.com.
Collapse
Affiliation(s)
- Xiong Li
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
17
|
Heterogeneity Analysis and Diagnosis of Complex Diseases Based on Deep Learning Method. Sci Rep 2018; 8:6155. [PMID: 29670206 PMCID: PMC5906634 DOI: 10.1038/s41598-018-24588-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 04/05/2018] [Indexed: 12/26/2022] Open
Abstract
Understanding genetic mechanism of complex diseases is a serious challenge. Existing methods often neglect the heterogeneity phenomenon of complex diseases, resulting in lack of power or low reproducibility. Addressing heterogeneity when detecting epistatic single nucleotide polymorphisms (SNPs) can enhance the power of association studies and improve prediction performance of complex diseases diagnosis. In this study, we propose a three-stage framework including epistasis detection, clustering and prediction to address both epistasis and heterogeneity of complex diseases based on deep learning method. The epistasis detection stage applies a multi-objective optimization method to find several candidate sets of epistatic SNPs which contribute to different subtypes of complex diseases. Then, a K-means clustering algorithm is used to define subtypes of the case group. Finally, a deep learning model has been trained for disease prediction based on graphics processing unit (GPU). Experimental results on pure and heterogeneous datasets show that our method has potential practicality and can serve as a possible alternative to other methods. Therefore, when epistasis and heterogeneity exist at the same time, our method is especially suitable for diagnosis of complex diseases.
Collapse
|
18
|
Löpprich M, Karmen C, Ganzinger M, Gietzelt M. Models and Data Sources Used in Systems Medicine. Methods Inf Med 2018; 55:107-13. [DOI: 10.3414/me15-01-0151] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 01/18/2016] [Indexed: 12/11/2022]
Abstract
SummaryBackground: Systems medicine is a new approach for the development and selection of treatment strategies for patients with complex diseases. It is often referred to as the application of systems biology methods for decision making in patient care. For systems medicine computer applications, many different data sources have to be integrated and included into models. This is a challenging task for Medical Informatics since the approach exceeds traditional systems like Electronic Health Records. To prioritize research activities for systems medicine applications, it is necessary to get an overview over modelling methods and data sources already used in this field.Objectives: We performed a systematic literature review with the objective to capture current use of 1) modelling methods and 2) data sources in systems medicine related research projects.Methods: We queried the MEDLINE and ScienceDirect databases for papers associated with the search term systems medicine and related terms. Papers were screened and assessed in full text in a two-step process according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement guidelines.Results: The queries returned 698 articles of which 34 papers were finally included into the study. A multitude of modelling approaches such as machine learning and network analysis was identified and classified. Since these approaches are also used in other domains, no methods specific for systems medicine could be identified. Omics data are the most widely used data types followed by clinical data. Most studies only include a rather limited number of data sources.Conclusions: Currently, many different modelling approaches are used in systems medicine. Thus, highly flexible modular solutions are necessary for systems medicine clinical applications. However, the number of data sources included into the models is limited and most projects currently focus on prognosis. To leverage the potential of systems medicine further, it will be necessary to focus on treatment strategies for patients and consider a broader range of data.
Collapse
|
19
|
Li CF, Wu WR, Chan TC, Wang YH, Chen LR, Wu WJ, Yeh BW, Liang SS, Shiue YL. Transmembrane and Coiled-Coil Domain 1 Impairs the AKT Signaling Pathway in Urinary Bladder Urothelial Carcinoma: A Characterization of a Tumor Suppressor. Clin Cancer Res 2017; 23:7650-7663. [PMID: 28972042 DOI: 10.1158/1078-0432.ccr-17-0002] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Revised: 08/02/2017] [Accepted: 09/25/2017] [Indexed: 11/16/2022]
Abstract
Purpose: Urinary bladder urothelial carcinoma (UBUC) is a common malignant disease in developed countries. Cell-cycle dysregulation resulting in uncontrolled cell proliferation has been associated with UBUC development. This study aimed to explore the roles of TMCO1 in UBUCs.Experimental Design: Data mining, branched DNA assay, immunohistochemistry, xenograft, cell culture, quantitative RT-PCR, immunoblotting, stable and transient transfection, lentivirus production and stable knockdown, cell-cycle, cell viability and proliferation, soft-agar, wound-healing, transwell migration and invasion, coimmunoprecipitation, immunocytochemistry, and AKT serine/threonine kinase (AKT) activity assays and site-directed mutagenesis were used to study TMCO1 involvement in vivo and in vitroResults: Data mining identified that the TMCO1 transcript was downregulated during the progression of UBUCs. In distinct UBUC-derived cell lines, changes in TMCO1 levels altered the cell-cycle distribution, cell viability, cell proliferation, and colony formation and modulated the AKT pathway. TMCO1 recruited the PH domain and leucine-rich repeat protein phosphatase 2 (PHLPP2) to dephosphorylate pAKT1(serine 473) (S473). Mutagenesis at S60 of the TMCO1 protein released TMCO1-induced cell-cycle arrest and restored the AKT pathway in BFTC905 cells. Stable TMCO1 (wild-type) overexpression suppressed, whereas T33A and S60A mutants recovered, tumor size in xenograft mice.Conclusions: Clinical associations, xenograft mice, and in vitro indications provide solid evidence that the TMCO1 gene is a novel tumor suppressor in UBUCs. TMCO1 dysregulates cell-cycle progression via suppression of the AKT pathway, and S60 of the TMCO1 protein is crucial for its tumor-suppressor roles. Clin Cancer Res; 23(24); 7650-63. ©2017 AACR.
Collapse
Affiliation(s)
- Chien-Feng Li
- Department of Pathology, Chi Mei Medical Center, Tainan, Taiwan.,National Institute of Cancer Research, National Health Research Institute, Tainan, Taiwan.,Department of Pathology, Kaohsiung Medical University, Kaohsiung, Taiwan.,Department of Biotechnology, Southern Taiwan University of Science and Technology, Tainan, Taiwan
| | - Wen-Ren Wu
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Ti-Chun Chan
- Department of Pathology, Chi Mei Medical Center, Tainan, Taiwan.,Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Yu-Hui Wang
- Department of Pathology, Chi Mei Medical Center, Tainan, Taiwan.,Institute of Bioinformatics and Biosignal Transduction, National Cheng Kung University, Tainan, Taiwan
| | - Lih-Ren Chen
- Department of Biotechnology, Southern Taiwan University of Science and Technology, Tainan, Taiwan.,Division of Physiology, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan.,Institute of Biotechnology, National Cheng Kung University, Tainan, Taiwan
| | - Wen-Jeng Wu
- Department of Urology, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan.,Department of Urology, School of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.,Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.,Center for Infectious Disease and Cancer Research, Kaohsiung Medical University, Kaohsiung, Taiwan.,Center for Stem Cell Research, Kaohsiung Medical University, Kaohsiung, Taiwan.,Department of Urology, Kaohsiung Municipal Ta-Tung Hospital, Kaohsiung, Taiwan.,Institute of Medical Science and Technology, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Bi-Wen Yeh
- Department of Urology, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan
| | - Shih-Shin Liang
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan.,Department of Biotechnology, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Yow-Ling Shiue
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan. .,Department of Biological Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan.,Doctoral degree program in Marine Biotechnology, National Sun Yat-sen University, Kaohsiung, Taiwan
| |
Collapse
|
20
|
Li X, Jiang W. Method for generating multiple risky barcodes of complex diseases using ant colony algorithm. Theor Biol Med Model 2017; 14:4. [PMID: 28143579 PMCID: PMC5286784 DOI: 10.1186/s12976-017-0050-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 01/12/2017] [Indexed: 11/30/2022] Open
Abstract
Background Susceptible barcode recognition plays an important role in the diagnosis and treatment of complex diseases. Numerous approaches have been proposed to identify risky barcodes involved in the progress of complex diseases. However, some methods only consider differences in barcode frequencies between the control and disease groups; as such, these methods may be partial or even wrong. For example, some barcodes with a high risk ratio yield a low frequency on cases or exhibit a high frequency on controls, which may unreasonable from a statistical point. Results In our study, a stricter criteria, maximum discrepancy and maximum constituency, is designed to evaluate each barcode and ant colony algorithm is used to search combination space of epistasis. For complex diseases with multi-subtypes, our method can list several potential barcodes contributing to different subtypes of complex diseases. Another contribution of this work is to introduce a method for determining the length of barcodes and excluding noisy barcodes whose frequencies are abnormal. In addition, common pathogenic genes shared by different risky barcodes are also recognized, which may provide key clue for further study, such as gene function analysis. Conclusions Experimental results reveal that our method can find multiple risky barcodes whose risk ratio and odds ratio are >1. These barcodes could be related to different subtypes of complex diseases.
Collapse
Affiliation(s)
- Xiong Li
- School of Software, East China Jiaotong University, Nanchang, 330013, China. .,College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.
| | - Wen Jiang
- Software School, Hunan Vocational College Of Science and Technology, Changsha, Hunan, 410118, China
| |
Collapse
|
21
|
Lazzarini N, Widera P, Williamson S, Heer R, Krasnogor N, Bacardit J. Functional networks inference from rule-based machine learning models. BioData Min 2016; 9:28. [PMID: 27597880 PMCID: PMC5011349 DOI: 10.1186/s13040-016-0106-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 08/11/2016] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. RESULTS We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. AVAILABILITY The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html.
Collapse
Affiliation(s)
- Nicola Lazzarini
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Paweł, Widera
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Stuart Williamson
- Clinical and Experimental Pharmacology Group, Cancer Research UK Manchester Institute, University of Manchester, Manchester, UK
| | - Rakesh Heer
- Northern Institute for Cancer Research, Medical School, Newcastle University, Newcastle upon Tyne, UK
| | - Natalio Krasnogor
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Jaume Bacardit
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
22
|
The cAMP responsive element binding protein 1 transactivates epithelial membrane protein 2, a potential tumor suppressor in the urinary bladder urothelial carcinoma. Oncotarget 2016; 6:9220-39. [PMID: 25940704 PMCID: PMC4496213 DOI: 10.18632/oncotarget.3312] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 02/08/2015] [Indexed: 12/22/2022] Open
Abstract
In this study, we report that EMP2 plays a tumor suppressor role by inducing G2/M cell cycle arrest, suppressing cell viability, proliferation, colony formation/anchorage-independent cell growth via regulation of G2/M checkpoints in distinct urinary bladder urothelial carcinoma (UBUC)-derived cell lines. Genistein treatment or exogenous expression of the cAMP responsive element binding protein 1 (CREB1) gene in different UBUC-derived cell lines induced EMP2 transcription and subsequent translation. Mutagenesis on either or both cAMP-responsive element(s) dramatically decreased the EMP2 promoter activity with, without genistein treatment or exogenous CREB1 expression, respectively. Significantly correlation between the EMP2 immunointensity and primary tumor, nodal status, histological grade, vascular invasion and mitotic activity was identified. Multivariate analysis further demonstrated that low EMP2 immunoexpression is an independent prognostic factor for poor disease-specific survival. Genistein treatments, knockdown of EMP2 gene and double knockdown of CREB1 and EMP2 genes significantly inhibited tumor growth and notably downregulated CREB1 and EMP2 protein levels in the mice xenograft models. Therefore, genistein induced CREB1 transcription, translation and upregulated pCREB1(S133) protein level. Afterward, pCREB1(S133) transactivated the tumor suppressor gene, EMP2, in vitro and in vivo. Our study identified a novel transcriptional target, which plays a tumor suppressor role, of CREB1.
Collapse
|
23
|
Urbanowicz RJ, Moore JH. ExSTraCS 2.0: Description and Evaluation of a Scalable Learning Classifier System. EVOLUTIONARY INTELLIGENCE 2015; 8:89-116. [PMID: 26417393 PMCID: PMC4583133 DOI: 10.1007/s12065-015-0128-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Algorithmic scalability is a major concern for any machine learning strategy in this age of 'big data'. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExS-TraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExS-TraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExS-TraCS usability was made simpler through the elimination of previously critical run parameters.
Collapse
Affiliation(s)
- Ryan J. Urbanowicz
- Geisel School of Medicine, 1 Medical Center Dr., Lebanon NH, 03756, USA, Tel.: +603-653-6017, Fax: +603-653-9952
| | - Jason H. Moore
- Geisel School of Medicine, 1 Medical Center Dr., Lebanon NH, 03756, USA, Tel.: +603-653-6017, Fax: +603-653-9952
| |
Collapse
|
24
|
Abstract
Here we introduce the ReliefF machine learning algorithm and some of its extensions for detecting and characterizing epistasis in genetic association studies. We provide a general overview of the method and then highlight some of the modifications that have greatly improved its power for genetic analysis. We end with a few examples of published studies of complex human diseases that have used ReliefF.
Collapse
|
25
|
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 2014; 13:8-17. [PMID: 25750696 PMCID: PMC4348437 DOI: 10.1016/j.csbj.2014.11.005] [Citation(s) in RCA: 1112] [Impact Index Per Article: 111.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.
Collapse
Key Words
- ANN, Artificial Neural Network
- AUC, Area Under Curve
- BCRSVM, Breast Cancer Support Vector Machine
- BN, Bayesian Network
- CFS, Correlation based Feature Selection
- Cancer recurrence
- Cancer survival
- Cancer susceptibility
- DT, Decision Tree
- ES, Early Stopping algorithm
- GEO, Gene Expression Omnibus
- HTT, High-throughput Technologies
- LCS, Learning Classifying Systems
- ML, Machine Learning
- Machine learning
- NCI caArray, National Cancer Institute Array Data Management System
- NSCLC, Non-small Cell Lung Cancer
- OSCC, Oral Squamous Cell Carcinoma
- PPI, Protein–Protein Interaction
- Predictive models
- ROC, Receiver Operating Characteristic
- SEER, Surveillance, Epidemiology and End results Database
- SSL, Semi-supervised Learning
- SVM, Support Vector Machine
- TCGA, The Cancer Genome Atlas Research Network
Collapse
Affiliation(s)
- Konstantina Kourou
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Themis P Exarchos
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece ; IMBB - FORTH, Dept. of Biomedical Research, Ioannina, Greece
| | - Konstantinos P Exarchos
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Michalis V Karamouzis
- Molecular Oncology Unit, Department of Biological Chemistry, Medical School, University of Athens, Athens, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece ; IMBB - FORTH, Dept. of Biomedical Research, Ioannina, Greece
| |
Collapse
|
26
|
Holmes JH. Methods and applications of evolutionary computation in biomedicine. J Biomed Inform 2014; 49:11-5. [PMID: 24874181 DOI: 10.1016/j.jbi.2014.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 05/12/2014] [Accepted: 05/13/2014] [Indexed: 12/20/2022]
Affiliation(s)
- John H Holmes
- Associate Professor of Medical Informatics in Epidemiology, Department of Biostatistics and Epidemiology, University of Pennsylvania, Perelman School of Medicine, United States.
| |
Collapse
|
27
|
Rudd J, Moore JH, Urbanowicz RJ. A Multi-Core Parallelization Strategy for Statistical Significance Testing in Learning Classifier Systems. EVOLUTIONARY INTELLIGENCE 2013; 6. [PMID: 24358057 DOI: 10.1007/s12065-013-0092-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear.
Collapse
Affiliation(s)
- James Rudd
- Dartmouth College, 1 Medical Center Dr., Lebanon, NH 03755,USA,
| | - Jason H Moore
- Dartmouth College, 1 Medical Center Dr., Lebanon, NH 03755,USA,
| | | |
Collapse
|