1
|
Jurisica I. Explainable biology for improved therapies in precision medicine: AI is not enough. Best Pract Res Clin Rheumatol 2024; 38:102006. [PMID: 39332994 DOI: 10.1016/j.berh.2024.102006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 09/18/2024] [Accepted: 09/18/2024] [Indexed: 09/29/2024]
Abstract
Technological advances and high-throughput bio-chemical assays are rapidly changing ways how we formulate and test biological hypotheses, and how we treat patients. Most complex diseases arise on a background of genetics, lifestyle and environment factors, and manifest themselves as a spectrum of symptoms. To fathom intricate biological processes and their changes from healthy to disease states, we need to systematically integrate and analyze multi-omics datasets, ontologies, and diverse annotations. Without proper management of such complex biological and clinical data, artificial intelligence (AI) algorithms alone cannot be effectively trained, validated, and successfully applied to provide trustworthy and patient-centric diagnosis, prognosis and treatment. Precision medicine requires to use multi-omics approaches effectively, and offers many opportunities for using AI, "big data" analytics, and integrative computational biology workflows. Advances in optical and biochemical assay technologies including sequencing, mass spectrometry and imaging modalities have transformed research by empowering us to simultaneously view all genes expressed, identify proteome-wide changes, and assess interacting partners of each individual protein within a dynamically changing biological system, at an individual cell level. While such views are already having an impact on our understanding of healthy and disease conditions, it remains challenging to extract useful information comprehensively and systematically from individual studies, ensure that signal is separated from noise, develop models, and provide hypotheses for further research. Data remain incomplete and are often poorly connected using fragmented biological networks. In addition, statistical and machine learning models are developed at a cohort level and often not validated at the individual patient level. Combining integrative computational biology and AI has the potential to improve understanding and treatment of diseases by identifying biomarkers and building explainable models characterizing individual patients. From systematic data analysis to more specific diagnostic, prognostic and predictive biomarkers, drug mechanism of action, and patient selection, such analyses influence multiple steps from prevention to disease characterization, and from prognosis to drug discovery. Data mining, machine learning, graph theory and advanced visualization may help identify diagnostic, prognostic and predictive biomarkers, and create causal models of disease. Intertwining computational prediction and modeling with biological experiments leads to faster, more biologically and clinically relevant discoveries. However, computational analysis results and models are going to be only as accurate and useful as correct and comprehensive are the networks, ontologies and datasets used to build them. High quality, curated data portals provide the necessary foundation for translational research. They help to identify better biomarkers, new drugs, precision treatments, and should lead to improved patient outcomes and their quality of life. Intertwining computational prediction and modeling with biological experiments, efficiently and effectively leads to more useful findings faster.
Collapse
Affiliation(s)
- I Jurisica
- Division of Orthopaedics, Osteoarthritis Research Program, Schroeder Arthritis Institute, and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON, M5T 0S8, Canada; Departments of Medical Biophysics and Computer Science, and Faculty of Dentistry, University of Toronto, Toronto, ON, Canada; Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia.
| |
Collapse
|
2
|
Mandilaras V, Garg S, Cabanero M, Tan Q, Pastrello C, Burnier J, Karakasis K, Wang L, Dhani NC, Butler MO, Bedard PL, Siu LL, Clarke B, Shaw PA, Stockley T, Jurisica I, Oza AM, Lheureux S. TP53 mutations in high grade serous ovarian cancer and impact on clinical outcomes: a comparison of next generation sequencing and bioinformatics analyses. Int J Gynecol Cancer 2019; 29:346-352. [PMID: 30659026 DOI: 10.1136/ijgc-2018-000087] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 11/25/2018] [Accepted: 11/29/2018] [Indexed: 01/08/2023] Open
Abstract
OBJECTIVE Mutations in TP53 are found in the majority of high grade serous ovarian cancers, leading to gain of function or loss of function of its protein product, p53, involved in oncogenesis. There have been conflicting reports as to the impact of the type of these on prognosis. We aim to further elucidate this relationship in our cohort of patients. METHODS 229 patients with high grade serous ovarian cancer underwent tumor profiling through an institutional molecular screening program with targeted next generation sequencing. TP53 mutations were classified using methods previously described in the literature. Immunohistochemistry on formalin-fixed paraffin embedded tissue was used to assess for TP53 mutation. Using divisive hierarchal clustering, we generated patient clusters with similar clinicopathologic characteristics to investigate differences in outcomes. RESULTS Six different classification schemes of TP53 mutations were studied. These did not show an association with first platinum-free interval or overall survival. Next generation sequencing reliably predicted mutation in 80% of cases, similar to the proportion detected by immunohistochemistry. Divisive hierarchical clustering generated four main clusters, with cluster 3 having a significantly worse prognosis (p<0.0001; log-rank test). This cluster had a higher concentration of gain of function mutations and these patients were less likely to have undergone optimal debulking surgery. CONCLUSIONS Different classifications of TP53 mutations did not show an impact on outcomes in this study. Immunohistochemistry was a good predictor for TP53 mutation. Cluster analysis showed that a subgroup of patients with gain of function mutations (cluster 3) had a worse prognosis.
Collapse
Affiliation(s)
- Victoria Mandilaras
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Swati Garg
- Advanced Molecular Diagnostics Laboratory, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Michael Cabanero
- Department of Laboratory Medicine and Pathology, University Health Network, Toronto, Ontario, Canada
| | - Qian Tan
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Chiara Pastrello
- Krembil Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Julia Burnier
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Katherine Karakasis
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Lisa Wang
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Neesha C Dhani
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Marcus O Butler
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Philippe L Bedard
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Lillian L Siu
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Blaise Clarke
- Department of Laboratory Medicine and Pathology, University Health Network, Toronto, Ontario, Canada
| | - Patricia Ann Shaw
- Department of Laboratory Medicine and Pathology, University Health Network, Toronto, Ontario, Canada
| | - Tracy Stockley
- Advanced Molecular Diagnostics Laboratory, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Igor Jurisica
- Krembil Research Institute, University Health Network, Toronto, Ontario, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, Ontario, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Amit M Oza
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Stephanie Lheureux
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Berrouiguet S, Perez-Rodriguez MM, Larsen M, Baca-García E, Courtet P, Oquendo M. From eHealth to iHealth: Transition to Participatory and Personalized Medicine in Mental Health. J Med Internet Res 2018; 20:e2. [PMID: 29298748 PMCID: PMC5772066 DOI: 10.2196/jmir.7412] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Revised: 08/08/2017] [Accepted: 09/13/2017] [Indexed: 11/13/2022] Open
Abstract
Clinical assessment in psychiatry is commonly based on findings from brief, regularly scheduled in-person appointments. Although critically important, this approach reduces assessment to cross-sectional observations that miss essential information about disease course. The mental health provider makes all medical decisions based on this limited information. Thanks to recent technological advances such as mobile phones and other personal devices, electronic health (eHealth) data collection strategies now can provide access to real-time patient self-report data during the interval between visits. Since mobile phones are generally kept on at all times and carried everywhere, they are an ideal platform for the broad implementation of ecological momentary assessment technology. Integration of these tools into medical practice has heralded the eHealth era. Intelligent health (iHealth) further builds on and expands eHealth by adding novel built-in data analysis approaches based on (1) incorporation of new technologies into clinical practice to enhance real-time self-monitoring, (2) extension of assessment to the patient's environment including caregivers, and (3) data processing using data mining to support medical decision making and personalized medicine. This will shift mental health care from a reactive to a proactive and personalized discipline.
Collapse
Affiliation(s)
- Sofian Berrouiguet
- Lab-STICC, IMT Atlantique, Université Bretagne Loire, Brest, France.,Laboratoire Soins primaires, Santé publique, Registre des cancers de Bretagne Occidentale SPURBO, Equipe d'accueil 7479, Brest, France
| | | | - Mark Larsen
- Black Dog Institute, University of New South Wales, Sydney, Australia
| | - Enrique Baca-García
- Department of Psychiatry, Fundación Jimenez Diaz Hospital, Autónoma University, Centro de Investigacion en Red Salud Mental, Madrid, Spain
| | - Philippe Courtet
- Department of Emergency Psychiatry, University Hospital of Montpellier, University of Montpellier, Montpellier, France
| | - Maria Oquendo
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
4
|
O'Hagan S, Kell DB. Analysing and Navigating Natural Products Space for Generating Small, Diverse, But Representative Chemical Libraries. Biotechnol J 2017; 13. [PMID: 29168302 DOI: 10.1002/biot.201700503] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 11/09/2017] [Indexed: 01/01/2023]
Abstract
Armed with the digital availability of two natural products libraries, amounting to some 195 885 molecular entities, we ask the question of how we can best sample from them to maximize their "representativeness" in smaller and more usable libraries of 96, 384, 1152, and 1920 molecules. The term "representativeness" is intended to include diversity, but for numerical reasons (and the likelihood of being able to perform a QSAR) it is necessary to focus on areas of chemical space that are more highly populated. Encoding chemical structures as fingerprints using the RDKit "patterned" algorithm, we first assess the granularity of the natural products space using a simple clustering algorithm, showing that there are major regions of "denseness" but also a great many very sparsely populated areas. We then apply a "hybrid" hierarchical K-means clustering algorithm to the data to produce more statistically robust clusters from which representative and appropriate numbers of samples may be chosen. There is necessarily again a trade-off between cluster size and cluster number, but within these constraints, libraries containing 384 or 1152 molecules can be found that come from clusters that represent some 18 and 30% of the whole chemical space, with cluster sizes of, respectively, 50 and 27 or above, just about sufficient to perform a QSAR. By using the online availability of molecules via the Molport system (www.molport.com), we are also able to construct (and, for the first time, provide the contents of) a small virtual library of available molecules that provided effective coverage of the chemical space described. Consistent with this, the average molecular similarities of the contents of the libraries developed is considerably smaller than is that of the original libraries. The suggested libraries may have use in molecular or phenotypic screening, including for determining possible transporter substrates.
Collapse
Affiliation(s)
- Steve O'Hagan
- Dr. S. O'Hagan, Prof. D. B. Kell, School of Chemistry, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Dr. S. O'Hagan, Prof. D. B. Kell, The Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Douglas B Kell
- Dr. S. O'Hagan, Prof. D. B. Kell, School of Chemistry, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Dr. S. O'Hagan, Prof. D. B. Kell, The Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Prof. D. B. Kell, Centre for the Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), The University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| |
Collapse
|
5
|
|
6
|
Noderer WL, Flockhart RJ, Bhaduri A, Diaz de Arce AJ, Zhang J, Khavari PA, Wang CL. Quantitative analysis of mammalian translation initiation sites by FACS-seq. Mol Syst Biol 2014; 10:748. [PMID: 25170020 PMCID: PMC4299517 DOI: 10.15252/msb.20145136] [Citation(s) in RCA: 137] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
An approach combining fluorescence-activated cell sorting and high-throughput DNA sequencing
(FACS-seq) was employed to determine the efficiency of start codon recognition for all possible
translation initiation sites (TIS) utilizing AUG start codons. Using FACS-seq, we measured
translation from a genetic reporter library representing all 65,536 possible TIS sequences spanning
the −6 to +5 positions. We found that the motif RYMRMVAUGGC enhanced start codon
recognition and translation efficiency. However, dinucleotide interactions, which cannot be conveyed
by a single motif, were also important for modeling TIS efficiency. Our dataset combined with
modeling allowed us to predict genome-wide translation initiation efficiency for all mRNA
transcripts. Additionally, we screened somatic TIS mutations associated with tumorigenesis to
identify candidate driver mutations consistent with known tumor expression patterns. Finally, we
implemented a quantitative leaky scanning model to predict alternative initiation sites that produce
truncated protein isoforms and compared predictions with ribosome footprint profiling data. The
comprehensive analysis of the TIS sequence space enables quantitative predictions of translation
initiation based on genome sequence.
Collapse
Affiliation(s)
- William L Noderer
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Ross J Flockhart
- The Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Aparna Bhaduri
- The Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA The Program in Cancer Biology, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Jiajing Zhang
- The Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Paul A Khavari
- The Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| | - Clifford L Wang
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| |
Collapse
|
7
|
Holzinger A, Dehmer M, Jurisica I. Knowledge Discovery and interactive Data Mining in Bioinformatics--State-of-the-Art, future challenges and research directions. BMC Bioinformatics 2014; 15 Suppl 6:I1. [PMID: 25078282 PMCID: PMC4140208 DOI: 10.1186/1471-2105-15-s6-i1] [Citation(s) in RCA: 134] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Affiliation(s)
- Andreas Holzinger
- Research Unit Human-Computer Interaction, Austrian IBM Watson Think Group, Institute for Medical Informatics, Statistics & Documentation, Medical University Graz, Austria
- Institute of Information Systems and Computer Media, Graz University of Technology, Austria
| | - Matthias Dehmer
- Institute for Bioinformatics and Translational Research, UMIT Tyrol, Austria
| | - Igor Jurisica
- Departments of Medical Biophysics and Computer Science, University of Toronto, Ontario, Canada
- Princess Margaret Cancer Centre and Techna Institute for the Advancement of Technology for Health, University Health Network, IBM Life Sciences Discovery Centre, Ontario, Canada
| |
Collapse
|
8
|
Holzinger A, Zupan M. KNODWAT: a scientific framework application for testing knowledge discovery methods for the biomedical domain. BMC Bioinformatics 2013; 14:191. [PMID: 23763826 PMCID: PMC3691758 DOI: 10.1186/1471-2105-14-191] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 05/31/2013] [Indexed: 12/05/2022] Open
Abstract
Background Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. Results A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. Conclusions The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Collapse
Affiliation(s)
- Andreas Holzinger
- Research Unit Human-Computer Interaction (HCI4MED), Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Auenbruggerplatz 2/V, Graz 8036, Austria.
| | | |
Collapse
|
9
|
Sex-specific differences in placental global gene expression in pregnancies complicated by asthma. Placenta 2011; 32:570-8. [DOI: 10.1016/j.placenta.2011.05.005] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2010] [Revised: 05/08/2011] [Accepted: 05/13/2011] [Indexed: 11/22/2022]
|
10
|
Hawrylycz M, Ng L, Page D, Morris J, Lau C, Faber S, Faber V, Sunkin S, Menon V, Lein E, Jones A. Multi-scale correlation structure of gene expression in the brain. Neural Netw 2011; 24:933-42. [PMID: 21764550 DOI: 10.1016/j.neunet.2011.06.012] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Revised: 06/16/2011] [Accepted: 06/16/2011] [Indexed: 01/25/2023]
Abstract
The mammalian brain is best understood as a multi-scale hierarchical neural system, in the sense that connection and function occur on multiple scales from micro to macro. Modern genomic-scale expression profiling can provide insight into methodologies that elucidate this architecture. We present a methodology for understanding the relationship of gene expression and neuroanatomy based on correlation between gene expression profiles across tissue samples. A resulting tool, NeuroBlast, can identify networks of genes co-expressed within or across neuroanatomic structures. The method applies to any data modality that can be mapped with sufficient spatial resolution, and provides a computation technique to elucidate neuroanatomy via patterns of gene expression on spatial and temporal scales. In addition, from the perspective of spatial location, we discuss a complementary technique that identifies gene classes that contribute to defining anatomic patterns.
Collapse
Affiliation(s)
- Mike Hawrylycz
- Allen Institute for Brain Science, 551 N. 34th Street, Seattle, WA 98103, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
|
12
|
Daiba A, Ito S, Takeuchi T, Yohda M. Gene expression informatics with an automatic histogram-type membership function for non-uniform data. CHEM-BIO INFORMATICS JOURNAL 2010. [DOI: 10.1273/cbij.10.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Akito Daiba
- Department of Biotechnology and Life Science, Graduate School of Technology, Tokyo University of Agriculture and Technology
- AP Solutions Consulting, Accelrys K.K
| | - Satoru Ito
- Scientific information Department, Fujirebio Inc
| | - Tsutomu Takeuchi
- Division of Rheumatology/Clinical Immunology, Department of Internal Medicine, School of Medicine, Keio University
| | - Masafumi Yohda
- Department of Biotechnology and Life Science, Graduate School of Technology, Tokyo University of Agriculture and Technology
| |
Collapse
|
13
|
Chang YI, Chen JR, Tsai YC. Mining subspace clusters from DNA microarray data using large itemset techniques. J Comput Biol 2009; 16:745-68. [PMID: 19432542 DOI: 10.1089/cmb.2008.0161] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.
Collapse
Affiliation(s)
- Ye-In Chang
- Department of Computer Science and Engineering, National Sun Yat-Sen University, Taiwan, R.O.C.
| | | | | |
Collapse
|
14
|
Goswami RS, Sukhai MA, Thomas M, Reis PP, Kamel-Reid S. Applications of microarray technology to Acute Myelogenous Leukemia. Cancer Inform 2008; 7:13-28. [PMID: 19352456 PMCID: PMC2664704 DOI: 10.4137/cin.s1015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Microarray technology is a powerful tool, which has been applied to further the understanding of gene expression changes in disease. Array technology has been applied to the diagnosis and prognosis of Acute Myelogenous Leukemia (AML). Arrays have also been used extensively in elucidating the mechanism of and predicting therapeutic response in AML, as well as to further define the mechanism of AML pathogenesis. In this review, we discuss the major paradigms of gene expression array analysis, and provide insights into the use of software tools to annotate the array dataset and elucidate deregulated pathways and gene interaction networks. We present the application of gene expression array technology to questions in acute myelogenous leukemia; specifically, disease diagnosis, treatment and prognosis, and disease pathogenesis. Finally, we discuss several new and emerging array technologies, and how they can be further utilized to improve our understanding of AML.
Collapse
Affiliation(s)
- Rashmi S Goswami
- Division of Applied Molecular Oncology, Princess Margaret Hospital/Ontario Cancer Institute, University Health Network, Toronto, ON, Canada
| | | | | | | | | |
Collapse
|
15
|
Tone AA, Begley H, Sharma M, Murphy J, Rosen B, Brown TJ, Shaw PA. Gene expression profiles of luteal phase fallopian tube epithelium from BRCA mutation carriers resemble high-grade serous carcinoma. Clin Cancer Res 2008; 14:4067-78. [PMID: 18593983 DOI: 10.1158/1078-0432.ccr-07-4959] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
PURPOSE To identify molecular alterations potentially involved in predisposition to adnexal serous carcinoma (SerCa) in the nonmalignant fallopian tube epithelium (FTE) of BRCA1/2 mutation carriers, given recent evidence implicating the distal FTE as a common source for SerCa. EXPERIMENTAL DESIGN We obtained and compared gene expression profiles of laser capture microdissected nonmalignant distal FTE from 12 known BRCA1/2 mutation carriers (FTEb) and 12 control women (FTEn) during the luteal and follicular phase, as well as 13 high-grade tubal and ovarian SerCa. RESULTS Gene expression profiles of tubal and ovarian SerCa specimens were indistinguishable by unsupervised cluster analysis and significance analysis of microarrays. FTEb samples as a group, and four individual FTEb samples from the luteal phase in particular, clustered closely with SerCa rather than normal control FTE. Differentially expressed genes from these four samples relative to other FTEb samples, as well as differentially expressed genes in all FTEb luteal samples relative to follicular samples, were mapped to the I2D protein-protein interaction database, revealing a complex network affecting signaling pathways previously implicated in tumorigenesis. Two candidates, disabled homolog 2 mitogen-responsive phosphoprotein (DAB2) and Ski-like (SKIL), were further validated by real-time reverse transcription-PCR and tissue arrays. FTEb luteal and SerCa samples expressed higher levels of oncogenic SKIL and decreased levels of tumor suppressor DAB2, relative to FTEb follicular samples. CONCLUSIONS These findings support a common molecular pathway for adnexal SerCa and implicate factors associated with the luteal phase in predisposition to ovarian cancer in BRCA mutation carriers.
Collapse
Affiliation(s)
- Alicia A Tone
- Department of Laboratory Medicine and Pathobiology, Division of Gynecologic Oncology, University of Toronto, Canada
| | | | | | | | | | | | | |
Collapse
|
16
|
Sodek KL, Evangelou AI, Ignatchenko A, Agochiya M, Brown TJ, Ringuette MJ, Jurisica I, Kislinger T. Identification of pathways associated with invasive behavior by ovarian cancer cells using multidimensional protein identification technology (MudPIT). MOLECULAR BIOSYSTEMS 2008; 4:762-73. [PMID: 18563251 DOI: 10.1039/b717542f] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Proteomic profiling has emerged as a useful tool for identifying tissue alterations in disease states including malignant transformation. The aim of this study was to reveal expression profiles associated with the highly motile/invasive ovarian cancer cell phenotype. Six ovarian cancer cell lines were subjected to proteomic characterization using multidimensional protein identification technology (MudPIT), and evaluated for their motile/invasive behavior, so that these parameters could be compared. Within whole cell extracts of the ovarian cancer cells, MudPIT identified proteins that mapped to 2245 unique genes. Western blot analysis for selected proteins confirmed the expression profiles revealed by MudPIT, demonstrating the fidelity of this high-throughput analysis. Unsupervised cluster analysis partitioned the cell lines in a manner that reflected their motile/invasive capacity. A comparison of protein expression profiles between cell lines of high (group 1) versus low (group 2) motile/invasive capacity revealed 300 proteins that were differentially expressed, of which 196 proteins were significantly upregulated in group 1. Protein network and KEGG pathway analysis indicated a functional interplay between proteins up-regulated in group 1 cells, with increased expression of several key members of the actin cytoskeleton, extracellular matrix (ECM) and focal adhesion pathways. These proteomic expression profiles can be utilized to distinguish highly motile, aggressive ovarian cancer cells from lesser invasive ones, and could prove to be essential in the development of more effective strategies that target pivotal cell signaling pathways used by cancer cells during local invasion and distant metastasis.
Collapse
|
17
|
Lau SK, Boutros PC, Pintilie M, Blackhall FH, Zhu CQ, Strumpf D, Johnston MR, Darling G, Keshavjee S, Waddell TK, Liu N, Lau D, Penn LZ, Shepherd FA, Jurisica I, Der SD, Tsao MS. Three-gene prognostic classifier for early-stage non small-cell lung cancer. J Clin Oncol 2007; 25:5562-9. [PMID: 18065728 DOI: 10.1200/jco.2007.12.0352] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Several microarray studies have reported gene expression signatures that classify non-small-cell lung carcinoma (NSCLC) patients into different prognostic groups. However, the prognostic gene lists reported to date overlap poorly across studies, and few have been validated independently using more quantitative assay methods. PATIENTS AND METHODS The expression of 158 putative prognostic genes identified in previous microarray studies was analyzed by reverse transcription quantitative polymerase chain reaction in the tumors of 147 NSCLC patients. Concordance indices and risk scores were used to identify a stage-independent set of genes that could classify patients with significantly different prognoses. RESULTS We have identified a three-gene classifier (STX1A, HIF1A, and CCR7) for overall survival (hazard ratio = 3.8; 95% CI, 1.7 to 8.2; P < .001). The classifier was also able to stratify stage I and II patients and further improved the predictive ability of clinical factors such as histology and tumor stage. The predictive value of this three-gene classifier was validated in two large independent microarray data sets from Harvard and Duke Universities. CONCLUSION We have identified a new three-gene classifier that is independent of and improves on stage to stratify early-stage NSCLC patients with significantly different prognoses. This classifier may be tested further for its potential value to improve the selection of resected NSCLC patients in adjuvant therapy.
Collapse
Affiliation(s)
- Suzanne K Lau
- Princess Margaret Hospital, 610 University Ave, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Yoon S, De Micheli G. An application of zero-suppressed binary decision diagrams to clustering analysis of DNA microarray data. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2004:2925-8. [PMID: 17270890 DOI: 10.1109/iembs.2004.1403831] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Clustering has been one of the most popular techniques to analyze gene expression data. The biclustering method is two-dimensional clustering of genes and experimental conditions to identify a group of genes that display a coherent behavior in some conditions. Although this method may provide additional insight overlooked by traditional clustering techniques, it is often computationally expensive to perform biclustering on practical gene expression data. In this work, we propose a novel biclustering technique that exploits the zero-suppressed binary decision diagrams (ZBDDs) to cope with such a computational challenge. The ZBDDs are a variant of the reduced ordered binary decision diagrams that have found a widespread use in optimization and verification of VLSI digital circuits. Our experimental results demonstrate that the ZBDDs can indeed extend the scalability of our biclustering algorithm substantially, thus enabling us to apply it to a wider spectrum of gene expression data.
Collapse
|
19
|
Shi W, Bastianutto C, Li A, Perez-Ordonez B, Ng R, Chow KY, Zhang W, Jurisica I, Lo KW, Bayley A, Kim J, O'Sullivan B, Siu L, Chen E, Liu FF. Multiple dysregulated pathways in nasopharyngeal carcinoma revealed by gene expression profiling. Int J Cancer 2006; 119:2467-75. [PMID: 16858677 DOI: 10.1002/ijc.22107] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Gene expression profiling was conducted using primary human nasopharyngeal carcinoma (NPC) biopsy samples to improve the understanding of the molecular pathways defining NPC and to identify novel potential therapeutic targets. RNA samples were extracted from 36 patients suspected to have NPC and hybridized onto the Affymetrix U133A chip. NPC was diagnosed in 19 patients, 11 had lymphoid hyperplasia (LH), and 6 were "normal" biopsies. Clinical stages for these NPC patients ranged from I-IV, including one M1. All NPC patients (except the M1) were treated with curative intent, which included radiotherapy alone (4 patients), or combined with chemotherapy (14 patients). Unsupervised clustering demonstrated a distinct NPC expression pattern, compared to normal biopsies. Subsequent Significance Analysis of Microarrays (SAM) derived from 14 NPC and 6 normal samples discovered 1,089 differentially regulated genes. Pathway analyses revealed novel insights into the mechanisms leading to NPC, whereby upregulation of NFkappaB2 and survivin play central roles in increasing resistance to apoptosis, and changes in integrin and WNT/beta-catenin signaling leading to uncontrolled proliferation. The role of survivin in resisting apoptosis in NPC was confirmed by RNA interference. Our data provide novel insights into the development and progression of NPC, and suggest survivin as a novel therapeutic target for NPC.
Collapse
Affiliation(s)
- Wei Shi
- Division of Applied Molecular Oncology, Ontario Cancer Institute, Toronto, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Kusumoto H, Takefuji Y. O(log2 M) self-organizing map algorithm without learning of neighborhood vectors. IEEE TRANSACTIONS ON NEURAL NETWORKS 2006; 17:1656-61. [PMID: 17131681 DOI: 10.1109/tnn.2006.882370] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In this letter, a new self-organizing map (SOM) algorithm with computational cost O(log2 M) is proposed where M2 is the size of a feature map. The first SOM algorithm with O(M2) was originally proposed by Kohonen. The proposed algorithm is composed of the subdividing method and the binary search method. The proposed algorithm does not need the neighborhood functions so that it eliminates the computational cost in learning of neighborhood vectors and the labor of adjusting the parameters of neighborhood functions. The effectiveness of the proposed algorithm was examined by an analysis of codon frequencies of Escherichia coli (E. coli) K12 genes. These drastic computational reduction and accessible application that requires no adjusting of the neighborhood function will be able to contribute to many scientific areas.
Collapse
|
21
|
|
22
|
Motamed-Khorasani A, Jurisica I, Letarte M, Shaw PA, Parkes RK, Zhang X, Evangelou A, Rosen B, Murphy KJ, Brown TJ. Differentially androgen-modulated genes in ovarian epithelial cells from BRCA mutation carriers and control patients predict ovarian cancer survival and disease progression. Oncogene 2006; 26:198-214. [PMID: 16832351 DOI: 10.1038/sj.onc.1209773] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Epidemiological studies have implicated androgens in the etiology and progression of epithelial ovarian cancer. We previously reported that some androgen responses were dysregulated in malignant ovarian epithelial cells relative to control, non-malignant ovarian surface epithelial (OSE) cells. Moreover, dysregulated androgen responses were observed in OSE cells derived from patients with germline BRCA-1 or -2 mutations (OSEb), which account for the majority of familial ovarian cancer predisposition, and such altered responses may be involved in ovarian carcinogenesis or progression. In the present study, gene expression profiling using cDNA microarrays identified 17 genes differentially expressed in response to continuous androgen exposure in OSEb cells and ovarian cancer cells as compared to OSE cells derived from control patients. A subset of these differentially affected genes was selected and verified by quantitative real-time reverse transcription-polymerase chain reaction. Six of the gene products mapped to the OPHID protein-protein interaction database, and five were networked within two interacting partners. Basic leucine zipper transcription factor 2 (BACH2) and acetylcholinesterase (ACHE), which were upregulated by androgen in OSEb cells relative to OSE cells, were further investigated using an ovarian cancer tissue microarray from a separate set of 149 clinical samples. Both cytoplasmic ACHE and BACH2 immunostaining were significantly increased in ovarian cancer relative to benign cases. High levels of cytoplasmic ACHE staining correlated with decreased survival, whereas nuclear BACH2 staining correlated with decreased time to disease recurrence. The finding that products of genes differentially responsive to androgen in OSEb cells may predict survival and disease progression supports a role for altered androgen effects in ovarian cancer. In addition to BACH2 and ACHE, this study highlights a set of potentially functionally related genes for further investigation in ovarian cancer.
Collapse
MESH Headings
- Acetylcholinesterase/genetics
- Acetylcholinesterase/metabolism
- Adult
- Aged
- Aged, 80 and over
- Androgens/pharmacology
- BRCA1 Protein/genetics
- Basic-Leucine Zipper Transcription Factors/genetics
- Basic-Leucine Zipper Transcription Factors/metabolism
- Carcinoma, Endometrioid/genetics
- Carcinoma, Endometrioid/metabolism
- Carcinoma, Papillary/genetics
- Carcinoma, Papillary/metabolism
- Cells, Cultured
- Cystadenocarcinoma, Serous/genetics
- Cystadenocarcinoma, Serous/metabolism
- Disease Progression
- Epithelial Cells/metabolism
- Female
- Flow Cytometry
- Gene Expression Profiling
- Gene Expression Regulation, Neoplastic
- Humans
- Immunoenzyme Techniques
- Leucine Zippers
- Middle Aged
- Mutation
- Oligonucleotide Array Sequence Analysis
- Ovarian Neoplasms/genetics
- Ovarian Neoplasms/metabolism
- Ovarian Neoplasms/mortality
- Ovary/metabolism
- Ovary/pathology
- RNA, Messenger/analysis
- RNA, Messenger/genetics
- RNA, Neoplasm/analysis
- Reverse Transcriptase Polymerase Chain Reaction
- Survival Rate
- Tissue Array Analysis
Collapse
Affiliation(s)
- A Motamed-Khorasani
- The Samuel Lunenfeld Research Institute, Mt Sinai Hospital, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Brierley MM, Marchington KL, Jurisica I, Fish EN. Identification of GAS-dependent interferon-sensitive target genes whose transcription is STAT2-dependent but ISGF3-independent. FEBS J 2006; 273:1569-81. [PMID: 16689942 DOI: 10.1111/j.1742-4658.2006.05176.x] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Signal transducer and activator of transcription 2 (STAT2) is best known as a critical transactivator component of the interferon-stimulated gene factor 3 (ISGF3) complex that drives the expression of many interferon (IFN)-inducible genes. However, STAT2 is also involved in DNA binding in non-ISGF3 transcriptional complexes. We used a DNA microarray to survey the expression of genes regulated by IFN-inducible, STAT2-dependent DNA binding, and compared the cDNAs of IFN-treated cells overexpressing intact STAT2 to those of IFN-treated cells overexpressing mutated STAT2 lacking the DNA binding domain. The IFN-inducible expression of genes known to be regulated by ISGF3 was similar in both cases. However, a subset of IFN-inducible genes was identified whose expression was decreased in cells expressing the mutated STAT2. Importantly, these genes all contained gamma-activated sequence (GAS)-like elements in their 5' flanking sequences. Our data reveal the existence of a collection of GAS-regulated target genes whose expression is IFN-inducible and independent of ISGF3 but highly dependent on the STAT2 DNA binding domain. This report is the first analysis of the contribution of the STAT2 DNA binding domain to IFN responses on a global basis, and shows that STAT2 is required for the IFN-inducible activation of the full spectrum of GAS target genes.
Collapse
Affiliation(s)
- Melissa M Brierley
- Department of Cell and Molecular Biology, Toronto General Research Institute, University Health Network, University of Toronto, ON, Canada
| | | | | | | |
Collapse
|
24
|
Böcker A, Schneider G, Teckentrup A. NIPALSTREE: A New Hierarchical Clustering Approach for Large Compound Libraries and Its Application to Virtual Screening. J Chem Inf Model 2006; 46:2220-9. [PMID: 17125166 DOI: 10.1021/ci050541d] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A hierarchical clustering algorithm--NIPALSTREE--was developed that is able to analyze large data sets in high-dimensional space. The result can be displayed as a dendrogram. At each tree level the algorithm projects a data set via principle component analysis onto one dimension. The data set is sorted according to this one dimension and split at the median position. To avoid distortion of clusters at the median position, the algorithm identifies a potentially more suited split point left or right of the median. The procedure is recursively applied on the resulting subsets until the maximal distance between cluster members exceeds a user-defined threshold. The approach was validated in a retrospective screening study for angiotensin converting enzyme (ACE) inhibitors. The resulting clusters were assessed for their purity and enrichment in actives belonging to this ligand class. Enrichment was observed in individual branches of the dendrogram. In further retrospective virtual screening studies employing the MDL Drug Data Report (MDDR), COBRA, and the SPECS catalog, NIPALSTREE was compared with the hierarchical k-means clustering approach. Results show that both algorithms can be used in the context of virtual screening. Intersecting the result lists obtained with both algorithms improved enrichment factors while losing only few chemotypes.
Collapse
Affiliation(s)
- Alexander Böcker
- Institut für Organische Chemie und Chemische Biologie, Johann Wolfgang Goethe-Universität, Marie-Curie-Strasse 11, D-60439 Frankfurt, Germany
| | | | | |
Collapse
|
25
|
Seiden-Long IM, Brown KR, Shih W, Wigle DA, Radulovich N, Jurisica I, Tsao MS. Transcriptional targets of hepatocyte growth factor signaling and Ki-ras oncogene activation in colorectal cancer. Oncogene 2006; 25:91-102. [PMID: 16158056 DOI: 10.1038/sj.onc.1209005] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Both Ki-ras mutation and hepatocyte growth factor (HGF) receptor Met overexpression occur at high frequency in colon cancer. This study investigates the transcriptional changes induced by Ki-ras oncogene and HGF/Met signaling activation in colon cancer cell lines in vitro and in vivo. The model system used in these studies included the DLD-1 colon cancer cell line with a mutated Ki-ras allele, and the DKO-4 cell line generated from DLD-1, with its mutant Ki-ras allele inactivated by targeted disruption. These cell lines were transduced with cDNAs of full-length Met receptor. Microarray transcriptional profiling was conducted on cell lines stimulated with HGF, as well as on tumor xenograft tissues. Overlapping genes between in vitro and in vivo microarray data sets were selected as a subset of HGF/Met and Ki-ras oncogene-regulated targets. Using the Online Predicted Human Interaction Database, novel HGF/Met and Ki-ras regulated proteins with putative functional linkage were identified. Novel proteins identified included histone acetyltransferase 1, phosphoribosyl pyrophosphate synthetase 2, chaperonin containing TCP1, subunit 8, CSE1 chromosome segregation 1-like (yeast)/cellular apoptosis susceptibility (mammals), CCR4-NOT transcription complex, subunit 8, and cyclin H. Transcript levels for these Met-signaling targets were correlated with Met expression levels, and were significantly elevated in both primary and metastatic human colorectal cancer samples compared to normal colorectal mucosa. These genes represent novel Met and/or Ki-ras transcriptionally coregulated genes with a high degree of validation in human colorectal cancers.
Collapse
Affiliation(s)
- I M Seiden-Long
- Ontario Cancer Institute/Princess Margaret Hospital, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | | | | | |
Collapse
|
26
|
Abstract
The genome era provides two sources of knowledge to investigators whose goal is to discover new cancer therapies: first, information on the 20,000 to 40,000 genes that comprise the human genome, the proteins they encode, and the variation in these genes and proteins in human populations that place individuals at risk or that occur in disease; second, genome-wide analysis of cancer cells and tissues leads to the identification of new drug targets and the design of new therapeutic interventions. Using genome resources requires the storage and analysis of large amounts of diverse information on genetic variation, gene and protein functions, and interactions in regulatory processes and biochemical pathways. Cancer bioinformatics deals with organizing and analyzing the data so that important trends and patterns can be identified. Specific gene and protein targets on which cancer cells depend can be identified. Therapeutic agents directed against these targets can then be developed and evaluated. Finally, molecular and genetic variation within a population may become the basis of individualized treatment.
Collapse
Affiliation(s)
- David W Mount
- Arizona Cancer Center, University of Arizona, 1515 North Campbell Avenue, P.O. Box 245024, Tucson, AZ 85724-5024, USA.
| | | |
Collapse
|
27
|
Choi P, Chen C. Genetic expression profiles and biologic pathway alterations in head and neck squamous cell carcinoma. Cancer 2005; 104:1113-28. [PMID: 16092115 DOI: 10.1002/cncr.21293] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Head and neck squamous cell carcinoma (HNSCC) is associated with considerable mortality and morbidity and is a major public health concern worldwide. To date, > 20 studies incorporating DNA microarray analyses have examined genomewide genetic expression changes associated with the development of HNSCC. The authors identified published reports of genetic expression profiles of HNSCC by Medline database search. They performed a review of the reports to identify genes that have been found repeatedly to exhibit substantially altered expression in HNSCC. Genes with altered expression were subsequently examined in the context of defined biologic systems with the use of GenMapp 2.0 pathway analysis software. Genes most commonly found to exhibit altered expression were those encoding for cytoskeletal and extracellular matrix proteins, inflammatory mediators, proteins involved in epidermal differentiation, and cell adhesion molecules. Results of GenMapp 2.0 analysis suggested global down-regulation of genes that encode for ribosomal proteins and enzymes in the cholesterol biosynthesis pathway; and up-regulation of genes that encode for matrix metalloproteinases and genes that bear on the inflammatory response. The review indicated that there are several genes and pathways that exhibit substantially altered expression in cancerous versus noncancerous states across studies. Further investigation into the genomic, proteomic, and functional consequences of these gene expression alterations may provide insight into the pathophysiology of HNSCC.
Collapse
Affiliation(s)
- Peter Choi
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, 98109, USA
| | | |
Collapse
|
28
|
Yoon S, Nardini C, Benini L, De Micheli G. Discovering coherent biclusters from gene expression data using zero-suppressed binary decision diagrams. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2005; 2:339-54. [PMID: 17044171 DOI: 10.1109/tcbb.2005.55] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The biclustering method can be a very useful analysis tool when some genes have multiple functions and experimental conditions are diverse in gene expression measurement. This is because the biclustering approach, in contrast to the conventional clustering techniques, focuses on finding a subset of the genes and a subset of the experimental conditions that together exhibit coherent behavior. However, the biclustering problem is inherently intractable, and it is often computationally costly to find biclusters with high levels of coherence. In this work, we propose a novel biclustering algorithm that exploits the zero-suppressedbinary decision diagrams (ZBDDs) data structure to cope with the computational challenges. Our method can find all biclusters that satisfy specific input conditions, and it is scalable to practical gene expression data. We also present experimental results confirming the effectiveness of our approach.
Collapse
Affiliation(s)
- Sungroh Yoon
- Computer Systems Laboratory, Stanford University, Room 334, William Gates Computer Science Hall, Stanford, CA 94305, USA.
| | | | | | | |
Collapse
|
29
|
Soleymanlou N, Jurisica I, Nevo O, Ietta F, Zhang X, Zamudio S, Post M, Caniggia I. Molecular evidence of placental hypoxia in preeclampsia. J Clin Endocrinol Metab 2005; 90:4299-308. [PMID: 15840747 PMCID: PMC6428057 DOI: 10.1210/jc.2005-0078] [Citation(s) in RCA: 288] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
BACKGROUND Oxygen plays a central role in human placental pathologies including preeclampsia, a leading cause of fetal and maternal death and morbidity. Insufficient uteroplacental oxygenation in preeclampsia is believed to be responsible for the molecular events leading to the clinical manifestations of this disease. DESIGN Using high-throughput functional genomics, we determined the global gene expression profiles of placentae from high altitude pregnancies, a natural in vivo model of chronic hypoxia, as well as that of first-trimester explants under 3 and 20% oxygen, an in vitro organ culture model. We next compared the genomic profile from these two models with that obtained from pregnancies complicated by preeclampsia. Microarray data were analyzed using the binary tree-structured vector quantization algorithm, which generates global gene expression maps. RESULTS Our results highlight a striking global gene expression similarity between 3% O(2)-treated explants, high-altitude placentae, and importantly placentae from preeclamptic pregnancies. We demonstrate herein the utility of explant culture and high-altitude placenta as biologically relevant and powerful models for studying the oxygen-mediated events in preeclampsia. CONCLUSION Our results provide molecular evidence that aberrant global placental gene expression changes in preeclampsia may be due to reduced oxygenation and that these events can successfully be mimicked by in vivo and in vitro models of placental hypoxia.
Collapse
Affiliation(s)
- Nima Soleymanlou
- Department of Obstetrics and Gynecology, Mount Sinai Hospital, Toronto, Ontario, Canada M5G 1X5
| | | | | | | | | | | | | | | |
Collapse
|
30
|
Böcker A, Derksen S, Schmidt E, Teckentrup A, Schneider G. A Hierarchical Clustering Approach for Large Compound Libraries. J Chem Inf Model 2005; 45:807-15. [PMID: 16045274 DOI: 10.1021/ci0500029] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A modified version of the k-means clustering algorithm was developed that is able to analyze large compound libraries. A distance threshold determined by plotting the sum of radii of leaf clusters was used as a termination criterion for the clustering process. Hierarchical trees were constructed that can be used to obtain an overview of the data distribution and inherent cluster structure. The approach is also applicable to ligand-based virtual screening with the aim to generate preferred screening collections or focused compound libraries. Retrospective analysis of two activity classes was performed: inhibitors of caspase 1 [interleukin 1 (IL1) cleaving enzyme, ICE] and glucocorticoid receptor ligands. The MDL Drug Data Report (MDDR) and Collection of Bioactive Reference Analogues (COBRA) databases served as the compound pool, for which binary trees were produced. Molecules were encoded by all Molecular Operating Environment 2D descriptors and topological pharmacophore atom types. Individual clusters were assessed for their purity and enrichment of actives belonging to the two ligand classes. Significant enrichment was observed in individual branches of the cluster tree. After clustering a combined database of MDDR, COBRA, and the SPECS catalog, it was possible to retrieve MDDR ICE inhibitors with new scaffolds using COBRA ICE inhibitors as seeds. A Java implementation of the clustering method is available via the Internet (http://www.modlab.de).
Collapse
Affiliation(s)
- Alexander Böcker
- Johann Wolfgang Goethe-Universität, Institut für Organische Chemie und Chemische Biologie, Marie-Curie-Str. 11, D-60439 Frankfurt, Germany
| | | | | | | | | |
Collapse
|
31
|
Blackhall FH, Pintilie M, Wigle DA, Jurisica I, Liu N, Radulovich N, Johnston MR, Keshavjee S, Tsao MS. Stability and heterogeneity of expression profiles in lung cancer specimens harvested following surgical resection. Neoplasia 2005; 6:761-7. [PMID: 15720802 PMCID: PMC1531680 DOI: 10.1593/neo.04301] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One of the major concerns in microarray profiling studies of clinical samples is the effect of tissue sampling and RNA extraction on data. We analyzed gene expression in lung cancer specimens that were serially harvested from tumor mass and snap-frozen at several intervals up to 120 minutes after surgical resection. Global gene expression was profiled on cDNA microarrays, and selected stress and hypoxia-activated genes were evaluated using real-time reverse transcription polymerase chain reaction (RT-PCR). Remarkably, similar gene expression profiles were obtained for the majority of samples regardless of the time that had elapsed between resection and freezing. Real-time RT-PCR studies showed significant heterogeneity in the expression levels of stress and hypoxia-activated genes in samples obtained from different areas of a tumor specimen at one time point after resection. The variations between multiple samplings were significantly greater than those of elapsed time between sampling/freezing. Overall samples snap-frozen within 30 to 60 minutes of surgical resection are acceptable for gene expression studies, thus making sampling and snap-freezing of tumor samples in a routine surgical pathology laboratory setting feasible. However, sampling and pooling from multiple sites of each tumor may be necessary for expression profiling studies to overcome the molecular heterogeneity present in tumor specimens.
Collapse
Affiliation(s)
- Fiona H Blackhall
- University Health Network, Ontario Cancer Institute/Princess Margaret Hospital, Toronto General Hospital and University of Toronto, Toronto, Ontario, Canada M5G 2M9
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Barrios-Rodiles M, Brown KR, Ozdamar B, Bose R, Liu Z, Donovan RS, Shinjo F, Liu Y, Dembowy J, Taylor IW, Luga V, Przulj N, Robinson M, Suzuki H, Hayashizaki Y, Jurisica I, Wrana JL. High-throughput mapping of a dynamic signaling network in mammalian cells. Science 2005; 307:1621-5. [PMID: 15761153 DOI: 10.1126/science.1105776] [Citation(s) in RCA: 540] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Signaling pathways transmit information through protein interaction networks that are dynamically regulated by complex extracellular cues. We developed LUMIER (for luminescence-based mammalian interactome mapping), an automated high-throughput technology, to map protein-protein interaction networks systematically in mammalian cells and applied it to the transforming growth factor-beta (TGFbeta) pathway. Analysis using self-organizing maps and k-means clustering identified links of the TGFbeta pathway to the p21-activated kinase (PAK) network, to the polarity complex, and to Occludin, a structural component of tight junctions. We show that Occludin regulates TGFbeta type I receptor localization for efficient TGFbeta-dependent dissolution of tight junctions during epithelial-to-mesenchymal transitions.
Collapse
Affiliation(s)
- Miriam Barrios-Rodiles
- Program in Molecular Biology and Cancer, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada, M5G 1X5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Blackhall FH, Wigle DA, Jurisica I, Pintilie M, Liu N, Darling G, Johnston MR, Keshavjee S, Waddell T, Winton T, Shepherd FA, Tsao MS. Validating the prognostic value of marker genes derived from a non-small cell lung cancer microarray study. Lung Cancer 2005; 46:197-204. [PMID: 15474668 DOI: 10.1016/j.lungcan.2004.04.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2004] [Revised: 03/18/2004] [Accepted: 04/01/2004] [Indexed: 10/26/2022]
Abstract
We previously reported that our cDNA microarray analysis of primary non-small cell lung carcinoma (NSCLC) could predict for patients at increased risk of cancer recurrence. From the result of this analysis, we selected 11 genes that were considered candidate prognostic marker genes and used the realtime reverse transcription polymerase chain reaction (RT-PCR) to investigate their expression in the same set of NSCLC cases used in the microarray study. Cluster analysis of the realtime RT-PCR data separated these patients into two groups with significantly different disease-free survivals (log-rank test, P < 0.017). In contrast, cluster analysis failed to confirm the prognostic significance of the realtime RT-PCR results for these 11 genes in a validation series of 92 NSCLC cases. In univariate analysis, hypoxia inducible factor 1alpha, Rho-GDP dissociation inhibitor (GDI) alpha (RhoGDI) and Citron/rho-interacting serine-threonine kinase 21 (Citron K21) were significant prognostic factors for disease-free survival in the entire cohort of 130 NSCLC patients, but none were significant in multivariate analysis. The results demonstrate that the prognostic significance of microarray (SAM) results can be partially validated using realtime RT-PCR, but secondary validation using larger and independent series of tumors is necessary to identify true prognostic marker genes.
Collapse
Affiliation(s)
- Fiona H Blackhall
- Division of Cellular and Molecular Biology, University Health Network, Ontario Cancer Institute, Princess Margaret Hospital and University of Toronto, Toronto, Ontario, Canada M5G 2M9
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Affiliation(s)
- C H Graham
- Department of Anatomy and Cell Biology, Queen's University, Kingston, Ontario, Canada
| | | |
Collapse
|
35
|
Warner GC, Reis PP, Jurisica I, Sultan M, Arora S, Macmillan C, Makitie AA, Grénman R, Reid N, Sukhai M, Freeman J, Gullane P, Irish J, Kamel-Reid S. Molecular classification of oral cancer by cDNA microarrays identifies overexpressed genes correlated with nodal metastasis. Int J Cancer 2004; 110:857-68. [PMID: 15170668 DOI: 10.1002/ijc.20197] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Our purpose was to classify OSCCs based on their gene expression profiles, to identify differentially expressed genes in these cancers and to correlate genetic deregulation with clinical and histopathologic data and patient outcome. After conducting proof-of-principle experiments utilizing 6 HNSCC cell lines, the gene expression profiles of 20 OSCCs were determined using cDNA microarrays containing 19,200 sequences and the BTSVQ method of data analysis. We identified 2 sample clusters that correlated with the T3-T4 category of disease (p = 0.035) and nodal metastasis (p = 0.035). BTSVQ analysis identified a subset of 23 differentially expressed genes with the lowest QE scores in the cluster containing more advanced-stage tumors. Expression of 6 of these differentially expressed genes was validated by quantitative real-time RT-PCR. Statistical analysis of quantitative real-time RT-PCR data was performed and, after Bonferroni correction, CLDN1 overexpression was significantly correlated with the cluster containing more advanced-stage tumors (p = 0.007). Despite the clinical heterogeneity of OSCC, molecular subtyping by cDNA microarray analysis identified distinct patterns of gene expression associated with relevant clinical parameters. Application of this methodology represents an advance in the classification of oral cavity tumors and may ultimately aid in the development of more tailored therapies for oral carcinoma.
Collapse
Affiliation(s)
- Giles C Warner
- Department of Cellular and Molecular Biology, Princess Margaret Hospital, Ontario Cancer Institute, University Health Network, 610 University Avenue, Toronto, Ontario M5G 2M9, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Liu J, Blackhall F, Seiden-Long I, Jurisica I, Navab R, Liu N, Radulovich N, Wigle D, Sultan M, Hu J, Tsao MS, Johnston MR. Modeling of lung cancer by an orthotopically growing H460SM variant cell line reveals novel candidate genes for systemic metastasis. Oncogene 2004; 23:6316-24. [PMID: 15247903 DOI: 10.1038/sj.onc.1207795] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Endobronchial implantation of NCI-H460 cells into the nude rat generates a primary lung tumor with mediastinal lymph node spread, but rarely systemic metastases. We isolated tumor cells from mediastinal nodes, orthotopically reimplanted the cells into nude rats and repeated this four times to derive a cell line, designated H460SM, that spontaneously metastasizes to bone, kidney, brain, soft tissue and contralateral lung. H460SM cells demonstrated higher invasive activity in vitro than parental NCI-H460 cells. Spectral karyotyping revealed a new inversion within 17q and loss of an extra normal copy of chromosome 14 present in parental NCI-H460 cells. Expression profiling of orthotopic primary tumors revealed differential expression of 360 genes. Of these, 173 were represented in the probe set of a 19.2K OCI cDNA microarray previously used to profile the gene expression of surgically resected lung cancer specimens. We have computationally validated clinical importance of these genes by using in silico analysis of 18 cases of pulmonary adenocarcinoma, which were split into two patient groups with markedly different clinical outcome. The model identifies additional novel candidate genes for the progression of lung cancer to systemic metastases and poor prognosis.
Collapse
Affiliation(s)
- Jiang Liu
- Division of Thoracic Surgery, University Health Network, Princess Margaret Hospital and Ontario Cancer Institute, Ontario, Canada M5G 2M9
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Nair TM, Zheng CL, Fink JL, Stuart RO, Gribskov M. Rival penalized competitive learning (RPCL): a topology-determining algorithm for analyzing gene expression data. Comput Biol Chem 2003; 27:565-74. [PMID: 14667784 DOI: 10.1016/j.compbiolchem.2003.09.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
DNA arrays have become the immediate choice in the analysis of large-scale expression measurements. Understanding the expression pattern of genes provide functional information on newly identified genes by computational approaches. Gene expression pattern is an indicator of the state of the cell, and abnormal cellular states can be inferred by comparing expression profiles. Since co-regulated genes, and genes involved in a particular pathway, tend to show similar expression patterns, clustering expression patterns has become the natural method of choice to differentiate groups. However, most methods based on cluster analysis suffer from the usual problems (i) dead units, and (ii) the problem of determining the correct number of clusters (k) needed to classify the data. Selecting the k has been an open problem of pattern recognition and statistics for decades. Since clustering reveals similar patterns present in the data, fixing this number strongly influences the quality of the result. While there is no theoretical solution to this problem, the number of clusters can be decided by a heuristic clustering algorithm called rival penalized competitive learning (RPCL). We present a novel implementation of RPCL that transforms the correct number of clusters problem to the tractable problem of clustering based on the degree of similarity. This is biologically significant since our implementation clusters functionally co-regulated genes and genes that present similar patterns of expression. This new approach reveals potential genes that are co-involved in a biological process. This implementation of the RPCL algorithm is useful in differentiating groups involved in concerted functional regulation and helps to progressively home into patterns, which are closely similar.
Collapse
Affiliation(s)
- T Murlidharan Nair
- San Diego Supercomputer Center, University of California at San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0537, USA.
| | | | | | | | | |
Collapse
|
38
|
Breitkreutz A, Boucher L, Breitkreutz BJ, Sultan M, Jurisica I, Tyers M. Phenotypic and Transcriptional Plasticity Directed by a Yeast Mitogen-Activated Protein Kinase Network. Genetics 2003; 165:997-1015. [PMID: 14668360 PMCID: PMC1462838 DOI: 10.1093/genetics/165.3.997] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
The yeast pheromone/filamentous growth MAPK pathway mediates both mating and invasive-growth responses. The interface between this MAPK module and the transcriptional machinery consists of a network of two MAPKs, Fus3 and Kss1; two regulators, Rst1 and Rst2 (a.k.a. Dig1 and Dig2); and two transcription factors, Ste12 and Tec1. Of 16 possible combinations of gene deletions in FUS3, KSS1, RST1, and RST2 in the Σ1278 background, 10 display constitutive invasive growth. Rst1 was the primary negative regulator of invasive growth, while other components either attenuated or enhanced invasive growth, depending on the genetic context. Despite activation of the invasive response by lesions at the same level in the MAPK pathway, transcriptional profiles of different invasive mutant combinations did not exhibit a unified program of gene expression. The distal MAPK regulatory network is thus capable of generating phenotypically similar invasive-growth states (an attractor) from different molecular architectures (trajectories) that can functionally compensate for one another. This systems-level robustness may also account for the observed diversity of signals that trigger invasive growth.
Collapse
Affiliation(s)
- Ashton Breitkreutz
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | | | | | | | | | | |
Collapse
|
39
|
Usui T, Saitoh Y, Komada F. Induction of CYP3As in HepG2 cells by several drugs. Association between induction of CYP3A4 and expression of glucocorticoid receptor. Biol Pharm Bull 2003; 26:510-7. [PMID: 12673034 DOI: 10.1248/bpb.26.510] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The cytochrome P-450 3A (CYP3A) enzyme family is responsible for most of the drug metabolism in the human liver. In this study, we demonstrated the inductive effects of phenobarbital, rifampicin, carbamazepine, phenytoin, prednisolone, ciclosporin and clotrimazole on CYP3A4, CYP3A5 and CYP3A7 mRNA expression, and established the relationship between the expression of human glucocorticoid receptor alpha (hGR) mRNA and the induction of CYP3A4 mRNA in cultured HepG2 cells by reverse transcription polymerase chain reaction (RT-PCR). Treatment with prednisolone, rifampicin and carbamazepine rapidly induced the level of CYP3A4 mRNA expression by 3- to 6-fold. However, phenytoin and phenobarbital gradually induced CYP3A4 mRNA level by 3 to 4-fold. The induction of CYP3A4 mRNA expression by clotrimazole and ciclosporin was negligible. Treatment with phenytoin, rifampicin, carbamazepine and ciclosporin induced approximately 2-fold increases in the expression of CYP3A5 mRNA, although prednisolone, phenytoin and clotrimazole had no effect. Treatment with rifampicin, phenytoin, clotrimazole and ciclosporin resulted in approximately a 2-fold induction of the CYP3A7 mRNA level. Treatment with rifampicin and ciclosporin induced the expression of hGRalpha mRNA significantly in comparison with controls, although the induction of hGRalpha mRNA following treatment with other drugs was negligible. In cluster analysis, the induced level of CYP3A4, CYP3A5, CYP3A7 and hGRalpha mRNA by these drugs could be classified into four major clusters. This suggested that each cluster might be associated with different mechanism(s) of induction by these drugs. Furthermore, we studied the associations between the expression of hGRalpha mRNA and the induced level of CYP3A4 mRNA by prednisolone and ciclosporin. Treatment with both prednisolone and ciclosporin showed synergistic effects on induction of CYP3A4 mRNA and, following treatment with both drugs, the expression level of CYP3A4 mRNA was 2-fold greater compared with prednisolone alone after the fifth day. Positive correlations were observed between the levels of hGRalpha mRNA expression and those of CYP3A4 mRNA. This observation shows that the regulation of CYP3A4 gene expression was hGRalpha-dependent and that ciclosporin may function as a regulator of expression via hGRalpha.
Collapse
Affiliation(s)
- Tatsuhiro Usui
- Department of Drug Informatics, Faculty of Pharmaceutical Sciences, Josai University, Sakado, Saitama, Japan
| | | | | |
Collapse
|