1
|
Iddawela M, Rueda O, Eremin J, Eremin O, Cowley J, Earl HM, Caldas C. Integrative analysis of copy number and gene expression in breast cancer using formalin-fixed paraffin-embedded core biopsy tissue: a feasibility study. BMC Genomics 2017; 18:526. [PMID: 28697743 PMCID: PMC5506605 DOI: 10.1186/s12864-017-3867-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 06/16/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An absence of reliable molecular markers has hampered individualised breast cancer treatments, and a major limitation for translational research is the lack of fresh tissue. There are, however, abundant banks of formalin-fixed paraffin-embedded (FFPE) tissue. This study evaluated two platforms available for the analysis of DNA copy number and gene expression using FFPE samples. METHODS The cDNA-mediated annealing, selection, extension, and ligation assay (DASL™) has been developed for gene expression analysis and the Molecular Inversion Probes assay (Oncoscan™), were used for copy number analysis using FFPE tissues. Gene expression and copy number were evaluated in core-biopsy samples from patients with breast cancer undergoing neoadjuvant chemotherapy (NAC). RESULTS Forty-three core-biopsies were evaluated and characteristic copy number changes in breast cancers, gains in 1q, 8q, 11q, 17q and 20q and losses in 6q, 8p, 13q and 16q, were confirmed. Regions that frequently exhibited gains in tumours showing a pathological complete response (pCR) to NAC were 1q (55%), 8q (40%) and 17q (40%), whereas 11q11 (37%) gain was the most frequent change in non-pCR tumours. Gains associated with poor survival were 11q13 (62%), 8q24 (54%) and 20q (47%). Gene expression assessed by DASL correlated with immunohistochemistry (IHC) analysis for oestrogen receptor (ER) [area under the curve (AUC) = 0.95], progesterone receptor (PR)(AUC = 0.90) and human epidermal growth factor type-2 receptor (HER-2) (AUC = 0.96). Differential expression analysis between ER+ and ER- cancers identified over-expression of TTF1, LAF-4 and C-MYB (p ≤ 0.05), and between pCR vs non-pCRs, over-expression of CXCL9, AREG, B-MYB and under-expression of ABCG2. CONCLUSION This study was an integrative analysis of copy number and gene expression using FFPE core biopsies and showed that molecular marker data from FFPE tissues were consistent with those in previous studies using fresh-frozen samples. FFPE tissue can provide reliable information and will be a useful tool in molecular marker studies. TRIAL REGISTRATION Trial registration number ISRCTN09184069 and registered retrospectively on 02/06/2010.
Collapse
Affiliation(s)
- Mahesh Iddawela
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE UK
- Department of Oncology, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge, UK
- Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, NIHR Cambridge Biomedical Research Centre and Cambridge Experimental Cancer Medicine Centre, Cambridge, UK
- Department of Anatomy & Developmental Biology, Monash University, Clayton, VIC 3800 Australia
- School of Clinical Sciences, Monash University, Clayton, Australia
| | - Oscar Rueda
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE UK
| | - Jenny Eremin
- Research and Development, Lincoln Breast Unit, Lincoln County Hospital, Lincoln, UK
- Nottingham Digestive Disease Centre, Faculty of Medicine and Health Sciences, University of Nottingham, Queen’s Medical Centre, Nottingham, UK
| | - Oleg Eremin
- Research and Development, Lincoln Breast Unit, Lincoln County Hospital, Lincoln, UK
- Nottingham Digestive Disease Centre, Faculty of Medicine and Health Sciences, University of Nottingham, Queen’s Medical Centre, Nottingham, UK
| | - Jed Cowley
- PathLinks, Lincoln County Hospital, Lincoln, UK
| | - Helena M. Earl
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE UK
- Department of Oncology, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge, UK
- Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, NIHR Cambridge Biomedical Research Centre and Cambridge Experimental Cancer Medicine Centre, Cambridge, UK
| | - Carlos Caldas
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE UK
- Department of Oncology, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge, UK
- Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, NIHR Cambridge Biomedical Research Centre and Cambridge Experimental Cancer Medicine Centre, Cambridge, UK
| |
Collapse
|
2
|
Börnigen D, Moon YS, Rahnavard G, Waldron L, McIver L, Shafquat A, Franzosa EA, Miropolsky L, Sweeney C, Morgan XC, Garrett WS, Huttenhower C. A reproducible approach to high-throughput biological data acquisition and integration. PeerJ 2015; 3:e791. [PMID: 26157642 PMCID: PMC4493686 DOI: 10.7717/peerj.791] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 02/04/2015] [Indexed: 12/25/2022] Open
Abstract
Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa.
Collapse
Affiliation(s)
- Daniela Börnigen
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yo Sup Moon
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA
| | - Gholamali Rahnavard
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Levi Waldron
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,City University of New York School of Public Health, Hunter College, New York, NY, USA
| | - Lauren McIver
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA
| | - Afrah Shafquat
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA
| | - Eric A Franzosa
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Larissa Miropolsky
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA
| | | | - Xochitl C Morgan
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Wendy S Garrett
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Immunology and Infectious Diseases, Harvard School of Public Health, Boston, MA, USA.,Department of Medicine, Harvard Medical School, Boston, MA, USA.,Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Curtis Huttenhower
- Biostatistics Department, Harvard School of Public Health, Boston, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
3
|
Veytsman B, Wang L, Cui T, Bruskin S, Baranova A. Distance-based classifiers as potential diagnostic and prediction tools for human diseases. BMC Genomics 2015; 15 Suppl 12:S10. [PMID: 25563076 PMCID: PMC4303935 DOI: 10.1186/1471-2164-15-s12-s10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Typically, gene expression biomarkers are being discovered in course of high-throughput experiments, for example, RNAseq or microarray profiling. Analytic pipelines that extract so-called signatures suffer from the "Dimensionality curse": the number of genes expressed exceeds the number of patients we can enroll in the study and use to train the discriminator algorithm. Hence, problems with the reproducibility of gene signatures are more common than not; when the algorithm is executed using a different training set, the resulting diagnostic signature may turn out to be completely different. In this paper we propose an alternative novel approach which takes into account quantifiable expression levels of all genes assayed. In our analysis, the cumulative gene expression pattern of an individual patient is represented as a point in the multidimensional space formed by all gene expression profiles assayed in given system, where the clusters of "normal samples" and "affected samples" and defined. The degree of separation of the given sample from the space occupied by "normal samples" reflects the drift of the sample away from homeostasis in the course of development of the pathophysiological process that underly the disease. The outlined approach was validated using the publicly available glioma dataset deposited in Rembrandt and associated with survival data. Additionally, the applicability of the distance analysis to the classification of non-malignant sampled was tested using psoriatic lesions and non-lesional matched controls as a model. Keywords: biomarkers; clustering; human diseases; RNA
Collapse
|
4
|
Abstract
INTRODUCTION There is certain degree of frustration and discontent in the area of microarray gene expression data analysis of cancer datasets. It arises from the mathematical problem called 'curse of dimensionality,' which is due to the small number of samples available in training sets, used for calculating transcriptional signatures from the large number of differentially expressed (DE) genes, measured by microarrays. The new generation of causal reasoning algorithms can provide solutions to the curse of dimensionality by transforming microarray data into activity of a small number of cancer hallmark pathways. This new approach can make feature space dimensionality optimal for mathematical signature calculations. AREAS COVERED The author reviews the reasons behind the current frustration with transcriptional signatures derived from DE genes in cancer. He also provides an overview of the novel methods for signature calculations based on differentially variable genes and expression regulators. Furthermore, the authors provide perspectives on causal reasoning algorithms that use prior knowledge about regulatory events described in scientific literature to identify expression regulators responsible for the differential expression observed in cancer samples. EXPERT OPINION The author advocates causal reasoning methods to calculate cancer pathway activity signatures. The current challenge for these algorithms is in ensuring quality of the knowledgebase. Indeed, the development of cancer hallmark pathway collections, together with statistical algorithms to transform activity of expression regulators into pathway activity, are necessary for causal reasoning to be used in cancer research.
Collapse
Affiliation(s)
- Anton Yuryev
- Elsevier, Inc. , 5635 Fishers Lane, Rockville, MD 20852 USA
| |
Collapse
|