1
|
Pourcelot E, El Samra G, Mossuz P, Moulis JM. Molecular Insight into Iron Homeostasis of Acute Myeloid Leukemia Blasts. Int J Mol Sci 2023; 24:14307. [PMID: 37762610 PMCID: PMC10531764 DOI: 10.3390/ijms241814307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 09/12/2023] [Accepted: 09/15/2023] [Indexed: 09/29/2023] Open
Abstract
Acute myeloid leukemia (AML) remains a disease of gloomy prognosis despite intense efforts to understand its molecular foundations and to find efficient treatments. In search of new characteristic features of AML blasts, we first examined experimental conditions supporting the amplification of hematological CD34+ progenitors ex vivo. Both AML blasts and healthy progenitors heavily depended on iron availability. However, even if known features, such as easier engagement in the cell cycle and amplification factor by healthy progenitors, were observed, multiplying progenitors in a fully defined medium is not readily obtained without modifying their cellular characteristics. As such, we measured selected molecular data including mRNA, proteins, and activities right after isolation. Leukemic blasts showed clear signs of metabolic and signaling shifts as already known, and we provide unprecedented data emphasizing disturbed cellular iron homeostasis in these blasts. The combined quantitative data relative to the latter pathway allowed us to stratify the studied patients in two sets with different iron status. This categorization is likely to impact the efficiency of several therapeutic strategies targeting cellular iron handling that may be applied to eradicate AML blasts.
Collapse
Affiliation(s)
- Emmanuel Pourcelot
- Laboratory of Fundamental and Applied Bioenergetics (LBFA), University Grenoble Alpes, INSERM U1055, 38000 Grenoble, France; (E.P.); (G.E.S.)
- Department of Biological Hematology, Institute of Biology and Pathology, Hospital of Grenoble Alpes (CHUGA), CS 20217, 38043 Grenoble, CEDEX a9, France;
| | - Ghina El Samra
- Laboratory of Fundamental and Applied Bioenergetics (LBFA), University Grenoble Alpes, INSERM U1055, 38000 Grenoble, France; (E.P.); (G.E.S.)
| | - Pascal Mossuz
- Department of Biological Hematology, Institute of Biology and Pathology, Hospital of Grenoble Alpes (CHUGA), CS 20217, 38043 Grenoble, CEDEX a9, France;
- Team “Epigenetic and Cellular Signaling”, Institute for Advanced Biosciences, University Grenoble Alpes (UGA), INSERM U1209/CNRS 5309, 38700 Grenoble, France
| | - Jean-Marc Moulis
- Laboratory of Fundamental and Applied Bioenergetics (LBFA), University Grenoble Alpes, INSERM U1055, 38000 Grenoble, France; (E.P.); (G.E.S.)
- University Grenoble Alpes, CEA, IRIG, 38000 Grenoble, France
| |
Collapse
|
2
|
Eckardt JN, Röllig C, Metzeler K, Kramer M, Stasik S, Georgi JA, Heisig P, Spiekermann K, Krug U, Braess J, Görlich D, Sauerland CM, Woermann B, Herold T, Berdel WE, Hiddemann W, Kroschinsky F, Schetelig J, Platzbecker U, Müller-Tidow C, Sauer T, Serve H, Baldus C, Schäfer-Eckart K, Kaufmann M, Krause S, Hänel M, Schliemann C, Hanoun M, Thiede C, Bornhäuser M, Wendt K, Middeke JM. Prediction of complete remission and survival in acute myeloid leukemia using supervised machine learning. Haematologica 2023; 108:690-704. [PMID: 35708137 PMCID: PMC9973482 DOI: 10.3324/haematol.2021.280027] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Indexed: 11/09/2022] Open
Abstract
Achievement of complete remission signifies a crucial milestone in the therapy of acute myeloid leukemia (AML) while refractory disease is associated with dismal outcomes. Hence, accurately identifying patients at risk is essential to tailor treatment concepts individually to disease biology. We used nine machine learning (ML) models to predict complete remission and 2-year overall survival in a large multicenter cohort of 1,383 AML patients who received intensive induction therapy. Clinical, laboratory, cytogenetic and molecular genetic data were incorporated and our results were validated on an external multicenter cohort. Our ML models autonomously selected predictive features including established markers of favorable or adverse risk as well as identifying markers of so-far controversial relevance. De novo AML, extramedullary AML, double-mutated CEBPA, mutations of CEBPA-bZIP, NPM1, FLT3-ITD, ASXL1, RUNX1, SF3B1, IKZF1, TP53, and U2AF1, t(8;21), inv(16)/t(16;16), del(5)/del(5q), del(17)/del(17p), normal or complex karyotypes, age and hemoglobin concentration at initial diagnosis were statistically significant markers predictive of complete remission, while t(8;21), del(5)/del(5q), inv(16)/t(16;16), del(17)/del(17p), double-mutated CEBPA, CEBPA-bZIP, NPM1, FLT3-ITD, DNMT3A, SF3B1, U2AF1, and TP53 mutations, age, white blood cell count, peripheral blast count, serum lactate dehydrogenase level and hemoglobin concentration at initial diagnosis as well as extramedullary manifestations were predictive for 2-year overall survival. For prediction of complete remission and 2-year overall survival areas under the receiver operating characteristic curves ranged between 0.77-0.86 and between 0.63-0.74, respectively in our test set, and between 0.71-0.80 and 0.65-0.75 in the external validation cohort. We demonstrated the feasibility of ML for risk stratification in AML as a model disease for hematologic neoplasms, using a scalable and reusable ML framework. Our study illustrates the clinical applicability of ML as a decision support system in hematology.
Collapse
Affiliation(s)
- Jan-Niklas Eckardt
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden.
| | - Christoph Röllig
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Klaus Metzeler
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Michael Kramer
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Sebastian Stasik
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | | | - Peter Heisig
- Institute of Software and Multimedia Technology, Technical University Dresden, Dresden
| | - Karsten Spiekermann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Utz Krug
- Medical Clinic III, Hospital Leverkusen, Leverkusen
| | - Jan Braess
- Hospital Barmherzige Brueder Regensburg, Regensburg
| | - Dennis Görlich
- Institute for Biometrics and Clinical Research, University Muenster, Muenster
| | | | - Bernhard Woermann
- Department of Hematology, Oncology and Tumor Immunology, Charité, Berlin
| | - Tobias Herold
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Wolfgang E Berdel
- Department of Internal Medicine A, University Hospital Muenster, Muenster
| | - Wolfgang Hiddemann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Frank Kroschinsky
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Johannes Schetelig
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Uwe Platzbecker
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Carsten Müller-Tidow
- Department of Medicine V, University Hospital Heidelberg, Heidelberg, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg
| | - Tim Sauer
- Department of Medicine V, University Hospital Heidelberg, Heidelberg
| | - Hubert Serve
- Department of Medicine 2, Hematology and Oncology, Goethe University Frankfurt, Frankfurt
| | - Claudia Baldus
- Department of Hematology and Oncology, University Hospital Schleswig Holstein, Kiel
| | - Kerstin Schäfer-Eckart
- Department of Internal Medicine 5, Paracelsus Medical Private University Nuremberg, Nuremberg
| | - Martin Kaufmann
- Department of Hematology, Oncology and Palliative Care, Robert-Bosch Hospital, Stuttgart
| | - Stefan Krause
- Department of Internal Medicine 5, University Hospital Erlangen, Erlangen
| | - Mathias Hänel
- Department of Internal Medicine 3, Klinikum Chemnitz GmbH, Chemnitz, Germany; Department of Hematology and Stem Cell Transplantation, University Hospital Essen, Essen
| | | | - Maher Hanoun
- Department of Internal Medicine 3, Klinikum Chemnitz GmbH, Chemnitz, Germany; Department of Hematology and Stem Cell Transplantation, University Hospital Essen, Essen
| | - Christian Thiede
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg, Germany; National Center for Tumor Diseases (NCT), Dresden
| | - Karsten Wendt
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Jan Moritz Middeke
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| |
Collapse
|
3
|
Kontio JAJ, Pyhäjärvi T, Sillanpää MJ. Model guided trait-specific co-expression network estimation as a new perspective for identifying molecular interactions and pathways. PLoS Comput Biol 2021; 17:e1008960. [PMID: 33939702 PMCID: PMC8118548 DOI: 10.1371/journal.pcbi.1008960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 05/13/2021] [Accepted: 04/13/2021] [Indexed: 11/19/2022] Open
Abstract
A wide variety of 1) parametric regression models and 2) co-expression networks have been developed for finding gene-by-gene interactions underlying complex traits from expression data. While both methodological schemes have their own well-known benefits, little is known about their synergistic potential. Our study introduces their methodological fusion that cross-exploits the strengths of individual approaches via a built-in information-sharing mechanism. This fusion is theoretically based on certain trait-conditioned dependency patterns between two genes depending on their role in the underlying parametric model. Resulting trait-specific co-expression network estimation method 1) serves to enhance the interpretation of biological networks in a parametric sense, and 2) exploits the underlying parametric model itself in the estimation process. To also account for the substantial amount of intrinsic noise and collinearities, often entailed by expression data, a tailored co-expression measure is introduced along with this framework to alleviate related computational problems. A remarkable advance over the reference methods in simulated scenarios substantiate the method's high-efficiency. As proof-of-concept, this synergistic approach is successfully applied in survival analysis, with acute myeloid leukemia data, further highlighting the framework's versatility and broad practical relevance.
Collapse
Affiliation(s)
- Juho A. J. Kontio
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
| | - Tanja Pyhäjärvi
- Department of Ecology and Genetics, University of Oulu, Oulu, Finland
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
| | - Mikko J. Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
- * E-mail:
| |
Collapse
|
4
|
Kontio JAJ, Rinta-Aho MJ, Sillanpää MJ. Estimating Linear and Nonlinear Gene Coexpression Networks by Semiparametric Neighborhood Selection. Genetics 2020; 215:597-607. [PMID: 32414870 PMCID: PMC7337083 DOI: 10.1534/genetics.120.303186] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 05/11/2020] [Indexed: 11/18/2022] Open
Abstract
Whereas nonlinear relationships between genes are acknowledged, there exist only a few methods for estimating nonlinear gene coexpression networks or gene regulatory networks (GCNs/GRNs) with common deficiencies. These methods often consider only pairwise associations between genes, and are, therefore, poorly capable of identifying higher-order regulatory patterns when multiple genes should be considered simultaneously. Another critical issue in current nonlinear GCN/GRN estimation approaches is that they consider linear and nonlinear dependencies at the same time in confounded form nonparametrically. This severely undermines the possibilities for nonlinear associations to be found, since the power of detecting nonlinear dependencies is lower compared to linear dependencies, and the sparsity-inducing procedures might favor linear relationships over nonlinear ones only due to small sample sizes. In this paper, we propose a method to estimate undirected nonlinear GCNs independently from the linear associations between genes based on a novel semiparametric neighborhood selection procedure capable of identifying complex nonlinear associations between genes. Simulation studies using the common DREAM3 and DREAM9 datasets show that the proposed method compares superiorly to the current nonlinear GCN/GRN estimation methods.
Collapse
Affiliation(s)
- Juho A J Kontio
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland
| | - Marko J Rinta-Aho
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland
- Infotech Oulu, University of Oulu, 90014, Finland
| |
Collapse
|
5
|
Kontio JAJ, Sillanpää MJ. Scalable Nonparametric Prescreening Method for Searching Higher-Order Genetic Interactions Underlying Quantitative Traits. Genetics 2019; 213:1209-1224. [PMID: 31585953 PMCID: PMC6893368 DOI: 10.1534/genetics.119.302658] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/27/2019] [Indexed: 02/07/2023] Open
Abstract
Gaussian process (GP)-based automatic relevance determination (ARD) is known to be an efficient technique for identifying determinants of gene-by-gene interactions important to trait variation. However, the estimation of GP models is feasible only for low-dimensional datasets (∼200 variables), which severely limits application of the GP-based ARD method for high-throughput sequencing data. In this paper, we provide a nonparametric prescreening method that preserves virtually all the major benefits of the GP-based ARD method and extends its scalability to the typical high-dimensional datasets used in practice. In several simulated test scenarios, the proposed method compared favorably with existing nonparametric dimension reduction/prescreening methods suitable for higher-order interaction searches. As a real-data example, the proposed method was applied to a high-throughput dataset downloaded from the cancer genome atlas (TCGA) with measured expression levels of 16,976 genes (after preprocessing) from patients diagnosed with acute myeloid leukemia.
Collapse
Affiliation(s)
- Juho A J Kontio
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland and
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland and
- Infotech Oulu, University of Oulu, 90014, Finland
| |
Collapse
|
6
|
Shi M, Xu G. Development and validation of GMI signature based random survival forest prognosis model to predict clinical outcome in acute myeloid leukemia. BMC Med Genomics 2019; 12:90. [PMID: 31242922 PMCID: PMC6595612 DOI: 10.1186/s12920-019-0540-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 05/30/2019] [Indexed: 12/13/2022] Open
Abstract
Background Acute myeloid leukemia (AML) is a disease with marked molecular heterogeneity and a high early death rate. Our aim was to investigate an integrated Gene expression, Mirna and miRNA-mRNA Interactions (GMI) signature for improving risk stratification of AML. Methods We identified differentially expressed genes by pooling a large number of 861 human AML patients and 75 normal cases. We then used miRWalk to identify the functional miRNA-mRNA regulatory module. The GMI signature based random survival forest (RSF) prognosis model was developed from training data set and evaluated in independent patient cohorts from The Cancer Genome Atlas (TCGA) dataset (N = 147). Univariate and multivariate Cox proportional hazards regression analyses were applied to evaluate the prognostic value of GMI signature. Results We identified 139 differentially expressed genes between normal and abnormal AML samples. We discovered the functional miRNA-mRNA regulatory module which participate in the network of cancer progression. We named 23 differentially expressed genes and 16 validated target miRNAs as the GMI signature. The RSF model-based scores separated independent patient cohorts into two groups with significantly different overall survival (C-index = 0.59, hazard ratio [HR], 2.12; 95% confidence interval [CI], 1.11–4.03; p = 0.019). Similar results were obtained with reversed training and testing datasets (C-index = 0.58, hazard ratio [HR], 2.08; 95% confidence interval [CI], 1.02–4.24; p = 0.038). The GMI signature score contributed more information about recurrence than standard clinical covariates. Conclusion The GMI signature based RSF prognosis model not only reflects regulatory relationships from identified miRNA-mRNA module but also informs patient prognosis. While in the TCGA data set the GMI signature score contributed additional information about recurrence in comparison to standard clinical covariates, further studies are needed to determine its clinical significance. Electronic supplementary material The online version of this article (10.1186/s12920-019-0540-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mingguang Shi
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, 230009, Anhui, China.
| | - Guofu Xu
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, 230009, Anhui, China
| |
Collapse
|
7
|
Tucker JD, Day S, Tang W, Bayus B. Crowdsourcing in medical research: concepts and applications. PeerJ 2019; 7:e6762. [PMID: 30997295 PMCID: PMC6463854 DOI: 10.7717/peerj.6762] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 03/11/2019] [Indexed: 12/23/2022] Open
Abstract
Crowdsourcing shifts medical research from a closed environment to an open collaboration between the public and researchers. We define crowdsourcing as an approach to problem solving which involves an organization having a large group attempt to solve a problem or part of a problem, then sharing solutions. Crowdsourcing allows large groups of individuals to participate in medical research through innovation challenges, hackathons, and related activities. The purpose of this literature review is to examine the definition, concepts, and applications of crowdsourcing in medicine. This multi-disciplinary review defines crowdsourcing for medicine, identifies conceptual antecedents (collective intelligence and open source models), and explores implications of the approach. Several critiques of crowdsourcing are also examined. Although several crowdsourcing definitions exist, there are two essential elements: (1) having a large group of individuals, including those with skills and those without skills, propose potential solutions; (2) sharing solutions through implementation or open access materials. The public can be a central force in contributing to formative, pre-clinical, and clinical research. A growing evidence base suggests that crowdsourcing in medicine can result in high-quality outcomes, broad community engagement, and more open science.
Collapse
Affiliation(s)
- Joseph D. Tucker
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, University of London, London, UK
- Social Entrepreneurship to Spur Health (SESH) Global, Guangzhou, China
| | - Suzanne Day
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Social Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Weiming Tang
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of STD Control, Dermatology Hospital of Southern Medical University, Guangzhou, China
| | - Barry Bayus
- Kenan-Flagler School of Business, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
8
|
Ali M, Khan SA, Wennerberg K, Aittokallio T. Global proteomics profiling improves drug sensitivity prediction: results from a multi-omics, pan-cancer modeling approach. Bioinformatics 2019; 34:1353-1362. [PMID: 29186355 PMCID: PMC5905617 DOI: 10.1093/bioinformatics/btx766] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 11/27/2017] [Indexed: 12/13/2022] Open
Abstract
Motivation Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Results Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Availability and implementation Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. Contact mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mehreen Ali
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, 00290 Helsinki, Finland.,Helsinki Institute for Information Technology (HIIT), Aalto University, 02150 Espoo, Finland
| | - Suleiman A Khan
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, 00290 Helsinki, Finland.,Helsinki Institute for Information Technology (HIIT), Aalto University, 02150 Espoo, Finland
| | - Krister Wennerberg
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, 00290 Helsinki, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, 00290 Helsinki, Finland.,Helsinki Institute for Information Technology (HIIT), Aalto University, 02150 Espoo, Finland.,Department of Mathematics and Statistics, University of Turku, 20014 Turku, Finland
| |
Collapse
|
9
|
Stratification of amyotrophic lateral sclerosis patients: a crowdsourcing approach. Sci Rep 2019; 9:690. [PMID: 30679616 PMCID: PMC6345935 DOI: 10.1038/s41598-018-36873-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 11/26/2018] [Indexed: 12/11/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease where substantial heterogeneity in clinical presentation urgently requires a better stratification of patients for the development of drug trials and clinical care. In this study we explored stratification through a crowdsourcing approach, the DREAM Prize4Life ALS Stratification Challenge. Using data from >10,000 patients from ALS clinical trials and 1479 patients from community-based patient registers, more than 30 teams developed new approaches for machine learning and clustering, outperforming the best current predictions of disease outcome. We propose a new method to integrate and analyze patient clusters across methods, showing a clear pattern of consistent and clinically relevant sub-groups of patients that also enabled the reliable classification of new patients. Our analyses reveal novel insights in ALS and describe for the first time the potential of a crowdsourcing to uncover hidden patient sub-populations, and to accelerate disease understanding and therapeutic development.
Collapse
|
10
|
Byron A. Reproducibility and Crossplatform Validation of Reverse-Phase Protein Array Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1188:181-201. [PMID: 31820389 DOI: 10.1007/978-981-32-9755-5_10] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Reverse-phase protein array (RPPA) technology is a high-throughput antibody- and microarray-based approach for the rapid profiling of levels of proteins and protein posttranslational modifications in biological specimens. The technology consumes small amounts of samples, can sensitively detect low-abundance proteins and posttranslational modifications, enables measurements of multiple signaling pathways in parallel, has the capacity to analyze large sample numbers, and offers robust interexperimental reproducibility. These features of RPPA experiments have motivated and enabled the use of RPPA technology in various biomedical, translational, and clinical applications, including the delineation of molecular mechanisms of disease, profiling of druggable signaling pathway activation, and search for new prognostic markers. Owing to the complexity of many of these applications, such as developing multiplex protein assays for diagnostic laboratories or integrating posttranslational modification-level data using large-scale proteogenomic approaches, robust and well-validated data are essential. There are many distinct components of an RPPA workflow, and numerous possible technical setups and analysis parameter options exist. The differences between RPPA platform setups around the world offer opportunities to assess and minimize interplatform variation. Crossplatform validation may also aid in the evaluation of robust, platform-independent protein markers of disease and response to therapy.
Collapse
Affiliation(s)
- Adam Byron
- Cancer Research UK Edinburgh Centre, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
11
|
Ali M, Aittokallio T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys Rev 2018; 11:31-39. [PMID: 30097794 PMCID: PMC6381361 DOI: 10.1007/s12551-018-0446-z] [Citation(s) in RCA: 92] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 07/22/2018] [Indexed: 02/07/2023] Open
Abstract
In-depth modeling of the complex interplay among multiple omics data measured from cancer cell lines or patient tumors is providing new opportunities toward identification of tailored therapies for individual cancer patients. Supervised machine learning algorithms are increasingly being applied to the omics profiles as they enable integrative analyses among the high-dimensional data sets, as well as personalized predictions of therapy responses using multi-omics panels of response-predictive biomarkers identified through feature selection and cross-validation. However, technical variability and frequent missingness in input "big data" require the application of dedicated data preprocessing pipelines that often lead to some loss of information and compressed view of the biological signal. We describe here the state-of-the-art machine learning methods for anti-cancer drug response modeling and prediction and give our perspective on further opportunities to make better use of high-dimensional multi-omics profiles along with knowledge about cancer pathways targeted by anti-cancer compounds when predicting their phenotypic responses.
Collapse
Affiliation(s)
- Mehreen Ali
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, FI-00290, Helsinki, Finland.,Helsinki Institute for Information Technology (HIIT), Aalto University, FI-02150, Espoo, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, FI-00290, Helsinki, Finland. .,Helsinki Institute for Information Technology (HIIT), Aalto University, FI-02150, Espoo, Finland. .,Department of Mathematics and Statistics, University of Turku, FI-20014, Turku, Finland.
| |
Collapse
|
12
|
Chebouba L, Boughaci D, Guziolowski C. Proteomics Versus Clinical Data and Stochastic Local Search Based Feature Selection for Acute Myeloid Leukemia Patients' Classification. J Med Syst 2018; 42:129. [PMID: 29869179 DOI: 10.1007/s10916-018-0972-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Accepted: 05/18/2018] [Indexed: 01/02/2023]
Abstract
The use of data issued from high throughput technologies in drug target problems is widely widespread during the last decades. This study proposes a meta-heuristic framework using stochastic local search (SLS) combined with random forest (RF) where the aim is to specify the most important genes and proteins leading to the best classification of Acute Myeloid Leukemia (AML) patients. First we use a stochastic local search meta-heuristic as a feature selection technique to select the most significant proteins to be used in the classification task step. Then we apply RF to classify new patients into their corresponding classes. The evaluation technique is to run the RF classifier on the training data to get a model. Then, we apply this model on the test data to find the appropriate class. We use as metrics the balanced accuracy (BAC) and the area under the receiver operating characteristic curve (AUROC) to measure the performance of our model. The proposed method is evaluated on the dataset issued from DREAM 9 challenge. The comparison is done with a pure random forest (without feature selection), and with the two best ranked results of the DREAM 9 challenge. We used three types of data: only clinical data, only proteomics data, and finally clinical and proteomics data combined. The numerical results show that the highest scores are obtained when using clinical data alone, and the lowest is obtained when using proteomics data alone. Further, our method succeeds in finding promising results compared to the methods presented in the DREAM challenge.
Collapse
Affiliation(s)
- Lokmane Chebouba
- Department of Computer Science, LRIA Laboratory, Electrical Engineering and Computer Science Faculty, University of Science and Technology Houari Boumediene (USTHB), El-Alia BP 32, Bab-Ezzouar, Algiers, Algeria.
| | - Dalila Boughaci
- Department of Computer Science, LRIA Laboratory, Electrical Engineering and Computer Science Faculty, University of Science and Technology Houari Boumediene (USTHB), El-Alia BP 32, Bab-Ezzouar, Algiers, Algeria
| | | |
Collapse
|
13
|
Chebouba L, Miannay B, Boughaci D, Guziolowski C. Discriminate the response of Acute Myeloid Leukemia patients to treatment by using proteomics data and Answer Set Programming. BMC Bioinformatics 2018. [PMID: 29536824 PMCID: PMC5850944 DOI: 10.1186/s12859-018-2034-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background During the last years, several approaches were applied on biomedical data to detect disease specific proteins and genes in order to better target drugs. It was shown that statistical and machine learning based methods use mainly clinical data and improve later their results by adding omics data. This work proposes a new method to discriminate the response of Acute Myeloid Leukemia (AML) patients to treatment. The proposed approach uses proteomics data and prior regulatory knowledge in the form of networks to predict cancer treatment outcomes by finding out the different Boolean networks specific to each type of response to drugs. To show its effectiveness we evaluate our method on a dataset from the DREAM 9 challenge. Results The results are encouraging and demonstrate the benefit of our approach to distinguish patient groups with different response to treatment. In particular each treatment response group is characterized by a predictive model in the form of a signaling Boolean network. This model describes regulatory mechanisms which are specific to each response group. The proteins in this model were selected from the complete dataset by imposing optimization constraints that maximize the difference in the logical response of the Boolean network associated to each group of patients given the omic dataset. This mechanistic and predictive model also allow us to classify new patients data into the two different patient response groups. Conclusions We propose a new method to detect the most relevant proteins for understanding different patient responses upon treatments in order to better target drugs using a Prior Knowledge Network and proteomics data. The results are interesting and show the effectiveness of our method. Electronic supplementary material The online version of this article (10.1186/s12859-018-2034-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lokmane Chebouba
- Department of Computer Science, LRIA Laboratory, Electrical Engineering and Computer Science Faculty, University of Science and Technology Houari Boumediene (USTHB), El-Alia BP 32 Bab-Ezzouar, Algiers, 16111, Algeria.,LS2N, UMR 6004, École Centrale de Nantes, Nantes, France
| | | | - Dalila Boughaci
- Department of Computer Science, LRIA Laboratory, Electrical Engineering and Computer Science Faculty, University of Science and Technology Houari Boumediene (USTHB), El-Alia BP 32 Bab-Ezzouar, Algiers, 16111, Algeria
| | | |
Collapse
|
14
|
Sweeney TE, Perumal TM, Henao R, Nichols M, Howrylak JA, Choi AM, Bermejo-Martin JF, Almansa R, Tamayo E, Davenport EE, Burnham KL, Hinds CJ, Knight JC, Woods CW, Kingsmore SF, Ginsburg GS, Wong HR, Parnell GP, Tang B, Moldawer LL, Moore FE, Omberg L, Khatri P, Tsalik EL, Mangravite LM, Langley RJ. A community approach to mortality prediction in sepsis via gene expression analysis. Nat Commun 2018; 9:694. [PMID: 29449546 PMCID: PMC5814463 DOI: 10.1038/s41467-018-03078-2] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 01/18/2018] [Indexed: 12/27/2022] Open
Abstract
Improved risk stratification and prognosis prediction in sepsis is a critical unmet need. Clinical severity scores and available assays such as blood lactate reflect global illness severity with suboptimal performance, and do not specifically reveal the underlying dysregulation of sepsis. Here, we present prognostic models for 30-day mortality generated independently by three scientific groups by using 12 discovery cohorts containing transcriptomic data collected from primarily community-onset sepsis patients. Predictive performance is validated in five cohorts of community-onset sepsis patients in which the models show summary AUROCs ranging from 0.765-0.89. Similar performance is observed in four cohorts of hospital-acquired sepsis. Combining the new gene-expression-based prognostic models with prior clinical severity scores leads to significant improvement in prediction of 30-day mortality as measured via AUROC and net reclassification improvement index These models provide an opportunity to develop molecular bedside tests that may improve risk stratification and mortality prediction in patients with sepsis.
Collapse
Affiliation(s)
- Timothy E Sweeney
- Stanford Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Division of Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Inflammatix Inc., Burlingame, CA, 94010, USA
| | | | - Ricardo Henao
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC, 27708, USA
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA
| | - Marshall Nichols
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC, 27708, USA
| | - Judith A Howrylak
- Division of Pulmonary and Critical Care Medicine, Penn State Milton S. Hershey Medical Center, Hershey, PA, 17033, USA
| | - Augustine M Choi
- Department of Medicine, Cornell Medical Center, New York, NY, 10065, USA
| | | | - Raquel Almansa
- Hospital Clínico Universitario de Valladolid/IECSCYL, Valladolid, 47005, Spain
| | - Eduardo Tamayo
- Hospital Clínico Universitario de Valladolid/IECSCYL, Valladolid, 47005, Spain
| | - Emma E Davenport
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Partners Center for Personalized Genetic Medicine, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Katie L Burnham
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Charles J Hinds
- William Harvey Research Institute, Barts and The London School of Medicine, Queen Mary University, London, EC1M 6BQ, UK
| | - Julian C Knight
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Christopher W Woods
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC, 27708, USA
- Division of Infectious Diseases and International Health, Department of Medicine, Duke University, Durham, NC, 27710, USA
- Durham Veteran's Affairs Health Care System, Durham, NC, 27705, USA
| | | | - Geoffrey S Ginsburg
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC, 27708, USA
| | - Hector R Wong
- Division of Critical Care Medicine, Cincinnati Children's Hospital Medical Center and Cincinnati Children's Research Foundation, Cincinnati, OH, 45223, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, 45267, USA
| | - Grant P Parnell
- Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Westmead, NSW, 2145, Australia
| | - Benjamin Tang
- Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Westmead, NSW, 2145, Australia
- Department of Intensive Care Medicine, Nepean Hospital, Sydney, Australia, Penrith, NSW, 2751, Australia
- Nepean Genomic Research Group, Nepean Clinical School, University of Sydney, Penrith, NSW, 2751, Australia
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Westmead, NSW, 2145, Australia
| | - Lyle L Moldawer
- Department of Surgery, University of Florida College of Medicine, Gainesville, FL, 32610, USA
| | - Frederick E Moore
- Department of Surgery, University of Florida College of Medicine, Gainesville, FL, 32610, USA
| | | | - Purvesh Khatri
- Stanford Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Division of Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Ephraim L Tsalik
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC, 27708, USA
- Division of Infectious Diseases and International Health, Department of Medicine, Duke University, Durham, NC, 27710, USA
- Durham Veteran's Affairs Health Care System, Durham, NC, 27705, USA
| | | | - Raymond J Langley
- Department of Pharmacology, University of South Alabama, Mobile, AL, 36688, USA.
| |
Collapse
|
15
|
Liang Y, Kelemen A. Computational dynamic approaches for temporal omics data with applications to systems medicine. BioData Min 2017. [PMID: 28638442 PMCID: PMC5473988 DOI: 10.1186/s13040-017-0140-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Modeling and predicting biological dynamic systems and simultaneously estimating the kinetic structural and functional parameters are extremely important in systems and computational biology. This is key for understanding the complexity of the human health, drug response, disease susceptibility and pathogenesis for systems medicine. Temporal omics data used to measure the dynamic biological systems are essentials to discover complex biological interactions and clinical mechanism and causations. However, the delineation of the possible associations and causalities of genes, proteins, metabolites, cells and other biological entities from high throughput time course omics data is challenging for which conventional experimental techniques are not suited in the big omics era. In this paper, we present various recently developed dynamic trajectory and causal network approaches for temporal omics data, which are extremely useful for those researchers who want to start working in this challenging research area. Moreover, applications to various biological systems, health conditions and disease status, and examples that summarize the state-of-the art performances depending on different specific mining tasks are presented. We critically discuss the merits, drawbacks and limitations of the approaches, and the associated main challenges for the years ahead. The most recent computing tools and software to analyze specific problem type, associated platform resources, and other potentials for the dynamic trajectory and interaction methods are also presented and discussed in detail.
Collapse
Affiliation(s)
- Yulan Liang
- Department of Family and Community Health, University of Maryland, Baltimore, MD 21201 USA
| | - Arpad Kelemen
- Department of Organizational Systems and Adult Health, University of Maryland, Baltimore, MD 21201 USA
| |
Collapse
|
16
|
Way GP, Allaway RJ, Bouley SJ, Fadul CE, Sanchez Y, Greene CS. A machine learning classifier trained on cancer transcriptomes detects NF1 inactivation signal in glioblastoma. BMC Genomics 2017; 18:127. [PMID: 28166733 PMCID: PMC5292791 DOI: 10.1186/s12864-017-3519-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2016] [Accepted: 01/26/2017] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND We have identified molecules that exhibit synthetic lethality in cells with loss of the neurofibromin 1 (NF1) tumor suppressor gene. However, recognizing tumors that have inactivation of the NF1 tumor suppressor function is challenging because the loss may occur via mechanisms that do not involve mutation of the genomic locus. Degradation of the NF1 protein, independent of NF1 mutation status, phenocopies inactivating mutations to drive tumors in human glioma cell lines. NF1 inactivation may alter the transcriptional landscape of a tumor and allow a machine learning classifier to detect which tumors will benefit from synthetic lethal molecules. RESULTS We developed a strategy to predict tumors with low NF1 activity and hence tumors that may respond to treatments that target cells lacking NF1. Using RNAseq data from The Cancer Genome Atlas (TCGA), we trained an ensemble of 500 logistic regression classifiers that integrates mutation status with whole transcriptomes to predict NF1 inactivation in glioblastoma (GBM). On TCGA data, the classifier detected NF1 mutated tumors (test set area under the receiver operating characteristic curve (AUROC) mean = 0.77, 95% quantile = 0.53 - 0.95) over 50 random initializations. On RNA-Seq data transformed into the space of gene expression microarrays, this method produced a classifier with similar performance (test set AUROC mean = 0.77, 95% quantile = 0.53 - 0.96). We applied our ensemble classifier trained on the transformed TCGA data to a microarray validation set of 12 samples with matched RNA and NF1 protein-level measurements. The classifier's NF1 score was associated with NF1 protein concentration in these samples. CONCLUSIONS We demonstrate that TCGA can be used to train accurate predictors of NF1 inactivation in GBM. The ensemble classifier performed well for samples with very high or very low NF1 protein concentrations but had mixed performance in samples with intermediate NF1 concentrations. Nevertheless, high-performing and validated predictors have the potential to be paired with targeted therapies and personalized medicine.
Collapse
Affiliation(s)
- Gregory P Way
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA, USA.,Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Robert J Allaway
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Dartmouth College, HB 7650, Hanover, NH, 03755, USA
| | - Stephanie J Bouley
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Dartmouth College, HB 7650, Hanover, NH, 03755, USA
| | - Camilo E Fadul
- Department of Neurology, University of Virginia, Charlottesville, VA, USA
| | - Yolanda Sanchez
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Dartmouth College, HB 7650, Hanover, NH, 03755, USA. .,Norris Cotton Cancer Center, Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA.
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA.
| |
Collapse
|
17
|
Liu L, Chang Y, Yang T, Noren DP, Long B, Kornblau S, Qutub A, Ye J. Evolution-informed modeling improves outcome prediction for cancers. Evol Appl 2016; 10:68-76. [PMID: 28035236 PMCID: PMC5192825 DOI: 10.1111/eva.12417] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/17/2016] [Indexed: 12/19/2022] Open
Abstract
Despite wide applications of high-throughput biotechnologies in cancer research, many biomarkers discovered by exploring large-scale omics data do not provide satisfactory performance when used to predict cancer treatment outcomes. This problem is partly due to the overlooking of functional implications of molecular markers. Here, we present a novel computational method that uses evolutionary conservation as prior knowledge to discover bona fide biomarkers. Evolutionary selection at the molecular level is nature's test on functional consequences of genetic elements. By prioritizing genes that show significant statistical association and high functional impact, our new method reduces the chances of including spurious markers in the predictive model. When applied to predicting therapeutic responses for patients with acute myeloid leukemia and to predicting metastasis for patients with prostate cancers, the new method gave rise to evolution-informed models that enjoyed low complexity and high accuracy. The identified genetic markers also have significant implications in tumor progression and embrace potential drug targets. Because evolutionary conservation can be estimated as a gene-specific, position-specific, or allele-specific parameter on the nucleotide level and on the protein level, this new method can be extended to apply to miscellaneous "omics" data to accelerate biomarker discoveries.
Collapse
Affiliation(s)
- Li Liu
- Department of Biomedical Informatics Arizona State University Tempe AZ USA
| | - Yung Chang
- School of Life Science Arizona State University Tempe AZ USA
| | - Tao Yang
- Department of Computer Science and Engineering Arizona State University Tempe AZ USA
| | - David P Noren
- Department of Bioengineering Rice University Houston TX USA
| | - Byron Long
- Department of Bioengineering Rice University Houston TX USA
| | - Steven Kornblau
- The University of Texas MD Anderson Cancer Center Houston TX USA
| | - Amina Qutub
- Department of Bioengineering Rice University Houston TX USA
| | - Jieping Ye
- Department of Computational Medicine and Bioinformatics University of Michigan Ann Arbor MI USA
| |
Collapse
|