1
|
Wang B, Huang C, Liu X, Liu Z, Zhang Y, Zhao W, Xu Q, Ho PC, Xiao Z. iMetAct: An integrated systematic inference of metabolic activity for dissecting tumor metabolic preference and tumor-immune microenvironment. Cell Rep 2025; 44:115375. [PMID: 40053454 DOI: 10.1016/j.celrep.2025.115375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 12/03/2024] [Accepted: 02/10/2025] [Indexed: 03/09/2025] Open
Abstract
Metabolic enzymes play a central role in cancer metabolic reprogramming, and their dysregulation creates vulnerabilities that can be exploited for therapy. However, accurately measuring metabolic enzyme activity in a high-throughput manner remains challenging due to the complex, multi-layered regulatory mechanisms involved. Here, we present iMetAct, a framework that integrates metabolic-transcription networks with an information propagation strategy to infer enzyme activity from gene expression data. iMetAct outperforms expression-based methods in predicting metabolite conversion rates by accounting for the effects of post-translational modifications. With iMetAct, we identify clinically significant subtypes of hepatocellular carcinoma with distinct metabolic preferences driven by dysregulated enzymes and metabolic regulators acting at both the transcriptional and non-transcriptional levels. Moreover, applying iMetAct to single-cell RNA sequencing data allows for the exploration of cancer cell metabolism and its interplay with immune regulation in the tumor microenvironment. An accompanying online platform further facilitates tumor metabolic analysis, patient stratification, and immune microenvironment characterization.
Collapse
Affiliation(s)
- Binxian Wang
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Chao Huang
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Xuan Liu
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Zhenni Liu
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Yilei Zhang
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Wei Zhao
- Department of General Surgery, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Qiuran Xu
- Key Laboratory of Tumor Molecular Diagnosis and Individualized Medicine of Zhejiang Province, Zhejiang Provincial People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang 310014, China
| | - Ping-Chih Ho
- Department of Oncology, University of Lausanne, Epalinges, Switzerland; Ludwig Institute for Cancer Research, University of Lausanne, Epalinges, Switzerland.
| | - Zhengtao Xiao
- Institute of Molecular and Translational Medicine, Department of Biochemistry and Molecular Biology, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China; Key Laboratory of Environment and Disease-Related Genes, Ministry of Education, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China.
| |
Collapse
|
2
|
Spellicy SE, Hess DC. Recycled Translation: Repurposing Drugs for Stroke. Transl Stroke Res 2022; 13:866-880. [PMID: 35218497 PMCID: PMC9844207 DOI: 10.1007/s12975-022-01000-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 02/16/2022] [Accepted: 02/19/2022] [Indexed: 01/19/2023]
Abstract
Stroke, which continues to be a leading cause of death and long-term disability worldwide, has often been described as a clinical graveyard. While multiple small molecule therapeutics have undergone clinical trials in stroke, currently only one Food and Drug Administration (FDA)-approved medication exists for the treatment of stroke, the biological, recombinant tissue plasminogen activator (rt-PA). Repurposing of therapeutics which have previously gained FDA approval for alternative indications serves as a prospective option for stroke therapeutic translation. In contrast to de novo drug development, repurposing strategies have patient-centered and economic advantages. These include increased safety, increased chance of approval, decreased time to approval, and decreased capital investment. Presently, 37 active stroke clinical trials utilize repurposed therapeutics with various initial indications and dosing paradigms. The currently studied repurposed therapeutics fall into six mechanistic categories: (1) anticoagulation; (2) vasculature integrity, response, or red blood cell (RBC) alterations; (3) immune system regulation; (4) neurotransmission; and (5) neuroprotection. Directed hypothesis-driven computational investigation utilizing drug databases, in silico drug-protein interaction modeling, genomic data, and consensus methodology can determine if the current mechanistic repurposing categories have the highest chance of translational success or if other mechanistic avenues should be explored. With this increased focus on repurposed therapeutic strategies over de novo strategies, evolution and optimization of regulatory protections are needed to incentivize innovators and investigators.
Collapse
Affiliation(s)
- Samantha E Spellicy
- M.D./Ph.D. Program, Office of Academic Affairs, Medical College of Georgia at Augusta University, Augusta, GA, USA.
| | - David C Hess
- Dean's Office, Medical College of Georgia at Augusta University, Augusta, GA, USA
| |
Collapse
|
3
|
Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations. Cells 2022; 11:cells11162485. [PMID: 36010562 PMCID: PMC9406402 DOI: 10.3390/cells11162485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/31/2022] [Accepted: 08/05/2022] [Indexed: 11/18/2022] Open
Abstract
Understanding gene functions and their associated abnormal phenotypes is crucial in the prevention, diagnosis and treatment against diseases. The Human Phenotype Ontology (HPO) is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. However, the current HPO annotations are far from completion, and only a small fraction of human protein-coding genes has HPO annotations. Thus, it is necessary to predict protein-phenotype associations using computational methods. Protein sequences can indicate the structure and function of the proteins, and interacting proteins are more likely to have same function. It is promising to integrate these features for predicting HPO annotations of human protein. We developed GraphPheno, a semi-supervised method based on graph autoencoders, which does not require feature engineering to capture deep features from protein sequences, while also taking into account the topological properties in the protein–protein interaction network to predict the relationships between human genes/proteins and abnormal phenotypes. Cross validation and independent dataset tests show that GraphPheno has satisfactory prediction performance. The algorithm is further confirmed on automatic HPO annotation for no-knowledge proteins under the benchmark of the second Critical Assessment of Functional Annotation, 2013–2014 (CAFA2), where GraphPheno surpasses most existing methods. Further bioinformatics analysis shows that predicted certain phenotype-associated genes using GraphPheno share similar biological properties with known ones. In a case study on the phenotype of abnormality of mitochondrial respiratory chain, top prioritized genes are validated by recent papers. We believe that GraphPheno will help to reveal more associations between genes and phenotypes, and contribute to the discovery of drug targets.
Collapse
|
4
|
Gliozzo J, Mesiti M, Notaro M, Petrini A, Patak A, Puertas-Gallardo A, Paccanaro A, Valentini G, Casiraghi E. Heterogeneous data integration methods for patient similarity networks. Brief Bioinform 2022; 23:6604996. [PMID: 35679533 PMCID: PMC9294435 DOI: 10.1093/bib/bbac207] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 04/14/2022] [Accepted: 05/04/2022] [Indexed: 12/29/2022] Open
Abstract
Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.
Collapse
Affiliation(s)
- Jessica Gliozzo
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,European Commission, Joint Research Centre (JRC), Ispra (VA), Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Mesiti
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Notaro
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alessandro Petrini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alex Patak
- European Commission, Joint Research Centre (JRC), Ispra (VA), Italy
| | | | - Alberto Paccanaro
- Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX UK.,School of Applied Mathematics (EMAp), Fundação Getúlio Vargas, Rio de Janeiro Brazil
| | - Giorgio Valentini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy.,DSRC UNIMI, Data Science Research Center, Milano, 20135, Italy.,ELLIS, European Laboratory for Learning and Intelligent Systems, Berlin, Germany
| | - Elena Casiraghi
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| |
Collapse
|
5
|
Liu L, Mamitsuka H, Zhu S. HPODNets: deep graph convolutional networks for predicting human protein-phenotype associations. Bioinformatics 2022; 38:799-808. [PMID: 34672333 DOI: 10.1093/bioinformatics/btab729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 09/18/2021] [Accepted: 10/18/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Deciphering the relationship between human genes/proteins and abnormal phenotypes is of great importance in the prevention, diagnosis and treatment against diseases. The Human Phenotype Ontology (HPO) is a standardized vocabulary that describes the phenotype abnormalities encountered in human disorders. However, the current HPO annotations are still incomplete. Thus, it is necessary to computationally predict human protein-phenotype associations. In terms of current, cutting-edge computational methods for annotating proteins (such as functional annotation), three important features are (i) multiple network input, (ii) semi-supervised learning and (iii) deep graph convolutional network (GCN), whereas there are no methods with all these features for predicting HPO annotations of human protein. RESULTS We develop HPODNets with all above three features for predicting human protein-phenotype associations. HPODNets adopts a deep GCN with eight layers which allows to capture high-order topological information from multiple interaction networks. Empirical results with both cross-validation and temporal validation demonstrate that HPODNets outperforms seven competing state-of-the-art methods for protein function prediction. HPODNets with the architecture of deep GCNs is confirmed to be effective for predicting HPO annotations of human protein and, more generally, node label ranking problem with multiple biomolecular networks input in bioinformatics. AVAILABILITY AND IMPLEMENTATION https://github.com/liulizhi1996/HPODNets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lizhi Liu
- School of Computer Science, Fudan University, Shanghai 200433, China
| | - Hiroshi Mamitsuka
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto Prefecture 611-0011, Japan.,Department of Computer Science, Aalto University, Espoo 02150, Finland
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai 200433, China.,MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China.,Zhangjiang Fudan International Innovation Center, Shanghai 200433, China.,Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China.,Institute of Artificial Intelligence Biomedicine, Nanjing University, Nanjing 210032, China
| |
Collapse
|
6
|
Liu L, Zhu S. Computational Methods for Prediction of Human Protein-Phenotype Associations: A Review. PHENOMICS (CHAM, SWITZERLAND) 2021; 1:171-185. [PMID: 36939789 PMCID: PMC9590544 DOI: 10.1007/s43657-021-00019-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 06/05/2021] [Accepted: 06/16/2021] [Indexed: 12/01/2022]
Abstract
Deciphering the relationship between human proteins (genes) and phenotypes is one of the fundamental tasks in phenomics research. The Human Phenotype Ontology (HPO) builds upon a standardized logical vocabulary to describe the abnormal phenotypes encountered in human diseases and paves the way towards the computational analysis of their genetic causes. To date, many computational methods have been proposed to predict the HPO annotations of proteins. In this paper, we conduct a comprehensive review of the existing approaches to predicting HPO annotations of novel proteins, identifying missing HPO annotations, and prioritizing candidate proteins with respect to a certain HPO term. For each topic, we first give the formalized description of the problem, and then systematically revisit the published literatures highlighting their advantages and disadvantages, followed by the discussion on the challenges and promising future directions. In addition, we point out several potential topics to be worthy of exploration including the selection of negative HPO annotations and detecting HPO misannotations. We believe that this review will provide insight to the researchers in the field of computational phenotype analyses in terms of comprehending and developing novel prediction algorithms.
Collapse
Affiliation(s)
- Lizhi Liu
- School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433 China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, 200433 China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Zhangjiang Fudan International Innovation Center, Shanghai, 200433 China
- Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, 200433 China
| |
Collapse
|
7
|
Janyasupab P, Suratanee A, Plaimas K. Network diffusion with centrality measures to identify disease-related genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:2909-2929. [PMID: 33892577 DOI: 10.3934/mbe.2021147] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Disease-related gene prioritization is one of the most well-established pharmaceutical techniques used to identify genes that are important to a biological process relevant to a disease. In identifying these essential genes, the network diffusion (ND) approach is a widely used technique applied in gene prioritization. However, there is still a large number of candidate genes that need to be evaluated experimentally. Therefore, it would be of great value to develop a new strategy to improve the precision of the prioritization. Given the efficiency and simplicity of centrality measures in capturing a gene that might be important to the network structure, herein, we propose a technique that extends the scope of ND through a centrality measure to identify new disease-related genes. Five common centrality measures with different aspects were examined for integration in the traditional ND model. A total of 40 diseases were used to test our developed approach and to find new genes that might be related to a disease. Results indicated that the best measure to combine with the diffusion is closeness centrality. The novel candidate genes identified by the model for all 40 diseases were provided along with supporting evidence. In conclusion, the integration of network centrality in ND is a simple but effective technique to discover more precise disease-related genes, which is extremely useful for biomedical science.
Collapse
Affiliation(s)
- Panisa Janyasupab
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Apichat Suratanee
- Intelligent and Nonlinear Dynamic Innovations Research Center, Department of Mathematics, Faculty of Applied Science, King Mongkut's University of Technology North Bangkok, Bangkok 10800, Thailand
| | - Kitiporn Plaimas
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, 10330, Thailand
- Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok, 10330, Thailand
| |
Collapse
|
8
|
Marín-Llaó J, Mubeen S, Perera-Lluna A, Hofmann-Apitius M, Picart-Armada S, Domingo-Fernández D. MultiPaths: a python framework for analyzing multi-layer biological networks using diffusion algorithms. Bioinformatics 2020; 37:137-139. [PMID: 33367476 PMCID: PMC8034528 DOI: 10.1093/bioinformatics/btaa1069] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 11/23/2020] [Accepted: 12/14/2020] [Indexed: 11/13/2022] Open
Abstract
Summary High-throughput screening yields vast amounts of biological data which can be highly challenging to interpret. In response, knowledge-driven approaches emerged as possible solutions to analyze large datasets by leveraging prior knowledge of biomolecular interactions represented in the form of biological networks. Nonetheless, given their size and complexity, their manual investigation quickly becomes impractical. Thus, computational approaches, such as diffusion algorithms, are often employed to interpret and contextualize the results of high-throughput experiments. Here, we present MultiPaths, a framework consisting of two independent Python packages for network analysis. While the first package, DiffuPy, comprises numerous commonly used diffusion algorithms applicable to any generic network, the second, DiffuPath, enables the application of these algorithms on multi-layer biological networks. To facilitate its usability, the framework includes a command line interface, reproducible examples and documentation. To demonstrate the framework, we conducted several diffusion experiments on three independent multi-omics datasets over disparate networks generated from pathway databases, thus, highlighting the ability of multi-layer networks to integrate multiple modalities. Finally, the results of these experiments demonstrate how the generation of harmonized networks from disparate databases can improve predictive performance with respect to individual resources. Availability and implementation DiffuPy and DiffuPath are publicly available under the Apache License 2.0 at https://github.com/multipaths. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Josep Marín-Llaó
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, 53757, Germany.,B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, 08028, Spain
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, 53757, Germany.,Fraunhofer Center for Machine Learning, Germany
| | - Alexandre Perera-Lluna
- B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, 08028, Spain
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, 53757, Germany
| | - Sergio Picart-Armada
- B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, 08028, Spain
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, 53757, Germany.,Fraunhofer Center for Machine Learning, Germany
| |
Collapse
|
9
|
Vascon S, Frasca M, Tripodi R, Valentini G, Pelillo M. Protein function prediction as a graph-transduction game. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2018.04.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
10
|
Network modeling of patients' biomolecular profiles for clinical phenotype/outcome prediction. Sci Rep 2020; 10:3612. [PMID: 32107391 PMCID: PMC7046773 DOI: 10.1038/s41598-020-60235-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 11/05/2019] [Indexed: 12/15/2022] Open
Abstract
Methods for phenotype and outcome prediction are largely based on inductive supervised models that use selected biomarkers to make predictions, without explicitly considering the functional relationships between individuals. We introduce a novel network-based approach named Patient-Net (P-Net) in which biomolecular profiles of patients are modeled in a graph-structured space that represents gene expression relationships between patients. Then a kernel-based semi-supervised transductive algorithm is applied to the graph to explore the overall topology of the graph and to predict the phenotype/clinical outcome of patients. Experimental tests involving several publicly available datasets of patients afflicted with pancreatic, breast, colon and colorectal cancer show that our proposed method is competitive with state-of-the-art supervised and semi-supervised predictive systems. Importantly, P-Net also provides interpretable models that can be easily visualized to gain clues about the relationships between patients, and to formulate hypotheses about their stratification.
Collapse
|
11
|
Biswas AK, Kim DC, Kang M, Gao JX. Robust Inductive Matrix Completion Strategy to Explore Associations Between LincRNAs and Human Disease Phenotypes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:2066-2077. [PMID: 29994224 DOI: 10.1109/tcbb.2018.2844816] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Over the past few years, it has been established that a number of long intergenic non-coding RNAs (lincRNAs) are linked to a wide variety of human diseases. The relationship among many other lincRNAs still remains as puzzle. Validation of such link between the two entities through biological experiments is expensive. However, piles of information about the two are becoming available, thanks to the High Throughput Sequencing (HTS) platforms, Genome Wide Association Studies (GWAS), etc., thereby opening opportunity for cutting-edge machine learning and data mining approaches. However, there are only a few in silico lincRNA-disease association inference tools available to date, and none of these utilizes side information of both the entities. The recently developed Inductive Matrix Completion (IMC) technique provides a recommendation platform among two entities considering respective side information. But, the formulation of IMC is incapable of handling noise and outliers that may present in the dataset, while data sparsity consideration is another issue with the standard IMC method. Thus, a robust version of IMC is needed that can solve these two issues. As a remedy, in this paper, we propose Robust Inductive Matrix Completion (RIMC) using l2,1 norm loss function as well as l2,1 norm based regularization. We applied RIMC to the available association data between human lincRNAs and OMIM disease phenotypes as well as a diverse set of side information about the lincRNAs and the diseases. Our method performs better than the state-of-the-art methods in terms of precision@k and recall@k at the top- k disease prioritization to the subject lincRNAs. We also demonstrate that RIMC is equally effective for querying about novel lincRNAs, as well as predicting rank of a newly known disease for a set of well-characterized lincRNAs. Availability: All the supporting datasets are available at the publicly accessible URL located at http://biomecis.uta.edu/~ashis/res/RIMC/.
Collapse
|
12
|
Picart-Armada S, Barrett SJ, Willé DR, Perera-Lluna A, Gutteridge A, Dessailly BH. Benchmarking network propagation methods for disease gene identification. PLoS Comput Biol 2019; 15:e1007276. [PMID: 31479437 PMCID: PMC6743778 DOI: 10.1371/journal.pcbi.1007276] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 09/13/2019] [Accepted: 07/16/2019] [Indexed: 12/17/2022] Open
Abstract
In-silico identification of potential target genes for disease is an essential aspect of drug target discovery. Recent studies suggest that successful targets can be found through by leveraging genetic, genomic and protein interaction information. Here, we systematically tested the ability of 12 varied algorithms, based on network propagation, to identify genes that have been targeted by any drug, on gene-disease data from 22 common non-cancerous diseases in OpenTargets. We considered two biological networks, six performance metrics and compared two types of input gene-disease association scores. The impact of the design factors in performance was quantified through additive explanatory models. Standard cross-validation led to over-optimistic performance estimates due to the presence of protein complexes. In order to obtain realistic estimates, we introduced two novel protein complex-aware cross-validation schemes. When seeding biological networks with known drug targets, machine learning and diffusion-based methods found around 2-4 true targets within the top 20 suggestions. Seeding the networks with genes associated to disease by genetics decreased performance below 1 true hit on average. The use of a larger network, although noisier, improved overall performance. We conclude that diffusion-based prioritisers and machine learning applied to diffusion-based features are suited for drug discovery in practice and improve over simpler neighbour-voting methods. We also demonstrate the large impact of choosing an adequate validation strategy and the definition of seed disease genes. The use of biological network data has proven its effectiveness in many areas from computational biology. Networks consist of nodes, usually genes or proteins, and edges that connect pairs of nodes, representing information such as physical interactions, regulatory roles or co-occurrence. In order to find new candidate nodes for a given biological property, the so-called network propagation algorithms start from the set of known nodes with that property and leverage the connections from the biological network to make predictions. Here, we assess the performance of several network propagation algorithms to find sensible gene targets for 22 common non-cancerous diseases, i.e. those that have been found promising enough to start the clinical trials with any compound. We focus on obtaining performance metrics that reflect a practical scenario in drug development where only a small set of genes can be essayed. We found that the presence of protein complexes biased the performance estimates, leading to over-optimistic conclusions, and introduced two novel strategies to address it. Our results support that network propagation is still a viable approach to find drug targets, but that special care needs to be put on the validation strategy. Algorithms benefitted from the use of a larger -although noisier- network and of direct evidence data, rather than indirect genetic associations to disease.
Collapse
Affiliation(s)
- Sergio Picart-Armada
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Spain
- * E-mail:
| | | | | | - Alexandre Perera-Lluna
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Spain
| | - Alex Gutteridge
- Computational Biology and Statistics, GSK, Stevenage, United Kingdom
| | | |
Collapse
|
13
|
Frasca M, Bianchi NC. Multitask Protein Function Prediction through Task Dissimilarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1550-1560. [PMID: 28328509 DOI: 10.1109/tcbb.2017.2684127] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Automated protein function prediction is a challenging problem with distinctive features, such as the hierarchical organization of protein functions and the scarcity of annotated proteins for most biological functions. We propose a multitask learning algorithm addressing both issues. Unlike standard multitask algorithms, which use task (protein functions) similarity information as a bias to speed up learning, we show that dissimilarity information enforces separation of rare class labels from frequent class labels, and for this reason is better suited for solving unbalanced protein function prediction problems. We support our claim by showing that a multitask extension of the label propagation algorithm empirically works best when the task relatedness information is represented using a dissimilarity matrix as opposed to a similarity matrix. Moreover, the experimental comparison carried out on three model organism shows that our method has a more stable performance in both "protein-centric" and "function-centric" evaluation settings.
Collapse
|
14
|
Cáceres JJ, Paccanaro A. Disease gene prediction for molecularly uncharacterized diseases. PLoS Comput Biol 2019; 15:e1007078. [PMID: 31276496 PMCID: PMC6636748 DOI: 10.1371/journal.pcbi.1007078] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Revised: 07/17/2019] [Accepted: 05/09/2019] [Indexed: 02/06/2023] Open
Abstract
Network medicine approaches have been largely successful at increasing our knowledge of molecularly characterized diseases. Given a set of disease genes associated with a disease, neighbourhood-based methods and random walkers exploit the interactome allowing the prediction of further genes for that disease. In general, however, diseases with no known molecular basis constitute a challenge. Here we present a novel network approach to prioritize gene-disease associations that is able to also predict genes for diseases with no known molecular basis. Our method, which we have called Cardigan (ChARting DIsease Gene AssociatioNs), uses semi-supervised learning and exploits a measure of similarity between disease phenotypes. We evaluated its performance at predicting genes for both molecularly characterized and uncharacterized diseases in OMIM, using both weighted and binary interactomes, and compared it with state-of-the-art methods. Our tests, which use datasets collected at different points in time to replicate the dynamics of the disease gene discovery process, prove that Cardigan is able to accurately predict disease genes for molecularly uncharacterized diseases. Additionally, standard leave-one-out cross validation tests show how our approach outperforms state-of-the-art methods at predicting genes for molecularly characterized diseases by 14%-65%. Cardigan can also be used for disease module prediction, where it outperforms state-of-the-art methods by 87%-299%.
Collapse
Affiliation(s)
- Juan J. Cáceres
- Centre for Systems and Synthetic Biology & Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Alberto Paccanaro
- Centre for Systems and Synthetic Biology & Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
- * E-mail:
| |
Collapse
|
15
|
Navarro C, Martínez V, Blanco A, Cano C. ProphTools: general prioritization tools for heterogeneous biological networks. Gigascience 2018; 6:1-8. [PMID: 29186475 PMCID: PMC5751048 DOI: 10.1093/gigascience/gix111] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 11/09/2017] [Indexed: 12/17/2022] Open
Abstract
Background Networks have been proven effective representations for the analysis of biological data. As such, there exist multiple methods to extract knowledge from biological networks. However, these approaches usually limit their scope to a single biological entity type of interest or they lack the flexibility to analyze user-defined data. Results We developed ProphTools, a flexible open-source command-line tool that performs prioritization on a heterogeneous network. ProphTools prioritization combines a Flow Propagation algorithm similar to a Random Walk with Restarts and a weighted propagation method. A flexible model for the representation of a heterogeneous network allows the user to define a prioritization problem involving an arbitrary number of entity types and their interconnections. Furthermore, ProphTools provides functionality to perform cross-validation tests, allowing users to select the best network configuration for a given problem. ProphTools core prioritization methodology has already been proven effective in gene-disease prioritization and drug repositioning. Here we make ProphTools available to the scientific community as flexible, open-source software and perform a new proof-of-concept case study on long noncoding RNAs (lncRNAs) to disease prioritization. Conclusions ProphTools is robust prioritization software that provides the flexibility not present in other state-of-the-art network analysis approaches, enabling researchers to perform prioritization tasks on any user-defined heterogeneous network. Furthermore, the application to lncRNA-disease prioritization shows that ProphTools can reach the performance levels of ad hoc prioritization tools without losing its generality.
Collapse
Affiliation(s)
- Carmen Navarro
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| | - Victor Martínez
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| | - Armando Blanco
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| | - Carlos Cano
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| |
Collapse
|
16
|
Frasca M, Grossi G, Gliozzo J, Mesiti M, Notaro M, Perlasca P, Petrini A, Valentini G. A GPU-based algorithm for fast node label learning in large and unbalanced biomolecular networks. BMC Bioinformatics 2018; 19:353. [PMID: 30367594 PMCID: PMC6191976 DOI: 10.1186/s12859-018-2301-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Several problems in network biology and medicine can be cast into a framework where entities are represented through partially labeled networks, and the aim is inferring the labels (usually binary) of the unlabeled part. Connections represent functional or genetic similarity between entities, while the labellings often are highly unbalanced, that is one class is largely under-represented: for instance in the automated protein function prediction (AFP) for most Gene Ontology terms only few proteins are annotated, or in the disease-gene prioritization problem only few genes are actually known to be involved in the etiology of a given disease. Imbalance-aware approaches to accurately predict node labels in biological networks are thereby required. Furthermore, such methods must be scalable, since input data can be large-sized as, for instance, in the context of multi-species protein networks. RESULTS We propose a novel semi-supervised parallel enhancement of COSNET, an imbalance-aware algorithm build on Hopfield neural model recently suggested to solve the AFP problem. By adopting an efficient representation of the graph and assuming a sparse network topology, we empirically show that it can be efficiently applied to networks with millions of nodes. The key strategy to speed up the computations is to partition nodes into independent sets so as to process each set in parallel by exploiting the power of GPU accelerators. This parallel technique ensures the convergence to asymptotically stable attractors, while preserving the asynchronous dynamics of the original model. Detailed experiments on real data and artificial big instances of the problem highlight scalability and efficiency of the proposed method. CONCLUSIONS By parallelizing COSNET we achieved on average a speed-up of 180x in solving the AFP problem in the S. cerevisiae, Mus musculus and Homo sapiens organisms, while lowering memory requirements. In addition, to show the potential applicability of the method to huge biomolecular networks, we predicted node labels in artificially generated sparse networks involving hundreds of thousands to millions of nodes.
Collapse
Affiliation(s)
- Marco Frasca
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Giuliano Grossi
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Jessica Gliozzo
- Department of Dermatology, Fondazione IRCCS Ca’ Granda,, Ospedale Maggiore Policlinico, Milan, 20122 Italy
| | - Marco Mesiti
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Marco Notaro
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Paolo Perlasca
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Alessandro Petrini
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| | - Giorgio Valentini
- AnacletoLab - Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, Milano, 20135 Italy
| |
Collapse
|
17
|
Doğan T. HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences. PeerJ 2018; 6:e5298. [PMID: 30083448 PMCID: PMC6076985 DOI: 10.7717/peerj.5298] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Accepted: 07/03/2018] [Indexed: 01/24/2023] Open
Abstract
Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein—ontology term—disease relations. As an application of the proposed approach, HPO term—protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.
Collapse
Affiliation(s)
- Tunca Doğan
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Cancer Systems Biology Laboratory (KanSiL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
18
|
Picart-Armada S, Thompson WK, Buil A, Perera-Lluna A. diffuStats: an R package to compute diffusion-based scores on biological networks. Bioinformatics 2018; 34:533-534. [PMID: 29029016 PMCID: PMC5860365 DOI: 10.1093/bioinformatics/btx632] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 09/28/2017] [Accepted: 10/04/2017] [Indexed: 12/31/2022] Open
Abstract
Summary Label propagation and diffusion over biological networks are a common mathematical formalism in computational biology for giving context to molecular entities and prioritizing novel candidates in the area of study. There are several choices in conceiving the diffusion process-involving the graph kernel, the score definitions and the presence of a posterior statistical normalization-which have an impact on the results. This manuscript describes diffuStats, an R package that provides a collection of graph kernels and diffusion scores, as well as a parallel permutation analysis for the normalized scores, that eases the computation of the scores and their benchmarking for an optimal choice. Availability and implementation The R package diffuStats is publicly available in Bioconductor, https://bioconductor.org, under the GPL-3 license. Contact sergi.picart@upc.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sergio Picart-Armada
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Department of Biomedical Engineering, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Wesley K Thompson
- Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Roskilde, Denmark
- Department of Family Medicine and Public Health, University of California, San Diego, La Jolla, CA, USA
| | - Alfonso Buil
- Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Roskilde, Denmark
| | - Alexandre Perera-Lluna
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Department of Biomedical Engineering, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| |
Collapse
|
19
|
Notaro M, Schubach M, Robinson PN, Valentini G. Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods. BMC Bioinformatics 2017; 18:449. [PMID: 29025394 PMCID: PMC5639780 DOI: 10.1186/s12859-017-1854-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 10/02/2017] [Indexed: 03/12/2023] Open
Abstract
BACKGROUND The prediction of human gene-abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene-disease associations has been widely investigated, the related problem of gene-phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. RESULTS We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a "flat" learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. CONCLUSIONS Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository.
Collapse
Affiliation(s)
- Marco Notaro
- Anacleto Lab - Dipartimento di Informatica, Universitá degli Studi di Milano, Via Comelico 39, Milan, 20135 Italy
| | - Max Schubach
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Augustenburger Platz 1, Berlin, 13353 Germany
- Berlin Institute of Health (BIH), Anna-Louisa-Karsch-Str. 2, Berlin, 10178 Germany
| | - Peter N. Robinson
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Augustenburger Platz 1, Berlin, 13353 Germany
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, Berlin, 14195 Germany
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr, Farmington, 06032 CT USA
- Institute for Systems Genomics, University of Connecticut, 10 Discovery Dr, Farmington, 06032 CT USA
| | - Giorgio Valentini
- Anacleto Lab - Dipartimento di Informatica, Universitá degli Studi di Milano, Via Comelico 39, Milan, 20135 Italy
| |
Collapse
|
20
|
Frasca M. Gene2DisCo: Gene to disease using disease commonalities. Artif Intell Med 2017; 82:34-46. [DOI: 10.1016/j.artmed.2017.08.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Revised: 07/24/2017] [Accepted: 08/13/2017] [Indexed: 01/10/2023]
|
21
|
Cha Y, Erez T, Reynolds IJ, Kumar D, Ross J, Koytiger G, Kusko R, Zeskind B, Risso S, Kagan E, Papapetropoulos S, Grossman I, Laifenfeld D. Drug repurposing from the perspective of pharmaceutical companies. Br J Pharmacol 2017; 175:168-180. [PMID: 28369768 DOI: 10.1111/bph.13798] [Citation(s) in RCA: 247] [Impact Index Per Article: 30.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 03/06/2017] [Accepted: 03/08/2017] [Indexed: 02/01/2023] Open
Abstract
Drug repurposing holds the potential to bring medications with known safety profiles to new patient populations. Numerous examples exist for the identification of new indications for existing molecules, most stemming from serendipitous findings or focused recent efforts specifically limited to the mode of action of a specific drug. In recent years, the need for new approaches to drug research and development, combined with the advent of big data repositories and associated analytical methods, has generated interest in developing systematic approaches to drug repurposing. A variety of innovative computational methods to enable systematic repurposing screens, experimental as well as through in silico approaches, have emerged. An efficient drug repurposing pipeline requires the combination of access to molecular data, appropriate analytical expertise to enable robust insights, expertise and experimental set-up for validation and clinical development know-how. In this review, we describe some of the main approaches to systematic repurposing and discuss the various players in this field and the need for strategic collaborations to increase the likelihood of success in bringing existing molecules to new indications, as well as the current advantages, considerations and challenges in repurposing as a drug development strategy pursued by pharmaceutical companies. LINKED ARTICLES This article is part of a themed section on Inventing New Therapies Without Reinventing the Wheel: The Power of Drug Repurposing. To view the other articles in this section visit http://onlinelibrary.wiley.com/doi/10.1111/bph.v175.2/issuetoc.
Collapse
Affiliation(s)
- Y Cha
- Immuneering Corporation, Cambridge, MA, USA
| | - T Erez
- Global Research and Development, Teva Pharmaceutical Industries, Netanya, Israel
| | - I J Reynolds
- Global Research and Development, Teva Pharmaceutical Industries, West Chester, PA, USA
| | - D Kumar
- Immuneering Corporation, Cambridge, MA, USA
| | - J Ross
- Immuneering Corporation, Cambridge, MA, USA
| | - G Koytiger
- Immuneering Corporation, Cambridge, MA, USA
| | - R Kusko
- Immuneering Corporation, Cambridge, MA, USA
| | - B Zeskind
- Immuneering Corporation, Cambridge, MA, USA
| | - S Risso
- Global Research and Development, Teva Pharmaceutical Industries, West Chester, PA, USA
| | - E Kagan
- Global Research and Development, Teva Pharmaceutical Industries, Netanya, Israel
| | - S Papapetropoulos
- Global Research and Development, Teva Pharmaceutical Industries, Frazer, PA, USA
| | - I Grossman
- Global Research and Development, Teva Pharmaceutical Industries, Netanya, Israel
| | - D Laifenfeld
- Global Research and Development, Teva Pharmaceutical Industries, Netanya, Israel
| |
Collapse
|