1
|
Huusari R, Wang T, Szedmak S, Dias D, Aittokallio T, Rousu J. Scaling up drug combination surface prediction. Brief Bioinform 2025; 26:bbaf099. [PMID: 40079263 PMCID: PMC11904408 DOI: 10.1093/bib/bbaf099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Revised: 02/19/2025] [Accepted: 02/24/2025] [Indexed: 03/15/2025] Open
Abstract
Drug combinations are required to treat advanced cancers and other complex diseases. Compared with monotherapy, combination treatments can enhance efficacy and reduce toxicity by lowering the doses of single drugs-and there especially synergistic combinations are of interest. Since drug combination screening experiments are costly and time-consuming, reliable machine learning models are needed for prioritizing potential combinations for further studies. Most of the current machine learning models are based on scalar-valued approaches, which predict individual response values or synergy scores for drug combinations. We take a functional output prediction approach, in which full, continuous dose-response combination surfaces are predicted for each drug combination on the cell lines. We investigate the predictive power of the recently proposed comboKR method, which is based on a powerful input-output kernel regression technique and functional modeling of the response surface. In this work, we develop a scaled-up formulation of the comboKR, which also implements improved modeling choices: we (1) incorporate new modeling choices for the output drug combination response surfaces to the comboKR framework, and (2) propose a projected gradient descent method to solve the challenging pre-image problem that is traditionally solved with simple candidate set approaches. We provide thorough experimental analysis of comboKR 2.0 with three real-word datasets within various challenging experimental settings, including cases where drugs or cell lines have not been encountered in the training data. Our comparison with synergy score prediction methods further highlights the relevance of dose-response prediction approaches, instead of relying on simple scoring methods.
Collapse
Affiliation(s)
- Riikka Huusari
- Department of Computer Science, Aalto University, Otakaari 1B, FI-00076 Espoo, Finland
| | - Tianduanyi Wang
- Department of Computer Science, Aalto University, Otakaari 1B, FI-00076 Espoo, Finland
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, FI-00270 Helsinki, Finland
| | - Sandor Szedmak
- Department of Computer Science, Aalto University, Otakaari 1B, FI-00076 Espoo, Finland
| | - Diogo Dias
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, FI-00270 Helsinki, Finland
- Hematology Research Unit, University of Helsinki and Helsinki University Hospital, Haartmaninkatu 8, FI-00290 Helsinki, Finland
- Translational Immunology Research Program, University of Helsinki, Haartmaninkatu 8, FI-00290 Helsinki, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, FI-00270 Helsinki, Finland
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Ullernchausseen 70, N-0379 Oslo, Norway
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Sognsvannsveien 9, N-0372 Oslo, Norway
| | - Juho Rousu
- Department of Computer Science, Aalto University, Otakaari 1B, FI-00076 Espoo, Finland
| |
Collapse
|
2
|
Liu B, Tsoumakas G. Integrating Similarities via Local Interaction Consistency and Optimizing Area Under the Curve Measures via Matrix Factorization for Drug-Target Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2212-2225. [PMID: 39226198 DOI: 10.1109/tcbb.2024.3453499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure. Computational methods efficiently predict DTIs and recommend a small part of potential interacting pairs for further experimental confirmation, accelerating the drug discovery process. Although fusing heterogeneous drug and target similarities can improve the prediction ability, the existing similarity combination methods ignore the interaction consistency for neighbour entities. Furthermore, area under the precision-recall curve (AUPR) and area under the receiver operating characteristic curve (AUC) are two widely used evaluation metrics in DTI prediction. However, the two metrics are seldom considered as losses within existing DTI prediction methods. We propose a local interaction consistency (LIC) aware similarity integration method to fuse vital information from diverse views for DTI prediction models. Furthermore, we propose two matrix factorization (MF) methods that optimize AUPR and AUC using convex surrogate losses respectively, and then develop an ensemble MF approach that takes advantage of the two area under the curve metrics by combining the two single metric based MF models. Experimental results under different prediction settings show that the proposed methods outperform various competitors in terms of the metric(s) they optimize and are reliable in discovering potential new DTIs.
Collapse
|
3
|
Huang J, Sun C, Li M, Tang R, Xie B, Wang S, Wei JM. Structure-inclusive similarity based directed GNN: a method that can control information flow to predict drug-target binding affinity. Bioinformatics 2024; 40:btae563. [PMID: 39292540 PMCID: PMC11474107 DOI: 10.1093/bioinformatics/btae563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/21/2024] [Accepted: 09/17/2024] [Indexed: 09/20/2024] Open
Abstract
MOTIVATION Exploring the association between drugs and targets is essential for drug discovery and repurposing. Comparing with the traditional methods that regard the exploration as a binary classification task, predicting the drug-target binding affinity can provide more specific information. Many studies work based on the assumption that similar drugs may interact with the same target. These methods constructed a symmetric graph according to the undirected drug similarity or target similarity. Although these similarities can measure the difference between two molecules, it is unable to analyze the inclusion relationship of their substructure. For example, if drug A contains all the substructures of drug B, then in the message-passing mechanism of the graph neural network, drug A should acquire all the properties of drug B, while drug B should only obtain some of the properties of A. RESULTS To this end, we proposed a structure-inclusive similarity (SIS) which measures the similarity of two drugs by considering the inclusion relationship of their substructures. Based on SIS, we constructed a drug graph and a target graph, respectively, and predicted the binding affinities between drugs and targets by a graph convolutional network-based model. Experimental results show that considering the inclusion relationship of the substructure of two molecules can effectively improve the accuracy of the prediction model. The performance of our SIS-based prediction method outperforms several state-of-the-art methods for drug-target binding affinity prediction. The case studies demonstrate that our model is a practical tool to predict the binding affinity between drugs and targets. AVAILABILITY AND IMPLEMENTATION Source codes and data are available at https://github.com/HuangStomach/SISDTA.
Collapse
Affiliation(s)
- Jipeng Huang
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Chang Sun
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Minglei Li
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Rong Tang
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Bin Xie
- College of Computer and Cyber Security, Hebei Normal University, Shijiazhuang 050024, China
| | - Shuqin Wang
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin, Xi Qing District 300387, China
| | - Jin-Mao Wei
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
| |
Collapse
|
4
|
Guichaoua G, Pinel P, Hoffmann B, Azencott CA, Stoven V. Drug-Target Interactions Prediction at Scale: The Komet Algorithm with the LCIdb Dataset. J Chem Inf Model 2024; 64:6938-6956. [PMID: 39237105 PMCID: PMC11423346 DOI: 10.1021/acs.jcim.4c00422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Drug-target interactions (DTIs) prediction algorithms are used at various stages of the drug discovery process. In this context, specific problems such as deorphanization of a new therapeutic target or target identification of a drug candidate arising from phenotypic screens require large-scale predictions across the protein and molecule spaces. DTI prediction heavily relies on supervised learning algorithms that use known DTIs to learn associations between molecule and protein features, allowing for the prediction of new interactions based on learned patterns. The algorithms must be broadly applicable to enable reliable predictions, even in regions of the protein or molecule spaces where data may be scarce. In this paper, we address two key challenges to fulfill these goals: building large, high-quality training datasets and designing prediction methods that can scale, in order to be trained on such large datasets. First, we introduce LCIdb, a curated, large-sized dataset of DTIs, offering extensive coverage of both the molecule and druggable protein spaces. Notably, LCIdb contains a much higher number of molecules than publicly available benchmarks, expanding coverage of the molecule space. Second, we propose Komet (Kronecker Optimized METhod), a DTI prediction pipeline designed for scalability without compromising performance. Komet leverages a three-step framework, incorporating efficient computation choices tailored for large datasets and involving the Nyström approximation. Specifically, Komet employs a Kronecker interaction module for (molecule, protein) pairs, which efficiently captures determinants in DTIs, and whose structure allows for reduced computational complexity and quasi-Newton optimization, ensuring that the model can handle large training sets, without compromising on performance. Our method is implemented in open-source software, leveraging GPU parallel computation for efficiency. We demonstrate the interest of our pipeline on various datasets, showing that Komet displays superior scalability and prediction performance compared to state-of-the-art deep learning approaches. Additionally, we illustrate the generalization properties of Komet by showing its performance on an external dataset, and on the publicly available L H benchmark designed for scaffold hopping problems. Komet is available open source at https://komet.readthedocs.io and all datasets, including LCIdb, can be found at https://zenodo.org/records/10731712.
Collapse
Affiliation(s)
- Gwenn Guichaoua
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| | - Philippe Pinel
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
- Iktos SAS, 75017 Paris, France
| | | | - Chloé-Agathe Azencott
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| | - Véronique Stoven
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| |
Collapse
|
5
|
Huang D, Xie J. EMPDTA: An End-to-End Multimodal Representation Learning Framework with Pocket Online Detection for Drug-Target Affinity Prediction. Molecules 2024; 29:2912. [PMID: 38930976 PMCID: PMC11206982 DOI: 10.3390/molecules29122912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 06/15/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024] Open
Abstract
Accurately predicting drug-target interactions is a critical yet challenging task in drug discovery. Traditionally, pocket detection and drug-target affinity prediction have been treated as separate aspects of drug-target interaction, with few methods combining these tasks within a unified deep learning system to accelerate drug development. In this study, we propose EMPDTA, an end-to-end framework that integrates protein pocket prediction and drug-target affinity prediction to provide a comprehensive understanding of drug-target interactions. The EMPDTA framework consists of three main modules: pocket online detection, multimodal representation learning for affinity prediction, and multi-task joint training. The performance and potential of the proposed framework have been validated across diverse benchmark datasets, achieving robust results in both tasks. Furthermore, the visualization results of the predicted pockets demonstrate accurate pocket detection, confirming the effectiveness of our framework.
Collapse
Affiliation(s)
| | - Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;
| |
Collapse
|
6
|
Viljanen M, Airola A, Pahikkala T. Generalized vec trick for fast learning of pairwise kernel models. Mach Learn 2022. [DOI: 10.1007/s10994-021-06127-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractPairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. In this work, we present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects. Specifically, we consider the standard, symmetric and anti-symmetric Kronecker product kernels, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. Recently, a $$O(nm+nq)$$
O
(
n
m
+
n
q
)
time generalized vec trick algorithm, where $$n$$
n
, $$m$$
m
, and $$q$$
q
denote the number of pairs, drugs and targets, was introduced for training kernel methods with the Kronecker product kernel. This was a significant improvement over previous $$O(n^2)$$
O
(
n
2
)
training methods, since in most real-world applications $$m,q<< n$$
m
,
q
<
<
n
. In this work we show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and provide an extensive comparison of the kernels on a number of biological interaction prediction tasks.
Collapse
|
7
|
Rifaioglu AS, Cetin Atalay R, Cansen Kahraman D, Doğan T, Martin M, Atalay V. MDeePred: novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery. Bioinformatics 2021; 37:693-704. [PMID: 33067636 DOI: 10.1093/bioinformatics/btaa858] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 08/16/2020] [Accepted: 10/06/2020] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Identification of interactions between bioactive small molecules and target proteins is crucial for novel drug discovery, drug repurposing and uncovering off-target effects. Due to the tremendous size of the chemical space, experimental bioactivity screening efforts require the aid of computational approaches. Although deep learning models have been successful in predicting bioactive compounds, effective and comprehensive featurization of proteins, to be given as input to deep neural networks, remains a challenge. RESULTS Here, we present a novel protein featurization approach to be used in deep learning-based compound-target protein binding affinity prediction. In the proposed method, multiple types of protein features such as sequence, structural, evolutionary and physicochemical properties are incorporated within multiple 2D vectors, which is then fed to state-of-the-art pairwise input hybrid deep neural networks to predict the real-valued compound-target protein interactions. The method adopts the proteochemometric approach, where both the compound and target protein features are used at the input level to model their interaction. The whole system is called MDeePred and it is a new method to be used for the purposes of computational drug discovery and repositioning. We evaluated MDeePred on well-known benchmark datasets and compared its performance with the state-of-the-art methods. We also performed in vitro comparative analysis of MDeePred predictions with selected kinase inhibitors' action on cancer cells. MDeePred is a scalable method with sufficiently high predictive performance. The featurization approach proposed here can also be utilized for other protein-related predictive tasks. AVAILABILITY AND IMPLEMENTATION The source code, datasets, additional information and user instructions of MDeePred are available at https://github.com/cansyl/MDeePred. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- A S Rifaioglu
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey.,Department of Computer Engineering, İskenderun Technical University, Hatay, Turkey
| | - R Cetin Atalay
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Section of Pulmonary and Critical Care Medicine, The University of Chicago, Chicago, IL, USA
| | - D Cansen Kahraman
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - T Doğan
- Department of Computer Engineering, Hacettepe University, Ankara, Turkey.,Institute of Informatics, Hacettepe University, Ankara, Turkey
| | - M Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, Hinxton, UK
| | - V Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| |
Collapse
|
8
|
Cichońska A, Ravikumar B, Allaway RJ, Wan F, Park S, Isayev O, Li S, Mason M, Lamb A, Tanoli Z, Jeon M, Kim S, Popova M, Capuzzi S, Zeng J, Dang K, Koytiger G, Kang J, Wells CI, Willson TM, Oprea TI, Schlessinger A, Drewry DH, Stolovitzky G, Wennerberg K, Guinney J, Aittokallio T. Crowdsourced mapping of unexplored target space of kinase inhibitors. Nat Commun 2021; 12:3307. [PMID: 34083538 PMCID: PMC8175708 DOI: 10.1038/s41467-021-23165-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Accepted: 04/15/2021] [Indexed: 12/31/2022] Open
Abstract
Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound-kinase interactions for novel and potent activities. Here, we carry out a crowdsourced benchmarking of predictive algorithms for kinase inhibitor potencies across multiple kinase families tested on unpublished bioactivity data. We find the top-performing predictions are based on various models, including kernel learning, gradient boosting and deep learning, and their ensemble leads to a predictive accuracy exceeding that of single-dose kinase activity assays. We design experiments based on the model predictions and identify unexpected activities even for under-studied kinases, thereby accelerating experimental mapping efforts. The open-source prediction algorithms together with the bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking prediction algorithms and for extending the druggable kinome.
Collapse
Affiliation(s)
- Anna Cichońska
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Department of Computer Science, Helsinki Institute for Information Technology (HIIT), Aalto University, Espoo, Finland
- Department of Computing, University of Turku, Turku, Finland
| | - Balaguru Ravikumar
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | | | - Fangping Wan
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Sungjoon Park
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Shuya Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Michael Mason
- Computational Oncology, Sage Bionetworks, Seattle, WA, USA
| | - Andrew Lamb
- Computational Oncology, Sage Bionetworks, Seattle, WA, USA
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Minji Jeon
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Sunkyu Kim
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Mariya Popova
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Stephen Capuzzi
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Kristen Dang
- Computational Oncology, Sage Bionetworks, Seattle, WA, USA
| | | | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Carrow I Wells
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Timothy M Willson
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Tudor I Oprea
- Translational Informatics Division and Comprehensive Cancer Center, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - David H Drewry
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | | | - Krister Wennerberg
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark.
| | - Justin Guinney
- Computational Oncology, Sage Bionetworks, Seattle, WA, USA.
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland.
- Department of Computer Science, Helsinki Institute for Information Technology (HIIT), Aalto University, Espoo, Finland.
- Department of Mathematics and Statistics, University of Turku, Turku, Finland.
- Institute for Cancer Research, Oslo University Hospital, Oslo, Norway.
- Oslo Centre for Biostatistics and Epidemiology (OCBE), University of Oslo, Oslo, Norway.
| |
Collapse
|
9
|
Zhang C, Cheng J, Tian Q. Multiview Semantic Representation for Visual Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2038-2049. [PMID: 30418893 DOI: 10.1109/tcyb.2018.2875728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Due to interclass and intraclass variations, the images of different classes are often cluttered which makes it hard for efficient classifications. The use of discriminative classification algorithms helps to alleviate this problem. However, it is still an open problem to accurately model the relationships between visual representations and human perception. To alleviate these problems, in this paper, we propose a novel multiview semantic representation (MVSR) algorithm for efficient visual recognition. First, we leverage visually based methods to get initial image representations. We then use both visual and semantic similarities to divide images into groups which are then used for semantic representations. We treat different image representation strategies, partition methods, and numbers as different views. A graph is then used to combine the discriminative power of different views. The similarities between images can be obtained by measuring the similarities of graphs. Finally, we train classifiers to predict the categories of images. We evaluate the discriminative power of the proposed MVSR method for visual recognition on several public image datasets. Experimental results show the effectiveness of the proposed method.
Collapse
|
10
|
Li S, Wan F, Shu H, Jiang T, Zhao D, Zeng J. MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities. Cell Syst 2020. [DOI: 10.1016/j.cels.2020.03.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
11
|
Ak Ç, Ergönül Ö, Şencan İ, Torunoğlu MA, Gönen M. Spatiotemporal prediction of infectious diseases using structured Gaussian processes with application to Crimean-Congo hemorrhagic fever. PLoS Negl Trop Dis 2018; 12:e0006737. [PMID: 30118497 PMCID: PMC6114917 DOI: 10.1371/journal.pntd.0006737] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 08/29/2018] [Accepted: 08/07/2018] [Indexed: 11/18/2022] Open
Abstract
Background Infectious diseases are one of the primary healthcare problems worldwide, leading to millions of deaths annually. To develop effective control and prevention strategies, we need reliable computational tools to understand disease dynamics and to predict future cases. These computational tools can be used by policy makers to make more informed decisions. Methodology/Principal findings In this study, we developed a computational framework based on Gaussian processes to perform spatiotemporal prediction of infectious diseases and exploited the special structure of similarity matrices in our formulation to obtain a very efficient implementation. We then tested our framework on the problem of modeling Crimean–Congo hemorrhagic fever cases between years 2004 and 2015 in Turkey. Conclusions/Significance We showed that our Gaussian process formulation obtained better results than two frequently used standard machine learning algorithms (i.e., random forests and boosted regression trees) under temporal, spatial, and spatiotemporal prediction scenarios. These results showed that our framework has the potential to make an important contribution to public health policy makers. Infectious diseases cause important health problems worldwide and create difficult challenges for public health policy makers. That is why they need reliable computational tools to better understand disease and to predict case counts. They will benefit from such computational tools to make more informed decisions in developing control and prevention strategies. We formulated a computational framework that can be used to model spatial, temporal, or spatiotemporal dynamics of infectious diseases. We showed the utility of our framework on the problem of modeling Crimean–Congo hemorrhagic fever in Turkey.
Collapse
Affiliation(s)
- Çiğdem Ak
- Graduate School of Sciences and Engineering, Koç University, İstanbul, Turkey
| | - Önder Ergönül
- Department of Infectious Diseases and Clinical Microbiology, School of Medicine, Koç University, İstanbul, Turkey
| | - İrfan Şencan
- Public Health Directorate, Ministry of Health, Ankara, Turkey
| | | | - Mehmet Gönen
- Department of Industrial Engineering, College of Engineering, Koç University, İstanbul, Turkey
- School of Medicine, Koç University, İstanbul, Turkey
- * E-mail:
| |
Collapse
|
12
|
Cichonska A, Pahikkala T, Szedmak S, Julkunen H, Airola A, Heinonen M, Aittokallio T, Rousu J. Learning with multiple pairwise kernels for drug bioactivity prediction. Bioinformatics 2018; 34:i509-i518. [PMID: 29949975 PMCID: PMC6022556 DOI: 10.1093/bioinformatics/bty277] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Motivation Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs. Results We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem. Availability and implementation Code is available at https://github.com/aalto-ics-kepaco. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anna Cichonska
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland
| | - Tapio Pahikkala
- Department of Information Technology, University of Turku, Turku, Finland
| | - Sandor Szedmak
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
| | - Heli Julkunen
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
| | - Antti Airola
- Department of Information Technology, University of Turku, Turku, Finland
| | - Markus Heinonen
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
| | - Tero Aittokallio
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - Juho Rousu
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
| |
Collapse
|