1
|
Li P, Hua L, Ma Z, Hu W, Liu Y, Zhu J. Conformalized Graph Learning for Molecular ADMET Property Prediction and Reliable Uncertainty Quantification. J Chem Inf Model 2024; 64:8705-8717. [PMID: 39571080 DOI: 10.1021/acs.jcim.4c01139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2024]
Abstract
Drug discovery and development is a complex and costly process, with a substantial portion of the expense dedicated to characterizing the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of new drug candidates. While the advent of deep learning and molecular graph neural networks (GNNs) has significantly enhanced in silico ADMET prediction capabilities, reliably quantifying prediction uncertainty remains a critical challenge. The performance of GNNs is influenced by both the volume and the quality of the data. Hence, determining the reliability and extent of a prediction is as crucial as achieving accurate predictions, especially for out-of-domain (OoD) compounds. This paper introduces a novel GNN model called conformalized fusion regression (CFR). CFR combined a GNN model with a joint mean-quantile regression loss and an ensemble-based conformal prediction (CP) method. Through rigorous evaluation across various ADMET tasks, we demonstrate that our framework provides accurate predictions, reliable probability calibration, and high-quality prediction intervals, outperforming existing uncertainty quantification methods.
Collapse
Affiliation(s)
- Peiyao Li
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Lan Hua
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Zhechao Ma
- Department of Computer Science and Technology, Hefei University of Technology, Hefei 230009, China
| | - Wenbo Hu
- Department of Computer Science and Technology, Hefei University of Technology, Hefei 230009, China
| | - Ye Liu
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Jun Zhu
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
2
|
Jin YW, Hu P, Liu Q. NNICE: a deep quantile neural network algorithm for expression deconvolution. Sci Rep 2024; 14:14040. [PMID: 38890415 PMCID: PMC11189483 DOI: 10.1038/s41598-024-65053-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 06/17/2024] [Indexed: 06/20/2024] Open
Abstract
The composition of cell-type is a key indicator of health. Advancements in bulk gene expression data curation, single cell RNA-sequencing technologies, and computational deconvolution approaches offer a new perspective to learn about the composition of different cell types in a quick and affordable way. In this study, we developed a quantile regression and deep learning-based method called Neural Network Immune Contexture Estimator (NNICE) to estimate the cell type abundance and its uncertainty by automatically deconvolving bulk RNA-seq data. The proposed NNICE model was able to successfully recover ground-truth cell type fraction values given unseen bulk mixture gene expression profiles from the same dataset it was trained on. Compared with baseline methods, NNICE achieved better performance on deconvolve both pseudo-bulk gene expressions (Pearson correlation R = 0.9) and real bulk gene expression data (Pearson correlation R = 0.9) across all cell types. In conclusion, NNICE combines statistic inference with deep learning to provide accurate and interpretable cell type deconvolution from bulk gene expression.
Collapse
Affiliation(s)
- Yong Won Jin
- Department of Biochemistry & Medical Genetics, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, R3E 0J9, Canada
| | - Pingzhao Hu
- Department of Biochemistry & Medical Genetics, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, R3E 0J9, Canada
- Department of Biochemistry, Schulich School of Medicine & Dentistry, Western University, London, ON, N6A 5C1, Canada
| | - Qian Liu
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, R3B 2E9, Canada.
| |
Collapse
|
3
|
Wang W, Balsalobre-Lorente D, Anwar A, Adebayo TS, Cong PT, Quynh NN, Nguyen MQ. Shaping a greener future: The role of geopolitical risk, renewable energy and financial development on environmental sustainability using the LCC hypothesis. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 357:120708. [PMID: 38552512 DOI: 10.1016/j.jenvman.2024.120708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 02/02/2024] [Accepted: 03/19/2024] [Indexed: 04/14/2024]
Abstract
The recent progress report of Sustainable Development Goals (SDG) 2023 highlighted the extreme reactions of environmental degradation. This report also shows that the current efforts for achieving environmental sustainability (SDG 13) are inadequate and a comprehensive policy agenda is needed. However, the present literature has highlighted several determinants of environmental degradation but the influence of geopolitical risk on environmental quality (EQ) is relatively ignored. To fill this research gap and propose a inclusive policy structure for achieving the sustainable development goals. This study is the earliest attempt that delve into the effects o of geopolitical risk (GPR), financial development (FD), and renewable energy consumption (REC) on load capacity factor (LCF) under the framework of load capacity curve (LCC) hypothesis for selected Asian countries during 1990-2020. In this regard, we use several preliminary sensitivity tests to check the features and reliability of the dataset. Similarly, we use panel quantile regression for investigating long-run relationships. The factual results affirm the existence of the LCC hypothesis in selected Asian countries. Our findings also show that geopolitical risk reduces environmental quality whereas financial development and REC increase environmental quality. Drawing from the empirical findings, this study suggests a holistic policy approach for achieving the targets of SDG 13 (climate change).
Collapse
Affiliation(s)
- Wenjun Wang
- International Business School of Shaanxi Normal University, Xi'an, China.
| | - Daniel Balsalobre-Lorente
- Department of Applied Economics I, University of Castilla-La, Mancha, Spain; Department of Management and Marketing, Czech University of Life Sciences Prague, Faculty of Economics and Management, Prague, Czech Republic; UNEC Research Methods Application Center, Azerbaijan State University of Economics (UNEC), Istiqlaliyyat Str. 6, Baku, 1001, Azerbaijan.
| | - Ahsan Anwar
- UCSI Graduate Business School, UCSI University, Kuala Lumpur, Malaysia; Advanced Research Centre, European University of Lefke, Lefke, Northern Cyprus, TR-10, Mersin, Turkey.
| | - Tomiwa Sunday Adebayo
- Department of Business Administration, Faculty of Economics and Administrative Science, Cyprus International University, Nicosia, Mersin 10 Turkey; Adnan Kassar School of Business, Lebanese American University, Beirut, Lebanon; University of Tashkent for Applied Sciences, Str. Gavhar 1, Tashkent 100149, Uzbekistan.
| | - Phan The Cong
- Faculty of Economics, Thuongmai University, Hanoi, Viet Nam.
| | | | | |
Collapse
|
4
|
Wang B, Lu J, Li T, Yan Z, Zhang G. A quantile fusion methodology for deep forecasting. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.02.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
5
|
Medrano RD, Aznarte JL. On the inclusion of spatial information for spatio-temporal neural networks. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06111-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
6
|
Oala L, Heiß C, Macdonald J, März M, Kutyniok G, Samek W. Detecting failure modes in image reconstructions with interval neural network uncertainty. Int J Comput Assist Radiol Surg 2021; 16:2089-2097. [PMID: 34480723 PMCID: PMC8616888 DOI: 10.1007/s11548-021-02482-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 08/10/2021] [Indexed: 12/03/2022]
Abstract
Purpose The quantitative detection of failure modes is important for making deep neural networks reliable and usable at scale. We consider three examples for common failure modes in image reconstruction and demonstrate the potential of uncertainty quantification as a fine-grained alarm system. Methods We propose a deterministic, modular and lightweight approach called Interval Neural Network (INN) that produces fast and easy to interpret uncertainty scores for deep neural networks. Importantly, INNs can be constructed post hoc for already trained prediction networks. We compare it against state-of-the-art baseline methods (MCDrop, ProbOut). Results We demonstrate on controlled, synthetic inverse problems the capacity of INNs to capture uncertainty due to noise as well as directional error information. On a real-world inverse problem with human CT scans, we can show that INNs produce uncertainty scores which improve the detection of all considered failure modes compared to the baseline methods. Conclusion Interval Neural Networks offer a promising tool to expose weaknesses of deep image reconstruction models and ultimately make them more reliable. The fact that they can be applied post hoc to equip already trained deep neural network models with uncertainty scores makes them particularly interesting for deployment.
Collapse
Affiliation(s)
- Luis Oala
- Department of Artificial Intelligence, Fraunhofer HHI, Berlin, Germany.
| | - Cosmas Heiß
- Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
| | - Jan Macdonald
- Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
| | - Maximilian März
- Department of Artificial Intelligence, Fraunhofer HHI, Berlin, Germany
| | - Gitta Kutyniok
- Mathematisches Institut, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Wojciech Samek
- Department of Artificial Intelligence, Fraunhofer HHI, Berlin, Germany
| |
Collapse
|
7
|
Quantile Regression for Uncertainty Estimation in VAEs with Applications to Brain Lesion Detection. ACTA ACUST UNITED AC 2021. [PMID: 34334982 DOI: 10.1007/978-3-030-78191-0_53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
The Variational AutoEncoder (VAE) has become one of the most popular models for anomaly detection in applications such as lesion detection in medical images. The VAE is a generative graphical model that is used to learn the data distribution from samples and then generate new samples from this distribution. By training on normal samples, the VAE can be used to detect inputs that deviate from this learned distribution. The VAE models the output as a conditionally independent Gaussian characterized by means and variances for each output dimension. VAEs can therefore use reconstruction probability instead of reconstruction error for anomaly detection. Unfortunately, joint optimization of both mean and variance in the VAE leads to the well-known problem of shrinkage or underestimation of variance. We describe an alternative VAE model, Quantile-Regression VAE (QR-VAE), that avoids this variance shrinkage problem by estimating conditional quantiles for the given input image. Using the estimated quantiles, we compute the conditional mean and variance for input images under the Gaussian model. We then compute reconstruction probability using this model as a principled approach to outlier or anomaly detection. We also show how our approach can be used for heterogeneous thresholding of images for detecting lesions in brain images.
Collapse
|
8
|
Rizos G, Schuller BW. Average Jane, Where Art Thou? – Recent Avenues in Efficient Machine Learning Under Subjectivity Uncertainty. INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS 2020. [PMCID: PMC7274315 DOI: 10.1007/978-3-030-50146-4_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In machine learning tasks an actual ‘ground truth’ may not be available. Then, machines often have to rely on human labelling of data. This becomes challenging the more subjective the learning task is, as human agreement can be low. To cope with the resulting high uncertainty, one could train individual models reflecting a single human’s opinion. However, this is not viable, if one aims at mirroring the general opinion of a hypothetical ‘completely average person’ – the ‘average Jane’. Here, I summarise approaches to optimally learn efficiently in such a case. First, different strategies of reaching a single learning target from several labellers will be discussed. This includes varying labeller trustability and the case of time-continuous labels with potential dynamics. As human labelling is a labour-intensive endeavour, active and cooperative learning strategies can help reduce the number of labels needed. Next, sample informativeness can be exploited in teacher-based algorithms to additionally weigh data by certainty. In addition, multi-target learning of different labeller tracks in parallel and/or of the uncertainty can help improve the model robustness and provide an additional uncertainty measure. Cross-modal strategies to reduce uncertainty offer another view. From these and further recent strategies, I distil a number of future avenues to handle subjective uncertainty in machine learning. These comprise bigger, yet weakly labelled data processing basing amongst other on reinforcement learning, lifelong learning, and self-learning. Illustrative examples stem from the fields of Affective Computing and Digital Health – both notoriously marked by subjectivity uncertainty.
Collapse
|