Abbassi O, Ziti S. QMGBP-DL: a deep learning and machine learning approach for quantum molecular graph band-gap prediction.
Mol Divers 2025:10.1007/s11030-025-11178-7. [PMID:
40252145 DOI:
10.1007/s11030-025-11178-7]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Accepted: 03/26/2025] [Indexed: 04/21/2025]
Abstract
Predicting molecular and quantum material properties, especially the band gap, is crucial for accelerating discoveries in drug design and material science. Although graph neural networks and probabilistic encoders are well established in molecular data analysis, their targeted integration and application for band-gap prediction remain an active research area. This paper introduces QMGBP-DL, a deep learning approach that combines a molecular graph encoder with machine learning models to improve the prediction accuracy of molecular and material band-gap energy. The encoder uses graph convolutional networks to derive latent representations of chemical structures from SMILES strings, optimized via Kullback-Leibler divergence loss. These representations serve as inputs for training various machine learning models to predict properties. QMGBP-DL's effectiveness is assessed using the QM9, PCQM4M, and OPV datasets, demonstrating significant improvements, particularly with a random forest model for property prediction. A comparative analysis against established approaches DenseGNN, MEGNet, and ALIGNN reveals that QMGBP-DL excels in predicting HOMO, LUMO, and band gap, achieving notably lower MAE values. The integration of GCN-derived latent spaces with traditional machine learning models, especially Random Forest, provides a powerful approach for band-gap prediction. The results highlight the efficacy of our integrated approach, showcasing that graph-based molecular encoding combined with machine learning, particularly Random Forest, is highly effective for accurate band-gap prediction, thereby facilitating material discovery and design.
Collapse