Gill JK, Chetty M, Lim S, Hallinan J. BioBERT based text mining for incorporating prior knowledge in the inference of genetic network models.
Comput Biol Med 2025;
186:109623. [PMID:
39753024 DOI:
10.1016/j.compbiomed.2024.109623]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Revised: 12/03/2024] [Accepted: 12/23/2024] [Indexed: 02/20/2025]
Abstract
Reconstruction of Gene Regulatory Networks (GRNs) is essential for understanding gene interactions, their impact on cellular processes, and manifestation of diseases, including drug discovery. Among various mathematical and dynamic models used for GRN reconstruction, S-system model, comprising non-linear differential equations, is widely utilised to capture the behaviour of complex biological systems with non-linear and time-dependent interactions. However, as the network size increases, computational demand for network inference grows due to a greater number of estimation parameters, significantly impacting the performance of optimisation algorithms. Incorporating biologically relevant prior knowledge using advanced Natural Language Processing methods can effectively address this limitation by reducing the need for computing large parameters, thereby enhancing speed and accuracy. In this study, we introduce PRESS, an integrated Prior Knowledge Enhanced S-system model for accurate GRN reconstructions, which seamlessly automates the incorporation of prior knowledge obtained through systematic extraction from published literature. PRESS exploits our recently reported BioBERT-based Gene Interaction Extraction Framework with enhanced targeted genetic relation extraction and the prediction of regulatory genes. Effectiveness of the optimisation algorithm in learning model parameters is further enhanced through a novel fitness evaluation, which limits the maximum number of regulatory genes to mimic real GRNs. This integrated method, combining a robust relation extraction framework for automated prior knowledge with a GRN reconstruction model, is novel and has not been reported previously. Experimental results obtained using Escherichia coli subnetworks and the benchmark SOS dataset demonstrate substantial reductions in computational cost while simultaneously improving prediction accuracy.
Collapse