Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sun A, Xiao X, Xu Z. iPTT(2 L)-CNN: A Two-Layer Predictor for Identifying Promoters and Their Types in Plant Genomes by Convolutional Neural Network. Comput Math Methods Med 2021;2021:6636350. [PMID: 33488763 DOI: 10.1155/2021/6636350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/13/2020] [Accepted: 12/16/2020] [Indexed: 11/18/2022]

For:	Sun A, Xiao X, Xu Z. iPTT(2 L)-CNN: A Two-Layer Predictor for Identifying Promoters and Their Types in Plant Genomes by Convolutional Neural Network. Comput Math Methods Med 2021;2021:6636350. [PMID: 33488763 DOI: 10.1155/2021/6636350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/13/2020] [Accepted: 12/16/2020] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Lei X, Wang X, Chen G, Liang C, Li Q, Jiang H, Xiong W. Combining diffusion and transformer models for enhanced promoter synthesis and strength prediction in deep learning. mSystems 2025;10:e0018325. [PMID: 40105319 PMCID: PMC12013266 DOI: 10.1128/msystems.00183-25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2025] [Accepted: 02/13/2025] [Indexed: 03/20/2025] Open

Abstract

In the field of synthetic biology, the engineering of synthetic promoters that outperform their natural counterparts is of paramount importance, which can optimize the expression of exogenous genes, enhance the efficiency of metabolic pathways, and possess substantial commercial value. Research indicates that some synthetic promoters have higher transcriptional activity compared to strong natural promoters. However, with the exponential increase in complexity due to the 4n potential combinations in a promoter sequence of length n, identifying effective synthetic promoters remains a formidable challenge. Deep learning models, by adaptively learning from extensive data sets, have become instrumental in analyzing biological data. This study introduces a diffusion model-based approach for designing promoters viable in model bacteria such as Escherichia coli and cyanobacteria. This model proficiently assimilates and utilizes inherent biological features from natural promoter sequences to engineer synthetic variants. Additionally, we employed a transformer model to evaluate the efficacy of these synthetic promoters, aiming at screening those with high performance. The experimental findings suggest that the synthetic promoters by the diffusion model not only share key biological features with their natural counterparts but also demonstrate greater similarity to natural promoters than those generated by a variational autoencoder. In predicting promoter strength, the transformer model demonstrated improved performance over the convolutional neural network. Finally, we developed an integrated platform for generating promoters and predicting their strength.

IMPORTANCE

We demonstrated that diffusion models are superior in accomplishing the promoter synthesis task compared to other state-of-the-art deep learning models. The effectiveness of our method was validated using data sets of Escherichia coli and cyanobacteria promoters, showing more stable and prompt convergence and more natural-like promoters than the variational autoencoder model. We extracted sequence information, dimer information, and position information from promoters and combined them with a transformer model to predict promoter strength. Our prediction results were more accurate than those obtained with a convolutional neural network model. Our in silico experiments systematically introduced mutations in promoter sequences and explored their contribution to promoter strength, highlighting the depth of learning in our model.

Collapse

iProm-Zea: A two-layer model to identify plant promoters and their types using convolutional neural network. Genomics 2022;114:110384. [PMID: 35533969 DOI: 10.1016/j.ygeno.2022.110384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/18/2022] [Accepted: 05/02/2022] [Indexed: 01/14/2023]

Abstract

A promoter is a short DNA sequence near the start codon, responsible for initiating the transcription of a specific gene in the genome. The accurate recognition of promoters is important for achieving a better understanding of transcriptional regulation. Because of their importance in the process of biological transcriptional regulation, there is an urgent need to develop in silico tools to identify promoters and their types in a timely and accurate manner. A number of prediction methods have been developed in this regard; however, almost all of them are merely used for identifying promoters and their strength or sigma types. The TATA box region in TATA promoter influences the post-transcriptional processes; therefore, in the current study, we developed a two-layer predictor called "iProm-Zea" using the convolutional neural network (CNN) for identify TATA and TATA less promoters. The first layer can be used to identify a given DNA sequence as a promoter or non-promoter. The second layer can be used to identify whether the recognized promoter is the TATA promoter. To find an optimal feature encoding scheme and model, we employed four feature encoding schemes on different machine learning and CNN algorithms, and based on the evaluation results, we selected a one-hot encoding scheme and a CNN model for iProm-Zea. The 5-fold cross validation testing results demonstrated that the constructed predictor showed great potential for identifying promoters and classifying them as TATA and TATA less promoters. Furthermore, we performed cross-species analysis of iProm-Zea to evaluate its performance in other species. Moreover, to make it easier for other experimental scientists to obtain the results they need, we established a freely accessible and user-friendly web server at http://nsclbio.jbnu.ac.kr/tools/iProm-Zea/.

Collapse