1
|
Chao KH, Mao A, Salzberg SL, Pertea M. Splam: a deep-learning-based splice site predictor that improves spliced alignments. Genome Biol 2024; 25:243. [PMID: 39285451 PMCID: PMC11406845 DOI: 10.1186/s13059-024-03379-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 08/28/2024] [Indexed: 09/19/2024] Open
Abstract
The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. We describe Splam, a novel method for predicting splice junctions in DNA using deep residual convolutional neural networks. Unlike previous models, Splam looks at a 400-base-pair window flanking each splice site, reflecting the biological splicing process that relies primarily on signals within this window. Splam also trains on donor and acceptor pairs together, mirroring how the splicing machinery recognizes both ends of each intron. Compared to SpliceAI, Splam is consistently more accurate, achieving 96% accuracy in predicting human splice junctions.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21211, USA.
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21211, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21211, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - Mihaela Pertea
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21211, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.
| |
Collapse
|
2
|
Adnan N, Umer F, Malik S, Hussain OA. Multi-model deep learning approach for segmentation of teeth and periapical lesions on pantomographs. Oral Surg Oral Med Oral Pathol Oral Radiol 2024; 138:196-204. [PMID: 38616480 DOI: 10.1016/j.oooo.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 08/20/2023] [Accepted: 11/06/2023] [Indexed: 04/16/2024]
Abstract
INTRODUCTION The fields of medicine and dentistry are beginning to integrate artificial intelligence (AI) in diagnostics. This may reduce subjectivity and improve the accuracy of diagnoses and treatment planning. Current evidence on pathosis detection on pantomographs (PGs) indicates the presence or absence of disease in the entire radiographic image, with little evidence of the relation of periapical pathosis to the causative tooth. OBJECTIVE To develop a deep learning (DL) AI model for the segmentation of periapical pathosis and its relation to teeth on PGs. METHOD 250 PGs were manually annotated by subject experts to lay down the ground truth for training AI algorithms on the segmentation of periapical pathosis. Two approaches were used for lesion detection: Multi-models 1 and 2, using U-net and Mask RCNN algorithms, respectively. The resulting segmented lesions generated on the testing data set were superimposed with results of teeth segmentation and numbering algorithms trained separately to relate lesions to causative teeth. Hence, both multi-model approaches related periapical pathosis to the causative teeth on PGs. RESULTS The performance metrics of lesion segmentation carried out by U-net are as follows: Accuracy = 98.1%, precision = 84.5%, re-call = 80.3%, F-1 score = 82.2%, dice index = 75.2%, and Intersection over Union = 67.6%. Mask RCNN carried out lesion segmentation with an accuracy of 46.7%, precision of 80.6%, recall of 55%, and F-1 score of 63.1%. CONCLUSION In this study, the multi-model approach successfully related periapical pathosis to the causative tooth on PGs. However, U-net outperformed Mask RCNN in the tasks performed, suggesting that U-net will remain the standard for medical image segmentation tasks. Further training of the models on other findings and an increased number of images will lead to the automation of the detection of common radiographic findings in the dental diagnostic workflow.
Collapse
Affiliation(s)
- Niha Adnan
- Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan
| | - Fahad Umer
- Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan.
| | | | - Owais A Hussain
- Karachi Institute of Economics and Technology, Karachi, Pakistan
| |
Collapse
|
3
|
Liu X, Zhang H, Zeng Y, Zhu X, Zhu L, Fu J. DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks. Genes (Basel) 2024; 15:404. [PMID: 38674339 PMCID: PMC11048956 DOI: 10.3390/genes15040404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 04/28/2024] Open
Abstract
The precise identification of splice sites is essential for unraveling the structure and function of genes, constituting a pivotal step in the gene annotation process. In this study, we developed a novel deep learning model, DRANetSplicer, that integrates residual learning and attention mechanisms for enhanced accuracy in capturing the intricate features of splice sites. We constructed multiple datasets using the most recent versions of genomic data from three different organisms, Oryza sativa japonica, Arabidopsis thaliana and Homo sapiens. This approach allows us to train models with a richer set of high-quality data. DRANetSplicer outperformed benchmark methods on donor and acceptor splice site datasets, achieving an average accuracy of (96.57%, 95.82%) across the three organisms. Comparative analyses with benchmark methods, including SpliceFinder, Splice2Deep, Deep Splicer, EnsembleSplice, and DNABERT, revealed DRANetSplicer's superior predictive performance, resulting in at least a (4.2%, 11.6%) relative reduction in average error rate. We utilized the DRANetSplicer model trained on O. sativa japonica data to predict splice sites in A. thaliana, achieving accuracies for donor and acceptor sites of (94.89%, 94.25%). These results indicate that DRANetSplicer possesses excellent cross-organism predictive capabilities, with its performance in cross-organism predictions even surpassing that of benchmark methods in non-cross-organism predictions. Cross-organism validation showcased DRANetSplicer's excellence in predicting splice sites across similar organisms, supporting its applicability in gene annotation for understudied organisms. We employed multiple methods to visualize the decision-making process of the model. The visualization results indicate that DRANetSplicer can learn and interpret well-known biological features, further validating its overall performance. Our study systematically examined and confirmed the predictive ability of DRANetSplicer from various levels and perspectives, indicating that its practical application in gene annotation is justified.
Collapse
Affiliation(s)
- Xueyan Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Hongyan Zhang
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Ying Zeng
- School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China;
| | - Xinghui Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Lei Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Jiahui Fu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| |
Collapse
|
4
|
Choi JY, Kim H, Kim JK, Lee IS, Ryu IH, Kim JS, Yoo TK. Deep learning prediction of steep and flat corneal curvature using fundus photography in post-COVID telemedicine era. Med Biol Eng Comput 2024; 62:449-463. [PMID: 37889431 DOI: 10.1007/s11517-023-02952-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 10/14/2023] [Indexed: 10/28/2023]
Abstract
Recently, fundus photography (FP) is being increasingly used. Corneal curvature is an essential factor in refractive errors and is associated with several pathological corneal conditions. As FP-based examination systems have already been widely distributed, it would be helpful for telemedicine to extract information such as corneal curvature using FP. This study aims to develop a deep learning model based on FP for corneal curvature prediction by categorizing corneas into steep, regular, and flat groups. The EfficientNetB0 architecture with transfer learning was used to learn FP patterns to predict flat, regular, and steep corneas. In validation, the model achieved a multiclass accuracy of 0.727, a Matthews correlation coefficient of 0.519, and an unweighted Cohen's κ of 0.590. The areas under the receiver operating characteristic curves for binary prediction of flat and steep corneas were 0.863 and 0.848, respectively. The optic nerve and its peripheral areas were the main focus of the model. The developed algorithm shows that FP can potentially be used as an imaging modality to estimate corneal curvature in the post-COVID-19 era, whereby patients may benefit from the detection of abnormal corneal curvatures using FP in the telemedicine setting.
Collapse
Affiliation(s)
- Joon Yul Choi
- Department of Biomedical Engineering, Yonsei University, Wonju, South Korea
| | | | - Jin Kuk Kim
- Department of Refractive Surgery, B&VIIT Eye Center, B2 GT Tower, 1317-23 Seocho-Dong, Seocho-Gu, Seoul, South Korea
| | - In Sik Lee
- Department of Refractive Surgery, B&VIIT Eye Center, B2 GT Tower, 1317-23 Seocho-Dong, Seocho-Gu, Seoul, South Korea
| | - Ik Hee Ryu
- Department of Refractive Surgery, B&VIIT Eye Center, B2 GT Tower, 1317-23 Seocho-Dong, Seocho-Gu, Seoul, South Korea
- Research and Development Department, VISUWORKS, Seoul, South Korea
| | - Jung Soo Kim
- Research and Development Department, VISUWORKS, Seoul, South Korea
| | - Tae Keun Yoo
- Department of Refractive Surgery, B&VIIT Eye Center, B2 GT Tower, 1317-23 Seocho-Dong, Seocho-Gu, Seoul, South Korea.
- Research and Development Department, VISUWORKS, Seoul, South Korea.
| |
Collapse
|
5
|
Yu L, Zhang Y, Xue L, Liu F, Jing R, Luo J. EnsembleDL-ATG: Identifying autophagy proteins by integrating their sequence and evolutionary information using an ensemble deep learning framework. Comput Struct Biotechnol J 2023; 21:4836-4848. [PMID: 37854634 PMCID: PMC10579870 DOI: 10.1016/j.csbj.2023.09.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Autophagy is a primary mechanism for maintaining cellular homeostasis. The synergistic actions of autophagy-related (ATG) proteins strictly regulate the whole autophagic process. Therefore, accurate identification of ATGs is a first and critical step to reveal the molecular mechanism underlying the regulation of autophagy. Current computational methods can predict ATGs from primary protein sequences, but owing to the limitations of algorithms, significant room for improvement still exists. In this research, we propose EnsembleDL-ATG, an ensemble deep learning framework that aggregates multiple deep learning models to predict ATGs from protein sequence and evolutionary information. We first evaluated the performance of individual networks for various feature descriptors to identify the most promising models. Then, we explored all possible combinations of independent models to select the most effective ensemble architecture. The final framework was built and maintained by an organization of four different deep learning models. Experimental results show that our proposed method achieves a prediction accuracy of 94.5 % and MCC of 0.890, which are nearly 4 % and 0.08 higher than ATGPred-FL, respectively. Overall, EnsembleDL-ATG is the first ATG machine learning predictor based on ensemble deep learning. The benchmark data and code utilized in this study can be accessed for free at https://github.com/jingry/autoBioSeqpy/tree/2.0/examples/EnsembleDL-ATG.
Collapse
Affiliation(s)
- Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang 550018, Guizhou, China
- Basic Medical College, Southwest Medical University, Luzhou 646000, Sichuan, China
| | - Yonglin Zhang
- Department of Pharmacy, The Affiliated Hospital of North Sichuan Medical College, Nanchong 637000, Sichuan, China
| | - Li Xue
- School of Public Health, Southwest Medical University, Luzhou 646000, Sichuan, China
| | - Fengjuan Liu
- School of Geography and Resources, Guizhou Education University, Guiyang 550018, Guizhou, China
| | - Runyu Jing
- School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Jiesi Luo
- Basic Medical College, Southwest Medical University, Luzhou 646000, Sichuan, China
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, Southwest Medical University, Luzhou 646000, Sichuan, China
| |
Collapse
|