Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Abbasi WA, Minhas FUAA. Issues in performance evaluation for host-pathogen protein interaction prediction. J Bioinform Comput Biol 2016;14:1650011. [PMID: 26932275 DOI: 10.1142/s0219720016500116] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

For:	Abbasi WA, Minhas FUAA. Issues in performance evaluation for host-pathogen protein interaction prediction. J Bioinform Comput Biol 2016;14:1650011. [PMID: 26932275 DOI: 10.1142/s0219720016500116] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Number

Cited by Other Article(s)

Li M, Shi W, Zhang F, Zeng M, Li Y. A Deep Learning Framework for Predicting Protein Functions With Co-Occurrence of GO Terms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:833-842. [PMID: 35476573 DOI: 10.1109/tcbb.2022.3170719] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Li K, Quan L, Jiang Y, Li Y, Zhou Y, Wu T, Lyu Q. ctP²ISP: Protein-Protein Interaction Sites Prediction Using Convolution and Transformer With Data Augmentation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:297-306. [PMID: 35213314 DOI: 10.1109/tcbb.2022.3154413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Guo Z, Yamaguchi R. Machine learning methods for protein-protein binding affinity prediction in protein design. FRONTIERS IN BIOINFORMATICS 2022;2:1065703. [PMID: 36591334 PMCID: PMC9800603 DOI: 10.3389/fbinf.2022.1065703] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/01/2022] [Indexed: 12/23/2022] Open

Kaundal R, Loaiza CD, Duhan N, Flann N. deepHPI: a comprehensive deep learning platform for accurate prediction and visualization of host-pathogen protein-protein interactions. Brief Bioinform 2022;23:6576450. [PMID: 35511057 DOI: 10.1093/bib/bbac125] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 02/07/2022] [Accepted: 03/15/2022] [Indexed: 01/06/2023] Open

Abstract

Host-pathogen protein interactions (HPPIs) play vital roles in many biological processes and are directly involved in infectious diseases. With the outbreak of more frequent pandemics in the last couple of decades, such as the recent outburst of Covid-19 causing millions of deaths, it has become more critical to develop advanced methods to accurately predict pathogen interactions with their respective hosts. During the last decade, experimental methods to identify HPIs have been used to decipher host-pathogen systems with the caveat that those techniques are labor-intensive, expensive and time-consuming. Alternatively, accurate prediction of HPIs can be performed by the use of data-driven machine learning. To provide a more robust and accurate solution for the HPI prediction problem, we have developed a deepHPI tool based on deep learning. The web server delivers four host-pathogen model types: plant-pathogen, human-bacteria, human-virus and animal-pathogen, leveraging its operability to a wide range of analyses and cases of use. The deepHPI web tool is the first to use convolutional neural network models for HPI prediction. These models have been selected based on a comprehensive evaluation of protein features and neural network architectures. The best prediction models have been tested on independent validation datasets, which achieved an overall Matthews correlation coefficient value of 0.87 for animal-pathogen using the combined pseudo-amino acid composition and conjoint triad (PAAC_CT) features, 0.75 for human-bacteria using the combined pseudo-amino acid composition, conjoint triad and normalized Moreau-Broto feature (PAAC_CT_NMBroto), 0.96 for human-virus using PAAC_CT_NMBroto and 0.94 values for plant-pathogen interactions using the combined pseudo-amino acid composition, composition and transition feature (PAAC_CTDC_CTDT). Our server running deepHPI is deployed on a high-performance computing cluster that enables large and multiple user requests, and it provides more information about interactions discovered. It presents an enriched visualization of the resulting host-pathogen networks that is augmented with external links to various protein annotation resources. We believe that the deepHPI web server will be very useful to researchers, particularly those working on infectious diseases. Additionally, many novel and known host-pathogen systems can be further investigated to significantly advance our understanding of complex disease-causing agents. The developed models are established on a web server, which is freely accessible at http://bioinfo.usu.edu/deepHPI/.

Collapse

Andleeb S, Abbasi WA, Ghulam Mustafa R, Islam GU, Naseer A, Shafique I, Parween A, Shaheen B, Shafiq M, Altaf M, Ali Abbas S. ESIDE: A computationally intelligent method to identify earthworm species (E. fetida) from digital images: Application in taxonomy. PLoS One 2021;16:e0255674. [PMID: 34529673 PMCID: PMC8445633 DOI: 10.1371/journal.pone.0255674] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/21/2021] [Indexed: 11/19/2022] Open

Abbasi WA, Abbas SA, Andleeb S. PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures. J Bioinform Comput Biol 2021;19:2150015. [PMID: 34126874 DOI: 10.1142/s0219720021500153] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract

Accurately determining a change in protein binding affinity upon mutations is important to find novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be supported with computational methods. Most of the available computational prediction techniques depend upon protein structures that bound their applicability to only protein complexes with recognized 3D structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation and question the effectiveness of [Formula: see text]-fold cross-validation (CV) across mutations adopted in previous studies to assess the generalization ability of such predictors with no known mutation during training. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA performs comparably to the existing methods gauged through an appropriate CV scheme and an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. We made PANDA easily accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda, respectively.

Collapse

Abbasi WA, Abbas SA, Andleeb S, Ul Islam G, Ajaz SA, Arshad K, Khalil S, Anjam A, Ilyas K, Saleem M, Chughtai J, Abbas A. COVIDC: An expert system to diagnose COVID-19 and predict its severity using chest CT scans: Application in radiology. INFORMATICS IN MEDICINE UNLOCKED 2021;23:100540. [PMID: 33644298 PMCID: PMC7901302 DOI: 10.1016/j.imu.2021.100540] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 02/17/2021] [Accepted: 02/19/2021] [Indexed: 01/09/2023] Open

Abstract

Early diagnosis of Coronavirus disease 2019 (COVID-19) is significantly important, especially in the absence or inadequate provision of a specific vaccine, to stop the surge of this lethal infection by advising quarantine. This diagnosis is challenging as most of the patients having COVID-19 infection stay asymptomatic while others showing symptoms are hard to distinguish from patients having different respiratory infections such as severe flu and Pneumonia. Due to cost and time-consuming wet-lab diagnostic tests for COVID-19, there is an utmost requirement for some alternate, non-invasive, rapid, and discounted automatic screening system. A chest CT scan can effectively be used as an alternative modality to detect and diagnose the COVID-19 infection. In this study, we present an automatic COVID-19 diagnostic and severity prediction system called COVIDC (COVID-19 detection using CT scans) that uses deep feature maps from the chest CT scans for this purpose. Our newly proposed system not only detects COVID-19 but also predicts its severity by using a two-phase classification approach (COVID vs non-COVID, and COVID-19 severity) with deep feature maps and different shallow supervised classification algorithms such as SVMs and random forest to handle data scarcity. We performed a stringent COVIDC performance evaluation not only through 10-fold cross-validation and an external validation dataset but also in a real setting under the supervision of an experienced radiologist. In all the evaluation settings, COVIDC outperformed all the existing state-of-the-art methods designed to detect COVID-19 with an F1 score of 0.94 on the validation dataset and justified its use to diagnose COVID-19 effectively in the real setting by classifying correctly 9 out of 10 COVID-19 CT scans. We made COVIDC openly accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/covidc.

Collapse

Affiliation(s)

Wajid Arshad Abbasi Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Syed Ali Abbas Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Saiqa Andleeb Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Ghafoor Ul Islam Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Syeda Adin Ajaz Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Kinza Arshad Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Sadia Khalil Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Asma Anjam Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Kashif Ilyas Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Mohsib Saleem Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Jawad Chughtai Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
Ayesha Abbas Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan

Collapse

COVIDX: Computer-aided diagnosis of COVID-19 and its severity prediction with raw digital chest X-ray scans. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-021-0278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abbasi WA, Yaseen A, Hassan FU, Andleeb S, Minhas FUAA. ISLAND: in-silico proteins binding affinity prediction using sequence information. BioData Min 2020;13:20. [PMID: 33292419 PMCID: PMC7688004 DOI: 10.1186/s13040-020-00231-w] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 11/15/2020] [Indexed: 12/30/2022] Open

Zhang H, Cheng W, Zheng J, Wang P, Liu Q, Li Z, Shi T, Zhou Y, Mao Y, Yu X. Identification and Molecular Characterization of a Pellino Protein in Kuruma Prawn (Marsupenaeus Japonicus) in Response to White Spot Syndrome Virus and Vibrio Parahaemolyticus Infection. Int J Mol Sci 2020;21:ijms21041243. [PMID: 32069894 PMCID: PMC7072872 DOI: 10.3390/ijms21041243] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Revised: 01/23/2020] [Accepted: 02/05/2020] [Indexed: 12/22/2022] Open

Affiliation(s)

Heqian Zhang Joint Laboratory of Guangdong Province and Hong Kong Regions on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou 510642, China; (H.Z.); (Q.L.); (Z.L.)
Wenzhi Cheng State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China
Jinbin Zheng State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China
Panpan Wang State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China
Qinghui Liu Joint Laboratory of Guangdong Province and Hong Kong Regions on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou 510642, China; (H.Z.); (Q.L.); (Z.L.)
Zhen Li Joint Laboratory of Guangdong Province and Hong Kong Regions on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou 510642, China; (H.Z.); (Q.L.); (Z.L.)
Tianyi Shi State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China
Yijian Zhou State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China
Yong Mao State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China; (W.C.); (J.Z.); (P.W.); (T.S.); (Y.Z.) Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen 361102, China Correspondence: (Y.M.); (X.Y.)
Xiangyong Yu Joint Laboratory of Guangdong Province and Hong Kong Regions on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou 510642, China; (H.Z.); (Q.L.); (Z.L.) Correspondence: (Y.M.); (X.Y.)

Collapse

Gull S, Shamim N, Minhas F. AMAP: Hierarchical multi-label prediction of biologically active and antimicrobial peptides. Comput Biol Med 2019;107:172-181. [DOI: 10.1016/j.compbiomed.2019.02.018] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 02/17/2019] [Accepted: 02/20/2019] [Indexed: 12/12/2022]

Ivan FX, Kwoh CK, Chow VT, Zheng J. Genome Analysis – Identification of Genes Involved in Host-Pathogen Protein-Protein Interaction Networks. ENCYCLOPEDIA OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2019:410-424. [DOI: 10.1016/b978-0-12-809633-8.20124-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]

Abbasi WA, Asif A, Ben-Hur A, Minhas FUAA. Learning protein binding affinity using privileged information. BMC Bioinformatics 2018;19:425. [PMID: 30442086 PMCID: PMC6238365 DOI: 10.1186/s12859-018-2448-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 10/25/2018] [Indexed: 01/04/2023] Open

Abstract

BACKGROUND

Determining protein-protein interactions and their binding affinity are important in understanding cellular biological processes, discovery and design of novel therapeutics, protein engineering, and mutagenesis studies. Due to the time and effort required in wet lab experiments, computational prediction of binding affinity from sequence or structure is an important area of research. Structure-based methods, though more accurate than sequence-based techniques, are limited in their applicability due to limited availability of protein structure data.

RESULTS

In this study, we propose a novel machine learning method for predicting binding affinity that uses protein 3D structure as privileged information at training time while expecting only protein sequence information during testing. Using the method, which is based on the framework of learning using privileged information (LUPI), we have achieved improved performance over corresponding sequence-based binding affinity prediction methods that do not have access to privileged information during training. Our experiments show that with the proposed framework which uses structure only during training, it is possible to achieve classification performance comparable to that which is obtained using structure-based features. Evaluation on an independent test set shows improved performance over the PPA-Pred2 method as well.

CONCLUSIONS

The proposed method outperforms several baseline learners and a state-of-the-art binding affinity predictor not only in cross-validation, but also on an additional validation dataset, demonstrating the utility of the LUPI framework for problems that would benefit from classification using structure-based features. The implementation of LUPI developed for this work is expected to be useful in other areas of bioinformatics as well.

Collapse

Yang KK, Wu Z, Bedbrook CN, Arnold FH. Learned protein embeddings for machine learning. Bioinformatics 2018;34:2642-2648. [PMID: 29584811 PMCID: PMC6061698 DOI: 10.1093/bioinformatics/bty178] [Citation(s) in RCA: 152] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Revised: 03/20/2018] [Accepted: 03/22/2018] [Indexed: 12/26/2022] Open

Basit AH, Abbasi WA, Asif A, Gull S, Minhas FUAA. Training host-pathogen protein-protein interaction predictors. J Bioinform Comput Biol 2018;16:1850014. [PMID: 30060698 DOI: 10.1142/s0219720018500142] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Abbasi WA, Asif A, Andleeb S, Minhas FUAA. CaMELS:In silicoprediction of calmodulin binding proteins and their binding sites. Proteins 2017;85:1724-1740. [DOI: 10.1002/prot.25330] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Revised: 05/13/2017] [Accepted: 06/07/2017] [Indexed: 11/08/2022]

Choi D, Park B, Chae H, Lee W, Han K. Predicting protein-binding regions in RNA using nucleotide profiles and compositions. BMC SYSTEMS BIOLOGY 2017;11:16. [PMID: 28361677 PMCID: PMC5374631 DOI: 10.1186/s12918-017-0386-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

Background

Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use.

Results

We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others.

Conclusions

Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding.

Electronic supplementary material

The online version of this article (doi:10.1186/s12918-017-0386-4) contains supplementary material, which is available to authorized users.

Collapse

Kim B, Alguwaizani S, Zhou X, Huang DS, Park B, Han K. An improved method for predicting interactions between virus and human proteins. J Bioinform Comput Biol 2016;15:1650024. [PMID: 27397631 DOI: 10.1142/s0219720016500244] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]