1
|
Chen S, Zheng P, Zheng L, Yao Q, Meng Z, Lin L, Chen X, Liu R. BERT-DomainAFP: Antifreeze protein recognition and classification model based on BERT and structural domain annotation. iScience 2025; 28:112077. [PMID: 40241758 PMCID: PMC12002629 DOI: 10.1016/j.isci.2025.112077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Revised: 01/03/2025] [Accepted: 02/17/2025] [Indexed: 04/18/2025] Open
Abstract
Antifreeze proteins (AFPs) are crucial for organisms to adapt to low temperatures, with applications in medicine, food storage, aquaculture, and agriculture. Accurate AFP identification is challenging due to structural and sequence diversity. To improve prediction and classification, we propose BERT-DomainAFP, a deep learning model trained on the AntiFreezeDomains dataset created with a novel annotation strategy. The model uses pre-trained ProteinBERT and incorporates oversampling and undersampling techniques to handle unbalanced data, ensuring high predictive ability. BERT-DomainAFP achieves 98.48% accuracy, the highest among existing models, and can classify different AFP types based on structural domain features. This model outperforms current tools, offering a promising solution for AFP recognition and classification in research and applications.
Collapse
Affiliation(s)
- Shengzhen Chen
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ping Zheng
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Lele Zheng
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Qinglong Yao
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ziyu Meng
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Longshan Lin
- Laboratory of Marine Biodiversity Research, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen 361005, China
| | - Xinhua Chen
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ruoyu Liu
- State Key Laboratory of Mariculture Breeding, Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| |
Collapse
|
2
|
Qi D, Liu T. VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning. Biochim Biophys Acta Gen Subj 2024; 1868:130721. [PMID: 39426757 DOI: 10.1016/j.bbagen.2024.130721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Revised: 09/24/2024] [Accepted: 10/11/2024] [Indexed: 10/21/2024]
Abstract
Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.
Collapse
Affiliation(s)
- Dawei Qi
- College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
| | - Taigang Liu
- College of Information Technology, Shanghai Ocean University, Shanghai 201306, China.
| |
Collapse
|
3
|
Xia Y, Zhang Y, Liu D, Zhu YH, Wang Z, Song J, Yu DJ. BLAM6A-Merge: Leveraging Attention Mechanisms and Feature Fusion Strategies to Improve the Identification of RNA N6-Methyladenosine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1803-1815. [PMID: 38913512 DOI: 10.1109/tcbb.2024.3418490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
RNA N6-methyladenosine is a prevalent and abundant type of RNA modification that exerts significant influence on diverse biological processes. To date, numerous computational approaches have been developed for predicting methylation, with most of them ignoring the correlations of different encoding strategies and failing to explore the adaptability of various attention mechanisms for methylation identification. To solve the above issues, we proposed an innovative framework for predicting RNA m6A modification site, termed BLAM6A-Merge. Specifically, it utilized a multimodal feature fusion strategy to combine the classification results of four features and Blastn tool. Apart from this, different attention mechanisms were employed for extracting higher-level features on specific features after the screening process. Extensive experiments on 12 benchmarking datasets demonstrated that BLAM6A-Merge achieved superior performance (average AUC: 0.849 for the full transcript mode and 0.784 for the mature mRNA mode). Notably, the Blastn tool was employed for the first time in the identification of methylation sites.
Collapse
|
4
|
Wu J, Liu Y, Zhu Y, Yu DJ. Improving Antifreeze Proteins Prediction With Protein Language Models and Hybrid Feature Extraction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2349-2358. [PMID: 39316498 DOI: 10.1109/tcbb.2024.3467261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Abstract
Accurate identification of antifreeze proteins (AFPs) is crucial in developing biomimetic synthetic anti-icing materials and low-temperature organ preservation materials. Although numerous machine learning-based methods have been proposed for AFPs prediction, the complex and diverse nature of AFPs limits the prediction performance of existing methods. In this study, we propose AFP-Deep, a new deep learning method to predict antifreeze proteins by integrating embedding from protein sequences with pre-trained protein language models and evolutionary contexts with hybrid feature extraction networks. The experimental results demonstrated that the main advantage of AFP-Deep is its utilization of pre-trained protein language models, which can extract discriminative global contextual features from protein sequences. Additionally, the hybrid deep neural networks designed for protein language models and evolutionary context feature extraction enhance the correlation between embeddings and antifreeze pattern. The performance evaluation results show that AFP-Deep achieves superior performance compared to state-of-the-art models on benchmark datasets, achieving an AUPRC of 0.724 and 0.924, respectively.
Collapse
|
5
|
Feng C, Wei H, Li X, Feng B, Xu C, Zhu X, Liu R. A stacking-based algorithm for antifreeze protein identification using combined physicochemical, pseudo amino acid composition, and reduction property features. Comput Biol Med 2024; 176:108534. [PMID: 38754217 DOI: 10.1016/j.compbiomed.2024.108534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 04/03/2024] [Accepted: 04/28/2024] [Indexed: 05/18/2024]
Abstract
Antifreeze proteins have wide applications in the medical and food industries. In this study, we propose a stacking-based classifier that can effectively identify antifreeze proteins. Initially, feature extraction was performed in three aspects: reduction properties, scalable pseudo amino acid composition, and physicochemical properties. A hybrid feature set comprised of the combined information from these three categories was obtained. Subsequently, we trained the training set based on LightGBM, XGBoost, and RandomForest algorithms, and the training outcomes were passed to the Logistic algorithm for matching, thereby establishing a stacking algorithm. The proposed algorithm was tested on the test set and an independent validation set. Experimental data indicates that the algorithm achieved a recognition accuracy of 98.3 %, and an accuracy of 98.5 % on the validation set. Lastly, we analyzed the reasons why numerical features achieved high recognition capabilities from multiple aspects. Data dimensionality reduction and the analysis from two-dimensional and three-dimensional views revealed separability between positive and negative samples, and the protein three-dimensional structure further demonstrated significant differences in related features between the two samples. Analysis of the classifier revealed that Hr*Hr, HrHr, and Sc-PseAAC_1, 188D(152,116,57,183) were among the seven most important numerical features affecting algorithm recognition. For Hr*Hr and HrHr, supportive sequence level evidence for the reduction dictionary was found in terms of conservation area analysis, multiple sequence alignment, and amino acid conservative substitution. Moreover, the importance of the reduction dictionary was recognized through a comparative analysis of importance before and after the reduction, realizing the effectiveness of the dictionary in improving feature importance. A decision tree model has been utilized to discern the distinctions between dipeptides associated with the physical and chemical properties of His(H), Iso(I), Leu(L), and Lys(K) and other dipeptides. We finally analyzed the other seven features of importance, and data analysis confirmed that hydrophobicity, secondary structure, charge properties, van der Waals forces, and solvent accessibility are also factors affecting the antifreeze capability of proteins.
Collapse
Affiliation(s)
- Changli Feng
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Haiyan Wei
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Xin Li
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Bin Feng
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Chugui Xu
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Xiaorong Zhu
- Department of Information Science and Technology, Taishan University, Taian, 271000, China.
| | - Ruijun Liu
- School of Software, Beihang University, Beijing, 100191, China.
| |
Collapse
|
6
|
Box ICH, van der Burg KRL, Marshall KE. Analysis of Ice-Binding Protein Evolution. Methods Mol Biol 2024; 2730:219-229. [PMID: 37943462 DOI: 10.1007/978-1-0716-3503-2_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Discovering novel ice-binding proteins (IBPs) is important for understanding the evolution of IBPs but it is difficult to determine where resources should be directed in the search for novel IBPs. For this reason, we developed a simple bioinformatic approach for aiding in the determination of where to direct efforts in the search for IBPs. First, BLAST is used to obtain a candidate list of putative IBPs. Next, phylogenetic trees are constructed to map the candidate list of putative IBPs to determine if any patterns are forming. These candidate putative IBPs and their patterns are then assessed through the production of ancestral sequences and reverse BLAST searches, in addition to the use of IBP calculators, to determine which sequences should be cut to produce the final putative IBP list. Finally, we explain an avenue to investigate these putative IBPs further for the development of hypotheses on their evolution.
Collapse
Affiliation(s)
- Isaiah C H Box
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
| | | | - Katie E Marshall
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
7
|
Khan A, Uddin J, Ali F, Kumar H, Alghamdi W, Ahmad A. AFP-SPTS: An Accurate Prediction of Antifreeze Proteins Using Sequential and Pseudo-Tri-Slicing Evolutionary Features with an Extremely Randomized Tree. J Chem Inf Model 2023; 63:826-834. [PMID: 36649569 DOI: 10.1021/acs.jcim.2c01417] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The development of intracellular ice in the bodies of cold-blooded living organisms may cause them to die. These species yield antifreeze proteins (AFPs) to live in subzero temperature environments. Additionally, AFPs are implemented in biotechnological, industrial, agricultural, and medical fields. Machine learning-based predictors were presented for AFP identification. However, more accurate predictors are still highly desirable for boosting the AFP prediction. This work presents a novel approach, named AFP-SPTS, for the correct prediction of AFPs. We explored the discriminative features with four schemes, namely, dipeptide deviation from the expected mean (DDE), reduced amino acid alphabet (RAAA), grouped dipeptide composition (GDPC), and a novel representative method, called pseudo-position-specific scoring matrix tri-slicing (PseTS-PSSM). Considering the advantages of ensemble learning strategy, we fused each feature vector into different combinations and trained the models with five machine learning algorithms, i.e., multilayer perceptron (MLP), extremely randomized tree (ERT), decision tree (DT), random forest (RF), and AdaBoost. Among all models, PseTS-PSSM + RAAA with an extremely randomized tree attained the best outcomes. The proposed predictor (AFP-SPTS) boosted the accuracies of AFPs in the literature by 1.82 and 4.1%.
Collapse
Affiliation(s)
- Adnan Khan
- Qurtuba University of Science and Information Technology, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Jamal Uddin
- Qurtuba University of Science and Information Technology, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Farman Ali
- Sarhad University of Science and Information Technology, Mardan Campus, Peshawar23200, Pakistan.,Department of Elementary and Secondary Education Department, Government of Khyber Pakhtunkhwa, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Harish Kumar
- Department of Computer Science, College of Computer Science, King Khalid University, Abha61421, Saudi Arabia
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King AbdulAziz University, Jeddah21589, Saudi Arabia
| | - Aftab Ahmad
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan23200, Pakistan
| |
Collapse
|
8
|
Khan A, Uddin J, Ali F, Ahmad A, Alghushairy O, Banjar A, Daud A. Prediction of antifreeze proteins using machine learning. Sci Rep 2022; 12:20672. [PMID: 36450775 PMCID: PMC9712683 DOI: 10.1038/s41598-022-24501-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 11/16/2022] [Indexed: 12/03/2022] Open
Abstract
Living organisms including fishes, microbes, and animals can live in extremely cold weather. To stay alive in cold environments, these species generate antifreeze proteins (AFPs), also referred to as ice-binding proteins. Moreover, AFPs are extensively utilized in many important fields including medical, agricultural, industrial, and biotechnological. Several predictors were constructed to identify AFPs. However, due to the sequence and structural heterogeneity of AFPs, correct identification is still a challenging task. It is highly desirable to develop a more promising predictor. In this research, a novel computational method, named AFP-LXGB has been proposed for prediction of AFPs more precisely. The information is explored by Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Position Specific Scoring Matrix-Segmentation-Autocorrelation Transformation (Sg-PSSM-ACT), and Pseudo Position Specific Scoring Matrix Tri-Slicing (PseTS-PSSM). Keeping the benefits of ensemble learning, these feature sets are concatenated into different combinations. The best feature set is selected by Extremely Randomized Tree-Recursive Feature Elimination (ERT-RFE). The models are trained by Light eXtreme Gradient Boosting (LXGB), Random Forest (RF), and Extremely Randomized Tree (ERT). Among classifiers, LXGB has obtained the best prediction results. The novel method (AFP-LXGB) improved the accuracies by 3.70% and 4.09% than the best methods. These results verified that AFP-LXGB can predict AFPs more accurately and can participate in a significant role in medical, agricultural, industrial, and biotechnological fields.
Collapse
Affiliation(s)
- Adnan Khan
- grid.444994.00000 0004 0609 284XQurtuba University of Science and Technology, Peshawar, Khyber Pakhtunkhwa Pakistan
| | - Jamal Uddin
- grid.444994.00000 0004 0609 284XQurtuba University of Science and Technology, Peshawar, Khyber Pakhtunkhwa Pakistan
| | - Farman Ali
- Department of Elementary and Secondary Education, Peshawar, Khyber Pakhtunkhwa Pakistan ,grid.444996.20000 0004 0609 292XSarhad University of Science and Information Technology, Mardan, Pakistan
| | - Ashfaq Ahmad
- grid.440522.50000 0004 0478 6450Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Omar Alghushairy
- grid.460099.2Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia
| | - Ameen Banjar
- grid.460099.2Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia
| | - Ali Daud
- Abu Dhabi School of Management, Abu Dhabi, United Arab Emirates ,grid.460099.2Department of Computer Science and Artificial Intelligence, University of Jeddah, Jeddah, Saudi Arabia
| |
Collapse
|
9
|
Satyakam, Zinta G, Singh RK, Kumar R. Cold adaptation strategies in plants—An emerging role of epigenetics and antifreeze proteins to engineer cold resilient plants. Front Genet 2022; 13:909007. [PMID: 36092945 PMCID: PMC9459425 DOI: 10.3389/fgene.2022.909007] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 07/21/2022] [Indexed: 11/13/2022] Open
Abstract
Cold stress adversely affects plant growth, development, and yield. Also, the spatial and geographical distribution of plant species is influenced by low temperatures. Cold stress includes chilling and/or freezing temperatures, which trigger entirely different plant responses. Freezing tolerance is acquired via the cold acclimation process, which involves prior exposure to non-lethal low temperatures followed by profound alterations in cell membrane rigidity, transcriptome, compatible solutes, pigments and cold-responsive proteins such as antifreeze proteins. Moreover, epigenetic mechanisms such as DNA methylation, histone modifications, chromatin dynamics and small non-coding RNAs play a crucial role in cold stress adaptation. Here, we provide a recent update on cold-induced signaling and regulatory mechanisms. Emphasis is given to the role of epigenetic mechanisms and antifreeze proteins in imparting cold stress tolerance in plants. Lastly, we discuss genetic manipulation strategies to improve cold tolerance and develop cold-resistant plants.
Collapse
|
10
|
Hosen MF, Mahmud SH, Ahmed K, Chen W, Moni MA, Deng HW, Shoombuatong W, Hasan MM. DeepDNAbP: A deep learning-based hybrid approach to improve the identification of deoxyribonucleic acid-binding proteins. Comput Biol Med 2022; 145:105433. [DOI: 10.1016/j.compbiomed.2022.105433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 03/11/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
|
11
|
Box ICH, Matthews BJ, Marshall KE. Molecular evidence of intertidal habitats selecting for repeated ice-binding protein evolution in invertebrates. J Exp Biol 2022; 225:274373. [PMID: 35258616 DOI: 10.1242/jeb.243409] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/20/2021] [Indexed: 12/21/2022]
Abstract
Ice-binding proteins (IBPs) have evolved independently in multiple taxonomic groups to improve their survival at sub-zero temperatures. Intertidal invertebrates in temperate and polar regions frequently encounter sub-zero temperatures, yet there is little information on IBPs in these organisms. We hypothesized that there are far more IBPs than are currently known and that the occurrence of freezing in the intertidal zone selects for these proteins. We compiled a list of genome-sequenced invertebrates across multiple habitats and a list of known IBP sequences and used BLAST to identify a wide array of putative IBPs in those invertebrates. We found that the probability of an invertebrate species having an IBP was significantly greater in intertidal species than in those primarily found in open ocean or freshwater habitats. These intertidal IBPs had high sequence similarity to fish and tick antifreeze glycoproteins and fish type II antifreeze proteins. Previously established classifiers based on machine learning techniques further predicted ice-binding activity in the majority of our newly identified putative IBPs. We investigated the potential evolutionary origin of one putative IBP from the hard-shelled mussel Mytilus coruscus and suggest that it arose through gene duplication and neofunctionalization. We show that IBPs likely readily evolve in response to freezing risk and that there is an array of uncharacterized IBPs, and highlight the need for broader laboratory-based surveys of the diversity of ice-binding activity across diverse taxonomic and ecological groups.
Collapse
Affiliation(s)
- Isaiah C H Box
- Department of Zoology, University of British Columbia, 6270 University Blvd, Vancouver, BC, CanadaV6T 1Z4
| | - Benjamin J Matthews
- Department of Zoology, University of British Columbia, 6270 University Blvd, Vancouver, BC, CanadaV6T 1Z4
| | - Katie E Marshall
- Department of Zoology, University of British Columbia, 6270 University Blvd, Vancouver, BC, CanadaV6T 1Z4
| |
Collapse
|
12
|
Ali F, Akbar S, Ghulam A, Maher ZA, Unar A, Talpur DB. AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information. Comput Biol Med 2021; 139:105006. [PMID: 34749096 DOI: 10.1016/j.compbiomed.2021.105006] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/29/2021] [Accepted: 10/29/2021] [Indexed: 11/30/2022]
Abstract
In extremely cold environments, living organisms like plants, animals, fishes, and microbes can die due to the intracellular ice formation in their bodies. To sustain life in such cold environments, some cold-blooded species produced Antifreeze proteins (AFPs), also called ice-binding proteins. AFPs are not only limited to the medical field but also have diverse significance in the area of biotechnology, agriculture, and the food industry. Different AFPs exhibit high heterogeneity in their structures and sequences. Keeping the significance of AFPs, several machine-learning-based models have been developed by scientists for the prediction of AFPs. However, due to the complex and diverse nature of AFPs, the prediction performance of the existing methods is limited. Therefore, it is highly indispensable for researchers to develop a reliable computational model that can accurately predict AFPs. In this connection, this study presents a novel predictor for AFPs, named AFP-CMBPred. The sequences of AFPs are formulated via four different feature representation methods, such as Amphiphilic pseudo amino acid composition (Amp-PseAAC), Dipeptide Deviation from Expected Mean (DDE), Multi-Blocks Position Specific Scoring Matrix (MB-PSSM), and Consensus Sequence-based on Multi-Blocks Position Specific Scoring Matrix (CS-MB-PSSM) to collect local and global descriptors. In the next step, the extracted feature vectors are evaluated via Support Vector Machine (SVM) and Random Forest (RF) based classification learners. The prediction performance of both classifiers is further assessed using three validation methods i.e., jackknife test, 10-fold cross-validation test, and independent test. After examining the prediction rates of all validation tests, it was found that our proposed model achieved the higher prediction accuracies of ∼2.65%, ∼2.84%, and ∼3.37% using jackknife, K-fold, and independent test, respectively. The experimental outcomes validate that our proposed "AFP-CMBPred" predictor secured the highest prediction results than the existing models for the identification of AFPs. It is further anticipated that our proposed AFP-CMBPred model will be considered a valuable tool in the research academia and drug development.
Collapse
Affiliation(s)
- Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
| | - Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan
| | - Ali Ghulam
- Computerization and Network Section, Sindh Agriculture University, Tandojam, Pakistan
| | | | - Ahsanullah Unar
- School of Life Science, University of Science and Technology, China
| | - Dhani Bux Talpur
- School of Information and Communication Engineering, Guilin University of Electronic Technology, Guilin, China
| |
Collapse
|
13
|
Prediction and analysis of antifreeze proteins. Heliyon 2021; 7:e07953. [PMID: 34604556 PMCID: PMC8473546 DOI: 10.1016/j.heliyon.2021.e07953] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/28/2021] [Accepted: 09/03/2021] [Indexed: 11/20/2022] Open
Abstract
Antifreeze proteins (AFPs) are proteins that protect cellular fluids and body fluids from freezing by inhibiting the nucleation and growth of ice crystals and preventing ice recrystallization, thereby contributing to the maintenance of life in living organisms. They exist in fish, insects, microorganisms, and fungi. However, the number of known AFPs is currently limited, and it is essential to construct a reliable dataset of AFPs and develop a bioinformatics tool to predict AFPs. In this work, we first collected AFPs sequences from UniProtKB considering the reliability of annotations and, based on these datasets, developed a prediction system using random forest. We achieved accuracies of 0.961 and 0.947 for non-redundant sequences with less than 90% and 30% identities and achieved the accuracy of 0.953 for representative sequences for each species. Using the ability of random forest, we identified the sequence features that contributed to the prediction. Some sequence features were common to AFPs from different species. These features include the Cys content, Ala-Ala content, Trp-Gly content, and the amino acids' distribution related to the disorder propensity. The computer program and the dataset developed in this work are available from the GitHub site: https://github.com/ryomiya/Prediction-and-analysis-of-antifreeze-proteins.
Collapse
|
14
|
Wang S, Deng L, Xia X, Cao Z, Fei Y. Predicting antifreeze proteins with weighted generalized dipeptide composition and multi-regression feature selection ensemble. BMC Bioinformatics 2021; 22:340. [PMID: 34162327 PMCID: PMC8220696 DOI: 10.1186/s12859-021-04251-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 06/09/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Antifreeze proteins (AFPs) are a group of proteins that inhibit body fluids from growing to ice crystals and thus improve biological antifreeze ability. It is vital to the survival of living organisms in extremely cold environments. However, little research is performed on sequences feature extraction and selection for antifreeze proteins classification in the structure and function prediction, which is of great significance. RESULTS In this paper, to predict the antifreeze proteins, a feature representation of weighted generalized dipeptide composition (W-GDipC) and an ensemble feature selection based on two-stage and multi-regression method (LRMR-Ri) are proposed. Specifically, four feature selection algorithms: Lasso regression, Ridge regression, Maximal information coefficient and Relief are used to select the feature sets, respectively, which is the first stage of LRMR-Ri method. If there exists a common feature subset among the above four sets, it is the optimal subset; otherwise we use Ridge regression to select the optimal subset from the public set pooled by the four sets, which is the second stage of LRMR-Ri. The LRMR-Ri method combined with W-GDipC was performed both on the antifreeze proteins dataset (binary classification), and on the membrane protein dataset (multiple classification). Experimental results show that this method has good performance in support vector machine (SVM), decision tree (DT) and stochastic gradient descent (SGD). The values of ACC, RE and MCC of LRMR-Ri and W-GDipC with antifreeze proteins dataset and SVM classifier have reached as high as 95.56%, 97.06% and 0.9105, respectively, much higher than those of each single method: Lasso, Ridge, Mic and Relief, nearly 13% higher than single Lasso for ACC. CONCLUSION The experimental results show that the proposed LRMR-Ri and W-GDipC method can significantly improve the accuracy of antifreeze proteins prediction compared with other similar single feature methods. In addition, our method has also achieved good results in the classification and prediction of membrane proteins, which verifies its widely reliability to a certain extent.
Collapse
Affiliation(s)
- Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, China.
| | - Lin Deng
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, China
| | - Xinnan Xia
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, China.
| | - Zicheng Cao
- School of Public Health (Shenzhen), Sun Yat-Sen University, Guangzhou, 510006, China
| | - Yu Fei
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, 650221, China.
| |
Collapse
|
15
|
Zhang HP, Liu W, An JQ, Yang P, Guo LH, Li YQ, Lv J, Yu SH. Transcriptome analyses and weighted gene coexpression network analysis reveal key pathways and genes involved in the rapid cold resistance of the Chinese white wax scale insect. ARCHIVES OF INSECT BIOCHEMISTRY AND PHYSIOLOGY 2021; 107:e21781. [PMID: 33687102 DOI: 10.1002/arch.21781] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/10/2021] [Accepted: 02/11/2021] [Indexed: 06/12/2023]
Abstract
The Chinese white wax scale insect, Ericerus pela, is an important resource insect in China. The rapid response of E. pela to decreasing temperatures plays key roles in the population distribution. In this study, we analyzed the gene expression of E. pela treated with low temperature using transcriptome analyses and weighted gene coexpression network analysis (WGCNA). The results showed that the cold resistance of E. pela involved changes in the expression of many genes. The genes were mainly involved in alcohol formation activity, lipid metabolism, membrane and structure, and oxidoreductase activity. According to the WGCNA results, some pathways related to cold resistance were found in the genes in the modules, such as cytoskeleton proteins, cytoskeleton protein pathway, biosynthesis of unsaturated fatty acids, glycerophospholipid metabolism, ether lipid metabolism, and thermogenesis. Some of the hub genes were nonspecific lipid-transfer proteins, DnaJ homolog subfamily C member 13, paramyosin, tropomodulin, and tubulin beta chain. In particular, the hub genes of the tan module included the heat shock protein (hsp) 10, hsp 60, hsp 70, and hsp 90 genes. Thirty-five antifreeze protein (afp) genes were identified according to the annotation results. Three afp genes were further identified among the hub genes. Six of these genes were selected for heterogeneous protein expression. One of them was expressed successfully. The thermal hysteresis activity (THA) analyses showed that the THA was 1.73°C. These results showed that the cytoskeleton, lipid metabolism, thermogenesis, HSPs and AFPs may play important roles in the cold resistance of E. pela.
Collapse
Affiliation(s)
- Hong-Ping Zhang
- College of Agriculture and Life Sciences, Kunming University, Kunming, China
| | - Wei Liu
- Research Institute of Resource Insects, Chinese Academy of Forestry, Key Laboratory of Cultivating and Utilization of Resource Insects of State Forestry Administration, Kunming, China
| | - Jia-Qi An
- Research Institute of Resource Insects, Chinese Academy of Forestry, Key Laboratory of Cultivating and Utilization of Resource Insects of State Forestry Administration, Kunming, China
| | - Pu Yang
- Research Institute of Resource Insects, Chinese Academy of Forestry, Key Laboratory of Cultivating and Utilization of Resource Insects of State Forestry Administration, Kunming, China
| | - Li-Hong Guo
- College of Agriculture and Life Sciences, Kunming University, Kunming, China
| | - Yan-Qiong Li
- College of Agriculture and Life Sciences, Kunming University, Kunming, China
| | - Jing Lv
- College of Agriculture and Life Sciences, Kunming University, Kunming, China
| | - Shu-Hui Yu
- College of Agriculture and Life Sciences, Kunming University, Kunming, China
| |
Collapse
|
16
|
Eskandari A, Leow TC, Rahman MBA, Oslan SN. Antifreeze Proteins and Their Practical Utilization in Industry, Medicine, and Agriculture. Biomolecules 2020; 10:biom10121649. [PMID: 33317024 PMCID: PMC7764015 DOI: 10.3390/biom10121649] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 11/28/2020] [Accepted: 11/30/2020] [Indexed: 12/15/2022] Open
Abstract
Antifreeze proteins (AFPs) are specific proteins, glycopeptides, and peptides made by different organisms to allow cells to survive in sub-zero conditions. AFPs function by reducing the water’s freezing point and avoiding ice crystals’ growth in the frozen stage. Their capability in modifying ice growth leads to the stabilization of ice crystals within a given temperature range and the inhibition of ice recrystallization that decreases the drip loss during thawing. This review presents the potential applications of AFPs from different sources and types. AFPs can be found in diverse sources such as fish, yeast, plants, bacteria, and insects. Various sources reveal different α-helices and β-sheets structures. Recently, analysis of AFPs has been conducted through bioinformatics tools to analyze their functions within proper time. AFPs can be used widely in various aspects of application and have significant industrial functions, encompassing the enhancement of foods’ freezing and liquefying properties, protection of frost plants, enhancement of ice cream’s texture, cryosurgery, and cryopreservation of cells and tissues. In conclusion, these applications and physical properties of AFPs can be further explored to meet other industrial players. Designing the peptide-based AFP can also be done to subsequently improve its function.
Collapse
Affiliation(s)
- Azadeh Eskandari
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia; (A.E.); (T.C.L.)
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia
| | - Thean Chor Leow
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia; (A.E.); (T.C.L.)
- Department of Cell and Molecular Biology, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia
- Enzyme Technology Laboratory, Institute of Bioscience, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia
| | | | - Siti Nurbaya Oslan
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia; (A.E.); (T.C.L.)
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia
- Enzyme Technology Laboratory, Institute of Bioscience, Universiti Putra Malaysia, UPM, Serdang 43400, Selangor, Malaysia
- Correspondence: ; Tel.: +60-39769-6710; Fax: +60-39769-7590
| |
Collapse
|
17
|
Khan F, Khan M, Iqbal N, Khan S, Muhammad Khan D, Khan A, Wei DQ. Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach. Front Genet 2020; 11:539227. [PMID: 33093842 PMCID: PMC7527634 DOI: 10.3389/fgene.2020.539227] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/13/2020] [Indexed: 01/20/2023] Open
Abstract
Meiotic recombination is the driving force of evolutionary development and an important source of genetic variation. The meiotic recombination does not take place randomly in a chromosome but occurs in some regions of the chromosome. A region in chromosomes with higher rate of meiotic recombination events are considered as hotspots and a region where frequencies of the recombination events are lower are called coldspots. Prediction of meiotic recombination spots provides useful information about the basic functionality of inheritance and genome diversity. This study proposes an intelligent computational predictor called iRSpots-DNN for the identification of recombination spots. The proposed predictor is based on a novel feature extraction method and an optimized deep neural network (DNN). The DNN was employed as a classification engine whereas, the novel features extraction method was developed to extract meaningful features for the identification of hotspots and coldspots across the yeast genome. Unlike previous algorithms, the proposed feature extraction avoids bias among different selected features and preserved the sequence discriminant properties along with the sequence-structure information simultaneously. This study also considered other effective classifiers named support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to predict recombination spots. Experimental results on a benchmark dataset with 10-fold cross-validation showed that iRSpots-DNN achieved the highest accuracy, i.e., 95.81%. Additionally, the performance of the proposed iRSpots-DNN is significantly better than the existing predictors on a benchmark dataset. The relevant benchmark dataset and source code are freely available at: https://github.com/Fatima-Khan12/iRspot_DNN/tree/master/iRspot_DNN.
Collapse
Affiliation(s)
- Fatima Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Mukhtaj Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Dost Muhammad Khan
- Department of Statistics, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Ministry of Education, Shanghai, China.,Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
18
|
Zhu YH, Hu J, Qi Y, Song XN, Yu DJ. Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites. Comb Chem High Throughput Screen 2020; 22:455-469. [PMID: 31553288 DOI: 10.2174/1386207322666190925125524] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Revised: 06/21/2019] [Accepted: 08/23/2019] [Indexed: 11/22/2022]
Abstract
AIM AND OBJECTIVE The accurate identification of protein-ligand binding sites helps elucidate protein function and facilitate the design of new drugs. Machine-learning-based methods have been widely used for the prediction of protein-ligand binding sites. Nevertheless, the severe class imbalance phenomenon, where the number of nonbinding (majority) residues is far greater than that of binding (minority) residues, has a negative impact on the performance of such machine-learning-based predictors. MATERIALS AND METHODS In this study, we aim to relieve the negative impact of class imbalance by Boosting Multiple Granular Support Vector Machines (BGSVM). In BGSVM, each base SVM is trained on a granular training subset consisting of all minority samples and some reasonably selected majority samples. The efficacy of BGSVM for dealing with class imbalance was validated by benchmarking it with several typical imbalance learning algorithms. We further implemented a protein-nucleotide binding site predictor, called BGSVM-NUC, with the BGSVM algorithm. RESULTS Rigorous cross-validation and independent validation tests for five types of proteinnucleotide interactions demonstrated that the proposed BGSVM-NUC achieves promising prediction performance and outperforms several popular sequence-based protein-nucleotide binding site predictors. The BGSVM-NUC web server is freely available at http://csbio.njust.edu.cn/bioinf/BGSVM-NUC/ for academic use.
Collapse
Affiliation(s)
- Yi-Heng Zhu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Jun Hu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yong Qi
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Xiao-Ning Song
- School of Internet of Things, Jiangnan University, Wuxi 214122, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| |
Collapse
|
19
|
iTTCA-Hybrid: Improved and robust identification of tumor T cell antigens by utilizing hybrid feature representation. Anal Biochem 2020; 599:113747. [DOI: 10.1016/j.ab.2020.113747] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 04/13/2020] [Accepted: 04/16/2020] [Indexed: 02/07/2023]
|
20
|
Sun S, Ding H, Wang D, Han S. Identifying Antifreeze Proteins Based on Key Evolutionary Information. Front Bioeng Biotechnol 2020; 8:244. [PMID: 32274383 PMCID: PMC7113384 DOI: 10.3389/fbioe.2020.00244] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/09/2020] [Indexed: 01/08/2023] Open
Abstract
Antifreeze proteins are important antifreeze materials that have been widely used in industry, including in cryopreservation, de-icing, and food storage applications. However, the quantity of some commercially produced antifreeze proteins is insufficient for large-scale industrial applications. Further, many antifreeze proteins have properties such as cytotoxicity, severely hindering their applications. Understanding the mechanisms underlying the protein-ice interactions and identifying novel antifreeze proteins are, therefore, urgently needed. In this study, to uncover the mechanisms underlying protein-ice interactions and provide an efficient and accurate tool for identifying antifreeze proteins, we assessed various evolutionary features based on position-specific scoring matrices (PSSMs) and evaluated their importance for discriminating of antifreeze and non-antifreeze proteins. We then parsimoniously selected seven key features with the highest importance. We found that the selected features showed opposite tendencies (regarding the conservation of certain amino acids) between antifreeze and non-antifreeze proteins. Five out of the seven features had relatively high contributions to the discrimination of antifreeze and non-antifreeze proteins, as revealed by a principal component analysis, i.e., the conservation of the replacement of Cys, Trp, and Gly in antifreeze proteins by Ala, Met, and Ala, respectively, in the related proteins, and the conservation of the replacement of Arg in non-antifreeze proteins by Ser and Arg in the related proteins. Based on the seven parsimoniously selected key features, we established a classifier using support vector machine, which outperformed the state-of-the-art tools. These results suggest that understanding evolutionary information is crucial to designing accurate automated methods for discriminating antifreeze and non-antifreeze proteins. Our classifier, therefore, is an efficient tool for annotating new proteins with antifreeze functions based on sequence information and can facilitate their application in industry.
Collapse
Affiliation(s)
- Shanwen Sun
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Ding
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Donghua Wang
- Department of General Surgery, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Shuguang Han
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
21
|
Xiang H, Yang X, Ke L, Hu Y. The properties, biotechnologies, and applications of antifreeze proteins. Int J Biol Macromol 2020; 153:661-675. [PMID: 32156540 DOI: 10.1016/j.ijbiomac.2020.03.040] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 03/04/2020] [Accepted: 03/06/2020] [Indexed: 01/30/2023]
Abstract
By natural selection, organisms evolve different solutions to cope with extremely cold weather. The emergence of an antifreeze protein gene is one of the most momentous solutions. Antifreeze proteins possess an importantly functional ability for organisms to survive in cold environments and are widely found in various cold-tolerant species. In this review, we summarize the origin of antifreeze proteins, describe the diversity of their species-specific properties and functions, and highlight the related biotechnology on the basis of both laboratory tests and bioinformatics analysis. The most recent advances in the applications of antifreeze proteins are also discussed. We expect that this systematic review will contribute to the comprehensive knowledge of antifreeze proteins to readers.
Collapse
Affiliation(s)
- Hong Xiang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China.; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institutes of Advanced Technology
| | - Xiaohu Yang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China.; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institutes of Advanced Technology
| | - Lei Ke
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China.; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institutes of Advanced Technology
| | - Yong Hu
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China.; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institutes of Advanced Technology.
| |
Collapse
|
22
|
Wang S, Wang X. Prediction of protein structural classes by different feature expressions based on 2-D wavelet denoising and fusion. BMC Bioinformatics 2019; 20:701. [PMID: 31874617 PMCID: PMC6929547 DOI: 10.1186/s12859-019-3276-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Protein structural class predicting is a heavily researched subject in bioinformatics that plays a vital role in protein functional analysis, protein folding recognition, rational drug design and other related fields. However, when traditional feature expression methods are adopted, the features usually contain considerable redundant information, which leads to a very low recognition rate of protein structural classes. RESULTS We constructed a prediction model based on wavelet denoising using different feature expression methods. A new fusion idea, first fuse and then denoise, is proposed in this article. Two types of pseudo amino acid compositions are utilized to distill feature vectors. Then, a two-dimensional (2-D) wavelet denoising algorithm is used to remove the redundant information from two extracted feature vectors. The two feature vectors based on parallel 2-D wavelet denoising are fused, which is known as PWD-FU-PseAAC. The related source codes are available at https://github.com/Xiaoheng-Wang12/Wang-xiaoheng/tree/master. CONCLUSIONS Experimental verification of three low-similarity datasets suggests that the proposed model achieves notably good results as regarding the prediction of protein structural classes.
Collapse
Affiliation(s)
- Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, People's Republic of China.
| | - Xiaoheng Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, People's Republic of China
| |
Collapse
|
23
|
Wang F, Guan ZX, Dao FY, Ding H. A Brief Review of the Computational Identification of Antifreeze Protein. CURR ORG CHEM 2019. [DOI: 10.2174/1385272823666190718145613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Lots of cold-adapted organisms could produce antifreeze proteins (AFPs) to counter the freezing of cell fluids by controlling the growth of ice crystal. AFPs have been found in various species such as in vertebrates, invertebrates, plants, bacteria, and fungi. These AFPs from fish, insects and plants displayed a high diversity. Thus, the identification of the AFPs is a challenging task in computational proteomics. With the accumulation of AFPs and development of machine meaning methods, it is possible to construct a high-throughput tool to timely identify the AFPs. In this review, we briefly reviewed the application of machine learning methods in antifreeze proteins identification from difference section, including published benchmark dataset, sequence descriptor, classification algorithms and published methods. We hope that this review will produce new ideas and directions for the researches in identifying antifreeze proteins.
Collapse
Affiliation(s)
- Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
24
|
Akbar S, Hayat M, Kabir M, Iqbal M. iAFP-gap-SMOTE: An Efficient Feature Extraction Scheme Gapped Dipeptide Composition is Coupled with an Oversampling Technique for Identification of Antifreeze Proteins. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180816101653] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Antifreeze proteins (AFPs) perform distinguishable roles in maintaining homeostatic conditions of living organisms and protect their cell and body from freezing in extremely cold conditions. Owing to high diversity in protein sequences and structures, the discrimination of AFPs from non- AFPs through experimental approaches is expensive and lengthy. It is, therefore, vastly desirable to propose a computational intelligent and high throughput model that truly reflects AFPs quickly and accurately. In a sequel, a new predictor called “iAFP-gap-SMOTE” is proposed for the identification of AFPs. Protein sequences are expressed by adopting three numerical feature extraction schemes namely; Split Amino Acid Composition, G-gap di-peptide Composition and Reduce Amino Acid alphabet composition. Usually, classification hypothesis biased towards majority class in case of the imbalanced dataset. Oversampling technique Synthetic Minority Over-sampling Technique is employed in order to increase the instances of the lower class and control the biasness. 10-fold cross-validation test is applied to appraise the success rates of “iAFP-gap-SMOTE” model. After the empirical investigation, “iAFP-gap-SMOTE” model obtained 95.02% accuracy. The comparison suggested that the accuracy of” iAFP-gap-SMOTE” model is higher than that of the present techniques in the literature so far. It is greatly recommended that our proposed model “iAFP-gap-SMOTE” might be helpful for the research community and academia.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Muhammad Kabir
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Muhammad Iqbal
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| |
Collapse
|
25
|
Prediction of membrane protein types by exploring local discriminative information from evolutionary profiles. Anal Biochem 2019; 564-565:123-132. [DOI: 10.1016/j.ab.2018.10.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Revised: 10/23/2018] [Accepted: 10/25/2018] [Indexed: 11/17/2022]
|
26
|
Akbar S, Hayat M. iMethyl-STTNC: Identification of N 6-methyladenosine sites by extending the idea of SAAC into Chou's PseAAC to formulate RNA sequences. J Theor Biol 2018; 455:205-211. [PMID: 30031793 DOI: 10.1016/j.jtbi.2018.07.018] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 07/14/2018] [Accepted: 07/17/2018] [Indexed: 11/17/2022]
Abstract
N6- methyladenosine (m6A) is a vital post-transcriptional modification, which adds another layer of epigenetic regulation at RNA level. It chemically modifies mRNA that effects protein expression. RNA sequence contains many genetic code motifs (GAC). Among these codes, identification of methylated or not methylated GAC motif is highly indispensable. However, with a large number of RNA sequences generated in post-genomic era, it becomes a challenging task how to accurately and speedily characterize these sequences. In view of this, the concept of an intelligent is incorporated with a computational model that truly and fast reflects the motif of the desired classes. An intelligent computational model "iMethyl-STTNC" model is proposed for identification of methyladenosine sites in RNA. In the proposed study, four feature extraction techniques, such as; Pseudo-dinucleotide-composition, Pseudo-trinucleotide-composition, split-trinucleotide-composition, and split-tetra-nucleotides-composition (STTNC) are utilized for genuine numerical descriptors. Three different classification algorithms including probabilistic neural network, Support vector machine (SVM), and K-nearest neighbor are adopted for prediction. After examining the outcomes of prediction model on each feature spaces, SVM using STTNC feature space reported the highest accuracy of 69.84%, 91.84% on dataset1 and dataset2, respectively. The reported results show that our proposed predictor has achieved encouraging results compared to the present approaches, so far in the research. It is finally reckoned that our developed model might be beneficial for in-depth analysis of genomes and drug development.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan.
| |
Collapse
|
27
|
iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into chou's pseudo amino acid composition. J Theor Biol 2018; 442:11-21. [DOI: 10.1016/j.jtbi.2018.01.008] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 12/23/2017] [Accepted: 01/10/2018] [Indexed: 02/08/2023]
|
28
|
A Two-Layer Computational Model for Discrimination of Enhancer and Their Types Using Hybrid Features Pace of Pseudo K-Tuple Nucleotide Composition. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2017. [DOI: 10.1007/s13369-017-2818-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
29
|
Tahir M, Hayat M. Machine learning based identification of protein–protein interactions using derived features of physiochemical properties and evolutionary profiles. Artif Intell Med 2017; 78:61-71. [DOI: 10.1016/j.artmed.2017.06.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 06/09/2017] [Accepted: 06/11/2017] [Indexed: 02/09/2023]
|
30
|
CryoProtect: A Web Server for Classifying Antifreeze Proteins from Nonantifreeze Proteins. J CHEM-NY 2017. [DOI: 10.1155/2017/9861752] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Antifreeze protein (AFP) is an ice-binding protein that protects organisms from freezing in extremely cold environments. AFPs are found across a diverse range of species and, therefore, significantly differ in their structures. As there are no consensus sequences available for determining the ice-binding domain of AFPs, thus the prediction and characterization of AFPs from their sequence is a challenging task. This study addresses this issue by predicting AFPs directly from sequence on a large set of 478 AFPs and 9,139 non-AFPs using machine learning (e.g., random forest) as a function of interpretable features (e.g., amino acid composition, dipeptide composition, and physicochemical properties). Furthermore, AFPs were characterized using propensity scores and important physicochemical properties via statistical and principal component analysis. The predictive model afforded high performance with an accuracy of 88.28% and results revealed that AFPs are likely to be composed of hydrophobic amino acids as well as amino acids with hydroxyl and sulfhydryl side chains. The predictive model is provided as a free publicly available web server called CryoProtect for classifying query protein sequence as being either AFP or non-AFP. The data set and source code are for reproducing the results which are provided on GitHub.
Collapse
|
31
|
Chow-Shi-Yée M, Briard JG, Grondin M, Averill-Bates DA, Ben RN, Ouellet F. Inhibition of ice recrystallization and cryoprotective activity of wheat proteins in liver and pancreatic cells. Protein Sci 2016; 25:974-86. [PMID: 26889747 DOI: 10.1002/pro.2903] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Revised: 02/09/2016] [Accepted: 02/12/2016] [Indexed: 01/08/2023]
Abstract
Efficient cryopreservation of cells at ultralow temperatures requires the use of substances that help maintain viability and metabolic functions post-thaw. We are developing new technology where plant proteins are used to substitute the commonly-used, but relatively toxic chemical dimethyl sulfoxide. Recombinant forms of four structurally diverse wheat proteins, TaIRI-2 (ice recrystallization inhibition), TaBAS1 (2-Cys peroxiredoxin), WCS120 (dehydrin), and TaENO (enolase) can efficiently cryopreserve hepatocytes and insulin-secreting INS832/13 cells. This study shows that TaIRI-2 and TaENO are internalized during the freeze-thaw process, while TaBAS1 and WCS120 remain at the extracellular level. Possible antifreeze activity of the four proteins was assessed. The "splat cooling" method for quantifying ice recrystallization inhibition activity (a property that characterizes antifreeze proteins) revealed that TaIRI-2 and TaENO are more potent than TaBAS1 and WCS120. Because of their ability to inhibit ice recrystallization, the wheat recombinant proteins TaIRI-2 and TaENO are promising candidates and could prove useful to improve cryopreservation protocols for hepatocytes and insulin-secreting cells, and possibly other cell types. TaENO does not have typical ice-binding domains, and the TargetFreeze tool did not predict an antifreeze capacity, suggesting the existence of nontypical antifreeze domains. The fact that TaBAS1 is an efficient cryoprotectant but does not show antifreeze activity indicates a different mechanism of action. The cryoprotective properties conferred by WCS120 depend on biochemical properties that remain to be determined. Overall, our results show that the proteins' efficiencies vary between cell types, and confirm that a combination of different protection mechanisms is needed to successfully cryopreserve mammalian cells.
Collapse
Affiliation(s)
- Mélanie Chow-Shi-Yée
- Département Des Sciences Biologiques, Université Du Québec À Montréal, Montréal, Canada
| | - Jennie G Briard
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Canada
| | - Mélanie Grondin
- Département Des Sciences Biologiques, Université Du Québec À Montréal, Montréal, Canada
| | - Diana A Averill-Bates
- Département Des Sciences Biologiques, Université Du Québec À Montréal, Montréal, Canada
| | - Robert N Ben
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Canada
| | - François Ouellet
- Département Des Sciences Biologiques, Université Du Québec À Montréal, Montréal, Canada
| |
Collapse
|
32
|
Ahmad K, Waris M, Hayat M. Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition. J Membr Biol 2016; 249:293-304. [PMID: 26746980 DOI: 10.1007/s00232-015-9868-8] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 12/30/2015] [Indexed: 12/15/2022]
Abstract
Mitochondrion is the key organelle of eukaryotic cell, which provides energy for cellular activities. Submitochondrial locations of proteins play crucial role in understanding different biological processes such as energy metabolism, program cell death, and ionic homeostasis. Prediction of submitochondrial locations through conventional methods are expensive and time consuming because of the large number of protein sequences generated in the last few decades. Therefore, it is intensively desired to establish an automated model for identification of submitochondrial locations of proteins. In this regard, the current study is initiated to develop a fast, reliable, and accurate computational model. Various feature extraction methods such as dipeptide composition (DPC), Split Amino Acid Composition, and Composition and Translation were utilized. In order to overcome the issue of biasness, oversampling technique SMOTE was applied to balance the datasets. Several classification learners including K-Nearest Neighbor, Probabilistic Neural Network, and support vector machine (SVM) are used. Jackknife test is applied to assess the performance of classification algorithms using two benchmark datasets. Among various classification algorithms, SVM achieved the highest success rates in conjunction with the condensed feature space of DPC, which are 95.20 % accuracy on dataset SML3-317 and 95.11 % on dataset SML3-983. The empirical results revealed that our proposed model obtained the highest results so far in the literatures. It is anticipated that our proposed model might be useful for future studies.
Collapse
Affiliation(s)
- Khurshid Ahmad
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Muhammad Waris
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan.
| |
Collapse
|
33
|
Badet T, Peyraud R, Raffaele S. Common protein sequence signatures associate with Sclerotinia borealis lifestyle and secretion in fungal pathogens of the Sclerotiniaceae. FRONTIERS IN PLANT SCIENCE 2015; 6:776. [PMID: 26442085 DOI: 10.3389/fpls.2015.00776issn=1664-462x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 09/10/2015] [Indexed: 05/25/2023]
Abstract
Fungal plant pathogens produce secreted proteins adapted to function outside fungal cells to facilitate colonization of their hosts. In many cases such as for fungi from the Sclerotiniaceae family the repertoire and function of secreted proteins remains elusive. In the Sclerotiniaceae, whereas Sclerotinia sclerotiorum and Botrytis cinerea are cosmopolitan broad host-range plant pathogens, Sclerotinia borealis has a psychrophilic lifestyle with a low optimal growth temperature, a narrow host range and geographic distribution. To spread successfully, S. borealis must synthesize proteins adapted to function in its specific environment. The search for signatures of adaptation to S. borealis lifestyle may therefore help revealing proteins critical for colonization of the environment by Sclerotiniaceae fungi. Here, we analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from the three Sclerotiniaceae species. We found that enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with S. borealis proteins. We designed an index to report bias in these properties and found that high index proteins were enriched among secreted proteins in the three Sclerotiniaceae fungi. High index proteins were also enriched in function associated with plant colonization in S. borealis, and in in planta-induced genes in S. sclerotiorum. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with S. borealis lifestyle and with secretion in the Sclerotiniaceae. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases.
Collapse
Affiliation(s)
- Thomas Badet
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| | - Rémi Peyraud
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| | - Sylvain Raffaele
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| |
Collapse
|
34
|
Kabir M, Hayat M. iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou’s PseAAC to formulate DNA samples. Mol Genet Genomics 2015; 291:285-96. [DOI: 10.1007/s00438-015-1108-5] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 08/19/2015] [Indexed: 10/23/2022]
|
35
|
Badet T, Peyraud R, Raffaele S. Common protein sequence signatures associate with Sclerotinia borealis lifestyle and secretion in fungal pathogens of the Sclerotiniaceae. FRONTIERS IN PLANT SCIENCE 2015; 6:776. [PMID: 26442085 PMCID: PMC4585107 DOI: 10.3389/fpls.2015.00776] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 09/10/2015] [Indexed: 05/04/2023]
Abstract
Fungal plant pathogens produce secreted proteins adapted to function outside fungal cells to facilitate colonization of their hosts. In many cases such as for fungi from the Sclerotiniaceae family the repertoire and function of secreted proteins remains elusive. In the Sclerotiniaceae, whereas Sclerotinia sclerotiorum and Botrytis cinerea are cosmopolitan broad host-range plant pathogens, Sclerotinia borealis has a psychrophilic lifestyle with a low optimal growth temperature, a narrow host range and geographic distribution. To spread successfully, S. borealis must synthesize proteins adapted to function in its specific environment. The search for signatures of adaptation to S. borealis lifestyle may therefore help revealing proteins critical for colonization of the environment by Sclerotiniaceae fungi. Here, we analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from the three Sclerotiniaceae species. We found that enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with S. borealis proteins. We designed an index to report bias in these properties and found that high index proteins were enriched among secreted proteins in the three Sclerotiniaceae fungi. High index proteins were also enriched in function associated with plant colonization in S. borealis, and in in planta-induced genes in S. sclerotiorum. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with S. borealis lifestyle and with secretion in the Sclerotiniaceae. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases.
Collapse
Affiliation(s)
- Thomas Badet
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
| | - Rémi Peyraud
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
| | - Sylvain Raffaele
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
- *Correspondence: Sylvain Raffaele, Laboratoire des Interactions Plante Micro-organismes, 24 Chemin de Borde Rouge – Auzeville, 31326 Castanet Tolosan, France
| |
Collapse
|