1
|
Xu Z, Zhang R, Chen H, Zhang L, Yan X, Qin Z, Cong S, Tan Z, Li T, Du M. Characterization and preparation of food-derived peptides on improving osteoporosis: A review. Food Chem X 2024; 23:101530. [PMID: 38933991 PMCID: PMC11200288 DOI: 10.1016/j.fochx.2024.101530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 05/18/2024] [Accepted: 06/01/2024] [Indexed: 06/28/2024] Open
Abstract
Osteoporosis is a systemic bone disease characterized by reduced bone mass and deterioration of the microstructure of bone tissue, leading to an increased risk of fragility fractures and affecting human health worldwide. Food-derived peptides are widely used in functional foods due to their low toxicity, ease of digestion and absorption, and potential to improve osteoporosis. This review summarized and discussed methods of diagnosing osteoporosis, treatment approaches, specific peptides as alternatives to conventional drugs, and the laboratory preparation and identification methods of peptides. It was found that peptides interacting with RGD (arginine-glycine-aspartic acid)-binding active sites in integrin could alleviate osteoporosis, analyzed the interaction sites between these osteogenic peptides and integrin, and further discussed their effects on improving osteoporosis. These may provide new insights for rapid screening of osteogenic peptides, and provide a theoretical basis for their application in bone materials and functional foods.
Collapse
Affiliation(s)
- Zhe Xu
- School of Food Science and Technology, State Key Laboratory of Marine Food Processing and Safety Control, Dalian Polytechnic University, Dalian 116034, China
- College of Life Sciences, Key Laboratory of Biotechnology and Bioresources Utilization, Dalian Minzu University, Ministry of Education, Dalian 116600, China
- Institute of Bast Fiber Crops & Center of Southern Economic Crops, Chinese Academy of Agricultural Sciences, Changsha 410205, China
| | - Rui Zhang
- School of Food Science and Technology, State Key Laboratory of Marine Food Processing and Safety Control, Dalian Polytechnic University, Dalian 116034, China
| | - Hongrui Chen
- School of Food and Bioengineering, Food Microbiology Key Laboratory of Sichuan Province, Chongqing Key Laboratory of Speciality Food Co-Built by Sichuan and Chongqing, Xihua University, Chengdu, Sichuan 611130, China
| | - Lijuan Zhang
- College of Life Sciences, Key Laboratory of Biotechnology and Bioresources Utilization, Dalian Minzu University, Ministry of Education, Dalian 116600, China
| | - Xu Yan
- College of Life Sciences, Key Laboratory of Biotechnology and Bioresources Utilization, Dalian Minzu University, Ministry of Education, Dalian 116600, China
| | - Zijin Qin
- Department of Food Science and Technology, University of Georgia, Clarke, Athens, GA 30602, USA
| | - Shuang Cong
- College of Life Sciences, Yantai University, Yantai, Shandong 264005, China
| | - Zhijian Tan
- Institute of Bast Fiber Crops & Center of Southern Economic Crops, Chinese Academy of Agricultural Sciences, Changsha 410205, China
| | - Tingting Li
- College of Life Sciences, Key Laboratory of Biotechnology and Bioresources Utilization, Dalian Minzu University, Ministry of Education, Dalian 116600, China
| | - Ming Du
- School of Food Science and Technology, State Key Laboratory of Marine Food Processing and Safety Control, Dalian Polytechnic University, Dalian 116034, China
| |
Collapse
|
2
|
Sreekumar S, Divya K, Joy N, Soniya EV. De novo transcriptome profiling unveils the regulation of phenylpropanoid biosynthesis in unripe Piper nigrum berries. BMC PLANT BIOLOGY 2022; 22:501. [PMID: 36284267 PMCID: PMC9597958 DOI: 10.1186/s12870-022-03878-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 09/09/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Black pepper (Piper nigrum L.) is rich in bioactive compounds that make it an imperative constituent in traditional medicines. Although the unripe fruits have long been used in different Ayurvedic formulations, the mechanism of gene regulation resulting in the production of the bioactive compounds in black pepper is not much investigated. Exploring the regulatory factors favouring the production of bioactive compounds ultimately help to accumulate the medicinally important content of black pepper. The factors that enhance the biosynthesis of these compounds could be potential candidates for metabolic engineering strategies to obtain a high level production of significant biomolecules. RESULTS Being a non-model plant, de novo sequencing technology was used to unravel comprehensive information about the genes and transcription factors that are expressed in mature unripe green berries of P. nigrum from which commercially available black pepper is prepared. In this study, the key gene regulations involved in the synthesis of bioactive principles in black pepper was brought out with a focus on the highly expressed phenylpropanoid pathway genes. Quantitative real-time PCR analysis of critical genes and transcription factors in the different developmental stages from bud to the mature green berries provides important information useful for choosing the developmental stage that would be best for the production of a particular bioactive compound. Comparison with a previous study has also been included to understand the relative position of the results obtained from this study. CONCLUSIONS The current study uncovered significant information regarding the gene expression and regulation responsible for the bioactivity of black pepper. The key transcription factors and enzymes analyzed in this study are promising targets for achieving a high level production of significant biomolecules through metabolic engineering.
Collapse
Affiliation(s)
- Sweda Sreekumar
- Transdisciplinary Biology, Rajiv Gandhi Centre for Biotechnology (RGCB), Thiruvananthapuram, Kerala, India
- Research Centre, University of Kerala, Thiruvananthapuram, Kerala, India
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, České Budějovice, Czech Republic
| | - Kattupalli Divya
- Transdisciplinary Biology, Rajiv Gandhi Centre for Biotechnology (RGCB), Thiruvananthapuram, Kerala, India
- Research Centre, University of Kerala, Thiruvananthapuram, Kerala, India
| | - Nisha Joy
- Transdisciplinary Biology, Rajiv Gandhi Centre for Biotechnology (RGCB), Thiruvananthapuram, Kerala, India
- Centre for Gene Regulation & Expression, School of Life Sciences, University of Dundee, Dundee, Scotland
| | - E V Soniya
- Transdisciplinary Biology, Rajiv Gandhi Centre for Biotechnology (RGCB), Thiruvananthapuram, Kerala, India.
| |
Collapse
|
3
|
Miao M, De Clercq E, Li G. Towards Efficient and Accurate SARS-CoV-2 Genome Sequence Typing Based on Supervised Learning Approaches. Microorganisms 2022; 10:microorganisms10091785. [PMID: 36144387 PMCID: PMC9505117 DOI: 10.3390/microorganisms10091785] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/24/2022] [Accepted: 09/01/2022] [Indexed: 11/16/2022] Open
Abstract
Despite the active development of SARS-CoV-2 surveillance methods (e.g., Nextstrain, GISAID, Pangolin), the global emergence of various SARS-CoV-2 viral lineages that potentially cause antiviral and vaccine failure has driven the need for accurate and efficient SARS-CoV-2 genome sequence classifiers. This study presents an optimized method that accurately identifies the viral lineages of SARS-CoV-2 genome sequences using existing schemes. For Nextstrain and GISAID clades, a template matching-based method is proposed to quantify the differences between viral clades and to play an important role in classification evaluation. Furthermore, to improve the typing accuracy of SARS-CoV-2 genome sequences, an ensemble model that integrates a combination of machine learning-based methods (such as Random Forest and Catboost) with optimized weights is proposed for Nextstrain, Pangolin, and GISAID clades. Cross-validation is applied to optimize the parameters of the machine learning-based method and the weight settings of the ensemble model. To improve the efficiency of the model, in addition to the one-hot encoding method, we have proposed a nucleotide site mutation-based data structure that requires less computational resources and performs better in SARS-CoV-2 genome sequence typing. Based on an accumulated database of >1 million SARS-CoV-2 genome sequences, performance evaluations show that the proposed system has a typing accuracy of 99.879%, 97.732%, and 96.291% for Nextstrain, Pangolin, and GISAID clades, respectively. A single prediction only takes an average of <20 ms on a portable laptop. Overall, this study provides an efficient and accurate SARS-CoV-2 genome sequence typing system that benefits current and future surveillance of SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Miao Miao
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, Changsha 410078, China
| | - Erik De Clercq
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium
| | - Guangdi Li
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, Changsha 410078, China
- Hunan Children’s Hospital, Changsha 410007, China
- Correspondence: ; Tel.: +86-731-8480-5414
| |
Collapse
|
4
|
Deep Learning Based-Virtual Screening Using 2D Pharmacophore Fingerprint in Drug Discovery. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Wu W, Zhu S, Xu L, Zhu L, Wang D, Liu Y, Liu S, Hao Z, Lu Y, Yang L, Shi J, Chen J. Genome-wide identification of the Liriodendron chinense WRKY gene family and its diverse roles in response to multiple abiotic stress. BMC PLANT BIOLOGY 2022; 22:25. [PMID: 35012508 PMCID: PMC8744262 DOI: 10.1186/s12870-021-03371-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 05/27/2023]
Abstract
BACKGROUND Liriodendron chinense (Lchi) is a tree species within the Magnoliaceae family and is considered a basal angiosperm. The too low or high temperature or soil drought will restrict its growth as the adverse environmental conditions, thus improving L. chinense abiotic tolerance was the key issues to study. WRKYs are a major family of plant transcription factors known to often be involved in biotic and abiotic stress responses. So far, it is still largely unknown if and how the LchiWRKY gene family is tied to regulating L. chinense stress responses. Therefore, studying the involvement of the WRKY gene family in abiotic stress regulation in L. chinense could be very informative in showing how this tree deals with such stressful conditions. RESULTS In this research, we performed a genome-wide analysis of the Liriodendron chinense (Lchi) WRKY gene family, studying their classification relationships, gene structure, chromosomal locations, gene duplication, cis-element, and response to abiotic stress. The 44 members of the LchiWRKY gene family contain a significant amount of sequence diversity, with their lengths ranging from 525 bp to 40,981 bp. Using classification analysis, we divided the 44 LchiWRKY genes into three phylogenetic groups (I, II, II), with group II then being further divided into five subgroups (IIa, IIb, IIc, IId, IIe). Comparative phylogenetic analysis including the WRKY families from 17 plant species suggested that LchiWRKYs are closely related to the Magnolia Cinnamomum kanehirae WRKY family, and has fewer family members than higher plants. We found the LchiWRKYs to be evenly distributed across 15 chromosomes, with their duplication events suggesting that tandem duplication may have played a major role in LchiWRKY gene expansion model. A Ka/Ks analysis indicated that they mainly underwent purifying selection and distributed in the group IId. Motif analysis showed that LchiWRKYs contained 20 motifs, and different phylogenetic groups contained conserved motif. Gene ontology (GO) analysis showed that LchiWRKYs were mainly enriched in two categories, i.e., biological process and molecular function. Two group IIc members (LchiWRKY10 and LchiWRKY37) contain unique WRKY element sequence variants (WRKYGKK and WRKYGKS). Gene structure analysis showed that most LchiWRKYs possess 3 exons and two different types of introns: the R- and V-type which are both contained within the WRKY domain (WD). Additional promoter cis-element analysis indicated that 12 cis-elements that play different functions in environmental adaptability occur across all LchiWRKY groups. Heat, cold, and drought stress mainly induced the expression of group II and I LchiWRKYs, some of which had undergone gene duplication during evolution, and more than half of which had three exons. LchiWRKY33 mainly responded to cold stress and LchiWRKY25 mainly responded to heat stress, and LchiWRKY18 mainly responded to drought stress, which was almost 4-fold highly expressed, while 5 LchiWRKYs (LchiWRKY5, LchiWRKY23, LchiWRKY14, LchiWRKY27, and LchiWRKY36) responded equally three stresses with more than 6-fold expression. Subcellular localization analysis showed that all LchiWRKYs were localized in the nucleus, and subcellular localization experiments of LchiWRKY18 and 36 also showed that these two transcription factors were expressed in the nucleus. CONCLUSIONS This study shows that in Liriodendron chinense, several WRKY genes like LchiWRKY33, LchiWRKY25, and LchiWRKY18, respond to cold or heat or drought stress, suggesting that they may indeed play a role in regulating the tree's response to such conditions. This information will prove a pivotal role in directing further studies on the function of the LchiWRKY gene family in abiotic stress response and provides a theoretical basis for popularizing afforestation in different regions of China.
Collapse
Affiliation(s)
- Weihuang Wu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Sheng Zhu
- College of Biology and the Environment, Nanjing Forestry University, Nanjing, China
| | - Lin Xu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Liming Zhu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Dandan Wang
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Yang Liu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Siqin Liu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Zhaodong Hao
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Ye Lu
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Liming Yang
- College of Biology and the Environment, Nanjing Forestry University, Nanjing, China
| | - Jisen Shi
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Jinhui Chen
- Key Laboratory of Forest Genetics and Biotechnology, Ministry of Education of China, Co-Innovation Center for the Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China.
| |
Collapse
|
6
|
Cai K, Liu H, Chen S, Liu Y, Zhao X, Chen S. Genome-wide identification and analysis of class III peroxidases in Betula pendula. BMC Genomics 2021; 22:314. [PMID: 33932996 PMCID: PMC8088069 DOI: 10.1186/s12864-021-07622-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 04/15/2021] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Class III peroxidases (POD) proteins are widely present in the plant kingdom that are involved in a broad range of physiological processes including stress responses and lignin polymerization throughout the plant life cycle. At present, POD genes have been studied in Arabidopsis, rice, poplar, maize and Chinese pear, but there are no reports on the identification and function of POD gene family in Betula pendula. RESULTS We identified 90 nonredundant POD genes in Betula pendula. (designated BpPODs). According to phylogenetic relationships, these POD genes were classified into 12 groups. The BpPODs are distributed in different numbers on the 14 chromosomes, and some BpPODs were located sequentially in tandem on chromosomes. In addition, we analyzed the conserved domains of BpPOD proteins and found that they contain highly conserved motifs. We also investigated their expression patterns in different tissues, the results showed that some BpPODs might play an important role in xylem, leaf, root and flower. Furthermore, under low temperature conditions, some BpPODs showed different expression patterns at different times. CONCLUSIONS The research on the structure and function of the POD genes in Betula pendula plays a very important role in understanding the growth and development process and the molecular mechanism of stress resistance. These results lay the theoretical foundation for the genetic improvement of Betula pendula.
Collapse
Affiliation(s)
- Kewei Cai
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China
| | - Huixin Liu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China
| | - Song Chen
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China
| | - Yi Liu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China
| | - Xiyang Zhao
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China
| | - Su Chen
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, 150040, China.
| |
Collapse
|
7
|
Thadani NN, Zhou Q, Reyes Gamas K, Butler S, Bueno C, Schafer NP, Morcos F, Wolynes PG, Suh J. Frustration and Direct-Coupling Analyses to Predict Formation and Function of Adeno-Associated Virus. Biophys J 2020; 120:489-503. [PMID: 33359833 DOI: 10.1016/j.bpj.2020.12.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 11/08/2020] [Accepted: 12/08/2020] [Indexed: 01/03/2023] Open
Abstract
Adeno-associated virus (AAV) is a promising gene therapy vector because of its efficient gene delivery and relatively mild immunogenicity. To improve delivery target specificity, researchers use combinatorial and rational library design strategies to generate novel AAV capsid variants. These approaches frequently propose high proportions of nonforming or noninfective capsid protein sequences that reduce the effective depth of synthesized vector DNA libraries, thereby raising the discovery cost of novel vectors. We evaluated two computational techniques for their ability to estimate the impact of residue mutations on AAV capsid protein-protein interactions and thus predict changes in vector fitness, reasoning that these approaches might inform the design of functionally enriched AAV libraries and accelerate therapeutic candidate identification. The Frustratometer computes an energy function derived from the energy landscape theory of protein folding. Direct-coupling analysis (DCA) is a statistical framework that captures residue coevolution within proteins. We applied the Frustratometer to select candidate protein residues predicted to favor assembled or disassembled capsid states, then predicted mutation effects at these sites using the Frustratometer and DCA. Capsid mutants were experimentally assessed for changes in virus formation, stability, and transduction ability. The Frustratometer-based metric showed a counterintuitive correlation with viral stability, whereas a DCA-derived metric was highly correlated with virus transduction ability in the small population of residues studied. Our results suggest that coevolutionary models may be able to elucidate complex capsid residue-residue interaction networks essential for viral function, but further study is needed to understand the relationship between protein energy simulations and viral capsid metastability.
Collapse
Affiliation(s)
| | - Qin Zhou
- Department of Biological Sciences, University of Texas at Dallas, Richardson, Texas
| | | | - Susan Butler
- Department of Bioengineering, Rice University, Houston, Texas
| | - Carlos Bueno
- Center for Theoretical Biological Physics, Rice University, Houston, Texas; Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, Texas; Department of Chemistry, Rice University, Houston, Texas
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, University of Texas at Dallas, Richardson, Texas; Department of Bioengineering, University of Texas at Dallas, Richardson, Texas
| | - Peter G Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas; Department of Chemistry, Rice University, Houston, Texas; Department of Biosciences, Rice University, Houston, Texas; Department of Physics, Rice University, Houston, Texas
| | - Junghae Suh
- Department of Bioengineering, Rice University, Houston, Texas; Department of Biosciences, Rice University, Houston, Texas; Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas; Systems, Synthetic, and Physical Biology Program, Rice University, Houston, Texas.
| |
Collapse
|
8
|
Kalmykova SD, Arapidi GP, Urban AS, Osetrova MS, Gordeeva VD, Ivanov VT, Govorun VM. In Silico Analysis of Peptide Potential Biological Functions. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2018. [DOI: 10.1134/s106816201804009x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Abstract
Amino acid mutations in proteins are random and those mutations which are beneficial or neutral survive during the course of evolution. Conservation or co-evolution analyses are performed on the multiple sequence alignment of homologous proteins to understand how important different amino acids or groups of them are. However, these traditional analyses do not explore the directed influence of amino acid mutations, such as compensatory effects. In this work we develop a method to capture the directed evolutionary impact of one amino acid on all other amino acids, and provide a visual network representation for it. The method developed for these directed networks of inter- and intra-protein evolutionary interactions can also be used for noting the differences in amino acid evolution between the control and experimental groups. The analysis is illustrated with a few examples, where the method identifies several directed interactions of functionally critical amino acids. The impact of an amino acid is quantified as the number of amino acids that are influenced as a consequence of its mutation, and it is intended to summarize the compensatory mutations in large evolutionary sequence data sets as well as to rationally identify targets for mutagenesis when their functional significance can not be assessed using structure or conservation.
Collapse
|
10
|
Xu Y, Liu F, Han G, Cheng B. Genome-wide identification and comparative analysis of phosphate starvation-responsive transcription factors in maize and three other gramineous plants. PLANT CELL REPORTS 2018; 37:711-726. [PMID: 29396709 DOI: 10.1007/s00299-018-2262-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 01/24/2018] [Indexed: 06/07/2023]
Abstract
The present study identified several important candidate Pi regulation genes of maize and provides a better understanding on the generation of PHR genes in gramineous plants. Plants have evolved adaptive responses to cope with low phosphate (Pi) soils. The previous studies have indicated that phosphate starvation response (PHR) genes play central roles in regulating plant Pi starvation responses. However, the investigation of PHR family in gramineous plants is limited. In this study, we identified 64 PHR genes in four gramineous plants, including maize, rice, sorghum, and brachypodium, and conducted systematical analyses on phylogenetic, structure, collinearity, and expression pattern of these PHR genes. Genome synteny analysis revealed that a number of PHR genes were present in the corresponding syntenic blocks of maize, rice, sorghum, and brachypodium, indicating that large-scale duplication events contributed significantly to the expansion and evolution of PHR genes in these gramineous plants. Gene expression analysis showed that many PHR genes were expressed in various tissues, suggesting that these genes are involved in Pi redistribution and allocation. In addition, the expression levels of PHR genes from maize and rice under low Pi stress conditions revealed that some PHRs may play an important role in Pi starvation response. Our results provided a better understanding on the generation of PHR genes in gramineous plants and identified several important candidate Pi regulation genes of maize.
Collapse
Affiliation(s)
- Yunjian Xu
- National Engineering Laboratory of Crop Stress Resistance, Anhui Agricultural University, No. 130, Changjiang West Road, Hefei, 230036, China
| | - Fang Liu
- National Engineering Laboratory of Crop Stress Resistance, Anhui Agricultural University, No. 130, Changjiang West Road, Hefei, 230036, China
- College of Agronomy, Anhui Agricultural University, No. 130, Changjiang West Road, Hefei, 230036, China
| | - Guomin Han
- National Engineering Laboratory of Crop Stress Resistance, Anhui Agricultural University, No. 130, Changjiang West Road, Hefei, 230036, China.
| | - Beijiu Cheng
- National Engineering Laboratory of Crop Stress Resistance, Anhui Agricultural University, No. 130, Changjiang West Road, Hefei, 230036, China.
| |
Collapse
|
11
|
Maddi AMA, Eslahchi C. Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs. Sci Rep 2017; 7:3247. [PMID: 28607455 PMCID: PMC5468366 DOI: 10.1038/s41598-017-03268-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 04/26/2017] [Indexed: 12/21/2022] Open
Abstract
Detecting known protein complexes and predicting undiscovered protein complexes from protein-protein interaction (PPI) networks help us to understand principles of cell organization and its functions. Nevertheless, the discovery of protein complexes based on experiment still needs to be explored. Therefore, computational methods are useful approaches to overcome the experimental limitations. Nevertheless, extraction of protein complexes from PPI network is often nontrivial. Two major constraints are large amount of noise and ignorance of occurrence time of different interactions in PPI network. In this paper, an efficient algorithm, Inter Module Hub Removal Clustering (IMHRC), is developed based on inter-module hub removal in the weighted PPI network which can detect overlapped complexes. By removing some of the inter-module hubs and module hubs, IMHRC eliminates high amount of noise in dataset and implicitly considers different occurrence time of the PPI in network. The performance of the IMHRC was evaluated on several benchmark datasets and results were compared with some of the state-of-the-art models. The protein complexes discovered with the IMHRC method show significantly better agreement with the real complexes than other current methods. Our algorithm provides an accurate and scalable method for detecting and predicting protein complexes from PPI networks.
Collapse
Affiliation(s)
- A M A Maddi
- Department of Electrical and computer Engineering, Isfahan University of Technology, Isfahan, 1983963113, Iran
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, 193955746, Iran
| | - Ch Eslahchi
- Department of Computer Sciences, Faculty of Mathematics, Shahid Beheshti University, Tehran, 1983963113, Iran.
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, 193955746, Iran.
| |
Collapse
|
12
|
Ou-Yang L, Zhang XF, Dai DQ, Wu MY, Zhu Y, Liu Z, Yan H. Protein complex detection based on partially shared multi-view clustering. BMC Bioinformatics 2016; 17:371. [PMID: 27623844 PMCID: PMC5022186 DOI: 10.1186/s12859-016-1164-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 07/23/2016] [Indexed: 01/05/2023] Open
Abstract
Background Protein complexes are the key molecular entities to perform many essential biological functions. In recent years, high-throughput experimental techniques have generated a large amount of protein interaction data. As a consequence, computational analysis of such data for protein complex detection has received increased attention in the literature. However, most existing works focus on predicting protein complexes from a single type of data, either physical interaction data or co-complex interaction data. These two types of data provide compatible and complementary information, so it is necessary to integrate them to discover the underlying structures and obtain better performance in complex detection. Results In this study, we propose a novel multi-view clustering algorithm, called the Partially Shared Multi-View Clustering model (PSMVC), to carry out such an integrated analysis. Unlike traditional multi-view learning algorithms that focus on mining either consistent or complementary information embedded in the multi-view data, PSMVC can jointly explore the shared and specific information inherent in different views. In our experiments, we compare the complexes detected by PSMVC from single data source with those detected from multiple data sources. We observe that jointly analyzing multi-view data benefits the detection of protein complexes. Furthermore, extensive experiment results demonstrate that PSMVC performs much better than 16 state-of-the-art complex detection techniques, including ensemble clustering and data integration techniques. Conclusions In this work, we demonstrate that when integrating multiple data sources, using partially shared multi-view clustering model can help to identify protein complexes which are not readily identifiable by conventional single-view-based methods and other integrative analysis methods. All the results and source codes are available on https://github.com/Oyl-CityU/PSMVC. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1164-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Le Ou-Yang
- College of Information Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China.,Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics and Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan, 430079, China
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xin Gang Road West, Guangzhou, 510275, China.
| | - Meng-Yun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Guoding Road, Shanghai, 200433, China
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Wuhan, China
| | - Zhiyong Liu
- Shenzhen Polytechnic, Shenzhen, 518055, China
| | - Hong Yan
- Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
| |
Collapse
|
13
|
Abstract
The HIV genome encodes a small number of viral proteins (i.e., 16), invariably establishing cooperative associations among HIV proteins and between HIV and host proteins, to invade host cells and hijack their internal machineries. As a known example, the HIV envelope glycoprotein GP120 is closely associated with GP41 for viral entry. From a genome-wide perspective, a hypothesis can be worked out to determine whether 16 HIV proteins could develop 120 possible pairwise associations either by physical interactions or by functional associations mediated via HIV or host molecules. Here, we present the first systematic review of experimental evidence on HIV genome-wide protein associations using a large body of publications accumulated over the past 3 decades. Of 120 possible pairwise associations between 16 HIV proteins, at least 34 physical interactions and 17 functional associations have been identified. To achieve efficient viral replication and infection, HIV protein associations play essential roles (e.g., cleavage, inhibition, and activation) during the HIV life cycle. In either a dispensable or an indispensable manner, each HIV protein collaborates with another viral protein to accomplish specific activities that precisely take place at the proper stages of the HIV life cycle. In addition, HIV genome-wide protein associations have an impact on anti-HIV inhibitors due to the extensive cross talk between drug-inhibited proteins and other HIV proteins. Overall, this study presents for the first time a comprehensive overview of HIV genome-wide protein associations, highlighting meticulous collaborations between all viral proteins during the HIV life cycle.
Collapse
Affiliation(s)
- Guangdi Li
- Department of Metabolism and Endocrinology, Metabolic Syndrome Research Center, Key Laboratory of Diabetes Immunology, Ministry of Education, National Clinical Research Center for Metabolic Diseases, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China KU Leuven-University of Leuven, Rega Institute for Medical Research, Department of Microbiology and Immunology, Leuven, Belgium
| | - Erik De Clercq
- KU Leuven-University of Leuven, Rega Institute for Medical Research, Department of Microbiology and Immunology, Leuven, Belgium
| |
Collapse
|
14
|
Zhang J, Jia H, Li J, Li Y, Lu M, Hu J. Molecular evolution and expression divergence of the Populus euphratica Hsf genes provide insight into the stress acclimation of desert poplar. Sci Rep 2016; 6:30050. [PMID: 27425424 PMCID: PMC4948027 DOI: 10.1038/srep30050] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2016] [Accepted: 06/29/2016] [Indexed: 12/27/2022] Open
Abstract
Heat shock transcription factor (Hsf) family is one of the most important regulators in the plant kingdom. Hsf has been demonstrated to be involved in various processes associated with plant growth, development as well as in response to hormone and abiotic stresses. In this study, we carried out a comprehensive analysis of Hsf family in desert poplar, Populus euphratica. Total of 32 genes encoding Hsf were identified and they were classified into three main classes (A, B, and C). Gene structure and conserved motif analyses indicated that the members in each class were relatively conserved. Total of 10 paralogous pairs were identified in PeuHsf family, in which nine pairs were generated by whole genome duplication events. Ka/Ks analysis showed that PeuHsfs underwent purifying selection pressure. In addition, various cis-acting elements involved in hormone and stress responses located in the promoter regions of PeuHsfs. Gene expression analysis indicated that several PeuHsfs were tissue-specific expression. Compared to Arabidopsis, more PeuHsf genes were significantly induced by heat, drought, and salt stresses (21, 19, and 22 PeuHsfs, respectively). Our findings are helpful in understanding the distinguished adaptability of P. euphratica to extreme environment and providing a basis for functional analysis of PeuHsfs in the future.
Collapse
Affiliation(s)
- Jin Zhang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.,Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China
| | - Huixia Jia
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.,Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China
| | - Jianbo Li
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Yu Li
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Mengzhu Lu
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.,Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China
| | - Jianjun Hu
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.,Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China
| |
Collapse
|
15
|
Abstract
Since the first antiviral drug, idoxuridine, was approved in 1963, 90 antiviral drugs categorized into 13 functional groups have been formally approved for the treatment of the following 9 human infectious diseases: (i) HIV infections (protease inhibitors, integrase inhibitors, entry inhibitors, nucleoside reverse transcriptase inhibitors, nonnucleoside reverse transcriptase inhibitors, and acyclic nucleoside phosphonate analogues), (ii) hepatitis B virus (HBV) infections (lamivudine, interferons, nucleoside analogues, and acyclic nucleoside phosphonate analogues), (iii) hepatitis C virus (HCV) infections (ribavirin, interferons, NS3/4A protease inhibitors, NS5A inhibitors, and NS5B polymerase inhibitors), (iv) herpesvirus infections (5-substituted 2'-deoxyuridine analogues, entry inhibitors, nucleoside analogues, pyrophosphate analogues, and acyclic guanosine analogues), (v) influenza virus infections (ribavirin, matrix 2 protein inhibitors, RNA polymerase inhibitors, and neuraminidase inhibitors), (vi) human cytomegalovirus infections (acyclic guanosine analogues, acyclic nucleoside phosphonate analogues, pyrophosphate analogues, and oligonucleotides), (vii) varicella-zoster virus infections (acyclic guanosine analogues, nucleoside analogues, 5-substituted 2'-deoxyuridine analogues, and antibodies), (viii) respiratory syncytial virus infections (ribavirin and antibodies), and (ix) external anogenital warts caused by human papillomavirus infections (imiquimod, sinecatechins, and podofilox). Here, we present for the first time a comprehensive overview of antiviral drugs approved over the past 50 years, shedding light on the development of effective antiviral treatments against current and emerging infectious diseases worldwide.
Collapse
Affiliation(s)
- Erik De Clercq
- KU Leuven-University of Leuven, Rega Institute for Medical Research, Department of Microbiology and Immunology, Leuven, Belgium
| | - Guangdi Li
- KU Leuven-University of Leuven, Rega Institute for Medical Research, Department of Microbiology and Immunology, Leuven, Belgium Department of Metabolism and Endocrinology, Metabolic Syndrome Research Center, Key Laboratory of Diabetes Immunology, Ministry of Education, National Clinical Research Center for Metabolic Diseases, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| |
Collapse
|
16
|
HIV Genome-Wide Protein Associations: a Review of 30 Years of Research. Microbiol Mol Biol Rev 2016; 80:679-731. [PMID: 27357278 DOI: 10.1128/mmbr.00065-15] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The HIV genome encodes a small number of viral proteins (i.e., 16), invariably establishing cooperative associations among HIV proteins and between HIV and host proteins, to invade host cells and hijack their internal machineries. As a known example, the HIV envelope glycoprotein GP120 is closely associated with GP41 for viral entry. From a genome-wide perspective, a hypothesis can be worked out to determine whether 16 HIV proteins could develop 120 possible pairwise associations either by physical interactions or by functional associations mediated via HIV or host molecules. Here, we present the first systematic review of experimental evidence on HIV genome-wide protein associations using a large body of publications accumulated over the past 3 decades. Of 120 possible pairwise associations between 16 HIV proteins, at least 34 physical interactions and 17 functional associations have been identified. To achieve efficient viral replication and infection, HIV protein associations play essential roles (e.g., cleavage, inhibition, and activation) during the HIV life cycle. In either a dispensable or an indispensable manner, each HIV protein collaborates with another viral protein to accomplish specific activities that precisely take place at the proper stages of the HIV life cycle. In addition, HIV genome-wide protein associations have an impact on anti-HIV inhibitors due to the extensive cross talk between drug-inhibited proteins and other HIV proteins. Overall, this study presents for the first time a comprehensive overview of HIV genome-wide protein associations, highlighting meticulous collaborations between all viral proteins during the HIV life cycle.
Collapse
|
17
|
Bonnici V, Manca V. Informational laws of genome structures. Sci Rep 2016; 6:28840. [PMID: 27354155 PMCID: PMC4937431 DOI: 10.1038/srep28840] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/09/2016] [Indexed: 01/06/2023] Open
Abstract
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- University of Verona, Department of Computer Science, University of Verona, Verona 37134, Italy,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| | - Vincenzo Manca
- University of Verona, Department of Computer Science, University of Verona, Verona 37134, Italy,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy,
| |
Collapse
|
18
|
Lamiable A, Thévenet P, Rey J, Vavrusa M, Derreumaux P, Tufféry P. PEP-FOLD3: faster de novo structure prediction for linear peptides in solution and in complex. Nucleic Acids Res 2016; 44:W449-54. [PMID: 27131374 PMCID: PMC4987898 DOI: 10.1093/nar/gkw329] [Citation(s) in RCA: 599] [Impact Index Per Article: 74.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 04/17/2016] [Indexed: 01/15/2023] Open
Abstract
Structure determination of linear peptides of 5–50 amino acids in aqueous solution and interacting with proteins is a key aspect in structural biology. PEP-FOLD3 is a novel computational framework, that allows both (i) de novo free or biased prediction for linear peptides between 5 and 50 amino acids, and (ii) the generation of native-like conformations of peptides interacting with a protein when the interaction site is known in advance. PEP-FOLD3 is fast, and usually returns solutions in a few minutes. Testing PEP-FOLD3 on 56 peptides in aqueous solution led to experimental-like conformations for 80% of the targets. Using a benchmark of 61 peptide–protein targets starting from the unbound form of the protein receptor, PEP-FOLD3 was able to generate peptide poses deviating on average by 3.3Å from the experimental conformation and return a native-like pose in the first 10 clusters for 52% of the targets. PEP-FOLD3 is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3.
Collapse
Affiliation(s)
- Alexis Lamiable
- Molécules Thérapeutiques in Silico, RPBS, INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris Cedex 13, France
| | - Pierre Thévenet
- Molécules Thérapeutiques in Silico, RPBS, INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris Cedex 13, France
| | - Julien Rey
- Molécules Thérapeutiques in Silico, RPBS, INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris Cedex 13, France
| | - Marek Vavrusa
- Molécules Thérapeutiques in Silico, RPBS, INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris Cedex 13, France
| | - Philippe Derreumaux
- Institut de Biologie Physico Chimique, Laboratoire de Biochimie Théorique, Université Paris Diderot, Sorbonne Paris Cité, CNRS UPR 9080, 75005 Paris, France
| | - Pierre Tufféry
- Molécules Thérapeutiques in Silico, RPBS, INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris Cedex 13, France
| |
Collapse
|
19
|
Khouri R, Vandamme AM. Virus genetic variability involvement in transmissibility of HIV-1 immune activation and disease progression. Future Virol 2015. [DOI: 10.2217/fvl.15.96] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Ricardo Khouri
- LIMI-LIP, CPqGM, Oswaldo Cruz Foundation (FIOCRUZ), Salvador-Bahia, Brazil
- KU Leuven – University of Leuven, Department of Microbiology & Immunology, Rega Institute for Medical Research, Clinical & Epidemiological Virology, Leuven, Belgium
| | - Anne-Mieke Vandamme
- KU Leuven – University of Leuven, Department of Microbiology & Immunology, Rega Institute for Medical Research, Clinical & Epidemiological Virology, Leuven, Belgium
- Center for Global Health & Tropical Medicine, Unidade de Microbiologia, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, Lisbon, Portugal
| |
Collapse
|
20
|
Coevolution Analysis of HIV-1 Envelope Glycoprotein Complex. PLoS One 2015; 10:e0143245. [PMID: 26579711 PMCID: PMC4651434 DOI: 10.1371/journal.pone.0143245] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 11/02/2015] [Indexed: 11/19/2022] Open
Abstract
The HIV-1 Env spike is the main protein complex that facilitates HIV-1 entry into CD4+ host cells. HIV-1 entry is a multistep process that is not yet completely understood. This process involves several protein-protein interactions between HIV-1 Env and a variety of host cell receptors along with many conformational changes within the spike. HIV-1 Env developed due to high mutation rates and plasticity escape strategies from immense immune pressure and entry inhibitors. We applied a coevolution and residue-residue contact detecting method to identify coevolution patterns within HIV-1 Env protein sequences representing all group M subtypes. We identified 424 coevolving residue pairs within HIV-1 Env. The majority of predicted pairs are residue-residue contacts and are proximal in 3D structure. Furthermore, many of the detected pairs have functional implications due to contributions in either CD4 or coreceptor binding, or variable loop, gp120-gp41, and interdomain interactions. This study provides a new dimension of information in HIV research. The identified residue couplings may not only be important in assisting gp120 and gp41 coordinate structure prediction, but also in designing new and effective entry inhibitors that incorporate mutation patterns of HIV-1 Env.
Collapse
|
21
|
Motion GB, Howden AJM, Huitema E, Jones S. DNA-binding protein prediction using plant specific support vector machines: validation and application of a new genome annotation tool. Nucleic Acids Res 2015; 43:e158. [PMID: 26304539 PMCID: PMC4678848 DOI: 10.1093/nar/gkv805] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 07/28/2015] [Indexed: 11/26/2022] Open
Abstract
There are currently 151 plants with draft genomes available but levels of functional annotation for putative protein products are low. Therefore, accurate computational predictions are essential to annotate genomes in the first instance, and to provide focus for the more costly and time consuming functional assays that follow. DNA-binding proteins are an important class of proteins that require annotation, but current computational methods are not applicable for genome wide predictions in plant species. Here, we explore the use of species and lineage specific models for the prediction of DNA-binding proteins in plants. We show that a species specific support vector machine model based on Arabidopsis sequence data is more accurate (accuracy 81%) than a generic model (74%), and based on this we develop a plant specific model for predicting DNA-binding proteins. We apply this model to the tomato proteome and demonstrate its ability to perform accurate high-throughput prediction of DNA-binding proteins. In doing so, we have annotated 36 currently uncharacterised proteins by assigning a putative DNA-binding function. Our model is publically available and we propose it be used in combination with existing tools to help increase annotation levels of DNA-binding proteins encoded in plant genomes.
Collapse
Affiliation(s)
- Graham B Motion
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK Cell and Molecular Sciences, James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Andrew J M Howden
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Edgar Huitema
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Susan Jones
- Information and Computational Sciences, James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| |
Collapse
|
22
|
Li G, Piampongsant S, Faria NR, Voet A, Pineda-Peña AC, Khouri R, Lemey P, Vandamme AM, Theys K. An integrated map of HIV genome-wide variation from a population perspective. Retrovirology 2015; 12:18. [PMID: 25808207 PMCID: PMC4358901 DOI: 10.1186/s12977-015-0148-6] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 01/28/2015] [Indexed: 01/01/2023] Open
Abstract
Background The HIV pandemic is characterized by extensive genetic variability, which has challenged the development of HIV drugs and vaccines. Although HIV genomes have been classified into different types, groups, subtypes and recombinants, a comprehensive study that maps HIV genome-wide diversity at the population level is still lacking to date. This study aims to characterize HIV genomic diversity in large-scale sequence populations, and to identify driving factors that shape HIV genome diversity. Results A total of 2996 full-length genomic sequences from 1705 patients infected with 16 major HIV groups, subtypes and circulating recombinant forms (CRFs) were analyzed along with structural, immunological and peptide inhibitor information. Average nucleotide diversity of HIV genomes was almost 50% between HIV-1 and HIV-2 types, 37.5% between HIV-1 groups, 14.7% between HIV-1 subtypes, 8.2% within individual HIV-1 subtypes and less than 1% within single patients. Along the HIV genome, diversity patterns and compositions of nucleotides and amino acids were highly similar across different groups, subtypes and CRFs. Current HIV-derived peptide inhibitors were predominantly derived from conserved, solvent accessible and intrinsically ordered structures in the HIV-1 subtype B genome. We identified these conserved regions in Capsid, Nucleocapsid, Protease, Integrase, Reverse transcriptase, Vpr and the GP41 N terminus as potential drug targets. In the analysis of factors that impact HIV-1 genomic diversity, we focused on protein multimerization, immunological constraints and HIV-human protein interactions. We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes. Moreover, intrinsic disorder regions in HIV-1 proteins coincided with high levels of amino acid diversity, facilitating a large number of interactions between HIV-1 and human proteins. Conclusions This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs. Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures. Electronic supplementary material The online version of this article (doi:10.1186/s12977-015-0148-6) contains supplementary material, which is available to authorized users.
Collapse
|