1
|
Hernández-Miranda OA, Campos JE, Sandoval-Zapotitla E, Rosas U, Ortiz-Melo MT, Salazar-Rojas VM. Transcriptomic analysis reveals molecular phenological changes during the flower-to-fruit transition in Vanilla planifolia Andrews (Orchidaceae). BMC PLANT BIOLOGY 2025; 25:437. [PMID: 40186135 PMCID: PMC11971897 DOI: 10.1186/s12870-025-06476-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Accepted: 03/27/2025] [Indexed: 04/07/2025]
Abstract
BACKGROUND The transition from flower to fruit, encompassing flower formation to fruit maturation, has been extensively studied in model plants such as Arabidopsis thaliana. However, the Orchidaceae family, including Vanilla planifolia, exhibits a unique phenomenon known as post-pollination syndrome (PPS), where pollination initiates ovule development but often leads to premature ovary drop. This phenomenon significantly impacts the yield and stability of V. planifolia crops. Understanding the molecular mechanisms underlying PPS is essential for improving crop production. This study explores transcriptomic and histological variations to identify key molecular and phenological changes in the ovary during the flower-to-fruit transition in V. planifolia. RESULTS The flower-to-fruit transition in Vanilla planifolia involves dynamic changes in gene expression and phenotypic events, which can be categorized into four distinct stages: (1) Pre-pollination: Ovary differentiation is characterized by the enrichment of nitrogen metabolism and photoperiod-responsive pathways. The upregulation of VpVRN5-like and VpNAC14-like suggests their roles in photoperiod-induced flowering and ovarian tissue differentiation in response to nitrate availability. (2) Pollination: Key events include nucellar filament branching and the functional enrichment of pathways associated with growth and responses to light intensity. The upregulation of VpMBS1-like indicates its involvement in regulating and adapting to high light conditions. (3) Post-pollination: This stage is marked by embryo sac formation and pollen tube elongation, with enrichment in auxin response pathways. The upregulation of VpIAA6-like and VpRALF27-like suggests their roles in auxin signaling during ovule development. (4) Fertilization: Seed development is associated with the enrichment of abiotic stress response pathways and carbohydrate transport. The upregulation of VpAAE3-like, VpPR1-like, and VpSWET12-like suggests functions in stress responses and sucrose transport, potentially linked to fungal interactions or symbiosis. CONCLUSIONS This study characterizes the molecular and phenological changes occurring during the flower-to-fruit transition in V. planifolia by integrating transcriptomic analysis with anatomical data on post-pollination syndrome. Based on functional predictions, this approach provides valuable insights into the mechanisms governing this transition in plants exhibiting PPS and identifies candidate genes for future experimental validation in V. planifolia. CLINICAL TRIAL NUMBER Not applicable.
Collapse
Affiliation(s)
- Olga Andrea Hernández-Miranda
- Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Colonia Los Reyes Ixtacala Tlalnepantla, Estado de México, Avenida de los Barrios Número 1, Mexico, C.P. 54090, Mexico
- Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México. Cto. de Posgrados, Ciudad Universitaria Del. Coyoacán, Ciudad de México, C. P. 04510, Mexico
| | - Jorge E Campos
- Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Colonia Los Reyes Ixtacala Tlalnepantla, Estado de México, Avenida de los Barrios Número 1, Mexico, C.P. 54090, Mexico
| | - Estela Sandoval-Zapotitla
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México. Cto. Zona Deportiva, Ciudad Universitaria Del. Coyoacán, Ciudad de México, C. P. 04510, Mexico
| | - Ulises Rosas
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México. Cto. Zona Deportiva, Ciudad Universitaria Del. Coyoacán, Ciudad de México, C. P. 04510, Mexico
| | - María Teresa Ortiz-Melo
- Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Colonia Los Reyes Ixtacala Tlalnepantla, Estado de México, Avenida de los Barrios Número 1, Mexico, C.P. 54090, Mexico
| | - Victor Manuel Salazar-Rojas
- Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Colonia Los Reyes Ixtacala Tlalnepantla, Estado de México, Avenida de los Barrios Número 1, Mexico, C.P. 54090, Mexico.
| |
Collapse
|
2
|
Jones DAB, Rybak K, Hossain M, Bertazzoni S, Williams A, Tan KC, Phan HTT, Hane JK. Repeat-induced point mutations driving Parastagonospora nodorum genomic diversity are balanced by selection against non-synonymous mutations. Commun Biol 2024; 7:1614. [PMID: 39627497 PMCID: PMC11615325 DOI: 10.1038/s42003-024-07327-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 11/27/2024] [Indexed: 12/06/2024] Open
Abstract
Parastagonospora nodorum is necrotrophic fungal pathogen of wheat with significant genomic resources. Population-level pangenome data for 173 isolates, of which 156 were from Western Australia (WA) and 17 were international, were examined for overall genomic diversity and effector gene content. A heterothallic core population occurred across all regions of WA, with asexually-reproducing clonal clusters in dryer northern regions. High potential for SNP diversity in the form of repeat-induced point mutation (RIP)-like transitions, was observed across the genome, suggesting widespread 'RIP-leakage' from transposon-rich repetitive sequences into non-repetitive regions. The strong potential for RIP-like mutations was balanced by negative selection against non-synonymous SNPs, that was observed within protein-coding regions. Protein isoform profiles of known effector loci (SnToxA, SnTox1, SnTox3, SnTox267, and SnTox5) indicated low-levels of non-synonymous and high-levels of silent RIP-like mutations. Effector predictions identified 186 candidate secreted predicted effector proteins (CSEPs), 69 of which had functional annotations and included confirmed effectors. Pangenome-based effector isoform profiles across WA were distinct from global isolates and were conserved relative to population structure, and may enable new approaches for monitoring crop disease pathotypes.
Collapse
Affiliation(s)
- Darcy A B Jones
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Kasia Rybak
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Mohitul Hossain
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Stefania Bertazzoni
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Angela Williams
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Kar-Chun Tan
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - Huyen T T Phan
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia
| | - James K Hane
- Centre for Crop & Disease Management, School of Molecular & Life Sciences, Curtin University, Perth, WA, Australia.
| |
Collapse
|
3
|
Taha K. Employing Machine Learning Techniques to Detect Protein Function: A Survey, Experimental, and Empirical Evaluations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1965-1986. [PMID: 39008392 DOI: 10.1109/tcbb.2024.3427381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
This review article delves deeply into the various machine learning (ML) methods and algorithms employed in discerning protein functions. Each method discussed is assessed for its efficacy, limitations, potential improvements, and future prospects. We present an innovative hierarchical classification system that arranges algorithms into intricate categories and unique techniques. This taxonomy is based on a tri-level hierarchy, starting with the methodology category and narrowing down to specific techniques. Such a framework allows for a structured and comprehensive classification of algorithms, assisting researchers in understanding the interrelationships among diverse algorithms and techniques. The study incorporates both empirical and experimental evaluations to differentiate between the techniques. The empirical evaluation ranks the techniques based on four criteria. The experimental assessments rank: (1) individual techniques under the same methodology sub-category, (2) different sub-categories within the same category, and (3) the broad categories themselves. Integrating the innovative methodological classification, empirical findings, and experimental assessments, the article offers a well-rounded understanding of ML strategies in protein function identification. The paper also explores techniques for multi-task and multi-label detection of protein functions, in addition to focusing on single-task methods. Moreover, the paper sheds light on the future avenues of ML in protein function determination.
Collapse
|
4
|
Zhou L, Tao C, Shen X, Sun X, Wang J, Yuan Q. Unlocking the potential of enzyme engineering via rational computational design strategies. Biotechnol Adv 2024; 73:108376. [PMID: 38740355 DOI: 10.1016/j.biotechadv.2024.108376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/27/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024]
Abstract
Enzymes play a pivotal role in various industries by enabling efficient, eco-friendly, and sustainable chemical processes. However, the low turnover rates and poor substrate selectivity of enzymes limit their large-scale applications. Rational computational enzyme design, facilitated by computational algorithms, offers a more targeted and less labor-intensive approach. There has been notable advancement in employing rational computational protein engineering strategies to overcome these issues, it has not been comprehensively reviewed so far. This article reviews recent developments in rational computational enzyme design, categorizing them into three types: structure-based, sequence-based, and data-driven machine learning computational design. Case studies are presented to demonstrate successful enhancements in catalytic activity, stability, and substrate selectivity. Lastly, the article provides a thorough analysis of these approaches, highlights existing challenges and potential solutions, and offers insights into future development directions.
Collapse
Affiliation(s)
- Lei Zhou
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Chunmeng Tao
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xiaolin Shen
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xinxiao Sun
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jia Wang
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Qipeng Yuan
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| |
Collapse
|
5
|
Ndochinwa OG, Wang QY, Amadi OC, Nwagu TN, Nnamchi CI, Okeke ES, Moneke AN. Current status and emerging frontiers in enzyme engineering: An industrial perspective. Heliyon 2024; 10:e32673. [PMID: 38912509 PMCID: PMC11193041 DOI: 10.1016/j.heliyon.2024.e32673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/25/2024] Open
Abstract
Protein engineering mechanisms can be an efficient approach to enhance the biochemical properties of various biocatalysts. Immobilization of biocatalysts and the introduction of new-to-nature chemical reactivities are also possible through the same mechanism. Discovering new protocols that enhance the catalytic active protein that possesses novelty in terms of being stable, active, and, stereoselectivity with functions could be identified as essential areas in terms of concurrent bioorganic chemistry (synergistic relationship between organic chemistry and biochemistry in the context of enzyme engineering). However, with our current level of knowledge about protein folding and its correlation with protein conformation and activities, it is almost impossible to design proteins with specific biological and physical properties. Hence, contemporary protein engineering typically involves reprogramming existing enzymes by mutagenesis to generate new phenotypes with desired properties. These processes ensure that limitations of naturally occurring enzymes are not encountered. For example, researchers have engineered cellulases and hemicellulases to withstand harsh conditions encountered during biomass pretreatment, such as high temperatures and acidic environments. By enhancing the activity and robustness of these enzymes, biofuel production becomes more economically viable and environmentally sustainable. Recent trends in enzyme engineering have enabled the development of tailored biocatalysts for pharmaceutical applications. For instance, researchers have engineered enzymes such as cytochrome P450s and amine oxidases to catalyze challenging reactions involved in drug synthesis. In addition to conventional methods, there has been an increasing application of machine learning techniques to identify patterns in data. These patterns are then used to predict protein structures, enhance enzyme solubility, stability, and function, forecast substrate specificity, and assist in rational protein design. In this review, we discussed recent trends in enzyme engineering to optimize the biochemical properties of various biocatalysts. Using examples relevant to biotechnology in engineering enzymes, we try to expatiate the significance of enzyme engineering with how these methods could be applied to optimize the biochemical properties of a naturally occurring enzyme.
Collapse
Affiliation(s)
- Obinna Giles Ndochinwa
- Department of Microbiology, Faculty of Biological Science, University of Nigeria, Nsukka, Nigeria
| | - Qing-Yan Wang
- State Key Laboratory of Biomass Enzyme Technology, National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi, China
| | - Oyetugo Chioma Amadi
- Department of Microbiology, Faculty of Biological Science, University of Nigeria, Nsukka, Nigeria
| | - Tochukwu Nwamaka Nwagu
- Department of Microbiology, Faculty of Biological Science, University of Nigeria, Nsukka, Nigeria
| | | | - Emmanuel Sunday Okeke
- Department of Biochemistry, Faculty of Biological Sciences & Natural Science Unit, School of General Studies, University of Nigeria, Nsukka, Enugu State, 410001, Nigeria
- Institute of Environmental Health and Ecological Security, School of the Environment and Safety, Jiangsu University, 301 Xuefu Rd., 212013, Zhenjiang, Jiangsu, China
| | - Anene Nwabu Moneke
- Department of Microbiology, Faculty of Biological Science, University of Nigeria, Nsukka, Nigeria
| |
Collapse
|
6
|
Li Q, Button-Simons KA, Sievert MAC, Chahoud E, Foster GF, Meis K, Ferdig MT, Milenković T. Enhancing Gene Co-Expression Network Inference for the Malaria Parasite Plasmodium falciparum. Genes (Basel) 2024; 15:685. [PMID: 38927622 PMCID: PMC11202799 DOI: 10.3390/genes15060685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 05/22/2024] [Accepted: 05/22/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Malaria results in more than 550,000 deaths each year due to drug resistance in the most lethal Plasmodium (P.) species P. falciparum. A full P. falciparum genome was published in 2002, yet 44.6% of its genes have unknown functions. Improving the functional annotation of genes is important for identifying drug targets and understanding the evolution of drug resistance. RESULTS Genes function by interacting with one another. So, analyzing gene co-expression networks can enhance functional annotations and prioritize genes for wet lab validation. Earlier efforts to build gene co-expression networks in P. falciparum have been limited to a single network inference method or gaining biological understanding for only a single gene and its interacting partners. Here, we explore multiple inference methods and aim to systematically predict functional annotations for all P. falciparum genes. We evaluate each inferred network based on how well it predicts existing gene-Gene Ontology (GO) term annotations using network clustering and leave-one-out crossvalidation. We assess overlaps of the different networks' edges (gene co-expression relationships), as well as predicted functional knowledge. The networks' edges are overall complementary: 47-85% of all edges are unique to each network. In terms of the accuracy of predicting gene functional annotations, all networks yielded relatively high precision (as high as 87% for the network inferred using mutual information), but the highest recall reached was below 15%. All networks having low recall means that none of them capture a large amount of all existing gene-GO term annotations. In fact, their annotation predictions are highly complementary, with the largest pairwise overlap of only 27%. We provide ranked lists of inferred gene-gene interactions and predicted gene-GO term annotations for future use and wet lab validation by the malaria community. CONCLUSIONS The different networks seem to capture different aspects of the P. falciparum biology in terms of both inferred interactions and predicted gene functional annotations. Thus, relying on a single network inference method should be avoided when possible. SUPPLEMENTARY DATA Attached.
Collapse
Affiliation(s)
- Qi Li
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
- Lucy Family Institute for Data & Society, University of Notre Dame, Notre Dame, IN 46556, USA (M.T.F.)
| | - Katrina A. Button-Simons
- Lucy Family Institute for Data & Society, University of Notre Dame, Notre Dame, IN 46556, USA (M.T.F.)
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Mackenzie A. C. Sievert
- Lucy Family Institute for Data & Society, University of Notre Dame, Notre Dame, IN 46556, USA (M.T.F.)
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Elias Chahoud
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Department of Preprofessional Studies, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Gabriel F. Foster
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Kaitlynn Meis
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Michael T. Ferdig
- Lucy Family Institute for Data & Society, University of Notre Dame, Notre Dame, IN 46556, USA (M.T.F.)
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
- Lucy Family Institute for Data & Society, University of Notre Dame, Notre Dame, IN 46556, USA (M.T.F.)
| |
Collapse
|
7
|
Bonello J, Orengo C. FunPredCATH: An ensemble method for predicting protein function using CATH. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2024; 1872:140985. [PMID: 38122964 DOI: 10.1016/j.bbapap.2023.140985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/05/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]
Abstract
MOTIVATION The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational techniques to narrow the search can speed up the process by providing highly specific Gene Ontology (GO) terms. METHODOLOGY We propose an ensemble approach that combines three generic base predictors that predict Gene Ontology (BP, CC and MF) terms from sequences across different species. We train our models on UniProtGOA annotation data and use the CATH domain resources to identify the protein families. We then calculate a score based on the prevalence of individual GO terms in the functional families that is then used as an indicator of confidence when assigning the GO term to an uncharacterised protein. METHODS In the ensemble, we use a statistics-based method that scores the occurrence of GO terms in a CATH FunFam against a background set of proteins annotated by the same GO term. We also developed a set-based method that uses Set Intersection and Set Union to score the occurrence of GO terms within the same CATH FunFam. Finally, we also use FunFams-Plus, a predictor method developed by the Orengo Group at UCL to predict GO terms for uncharacterised proteins in the CAFA3 challenge. EVALUATION We evaluated the methods against the CAFA3 benchmark and DomFun. We used the Precision, Recall and Fmax metrics and the benchmark datasets that are used in CAFA3 to evaluate our models and compare them to the CAFA3 results. Our results show that FunPredCATH compares well with top CAFA methods in the different ontologies and benchmarks. CONTRIBUTIONS FunPredCATH compares well with other prediction methods on CAFA3, and the ensemble approach outperforms the base methods. We show that non-IEA models obtain higher Fmax scores than the IEA counterparts, while the models including IEA annotations have higher coverage at the expense of a lower Fmax score.
Collapse
Affiliation(s)
- Joseph Bonello
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom; Department of Computer Information Systems, University of Malta, Faculty of ICT, Msida, MSD 2080, Malta.
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
| |
Collapse
|
8
|
Weng YM, Shashank PR, Godfrey RK, Plotkin D, Parker BM, Wist T, Kawahara AY. Evolutionary genomics of three agricultural pest moths reveals rapid evolution of host adaptation and immune-related genes. Gigascience 2024; 13:giad103. [PMID: 38165153 PMCID: PMC10759296 DOI: 10.1093/gigascience/giad103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 08/01/2023] [Accepted: 11/15/2023] [Indexed: 01/03/2024] Open
Abstract
BACKGROUND Understanding the genotype of pest species provides an important baseline for designing integrated pest management (IPM) strategies. Recently developed long-read sequence technologies make it possible to compare genomic features of nonmodel pest species to disclose the evolutionary path underlying the pest species profiles. Here we sequenced and assembled genomes for 3 agricultural pest gelechiid moths: Phthorimaea absoluta (tomato leafminer), Keiferia lycopersicella (tomato pinworm), and Scrobipalpa atriplicella (goosefoot groundling moth). We also compared genomes of tomato leafminer and tomato pinworm with published genomes of Phthorimaea operculella and Pectinophora gossypiella to investigate the gene family evolution related to the pest species profiles. RESULTS We found that the 3 solanaceous feeding species, P. absoluta, K. lycopersicella, and P. operculella, are clustered together. Gene family evolution analyses with the 4 species show clear gene family expansions on host plant-associated genes for the 3 solanaceous feeding species. These genes are involved in host compound sensing (e.g., gustatory receptors), detoxification (e.g., ABC transporter C family, cytochrome P450, glucose-methanol-choline oxidoreductase, insect cuticle proteins, and UDP-glucuronosyl), and digestion (e.g., serine proteases and peptidase family S1). A gene ontology enrichment analysis of rapid evolving genes also suggests enriched functions in host sensing and immunity. CONCLUSIONS Our results of family evolution analyses indicate that host plant adaptation and pathogen defense could be important drivers in species diversification among gelechiid moths.
Collapse
Affiliation(s)
- Yi-Ming Weng
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Pathour R Shashank
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
- Division of Entomology, ICAR-Indian Agricultural Research Institute, Pusa, New Delhi 110012, India
| | - R Keating Godfrey
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - David Plotkin
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Brandon M Parker
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Tyler Wist
- Agriculture and Agri-Food Canada, Saskatoon, SK, S7N 0×2, Canada
| | - Akito Y Kawahara
- McGuire Center for Lepidoptera & Biodiversity, Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
9
|
Li G, Jia L, Wang K, Sun T, Huang J. Prediction of Thermostability of Enzymes Based on the Amino Acid Index (AAindex) Database and Machine Learning. Molecules 2023; 28:8097. [PMID: 38138586 PMCID: PMC10746113 DOI: 10.3390/molecules28248097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/06/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
The combination of wet-lab experimental data on multi-site combinatorial mutations and machine learning is an innovative method in protein engineering. In this study, we used an innovative sequence-activity relationship (innov'SAR) methodology based on novel descriptors and digital signal processing (DSP) to construct a predictive model. In this paper, 21 experimental (R)-selective amine transaminases from Aspergillus terreus (AT-ATA) were used as an input to predict higher thermostability mutants than those predicted using the existing data. We successfully improved the coefficient of determination (R2) of the model from 0.66 to 0.92. In addition, root-mean-squared deviation (RMSD), root-mean-squared fluctuation (RMSF), solvent accessible surface area (SASA), hydrogen bonds, and the radius of gyration were estimated based on molecular dynamics simulations, and the differences between the predicted mutants and the wild-type (WT) were analyzed. The successful application of the innov'SAR algorithm in improving the thermostability of AT-ATA may help in directed evolutionary screening and open up new avenues for protein engineering.
Collapse
Affiliation(s)
- Gaolin Li
- School of Biological and Chemical Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China;
| | - Lili Jia
- State Key Laboratory of Rice Biology and Breeding, China National Rice Research Institute, Hangzhou 311400, China;
| | - Kang Wang
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310023, China;
| | - Tingting Sun
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310023, China;
| | - Jun Huang
- School of Biological and Chemical Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China;
| |
Collapse
|
10
|
Rodrigues Neto JC, Salgado FF, Braga ÍDO, Carvalho da Silva TL, Belo Silva VN, Leão AP, Ribeiro JADA, Abdelnur PV, Valadares LF, de Sousa CAF, Souza Júnior MT. Osmoprotectants play a major role in the Portulaca oleracea resistance to high levels of salinity stress-insights from a metabolomics and proteomics integrated approach. FRONTIERS IN PLANT SCIENCE 2023; 14:1187803. [PMID: 37384354 PMCID: PMC10296175 DOI: 10.3389/fpls.2023.1187803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 05/03/2023] [Indexed: 06/30/2023]
Abstract
Introduction Purslane (Portulaca oleracea L.) is a non-conventional food plant used extensively in folk medicine and classified as a multipurpose plant species, serving as a source of features of direct importance to the agricultural and agri-industrial sectors. This species is considered a suitable model to study the mechanisms behind resistance to several abiotic stresses including salinity. The recently achieved technological developments in high-throughput biology opened a new window of opportunity to gain additional insights on purslane resistance to salinity stress-a complex, multigenic, and still not well-understood trait. Only a few reports on single-omics analysis (SOA) of purslane are available, and only one multi-omics integration (MOI) analysis exists so far integrating distinct omics platforms (transcriptomics and metabolomics) to characterize the response of purslane plants to salinity stress. Methods The present study is a second step in building a robust database on the morpho-physiological and molecular responses purslane to salinity stress and its subsequent use in attempting to decode the genetics behind its resistance to this abiotic stress. Here, the characterization of the morpho-physiological responses of adult purslane plants to salinity stress and a metabolomics and proteomics integrative approach to study the changes at the molecular level in their leaves and roots is presented. Results and discussion Adult plants of the B1 purslane accession lost approximately 50% of the fresh and dry weight (from shoots and roots) whensubmitted to very high salinity stress (2.0 g of NaCl/100 g of the substrate). The resistance to very high levels of salinity stress increases as the purslane plant matures, and most of the absorbed sodium remains in the roots, with only a part (~12%) reaching the shoots. Crystal-like structures, constituted mainly by Na+, Cl-, and K+, were found in the leaf veins and intercellular space near the stoma, indicating that this species has a mechanism of salt exclusion operating on the leaves, which has its role in salt tolerance. The MOI approach showed that 41 metabolites were statistically significant on the leaves and 65 metabolites on the roots of adult purslane plants. The combination of the mummichog algorithm and metabolomics database comparison revealed that the glycine, serine, and threonine, amino sugar and nucleotide sugar, and glycolysis/gluconeogenesis pathways were the most significantly enriched pathways when considering the total number of occurrences in the leaves (with 14, 13, and 13, respectively) and roots (all with eight) of adult plants; and that purslane plants employ the adaptive mechanism of osmoprotection to mitigate the negative effect of very high levels of salinity stress; and that this mechanism is prevalent in the leaves. The multi-omics database built by our group underwent a screen for salt-responsive genes, which are now under further characterization for their potential to promote resistance to salinity stress when heterologously overexpressed in salt-sensitive plants.
Collapse
Affiliation(s)
| | | | | | | | | | - André Pereira Leão
- The Brazilian Agricultural Research Corporation, Embrapa Agroenergy, Brasília, DF, Brazil
| | | | | | | | | | - Manoel Teixeira Souza Júnior
- The Brazilian Agricultural Research Corporation, Embrapa Agroenergy, Brasília, DF, Brazil
- Graduate Program of Plant Biotechnology, Federal University of Lavras, Lavras, MG, Brazil
| |
Collapse
|
11
|
Rahman A, Sarker MT, Islam MA, Hossain MU, Hasan M, Susmi TF. Targeting Essential Hypothetical Proteins of Pseudomonas aeruginosa PAO1 for Mining of Novel Therapeutics: An In Silico Approach. BIOMED RESEARCH INTERNATIONAL 2023; 2023:1787485. [PMID: 37090194 PMCID: PMC10119676 DOI: 10.1155/2023/1787485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 01/24/2023] [Accepted: 02/06/2023] [Indexed: 04/25/2023]
Abstract
As an omnipresent opportunistic bacterium, Pseudomonas aeruginosa PAO1 is responsible for acute and chronic infection in immunocompromised individuals. Currently, this bacterium is on WHO's red list where new antibiotics are urgently required for the treatment. Finding essential genes and essential hypothetical proteins (EHP) can be crucial in identifying novel druggable targets and therapeutics. This study is aimed at characterizing these EHPs and analyzing subcellular and physiochemical properties, PPI network, nonhomologous analysis against humans, virulence factor and novel drug target prediction, and finally structural analysis of the identified target employing around 42 robust bioinformatics tools/databases, the output of which was evaluated using the ROC analysis. The study discovered 18 EHPs from 336 essential genes, with domain and functional annotation revealing that 50% of these proteins belong to the enzyme category. The majority are cytoplasmic and cytoplasmic membrane proteins, with half being stable proteins subjected to PPIs network analysis. The network contains 261 nodes and 269 edges for 9 proteins of interest, with 11 hubs containing at least three nodes each. Finally, a pipeline builder predicts 7 proteins with novel drug targets, 5 nonhomologous proteins against human proteome, human antitargets, and human gut flora, and 3 virulent proteins. Among these, homology modeling of NP_249450 and NP_251676 was done, and the Ramachandran plot analysis revealed that more than 94% of the residues were in the preferred region. By analyzing functional attributes and virulence characteristics, the findings of this study may facilitate the development of innovative antibacterial drug targets and drugs of Pseudomonas aeruginosa PAO1.
Collapse
Affiliation(s)
- Atikur Rahman
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Md. Takim Sarker
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Md Ashiqul Islam
- Department of Chemistry and Biochemistry, University of Windsor, Canada
| | - Mohammad Uzzal Hossain
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Mahmudul Hasan
- Department of Pharmaceuticals and Industrial Biotechnology, Sylhet Agricultural University, Sylhet 3100, Bangladesh
| | - Tasmina Ferdous Susmi
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| |
Collapse
|
12
|
Chen R, Sanders SM, Ma Z, Paschall J, Chang ES, Riscoe BM, Schnitzler CE, Baxevanis AD, Nicotra ML. XY sex determination in a cnidarian. BMC Biol 2023; 21:32. [PMID: 36782149 PMCID: PMC9926710 DOI: 10.1186/s12915-023-01532-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/31/2023] [Indexed: 02/15/2023] Open
Abstract
BACKGROUND Sex determination occurs across animal species, but most of our knowledge about its mechanisms comes from only a handful of bilaterian taxa. This limits our ability to infer the evolutionary history of sex determination within animals. RESULTS In this study, we generated a linkage map of the genome of the colonial cnidarian Hydractinia symbiolongicarpus and used it to demonstrate that this species has an XX/XY sex determination system. We demonstrate that the X and Y chromosomes have pseudoautosomal and non-recombining regions. We then use the linkage map and a method based on the depth of sequencing coverage to identify genes encoded in the non-recombining region and show that many of them have male gonad-specific expression. In addition, we demonstrate that recombination rates are enhanced in the female genome and that the haploid chromosome number in Hydractinia is n = 15. CONCLUSIONS These findings establish Hydractinia as a tractable non-bilaterian model system for the study of sex determination and the evolution of sex chromosomes.
Collapse
Affiliation(s)
- Ruoxu Chen
- School of Medicine, Tsinghua University, Beijing, China
- Visiting Scholar, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Steven M Sanders
- Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA, USA
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhiwei Ma
- Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA, USA
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Justin Paschall
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - E Sally Chang
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brooke M Riscoe
- Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA, USA
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Andreas D Baxevanis
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Matthew L Nicotra
- Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
- Department of Immunology, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
13
|
Dhanapal AR, Thandeeswaran M, Muthusamy P, Jayaraman A. Identification and structural prediction of the unrevealed amidohydrolase enzyme: Pterin deaminase from Agrobacterium tumefaciens LBA4404. Biotechnol Appl Biochem 2023; 70:193-200. [PMID: 35352406 DOI: 10.1002/bab.2342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 02/28/2022] [Indexed: 11/11/2022]
Abstract
Microbes make a remarkable contribution to the health and well-being of living beings all over the world. Interestingly, pterin deaminase is an amidohydrolase enzyme that exhibits antitumor, anticancer activities and antioxidant properties. With the existing evidence of the presence of pterin deaminase from microbial sources, an attempt was made to reveal the existence of this enzyme in the unexplored bacterium Agrobacterium tumefaciens LBA4404. After, the cells were harvested and characterized as intracellular enzymes and then partially purified through acetone precipitation. Subsequently, further purification step was carried out with an ion-exchange chromatogram (HiTrap Q FF) using the Fast-Protein Liquid Chromatography technique (FPLC). Henceforward, the approximate molecular weight of the purified pterin deaminase was determined through SDS-PAGE. Furthermore, the purified protein was identified accurately by MALDI-TOF, and the sequence was explored through a Mascot search engine. Additionally, the three-dimensional structure was predicted and then validated, as well as ligand-binding sites, and the stability of this enzyme was confirmed for the first time. Thus, the present study revealed the selected parameters showing a considerable impact on the identification and purification of pterin deaminase from A. tumefaciens LBA4404 for the first time. The enzyme specificity makes it a favorable choice as a potent anticancer agent.
Collapse
Affiliation(s)
- Anand Raj Dhanapal
- Department of Biotechnology, Karpagam Academy of Higher Education, Coimbatore, India
| | - Murugesan Thandeeswaran
- Cancer Therapeutics Laboratory, Department of Microbial Biotechnology, Bharathiar University, Coimbatore, India
| | | | - Angayarkanni Jayaraman
- Cancer Therapeutics Laboratory, Department of Microbial Biotechnology, Bharathiar University, Coimbatore, India
| |
Collapse
|
14
|
Paik I, Ngo PHT, Shroff R, Diaz DJ, Maranhao AC, Walker DJ, Bhadra S, Ellington AD. Improved Bst DNA Polymerase Variants Derived via a Machine Learning Approach. Biochemistry 2023; 62:410-418. [PMID: 34762799 PMCID: PMC9514386 DOI: 10.1021/acs.biochem.1c00451] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The DNA polymerase I from Geobacillus stearothermophilus (also known as Bst DNAP) is widely used in isothermal amplification reactions, where its strand displacement ability is prized. More robust versions of this enzyme should be enabled for diagnostic applications, especially for carrying out higher temperature reactions that might proceed more quickly. To this end, we appended a short fusion domain from the actin-binding protein villin that improved both stability and purification of the enzyme. In parallel, we have developed a machine learning algorithm that assesses the relative fit of individual amino acids to their chemical microenvironments at any position in a protein and applied this algorithm to predict sequence substitutions in Bst DNAP. The top predicted variants had greatly improved thermotolerance (heating prior to assay), and upon combination, the mutations showed additive thermostability, with denaturation temperatures up to 2.5 °C higher than the parental enzyme. The increased thermostability of the enzyme allowed faster loop-mediated isothermal amplification assays to be carried out at 73 °C, where both Bst DNAP and its improved commercial counterpart Bst 2.0 are inactivated. Overall, this is one of the first examples of the application of machine learning approaches to the thermostabilization of an enzyme.
Collapse
Affiliation(s)
- Inyup Paik
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Phuoc H. T. Ngo
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology and Department of Chemistry, College of Natural Sciences, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Raghav Shroff
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States; CCDC Army Research Lab-South, Austin, Texas 78712, United States
| | - Daniel J. Diaz
- Center for Systems and Synthetic Biology and Department of Chemistry, College of Natural Sciences, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Andre C. Maranhao
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - David J.F. Walker
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Sanchita Bhadra
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Andrew D. Ellington
- Department of Molecular Biosciences, College of Natural Sciences, the University of Texas at Austin, Austin, Texas 78712, United States; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
15
|
Sarker B, Khare N, Devignes MD, Aridhi S. Improving automatic GO annotation with semantic similarity. BMC Bioinformatics 2022; 23:433. [PMID: 36510133 PMCID: PMC9743508 DOI: 10.1186/s12859-022-04958-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 09/19/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Automatic functional annotation of proteins is an open research problem in bioinformatics. The growing number of protein entries in public databases, for example in UniProtKB, poses challenges in manual functional annotation. Manual annotation requires expert human curators to search and read related research articles, interpret the results, and assign the annotations to the proteins. Thus, it is a time-consuming and expensive process. Therefore, designing computational tools to perform automatic annotation leveraging the high quality manual annotations that already exist in UniProtKB/SwissProt is an important research problem RESULTS: In this paper, we extend and adapt the GrAPFI (graph-based automatic protein function inference) (Sarker et al. in BMC Bioinform 21, 2020; Sarker et al., in: Proceedings of 7th international conference on complex networks and their applications, Cambridge, 2018) method for automatic annotation of proteins with gene ontology (GO) terms renaming it as GrAPFI-GO. The original GrAPFI method uses label propagation in a similarity graph where proteins are linked through the domains, families, and superfamilies that they share. Here, we also explore various types of similarity measures based on common neighbors in the graph. Moreover, GO terms are arranged in a hierarchical manner according to semantic parent-child relations. Therefore, we propose an efficient pruning and post-processing technique that integrates both semantic similarity and hierarchical relations between the GO terms. We produce experimental results comparing the GrAPFI-GO method with and without considering common neighbors similarity. We also test the performance of GrAPFI-GO and other annotation tools for GO annotation on a benchmark of proteins with and without the proposed pruning and post-processing procedure. CONCLUSION Our results show that the proposed semantic hierarchical post-processing potentially improves the performance of GrAPFI-GO and of other annotation tools as well. Thus, GrAPFI-GO exposes an original efficient and reusable procedure, to exploit the semantic relations among the GO terms in order to improve the automatic annotation of protein functions.
Collapse
Affiliation(s)
- Bishnu Sarker
- grid.29172.3f0000 0001 2194 6418CNRS, Inria, LORIA, University of Lorraine, 54000 Nancy, France ,grid.443078.c0000 0004 0371 4228Khulna University of Engineering and Technology, Khulna, Bangladesh ,grid.259870.10000 0001 0286 752XSchool of Applied Computational Sciences, Meharry Medical College, Nashville, TN USA
| | - Navya Khare
- grid.29172.3f0000 0001 2194 6418CNRS, Inria, LORIA, University of Lorraine, 54000 Nancy, France ,grid.419361.80000 0004 1759 7632International Institute of Information Technology, Hyderabad, India
| | | | - Sabeur Aridhi
- grid.29172.3f0000 0001 2194 6418CNRS, Inria, LORIA, University of Lorraine, 54000 Nancy, France
| |
Collapse
|
16
|
Goudey B, Geard N, Verspoor K, Zobel J. Propagation, detection and correction of errors using the sequence database network. Brief Bioinform 2022; 23:6764545. [PMID: 36266246 PMCID: PMC9677457 DOI: 10.1093/bib/bbac416] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/31/2022] [Accepted: 08/28/2022] [Indexed: 12/14/2022] Open
Abstract
Nucleotide and protein sequences stored in public databases are the cornerstone of many bioinformatics analyses. The records containing these sequences are prone to a wide range of errors, including incorrect functional annotation, sequence contamination and taxonomic misclassification. One source of information that can help to detect errors are the strong interdependency between records. Novel sequences in one database draw their annotations from existing records, may generate new records in multiple other locations and will have varying degrees of similarity with existing records across a range of attributes. A network perspective of these relationships between sequence records, within and across databases, offers new opportunities to detect-or even correct-erroneous entries and more broadly to make inferences about record quality. Here, we describe this novel perspective of sequence database records as a rich network, which we call the sequence database network, and illustrate the opportunities this perspective offers for quantification of database quality and detection of spurious entries. We provide an overview of the relevant databases and describe how the interdependencies between sequence records across these databases can be exploited by network analyses. We review the process of sequence annotation and provide a classification of sources of error, highlighting propagation as a major source. We illustrate the value of a network perspective through three case studies that use network analysis to detect errors, and explore the quality and quantity of critical relationships that would inform such network analyses. This systematic description of a network perspective of sequence database records provides a novel direction to combat the proliferation of errors within these critical bioinformatics resources.
Collapse
Affiliation(s)
- Benjamin Goudey
- Corresponding author. Benjamin Goudey, School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010,
| | - Nicholas Geard
- School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010
| | - Karin Verspoor
- School of Computing Technologies, RMIT University Melbourne, Victoria, 3000
| | - Justin Zobel
- School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010
| |
Collapse
|
17
|
Anuntasomboon P, Siripattanapipong S, Unajak S, Choowongkomon K, Burchmore R, Leelayoova S, Mungthin M, E-kobon T. Making the Most of Its Short Reads: A Bioinformatics Workflow for Analysing the Short-Read-Only Data of Leishmania orientalis (Formerly Named Leishmania siamensis) Isolate PCM2 in Thailand. BIOLOGY 2022; 11:biology11091272. [PMID: 36138751 PMCID: PMC9495971 DOI: 10.3390/biology11091272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary Leishmaniasis is a parasitic disease caused by flagellated protozoa of the genus Leishmania. Multiple genome sequencing platforms have been employed to complete Leishmania genomes at the expense of high cost. This study proposes an integrative bioinformatic workflow for assembling only the short-read data of Leishmania orientalis isolate PCM2 from Thailand and produce an acceptable-quality genome for further genomic analysis. This workflow gives extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation. Abstract Background: Leishmania orientalis (formerly named Leishmania siamensis) has been neglected for years in Thailand. The genomic study of L. orientalis has gained much attention recently after the release of the first high-quality reference genome of the isolate LSCM4. The integrative approach of multiple sequencing platforms for whole-genome sequencing has proven effective at the expense of considerably expensive costs. This study presents a preliminary bioinformatic workflow including the use of multi-step de novo assembly coupled with the reference-based assembly method to produce high-quality genomic drafts from the short-read Illumina sequence data of L. orientalis isolate PCM2. Results: The integrating multi-step de novo assembly by MEGAHIT and SPAdes with the reference-based method using the L. enriettii genome and salvaging the unmapped reads resulted in the 30.27 Mb genomic draft of L. orientalis isolate PCM2 with 3367 contigs and 8887 predicted genes. The results from the integrated approach showed the best integrity, coverage, and contig alignment when compared to the genome of L. orientalis isolate LSCM4 collected from the northern province of Thailand. Similar patterns of gene ratios and frequency were observed from the GO biological process annotation. Fifty GO terms were assigned to the assembled genomes, and 23 of these (accounting for 61.6% of the annotated genes) showed higher gene counts and ratios when results from our workflow were compared to those of the LSCM4 isolate. Conclusions: These results indicated that our proposed bioinformatic workflow produced an acceptable-quality genome of L. orientalis strain PCM2 for functional genomic analysis, maximising the usage of the short-read data. This workflow would give extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation.
Collapse
Affiliation(s)
- Pornchai Anuntasomboon
- Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| | | | - Sasimanas Unajak
- Department of Biochemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Kiattawee Choowongkomon
- Department of Biochemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Richard Burchmore
- Glasgow Polyomics, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Saovanee Leelayoova
- Department of Parasitology, Phramongkutklao College of Medicine, Bangkok 10400, Thailand
| | - Mathirut Mungthin
- Department of Parasitology, Phramongkutklao College of Medicine, Bangkok 10400, Thailand
| | - Teerasak E-kobon
- Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
- Correspondence: ; Tel.: +66-812-85-4672
| |
Collapse
|
18
|
Rule-Based Pruning and In Silico Identification of Essential Proteins in Yeast PPIN. Cells 2022; 11:cells11172648. [PMID: 36078056 PMCID: PMC9454873 DOI: 10.3390/cells11172648] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/18/2022] [Accepted: 08/22/2022] [Indexed: 11/25/2022] Open
Abstract
Proteins are vital for the significant cellular activities of living organisms. However, not all of them are essential. Identifying essential proteins through different biological experiments is relatively more laborious and time-consuming than the computational approaches used in recent times. However, practical implementation of conventional scientific methods sometimes becomes challenging due to poor performance impact in specific scenarios. Thus, more developed and efficient computational prediction models are required for essential protein identification. An effective methodology is proposed in this research, capable of predicting essential proteins in a refined yeast protein–protein interaction network (PPIN). The rule-based refinement is done using protein complex and local interaction density information derived from the neighborhood properties of proteins in the network. Identification and pruning of non-essential proteins are equally crucial here. In the initial phase, careful assessment is performed by applying node and edge weights to identify and discard the non-essential proteins from the interaction network. Three cut-off levels are considered for each node and edge weight for pruning the non-essential proteins. Once the PPIN has been filtered out, the second phase starts with two centralities-based approaches: (1) local interaction density (LID) and (2) local interaction density with protein complex (LIDC), which are successively implemented to identify the essential proteins in the yeast PPIN. Our proposed methodology achieves better performance in comparison to the existing state-of-the-art techniques.
Collapse
|
19
|
Coulter M, Entizne JC, Guo W, Bayer M, Wonneberger R, Milne L, Schreiber M, Haaning A, Muehlbauer GJ, McCallum N, Fuller J, Simpson C, Stein N, Brown JWS, Waugh R, Zhang R. BaRTv2: a highly resolved barley reference transcriptome for accurate transcript-specific RNA-seq quantification. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1183-1202. [PMID: 35704392 PMCID: PMC9546494 DOI: 10.1111/tpj.15871] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 05/02/2022] [Accepted: 06/09/2022] [Indexed: 06/15/2023]
Abstract
Accurate characterisation of splice junctions (SJs) as well as transcription start and end sites in reference transcriptomes allows precise quantification of transcripts from RNA-seq data, and enables detailed investigations of transcriptional and post-transcriptional regulation. Using novel computational methods and a combination of PacBio Iso-seq and Illumina short-read sequences from 20 diverse tissues and conditions, we generated a comprehensive and highly resolved barley reference transcript dataset from the European 2-row spring barley cultivar Barke (BaRTv2.18). Stringent and thorough filtering was carried out to maintain the quality and accuracy of the SJs and transcript start and end sites. BaRTv2.18 shows increased transcript diversity and completeness compared with an earlier version, BaRTv1.0. The accuracy of transcript level quantification, SJs and transcript start and end sites have been validated extensively using parallel technologies and analysis, including high-resolution reverse transcriptase-polymerase chain reaction and 5'-RACE. BaRTv2.18 contains 39 434 genes and 148 260 transcripts, representing the most comprehensive and resolved reference transcriptome in barley to date. It provides an important and high-quality resource for advanced transcriptomic analyses, including both transcriptional and post-transcriptional regulation, with exceptional resolution and precision.
Collapse
Affiliation(s)
- Max Coulter
- Division of Plant SciencesUniversity of Dundee, James Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Juan Carlos Entizne
- Division of Plant SciencesUniversity of Dundee, James Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Wenbin Guo
- Information and Computational SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Micha Bayer
- Information and Computational SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Ronja Wonneberger
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)Corrensstrasse 3D‐06466Stadt SeelandGermany
| | - Linda Milne
- Information and Computational SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Miriam Schreiber
- Division of Plant SciencesUniversity of Dundee, James Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Allison Haaning
- Department of Agronomy and Plant GeneticsUniversity of Minnesota1991 Upper Buford Circle, 542 Borlaug HallSt PaulMinnesota55108USA
| | - Gary J. Muehlbauer
- Department of Agronomy and Plant GeneticsUniversity of Minnesota1991 Upper Buford Circle, 542 Borlaug HallSt PaulMinnesota55108USA
| | - Nicola McCallum
- Cell and Molecular SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - John Fuller
- Cell and Molecular SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Craig Simpson
- Cell and Molecular SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)Corrensstrasse 3D‐06466Stadt SeelandGermany
- Center for Integrated Breeding Research (CiBreed)Georg‐August‐UniversityGöttingenGermany
| | - John W. S. Brown
- Division of Plant SciencesUniversity of Dundee, James Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
- Cell and Molecular SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| | - Robbie Waugh
- Division of Plant SciencesUniversity of Dundee, James Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
- Cell and Molecular SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
- School of Agriculture and Wine & Waite Research InstituteUniversity of AdelaideWaite CampusGlen OsmondSouth Australia5064Australia
| | - Runxuan Zhang
- Information and Computational SciencesJames Hutton InstituteInvergowrieDundeeDD2 5DAScotlandUK
| |
Collapse
|
20
|
Booth MW, Breed MF, Kendrick GA, Bayer PE, Severn-Ellis AA, Sinclair EA. Tissue-specific transcriptome profiles identify functional differences key to understanding whole plant response to life in variable salinity. Biol Open 2022; 11:276025. [PMID: 35876771 PMCID: PMC9428325 DOI: 10.1242/bio.059147] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 07/14/2022] [Indexed: 11/20/2022] Open
Abstract
Plants endure environmental stressors via adaptation and phenotypic plasticity. Studying these mechanisms in seagrasses is extremely relevant as they are important primary producers and functionally significant carbon sinks. These mechanisms are not well understood at the tissue level in seagrasses. Using RNA-seq, we generated transcriptome sequences from tissue of leaf, basal leaf meristem and root organs of Posidonia australis, establishing baseline in situ transcriptomic profiles for tissues across a salinity gradient. Samples were collected from four P. australis meadows growing in Shark Bay, Western Australia. Analysis of gene expression showed significant differences between tissue types, with more variation among leaves than meristem or roots. Gene ontology enrichment analysis showed the differences were largely due to the role of photosynthesis, plant growth and nutrient absorption in leaf, meristem and root organs, respectively. Differential gene expression of leaf and meristem showed upregulation of salinity regulation processes in higher salinity meadows. Our study highlights the importance of considering leaf meristem tissue when evaluating whole-plant responses to environmental change. This article has an associated First Person interview with the first author of the paper. Summary: Differences in seagrass leaf, meristem and root transcriptomes across variable salinities are due to tissue-specific processes. Leaf meristem contained the broadest process range, indicating preferential use for inferring plant-wide activity.
Collapse
Affiliation(s)
- Mitchell W Booth
- School of Biological Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia.,Oceans Institute, The University of Western Australia, Crawley, Western Australia 6009, Australia
| | - Martin F Breed
- College of Science and Engineering, Flinders University, Bedford Park, South Australia 5042, Australia
| | - Gary A Kendrick
- School of Biological Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia.,Oceans Institute, The University of Western Australia, Crawley, Western Australia 6009, Australia
| | - Philipp E Bayer
- School of Biological Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
| | - Anita A Severn-Ellis
- School of Biological Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia.,Aquatic Animal Health Research, Indian Ocean Marine Research Centre, Department of Primary Industries and Regional Development, Western Australia, 6020, Australia
| | - Elizabeth A Sinclair
- School of Biological Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia.,Oceans Institute, The University of Western Australia, Crawley, Western Australia 6009, Australia.,Kings Park Science, Department of Biodiversity Conservation and Attractions, 1 Kattidj Close, West Perth, Western Australia, 6005, Australia
| |
Collapse
|
21
|
Borges Araujo AJ, Cerruti GV, Zuccarelli R, Rodriguez Ruiz M, Freschi L, Singh R, Moerschbacher BM, Floh EIS, Wendt dos Santos AL. Proteomic Analysis of S-Nitrosation Sites During Somatic Embryogenesis in Brazilian Pine, Araucaria angustifolia (Bertol.) Kuntze. FRONTIERS IN PLANT SCIENCE 2022; 13:902068. [PMID: 35845673 PMCID: PMC9280032 DOI: 10.3389/fpls.2022.902068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/27/2022] [Indexed: 06/15/2023]
Abstract
Cysteine S-nitrosation is a redox-based post-translational modification that mediates nitric oxide (NO) regulation of various aspects of plant growth, development and stress responses. Despite its importance, studies exploring protein signaling pathways that are regulated by S-nitrosation during somatic embryogenesis have not been performed. In the present study, endogenous cysteine S-nitrosation site and S-nitrosated proteins were identified by iodo-TMT labeling during somatic embryogenesis in Brazilian pine, an endangered native conifer of South America. In addition, endogenous -S-nitrosothiol (SNO) levels and S-nitrosoglutathione reductase (GSNOR) activity were determined in cell lines with contrasting embryogenic potential. Overall, we identified an array of proteins associated with a large variety of biological processes and molecular functions with some of them already described as important for somatic embryogenesis (Class IV chitinase, pyruvate dehydrogenase E1 and dehydroascorbate reductase). In total, our S-nitrosoproteome analyses identified 18 endogenously S-nitrosated proteins and 50 in vitro S-nitrosated proteins (after GSNO treatment) during cell culture proliferation and embryo development. Furthermore, SNO levels and GSNOR activity were increased during embryo formation. These findings expand our understanding of the Brazilian pine proteome and shed novel insights into the potential use of pharmacological manipulation of NO levels by using NO inhibitors and donors during somatic embryogenesis.
Collapse
Affiliation(s)
| | | | - Rafael Zuccarelli
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Marta Rodriguez Ruiz
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Luciano Freschi
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Ratna Singh
- Department of Plant Biology and Biotechnology, WWU Münster, Münster, Germany
| | | | - Eny Iochevet Segal Floh
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | | |
Collapse
|
22
|
Yazdi HP, Ravinet M, Rowe M, Saetre GP, Guldvog CØ, Eroukhmanoff F, Marzal A, Magallanes S, Runemark A. Extensive transgressive gene expression in testis but not ovary in the homoploid hybrid Italian sparrow. Mol Ecol 2022; 31:4067-4077. [PMID: 35726533 PMCID: PMC9542029 DOI: 10.1111/mec.16572] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 05/01/2022] [Accepted: 05/12/2022] [Indexed: 11/30/2022]
Abstract
Hybridization can result in novel allelic combinations which can impact the hybrid phenotype through changes in gene expression. While misexpression in F1 hybrids is well documented, how gene expression evolves in stabilized hybrid taxa remains an open question. As gene expression evolves in a stabilizing manner, break‐up of co‐evolved cis‐ and trans‐regulatory elements could lead to transgressive patterns of gene expression in hybrids. Here, we address to what extent gonad gene expression has evolved in an established and stable homoploid hybrid, the Italian sparrow (Passer italiae). Through comparison of gene expression in gonads from individuals of the two parental species (i.e., house and Spanish sparrow) to that of Italian sparrows, we find evidence for strongly transgressive expression in male Italian sparrows—2530 genes (22% of testis genes tested for inheritance) exhibit expression patterns outside the range of both parent species. In contrast, Italian sparrow ovary expression was similar to that of one of the parent species, the house sparrow (Passer domesticus). Moreover, the Italian sparrow testis transcriptome is 26 times as diverged from those of the parent species as the parental transcriptomes are from each other, despite being genetically intermediate. This highlights the potential for regulation of gene expression to produce novel variation following hybridization. Genes involved in mitochondrial respiratory chain complexes and protein synthesis are enriched in the subset that is over‐dominantly expressed in Italian sparrow testis, suggesting that selection on key functions has moulded the hybrid Italian sparrow transcriptome.
Collapse
Affiliation(s)
| | - Mark Ravinet
- School of Life Sciences, University of Nottingham, Nottingham, UK
| | - Melissah Rowe
- Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), AB, Wageningen, The Netherlands
| | - Glenn-Peter Saetre
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis, University of Oslo, PO, Oslo, Norway
| | - Caroline Øien Guldvog
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis, University of Oslo, PO, Oslo, Norway
| | - Fabrice Eroukhmanoff
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis, University of Oslo, PO, Oslo, Norway
| | - Alfonso Marzal
- Department of Anatomy, Cellular Biology and Zoology, University of Extremadura, Badajoz, Spain
| | - Sergio Magallanes
- Department of Anatomy, Cellular Biology and Zoology, University of Extremadura, Badajoz, Spain.,Department of Wetland Ecology, Doñana Biological Station (EBD-CSIC), Avda. Américo Vespucio, 41092, Seville, Spain
| | - Anna Runemark
- Department of Biology, Lund University, Lund, Sweden
| |
Collapse
|
23
|
Reijnders MJMF, Waterhouse RM. CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. PLoS Comput Biol 2022; 18:e1010075. [PMID: 35560159 PMCID: PMC9132264 DOI: 10.1371/journal.pcbi.1010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 05/25/2022] [Accepted: 04/04/2022] [Indexed: 11/29/2022] Open
Abstract
Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations. New technologies mean that we are able to read the genetic blueprints in the form of complete genome sequences from many different species. We are also able to use computational methods combined with evidence from experiments to map out the locations in the genomes of many thousands of genes and other important regions. However, discovering and characterising the biological functions of all these genes and their protein products requires considerably more experimental work. In order to gain insights into the possible functions of the many genes currently lacking functional information from experiments we must therefore rely on methods that computationally predict protein functions. Many different software tools have been developed to tackle this challenge, each with their own strengths and weaknesses as shown by several community-based competitions that assess the performance of the predictors. Taking advantage of powerful modern machine learning techniques, we developed CrowdGO, a new software that aims to combine predictions from several tools and produce comprehensive and accurate gene functional annotations. CrowdGO is able to computationally assess agreements and conflicts amongst annotations from different predictors to then re-evaluate the results and deliver enhanced predictions of protein functions.
Collapse
Affiliation(s)
- Maarten J. M. F. Reijnders
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail: (MJMFR); (RMW)
| | - Robert M. Waterhouse
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail: (MJMFR); (RMW)
| |
Collapse
|
24
|
Cao A, de la Fuente M, Gesteiro N, Santiago R, Malvar RA, Butrón A. Genomics and Pathways Involved in Maize Resistance to Fusarium Ear Rot and Kernel Contamination With Fumonisins. FRONTIERS IN PLANT SCIENCE 2022; 13:866478. [PMID: 35586219 PMCID: PMC9108495 DOI: 10.3389/fpls.2022.866478] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 03/25/2022] [Indexed: 06/15/2023]
Abstract
Fusarium verticillioides is a causal agent of maize ear rot and produces fumonisins, which are mycotoxins that are toxic to animals and humans. In this study, quantitative trait loci (QTLs) and bulk-segregant RNA-seq approaches were used to uncover genomic regions and pathways involved in resistance to Fusarium ear rot (FER) and to fumonisin accumulation in maize kernels. Genomic regions at bins 4.07-4.1, 6-6.01, 6.04-6.05, and 8.05-8.08 were related to FER resistance and/or reduced fumonisin levels in kernels. A comparison of transcriptomes between resistant and susceptible inbred bulks 10 days after inoculation with F. verticillioides revealed 364 differentially expressed genes (DEGs). In the resistant inbred bulks, genes involved in sink metabolic processes such as fatty acid and starch biosynthesis were downregulated, as well as those involved in phytosulfokine signaling and many other genes involved in cell division; while genes involved in secondary metabolism and compounds/processes related to resistance were upregulated, especially those related to cell wall biosynthesis/rearrangement and flavonoid biosynthesis. These trends are indicative of a growth-defense trade-off. Among the DEGs, Zm00001d053603, Zm00001d035562, Zm00001d037810, Zm00001d037921, and Zm00001d010840 were polymorphic between resistant and susceptible bulks, were located in the confidence intervals of detected QTLs, and showed large differences in transcript levels between the resistant and susceptible bulks. Thus, they were identified as candidate genes involved in resistance to FER and/or reduced fumonisin accumulation.
Collapse
Affiliation(s)
- Ana Cao
- Misión Biológica de Galicia (CSIC), Pontevedra, Spain
| | | | | | - Rogelio Santiago
- Misión Biológica de Galicia (CSIC), Pontevedra, Spain
- Agrobiología Ambiental, Calidad de Suelos y Plantas (UVIGO), Unidad Asociada a la MBG (CSIC), Pontevedra, Spain
| | - Rosa Ana Malvar
- Misión Biológica de Galicia (CSIC), Pontevedra, Spain
- Agrobiología Ambiental, Calidad de Suelos y Plantas (UVIGO), Unidad Asociada a la MBG (CSIC), Pontevedra, Spain
| | - Ana Butrón
- Misión Biológica de Galicia (CSIC), Pontevedra, Spain
| |
Collapse
|
25
|
Hakala K, Kaewphan S, Bjorne J, Mehryary F, Moen H, Tolvanen M, Salakoski T, Ginter F. Neural Network and Random Forest Models in Protein Function Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1772-1781. [PMID: 33306472 DOI: 10.1109/tcbb.2020.3044230] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Over the past decade, the demand for automated protein function prediction has increased due to the volume of newly sequenced proteins. In this paper, we address the function prediction task by developing an ensemble system automatically assigning Gene Ontology (GO) terms to the given input protein sequence. We develop an ensemble system which combines the GO predictions made by random forest (RF) and neural network (NN) classifiers. Both RF and NN models rely on features derived from BLAST sequence alignments, taxonomy and protein signature analysis tools. In addition, we report on experiments with a NN model that directly analyzes the amino acid sequence as its sole input, using a convolutional layer. The Swiss-Prot database is used as the training and evaluation data. In the CAFA3 evaluation, which relies on experimental verification of the functional predictions, our submitted ensemble model demonstrates competitive performance ranking among top-10 best-performing systems out of over 100 submitted systems. In this paper, we evaluate and further improve the CAFA3-submitted system. Our machine learning models together with the data pre-processing and feature generation tools are publicly available as an open source software at https://github.com/TurkuNLP/CAFA3.
Collapse
|
26
|
Ilgisonis EV, Pogodin PV, Kiseleva OI, Tarbeeva SN, Ponomarenko EA. Evolution of Protein Functional Annotation: Text Mining Study. J Pers Med 2022; 12:jpm12030479. [PMID: 35330478 PMCID: PMC8952229 DOI: 10.3390/jpm12030479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/23/2022] Open
Abstract
Within the Human Proteome Project initiative framework for creating functional annotations of uPE1 proteins, the neXt-CP50 Challenge was launched in 2018. In analogy with the missing-protein challenge, each command deciphers the functional features of the proteins in the chromosome-centric mode. However, the neXt-CP50 Challenge is more complicated than the missing-protein challenge: the approaches and methods for solving the problem are clear, but neither the concept of protein function nor specific experimental and/or bioinformatics protocols have been standardized to address it. We proposed using a retrospective analysis of the key HPP repository, the neXtProt database, to identify the most frequently used experimental and bioinformatic methods for analyzing protein functions, and the dynamics of accumulation of functional annotations. It has been shown that the dynamics of the increase in the number of proteins with known functions are greater than the progress made in the experimental confirmation of the existence of questionable proteins in the framework of the missing-protein challenge. At the same time, the functional annotation is based on the guilty-by-association postulate, according to which, based on large-scale experiments on API-MS and Y2H, proteins with unknown functions are most likely mapped through “handshakes” to biochemical processes.
Collapse
|
27
|
Pasternak Z, Chapnik N, Yosef R, Kopelman NM, Jurkevitch E, Segev E. Identifying protein function and functional links based on large-scale co-occurrence patterns. PLoS One 2022; 17:e0264765. [PMID: 35239724 PMCID: PMC8893610 DOI: 10.1371/journal.pone.0264765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 02/16/2022] [Indexed: 11/23/2022] Open
Abstract
Objective The vast majority of known proteins have not been experimentally tested even at the level of measuring their expression, and the function of many proteins remains unknown. In order to decipher protein function and examine functional associations, we developed "Cliquely", a software tool based on the exploration of co-occurrence patterns. Computational model Using a set of more than 23 million proteins divided into 404,947 orthologous clusters, we explored the co-occurrence graph of 4,742 fully sequenced genomes from the three domains of life. Edge weights in this graph represent co-occurrence probabilities. We use the Bron–Kerbosch algorithm to detect maximal cliques in this graph, fully-connected subgraphs that represent meaningful biological networks from different functional categories. Main results We demonstrate that Cliquely can successfully identify known networks from various pathways, including nitrogen fixation, glycolysis, methanogenesis, mevalonate and ribosome proteins. Identifying the virulence-associated type III secretion system (T3SS) network, Cliquely also added 13 previously uncharacterized novel proteins to the T3SS network, demonstrating the strength of this approach. Cliquely is freely available and open source. Users can employ the tool to explore co-occurrence networks using a protein of interest and a customizable level of stringency, either for the entire dataset or for a one of the three domains—Archaea, Bacteria, or Eukarya.
Collapse
Affiliation(s)
- Zohar Pasternak
- Division of Identification and Forensic Science, Israel Police, Jerusalem, Israel
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Noam Chapnik
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Roy Yosef
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Naama M. Kopelman
- Faculty of Science, Holon Institute of Technology, Holon, Israel
- * E-mail:
| | - Edouard Jurkevitch
- Department of Plant Pathology and Microbiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elad Segev
- Faculty of Science, Holon Institute of Technology, Holon, Israel
| |
Collapse
|
28
|
Törönen P, Holm L. PANNZER-A practical tool for protein function prediction. Protein Sci 2022; 31:118-128. [PMID: 34562305 PMCID: PMC8740830 DOI: 10.1002/pro.4193] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 01/03/2023]
Abstract
The facility of next-generation sequencing has led to an explosion of gene catalogs for novel genomes, transcriptomes and metagenomes, which are functionally uncharacterized. Computational inference has emerged as a necessary substitute for first-hand experimental evidence. PANNZER (Protein ANNotation with Z-scoRE) is a high-throughput functional annotation web server that stands out among similar publically accessible web servers in supporting submission of up to 100,000 protein sequences at once and providing both Gene Ontology (GO) annotations and free text description predictions. Here, we demonstrate the use of PANNZER and discuss future plans and challenges. We present two case studies to illustrate problems related to data quality and method evaluation. Some commonly used evaluation metrics and evaluation datasets promote methods that favor unspecific and broad functional classes over more informative and specific classes. We argue that this can bias the development of automated function prediction methods. The PANNZER web server and source code are available at http://ekhidna2.biocenter.helsinki.fi/sanspanz/.
Collapse
Affiliation(s)
- Petri Törönen
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of HelsinkiHelsinkiFinland
| | - Liisa Holm
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of HelsinkiHelsinkiFinland,Organismal and Evolutionary Biology Research Program, Faculty of BiosciencesUniversity of HelsinkiHelsinkiFinland
| |
Collapse
|
29
|
Wu B, Cox MP. Comparative genomics reveals a core gene toolbox for lifestyle transitions in Hypocreales fungi. Environ Microbiol 2021; 23:3251-3264. [PMID: 33939870 PMCID: PMC8360070 DOI: 10.1111/1462-2920.15554] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 04/29/2021] [Accepted: 04/30/2021] [Indexed: 12/13/2022]
Abstract
Fungi have evolved diverse lifestyles and adopted pivotal new roles in both natural ecosystems and human environments. However, the molecular mechanisms underlying their adaptation to new lifestyles are obscure. Here, we hypothesize that genes shared across all species with the same lifestyle, but absent in genera with alternative lifestyles, are crucial to that lifestyle. By analysing dozens of species within four genera in a fungal order, with each genus following a different lifestyle, we find that genus-specific genes are typically few in number. Notably, not all genus-specific genes appear to derive from de novo birth, with most instead reflecting recurrent loss across the fungi. Importantly, however, a subset of these genus-specific genes are shared by fungi with the same lifestyle in quite different evolutionary orders, thus supporting the view that some genus-specific genes are necessary for specific lifestyles. These lifestyle-specific genes are enriched for key functional classes and often exhibit specialized expression patterns. Genus-specific selection also contributes to lifestyle transitions, and is especially associated with intensity of pathogenesis. Our study, therefore, suggests that fungal adaptation to new lifestyles often requires just a small number of core genes, with gene turnover and positive selection playing complementary roles.
Collapse
Affiliation(s)
- Baojun Wu
- Statistics and Bioinformatics Group, School of Fundamental SciencesMassey UniversityPalmerston North4410New Zealand
- Bio‐Protection Research CentreMassey UniversityPalmerston North4410New Zealand
| | - Murray P. Cox
- Statistics and Bioinformatics Group, School of Fundamental SciencesMassey UniversityPalmerston North4410New Zealand
- Bio‐Protection Research CentreMassey UniversityPalmerston North4410New Zealand
| |
Collapse
|
30
|
Wimalanathan K, Lawrence-Dill CJ. Gene Ontology Meta Annotator for Plants (GOMAP). PLANT METHODS 2021; 17:54. [PMID: 34034755 PMCID: PMC8146647 DOI: 10.1186/s13007-021-00754-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 05/10/2021] [Indexed: 05/03/2023]
Abstract
Annotating gene structures and functions to genome assemblies is necessary to make assembly resources useful for biological inference. Gene Ontology (GO) term assignment is the most used functional annotation system, and new methods for GO assignment have improved the quality of GO-based function predictions. The Gene Ontology Meta Annotator for Plants (GOMAP) is an optimized, high-throughput, and reproducible pipeline for genome-scale GO annotation of plants. We containerized GOMAP to increase portability and reproducibility and also optimized its performance for HPC environments. Here we report on the pipeline's availability and performance for annotating large, repetitive plant genomes and describe how GOMAP was used to annotate multiple maize genomes as a test case. Assessment shows that GOMAP expands and improves the number of genes annotated and annotations assigned per gene as well as the quality (based on [Formula: see text]) of GO assignments in maize. GOMAP has been deployed to annotate other species including wheat, rice, barley, cotton, and soy. Instructions and access to the GOMAP Singularity container are freely available online at https://bioinformapping.com/gomap/ . A list of annotated genomes and links to data is maintained at https://dill-picl.org/projects/gomap/ .
Collapse
Affiliation(s)
- Kokulapalan Wimalanathan
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50010, USA.
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA.
- Greenlight Biosciences Inc., Medford, MA, 02155, USA.
| | - Carolyn J Lawrence-Dill
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50010, USA.
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA.
- Department of Agronomy, Iowa State University, Ames, IA, 50010, USA.
| |
Collapse
|
31
|
Michell C, Wutke S, Aranda M, Nyman T. Genomes of the willow-galling sawflies Euura lappo and Eupontania aestiva (Hymenoptera: Tenthredinidae): a resource for research on ecological speciation, adaptation, and gall induction. G3 (BETHESDA, MD.) 2021; 11:jkab094. [PMID: 33788947 PMCID: PMC8104934 DOI: 10.1093/g3journal/jkab094] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 03/09/2021] [Indexed: 12/14/2022]
Abstract
Hymenoptera is a hyperdiverse insect order represented by over 153,000 different species. As many hymenopteran species perform various crucial roles for our environments, such as pollination, herbivory, and parasitism, they are of high economic and ecological importance. There are 99 hymenopteran genomes in the NCBI database, yet only five are representative of the paraphyletic suborder Symphyta (sawflies, woodwasps, and horntails), while the rest represent the suborder Apocrita (bees, wasps, and ants). Here, using a combination of 10X Genomics linked-read sequencing, Oxford Nanopore long-read technology, and Illumina short-read data, we assembled the genomes of two willow-galling sawflies (Hymenoptera: Tenthredinidae: Nematinae: Euurina): the bud-galling species Euura lappo and the leaf-galling species Eupontania aestiva. The final assembly for E. lappo is 259.85 Mbp in size, with a contig N50 of 209.0 kbp and a BUSCO score of 93.5%. The E. aestiva genome is 222.23 Mbp in size, with a contig N50 of 49.7 kbp and a 90.2% complete BUSCO score. De novo annotation of repetitive elements showed that 27.45% of the genome was composed of repetitive elements in E. lappo and 16.89% in E. aestiva, which is a marked increase compared to previously published hymenopteran genomes. The genomes presented here provide a resource for inferring phylogenetic relationships among basal hymenopterans, comparative studies on host-related genomic adaptation in plant-feeding insects, and research on the mechanisms of plant manipulation by gall-inducing insects.
Collapse
Affiliation(s)
- Craig Michell
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80100, Finland
| | - Saskia Wutke
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80100, Finland
| | - Manuel Aranda
- Biological and Environmental Sciences & Engineering Division, Red Sea Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
| | - Tommi Nyman
- Department of Ecosystems in the Barents Region, Norwegian Institute of Bioeconomy Research, Svanvik, 9925, Norway
| |
Collapse
|
32
|
Genome assembly, sex-biased gene expression and dosage compensation in the damselfly Ischnura elegans. Genomics 2021; 113:1828-1837. [PMID: 33831439 DOI: 10.1016/j.ygeno.2021.04.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 02/27/2021] [Accepted: 04/04/2021] [Indexed: 12/14/2022]
Abstract
The evolution of sex chromosomes, and patterns of sex-biased gene expression and dosage compensation, are poorly known among early winged insects such as odonates. We assembled and annotated the genome of Ischnura elegans (blue-tailed damselfly), which, like other odonates, has a male-hemigametic sex-determining system (X0 males, XX females). By identifying X-linked genes in I. elegans and their orthologs in other insect genomes, we found homologies between the X chromosome in odonates and chromosomes of other orders, including the X chromosome in Coleoptera. Next, we showed balanced expression of X-linked genes between sexes in adult I. elegans, i.e. evidence of dosage compensation. Finally, among the genes in the sex-determining pathway only fruitless was found to be X-linked, while only doublesex showed sex-biased expression. This study reveals partly conserved sex chromosome synteny and independent evolution of dosage compensation among insect orders separated by several hundred million years of evolutionary history.
Collapse
|
33
|
Naake T, Maeda HA, Proost S, Tohge T, Fernie AR. Kingdom-wide analysis of the evolution of the plant type III polyketide synthase superfamily. PLANT PHYSIOLOGY 2021; 185:857-875. [PMID: 33793871 PMCID: PMC8133574 DOI: 10.1093/plphys/kiaa086] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 12/07/2020] [Indexed: 05/19/2023]
Abstract
The emergence of type III polyketide synthases (PKSs) was a prerequisite for the conquest of land by the green lineage. Within the PKS superfamily, chalcone synthases (CHSs) provide the entry point reaction to the flavonoid pathway, while LESS ADHESIVE POLLEN 5 and 6 (LAP5/6) provide constituents of the outer exine pollen wall. To study the deep evolutionary history of this key family, we conducted phylogenomic synteny network and phylogenetic analyses of whole-genome data from 126 species spanning the green lineage including Arabidopsis thaliana, tomato (Solanum lycopersicum), and maize (Zea mays). This study thereby combined study of genomic location and context with changes in gene sequences. We found that the two major clades, CHS and LAP5/6 homologs, evolved early by a segmental duplication event prior to the divergence of Bryophytes and Tracheophytes. We propose that the macroevolution of the type III PKS superfamily is governed by whole-genome duplications and triplications. The combined phylogenetic and synteny analyses in this study provide insights into changes in the genomic location and context that are retained for a longer time scale with more recent functional divergence captured by gene sequence alterations.
Collapse
Affiliation(s)
- Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany
| | - Hiroshi A Maeda
- Department of Botany, University of Wisconsin–Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| | - Sebastian Proost
- Laboratory of Molecular Bacteriology, Department of Microbiology and Immunology, Rega Institute, KU Leuven, Herestraat, 3000 Leuven, Belgium
- VIB-KU Leuven Center for Microbiology, Campus Gasthuisberg, Rega Instituut, Herestraat, 3000 Leuven, Belgium
| | - Takayuki Tohge
- Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany
- Author for communication:
| |
Collapse
|
34
|
Milne L, Bayer M, Rapazote-Flores P, Mayer CD, Waugh R, Simpson CG. EORNA, a barley gene and transcript abundance database. Sci Data 2021; 8:90. [PMID: 33767193 DOI: 10.1038/s41597-021-00872-874] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 02/22/2021] [Indexed: 05/27/2023] Open
Abstract
A high-quality, barley gene reference transcript dataset (BaRTv1.0), was used to quantify gene and transcript abundances from 22 RNA-seq experiments, covering 843 separate samples. Using the abundance data we developed a Barley Expression Database (EORNA*) to underpin a visualisation tool that displays comparative gene and transcript abundance data on demand as transcripts per million (TPM) across all samples and all the genes. EORNA provides gene and transcript models for all of the transcripts contained in BaRTV1.0, and these can be conveniently identified through either BaRT or HORVU gene names, or by direct BLAST of query sequences. Browsing the quantification data reveals cultivar, tissue and condition specific gene expression and shows changes in the proportions of individual transcripts that have arisen via alternative splicing. TPM values can be easily extracted to allow users to determine the statistical significance of observed transcript abundance variation among samples or perform meta analyses on multiple RNA-seq experiments. * Eòrna is the Scottish Gaelic word for Barley.
Collapse
Affiliation(s)
- Linda Milne
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Micha Bayer
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Paulo Rapazote-Flores
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Claus-Dieter Mayer
- Biomathematics and Statistics Scotland, University of Aberdeen, Aberdeen, AB25 2ZD, UK
| | - Robbie Waugh
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
- Division of Plant Sciences, School of Life Sciences, University of Dundee at the James Hutton Institute, Dundee, DD2 5DA, UK
- School of Agriculture and Wine & Waite Research Institute, University of Adelaide, Waite Campus, Glen Osmond, SA, 5064, Australia
| | - Craig G Simpson
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK.
| |
Collapse
|
35
|
Milne L, Bayer M, Rapazote-Flores P, Mayer CD, Waugh R, Simpson CG. EORNA, a barley gene and transcript abundance database. Sci Data 2021; 8:90. [PMID: 33767193 PMCID: PMC7994555 DOI: 10.1038/s41597-021-00872-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 02/22/2021] [Indexed: 01/31/2023] Open
Abstract
A high-quality, barley gene reference transcript dataset (BaRTv1.0), was used to quantify gene and transcript abundances from 22 RNA-seq experiments, covering 843 separate samples. Using the abundance data we developed a Barley Expression Database (EORNA*) to underpin a visualisation tool that displays comparative gene and transcript abundance data on demand as transcripts per million (TPM) across all samples and all the genes. EORNA provides gene and transcript models for all of the transcripts contained in BaRTV1.0, and these can be conveniently identified through either BaRT or HORVU gene names, or by direct BLAST of query sequences. Browsing the quantification data reveals cultivar, tissue and condition specific gene expression and shows changes in the proportions of individual transcripts that have arisen via alternative splicing. TPM values can be easily extracted to allow users to determine the statistical significance of observed transcript abundance variation among samples or perform meta analyses on multiple RNA-seq experiments. * Eòrna is the Scottish Gaelic word for Barley.
Collapse
Affiliation(s)
- Linda Milne
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Micha Bayer
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Paulo Rapazote-Flores
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Claus-Dieter Mayer
- Biomathematics and Statistics Scotland, University of Aberdeen, Aberdeen, AB25 2ZD, UK
| | - Robbie Waugh
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
- Division of Plant Sciences, School of Life Sciences, University of Dundee at the James Hutton Institute, Dundee, DD2 5DA, UK
- School of Agriculture and Wine & Waite Research Institute, University of Adelaide, Waite Campus, Glen Osmond, SA, 5064, Australia
| | - Craig G Simpson
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK.
| |
Collapse
|
36
|
A Computational Method to Predict Effects of Residue Mutations on the Catalytic Efficiency of Hydrolases. Catalysts 2021. [DOI: 10.3390/catal11020286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
With scientific and technological advances, growing research has focused on engineering enzymes that acquire enhanced efficiency and activity. Thereinto, computer-based enzyme modification makes up for the time-consuming and labor-intensive experimental methods and plays a significant role. In this study, for the first time, we collected and manually curated a data set for hydrolases mutation, including structural information of enzyme-substrate complexes, mutated sites and Kcat/Km obtained from vitro assay. We further constructed a classification model using the random forest algorithm to predict the effects of residue mutations on catalytic efficiency (increase or decrease) of hydrolases. This method has achieved impressive performance on a blind test set with the area under the receiver operating characteristic curve of 0.86 and the Matthews Correlation Coefficient of 0.659. Our results demonstrate that computational mutagenesis has an instructive effect on enzyme modification, which may expedite the design of engineering hydrolases.
Collapse
|
37
|
Yetsko K, Farrell JA, Blackburn NB, Whitmore L, Stammnitz MR, Whilde J, Eastman CB, Ramia DR, Thomas R, Krstic A, Linser P, Creer S, Carvalho G, Devlin MA, Nahvi N, Leandro AC, deMaar TW, Burkhalter B, Murchison EP, Schnitzler C, Duffy DJ. Molecular characterization of a marine turtle tumor epizootic, profiling external, internal and postsurgical regrowth tumors. Commun Biol 2021; 4:152. [PMID: 33526843 PMCID: PMC7851172 DOI: 10.1038/s42003-021-01656-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 12/31/2020] [Indexed: 01/30/2023] Open
Abstract
Sea turtle populations are under threat from an epizootic tumor disease (animal epidemic) known as fibropapillomatosis. Fibropapillomatosis continues to spread geographically, with prevalence of the disease also growing at many longer-affected sites globally. However, we do not yet understand the precise environmental, mutational and viral events driving fibropapillomatosis tumor formation and progression.Here we perform transcriptomic and immunohistochemical profiling of five fibropapillomatosis tumor types: external new, established and postsurgical regrowth tumors, and internal lung and kidney tumors. We reveal that internal tumors are molecularly distinct from the more common external tumors. However, they have a small number of conserved potentially therapeutically targetable molecular vulnerabilities in common, such as the MAPK, Wnt, TGFβ and TNF oncogenic signaling pathways. These conserved oncogenic drivers recapitulate remarkably well the core pan-cancer drivers responsible for human cancers. Fibropapillomatosis has been considered benign, but metastatic-related transcriptional signatures are strongly activated in kidney and established external tumors. Tumors in turtles with poor outcomes (died/euthanized) have genes associated with apoptosis and immune function suppressed, with these genes providing putative predictive biomarkers.Together, these results offer an improved understanding of fibropapillomatosis tumorigenesis and provide insights into the origins, inter-tumor relationships, and therapeutic treatment for this wildlife epizootic.
Collapse
Affiliation(s)
- Kelsey Yetsko
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Jessica A Farrell
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| | - Nicholas B Blackburn
- Department of Human Genetics, School of Medicine, University of Texas Rio Grande Valley, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, School of Medicine, University of Texas Rio Grande Valley, Brownsville, TX, USA
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Liam Whitmore
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
- Department of Biological Sciences, School of Natural Sciences, Faculty of Science and Engineering, University of Limerick, Limerick, Ireland
| | - Maximilian R Stammnitz
- Transmissible Cancer Group, Department of Veterinary Medicine, University of Cambridge, Cambridge, CB3 0ES, UK
| | - Jenny Whilde
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Catherine B Eastman
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Devon Rollinson Ramia
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Rachel Thomas
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Aleksandar Krstic
- Systems Biology Ireland & Precision Oncology Ireland, School of Medicine, University College Dublin, Belfield, Dublin, 4, Ireland
| | - Paul Linser
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Simon Creer
- Molecular Ecology and Fisheries Genetics Laboratory, School of Biological Sciences, Bangor University, Bangor, Gwynedd, LL57 2UW, UK
| | - Gary Carvalho
- Molecular Ecology and Fisheries Genetics Laboratory, School of Biological Sciences, Bangor University, Bangor, Gwynedd, LL57 2UW, UK
| | | | - Nina Nahvi
- Sea Turtle Inc., South Padre Island, TX, USA
| | - Ana Cristina Leandro
- Department of Human Genetics, School of Medicine, University of Texas Rio Grande Valley, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, School of Medicine, University of Texas Rio Grande Valley, Brownsville, TX, USA
| | | | - Brooke Burkhalter
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
| | - Elizabeth P Murchison
- Transmissible Cancer Group, Department of Veterinary Medicine, University of Cambridge, Cambridge, CB3 0ES, UK
| | - Christine Schnitzler
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| | - David J Duffy
- The Whitney Laboratory for Marine Bioscience and Sea Turtle Hospital, University of Florida, St. Augustine, FL, 32080, USA.
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA.
- Department of Biological Sciences, School of Natural Sciences, Faculty of Science and Engineering, University of Limerick, Limerick, Ireland.
- Systems Biology Ireland & Precision Oncology Ireland, School of Medicine, University College Dublin, Belfield, Dublin, 4, Ireland.
- Molecular Ecology and Fisheries Genetics Laboratory, School of Biological Sciences, Bangor University, Bangor, Gwynedd, LL57 2UW, UK.
| |
Collapse
|
38
|
Azlan A, Obeidat SM, Theva Das K, Yunus MA, Azzam G. Genome-wide identification of Aedes albopictus long noncoding RNAs and their association with dengue and Zika virus infection. PLoS Negl Trop Dis 2021; 15:e0008351. [PMID: 33481791 PMCID: PMC7872224 DOI: 10.1371/journal.pntd.0008351] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 02/09/2021] [Accepted: 11/20/2020] [Indexed: 12/14/2022] Open
Abstract
The Asian tiger mosquito, Aedes albopictus (Ae. albopictus), is an important vector that transmits arboviruses such as dengue (DENV), Zika (ZIKV) and Chikungunya virus (CHIKV). Long noncoding RNAs (lncRNAs) are known to regulate various biological processes. Knowledge on Ae. albopictus lncRNAs and their functional role in virus-host interactions are still limited. Here, we identified and characterized the lncRNAs in the genome of an arbovirus vector, Ae. albopictus, and evaluated their potential involvement in DENV and ZIKV infection. We used 148 public datasets, and identified a total of 10, 867 novel lncRNA transcripts, of which 5,809, 4,139, and 919 were intergenic, intronic and antisense respectively. The Ae. albopictus lncRNAs shared many characteristics with other species such as short length, low GC content, and low sequence conservation. RNA-sequencing of Ae. albopictus cells infected with DENV and ZIKV showed that the expression of lncRNAs was altered upon virus infection. Target prediction analysis revealed that Ae. albopictus lncRNAs may regulate the expression of genes involved in immunity and other metabolic and cellular processes. To verify the role of lncRNAs in virus infection, we generated mutations in lncRNA loci using CRISPR-Cas9, and discovered that two lncRNA loci mutations, namely XLOC_029733 (novel lncRNA transcript id: lncRNA_27639.2) and LOC115270134 (known lncRNA transcript id: XR_003899061.1) resulted in enhancement of DENV and ZIKV replication. The results presented here provide an important foundation for future studies of lncRNAs and their relationship with virus infection in Ae. albopictus. Ae. albopictus is an important vector of arboviruses such as dengue and Zika viruses. Studies on virus-host interaction at gene expression and molecular level are crucial especially in devising methods to inhibit virus replication in Aedes mosquitoes. Previous reports have shown that, besides protein-coding genes, noncoding RNAs such as lncRNAs are also involved in virus-host interaction. In this study, we report a comprehensive catalog of novel lncRNA transcripts in the genome of Ae. albopictus. We also show that the expression of lncRNAs was altered upon infection with dengue and Zika. Additionally, depletion of certain lncRNAs resulted in increased replication of dengue and Zika; hence, suggesting potential association of lncRNAs in virus infection. Results of this study provide a new avenue to the investigation of mosquito-virus interactions, especially in the aspect of noncoding genes.
Collapse
Affiliation(s)
- Azali Azlan
- School of Biological Sciences, Universiti Sains Malaysia, Penang, Malaysia
| | - Sattam M. Obeidat
- School of Biological Sciences, Universiti Sains Malaysia, Penang, Malaysia
| | - Kumitaa Theva Das
- Infectomics Cluster, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Bertam, Kepala Batas, Pulau Pinang, Malaysia
| | - Muhammad Amir Yunus
- Infectomics Cluster, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Bertam, Kepala Batas, Pulau Pinang, Malaysia
| | - Ghows Azzam
- School of Biological Sciences, Universiti Sains Malaysia, Penang, Malaysia
- * E-mail:
| |
Collapse
|
39
|
Rowe M, Whittington E, Borziak K, Ravinet M, Eroukhmanoff F, Sætre GP, Dorus S. Molecular Diversification of the Seminal Fluid Proteome in a Recently Diverged Passerine Species Pair. Mol Biol Evol 2020; 37:488-506. [PMID: 31665510 PMCID: PMC6993853 DOI: 10.1093/molbev/msz235] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Seminal fluid proteins (SFPs) mediate an array of postmating reproductive processes that influence fertilization and fertility. As such, it is widely held that SFPs may contribute to postmating, prezygotic reproductive barriers between closely related taxa. We investigated seminal fluid (SF) diversification in a recently diverged passerine species pair (Passer domesticus and Passer hispaniolensis) using a combination of proteomic and comparative evolutionary genomic approaches. First, we characterized and compared the SF proteome of the two species, revealing consistencies with known aspects of SFP biology and function in other taxa, including the presence and diversification of proteins involved in immunity and sperm maturation. Second, using whole-genome resequencing data, we assessed patterns of genomic differentiation between house and Spanish sparrows. These analyses detected divergent selection on immunity-related SF genes and positive selective sweeps in regions containing a number of SF genes that also exhibited protein abundance diversification between species. Finally, we analyzed the molecular evolution of SFPs across 11 passerine species and found a significantly higher rate of positive selection in SFPs compared with the rest of the genome, as well as significant enrichments for functional pathways related to immunity in the set of positively selected SF genes. Our results suggest that selection on immunity pathways is an important determinant of passerine SF composition and evolution. Assessing the role of immunity genes in speciation in other recently diverged taxa should be prioritized given the potential role for immunity-related proteins in reproductive incompatibilities in Passer sparrows.
Collapse
Affiliation(s)
- Melissah Rowe
- Natural History Museum, University of Oslo, Oslo, Norway.,Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.,Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands
| | - Emma Whittington
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, NY
| | - Kirill Borziak
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, NY
| | - Mark Ravinet
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Fabrice Eroukhmanoff
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Glenn-Peter Sætre
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Steve Dorus
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, NY
| |
Collapse
|
40
|
Chang HY, Tong CBS. Identification of Candidate Genes Involved in Fruit Ripening and Crispness Retention Through Transcriptome Analyses of a 'Honeycrisp' Population. PLANTS (BASEL, SWITZERLAND) 2020; 9:E1335. [PMID: 33050481 PMCID: PMC7650588 DOI: 10.3390/plants9101335] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 09/23/2020] [Accepted: 10/02/2020] [Indexed: 02/05/2023]
Abstract
Crispness retention is a postharvest trait that fruit of the 'Honeycrisp' apple and some of its progeny possess. To investigate the molecular mechanisms of crispness retention, progeny individuals derived from a 'Honeycrisp' × MN1764 population with fruit that either retain crispness (named "Retain"), lose crispness (named "Lose"), or that are not crisp at harvest (named "Non-crisp") were selected for transcriptomic comparisons. Differentially expressed genes (DEGs) were identified using RNA-Seq, and the expression levels of the DEGs were validated using nCounter®. Functional annotation of the DEGs revealed distinct ripening behaviors between fruit of the "Retain" and "Non-crisp" individuals, characterized by opposing expression patterns of auxin- and ethylene-related genes. However, both types of genes were highly expressed in the fruit of "Lose" individuals and 'Honeycrisp', which led to the potential involvements of genes encoding auxin-conjugating enzyme (GH3), ubiquitin ligase (ETO), and jasmonate O-methyltransferase (JMT) in regulating fruit ripening. Cell wall-related genes also differentiated the phenotypic groups; greater numbers of cell wall synthesis genes were highly expressed in fruit of the "Retain" individuals and 'Honeycrisp' when compared with "Non-crisp" individuals and MN1764. On the other hand, the phenotypic differences between fruit of the "Retain" and "Lose" individuals could be attributed to the functioning of fewer cell wall-modifying genes. A cell wall-modifying gene, MdXTH, was consistently identified as differentially expressed in those fruit over two years in this study, so is a major candidate for crispness retention.
Collapse
Affiliation(s)
- Hsueh-Yuan Chang
- Department of Horticultural Science, University of Minnesota, Saint Paul, MN 55108, USA;
| | | |
Collapse
|
41
|
Du Z, He Y, Li J, Uversky VN. DeepAdd: Protein function prediction from k-mer embedding and additional features. Comput Biol Chem 2020; 89:107379. [PMID: 33011616 DOI: 10.1016/j.compbiolchem.2020.107379] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Revised: 09/15/2020] [Accepted: 09/17/2020] [Indexed: 10/23/2022]
Abstract
With the application of new high throughput sequencing technology, a large number of protein sequences is becoming available. Determination of the functional characteristics of these proteins by experiments is an expensive endeavor that requires a lot of time. Furthermore, at the organismal level, such kind of experimental functional analyses can be conducted only for a very few selected model organisms. Computational function prediction methods can be used to fill this gap. The functions of proteins are classified by Gene Ontology (GO), which contains more than 40,000 classifications in three domains, Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Additionally, since proteins have many functions, function prediction represents a multi-label and multi-class problem. We developed a new method to predict protein function from sequence. To this end, natural language model was used to generate word embedding of sequence and learn features from it by deep learning, and additional features to locate every protein. Our method uses the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and have noticeable improvement over several algorithms, such as FFPred, DeepGO, GoFDR and other methods compared on the CAFA3 datasets.
Collapse
Affiliation(s)
- Zhihua Du
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Guangdong Province, PR China.
| | - Yufeng He
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Guangdong Province, PR China
| | - Jianqiang Li
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Guangdong Province, PR China
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, USA; USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, USA; Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Institutskaya Str., 7, Pushchino, Moscow Region, 142290, Russia.
| |
Collapse
|
42
|
Vieira Velloso CC, de Oliveira CA, Gomes EA, Lana UGDP, de Carvalho CG, Guimarães LJM, Pastina MM, de Sousa SM. Genome-guided insights of tropical Bacillus strains efficient in maize growth promotion. FEMS Microbiol Ecol 2020; 96:5891423. [DOI: 10.1093/femsec/fiaa157] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 08/10/2020] [Indexed: 02/06/2023] Open
Abstract
ABSTRACT
Plant growth promoting bacteria (PGPB) are an efficient and sustainable alternative to mitigate biotic and abiotic stresses in maize. This work aimed to sequence the genome of two Bacillus strains (B116 and B119) and to evaluate their plant growth-promoting (PGP) potential in vitro and their capacity to trigger specific responses in different maize genotypes. Analysis of the genomic sequences revealed the presence of genes related to PGP activities. Both strains were able to produce biofilm and exopolysaccharides, and solubilize phosphate. The strain B119 produced higher amounts of IAA-like molecules and phytase, whereas B116 was capable to produce more acid phosphatase. Maize seedlings inoculated with either strains were submitted to polyethylene glycol-induced osmotic stress and showed an increase of thicker roots, which resulted in a higher root dry weight. The inoculation also increased the total dry weight and modified the root morphology of 16 out of 21 maize genotypes, indicating that the bacteria triggered specific responses depending on plant genotype background. Maize root remodeling was related to growth promotion mechanisms found in genomic prediction and confirmed by in vitro analysis. Overall, the genomic and phenotypic characterization brought new insights to the mechanisms of PGP in tropical Bacillus.
Collapse
Affiliation(s)
- Camila Cristina Vieira Velloso
- Universidade Federal de São João del-Rei, Rua Padre João Pimentel, 80 - Dom Bosco, São João del-Rei - MG, 36301-158, Brazil
| | - Christiane Abreu de Oliveira
- Centro Universitário de Sete Lagoas, Avenida Marechal Castelo Branco, 2765 - Santo Antonio, Sete Lagoas - MG, 35701-242, Brazil
- Embrapa Milho e Sorgo,Rodovia MG 424 Km 45, Zona Rural, Sete Lagoas - MG, 35701-970, Brazil
| | - Eliane Aparecida Gomes
- Embrapa Milho e Sorgo,Rodovia MG 424 Km 45, Zona Rural, Sete Lagoas - MG, 35701-970, Brazil
| | - Ubiraci Gomes de Paula Lana
- Centro Universitário de Sete Lagoas, Avenida Marechal Castelo Branco, 2765 - Santo Antonio, Sete Lagoas - MG, 35701-242, Brazil
- Embrapa Milho e Sorgo,Rodovia MG 424 Km 45, Zona Rural, Sete Lagoas - MG, 35701-970, Brazil
| | - Chainheny Gomes de Carvalho
- Centro Universitário de Sete Lagoas, Avenida Marechal Castelo Branco, 2765 - Santo Antonio, Sete Lagoas - MG, 35701-242, Brazil
| | | | - Maria Marta Pastina
- Universidade Federal de São João del-Rei, Rua Padre João Pimentel, 80 - Dom Bosco, São João del-Rei - MG, 36301-158, Brazil
- Embrapa Milho e Sorgo,Rodovia MG 424 Km 45, Zona Rural, Sete Lagoas - MG, 35701-970, Brazil
| | - Sylvia Morais de Sousa
- Universidade Federal de São João del-Rei, Rua Padre João Pimentel, 80 - Dom Bosco, São João del-Rei - MG, 36301-158, Brazil
- Centro Universitário de Sete Lagoas, Avenida Marechal Castelo Branco, 2765 - Santo Antonio, Sete Lagoas - MG, 35701-242, Brazil
- Embrapa Milho e Sorgo,Rodovia MG 424 Km 45, Zona Rural, Sete Lagoas - MG, 35701-970, Brazil
| |
Collapse
|
43
|
Zhu C, Hu W, Zhao M, Huang MY, Cheng HZ, He JP, Liu JL. The Pre-Implantation Embryo Induces Uterine Inflammatory Reaction in Mice. Reprod Sci 2020; 28:60-68. [PMID: 32651899 DOI: 10.1007/s43032-020-00259-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 07/02/2020] [Indexed: 10/23/2022]
Abstract
It has been well established that uterine function during the peri-implantation period is precisely regulated by ovarian estrogen and progesterone. The embryo enters the uterine cavity before implantation. However, the impact of pre-implantation embryo on uterine function is largely unknown. In the present study, we performed RNA-seq analysis of mouse uterus on day 4 morning of natural pregnancy (with embryos in the uterus) and pseudo-pregnancy (without embryos in the uterus). We found that 146 genes were upregulated, and 77 genes were downregulated by the pre-implantation embryo. Gene ontology and gene network analysis highlighted the activation of inflammatory reaction in the uterus. By examining the promoter region of differentially expressed genes, we found that NF-kappaB was a causal transcription factor. Finally, we validated 4 inflammation-related genes by quantitative RT-PCR. These 4 genes are likely the main mediators of the inflammatory reaction in the uterus triggered by the pre-implantation embryo. Our data indicated that the pre-implantation embryo causes uterine inflammatory reaction, which in turn might contribute to the establishment of uterine receptivity and embryo implantation.
Collapse
Affiliation(s)
- Can Zhu
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Wei Hu
- College of Life Sciences and Resource Environment, Yichun University, Yichun, 336000, China
| | - Miao Zhao
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Ming-Yu Huang
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Hao-Zhuang Cheng
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Jia-Peng He
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Ji-Long Liu
- College of Veterinary Medicine, South China Agricultural University, No.483 Wushan Road, Tianhe District, Guangzhou, 510642, China. .,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China.
| |
Collapse
|
44
|
Rosa BA, Choi YJ, McNulty SN, Jung H, Martin J, Agatsuma T, Sugiyama H, Le TH, Doanh PN, Maleewong W, Blair D, Brindley PJ, Fischer PU, Mitreva M. Comparative genomics and transcriptomics of 4 Paragonimus species provide insights into lung fluke parasitism and pathogenesis. Gigascience 2020; 9:giaa073. [PMID: 32687148 PMCID: PMC7370270 DOI: 10.1093/gigascience/giaa073] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 03/19/2020] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Paragonimus spp. (lung flukes) are among the most injurious foodborne helminths, infecting ∼23 million people and subjecting ∼292 million to infection risk. Paragonimiasis is acquired from infected undercooked crustaceans and primarily affects the lungs but often causes lesions elsewhere including the brain. The disease is easily mistaken for tuberculosis owing to similar pulmonary symptoms, and accordingly, diagnostics are in demand. RESULTS We assembled, annotated, and compared draft genomes of 4 prevalent and distinct Paragonimus species: Paragonimus miyazakii, Paragonimus westermani, Paragonimus kellicotti, and Paragonimus heterotremus. Genomes ranged from 697 to 923 Mb, included 12,072-12,853 genes, and were 71.6-90.1% complete according to BUSCO. Orthologous group analysis spanning 21 species (lung, liver, and blood flukes, additional platyhelminths, and hosts) provided insights into lung fluke biology. We identified 256 lung fluke-specific and conserved orthologous groups with consistent transcriptional adult-stage Paragonimus expression profiles and enriched for iron acquisition, immune modulation, and other parasite functions. Previously identified Paragonimus diagnostic antigens were matched to genes, providing an opportunity to optimize and ensure pan-Paragonimus reactivity for diagnostic assays. CONCLUSIONS This report provides advances in molecular understanding of Paragonimus and underpins future studies into the biology, evolution, and pathogenesis of Paragonimus and related foodborne flukes. We anticipate that these novel genomic and transcriptomic resources will be invaluable for future lung fluke research.
Collapse
Affiliation(s)
- Bruce A Rosa
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - Young-Jun Choi
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - Samantha N McNulty
- The McDonnell Genome Institute at Washington University, School of Medicine, 4444 Forest Park Ave, St. Louis, MO 63108, USA
| | - Hyeim Jung
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - John Martin
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - Takeshi Agatsuma
- Department of Environmental Health Sciences, Kochi Medical School, Kohasu, Oko-cho 185-1, Nankoku, Kochi, 783-8505, Japan
| | - Hiromu Sugiyama
- Laboratory of Helminthology, Department of Parasitology, National Institute of Infectious Diseases, 1-23-1 Toyama, Shinjuku-ku, Tokyo 162-8640, Japan
| | - Thanh Hoa Le
- Department of Immunology, Institute of Biotechnology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cay Giay, Ha Noi 10307, Vietnam
| | - Pham Ngoc Doanh
- Institute of Ecology and Biological Resources, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cay Giay, Ha Noi 10307, Vietnam
- Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cay Giay, Ha Noi 10307, Vietnam
| | - Wanchai Maleewong
- Research and Diagnostic Center for Emerging Infectious Diseases, Khon Kaen University, 123 Moo 16 Mittraphap Rd., Nai-Muang, Muang District, Khon Kaen 40002, Thailand
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, 123 Moo 16 Mittraphap Rd., Nai-Muang, Muang District, Khon Kaen 40002, Thailand
| | - David Blair
- College of Marine and Environmental Sciences, James Cook University, 1 James Cook Drive, Townsville, Queensland 4811, Australia
| | - Paul J Brindley
- Departments of Microbiology, Immunology and Tropical Medicine, and Research Center for Neglected Diseases of Poverty, and Pathology School of Medicine & Health Sciences, George Washington University, Ross Hall 2300 Eye Street, NW, Washington, DC 20037, USA
| | - Peter U Fischer
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - Makedonka Mitreva
- Department of Internal Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
- The McDonnell Genome Institute at Washington University, School of Medicine, 4444 Forest Park Ave, St. Louis, MO 63108, USA
| |
Collapse
|
45
|
Bouziane H, Chouarfia A. Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment. J Integr Bioinform 2020; 18:51-79. [PMID: 32598314 PMCID: PMC8035964 DOI: 10.1515/jib-2019-0091] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 04/08/2020] [Indexed: 12/31/2022] Open
Abstract
To date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein-protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.
Collapse
Affiliation(s)
- Hafida Bouziane
- Département d’Informatique, Université des Sciences et de la Technologie d’Oran Mohamed Boudiaf, USTO-MB BP 1505, El M’Naouer, 31000, Oran, Algeria
| | - Abdallah Chouarfia
- Département d’Informatique, Université des Sciences et de la Technologie d’Oran Mohamed Boudiaf, USTO-MB BP 1505, El M’Naouer, 31000, Oran, Algeria
| |
Collapse
|
46
|
Sah PP, Bhattacharya S, Banerjee A, Ray S. Identification of novel therapeutic target and epitopes through proteome mining from essential hypothetical proteins in Salmonella strains: An In silico approach towards antivirulence therapy and vaccine development. INFECTION GENETICS AND EVOLUTION 2020; 83:104315. [PMID: 32276082 DOI: 10.1016/j.meegid.2020.104315] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Revised: 03/29/2020] [Accepted: 04/02/2020] [Indexed: 10/24/2022]
Abstract
Salmonella strains are responsible for a huge mortality rate through foodborne ailment in the world that necessitated the discovery of novel drugs and vaccines. Essential hypothetical proteins (EHPs), whose structures and functions were previously unknown, could serve as potential therapeutic and vaccine targets. Antivirulence therapy shall emerge as a superior therapeutic approach that uses virulence factors as drug targets. This study annotated the biological functions of 96 out of total 106 essential hypothetical proteins in five strains of Salmonella and classified into nine important protein categories. 34 virulence factors were predicted among the EHPs, out of which, 11 were identified to be pathogen specific potential drug targets for antivirulence therapy. These targets were non-homologous to both human and gut microbiota proteome to avoid cross-reactivity with them. Seven identified targets had druggable property, while the rest four targets were novel targets. Four identified targets (DEG10320148, DEG10110027, DEG10110040 and DEG10110142) had antigenic properties and were further classified as: two membrane-bound Lipid-binding transmembrane proteins, a Zinc-binding membrane protein and an extracellular glycosylase. These targets could be potentially used for the development of subunit vaccines. The study further identified 11 highly conserved and exposed epitope sequences from these 4 vaccine targets. The three-dimensional structures of the vaccine targets were also elucidated along with highlighting the conformation of the epitopes. This study identified potential therapeutic targets for antivirulence therapy against Salmonella. It would therefore instigate in novel drug designing as well as provide important leads to new Salmonella vaccine development.
Collapse
Affiliation(s)
| | | | - Arundhati Banerjee
- Department of Biochemistry and Biophysics, University of Kalyani, Kalyani, Nadia, India
| | - Sujay Ray
- Amity Institute of Biotechnology, Amity University, Kolkata, India.
| |
Collapse
|
47
|
Bhattacharya S, Ghosh P, Banerjee D, Banerjee A, Ray S. In Silico Drug Target Discovery Through Proteome Mining from M. tuberculosis: An Insight into Antivirulent Therapy. Comb Chem High Throughput Screen 2020; 23:253-268. [PMID: 32072892 DOI: 10.2174/1386207323666200219120903] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 01/23/2020] [Accepted: 02/01/2020] [Indexed: 11/22/2022]
Abstract
AIM AND OBJECTIVE One of the challenges to conventional therapies against Mycobacterium tuberculosis is the development of multi-drug resistant pathogenic strains. This study was undertaken to explore new therapeutic targets for the revolutionary antivirulence therapy utilizing the pathogen's essential hypothetical proteins, serving as virulence factors, which is the essential first step in novel drug designing. METHODS Functional annotations of essential hypothetical proteins from Mycobacterium tuberculosis (H37Rv strain) were performed through domain annotation, Gene Ontology analysis, physicochemical characterization and prediction of subcellular localization. Virulence factors among the essential hypothetical proteins were predicted, among which pathogen-specific drug target candidates, non-homologous to human and gut microbiota, were identified. This was followed by druggability and spectrum analysis of the identified targets. RESULTS AND CONCLUSION The study successfully assigned functions of 83 essential hypothetical proteins of Mycobacterium tuberculosis, among which 25 were identified as virulence factors. Out of 25, 12 virulence factors were observed as potential pathogen-specific drug target candidates. Nine potential targets had druggable properties and rest three were considered as novel targets. Exploration of these targets will provide new insights into future drug development. Characterization of subcellular localizations revealed that most of the predicted targets were cytoplasmic which could be ideal for intracellular drugs, while two drug targets were membranebound, ideal for vaccines. Spectrum analysis identified one broad-spectrum and 11 narrowspectrum targets. This study would, therefore, instigate designing novel therapeutics for antivirulence therapy, which have the potential to serve as revolutionary treatment instead of conventional antibiotic therapies to overcome the lethality of antibiotic-resistant strains.
Collapse
Affiliation(s)
| | - Puja Ghosh
- Amity Institute of Biotechnology, Amity University, Kolkata, India
| | | | - Arundhati Banerjee
- Department of Biochemistry and Biophysics, University of Kalyani, Kalyani, Nadia, India
| | - Sujay Ray
- Amity Institute of Biotechnology, Amity University, Kolkata, India
| |
Collapse
|
48
|
Choi YJ, Fontenla S, Fischer PU, Le TH, Costábile A, Blair D, Brindley PJ, Tort JF, Cabada MM, Mitreva M. Adaptive Radiation of the Flukes of the Family Fasciolidae Inferred from Genome-Wide Comparisons of Key Species. Mol Biol Evol 2020; 37:84-99. [PMID: 31501870 DOI: 10.1093/molbev/msz204] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Liver and intestinal flukes of the family Fasciolidae cause zoonotic food-borne infections that impact both agriculture and human health throughout the world. Their evolutionary history and the genetic basis underlying their phenotypic and ecological diversity are not well understood. To close that knowledge gap, we compared the whole genomes of Fasciola hepatica, Fasciola gigantica, and Fasciolopsis buski and determined that the split between Fasciolopsis and Fasciola took place ∼90 Ma in the late Cretaceous period, and that between 65 and 50 Ma an intermediate host switch and a shift from intestinal to hepatic habitats occurred in the Fasciola lineage. The rapid climatic and ecological changes occurring during this period may have contributed to the adaptive radiation of these flukes. Expansion of cathepsins, fatty-acid-binding proteins, protein disulfide-isomerases, and molecular chaperones in the genus Fasciola highlights the significance of excretory-secretory proteins in these liver-dwelling flukes. Fasciola hepatica and Fasciola gigantica diverged ∼5 Ma near the Miocene-Pliocene boundary that coincides with reduced faunal exchange between Africa and Eurasia. Severe decrease in the effective population size ∼10 ka in Fasciola is consistent with a founder effect associated with its recent global spread through ruminant domestication. G-protein-coupled receptors may have key roles in adaptation of physiology and behavior to new ecological niches. This study has provided novel insights about the genome evolution of these important pathogens, has generated genomic resources to enable development of improved interventions and diagnosis, and has laid a solid foundation for genomic epidemiology to trace drug resistance and to aid surveillance.
Collapse
Affiliation(s)
- Young-Jun Choi
- McDonnell Genome Institute at Washington University in St. Louis, St. Louis, MO
| | - Santiago Fontenla
- Departamento de Genética, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - Peter U Fischer
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO
| | - Thanh Hoa Le
- Immunology Department, Institute of Biotechnology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Alicia Costábile
- Departamento de Genética, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - David Blair
- College of Science and Engineering, James Cook University, Townsville, QLD, Australia
| | - Paul J Brindley
- Department of Microbiology, Immunology and Tropical Medicine, and Research Center for Neglected Diseases of Poverty, School of Medicine & Health Sciences, George Washington University, Washington, DC
| | - Jose F Tort
- Departamento de Genética, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - Miguel M Cabada
- Division of Infectious Diseases, Department of Medicine, School of Medicine, University of Texas Medical Branch, Galveston, TX
| | - Makedonka Mitreva
- McDonnell Genome Institute at Washington University in St. Louis, St. Louis, MO.,Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO
| |
Collapse
|
49
|
Whittington E, Karr TL, Mongue AJ, Dorus S, Walters JR. Evolutionary Proteomics Reveals Distinct Patterns of Complexity and Divergence between Lepidopteran Sperm Morphs. Genome Biol Evol 2020; 11:1838-1846. [PMID: 31268533 PMCID: PMC6607854 DOI: 10.1093/gbe/evz080] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/05/2019] [Indexed: 12/18/2022] Open
Abstract
Spermatozoa are one of the most strikingly diverse animal cell types. One poorly understood example of this diversity is sperm heteromorphism, where males produce multiple distinct morphs of sperm in a single ejaculate. Typically, only one morph is capable of fertilization and the function of the nonfertilizing morph, called parasperm, remains to be elucidated. Sperm heteromorphism has multiple independent origins, including Lepidoptera (moths and butterflies), where males produce a fertilizing eupyrene sperm and an apyrene parasperm, which lacks a nucleus and nuclear DNA. Here we report a comparative proteomic analysis of eupyrene and apyrene sperm between two distantly related lepidopteran species, the monarch butterfly (Danaus plexippus) and Carolina sphinx moth (Manduca sexta). In both species, we identified ∼700 sperm proteins, with half present in both morphs and the majority of the remainder observed only in eupyrene sperm. Apyrene sperm thus have a distinctly less complex proteome. Gene ontology (GO) analysis revealed proteins shared between morphs tend to be associated with canonical sperm cell structures (e.g., flagellum) and metabolism (e.g., ATP production). GO terms for morph-specific proteins broadly reflect known structural differences, but also suggest a role for apyrene sperm in modulating female neurobiology. Comparative analysis indicates that proteins shared between morphs are most conserved between species as components of sperm, whereas morph-specific proteins turn over more quickly, especially in apyrene sperm. The rapid divergence of apyrene sperm content is consistent with a relaxation of selective constraints associated with fertilization and karyogamy. On the other hand, parasperm generally exhibit greater evolutionary lability, and our observations may therefore reflect adaptive responses to shifting regimes of sexual selection.
Collapse
Affiliation(s)
- Emma Whittington
- Center for Reproductive Evolution, Department of Biology, Syracuse University
| | - Timothy L Karr
- Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University
| | - Andrew J Mongue
- Department of Ecology and Evolutionary Biology, University of Kansas
| | - Steve Dorus
- Center for Reproductive Evolution, Department of Biology, Syracuse University
| | - James R Walters
- Department of Ecology and Evolutionary Biology, University of Kansas
| |
Collapse
|
50
|
Garcia DC, Cheng X, Land ML, Standaert RF, Morrell-Falvey JL, Doktycz MJ. Computationally Guided Discovery and Experimental Validation of Indole-3-acetic Acid Synthesis Pathways. ACS Chem Biol 2019; 14:2867-2875. [PMID: 31693336 DOI: 10.1021/acschembio.9b00725] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Elucidating the interaction networks associated with secondary metabolite production in microorganisms is an ongoing challenge made all the more daunting by the rate at which DNA sequencing technology reveals new genes and potential pathways. Developing the culturing methods, expression conditions, and genetic systems needed for validating pathways in newly discovered microorganisms is often not possible. Therefore, new tools and techniques are needed for defining complex metabolic pathways. Here, we describe an in vitro computationally assisted pathway description approach that employs bioinformatic searches of genome databases, protein structural modeling, and protein-ligand-docking simulations to predict the gene products most likely to be involved in a particular secondary metabolite production pathway. This information is then used to direct in vitro reconstructions of the pathway and subsequent confirmation of pathway activity using crude enzyme preparations. As a test system, we elucidated the pathway for biosynthesis of indole-3-acetic acid (IAA) in the plant-associated microbe Pantoea sp. YR343. This organism is capable of metabolizing tryptophan into the plant phytohormone IAA. BLAST analyses identified a likely three-step pathway involving an amino transferase, an indole pyruvate decarboxylase, and a dehydrogenase. However, multiple candidate enzymes were identified at each step, resulting in a large number of potential pathway reconstructions (32 different enzyme combinations). Our approach shows the effectiveness of crude extracts to rapidly elucidate enzymes leading to functional pathways. Results are compared to affinity purified enzymes for select combinations and found to yield similar relative activities. Further, in vitro testing of the pathway reconstructions revealed the "underground" nature of IAA metabolism in Pantoea sp. YR343 and the various mechanisms used to produce IAA. Importantly, our experiments illustrate the scalable integration of computational tools and cell-free enzymatic reactions to identify and validate metabolic pathways in a broadly applicable manner.
Collapse
Affiliation(s)
- David C. Garcia
- Biological and Nanoscale Systems Group, Biosciences Division Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee 37996-4519, United States
| | - Xiaolin Cheng
- College of Pharmacy, The Ohio State University, Columbus, Ohio 43210, United States
| | - Miriam L. Land
- Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Robert F. Standaert
- Biological and Nanoscale Systems Group, Biosciences Division Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
- Department of Chemistry, East Tennessee State University, Johnson City, Tennessee 37604, United States
| | - Jennifer L. Morrell-Falvey
- Biological and Nanoscale Systems Group, Biosciences Division Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Mitchel J. Doktycz
- Biological and Nanoscale Systems Group, Biosciences Division Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee 37996-4519, United States
| |
Collapse
|