1
|
Lu J, Xie Y, Li C, Yang J, Fu J. Tensor decomposition reveals trans-regulated gene modules in maize drought response. J Genet Genomics 2024:S1673-8527(24)00285-6. [PMID: 39522680 DOI: 10.1016/j.jgg.2024.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 10/22/2024] [Accepted: 10/24/2024] [Indexed: 11/16/2024]
Abstract
When plants respond to drought stress, dynamic cellular changes occur, accompanied by alterations in gene expression, which often act through trans-regulation. However, the detection of trans-acting genetic variants and networks of genes is challenged by the large number of genes and markers. Using a tensor decomposition method, we identify trans-acting expression quantitative trait loci (trans-eQTLs) linked to gene modules, rather than individual genes, which were associated with maize drought response. Module-to-trait association analysis demonstrates that half of the modules are relevant to drought-related traits. Genome-wide association studies of the expression patterns of each module identify 286 trans-eQTLs linked to drought-responsive modules, the majority of which cannot be detected based on individual gene expression. Notably, the trans-eQTLs located in the regions selected during maize improvement tend towards relatively strong selection. We further prioritize the genes that affect the transcriptional regulation of multiple genes in trans, as exemplified by two transcription factor genes. Our analyses highlight that multidimensional reduction could facilitate the identification of trans-acting variations in gene expression in response to dynamic environments and serve as a promising technique for high-order data processing in future crop breeding.
Collapse
Affiliation(s)
- Jiawen Lu
- State Key Laboratory of Crop Gene Resources and Breeding, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Yuxin Xie
- State Key Laboratory of Crop Gene Resources and Breeding, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Chunhui Li
- State Key Laboratory of Crop Gene Resources and Breeding, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA
| | - Junjie Fu
- State Key Laboratory of Crop Gene Resources and Breeding, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| |
Collapse
|
2
|
Polyong CP, Roytrakul S, Sirivarasai J, Yingratanasuk T, Thetkathuek A. Novel Serum Proteomes Expressed from Benzene Exposure Among Gasoline Station Attendants. Biomark Insights 2024; 19:11772719241259604. [PMID: 38868168 PMCID: PMC11168042 DOI: 10.1177/11772719241259604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 05/17/2024] [Indexed: 06/14/2024] Open
Abstract
Background Research on the proteomes impact of benzene exposure in fuel station employees remains sparse, underscoring the need for detailed health impact assessments focusing on biomarker evaluation. Objectives This investigation aimed to analyze the differences in blood parameters and serum proteomes resulting from benzene exposure between gasoline station attendants (B-GSA) and a control group. Design and methods A cross-sectional analytical study was conducted with 96 participants, comprising 54 in the B-GSA group and 42 in the control group. The methodology employed included an interview questionnaire alongside urine and blood sample collections. The urine samples were analyzed for trans,trans-muconic acid (t,t-MA) levels, while the blood samples underwent complete blood count analysis and proteome profiling. Results Post-shift analysis indicated that the B-GSA group exhibited significantly higher levels of t,t-MA and monocytes compared to the control group (P < .05). Proteome quantification identified 1448 proteins differentially expressed between the B-GSA and control groups. Among these, 20 proteins correlated with the levels of t,t-MA in urine. Notably, 4 proteins demonstrated more than a 2-fold down-regulation in the B-GSA group: HBS1-like, non-structural maintenance of chromosomes element 1 homolog, proprotein convertase subtilisin/kexin type 4, and zinc finger protein 658. The KEGG pathway analysis revealed associations with apoptosis, cancer pathways, p53 signaling, and the TNF signaling pathway. Conclusion The changes in these 4 significant proteins may elucidate the molecular mechanisms underlying benzene toxicity and suggest their potential as biomarkers for benzene poisoning in future assessments.
Collapse
Affiliation(s)
- Chan Pattama Polyong
- Occupational Health and Safety Program, Faculty of Science and Technology, Bansomdejchaopraya Rajabhat University, Bangkok, Thailand
| | - Sittiruk Roytrakul
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency, Pathum Thani, Thailand
| | - Jintana Sirivarasai
- Nutrition Division, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Tanongsak Yingratanasuk
- Department of Industrial Hygiene and Safety, Faculty of Public Health, Burapha University, Chonburi, Thailand
| | - Anamai Thetkathuek
- Department of Industrial Hygiene and Safety, Faculty of Public Health, Burapha University, Chonburi, Thailand
| |
Collapse
|
3
|
Goldberg JK, Olcerst A, McKibben M, Hare JD, Barker MS, Bronstein JL. A de novo long-read genome assembly of the sacred datura plant (Datura wrightii) reveals a role of tandem gene duplications in the evolution of herbivore-defense response. BMC Genomics 2024; 25:15. [PMID: 38166627 PMCID: PMC10759348 DOI: 10.1186/s12864-023-09894-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 12/11/2023] [Indexed: 01/05/2024] Open
Abstract
The sacred datura plant (Solanales: Solanaceae: Datura wrightii) has been used to study plant-herbivore interactions for decades. The wealth of information that has resulted leads it to have potential as a model system for studying the ecological and evolutionary genomics of these interactions. We present a de novo Datura wrightii genome assembled using PacBio HiFi long-reads. Our assembly is highly complete and contiguous (N50 = 179Mb, BUSCO Complete = 97.6%). We successfully detected a previously documented ancient whole genome duplication using our assembly and have classified the gene duplication history that generated its coding sequence content. We use it as the basis for a genome-guided differential expression analysis to identify the induced responses of this plant to one of its specialized herbivores (Coleoptera: Chrysomelidae: Lema daturaphila). We find over 3000 differentially expressed genes associated with herbivory and that elevated expression levels of over 200 genes last for several days. We also combined our analyses to determine the role that different gene duplication categories have played in the evolution of Datura-herbivore interactions. We find that tandem duplications have expanded multiple functional groups of herbivore responsive genes with defensive functions, including UGT-glycosyltranserases, oxidoreductase enzymes, and peptidase inhibitors. Overall, our results expand our knowledge of herbivore-induced plant transcriptional responses and the evolutionary history of the underlying herbivore-response genes.
Collapse
Affiliation(s)
- Jay K Goldberg
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA.
| | - Aaron Olcerst
- Department of Entomology, University of California Riverside, Riverside, CA, USA
| | - Michael McKibben
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - J Daniel Hare
- Department of Entomology, University of California Riverside, Riverside, CA, USA
| | - Michael S Barker
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Judith L Bronstein
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
4
|
Cunsolo V, Di Francesco A, Pittalà MGG, Saletti R, Foti S. The TriMet_DB: A Manually Curated Database of the Metabolic Proteins of Triticum aestivum. Nutrients 2022; 14:nu14245377. [PMID: 36558536 PMCID: PMC9781733 DOI: 10.3390/nu14245377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/07/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022] Open
Abstract
Mass-spectrometry-based wheat proteomics is challenging because the current interpretation of mass spectrometry data relies on public databases that are not exhaustive (UniProtKB/Swiss-Prot) or contain many redundant and poor or un-annotated entries (UniProtKB/TrEMBL). Here, we report the development of a manually curated database of the metabolic proteins of Triticum aestivum (hexaploid wheat), named TriMet_DB (Triticum aestivum Metabolic Proteins DataBase). The manually curated TriMet_DB was generated in FASTA format so that it can be read directly by programs used to interpret the mass spectrometry data. Furthermore, the complete list of entries included in the TriMet_DB is reported in a freely available resource, which includes for each protein the description, the gene code, the protein family, and the allergen name (if any). To evaluate its performance, the TriMet_DB was used to interpret the MS data acquired on the metabolic protein fraction extracted from the cultivar MEC of Triticum aestivum. Data are available via ProteomeXchange with identifier PXD037709.
Collapse
|
5
|
Guo W, Coulter M, Waugh R, Zhang R. The value of genotype-specific reference for transcriptome analyses in barley. Life Sci Alliance 2022; 5:e202101255. [PMID: 35459738 PMCID: PMC9034525 DOI: 10.26508/lsa.202101255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 04/10/2022] [Accepted: 04/11/2022] [Indexed: 12/31/2022] Open
Abstract
It is increasingly apparent that although different genotypes within a species share "core" genes, they also contain variable numbers of "specific" genes and different structures of "core" genes that are only present in a subset of individuals. Using a common reference genome may thus lead to a loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete or misleading transcriptomics analysis results. In this study, we assembled genotype-specific RTD (sRTD) and common reference-based RTD (cRTD) from RNA-seq data of cultivated Barke and Morex barley, respectively. Our quantitative evaluation showed that the sRTD has a significantly higher diversity of transcripts and alternative splicing events, whereas the cRTD missed 40% of transcripts present in the sRTD and it only has ∼70% accurate transcript assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression analysis. However, gene-level quantification is less affected, which may be a reasonable compromise when a high-quality genotype-specific reference is not available.
Collapse
Affiliation(s)
- Wenbin Guo
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| | - Max Coulter
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| |
Collapse
|
6
|
Vos RA, van der Veen-van Wijk CAM, Schranz ME, Vrieling K, Klinkhamer PGL, Lens F. Refining bulk segregant analyses: ontology-mediated discovery of flowering time genes in Brassica oleracea. PLANT METHODS 2022; 18:92. [PMID: 35780674 PMCID: PMC9252076 DOI: 10.1186/s13007-022-00921-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Bulk segregant analysis (BSA) can help identify quantitative trait loci (QTLs), but this may result in substantial bycatch of functionally irrelevant genes. RESULTS Here we develop a Gene Ontology-mediated approach to zoom in on specific genes located inside QTLs identified by BSA as implicated in a continuous trait. We apply this to a novel experimental system: flowering time in the giant woody Jersey kale, which we phenotyped in four bulks of flowering onset. Our inferred QTLs yielded tens of thousands of candidate genes. We reduced this by two orders of magnitude by focusing on genes annotated with terms contained within relevant subgraphs of the Gene Ontology. A pathway enrichment test then led to the circadian rhythm pathway. The genes that enriched this pathway are attested from previous research as regulating flowering time. Within that pathway, the genes CCA1, FT, and TSF were identified as having functionally significant variation compared to Arabidopsis. We validated and confirmed our ontology-mediated results through genome sequencing and homology-based SNP analysis. However, our ontology-mediated approach produced additional genes of putative importance, showing that the approach aids in exploration and discovery. CONCLUSIONS Our method is potentially applicable to the study of other complex traits and we therefore make our workflows available as open-source code and a reusable Docker container.
Collapse
Affiliation(s)
- Rutger A Vos
- Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA, Leiden, The Netherlands.
- Institute of Biology Leiden, Leiden University, Sylviusweg 72, 2333 BE, Leiden, The Netherlands.
| | | | - M Eric Schranz
- Biosystematics Group, Wageningen University and Research, P.O. Box 16, 6700AP, Wageningen, The Netherlands
| | - Klaas Vrieling
- Institute of Biology Leiden, Leiden University, Sylviusweg 72, 2333 BE, Leiden, The Netherlands
| | - Peter G L Klinkhamer
- Institute of Biology Leiden, Leiden University, Sylviusweg 72, 2333 BE, Leiden, The Netherlands
| | - Frederic Lens
- Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA, Leiden, The Netherlands
- Institute of Biology Leiden, Leiden University, Sylviusweg 72, 2333 BE, Leiden, The Netherlands
| |
Collapse
|
7
|
Wang N, Zhang J, Liu B. IDRBP-PPCT: Identifying Nucleic Acid-Binding Proteins Based on Position-Specific Score Matrix and Position-Specific Frequency Matrix Cross Transformation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2284-2293. [PMID: 33780341 DOI: 10.1109/tcbb.2021.3069263] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs) are two important nucleic acid-binding proteins (NABPs), which play important roles in biological processes such as replication, translation and transcription of genetic material. Some proteins (DRBPs) bind to both DNA and RNA, also play a key role in gene expression. Identification of DBPs, RBPs and DRBPs is important to study protein-nucleic acid interactions. Computational methods are increasingly being proposed to automatically identify DNA- or RNA-binding proteins based only on protein sequences. One challenge is to design an effective protein representation method to convert protein sequences into fixed-dimension feature vectors. In this study, we proposed a novel protein representation method called Position-Specific Scoring Matrix (PSSM) and Position-Specific Frequency Matrix (PSFM) Cross Transformation (PPCT) to represent protein sequences. This method contains the evolutionary information in PSSM and PSFM, and their correlations. A new computational predictor called IDRBP-PPCT was proposed by combining PPCT and the two-layer framework based on the random forest algorithm to identify DBPs, RBPs and DRBPs. The experimental results on the independent dataset and the tomato genome proved the effectiveness of the proposed method. A user-friendly web-server of IDRBP-PPCT was constructed, which is freely available at http://bliulab.net/IDRBP-PPCT.
Collapse
|
8
|
Martyn GD, Veggiani G, Kusebauch U, Morrone SR, Yates BP, Singer AU, Tong J, Manczyk N, Gish G, Sun Z, Kurinov I, Sicheri F, Moran MF, Moritz RL, Sidhu SS. Engineered SH2 Domains for Targeted Phosphoproteomics. ACS Chem Biol 2022; 17:1472-1484. [PMID: 35613471 DOI: 10.1021/acschembio.2c00051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A comprehensive analysis of the phosphoproteome is essential for understanding molecular mechanisms of human diseases. However, current tools used to enrich phosphotyrosine (pTyr) are limited in their applicability and scope. Here, we engineered new superbinder Src-Homology 2 (SH2) domains that enrich diverse sets of pTyr-peptides. We used phage display to select a Fes-SH2 domain variant (superFes; sFes1) with high affinity for pTyr and solved its structure bound to a pTyr-peptide. We performed systematic structure-function analyses of the superbinding mechanisms of sFes1 and superSrc-SH2 (sSrc1), another SH2 superbinder. We grafted the superbinder motifs from sFes1 and sSrc1 into 17 additional SH2 domains and confirmed increased binding affinity for specific pTyr-peptides. Using mass spectrometry (MS), we demonstrated that SH2 superbinders have distinct specificity profiles and superior capabilities to enrich pTyr-peptides. Finally, using combinations of SH2 superbinders as affinity purification (AP) tools we showed that unique subsets of pTyr-peptides can be enriched with unparalleled depth and coverage.
Collapse
Affiliation(s)
- Gregory D. Martyn
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Gianluca Veggiani
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S3E1, Canada
| | - Ulrike Kusebauch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Seamus R. Morrone
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Bradley P. Yates
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S3E1, Canada
| | - Alex U. Singer
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S3E1, Canada
| | - Jiefei Tong
- Program in Cell biology, Hospital for Sick Children, Toronto M5G 0A4, Canada
| | - Noah Manczyk
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Gerald Gish
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Igor Kurinov
- Department of Chemistry and Chemical Biology, Cornell University, NE-CAT, Argonne, Illinois 60439, United States
| | - Frank Sicheri
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Michael F. Moran
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Program in Cell biology, Hospital for Sick Children, Toronto M5G 0A4, Canada
- The Hospital for Sick Children, SPARC Biocentre, Toronto, Ontario M5G 0A4, Canada
| | - Robert L. Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Sachdev S. Sidhu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
9
|
Gupta P, Naithani S, Preece J, Kim S, Cheng T, D'Eustachio P, Elser J, Bolton EE, Jaiswal P. Plant Reactome and PubChem: The Plant Pathway and (Bio)Chemical Entity Knowledgebases. Methods Mol Biol 2022; 2443:511-525. [PMID: 35037224 DOI: 10.1007/978-1-0716-2067-0_27] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Plant Reactome (https://plantreactome.gramene.org) and PubChem ( https://pubchem.ncbi.nlm.nih.gov ) are two reference data portals and resources for curated plant pathways, small molecules, metabolites, gene products, and macromolecular interactions. Plant Reactome knowledgebase, a conceptual plant pathway network, is built by biocuration and integrating (bio)chemical entities, gene products, and macromolecular interactions. It provides manually curated pathways for the reference species Oryza sativa (rice) and gene orthology-based projections that extend pathway knowledge to 106 plant species. Currently, it hosts 320 reference pathways for plant metabolism, hormone signaling, transport, genetic regulation, plant organ development and differentiation, and biotic and abiotic stress responses. In addition to the pathway browsing and search functions, the Plant Reactome provides the analysis tools for pathway comparison between reference and projected species, pathway enrichment in gene expression data, and overlay of gene-gene interaction data on pathways. PubChem, a popular reference database of (bio)chemical entities, provides information on small molecules and other types of chemical entities, such as siRNAs, miRNAs, lipids, carbohydrates, and chemically modified nucleotides. The data in PubChem is collected from hundreds of data sources, including Plant Reactome. This chapter provides a brief overview of the Plant Reactome and the PubChem knowledgebases, their association to other public resources providing accessory information, and how users can readily access the contents.
Collapse
Affiliation(s)
- Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Sunghwan Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Evan E Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA.
| |
Collapse
|
10
|
Yu J, Jung S, Cheng CH, Lee T, Zheng P, Buble K, Crabb J, Humann J, Hough H, Jones D, Campbell JT, Udall J, Main D. CottonGen: The Community Database for Cotton Genomics, Genetics, and Breeding Research. PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10122805. [PMID: 34961276 PMCID: PMC8705096 DOI: 10.3390/plants10122805] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/11/2021] [Accepted: 12/12/2021] [Indexed: 05/12/2023]
Abstract
Over the last eight years, the volume of whole genome, gene expression, SNP genotyping, and phenotype data generated by the cotton research community has exponentially increased. The efficient utilization/re-utilization of these complex and large datasets for knowledge discovery, translation, and application in crop improvement requires them to be curated, integrated with other types of data, and made available for access and analysis through efficient online search tools. Initiated in 2012, CottonGen is an online community database providing access to integrated peer-reviewed cotton genomic, genetic, and breeding data, and analysis tools. Used by cotton researchers worldwide, and managed by experts with crop-specific knowledge, it continuous to be the logical choice to integrate new data and provide necessary interfaces for information retrieval. The repository in CottonGen contains colleague, gene, genome, genotype, germplasm, map, marker, metabolite, phenotype, publication, QTL, species, transcriptome, and trait data curated by the CottonGen team. The number of data entries housed in CottonGen has increased dramatically, for example, since 2014 there has been an 18-fold increase in genes/mRNAs, a 23-fold increase in whole genomes, and a 372-fold increase in genotype data. New tools include a genetic map viewer, a genome browser, a synteny viewer, a metabolite pathways browser, sequence retrieval, BLAST, and a breeding information management system (BIMS), as well as various search pages for new data types. CottonGen serves as the home to the International Cotton Genome Initiative, managing its elections and serving as a communication and coordination hub for the community. With its extensive curation and integration of data and online tools, CottonGen will continue to facilitate utilization of its critical resources to empower research for cotton crop improvement.
Collapse
Affiliation(s)
- Jing Yu
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Chun-Huai Cheng
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Taein Lee
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Ping Zheng
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Katheryn Buble
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - James Crabb
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Jodi Humann
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Heidi Hough
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
| | - Don Jones
- Cotton Incorporated, Cary, NC 27513, USA;
| | - J. Todd Campbell
- The Agricultural Research Service of U.S. Department of Agriculture, Florence, SC 29501, USA;
| | - Josh Udall
- The Agricultural Research Service of U.S. Department of Agriculture, College Station, TX 77845, USA;
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164, USA; (J.Y.); (S.J.); (C.-H.C.); (T.L.); (P.Z.); (K.B.); (J.C.); (J.H.); (H.H.)
- Correspondence: ; Tel.: +1-509-335-2774
| |
Collapse
|
11
|
van den Bent I, Makrodimitris S, Reinders M. The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction. Evol Bioinform Online 2021; 17:11769343211062608. [PMID: 34880594 PMCID: PMC8647222 DOI: 10.1177/11769343211062608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 11/03/2021] [Indexed: 11/16/2022] Open
Abstract
Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.
Collapse
Affiliation(s)
- Irene van den Bent
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
| | - Stavros Makrodimitris
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
- Keygene N.V., Wageningen, the
Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
| |
Collapse
|
12
|
Foerster H, Battey JND, Sierro N, Ivanov NV, Mueller LA. Metabolic networks of the Nicotiana genus in the spotlight: content, progress and outlook. Brief Bioinform 2021; 22:bbaa136. [PMID: 32662816 PMCID: PMC8138835 DOI: 10.1093/bib/bbaa136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/19/2020] [Accepted: 06/04/2020] [Indexed: 01/09/2023] Open
Abstract
Manually curated metabolic databases residing at the Sol Genomics Network comprise two taxon-specific databases for the Solanaceae family, i.e. SolanaCyc and the genus Nicotiana, i.e. NicotianaCyc as well as six species-specific databases for Nicotiana tabacum TN90, N. tabacum K326, Nicotiana benthamiana, N. sylvestris, N. tomentosiformis and N. attenuata. New pathways were created through the extraction, examination and verification of related data from the literature and the aid of external database guided by an expert-led curation process. Here we describe the curation progress that has been achieved in these databases since the first release version 1.0 in 2016, the curation flow and the curation process using the example metabolic pathway for cholesterol in plants. The current content of our databases comprises 266 pathways and 36 superpathways in SolanaCyc and 143 pathways plus 21 superpathways in NicotianaCyc, manually curated and validated specifically for the Solanaceae family and Nicotiana genus, respectively. The curated data have been propagated to the respective Nicotiana-specific databases, which resulted in the enrichment and more accurate presentation of their metabolic networks. The quality and coverage in those databases have been compared with related external databases and discussed in terms of literature support and metabolic content.
Collapse
|
13
|
Barrera-Redondo J, Sánchez-de la Vega G, Aguirre-Liguori JA, Castellanos-Morales G, Gutiérrez-Guerrero YT, Aguirre-Dugua X, Aguirre-Planter E, Tenaillon MI, Lira-Saade R, Eguiarte LE. The domestication of Cucurbita argyrosperma as revealed by the genome of its wild relative. HORTICULTURE RESEARCH 2021; 8:109. [PMID: 33931618 PMCID: PMC8087764 DOI: 10.1038/s41438-021-00544-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 03/03/2021] [Accepted: 03/14/2021] [Indexed: 05/06/2023]
Abstract
Despite their economic importance and well-characterized domestication syndrome, the genomic impact of domestication and the identification of variants underlying the domestication traits in Cucurbita species (pumpkins and squashes) is currently lacking. Cucurbita argyrosperma, also known as cushaw pumpkin or silver-seed gourd, is a Mexican crop consumed primarily for its seeds rather than fruit flesh. This makes it a good model to study Cucurbita domestication, as seeds were an essential component of early Mesoamerican diet and likely the first targets of human-guided selection in pumpkins and squashes. We obtained population-level data using tunable Genotype by Sequencing libraries for 192 individuals of the wild and domesticated subspecies of C. argyrosperma across Mexico. We also assembled the first high-quality wild Cucurbita genome. Comparative genomic analyses revealed several structural variants and presence/absence of genes related to domestication. Our results indicate a monophyletic origin of this domesticated crop in the lowlands of Jalisco. We found evidence of gene flow between the domesticated and wild subspecies, which likely alleviated the effects of the domestication bottleneck. We uncovered candidate domestication genes that are involved in the regulation of growth hormones, plant defense mechanisms, seed development, and germination. The presence of shared selected alleles with the closely related species Cucurbita moschata suggests domestication-related introgression between both taxa.
Collapse
Affiliation(s)
- Josué Barrera-Redondo
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México.
| | - Guillermo Sánchez-de la Vega
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México
| | - Jonás A Aguirre-Liguori
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Gabriela Castellanos-Morales
- Departamento de Conservación de la Biodiversidad, El Colegio de la Frontera Sur, Villahermosa, Carretera Villahermosa-Reforma km 15.5 Ranchería El Guineo 2ª sección, 86280, Villahermosa, Tabasco, México
| | - Yocelyn T Gutiérrez-Guerrero
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México
| | - Xitlali Aguirre-Dugua
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México
| | - Erika Aguirre-Planter
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México
| | - Maud I Tenaillon
- Génétique Quantitative et Evolution - Le Moulon, Université Paris-Saclay, Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Centre National de la Recherche Scientifique, AgroParisTech, Gif-sur-Yvette, 91190, France
| | - Rafael Lira-Saade
- UBIPRO, Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Av. de los Barrios #1, Col. Los Reyes Iztacala, Tlalnepantla, Edo. de Mex, 54090, México.
| | - Luis E Eguiarte
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Circuito Exterior s/n Anexo al Jardín Botánico, 04510, Ciudad de México, México.
| |
Collapse
|
14
|
Michael TP, Ernst E, Hartwick N, Chu P, Bryant D, Gilbert S, Ortleb S, Baggs EL, Sree KS, Appenroth KJ, Fuchs J, Jupe F, Sandoval JP, Krasileva KV, Borisjuk L, Mockler TC, Ecker JR, Martienssen RA, Lam E. Genome and time-of-day transcriptome of Wolffia australiana link morphological minimization with gene loss and less growth control. Genome Res 2021; 31:225-238. [PMID: 33361111 PMCID: PMC7849404 DOI: 10.1101/gr.266429.120] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 12/16/2020] [Indexed: 11/24/2022]
Abstract
Rootless plants in the genus Wolffia are some of the fastest growing known plants on Earth. Wolffia have a reduced body plan, primarily multiplying through a budding type of asexual reproduction. Here, we generated draft reference genomes for Wolffia australiana (Benth.) Hartog & Plas, which has the smallest genome size in the genus at 357 Mb and has a reduced set of predicted protein-coding genes at about 15,000. Comparison between multiple high-quality draft genome sequences from W. australiana clones confirmed loss of several hundred genes that are highly conserved among flowering plants, including genes involved in root developmental and light signaling pathways. Wolffia has also lost most of the conserved nucleotide-binding leucine-rich repeat (NLR) genes that are known to be involved in innate immunity, as well as those involved in terpene biosynthesis, while having a significant overrepresentation of genes in the sphingolipid pathways that may signify an alternative defense system. Diurnal expression analysis revealed that only 13% of Wolffia genes are expressed in a time-of-day (TOD) fashion, which is less than the typical ∼40% found in several model plants under the same condition. In contrast to the model plants Arabidopsis and rice, many of the pathways associated with multicellular and developmental processes are not under TOD control in W. australiana, where genes that cycle the conditions tested predominantly have carbon processing and chloroplast-related functions. The Wolffia genome and TOD expression data set thus provide insight into the interplay between a streamlined plant body plan and optimized growth.
Collapse
Affiliation(s)
- Todd P Michael
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Evan Ernst
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Nolan Hartwick
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Philomena Chu
- Department of Plant Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, USA
| | - Douglas Bryant
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, USA
| | - Sarah Gilbert
- Department of Plant Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, USA
| | - Stefan Ortleb
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben 06466, Germany
| | - Erin L Baggs
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California 94720, USA
| | - K Sowjanya Sree
- Department of Environmental Science, Central University of Kerala, Periye, Kerala 671316, India
| | | | - Joerg Fuchs
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben 06466, Germany
| | - Florian Jupe
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Justin P Sandoval
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Ksenia V Krasileva
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California 94720, USA
| | - Ljudmylla Borisjuk
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben 06466, Germany
| | - Todd C Mockler
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, USA
| | - Joseph R Ecker
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Robert A Martienssen
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Eric Lam
- Department of Plant Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, USA
| |
Collapse
|
15
|
Marchant DB, Sessa EB, Wolf PG, Heo K, Barbazuk WB, Soltis PS, Soltis DE. The C-Fern (Ceratopteris richardii) genome: insights into plant genome evolution with the first partial homosporous fern genome assembly. Sci Rep 2019; 9:18181. [PMID: 31796775 PMCID: PMC6890710 DOI: 10.1038/s41598-019-53968-8] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Accepted: 11/04/2019] [Indexed: 01/04/2023] Open
Abstract
Ferns are notorious for possessing large genomes and numerous chromosomes. Despite decades of speculation, the processes underlying the expansive genomes of ferns are unclear, largely due to the absence of a sequenced homosporous fern genome. The lack of this crucial resource has not only hindered investigations of evolutionary processes responsible for the unusual genome characteristics of homosporous ferns, but also impeded synthesis of genome evolution across land plants. Here, we used the model fern species Ceratopteris richardii to address the processes (e.g., polyploidy, spread of repeat elements) by which the large genomes and high chromosome numbers typical of homosporous ferns may have evolved and have been maintained. We directly compared repeat compositions in species spanning the green plant tree of life and a diversity of genome sizes, as well as both short- and long-read-based assemblies of Ceratopteris. We found evidence consistent with a single ancient polyploidy event in the evolutionary history of Ceratopteris based on both genomic and cytogenetic data, and on repeat proportions similar to those found in large flowering plant genomes. This study provides a major stepping-stone in the understanding of land plant evolutionary genomics by providing the first homosporous fern reference genome, as well as insights into the processes underlying the formation of these massive genomes.
Collapse
Affiliation(s)
- D Blaine Marchant
- Department of Biology, Stanford University, Stanford, CA, 94305, USA.
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA.
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA.
| | - Emily B Sessa
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
- The Genetics Institute, University of Florida, Gainesville, FL, 32611, USA
| | - Paul G Wolf
- Department of Biology, Utah State University, Logan, UT, 84322, USA
- Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, 35899, USA
| | - Kweon Heo
- Department of Applied Plant Sciences, Kangwon National University, Chuncheon, 24341, Korea
| | - W Brad Barbazuk
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
- The Genetics Institute, University of Florida, Gainesville, FL, 32611, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- The Genetics Institute, University of Florida, Gainesville, FL, 32611, USA
- The Biodiversity Institute, University of Florida, Gainesville, FL, 32611, USA
| | - Douglas E Soltis
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- The Genetics Institute, University of Florida, Gainesville, FL, 32611, USA
- The Biodiversity Institute, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
16
|
Gabaldón T. Recent trends in molecular diagnostics of yeast infections: from PCR to NGS. FEMS Microbiol Rev 2019; 43:517-547. [PMID: 31158289 PMCID: PMC8038933 DOI: 10.1093/femsre/fuz015] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 05/31/2019] [Indexed: 12/29/2022] Open
Abstract
The incidence of opportunistic yeast infections in humans has been increasing over recent years. These infections are difficult to treat and diagnose, in part due to the large number and broad diversity of species that can underlie the infection. In addition, resistance to one or several antifungal drugs in infecting strains is increasingly being reported, severely limiting therapeutic options and showcasing the need for rapid detection of the infecting agent and its drug susceptibility profile. Current methods for species and resistance identification lack satisfactory sensitivity and specificity, and often require prior culturing of the infecting agent, which delays diagnosis. Recently developed high-throughput technologies such as next generation sequencing or proteomics are opening completely new avenues for more sensitive, accurate and fast diagnosis of yeast pathogens. These approaches are the focus of intensive research, but translation into the clinics requires overcoming important challenges. In this review, we provide an overview of existing and recently emerged approaches that can be used in the identification of yeast pathogens and their drug resistance profiles. Throughout the text we highlight the advantages and disadvantages of each methodology and discuss the most promising developments in their path from bench to bedside.
Collapse
Affiliation(s)
- Toni Gabaldón
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- ICREA, Pg Lluís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
17
|
Lu J, Xu M, Cai J, Yu D, Meng Y, Wang H. Transcriptome-wide identification of microRNAs and functional insights inferred from microRNA-target pairs in Physalis angulata L. PLANT SIGNALING & BEHAVIOR 2019; 14:1629267. [PMID: 31184247 PMCID: PMC6619950 DOI: 10.1080/15592324.2019.1629267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 05/27/2019] [Accepted: 06/03/2019] [Indexed: 06/09/2023]
Abstract
Physalis angulata L., a member of the family Solanaceae, is widely used as the folk medicine in various countries. Continuous research efforts are devoted to the discovery of the effective medicinal ingredients from Physalis angulata. However, due to the limited resources of genome and transcriptome sequencing data, only a few studies have been performed at the gene regulatory level. In this study, the transcriptomes of five organs (roots, stems, leaves, flowers and fruits) of Physalis angulata were reported. Based on the transcriptome assembly containing 196,117 unique transcripts, a total of 17,556 SSRs (simple sequence repeats) were identified, which could be useful RNA-based barcoding for discrimination of the plants closely relative to Physalis angulata. Additionally, 24 transcripts were discovered to be the potential microRNA (miRNA) precursors which encode a total of 31 distinct mature miRNAs. Some of these precursors showed organ-specific expression patterns. Target prediction revealed 116 miRNA-target pairs, involving 31 miRNAs and 83 target transcripts in Physalis angulata. Taken together, our results could serve as the data resource for in-depth studies on the molecular regulatory mechanisms related to the production of medicinal ingredients in Physalis angulata.
Collapse
Affiliation(s)
- Jiangjie Lu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Min Xu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Jiahui Cai
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Dongliang Yu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
| | - Yijun Meng
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Huizhong Wang
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| |
Collapse
|
18
|
Yan R, Wang X, Tian Y, Xu J, Xu X, Lin J. Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods. Mol Omics 2019; 15:205-215. [PMID: 31046040 DOI: 10.1039/c9mo00043g] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The zinc (Zn2+) cofactor has been proven to be involved in numerous biological mechanisms and the zinc-binding site is recognized as one of the most important post-translation modifications in proteins. Therefore, accurate knowledge of zinc ions in protein structures can provide potential clues for elucidation of protein folding and functions. However, determining zinc-binding residues by experimental means is usually lab-intensive and associated with high cost in most cases. In this context, the development of computational tools for identifying zinc-binding sites is highly desired, especially in the current post-genomic era. In this work, we developed a novel zinc-binding site prediction method by combining several intensively-trained machine learning models. To establish an accurate and generative method, we downloaded all zinc-binding proteins from the Protein Data Bank and prepared a non-redundant dataset. Meanwhile, a well-prepared dataset by other groups was also used. Then, effective and complementary features were extracted from sequences and three-dimensional structures of these proteins. Moreover, several well-designed machine learning models were intensively trained to construct accurate models. To assess the performance, the obtained predictors were stringently benchmarked using the diverse zinc-binding sites. Furthermore, several state-of-the-art in silico methods developed specifically for zinc-binding sites were also evaluated and compared. The results confirmed that our method is very competitive in real world applications and could become a complementary tool to wet lab experiments. To facilitate research in the community, a web server and stand-alone program implementing our method were constructed and are publicly available at . The downloadable program of our method can be easily used for the high-throughput screening of potential zinc-binding sites across proteomes.
Collapse
Affiliation(s)
- Renxiang Yan
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaofeng Wang
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen 041004, China
| | - Yarong Tian
- Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 40530, Sweden
| | - Jing Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaoli Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China.
| | - Juan Lin
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| |
Collapse
|
19
|
Karim MR, Michel A, Zappa A, Baranov P, Sahay R, Rebholz-Schuhmann D. Improving data workflow systems with cloud services and use of open data for bioinformatics research. Brief Bioinform 2019; 19:1035-1050. [PMID: 28419324 PMCID: PMC6169675 DOI: 10.1093/bib/bbx039] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Indexed: 11/22/2022] Open
Abstract
Data workflow systems (DWFSs) enable bioinformatics researchers to combine components for data access and data analytics, and to share the final data analytics approach with their collaborators. Increasingly, such systems have to cope with large-scale data, such as full genomes (about 200 GB each), public fact repositories (about 100 TB of data) and 3D imaging data at even larger scales. As moving the data becomes cumbersome, the DWFS needs to embed its processes into a cloud infrastructure, where the data are already hosted. As the standardized public data play an increasingly important role, the DWFS needs to comply with Semantic Web technologies. This advancement to DWFS would reduce overhead costs and accelerate the progress in bioinformatics research based on large-scale data and public resources, as researchers would require less specialized IT knowledge for the implementation. Furthermore, the high data growth rates in bioinformatics research drive the demand for parallel and distributed computing, which then imposes a need for scalability and high-throughput capabilities onto the DWFS. As a result, requirements for data sharing and access to public knowledge bases suggest that compliance of the DWFS with Semantic Web standards is necessary. In this article, we will analyze the existing DWFS with regard to their capabilities toward public open data use as well as large-scale computational and human interface requirements. We untangle the parameters for selecting a preferable solution for bioinformatics research with particular consideration to using cloud services and Semantic Web technologies. Our analysis leads to research guidelines and recommendations toward the development of future DWFS for the bioinformatics research community.
Collapse
Affiliation(s)
- Md Rezaul Karim
- Semantics in eHealth and Life Sciences (SeLS), Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland
| | - Audrey Michel
- School of Biochemistry and Cell Biology, University College Cork, Ireland
| | - Achille Zappa
- Insight Centre for Data Analytics, National University of Ireland Galway, Dangan, Galway, Ireland
| | - Pavel Baranov
- School of Biochemistry and Cell Biology, University College Cork, Ireland
| | - Ratnesh Sahay
- Semantics in eHealth and Life Sciences (SeLS), Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland
| | | |
Collapse
|
20
|
Featherston J, Arakaki Y, Hanschen ER, Ferris PJ, Michod RE, Olson BJSC, Nozaki H, Durand PM. The 4-Celled Tetrabaena socialis Nuclear Genome Reveals the Essential Components for Genetic Control of Cell Number at the Origin of Multicellularity in the Volvocine Lineage. Mol Biol Evol 2019; 35:855-870. [PMID: 29294063 DOI: 10.1093/molbev/msx332] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Multicellularity is the premier example of a major evolutionary transition in individuality and was a foundational event in the evolution of macroscopic biodiversity. The volvocine chlorophyte lineage is well suited for studying this process. Extant members span unicellular, simple colonial, and obligate multicellular taxa with germ-soma differentiation. Here, we report the nuclear genome sequence of one of the most morphologically simple organisms in this lineage-the 4-celled colonial Tetrabaena socialis and compare this to the three other complete volvocine nuclear genomes. Using conservative estimates of gene family expansions a minimal set of expanded gene families was identified that associate with the origin of multicellularity. These families are rich in genes related to developmental processes. A subset of these families is lineage specific, which suggests that at a genomic level the evolution of multicellularity also includes lineage-specific molecular developments. Multiple points of evidence associate modifications to the ubiquitin proteasomal pathway (UPP) with the beginning of coloniality. Genes undergoing positive or accelerating selection in the multicellular volvocines were found to be enriched in components of the UPP and gene families gained at the origin of multicellularity include components of the UPP. A defining feature of colonial/multicellular life cycles is the genetic control of cell number. The genomic data presented here, which includes diversification of cell cycle genes and modifications to the UPP, align the genetic components with the evolution of this trait.
Collapse
Affiliation(s)
- Jonathan Featherston
- Evolutionary Studies Institute, University of the Witwatersrand, Johannesburg, South Africa.,Agricultural Research Council, Biotechnology Platform, Pretoria, South Africa
| | - Yoko Arakaki
- Department of Biological Sciences, Graduate School of Science, University of Tokyo, Bunkyo-ku, Tokyo, Hongo, Japan
| | - Erik R Hanschen
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | - Patrick J Ferris
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | - Richard E Michod
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | | | - Hisayoshi Nozaki
- Department of Biological Sciences, Graduate School of Science, University of Tokyo, Bunkyo-ku, Tokyo, Hongo, Japan
| | - Pierre M Durand
- Evolutionary Studies Institute, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
21
|
Pinus massoniana Introgression Hybrids Display Differential Expression of Reproductive Genes. FORESTS 2019. [DOI: 10.3390/f10030230] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Pinus massoniana and P. hwangshanensis are two conifer species located in southern China, which are of both economic and ornamental value. Around the middle and lower reaches of the Yangtze River, P. massoniana occurs mainly at altitudes below 700 m, while P. hwangshanensis can be found above 900 m. At altitudes where the distribution of both pines overlaps, a natural introgression hybrid exists, which we will further refer to as the Z pine. This pine has a morphological character that shares attributes of both P. massoniana and P. hwangshanensis. However, compared to the other two pines, its reproductive structure, the pinecone, has an ultra-low ripening rate with seeds that germinate poorly. In this study, we aimed to find the reason for the impaired cone maturation by comparing transcriptome libraries of P. massoniana and Z pine cones at seven successive growth stages. After sequencing and assembly, we obtained unigenes and then annotated them against NCBI’s non-redundant nucleotide and protein sequences, Swiss-Prot, Clusters of Orthologous Groups, Gene Ontology and KEGG Orthology databases. Gene expression levels were estimated and differentially expressed genes (DEGs) of the two pines were mined and analyzed. We found that several of them indeed relate to reproductive process. At every growth stage, these genes are expressed at a higher level in P. massoniana than in the Z pine. These data provide insight into understanding which molecular mechanisms are altered between P. massoniana and the Z pine that might cause changes in the reproductive process.
Collapse
|
22
|
Alves TO, D’Almeida CTS, Scherf KA, Ferreira MSL. Modern Approaches in the Identification and Quantification of Immunogenic Peptides in Cereals by LC-MS/MS. FRONTIERS IN PLANT SCIENCE 2019; 10:1470. [PMID: 31798614 PMCID: PMC6868032 DOI: 10.3389/fpls.2019.01470] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Accepted: 10/22/2019] [Indexed: 05/17/2023]
Abstract
Celiac disease (CD) is an immunogenic disorder that affects the small intestine. It is caused by the ingestion of gluten, a protein network formed by prolamins and glutelins from cereals such as wheat, barley, rye and, possibly, oats. For predisposed people, gluten presents epitopes able to stimulate T-cells causing symptoms like nausea, vomiting, diarrhea, among others unrelated to the gastrointestinal system. The only treatment for CD is to maintain a gluten-free diet, not exceeding 20 mg/kg of gluten, what is generally considered the safe amount for celiacs. Due to this context, it is very important to identify and quantify the gluten content of food products. ELISA is the most commonly used method to detect gluten traces in food. However, by detecting only prolamins, the results of ELISA tests may be underestimated. For this reason, more reliable and sensitive assays are needed to improve gluten quantification. Because of high sensitivity and the ability to detect even trace amounts of peptides in complex matrices, the most promising approaches to verify the presence of gluten peptides in food are non-immunological techniques, like liquid chromatography coupled to mass spectrometry. Different methodologies using this approach have been developed and described in the last years, ranging from non-targeted and exploratory analysis to targeted and specific methods depending on the purpose of interest. Non-targeted analyses aim to define the proteomic profile of the sample, while targeted analyses allow the search for specific peptides, making it possible to quantify them. This review aims to gather and summarize the main proteomic techniques used in the identification and quantitation of gluten peptides related to CD-activity and gluten-related allergies.
Collapse
Affiliation(s)
- Thais O. Alves
- Food and Nutrition Graduate Program (PPGAN), Laboratory of Bioactives, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil
- Laboratory of Protein Biochemistry—Center of Innovation in Mass Spectrometry (LBP-IMasS), UNIRIO, Rio de Janeiro, Brazil
| | - Carolina T. S. D’Almeida
- Food and Nutrition Graduate Program (PPGAN), Laboratory of Bioactives, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil
- Laboratory of Protein Biochemistry—Center of Innovation in Mass Spectrometry (LBP-IMasS), UNIRIO, Rio de Janeiro, Brazil
| | - Katharina A. Scherf
- Department of Bioactive and Functional Food Chemistry, Institute of Applied Biosciences, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Mariana S. L. Ferreira
- Food and Nutrition Graduate Program (PPGAN), Laboratory of Bioactives, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil
- Laboratory of Protein Biochemistry—Center of Innovation in Mass Spectrometry (LBP-IMasS), UNIRIO, Rio de Janeiro, Brazil
- *Correspondence: Mariana S. L. Ferreira,
| |
Collapse
|
23
|
Firdaus-Raih M, Hashim NHF, Bharudin I, Abu Bakar MF, Huang KK, Alias H, Lee BKB, Mat Isa MN, Mat-Sharani S, Sulaiman S, Tay LJ, Zolkefli R, Muhammad Noor Y, Law DSN, Abdul Rahman SH, Md-Illias R, Abu Bakar FD, Najimudin N, Abdul Murad AM, Mahadi NM. The Glaciozyma antarctica genome reveals an array of systems that provide sustained responses towards temperature variations in a persistently cold habitat. PLoS One 2018; 13:e0189947. [PMID: 29385175 PMCID: PMC5791967 DOI: 10.1371/journal.pone.0189947] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 12/05/2017] [Indexed: 01/01/2023] Open
Abstract
Extremely low temperatures present various challenges to life that include ice formation and effects on metabolic capacity. Psyhcrophilic microorganisms typically have an array of mechanisms to enable survival in cold temperatures. In this study, we sequenced and analysed the genome of a psychrophilic yeast isolated in the Antarctic region, Glaciozyma antarctica. The genome annotation identified 7857 protein coding sequences. From the genome sequence analysis we were able to identify genes that encoded for proteins known to be associated with cold survival, in addition to annotating genes that are unique to G. antarctica. For genes that are known to be involved in cold adaptation such as anti-freeze proteins (AFPs), our gene expression analysis revealed that they were differentially transcribed over time and in response to different temperatures. This indicated the presence of an array of adaptation systems that can respond to a changing but persistent cold environment. We were also able to validate the activity of all the AFPs annotated where the recombinant AFPs demonstrated anti-freeze capacity. This work is an important foundation for further collective exploration into psychrophilic microbiology where among other potential, the genes unique to this species may represent a pool of novel mechanisms for cold survival.
Collapse
Affiliation(s)
- Mohd Firdaus-Raih
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- * E-mail:
| | - Noor Haza Fazlin Hashim
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Izwan Bharudin
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Mohd Faizal Abu Bakar
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- Malaysia Genome Institute, Jalan Bangi Lama, Kajang, Selangor, Malaysia
| | - Kie Kyon Huang
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Halimah Alias
- Malaysia Genome Institute, Jalan Bangi Lama, Kajang, Selangor, Malaysia
| | - Bernard K. B. Lee
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Mohd Noor Mat Isa
- Malaysia Genome Institute, Jalan Bangi Lama, Kajang, Selangor, Malaysia
| | - Shuhaila Mat-Sharani
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- Malaysia Genome Institute, Jalan Bangi Lama, Kajang, Selangor, Malaysia
| | - Suhaila Sulaiman
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Lih Jinq Tay
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Radziah Zolkefli
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Yusuf Muhammad Noor
- Malaysia Genome Institute, Jalan Bangi Lama, Kajang, Selangor, Malaysia
- Department of Biosciences Engineering, Faculty of Chemical & Natural Resources Engineering, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
| | - Douglas Sie Nguong Law
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Siti Hamidah Abdul Rahman
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Rosli Md-Illias
- Department of Biosciences Engineering, Faculty of Chemical & Natural Resources Engineering, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
| | - Farah Diba Abu Bakar
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Nazalan Najimudin
- School of Biological Sciences, Universiti Sains Malaysia, Penang, Malaysia
| | - Abdul Munir Abdul Murad
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | | |
Collapse
|
24
|
Condon B, Almsaeed A, Chen M, West J, Staton M. Tripal Developer Toolkit. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5103920. [PMID: 30295719 PMCID: PMC6147213 DOI: 10.1093/database/bay099] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 08/27/2018] [Indexed: 11/28/2022]
Abstract
Tripal community database construction toolkit utilizing the content management system Drupal. Tripal is used to make biological, genetic and genomic data more discoverable, shareable, searchable and standardized. As funding for community-level genomics databases declines, Tripal’s open-source codebase provides a means for sites to be built and maintained with a minimal investment in staff and new development. Tripal is ultimately as strong as the community of sites and developers that use it. We present a set of developer tools that will make building and maintaining Tripal 3 sites easier for new and returning users. These tools break down barriers to entry such as setting up developer and testing environments, acquiring and loading test datasets, working with controlled vocabulary terms and writing new Drupal classes.
Collapse
Affiliation(s)
- Bradford Condon
- Department of Entomology and Plant Pathology, University of Tennessee Institute of Agriculture, E.J. Chapman Blvd, 370 Plant Biotechnology Building, Knoxville, TN
| | - Abdullah Almsaeed
- Department of Entomology and Plant Pathology, University of Tennessee Institute of Agriculture, E.J. Chapman Blvd, 370 Plant Biotechnology Building, Knoxville, TN
| | - Ming Chen
- Department of Entomology and Plant Pathology, University of Tennessee Institute of Agriculture, E.J. Chapman Blvd, 370 Plant Biotechnology Building, Knoxville, TN.,Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, M411 Walters Life Science, Knoxville, TN
| | - Joe West
- Department of Entomology and Plant Pathology, University of Tennessee Institute of Agriculture, E.J. Chapman Blvd, 370 Plant Biotechnology Building, Knoxville, TN
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee Institute of Agriculture, E.J. Chapman Blvd, 370 Plant Biotechnology Building, Knoxville, TN
| |
Collapse
|
25
|
Duan D, Jia Y, Yang J, Li ZH. Comparative Transcriptome Analysis of Male and Female Conelets and Development of Microsatellite Markers in Pinus bungeana, an Endemic Conifer in China. Genes (Basel) 2017; 8:genes8120393. [PMID: 29257091 PMCID: PMC5748711 DOI: 10.3390/genes8120393] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 12/11/2017] [Accepted: 12/12/2017] [Indexed: 02/02/2023] Open
Abstract
The sex determination in gymnosperms is still poorly characterized due to the lack of genomic/transcriptome resources and useful molecular genetic markers. To enhance our understanding of the molecular mechanisms of the determination of sexual recognition of reproductive structures in conifers, the transcriptome of male and female conelets were characterized in a Chinese endemic conifer species, Pinus bungeana Zucc. ex Endl. The 39.62 Gb high-throughput sequencing reads were obtained from two kinds of sexual conelets. After de novo assembly of the obtained reads, 85,305 unigenes were identified, 53,944 (63.23%) of which were annotated with public databases. A total of 12,073 differentially expressed genes were detected between the two types of sexes in P. bungeana, and 5766 (47.76%) of them were up-regulated in females. The Kyoto Encyclopedia of Genes and Genomes (KEGG) enriched analysis suggested that some of the genes were significantly associated with the sex determination process of P. bungeana, such as those involved in tryptophan metabolism, zeatin biosynthesis, and cysteine and methionine metabolism, and the phenylpropanoid biosynthesis pathways. Meanwhile, some important plant hormone pathways (e.g., the gibberellin (GA) pathway, carotenoid biosynthesis, and brassinosteroid biosynthesis (BR) pathway) that affected sexual determination were also induced in P. bungeana. In addition, 8791 expressed sequence tag-simple sequence repeats (EST-SSRs) from 7859 unigenes were detected in P. bungeana. The most abundant repeat types were dinucleotides (1926), followed by trinucleotides (1711). The dominant classes of the sequence repeat were A/T (4942) in mononucleotides and AT/AT (1283) in dinucleotides. Among these EST-SSRs, 84 pairs of primers were randomly selected for the characterization of potential molecular genetic markers. Finally, 19 polymorphic EST-SSR primers were characterized. We found low to moderate levels of genetic diversity (NA = 1.754; HO = 0.206; HE = 0.205) across natural populations of P. bungeana. The cluster analysis revealed two distinct genetic groups for the six populations that were sampled in this endemic species, which might be caused by the fragmentation of habitats and long-term geographic isolation among different populations. Taken together, this work provides important insights into the molecular mechanisms of sexual identity in the reproductive organs of P. bungeana. The molecular genetic resources that were identified in this study will also facilitate further studies in functional genomics and population genetics in the Pinus species.
Collapse
Affiliation(s)
| | | | - Jie Yang
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences, Northwest University, Xi'an 710069, China.
| | - Zhong-Hu Li
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences, Northwest University, Xi'an 710069, China.
| |
Collapse
|
26
|
Lu J, Xu D, Jiang Y, Kong S, Shen Z, Xia S, Lu L. Integrated analysis of mRNA and viral miRNAs in the kidney of Carassius auratus gibelio response to cyprinid herpesvirus 2. Sci Rep 2017; 7:13787. [PMID: 29062054 PMCID: PMC5653811 DOI: 10.1038/s41598-017-14217-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 10/06/2017] [Indexed: 12/14/2022] Open
Abstract
MicroRNAs (miRNAs) are small, non-coding single stranded RNAs that play crucial roles in numerous biological processes. Vertebrate herpesviruses encode multiple viral miRNAs that modulate host and viral genes. However, the roles of viral miRNAs in lower vertebrates have not been fully determined. Here, we used high-throughput sequencing to analyse the miRNA and mRNA expression profiles of Carassius auratus gibelio in response to infection by cyprinid herpesvirus 2 (CyHV-2). RNA sequencing obtained 26,664 assembled transcripts, including 2,912 differentially expressed genes. Based on small RNA sequencing and secondary structure predictions, we identified 17 CyHV-2 encoded miRNAs, among which 14 were validated by stem-loop quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) and eight were validated by northern blotting. Furthermore, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of miRNAs-mRNA pairs revealed diverse affected immune signalling pathways, including the RIG-I-like receptor and JAK-STAT pathways. Finally, we presented four genes involved in RIG-I-like pathways, including host gene IRF3, RBMX, PIN1, viral gene ORF4, which are negatively regulated by CyHV-2 encoded miRNA miR-C4. The present study is the first to provide a comprehensive overview of viral miRNA-mRNA co-regulation, which might have a key role in controlling post-transcriptomic regulation during CyHV-2 infection.
Collapse
Affiliation(s)
- Jianfei Lu
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
| | - Dan Xu
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
- Key Laboratory of Agriculture Ministry for Freshwater Aquatic Genetic Resources, Shanghai Ocean University, Shanghai, P. R. China
- National Experimental Teaching Demonstration Center for Fishery Sciences, Shanghai Ocean University, Shanghai, P. R. China
| | - Yousheng Jiang
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
- Key Laboratory of Agriculture Ministry for Freshwater Aquatic Genetic Resources, Shanghai Ocean University, Shanghai, P. R. China
- National Experimental Teaching Demonstration Center for Fishery Sciences, Shanghai Ocean University, Shanghai, P. R. China
| | - Shanyun Kong
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
| | - Zhaoyuan Shen
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
| | - Siyao Xia
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China
| | - Liqun Lu
- National Pathogen Collection Center for Aquatic Animals, Shanghai Ocean University, Shanghai, P. R. China.
- Key Laboratory of Agriculture Ministry for Freshwater Aquatic Genetic Resources, Shanghai Ocean University, Shanghai, P. R. China.
- National Experimental Teaching Demonstration Center for Fishery Sciences, Shanghai Ocean University, Shanghai, P. R. China.
| |
Collapse
|
27
|
Karakülah G, Suner A. PlanTEnrichment: A tool for enrichment analysis of transposable elements in plants. Genomics 2017; 109:336-340. [DOI: 10.1016/j.ygeno.2017.05.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 05/29/2017] [Accepted: 05/31/2017] [Indexed: 02/01/2023]
|
28
|
Wang X, Xu ML, Li BQ, Zhai HL, Liu JJ, Li SY. Prediction of phosphorylation sites based on Krawtchouk image moments. Proteins 2017; 85:2231-2238. [PMID: 28921635 DOI: 10.1002/prot.25388] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Revised: 08/30/2017] [Accepted: 09/12/2017] [Indexed: 11/05/2022]
Abstract
Protein phosphorylation is one of the most pervasive post-translational modifications and regulates diverse cellular processes in organisms. Under the catalysis of protein kinases, protein phosphorylation usually occurred in the residues serine (S), threonine (T), or tyrosine (Y). In this contribution, we proposed a novel scheme (named KMPhos) for the theoretical prediction of protein phosphorylation sites. First, the numerical matrix was obtained from a protein sequence fragment by replacing the characters of the residues with the chemical descriptors of amino acid molecules to approximately describe the chemical environment of the protein fragment, which was turned to the grayscale image. Then the Krawtchouk image moments were calculated and used to establish the support vector machine models. The accuracies of 10-fold cross validation for the obtained models on the training set are up to 89.7%, 88.6%, and 90.1% for the residues S, Y, and T, respectively. For the independent test set, the prediction accuracies are up to 90.7% (S), 87.8% (T), and 89.3% (Y). The results of ROC and other evaluations are also satisfactory. Compared with several specialized prediction tools, KMPhos provided the higher accuracy and reliability. An available KMPhos package is provided and can be used directly for phosphorylation sites prediction.
Collapse
Affiliation(s)
- Xue Wang
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Min Li Xu
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Bao Qiong Li
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Hong Lin Zhai
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Jin Jin Liu
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Shu Yan Li
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| |
Collapse
|
29
|
Bromilow S, Gethings LA, Buckley M, Bromley M, Shewry PR, Langridge JI, Clare Mills EN. A curated gluten protein sequence database to support development of proteomics methods for determination of gluten in gluten-free foods. J Proteomics 2017; 163:67-75. [PMID: 28385663 PMCID: PMC5479479 DOI: 10.1016/j.jprot.2017.03.026] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Revised: 03/20/2017] [Accepted: 03/28/2017] [Indexed: 12/11/2022]
Abstract
The unique physiochemical properties of wheat gluten enable a diverse range of food products to be manufactured. However, gluten triggers coeliac disease, a condition which is treated using a gluten-free diet. Analytical methods are required to confirm if foods are gluten-free, but current immunoassay-based methods can unreliable and proteomic methods offer an alternative but require comprehensive and well annotated sequence databases which are lacking for gluten. A manually a curated database (GluPro V1.0) of gluten proteins, comprising 630 discrete unique full length protein sequences has been compiled. It is representative of the different types of gliadin and glutenin components found in gluten. An in silico comparison of their coeliac toxicity was undertaken by analysing the distribution of coeliac toxic motifs. This demonstrated that whilst the α-gliadin proteins contained more toxic motifs, these were distributed across all gluten protein sub-types. Comparison of annotations observed using a discovery proteomics dataset acquired using ion mobility MS/MS showed that more reliable identifications were obtained using the GluPro V1.0 database compared to the complete reviewed Viridiplantae database. This highlights the value of a curated sequence database specifically designed to support the proteomic workflows and the development of methods to detect and quantify gluten. SIGNIFICANCE We have constructed the first manually curated open-source wheat gluten protein sequence database (GluPro V1.0) in a FASTA format to support the application of proteomic methods for gluten protein detection and quantification. We have also analysed the manually verified sequences to give the first comprehensive overview of the distribution of sequences able to elicit a reaction in coeliac disease, the prevalent form of gluten intolerance. Provision of this database will improve the reliability of gluten protein identification by proteomic analysis, and aid the development of targeted mass spectrometry methods in line with Codex Alimentarius Commission requirements for foods designed to meet the needs of gluten intolerant individuals.
Collapse
Affiliation(s)
- Sophie Bromilow
- School of Biological Sciences, Manchester Institute of Biotechnology, Manchester Academic Health Sciences Centre, University of Manchester, M17DN, UK
| | - Lee A Gethings
- Waters Corporation, Stamford Avenue, Altrincham Road, Wilmslow SK9 4AX, UK
| | - Mike Buckley
- School of Earth and Environmental Sciences, Manchester Institute of Biotechnology, University of Manchester, M17DN, UK
| | - Mike Bromley
- Genon Laboratories Limited, Cragg Vale, Halifax, UK
| | | | - James I Langridge
- Waters Corporation, Stamford Avenue, Altrincham Road, Wilmslow SK9 4AX, UK
| | - E N Clare Mills
- School of Biological Sciences, Manchester Institute of Biotechnology, Manchester Academic Health Sciences Centre, University of Manchester, M17DN, UK.
| |
Collapse
|
30
|
Stein O, Avin-Wittenberg T, Krahnert I, Zemach H, Bogol V, Daron O, Aloni R, Fernie AR, Granot D. Arabidopsis Fructokinases Are Important for Seed Oil Accumulation and Vascular Development. FRONTIERS IN PLANT SCIENCE 2017; 7:2047. [PMID: 28119723 PMCID: PMC5222831 DOI: 10.3389/fpls.2016.02047] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 12/21/2016] [Indexed: 05/25/2023]
Abstract
Sucrose (a disaccharide made of glucose and fructose) is the primary carbon source transported to sink organs in many plants. Since fructose accounts for half of the hexoses used for metabolism in sink tissues, plant fructokinases (FRKs), the main fructose-phosphorylating enzymes, are likely to play a central role in plant development. However, to date, their specific functions have been the subject of only limited study. The Arabidopsis genome contains seven genes encoding six cytosolic FRKs and a single plastidic FRK. T-DNA knockout mutants for five of the seven FRKs were identified and used in this study. Single knockouts of the FRK mutants did not exhibit any unusual phenotype. Double-mutants of AtFRK6 (plastidic) and AtFRK7 showed normal growth in soil, but yielded dark, distorted seeds. The seed distortion could be complemented by expression of the well-characterized tomato SlFRK1, confirming that a lack of FRK activity was the primary cause of the seed phenotype. Seeds of the double-mutant germinated, but failed to establish on 1/2 MS plates. Seed establishment was made possible by the addition of glucose or sucrose, indicating reduced seed storage reserves. Metabolic profiling of the double-mutant seeds revealed decreased TCA cycle metabolites and reduced fatty acid metabolism. Examination of the mutant embryo cells revealed smaller oil bodies, the primary storage reserve in Arabidopsis seeds. Quadruple and penta FRK mutants showed growth inhibition and leaf wilting. Anatomical analysis revealed smaller trachea elements and smaller xylem area, accompanied by necrosis around the cambium and the phloem. These results demonstrate overlapping and complementary roles of the plastidic AtFRK6 and the cytosolic AtFRK7 in seed storage accumulation, and the importance of AtFRKs for vascular development.
Collapse
Affiliation(s)
- Ofer Stein
- Volcani Center, Institute of Plant Sciences, Agricultural Research OrganizationBet Dagan, Israel
- Robert H. Smith Faculty of Agriculture, Institute of Plant Sciences and Genetics in Agriculture, Food and Environment, Hebrew University of JerusalemRehovot, Israel
| | - Tamar Avin-Wittenberg
- Max-Planck-Institut für Molekulare PflanzenphysiologiePotsdam-Golm, Germany
- Department of Plant and Environmental Sciences, Hebrew University of JerusalemGivat Ram, Jerusalem, Israel
| | - Ina Krahnert
- Max-Planck-Institut für Molekulare PflanzenphysiologiePotsdam-Golm, Germany
| | - Hanita Zemach
- Volcani Center, Institute of Plant Sciences, Agricultural Research OrganizationBet Dagan, Israel
| | - Vlada Bogol
- Volcani Center, Institute of Plant Sciences, Agricultural Research OrganizationBet Dagan, Israel
| | - Oksana Daron
- Department of Life Sciences, Ben-Gurion UniversityBeer-Sheva, Israel
| | - Roni Aloni
- Department of Plant Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Alisdair R. Fernie
- Max-Planck-Institut für Molekulare PflanzenphysiologiePotsdam-Golm, Germany
| | - David Granot
- Volcani Center, Institute of Plant Sciences, Agricultural Research OrganizationBet Dagan, Israel
| |
Collapse
|
31
|
Li H, Joh YS, Kim H, Paek E, Lee SW, Hwang KB. Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification. BMC Genomics 2016; 17:1031. [PMID: 28155652 PMCID: PMC5259817 DOI: 10.1186/s12864-016-3327-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Background Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available. Results To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches. Conclusions We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3327-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Honglan Li
- School of Computer Science and Engineering, Soongsil University, Seoul, 06978, Republic of Korea
| | - Yoon Sung Joh
- Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
| | - Hyunwoo Kim
- Scientific Data Research Center, Korea Institute of Science and Technology Information, Daejeon, 34141, Republic of Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
| | - Sang-Won Lee
- Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul, 02841, Republic of Korea
| | - Kyu-Baek Hwang
- School of Computer Science and Engineering, Soongsil University, Seoul, 06978, Republic of Korea.
| |
Collapse
|
32
|
Rost B, Radivojac P, Bromberg Y. Protein function in precision medicine: deep understanding with machine learning. FEBS Lett 2016; 590:2327-41. [PMID: 27423136 PMCID: PMC5937700 DOI: 10.1002/1873-3468.12307] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 07/12/2016] [Accepted: 07/12/2016] [Indexed: 12/21/2022]
Abstract
Precision medicine and personalized health efforts propose leveraging complex molecular, medical and family history, along with other types of personal data toward better life. We argue that this ambitious objective will require advanced and specialized machine learning solutions. Simply skimming some low-hanging results off the data wealth might have limited potential. Instead, we need to better understand all parts of the system to define medically relevant causes and effects: how do particular sequence variants affect particular proteins and pathways? How do these effects, in turn, cause the health or disease-related phenotype? Toward this end, deeper understanding will not simply diffuse from deeper machine learning, but from more explicit focus on understanding protein function, context-specific protein interaction networks, and impact of variation on both.
Collapse
Affiliation(s)
- Burkhard Rost
- Department of Informatics and Bioinformatics, Institute for Advanced Studies, Technical University of Munich, Garching, Germany
| | - Predrag Radivojac
- School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| |
Collapse
|
33
|
da Costa JP, Rocha-Santos T, Duarte AC. Analytical tools to assess aging in humans: The rise of geri-omics. Trends Analyt Chem 2016. [DOI: 10.1016/j.trac.2015.09.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
34
|
Meng Y, Yu D, Xue J, Lu J, Feng S, Shen C, Wang H. A transcriptome-wide, organ-specific regulatory map of Dendrobium officinale, an important traditional Chinese orchid herb. Sci Rep 2016; 6:18864. [PMID: 26732614 PMCID: PMC4702150 DOI: 10.1038/srep18864] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 11/30/2015] [Indexed: 02/07/2023] Open
Abstract
Dendrobium officinale is an important traditional Chinese herb. Here, we did a transcriptome-wide, organ-specific study on this valuable plant by combining RNA, small RNA (sRNA) and degradome sequencing. RNA sequencing of four organs (flower, root, leaf and stem) of Dendrobium officinale enabled us to obtain 536,558 assembled transcripts, from which 2,645, 256, 42 and 54 were identified to be highly expressed in the four organs respectively. Based on sRNA sequencing, 2,038, 2, 21 and 24 sRNAs were identified to be specifically accumulated in the four organs respectively. A total of 1,047 mature microRNA (miRNA) candidates were detected. Based on secondary structure predictions and sequencing, tens of potential miRNA precursors were identified from the assembled transcripts. Interestingly, phase-distributed sRNAs with degradome-based processing evidences were discovered on the long-stem structures of two precursors. Target identification was performed for the 1,047 miRNA candidates, resulting in the discovery of 1,257 miRNA--target pairs. Finally, some biological meaningful subnetworks involving hormone signaling, development, secondary metabolism and Argonaute 1-related regulation were established. All of the sequencing data sets are available at NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra/). Summarily, our study provides a valuable resource for the in-depth molecular and functional studies on this important Chinese orchid herb.
Collapse
Affiliation(s)
- Yijun Meng
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| | - Dongliang Yu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China
| | - Jie Xue
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| | - Jiangjie Lu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| | - Shangguo Feng
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| | - Chenjia Shen
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| | - Huizhong Wang
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou 310036, PR China.,Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou 310036, China
| |
Collapse
|
35
|
Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I. UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 2016; 1374:23-54. [PMID: 26519399 DOI: 10.1007/978-1-4939-3167-5_2] [Citation(s) in RCA: 515] [Impact Index Per Article: 57.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The Universal Protein Resource (UniProt, http://www.uniprot.org ) consortium is an initiative of the SIB Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI) and the Protein Information Resource (PIR) to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB), updated every 4 weeks, and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc).The Swiss-Prot section of the UniProt KnowledgeBase (UniProtKB/Swiss-Prot) contains publicly available expertly manually annotated protein sequences obtained from a broad spectrum of organisms. Plant protein entries are produced in the frame of the Plant Proteome Annotation Program (PPAP), with an emphasis on characterized proteins of Arabidopsis thaliana and Oryza sativa. High level annotations provided by UniProtKB/Swiss-Prot are widely used to predict annotation of newly available proteins through automatic pipelines.The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry. We will also present some of the tools and databases that are linked to each entry.
Collapse
Affiliation(s)
- Emmanuel Boutet
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland.
| | - Damien Lieberherr
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Michael Tognolli
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Michel Schneider
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Parit Bansal
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Alan J Bridge
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Sylvain Poux
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Lydie Bougueleret
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
| | - Ioannis Xenarios
- Swiss Institute of Bioinformatics, Centre Medical Universitaire, rue Michel Servet 1, CH-1211, Geneva 4, Switzerland
- University of Lausanne, CIG, Lausanne, 1015, Switzerland
| |
Collapse
|
36
|
Dai SX, Li GH, Gao YD, Huang JF. Pharmacophore-Map-Pick: A Method to Generate Pharmacophore Models for All Human GPCRs. Mol Inform 2015; 35:81-91. [PMID: 27491793 DOI: 10.1002/minf.201500075] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 09/21/2015] [Indexed: 01/04/2023]
Abstract
GPCR-based drug discovery is hindered by a lack of effective screening methods for most GPCRs that have neither ligands nor high-quality structures. With the aim to identify lead molecules for these GPCRs, we developed a new method called Pharmacophore-Map-Pick to generate pharmacophore models for all human GPCRs. The model of ADRB2 generated using this method not only predicts the binding mode of ADRB2-ligands correctly but also performs well in virtual screening. Findings also demonstrate that this method is powerful for generating high-quality pharmacophore models. The average enrichment for the pharmacophore models of the 15 targets in different GPCR families reached 15-fold at 0.5 % false-positive rate. Therefore, the pharmacophore models can be applied in virtual screening directly with no requirement for any ligand information or shape constraints. A total of 2386 pharmacophore models for 819 different GPCRs (99 % coverage (819/825)) were generated and are available at http://bsb.kiz.ac.cn/GPCRPMD.
Collapse
Affiliation(s)
- Shao-Xing Dai
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, P. R. China phone/fax: + 86 087165199200/+ 86 087165199200.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Gong-Hua Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, P. R. China phone/fax: + 86 087165199200/+ 86 087165199200.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Yue-Dong Gao
- Kunming Biological Diversity Regional Center of Instruments, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, P. R. China
| | - Jing-Fei Huang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, P. R. China phone/fax: + 86 087165199200/+ 86 087165199200. .,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100049, P. R. China. .,Kunming Institute of Zoology - Chinese University of Hongkong Joint Research Center for Bio-resources and Human Disease Mechanisms, Kunming 650223, P. R. China.
| |
Collapse
|
37
|
Sengupta A, Grover M, Chakraborty A, Saxena S. HEPNet: A Knowledge Base Model of Human Energy Pool Network for Predicting the Energy Availability Status of an Individual. PLoS One 2015; 10:e0127918. [PMID: 26053019 PMCID: PMC4460090 DOI: 10.1371/journal.pone.0127918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 04/20/2015] [Indexed: 11/18/2022] Open
Abstract
HEPNet is an electronic representation of metabolic reactions occurring within human cellular organization focusing on inflow and outflow of the energy currency ATP, GTP and other energy associated moieties. The backbone of HEPNet consists of primary bio-molecules such as carbohydrates, proteins and fats which ultimately constitute the chief source for the synthesis and obliteration of energy currencies in a cell. A series of biochemical pathways and reactions constituting the catabolism and anabolism of various metabolites are portrayed through cellular compartmentalization. The depicted pathways function synchronously toward an overarching goal of producing ATP and other energy associated moieties to bring into play a variety of cellular functions. HEPNet is manually curated with raw data from experiments and is also connected to KEGG and Reactome databases. This model has been validated by simulating it with physiological states like fasting, starvation, exercise and disease conditions like glycaemia, uremia and dihydrolipoamide dehydrogenase deficiency (DLDD). The results clearly indicate that ATP is the master regulator under different metabolic conditions and physiological states. The results also highlight that energy currencies play a minor role. However, the moiety creatine phosphate has a unique character, since it is a ready-made source of phosphoryl groups for the rapid synthesis of ATP from ADP. HEPNet provides a framework for further expanding the network diverse age groups of both the sexes, followed by the understanding of energetics in more complex metabolic pathways that are related to human disorders.
Collapse
Affiliation(s)
- Abhishek Sengupta
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, U.P., India
| | - Monendra Grover
- Centre for Agricultural Bioinformatics (CABin), Indian Agricultural Statistics Research Institute (IASRI), ICAR, New Delhi, India
| | - Amlan Chakraborty
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, U.P., India
| | - Sarika Saxena
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, U.P., India
| |
Collapse
|
38
|
Hooper CM, Tanz SK, Castleden IR, Vacher MA, Small ID, Millar AH. SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome. ACTA ACUST UNITED AC 2014; 30:3356-64. [PMID: 25150248 DOI: 10.1093/bioinformatics/btu550] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
MOTIVATION Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory. RESULTS To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein-protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors. AVAILABILITY SUBAcon is a useful tool for recovering proteome-wide subcellular locations of Arabidopsis proteins and is displayed in the SUBA3 database (http://suba.plantenergy.uwa.edu.au). The source code and input data is available through the SUBA3 server (http://suba.plantenergy.uwa.edu.au//SUBAcon.html) and the Arabidopsis SUbproteome REference (ASURE) training set can be accessed using the ASURE web portal (http://suba.plantenergy.uwa.edu.au/ASURE).
Collapse
Affiliation(s)
- Cornelia M Hooper
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| | - Sandra K Tanz
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| | - Ian R Castleden
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| | - Michael A Vacher
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| | - Ian D Small
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| | - A Harvey Millar
- Centre of Excellence in Computational Systems Biology, The University of Western Australia, Perth, WA 6009, Australia and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia
| |
Collapse
|
39
|
Ranking biomedical annotations with annotator's semantic relevancy. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2014; 2014:258929. [PMID: 24899918 PMCID: PMC4037603 DOI: 10.1155/2014/258929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 04/09/2014] [Indexed: 11/17/2022]
Abstract
Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator's knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user's vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.
Collapse
|
40
|
Lohse M, Nagel A, Herter T, May P, Schroda M, Zrenner R, Tohge T, Fernie AR, Stitt M, Usadel B. Mercator: a fast and simple web server for genome scale functional annotation of plant sequence data. PLANT, CELL & ENVIRONMENT 2014; 37:1250-8. [PMID: 24237261 DOI: 10.1111/pce.12231] [Citation(s) in RCA: 401] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 10/23/2013] [Accepted: 10/28/2013] [Indexed: 05/18/2023]
Abstract
Next-generation technologies generate an overwhelming amount of gene sequence data. Efficient annotation tools are required to make these data amenable to functional genomics analyses. The Mercator pipeline automatically assigns functional terms to protein or nucleotide sequences. It uses the MapMan 'BIN' ontology, which is tailored for functional annotation of plant 'omics' data. The classification procedure performs parallel sequence searches against reference databases, compiles the results and computes the most likely MapMan BINs for each query. In the current version, the pipeline relies on manually curated reference classifications originating from the three reference organisms (Arabidopsis, Chlamydomonas, rice), various other plant species that have a reviewed SwissProt annotation, and more than 2000 protein domain and family profiles at InterPro, CDD and KOG. Functional annotations predicted by Mercator achieve accuracies above 90% when benchmarked against manual annotation. In addition to mapping files for direct use in the visualization software MapMan, Mercator provides graphical overview charts, detailed annotation information in a convenient web browser interface and a MapMan-to-GO translation table to export results as GO terms. Mercator is available free of charge via http://mapman.gabipd.org/web/guest/app/Mercator.
Collapse
Affiliation(s)
- Marc Lohse
- Max-Planck-Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Binder JX, Pletscher-Frankild S, Tsafou K, Stolte C, O'Donoghue SI, Schneider R, Jensen LJ. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau012. [PMID: 24573882 PMCID: PMC3935310 DOI: 10.1093/database/bau012] [Citation(s) in RCA: 424] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high-throughput microscopy-based screens and predicted from primary sequence. To get a comprehensive view of the localization of a protein, it is thus necessary to consult multiple databases and prediction tools. To address this, we present the COMPARTMENTS resource, which integrates all sources listed above as well as the results of automatic text mining. The resource is automatically kept up to date with source databases, and all localization evidence is mapped onto common protein identifiers and Gene Ontology terms. We further assign confidence scores to the localization evidence to facilitate comparison of different types and sources of evidence. To further improve the comparability, we assign confidence scores based on the type and source of the localization evidence. Finally, we visualize the unified localization evidence for a protein on a schematic cell to provide a simple overview. Database URL:http://compartments.jensenlab.org
Collapse
Affiliation(s)
- Janos X Binder
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany, Bioinformatics Core Facility, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4362 Esch-sur-Alzette, Luxembourg, Department of Disease Systems Biology, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark, CSIRO Computational Informatics, Sydney, NSW 2113 Australia and Garvan Institute of Medical Research, Sydney, NSW 2100, Australia
| | | | | | | | | | | | | |
Collapse
|
42
|
Abstract
Prediction of proteasomal cleavage sites has been a focus of computational biology. Up to date, the predictive methods are mostly based on nonlinear classifiers and variables with little physicochemical meanings. In this paper, the physicochemical properties of 14 residues both upstream and downstream of a cleavage site are characterized by VHSE (principal component score vector of hydrophobic, steric, and electronic properties) descriptors. Then, the resulting VHSE descriptors are employed to construct prediction models by support vector machine (SVM). For both in vivo and in vitro datasets, the performance of VHSE-based method is comparatively better than that of the well-known PAProC, MAPPP, and NetChop methods. The results reveal that the hydrophobic property of 10 residues both upstream and downstream of the cleavage site is a dominant factor affecting in vivo and in vitro cleavage specificities, followed by residue’s electronic and steric properties. Furthermore, the difference in hydrophobic potential between residues flanking the cleavage site is proposed to favor substrate cleavages. Overall, the interpretable VHSE-based method provides a preferable way to predict proteasomal cleavage sites.
Collapse
|
43
|
Chandrakar B, Jain A, Roy S, Gutlapalli VR, Saraf S, Suppahia A, Verma A, Tiwari A, Yadav M, Nayarisseri A. Molecular modeling of Acetyl-CoA carboxylase (ACC) from Jatropha curcas and virtual screening for identification of inhibitors. ACTA ACUST UNITED AC 2013. [DOI: 10.1016/j.jopr.2013.07.032] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
44
|
Tanz SK, Castleden I, Hooper CM, Vacher M, Small I, Millar HA. SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis. Nucleic Acids Res 2013; 41:D1185-91. [PMID: 23180787 PMCID: PMC3531127 DOI: 10.1093/nar/gks1151] [Citation(s) in RCA: 236] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Revised: 10/24/2012] [Accepted: 10/25/2012] [Indexed: 12/27/2022] Open
Abstract
The subcellular location database for Arabidopsis proteins (SUBA3, http://suba.plantenergy.uwa.edu.au) combines manual literature curation of large-scale subcellular proteomics, fluorescent protein visualization and protein-protein interaction (PPI) datasets with subcellular targeting calls from 22 prediction programs. More than 14 500 new experimental locations have been added since its first release in 2007. Overall, nearly 650 000 new calls of subcellular location for 35 388 non-redundant Arabidopsis proteins are included (almost six times the information in the previous SUBA version). A re-designed interface makes the SUBA3 site more intuitive and easier to use than earlier versions and provides powerful options to search for PPIs within the context of cell compartmentation. SUBA3 also includes detailed localization information for reference organelle datasets and incorporates green fluorescent protein (GFP) images for many proteins. To determine as objectively as possible where a particular protein is located, we have developed SUBAcon, a Bayesian approach that incorporates experimental localization and targeting prediction data to best estimate a protein's location in the cell. The probabilities of subcellular location for each protein are provided and displayed as a pictographic heat map of a plant cell in SUBA3.
Collapse
Affiliation(s)
- Sandra K. Tanz
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| | - Ian Castleden
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| | - Cornelia M. Hooper
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| | - Michael Vacher
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| | - Ian Small
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| | - Harvey A. Millar
- Centre of Excellence in Computational Systems Biology, ARC Centre of Excellence in Plant Energy Biology and Centre for Comparative Analysis on Biomolecular Networks (CABiN), The University of Western Australia, Perth, WA 6009, Australia
| |
Collapse
|
45
|
Parage C, Tavares R, Réty S, Baltenweck-Guyot R, Poutaraud A, Renault L, Heintz D, Lugan R, Marais GA, Aubourg S, Hugueney P. Structural, functional, and evolutionary analysis of the unusually large stilbene synthase gene family in grapevine. PLANT PHYSIOLOGY 2012; 160:1407-19. [PMID: 22961129 PMCID: PMC3490603 DOI: 10.1104/pp.112.202705] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 08/30/2012] [Indexed: 05/04/2023]
Abstract
Stilbenes are a small family of phenylpropanoids produced in a number of unrelated plant species, including grapevine (Vitis vinifera). In addition to their participation in defense mechanisms in plants, stilbenes, such as resveratrol, display important pharmacological properties and are postulated to be involved in the health benefits associated with a moderate consumption of red wine. Stilbene synthases (STSs), which catalyze the biosynthesis of the stilbene backbone, seem to have evolved from chalcone synthases (CHSs) several times independently in stilbene-producing plants. STS genes usually form small families of two to five closely related paralogs. By contrast, the sequence of grapevine reference genome (cv PN40024) has revealed an unusually large STS gene family. Here, we combine molecular evolution and structural and functional analyses to investigate further the high number of STS genes in grapevine. Our reannotation of the STS and CHS gene families yielded 48 STS genes, including at least 32 potentially functional ones. Functional characterization of nine genes representing most of the STS gene family diversity clearly indicated that these genes do encode for proteins with STS activity. Evolutionary analysis of the STS gene family revealed that both STS and CHS evolution are dominated by purifying selection, with no evidence for strong selection for new functions among STS genes. However, we found a few sites under different selection pressures in CHS and STS sequences, whose potential functional consequences are discussed using a structural model of a typical STS from grapevine that we developed.
Collapse
Affiliation(s)
| | | | - Stéphane Réty
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Raymonde Baltenweck-Guyot
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Anne Poutaraud
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Lauriane Renault
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Dimitri Heintz
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Raphaël Lugan
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Gabriel A.B. Marais
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Sébastien Aubourg
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| | - Philippe Hugueney
- Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1131 Santé de la Vigne et Qualité du Vin, F–68021 Colmar, France (C.P., R.B.-G., A.P., L.R., P.H.); Centre National de la Recherche Scientifique, Université Lyon 1, Unité Mixte de Recherche 5558 Laboratoire de Biométrie et Biologie Evolutive, F–69622 Villeurbanne, France (R.T., G.A.B.M.); Institut National de la Recherche Agronomique, Unité Mixte de Recherche 1165 Unité de Recherche en Génomique Végétale, Université d’Evry-Val-d’Essonne, Equipe de Recherche Labellisée 8196 Centre National de la Recherche Scientifique, F–91057 Evry, France (S.A.); Centre National de la Recherche Scientifique, Unité Propre de Recherche 2357 Institut de Biologie Moléculaire des Plantes, F–67084 Strasbourg, France (D.H., R.L.); Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8015 Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, Faculte de Pharmacie, Université Paris Descartes, F–75270 Paris, France (S.R.); Université de Strasbourg, F–67081 Strasbourg, France (C.P., R.B.-G., A.P., L.R., D.H., R.L., P.H.); Instituto Gulbenkian de Ciência, P–2780–156 Oeiras, Portugal (R.T., G.A.B.M.)
| |
Collapse
|
46
|
Hakeem KR, Chandna R, Ahmad P, Iqbal M, Ozturk M. Relevance of Proteomic Investigations in Plant Abiotic Stress Physiology. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:621-35. [DOI: 10.1089/omi.2012.0041] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Khalid Rehman Hakeem
- Molecular Ecology Laboratory, Department of Botany, Jamia Hamdard, New Delhi, India
| | - Ruby Chandna
- Molecular Ecology Laboratory, Department of Botany, Jamia Hamdard, New Delhi, India
| | - Parvaiz Ahmad
- Department of Botany, Amar Singh College, University of Kashmir, Srinagar, India
| | - Muhammad Iqbal
- Molecular Ecology Laboratory, Department of Botany, Jamia Hamdard, New Delhi, India
| | - Munir Ozturk
- Department of Botany, Ege University, Bornova, Izmir, Turkey
| |
Collapse
|
47
|
Palmieri MC, Perazzolli M, Matafora V, Moretto M, Bachi A, Pertot I. Proteomic analysis of grapevine resistance induced by Trichoderma harzianum T39 reveals specific defence pathways activated against downy mildew. JOURNAL OF EXPERIMENTAL BOTANY 2012; 63:6237-51. [PMID: 23105132 PMCID: PMC3481215 DOI: 10.1093/jxb/ers279] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Downy mildew is caused by the oomycete Plasmopara viticola and is one of the most serious diseases of grapevine. The beneficial microorganism Trichoderma harzianum T39 (T39) has previously been shown to induce plant-mediated resistance and to reduce the severity of downy mildew in susceptible grapevines. In order to better understand the cellular processes associated with T39-induced resistance, the proteomic and histochemical changes activated by T39 in grapevine were investigated before and 1 day after P. viticola inoculation. A comprehensive proteomic analysis of T39-induced resistance in grapevine was performed using an eight-plex iTRAQ protocol, resulting in the identification and quantification of a total of 800 proteins. Most of the proteins directly affected by T39 were found to be involved in signal transduction, indicating activation of a complete microbial recognition machinery. Moreover, T39-induced resistance was associated with rapid accumulation of reactive oxygen species and callose at infection sites, as well as changes in abundance of proteins involved in response to stress and redox balance, indicating an active defence response to downy mildew. On the other hand, proteins affected by P. viticola in control plants mainly decreased in abundance, possibly reflecting the establishment of a compatible interaction. Finally, the high-throughput iTRAQ protocol allowed de novo peptide sequencing, which will be used to improve annotation of the Vitis vinifera cv. Pinot Noir proteome.
Collapse
Affiliation(s)
- Maria Cristina Palmieri
- IASMA Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010 San Michele all’Adige, Trento, Italy
| | - Michele Perazzolli
- IASMA Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010 San Michele all’Adige, Trento, Italy
- * To whom correspondence should be addressed. E-mail:
| | - Vittoria Matafora
- Biological Mass Spectrometry Unit DIBIT, San Raffaele Scientific Institute, via Olgettina 58, 20132 Milano, Italy
| | - Marco Moretto
- IASMA Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010 San Michele all’Adige, Trento, Italy
| | - Angela Bachi
- Biological Mass Spectrometry Unit DIBIT, San Raffaele Scientific Institute, via Olgettina 58, 20132 Milano, Italy
| | - Ilaria Pertot
- IASMA Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010 San Michele all’Adige, Trento, Italy
| |
Collapse
|
48
|
Hamp T, Rost B. Alternative protein-protein interfaces are frequent exceptions. PLoS Comput Biol 2012; 8:e1002623. [PMID: 22876170 PMCID: PMC3410849 DOI: 10.1371/journal.pcbi.1002623] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 06/11/2012] [Indexed: 11/18/2022] Open
Abstract
The intricate molecular details of protein-protein interactions (PPIs) are crucial for function. Therefore, measuring the same interacting protein pair again, we expect the same result. This work measured the similarity in the molecular details of interaction for the same and for homologous protein pairs between different experiments. All scores analyzed suggested that different experiments often find exceptions in the interfaces of similar PPIs: up to 22% of all comparisons revealed some differences even for sequence-identical pairs of proteins. The corresponding number for pairs of close homologs reached 68%. Conversely, the interfaces differed entirely for 12-29% of all comparisons. All these estimates were calculated after redundancy reduction. The magnitude of interface differences ranged from subtle to the extreme, as illustrated by a few examples. An extreme case was a change of the interacting domains between two observations of the same biological interaction. One reason for different interfaces was the number of copies of an interaction in the same complex: the probability of observing alternative binding modes increases with the number of copies. Even after removing the special cases with alternative hetero-interfaces to the same homomer, a substantial variability remained. Our results strongly support the surprising notion that there are many alternative solutions to make the intricate molecular details of PPIs crucial for function.
Collapse
Affiliation(s)
- Tobias Hamp
- TUM, Bioinformatik - I12, Informatik, Garching, Germany
| | - Burkhard Rost
- TUM, Bioinformatik - I12, Informatik, Garching, Germany
- Institute of Advanced Study (IAS), TUM, Garching, Germany
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
49
|
Ciccarese P, Ocana M, Clark T. Open semantic annotation of scientific publications using DOMEO. J Biomed Semantics 2012; 3 Suppl 1:S1. [PMID: 22541592 PMCID: PMC3337259 DOI: 10.1186/2041-1480-3-s1-s1] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Background Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. Methods The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies. Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. Results We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF’s next full release. Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework.
Collapse
Affiliation(s)
- Paolo Ciccarese
- Harvard Medical School and Massachusetts General Hospital, Boston MA, USA.
| | | | | |
Collapse
|
50
|
Buschiazzo E, Ritland C, Bohlmann J, Ritland K. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms. BMC Evol Biol 2012; 12:8. [PMID: 22264329 PMCID: PMC3328258 DOI: 10.1186/1471-2148-12-8] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 01/20/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set. RESULTS Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10(-9) synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution. CONCLUSIONS Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations.
Collapse
Affiliation(s)
- Emmanuel Buschiazzo
- Department of Forest Sciences, University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| | | | | | | |
Collapse
|