1
|
The Molecular Toolset and Techniques Required to Build Cyanobacterial Cell Factories. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2022. [DOI: 10.1007/10_2022_210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
2
|
Genetic, Genomics, and Responses to Stresses in Cyanobacteria: Biotechnological Implications. Genes (Basel) 2021; 12:genes12040500. [PMID: 33805386 PMCID: PMC8066212 DOI: 10.3390/genes12040500] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 03/25/2021] [Accepted: 03/25/2021] [Indexed: 02/07/2023] Open
Abstract
Cyanobacteria are widely-diverse, environmentally crucial photosynthetic prokaryotes of great interests for basic and applied science. Work to date has focused mostly on the three non-nitrogen fixing unicellular species Synechocystis PCC 6803, Synechococcus PCC 7942, and Synechococcus PCC 7002, which have been selected for their genetic and physiological interests summarized in this review. Extensive "omics" data sets have been generated, and genome-scale models (GSM) have been developed for the rational engineering of these cyanobacteria for biotechnological purposes. We presently discuss what should be done to improve our understanding of the genotype-phenotype relationships of these models and generate robust and predictive models of their metabolism. Furthermore, we also emphasize that because Synechocystis PCC 6803, Synechococcus PCC 7942, and Synechococcus PCC 7002 represent only a limited part of the wide biodiversity of cyanobacteria, other species distantly related to these three models, should be studied. Finally, we highlight the need to strengthen the communication between academic researchers, who know well cyanobacteria and can engineer them for biotechnological purposes, but have a limited access to large photobioreactors, and industrial partners who attempt to use natural or engineered cyanobacteria to produce interesting chemicals at reasonable costs, but may lack knowledge on cyanobacterial physiology and metabolism.
Collapse
|
3
|
Yan Z, Shen Z, Li Z, Chao Q, Kong L, Gao ZF, Li QW, Zheng HY, Zhao CF, Lu CM, Wang YW, Wang BC. Genome-wide transcriptome and proteome profiles indicate an active role of alternative splicing during de-etiolation of maize seedlings. PLANTA 2020; 252:60. [PMID: 32964359 DOI: 10.1007/s00425-020-03464-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 09/12/2020] [Indexed: 06/11/2023]
Abstract
AS events affect genes encoding protein domain composition and make the single gene produce more proteins with a certain number of genes to satisfy the establishment of photosynthesis during de-etiolation. The drastic switch from skotomorphogenic to photomorphogenic development is an excellent system to elucidate rapid developmental responses to environmental stimuli in plants. To decipher the effects of different light wavelengths on de-etiolation, we illuminated etiolated maize seedlings with blue, red, blue-red mixed and white light, respectively. We found that blue light alone has the strongest effect on photomorphogenesis and that this effect can be attributed to the higher number and expression levels of photosynthesis and chlorosynthesis proteins. Deep sequencing-based transcriptome analysis revealed gene expression changes under different light treatments and a genome-wide alteration in alternative splicing (AS) profiles. We discovered 41,188 novel transcript isoforms for annotated genes, which increases the percentage of multi-exon genes with AS to 63% in maize. We provide peptide support for all defined types of AS, especially retained introns. Further in silico prediction revealed that 58.2% of retained introns have changes in domains compared with their most similar annotated protein isoform. This suggests that AS acts as a protein function switch allowing rapid light response through the addition or removal of functional domains. The richness of novel transcripts and protein isoforms also demonstrates the potential and importance of integrating proteomics into genome annotation in maize.
Collapse
Affiliation(s)
- Zhen Yan
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
- University of Chinese Academy of Sciences, 100049, Beijing, China
| | - Zhuo Shen
- Vegetable Research Institute, Guangdong Academy of Agricultural Sciences, Guangdong Key Laboratory for New Technology Research of Vegetables, Guangzhou, 510640, China
| | - Zhe Li
- Precision Scientific (Beijing) Co., Ltd., Beijing, 100085, China
| | - Qing Chao
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
- University of Chinese Academy of Sciences, 100049, Beijing, China
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China
| | - Lei Kong
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences, Peking University, Beijing, 100871, China
| | - Zhi-Fang Gao
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
| | - Qing-Wei Li
- Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hai-Yan Zheng
- Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
| | - Cai-Feng Zhao
- Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
| | - Cong-Ming Lu
- State Key Laboratory of Crop Biology, College of Life Sciences, Shandong Agricultural University, Taian, 271018, Shandong, China
| | - Ying-Wei Wang
- Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Bai-Chen Wang
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China.
- University of Chinese Academy of Sciences, 100049, Beijing, China.
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China.
| |
Collapse
|
4
|
Willforss J, Leonova S, Tillander J, Andreasson E, Marttila S, Olsson O, Chawade A, Levander F. Interactive proteogenomic exploration of response to Fusarium head blight in oat varieties with different resistance. J Proteomics 2020; 218:103688. [PMID: 32061841 DOI: 10.1016/j.jprot.2020.103688] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 02/03/2020] [Accepted: 02/12/2020] [Indexed: 11/17/2022]
Abstract
Fusarium species are cereal pathogens that cause the Fusarium Head Blight (FHB) disease. FHB can reduce yield, cause mycotoxin accumulation in the grain and reduce germination efficiency of the harvested seeds. Understanding the biochemical interactions between the host plants and the pathogen is crucial for controlling the disease and for the development of cultivars with improved tolerance to FHB. Here, we studied morphological and proteomic differences between the susceptible oat variety Belinda and the more resistant variety Argamak using variety-specific transcriptome assemblies as references. Measurements of deoxynivalenol toxin levels confirmed the partial resistance in Argamak and the susceptibility in Belinda. To jointly investigate the proteomics- and sequence data, we developed an RShiny-based interface for interactive exploration of the dataset using univariate and multivariate statistics. When applying this interface to the dataset, quantitative protein differences between Belinda and Argamak were detected, and eighteen peptides were found uniquely in Argamak during infection, among them several lipoxygenases. Such proteins can be developed as markers for Fusarium resistance breeding. In conclusion, this study provides the first proteogenomic insight on molecular Fusarium-oat interactions at both morphological and molecular levels and the data are openly available through an interactive interface for further inspection. SIGNIFICANCE: Fusarium head blight causes widespread damage to crops, and chronic and acute toxicity to human and livestock due to the accumulation of toxins during infection. In the present study, two oat varieties with differing resistance were challenged with Fusarium to understand the disease better, and studied both at morphological and molecular levels, identifying proteins which could play a role in the defense mechanism. Furthermore, a proteogenomics approach allows joint profiling of expression and sequence level differences to identify potentially functionally differing mutations. Here such analysis is made openly available through an interactive interface which allows other scientists to draw further findings from the data. This study may both serve as a basis for understanding oat disease response and developing breeding markers for Fusarium resistant oat and future proteogenomic studies using the interactive approach described.
Collapse
Affiliation(s)
- J Willforss
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - S Leonova
- CropTailor AB, c/o Pure and Applied Biochemistry, Department of Chemistry, Lund University, Lund, Sweden
| | - J Tillander
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - E Andreasson
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - S Marttila
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - O Olsson
- CropTailor AB, c/o Pure and Applied Biochemistry, Department of Chemistry, Lund University, Lund, Sweden
| | - A Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - F Levander
- Department of Immunotechnology, Lund University, Lund, Sweden; National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Lund University, Sweden.
| |
Collapse
|
5
|
Gordon GC, Pfleger BF. Regulatory Tools for Controlling Gene Expression in Cyanobacteria. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1080:281-315. [PMID: 30091100 PMCID: PMC6662922 DOI: 10.1007/978-981-13-0854-3_12] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Cyanobacteria are attractive hosts for converting carbon dioxide and sunlight into desirable chemical products. To engineer these organisms and manipulate their metabolic pathways, the biotechnology community has developed genetic tools to control gene expression. Many native cyanobacterial promoters and related sequence elements have been used to regulate genes of interest, and heterologous tools that use non-native small molecules to induce gene expression have been demonstrated. Overall, IPTG-based induction systems seem to be leaky and initially demonstrate small dynamic ranges in cyanobacteria. Consequently, a variety of other induction systems have been optimized to enable tighter control of gene expression. Tools require significant optimization because they function quite differently in cyanobacteria when compared to analogous use in model heterotrophs. We hypothesize that these differences are due to fundamental differences in physiology between organisms. This review is not intended to summarize all known products made in cyanobacteria nor the performance (titer, rate, yield) of individual strains, but instead will focus on the genetic tools and the inherent aspects of cellular physiology that influence gene expression in cyanobacteria.
Collapse
Affiliation(s)
- Gina C Gordon
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI, USA
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Brian F Pfleger
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI, USA.
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
6
|
Pettersen VK, Steinsland H, Wiker HG. Comparative Proteomics of Enterotoxigenic Escherichia coli Reveals Differences in Surface Protein Production and Similarities in Metabolism. J Proteome Res 2017; 17:325-336. [DOI: 10.1021/acs.jproteome.7b00593] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Veronika Kuchařová Pettersen
- The Gade Research Group for Infection and Immunity, Department of
Clinical Science, ‡Centre for International Health, Department of Global Public Health
and Primary Care, and §Department of Biomedicine, University of Bergen, 5021 Bergen, Norway
| | - Hans Steinsland
- The Gade Research Group for Infection and Immunity, Department of
Clinical Science, ‡Centre for International Health, Department of Global Public Health
and Primary Care, and §Department of Biomedicine, University of Bergen, 5021 Bergen, Norway
| | - Harald G. Wiker
- The Gade Research Group for Infection and Immunity, Department of
Clinical Science, ‡Centre for International Health, Department of Global Public Health
and Primary Care, and §Department of Biomedicine, University of Bergen, 5021 Bergen, Norway
| |
Collapse
|
7
|
Menschaert G, David F. Proteogenomics from a bioinformatics angle: A growing field. MASS SPECTROMETRY REVIEWS 2017; 36:584-599. [PMID: 26670565 PMCID: PMC6101030 DOI: 10.1002/mas.21483] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 09/01/2015] [Indexed: 05/16/2023]
Abstract
Proteogenomics is a research area that combines areas as proteomics and genomics in a multi-omics setup using both mass spectrometry and high-throughput sequencing technologies. Currently, the main goals of the field are to aid genome annotation or to unravel the proteome complexity. Mass spectrometry based identifications of matching or homologues peptides can further refine gene models. Also, the identification of novel proteoforms is also made possible based on detection of novel translation initiation sites (cognate or near-cognate), novel transcript isoforms, sequence variation or novel (small) open reading frames in intergenic or un-translated genic regions by analyzing high-throughput sequencing data from RNAseq or ribosome profiling experiments. Other proteogenomics studies using a combination of proteomics and genomics techniques focus on antibody sequencing, the identification of immunogenic peptides or venom peptides. Over the years, a growing amount of bioinformatics tools and databases became available to help streamlining these cross-omics studies. Some of these solutions only help in specific steps of the proteogenomics studies, e.g. building custom sequence databases (based on next generation sequencing output) for mass spectrometry fragmentation spectrum matching. Over the last few years a handful integrative tools also became available that can execute complete proteogenomics analyses. Some of these are presented as stand-alone solutions, whereas others are implemented in a web-based framework such as Galaxy. In this review we aimed at sketching a comprehensive overview of all the bioinformatics solutions that are available for this growing research area. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:584-599, 2017.
Collapse
Affiliation(s)
- Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics, Department of
Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience
Engineering, Ghent University, Ghent, Belgium
- To whom correspondence should be addressed. Tel:
+32 9 264 99 22; Fax: +32 9 264 6220;
| | - Fenyö David
- Center for Health Informatics and Bioinformatics and Department of
Biochemistry and Molecular Pharmacology, New York University School of Medicine, New
York, New York, USA
| |
Collapse
|
8
|
Vogel AIM, Lale R, Hohmann-Marriott MF. Streamlining recombination-mediated genetic engineering by validating three neutral integration sites in Synechococcus sp. PCC 7002. J Biol Eng 2017; 11:19. [PMID: 28592992 PMCID: PMC5458483 DOI: 10.1186/s13036-017-0061-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 05/08/2017] [Indexed: 11/17/2022] Open
Abstract
Background Synechococcus sp. PCC 7002 (henceforth Synechococcus) is developing into a powerful synthetic biology chassis. In order to streamline the integration of genes into the Synechococcus chromosome, validation of neutral integration sites with optimization of the DNA transformation protocol parameters is necessary. Availability of BioBrick-compatible integration modules is desirable to further simplifying chromosomal integrations. Results We designed three BioBrick-compatible genetic modules, each targeting a separate neutral integration site, A2842, A0935, and A0159, with varying length of homologous region, spanning from 100 to 800 nt. The performance of the different modules for achieving DNA integration were tested. Our results demonstrate that 100 nt homologous regions are sufficient for inserting a 1 kb DNA fragment into the Synechococcus chromosome. By adapting a transformation protocol from a related cyanobacterium, we shortened the transformation procedure for Synechococcus significantly. Conclusions The optimized transformation protocol reported in this study provides an efficient way to perform genetic engineering in Synechococcus. We demonstrated that homologous regions of 100 nt are sufficient for inserting a 1 kb DNA fragment into the three tested neutral integration sites. Integration at A2842, A0935 and A0159 results in only a minimal fitness cost for the chassis. This study contributes to developing Synechococcus as the prominent chassis for future synthetic biology applications. Electronic supplementary material The online version of this article (doi:10.1186/s13036-017-0061-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anne Ilse Maria Vogel
- Department of Biotechnology, PhotoSynLab, NTNU, Norwegian University of Science and Technology, Trondheim, Norway
| | - Rahmi Lale
- Department of Biotechnology, PhotoSynLab, NTNU, Norwegian University of Science and Technology, Trondheim, Norway
| | | |
Collapse
|
9
|
Zai X, Yang Q, Liu K, Li R, Qian M, Zhao T, Li Y, Yin Y, Dong D, Fu L, Li S, Xu J, Chen W. A comprehensive proteogenomic study of the human Brucella vaccine strain 104 M. BMC Genomics 2017; 18:402. [PMID: 28535754 PMCID: PMC5442703 DOI: 10.1186/s12864-017-3800-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 05/16/2017] [Indexed: 03/21/2023] Open
Abstract
BACKGROUND Brucella spp. are Gram-negative, facultative intracellular pathogens that cause brucellosis in both humans and animals. The B. abortus vaccine strain 104 M is the only vaccine available in China for the prevention of brucellosis in humans. Although the B. abortus 104 M genome has been fully sequenced, the current genome annotations are not yet complete. In addition, the main mechanisms underpinning its residual toxicity and vaccine-induced immune protection have yet to be elucidated. Mapping the proteome of B. abortus 104 M will help to improve genome annotation quality, thereby facilitating a greater understanding of its biology. RESULTS In this study, we utilized a proteogenomic approach that combined subcellular fractionation and peptide fractionation to perform a whole-proteome analysis and genome reannotation of B. abortus 104 M using high-resolution mass spectrometry. In total, 1,729 proteins (56.3% of 3,072) including 218 hypothetical proteins were identified using the culture conditions that were employed this study. The annotations of the B. abortus 104 M genome were also refined following identification and validation by reverse transcription-PCR. In addition, 14 pivotal virulence factors and 17 known protective antigens known to be involved in residual toxicity and immune protection were confirmed at the protein level following induction by the 104 M vaccine. Moreover, a further insight into the cell biology of multichromosomal bacteria was obtained following the elucidation of differences in protein expression levels between the small and large chromosomes. CONCLUSIONS The work presented in this report used a proteogenomic approach to perform whole-proteome analysis and genome reannotation in B. abortus 104 M; this work helped to improve genome annotation quality. Our analysis of virulence factors, protective antigens and other protein effectors provided the basis for further research to elucidate the mechanisms of residual toxicity and immune protection induced by the 104 M vaccine. Finally, the potential link between replication dynamics, gene function, and protein expression levels in this multichromosomal bacterium was detailed.
Collapse
Affiliation(s)
- Xiaodong Zai
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Qiaoling Yang
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Kun Liu
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Ruihua Li
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Mengying Qian
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Taoran Zhao
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Yaohui Li
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Ying Yin
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Dayong Dong
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Ling Fu
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Shanhu Li
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China
| | - Junjie Xu
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China.
| | - Wei Chen
- Laboratory of Vaccine and Antibody Engineering, Beijing Institute of Biotechnology, Beijing, China.
| |
Collapse
|
10
|
Ruggles KV, Krug K, Wang X, Clauser KR, Wang J, Payne SH, Fenyö D, Zhang B, Mani DR. Methods, Tools and Current Perspectives in Proteogenomics. Mol Cell Proteomics 2017; 16:959-981. [PMID: 28456751 DOI: 10.1074/mcp.mr117.000024] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Indexed: 12/20/2022] Open
Abstract
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications.
Collapse
Affiliation(s)
- Kelly V Ruggles
- From the ‡Department of Medicine, New York University School of Medicine, New York, New York 10016
| | - Karsten Krug
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Xiaojing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Karl R Clauser
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Jing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Samuel H Payne
- **Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354
| | - David Fenyö
- ‡‡Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, New York 10016; .,§§Institute for Systems Genetics, New York University School of Medicine, New York, New York 10016
| | - Bing Zhang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030; .,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - D R Mani
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142;
| |
Collapse
|
11
|
Fu S, Liu X, Luo M, Xie K, Nice EC, Zhang H, Huang C. Proteogenomic studies on cancer drug resistance: towards biomarker discovery and target identification. Expert Rev Proteomics 2017; 14:351-362. [PMID: 28276747 DOI: 10.1080/14789450.2017.1299006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
INTRODUCTION Chemoresistance is a major obstacle for current cancer treatment. Proteogenomics is a powerful multi-omics research field that uses customized protein sequence databases generated by genomic and transcriptomic information to identify novel genes (e.g. noncoding, mutation and fusion genes) from mass spectrometry-based proteomic data. By identifying aberrations that are differentially expressed between tumor and normal pairs, this approach can also be applied to validate protein variants in cancer, which may reveal the response to drug treatment. Areas covered: In this review, we will present recent advances in proteogenomic investigations of cancer drug resistance with an emphasis on integrative proteogenomic pipelines and the biomarker discovery which contributes to achieving the goal of using precision/personalized medicine for cancer treatment. Expert commentary: The discovery and comprehensive understanding of potential biomarkers help identify the cohort of patients who may benefit from particular treatments, and will assist real-time clinical decision-making to maximize therapeutic efficacy and minimize adverse effects. With the development of MS-based proteomics and NGS-based sequencing, a growing number of proteogenomic tools are being developed specifically to investigate cancer drug resistance.
Collapse
Affiliation(s)
- Shuyue Fu
- a State Key Laboratory of Biotherapy and Cancer Center , West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy , Chengdu , P.R. China
| | - Xiang Liu
- b Department of Pathology , Sichuan Academy of Medical Sciences, Sichuan Provincial People's Hospital , Chengdu , P.R. China
| | - Maochao Luo
- c West China School of Public Health, Sichuan University , Chengdu , P.R.China
| | - Ke Xie
- d Department of Oncology , Sichuan Academy of Medical Sciences, Sichuan Provincial People's Hospital , Chengdu , P.R. China
| | - Edouard C Nice
- e Department of Biochemistry and Molecular Biology , Monash University , Clayton , Australia
| | - Haiyuan Zhang
- f School of Medicine , Yangtze University , P. R. China
| | - Canhua Huang
- a State Key Laboratory of Biotherapy and Cancer Center , West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy , Chengdu , P.R. China
| |
Collapse
|
12
|
Ruffing AM, Jensen TJ, Strickland LM. Genetic tools for advancement of Synechococcus sp. PCC 7002 as a cyanobacterial chassis. Microb Cell Fact 2016; 15:190. [PMID: 27832791 PMCID: PMC5105302 DOI: 10.1186/s12934-016-0584-6] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 10/28/2016] [Indexed: 11/10/2022] Open
Abstract
Background Successful implementation of modified cyanobacteria as hosts for industrial applications requires the development of a cyanobacterial chassis. The cyanobacterium Synechococcus sp. PCC 7002 embodies key attributes for an industrial host, including a fast growth rate and high salt, light, and temperature tolerances. This study addresses key limitations in the advancement of Synechococcus sp. PCC 7002 as an industrial chassis. Results Tools for genome integration were developed and characterized, including several putative neutral sites for genome integration. The minimum homology arm length for genome integration in Synechococcus sp. PCC 7002 was determined to be approximately 250 bp. Three fluorescent protein reporters (hGFP, Ypet, and mOrange) were characterized for gene expression, microscopy, and flow cytometry applications in Synechococcus sp. PCC 7002. Of these three proteins, the yellow fluorescent protein (Ypet) had the best optical properties for minimal interference with the native photosynthetic pigments and for detection using standard microscopy and flow cytometry optics. Twenty-five native promoters were characterized as tools for recombinant gene expression in Synechococcus sp. PCC 7002 based on previous RNA-seq results. This characterization included comparisons of protein and mRNA levels as well as expression under both continuous and diurnal light conditions. Promoters A2520 and A2579 were found to have strong expression in Synechococcus sp. PCC 7002 while promoters A1930, A1961, A2531, and A2813 had moderate expression. Promoters A2520 and A2813 showed more than twofold increases in gene expression under light conditions compared to dark, suggesting these promoters may be useful tools for engineering diurnal regulation. Conclusions The genome integration, fluorescent protein, and promoter tools developed in this study will help to advance Synechococcus sp. PCC 7002 as a cyanobacterial chassis. The long minimum homology arm length for Synechococcus sp. PCC 7002 genome integration indicates native exonuclease activity or a low efficiency of homologous recombination. Low correlation between transcript and protein levels in Synechococcus sp. PCC 7002 suggests that transcriptomic data are poor selection criteria for promoter tool development. Lastly, the conventional strategy of using promoters from photosynthetic operons as strong promoter tools is debunked, as promoters from hypothetical proteins (A2520 and A2579) were found to have much higher expression levels. Electronic supplementary material The online version of this article (doi:10.1186/s12934-016-0584-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anne M Ruffing
- Department of Bioenergy and Defense Technologies, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, NM, 87185-1413, USA.
| | - Travis J Jensen
- Department of Bioenergy and Defense Technologies, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, NM, 87185-1413, USA
| | - Lucas M Strickland
- Department of Bioenergy and Defense Technologies, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, NM, 87185-1413, USA
| |
Collapse
|
13
|
Zhang J, Yang MK, Zeng H, Ge F. GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes. Mol Cell Proteomics 2016; 15:3529-3539. [PMID: 27630248 DOI: 10.1074/mcp.m116.060046] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Indexed: 11/06/2022] Open
Abstract
Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/.
Collapse
Affiliation(s)
- Jia Zhang
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Ming-Kun Yang
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Honghui Zeng
- §Wuhan Branch, Supercomputing Center, Chinese Academy of Sciences, China
| | - Feng Ge
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China; .,§Wuhan Branch, Supercomputing Center, Chinese Academy of Sciences, China
| |
Collapse
|
14
|
Hammarén R, Pal C, Bengtsson-Palme J. FARAO: the flexible all-round annotation organizer. Bioinformatics 2016; 32:3664-3666. [PMID: 27493193 DOI: 10.1093/bioinformatics/btw499] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 07/21/2016] [Accepted: 07/22/2016] [Indexed: 11/12/2022] Open
Abstract
With decreasing costs of generating DNA sequence data, genome and metagenome projects have become accessible to a wider scientific community. However, to extract meaningful information and visualize the data remain challenging. We here introduce FARAO, a highly scalable software for organization, visualization and integration of annotation and read coverage data that can also combine output data from several bioinformatics tools. The capabilities of FARAO can greatly aid analyses of genomic and metagenomic datasets. AVAILABILITY AND IMPLEMENTATION FARAO is implemented in Perl and is supported under Unix-like operative systems, including Linux and macOS. The Perl source code is freely available for download under the MIT License from http://microbiology.se/software/farao/ CONTACT: johan.bengtsson-palme@microbiology.seSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rickard Hammarén
- Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10, SE-413 46, Gothenburg, Sweden.,Science for Life Laboratory, Department of Medical Epidemiology and Biostatistics, Karolinska Institute, SE-171 21 Solna, Stockholm, Sweden
| | - Chandan Pal
- Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10, SE-413 46, Gothenburg, Sweden.,Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden
| | - Johan Bengtsson-Palme
- Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10, SE-413 46, Gothenburg, Sweden.,Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
15
|
Mao Y, Yang X, Liu Y, Yan Y, Du Z, Han Y, Song Y, Zhou L, Cui Y, Yang R. Reannotation of Yersinia pestis Strain 91001 Based on Omics Data. Am J Trop Med Hyg 2016; 95:562-70. [PMID: 27382076 DOI: 10.4269/ajtmh.16-0215] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 05/17/2016] [Indexed: 12/16/2022] Open
Abstract
Yersinia pestis is among the most dangerous human pathogens, and systematic research of this pathogen is important in bacterial pathogenomics research. To fully interpret the biological functions, physiological characteristics, and pathogenesis of Y. pestis, a comprehensive annotation of its entire genome is necessary. The emergence of omics-based research has brought new opportunities to better annotate the genome of this pathogen. Here, the complete genome of Y. pestis strain 91001 was reannotated using genomics and proteogenomics data. One hundred and thirty-seven unreliable coding sequences were removed, and 41 homologous genes were relocated with their translational initiation sites, while the functions of seven pseudogenes and 392 hypothetical genes were revised. Moreover, annotations of noncoding RNAs, repeat sequences, and transposable elements have also been incorporated. The reannotated results are freely available at http://tody.bmi.ac.cn.
Collapse
Affiliation(s)
- Yiqing Mao
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China. Center of Information Technology, Beijing Institute of Health and Medical Information, Beijing, People's Republic of China
| | - Xianwei Yang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Yang Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, People's Republic of China
| | - Yanfeng Yan
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Zongmin Du
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Yanping Han
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Yajun Song
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Lei Zhou
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
| | - Yujun Cui
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China.
| | - Ruifu Yang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China.
| |
Collapse
|
16
|
Proteogenomic Tools and Approaches to Explore Protein Coding Landscapes of Eukaryotic Genomes. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 926:1-10. [DOI: 10.1007/978-3-319-42316-6_1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
17
|
Pettersen VK, Steinsland H, Wiker HG. Improving genome annotation of enterotoxigenicEscherichia coliTW10598 by a label-free quantitative MS/MS approach. Proteomics 2015; 15:3826-34. [DOI: 10.1002/pmic.201500278] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 08/18/2015] [Accepted: 09/04/2015] [Indexed: 12/14/2022]
Affiliation(s)
- Veronika Kuchařová Pettersen
- The Gade Research Group for Infection and Immunity; Department of Clinical Science; University of Bergen; Bergen Norway
| | - Hans Steinsland
- Centre for International Health; Department of Global Public Health and Primary Care; University of Bergen; Bergen Norway
- Department of Biomedicine; University of Bergen; Bergen Norway
| | - Harald G. Wiker
- The Gade Research Group for Infection and Immunity; Department of Clinical Science; University of Bergen; Bergen Norway
| |
Collapse
|
18
|
Kucharova V, Wiker HG. Proteogenomics in microbiology: taking the right turn at the junction of genomics and proteomics. Proteomics 2014; 14:2360-675. [PMID: 25263021 DOI: 10.1002/pmic.201400168] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/18/2014] [Accepted: 09/23/2014] [Indexed: 12/14/2022]
Abstract
High-accuracy and high-throughput proteomic methods have completely changed the way we can identify and characterize proteins. MS-based proteomics can now provide a unique supplement to genomic data and add a new level of information to the interpretation of genomic sequences. Proteomics-driven genome annotation has become especially relevant in microbiology where genomes are sequenced on a daily basis and limitations of an in silico driven annotation process are well recognized. In this review paper, we outline different strategies on how one can design a proteogenomic experiment, for example on genome-sequenced (synonymous proteogenomics) versus unsequenced organisms (ortho-proteogenomics) or with the aid of other "omic" data such as RNA-seq. We touch upon many challenges that are encountered during a typical proteogenomic study, mostly concerning bioinformatics methods and downstream data analysis, but also related to creation and use of sequence databases. A large list of proteogenomic case studies of different microorganisms is provided to illustrate the mapping of MS/MS-derived peptide spectra to genomic DNA sequences. These investigations have led to accurate determination of translational initiation sites, pointed out eventual read-throughs or programmed frameshifts, detected signal peptide processing or other protein maturation events, removed questionable annotation assignments, and provided evidence for predicted hypothetical proteins.
Collapse
Affiliation(s)
- Veronika Kucharova
- Department of Clinical Science, The Gade Research Group for Infection and Immunity, University of Bergen, Norway
| | | |
Collapse
|
19
|
Kelkar DS, Provost E, Chaerkady R, Muthusamy B, Manda SS, Subbannayya T, Selvan LDN, Wang CH, Datta KK, Woo S, Dwivedi SB, Renuse S, Getnet D, Huang TC, Kim MS, Pinto SM, Mitchell CJ, Madugundu AK, Kumar P, Sharma J, Advani J, Dey G, Balakrishnan L, Syed N, Nanjappa V, Subbannayya Y, Goel R, Prasad TSK, Bafna V, Sirdeshmukh R, Gowda H, Wang C, Leach SD, Pandey A. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis. Mol Cell Proteomics 2014; 13:3184-98. [PMID: 25060758 DOI: 10.1074/mcp.m114.038299] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼ 69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes.
Collapse
Affiliation(s)
- Dhanashree S Kelkar
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Elayne Provost
- §Department of Surgery, Johns Hopkins University, Baltimore, Maryland 21205
| | - Raghothama Chaerkady
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205
| | - Babylakshmi Muthusamy
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‖Centre of Excellence in Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry 605014, India
| | - Srikanth S Manda
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‖Centre of Excellence in Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry 605014, India; **Departments of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Tejaswini Subbannayya
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Lakshmi Dhevi N Selvan
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Chieh-Huei Wang
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205
| | - Keshava K Datta
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡‡School of Biotechnology, KIIT University, Bhubaneswar, Odisha 751024, India
| | - Sunghee Woo
- §§Department of Computer Science, University of California, San Diego, California 92093
| | - Sutopa B Dwivedi
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Santosh Renuse
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Derese Getnet
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205
| | - Tai-Chung Huang
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205
| | - Min-Sik Kim
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205; **Departments of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Sneha M Pinto
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205; ¶¶Manipal University, Madhav Nagar, Manipal, Karnataka 576104, India
| | - Christopher J Mitchell
- ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205
| | - Anil K Madugundu
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Praveen Kumar
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Jyoti Sharma
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ¶¶Manipal University, Madhav Nagar, Manipal, Karnataka 576104, India
| | - Jayshree Advani
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Gourav Dey
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ¶¶Manipal University, Madhav Nagar, Manipal, Karnataka 576104, India
| | - Lavanya Balakrishnan
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‖‖Department of Biotechnology, Kuvempu University, Shimoga 577 451, India
| | - Nazia Syed
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; Department of Biochemistry and Molecular Biology, School of Life Sciences, Pondicherry University, Puducherry 605 014, India
| | - Vishalakshi Nanjappa
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India
| | - Yashwanth Subbannayya
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Renu Goel
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - T S Keshava Prasad
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ‡Amrita School of Biotechnology, Amrita University, Kollam 690 525, India; ‖Centre of Excellence in Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry 605014, India; ¶¶Manipal University, Madhav Nagar, Manipal, Karnataka 576104, India
| | - Vineet Bafna
- §§Department of Computer Science, University of California, San Diego, California 92093
| | - Ravi Sirdeshmukh
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Harsha Gowda
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India
| | - Charles Wang
- The Center for Genomics and Division of Microbiology & Molecular Genetics, School of Medicine, Loma Linda University, Loma Linda, California 92350;
| | - Steven D Leach
- §Department of Surgery, Johns Hopkins University, Baltimore, Maryland 21205; ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205;
| | - Akhilesh Pandey
- From the *Institute of Bioinformatics, International Technology Park, Bangalore 560 066, India; ¶McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205; **Departments of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205; Sol Goldman Pancreatic Cancer Research Center, Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| |
Collapse
|
20
|
Schellenberg JJ, Verbeke TJ, McQueen P, Krokhin OV, Zhang X, Alvare G, Fristensky B, Thallinger GG, Henrissat B, Wilkins JA, Levin DB, Sparling R. Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532T using RNA-seq transcriptomics and high-throughput proteomics. BMC Genomics 2014; 15:567. [PMID: 24998381 PMCID: PMC4102724 DOI: 10.1186/1471-2164-15-567] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 06/26/2014] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. Clostridium stercorarium DSM8532T is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics. RESULTS A paired-end Roche/454 WGS assembly was closed through application of an in silico algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose. CONCLUSIONS Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose.
Collapse
Affiliation(s)
| | - Tobin J Verbeke
- />Department of Microbiology, University of Manitoba, Winnipeg, Canada
| | - Peter McQueen
- />Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Canada
| | - Oleg V Krokhin
- />Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Canada
| | - Xiangli Zhang
- />Department of Plant Sciences, University of Manitoba, Winnipeg, Canada
| | - Graham Alvare
- />Department of Plant Sciences, University of Manitoba, Winnipeg, Canada
| | - Brian Fristensky
- />Department of Plant Sciences, University of Manitoba, Winnipeg, Canada
| | - Gerhard G Thallinger
- />Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology (ACIB), Graz, Austria
- />Institute for Genomics and Bioinformatics, Graz University of Technology, Graz, Austria
| | - Bernard Henrissat
- />Architecture et Fonction des Macromolécules Biologiques, Université Aix-Marseille, Marseille, France
- />UMR 7257, Centre National de Recherche Scientifique, 163 ave. de Luminy, Marseille, 13288 France
| | - John A Wilkins
- />Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Canada
| | - David B Levin
- />Department of Biosystems Engineering, University of Manitoba, Winnipeg, Canada
| | - Richard Sparling
- />Department of Microbiology, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
21
|
Pang CNI, Tay AP, Aya C, Twine NA, Harkness L, Hart-Smith G, Chia SZ, Chen Z, Deshpande NP, Kaakoush NO, Mitchell HM, Kassem M, Wilkins MR. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. J Proteome Res 2013; 13:84-98. [PMID: 24152167 DOI: 10.1021/pr400820p] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Armengaud J, Hartmann EM, Bland C. Proteogenomics for environmental microbiology. Proteomics 2013; 13:2731-42. [PMID: 23636904 DOI: 10.1002/pmic.201200576] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 03/06/2013] [Accepted: 04/09/2013] [Indexed: 11/09/2022]
Abstract
Proteogenomics sensu stricto refers to the use of proteomic data to refine the annotation of genomes from model organisms. Because of the limitations of automatic annotation pipelines, a relatively high number of errors occur during the structural annotation of genes coding for proteins. Whether putative orphan sequences or short genes encoding low-molecular-weight proteins really exist is still frequently a mystery. Whether start codons are well defined is also an open debate. These problems are exacerbated for genomes of microorganisms belonging to poorly documented genera, as related sequences are not always available for homology-guided annotation. The functional annotation of a significant proportion of genes is also another well-known issue when annotating environmental microorganisms. High-throughput shotgun proteomics has recently greatly evolved, allowing the exploration of the proteome from any microorganism at an unprecedented depth. The structural and functional annotation process may be usefully complemented with experimental data. Indeed, proteogenomic mapping has been successfully performed for a wide variety of organisms. Specific approaches devoted to systematically establishing the N-termini of a large set of proteins are being developed. N-terminomics is giving rise to datasets of experimentally proven translational start codons as well as validated peptide signals for secreted proteins. By extension, combining genomic and proteomic data is becoming routine in many research projects. The proteomic analysis of organisms with unfinished genome sequences, the so-called composite proteomics, and the search for microbial biomarkers by bottom-up and top-down combined approaches are some examples of proteogenomic-flavored studies. They illustrate the advent of a new era of environmental microbiology where proteomics and genomics are intimately integrated to answer key biological questions.
Collapse
Affiliation(s)
- Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, France
| | | | | |
Collapse
|
23
|
Sallet E, Roux B, Sauviac L, Jardinaud MF, Carrère S, Faraut T, de Carvalho-Niebel F, Gouzy J, Gamas P, Capela D, Bruand C, Schiex T. Next-generation annotation of prokaryotic genomes with EuGene-P: application to Sinorhizobium meliloti 2011. DNA Res 2013; 20:339-54. [PMID: 23599422 PMCID: PMC3738161 DOI: 10.1093/dnares/dst014] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The availability of next-generation sequences of transcripts from prokaryotic organisms offers the opportunity to design a new generation of automated genome annotation tools not yet available for prokaryotes. In this work, we designed EuGene-P, the first integrative prokaryotic gene finder tool which combines a variety of high-throughput data, including oriented RNA-Seq data, directly into the prediction process. This enables the automated prediction of coding sequences (CDSs), untranslated regions, transcription start sites (TSSs) and non-coding RNA (ncRNA, sense and antisense) genes. EuGene-P was used to comprehensively and accurately annotate the genome of the nitrogen-fixing bacterium Sinorhizobium meliloti strain 2011, leading to the prediction of 6308 CDSs as well as 1876 ncRNAs. Among them, 1280 appeared as antisense to a CDS, which supports recent findings that antisense transcription activity is widespread in bacteria. Moreover, 4077 TSSs upstream of protein-coding or non-coding genes were precisely mapped providing valuable data for the study of promoter regions. By looking for RpoE2-binding sites upstream of annotated TSSs, we were able to extend the S. meliloti RpoE2 regulon by ∼3-fold. Altogether, these observations demonstrate the power of EuGene-P to produce a reliable and high-resolution automatic annotation of prokaryotic genomes.
Collapse
Affiliation(s)
- Erika Sallet
- INRA, Laboratoire des Interactions Plantes-Microorganismes-LIPM, UMR 441, Castanet-Tolosan F-31326, France
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Abstract
High-throughput identification of proteins with the latest generation of hybrid high-resolution mass spectrometers is opening new perspectives in microbiology. I present, here, an overview of tandem mass spectrometry technology and bioinformatics for shotgun proteomics that make 2D-PAGE approaches obsolete. Non-labelling quantitative approaches have become more popular than labelling techniques on most proteomic platforms because they are easier to carry out while their quantitative outcome is rather robust. Parameters for recording mass spectrometry data, however, need to be chosen carefully and statistics to assess the confidence of the results should not be neglected. Interestingly, next-generation sequencing methodologies make any microbial model quickly amenable to proteomics, leading to the documentation of a wide range of organisms from diverse environments. Some recent discoveries made using microbial proteomics have challenged some biological dogma, such as: (i) initiation of the translation does not occur predominantly from ATG codons in some microorganisms, (ii) non-canonical initiation codons are used to regulate the production of specific but important proteins and (iii) a gene may code for multiple polypeptide species, heterogeneous in terms of sequences. Microbial diversity and microbial physiology can now be revisited by means of exhaustive comparative proteomic surveys where thousands of proteins are detected and quantified. Proteogenomics, consisting of better annotating of genomes with the help of proteomic evidence, is paving the way for integrated multi-omic approaches in microbiology. Finally, meta-proteomic tools and approaches are emerging for tackling the high complexity of the microbial world as a whole, opening new perspectives for assessing how microbial communities function.
Collapse
Affiliation(s)
- Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, F-30207 Bagnols-sur-Cèze, France.
| |
Collapse
|