1
|
Patole MS, Sharma J, Pawar H. Comparative Proteogenomic Approaches for Mapping the Global Proteome of the Unsequenced Leishmania Vector Phlebotomus papatasi. Methods Mol Biol 2025; 2859:265-277. [PMID: 39436607 DOI: 10.1007/978-1-0716-4152-1_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
The rapid improvements in next-generation sequencing technologies have made it possible to quickly access in-depth genome sequence data. This has resulted in a flurry of genome sequences of various organisms being published and made publicly available in the last two decades. However, not all organisms have genome sequence data available. Various factors play a role, such as the importance of the organism, either medically or economically, and the genome complexity of the organisms. Phlebotomus papatasi is the sandfly vector for the Leishmania parasite, which is the causative agent for leishmaniasis. P. papatasi is a hematophagous vector, and the female flies feed on human blood to complete their reproductive cycle. The P. papatasi genome is currently being sequenced as part of a multicentric consortium, and the genome sequence is not published to date. Hence, efforts to map its global proteome are hindered in P. papatasi. In such cases, comparative proteogenomic approaches can help map the global proteome of an unsequenced organism using homology-based methods.
Collapse
Affiliation(s)
| | - Jyoti Sharma
- Manipal Academy of Higher Education, Manipal, Karnataka, India
- Institute of Bioinformatics, Bangalore, India
| | - Harsh Pawar
- Biomedical and Life Sciences Division, Lancaster University, Lancaster, UK
| |
Collapse
|
2
|
Ye J, Li A, Zheng H, Yang B, Lu Y. Machine Learning Advances in Predicting Peptide/Protein-Protein Interactions Based on Sequence Information for Lead Peptides Discovery. Adv Biol (Weinh) 2023; 7:e2200232. [PMID: 36775876 DOI: 10.1002/adbi.202200232] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/30/2022] [Indexed: 02/14/2023]
Abstract
Peptides have shown increasing advantages and significant clinical value in drug discovery and development. With the development of high-throughput technologies and artificial intelligence (AI), machine learning (ML) methods for discovering new lead peptides have been expanded and incorporated into rational drug design. Predictions of peptide-protein interactions (PepPIs) and protein-protein interactions (PPIs) are both opportunities and challenges in computational biology, which will help to better understand the mechanisms of disease and provide the impetus for the discovery of lead peptides. This paper comprehensively reviews computational models for PepPI and PPI predictions. It begins with an introduction of various databases of peptide ligands and target proteins. Then it discusses data formats and feature representations for proteins and peptides. Furthermore, classical ML methods and emerging deep learning (DL) methods that can be used to train prediction models of PepPI and PPI are classified into four categories, and their advantages and disadvantages are analyzed. To assess the relative performance of different models, different validation protocols and evaluation indexes are discussed. The goal of this review is to help researchers quickly get started to develop computational frameworks using these integrated resources and eventually promote the discovery of lead peptides.
Collapse
Affiliation(s)
- Jiahao Ye
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - An Li
- Department of Critical Care Medicine, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, China
- Department of Biochemical Pharmacy, School of Pharmacy, Second Military Medical University, Shanghai, 200433, China
| | - Hao Zheng
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - Banghua Yang
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - Yiming Lu
- School of Medicine, Shanghai University, Shanghai, 200444, China
- Department of Critical Care Medicine, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, China
- Department of Biochemical Pharmacy, School of Pharmacy, Second Military Medical University, Shanghai, 200433, China
| |
Collapse
|
3
|
Heil BJ, Greene CS. The Field-Dependent Nature of PageRank Values in Citation Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522943. [PMID: 36711900 PMCID: PMC9881996 DOI: 10.1101/2023.01.05.522943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
The value of scientific research can be easier to assess at the collective level than at the level of individual contributions. Several journal-level and article-level metrics aim to measure the importance of journals or individual manuscripts. However, many are citation-based and citation practices vary between fields. To account for these differences, scientists have devised normalization schemes to make metrics more comparable across fields. We use PageRank as an example metric and examine the extent to which field-specific citation norms drive estimated importance differences. In doing so, we recapitulate differences in journal and article PageRanks between fields. We also find that manuscripts shared between fields have different PageRanks depending on which field's citation network the metric is calculated in. We implement a degree-preserving graph shuffling algorithm to generate a null distribution of similar networks and find differences more likely attributed to field-specific preferences than citation norms. Our results suggest that while differences exist between fields' metric distributions, applying metrics in a field-aware manner rather than using normalized global metrics avoids losing important information about article preferences. They also imply that assigning a single importance value to a manuscript may not be a useful construct, as the importance of each manuscript varies by the reader's field.
Collapse
Affiliation(s)
- Benjamin J. Heil
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania
| | - Casey S. Greene
- Department of Pharmacology, University of Colorado School of Medicine; Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine
| |
Collapse
|
4
|
Sengupta A, Naresh G, Mishra A, Parashar D, Narad P. Proteome analysis using machine learning approaches and its applications to diseases. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:161-216. [PMID: 34340767 DOI: 10.1016/bs.apcsb.2021.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
With the tremendous developments in the fields of biological and medical technologies, huge amounts of data are generated in the form of genomic data, images in medical databases or as data on protein sequences, and so on. Analyzing this data through different tools sheds light on the particulars of the disease and our body's reactions to it, thus, aiding our understanding of the human health. Most useful of these tools is artificial intelligence and deep learning (DL). The artificially created neural networks in DL algorithms help extract viable data from the datasets, and further, to recognize patters in these complex datasets. Therefore, as a part of machine learning, DL helps us face all the various challenges that come forth during protein prediction, protein identification and their quantification. Proteomics is the study of such proteins, their structures, features, properties and so on. As a form of data science, Proteomics has helped us progress excellently in the field of genomics technologies. One of the major techniques used in proteomics studies is mass spectrometry (MS). However, MS is efficient with analysis of large datasets only with the added help of informatics approaches for data analysis and interpretation; these mainly include machine learning and deep learning algorithms. In this chapter, we will discuss in detail the applications of deep learning and various algorithms of machine learning in proteomics.
Collapse
Affiliation(s)
- Abhishek Sengupta
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - G Naresh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Astha Mishra
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Diksha Parashar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Priyanka Narad
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India.
| |
Collapse
|
5
|
Hu L, Wang X, Huang YA, Hu P, You ZH. A survey on computational models for predicting protein-protein interactions. Brief Bioinform 2021; 22:6159365. [PMID: 33693513 DOI: 10.1093/bib/bbab036] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/31/2020] [Indexed: 12/24/2022] Open
Abstract
Proteins interact with each other to play critical roles in many biological processes in cells. Although promising, laboratory experiments usually suffer from the disadvantages of being time-consuming and labor-intensive. The results obtained are often not robust and considerably uncertain. Due recently to advances in high-throughput technologies, a large amount of proteomics data has been collected and this presents a significant opportunity and also a challenge to develop computational models to predict protein-protein interactions (PPIs) based on these data. In this paper, we present a comprehensive survey of the recent efforts that have been made towards the development of effective computational models for PPI prediction. The survey introduces the algorithms that can be used to learn computational models for predicting PPIs, and it classifies these models into different categories. To understand their relative merits, the paper discusses different validation schemes and metrics to evaluate the prediction performance. Biological databases that are commonly used in different experiments for performance comparison are also described and their use in a series of extensive experiments to compare different prediction models are discussed. Finally, we present some open issues in PPI prediction for future work. We explain how the performance of PPI prediction can be improved if these issues are effectively tackled.
Collapse
Affiliation(s)
- Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, 830011, Urumqi, China
| | - Xiaojuan Wang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, China
| | - Yu-An Huang
- College of Computer Science and Software Engineering, Shenzhen University, 518060, Shenzhen, China
| | | | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, 830011, Urumqi, China
| |
Collapse
|
6
|
Metatranscriptomics and Metaproteomics for Microbial Communities Profiling. UNRAVELLING THE SOIL MICROBIOME 2020. [DOI: 10.1007/978-3-030-15516-2_5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
7
|
A Novel Stochastic Block Model for Network-Based Prediction of Protein-Protein Interactions. INTELLIGENT COMPUTING THEORIES AND APPLICATION 2020. [DOI: 10.1007/978-3-030-60802-6_54] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
8
|
Bespyatykh J, Smolyakov A, Guliaev A, Shitikov E, Arapidi G, Butenko I, Dogonadze M, Manicheva O, Ilina E, Zgoda V, Govorun V. Proteogenomic analysis of Mycobacterium tuberculosis Beijing B0/W148 cluster strains. J Proteomics 2019; 192:18-26. [DOI: 10.1016/j.jprot.2018.07.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 06/29/2018] [Accepted: 07/10/2018] [Indexed: 10/28/2022]
|
9
|
González-Gomariz J, Guruceaga E, López-Sánchez M, Segura V. Proteogenomics in the context of the Human Proteome Project (HPP). Expert Rev Proteomics 2019; 16:267-275. [PMID: 30654666 DOI: 10.1080/14789450.2019.1571916] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
INTRODUCTION The technological and scientific progress performed in the Human Proteome Project (HPP) has provided to the scientific community a new set of experimental and bioinformatic methods in the challenging field of shotgun and SRM/MRM-based Proteomics. The requirements for a protein to be considered experimentally validated are now well-established, and the information about the human proteome is available in the neXtProt database, while targeted proteomic assays are stored in SRMAtlas. However, the study of the missing proteins continues being an outstanding issue. Areas covered: This review is focused on the implementation of proteogenomic methods designed to improve the detection and validation of the missing proteins. The evolution of the methodological strategies based on the combination of different omic technologies and the use of huge publicly available datasets is shown taking the Chromosome 16 Consortium as reference. Expert commentary: Proteogenomics and other strategies of data analysis implemented within the C-HPP initiative could be used as guidance to complete in a near future the catalog of the human proteins. Besides, in the next years, we will probably witness their use in the B/D-HPP initiative to go a step forward on the implications of the proteins in the human biology and disease.
Collapse
Affiliation(s)
- José González-Gomariz
- a Bioinformatics Platform, Center for Applied Medical Research , University of Navarra , Pamplona , Spain.,b IdiSNA , Navarra Institute for Health Research , Pamplona , Spain
| | - Elizabeth Guruceaga
- a Bioinformatics Platform, Center for Applied Medical Research , University of Navarra , Pamplona , Spain.,b IdiSNA , Navarra Institute for Health Research , Pamplona , Spain
| | - Macarena López-Sánchez
- a Bioinformatics Platform, Center for Applied Medical Research , University of Navarra , Pamplona , Spain
| | - Victor Segura
- a Bioinformatics Platform, Center for Applied Medical Research , University of Navarra , Pamplona , Spain.,b IdiSNA , Navarra Institute for Health Research , Pamplona , Spain
| |
Collapse
|
10
|
Computational Resources for Predicting Protein-Protein Interactions. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2017; 110:251-275. [PMID: 29412998 DOI: 10.1016/bs.apcsb.2017.07.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Proteins are the essential building blocks and functional components of a cell. They account for the vital functions of an organism. Proteins interact with each other and form protein interaction networks. These protein interactions play a major role in all the biological processes and pathways. The previous methods of predicting protein interactions were experimental which focused on a small set of proteins or a particular protein. However, these experimental approaches are low-throughput as they are time-consuming and require a significant amount of human effort. This led to the development of computational techniques that uses high-throughput experimental data for analyzing protein-protein interactions. The main purpose of this review is to provide an overview on the computational advancements and tools for the prediction of protein interactions. The major databases for the deposition of these interactions are also described. The advantages, as well as the specific limitations of these tools, are highlighted which will shed light on the computational aspects that can help the biologist and researchers in their research.
Collapse
|
11
|
Seligmann H. Natural mitochondrial proteolysis confirms transcription systematically exchanging/deleting nucleotides, peptides coded by expanded codons. J Theor Biol 2016; 414:76-90. [PMID: 27899286 DOI: 10.1016/j.jtbi.2016.11.021] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 11/11/2016] [Accepted: 11/22/2016] [Indexed: 12/19/2022]
Abstract
Protein sequences have higher linguistic complexities than human languages. This indicates undeciphered multilayered, overprinted information/genetic codes. Some superimposed genetic information is revealed by detections of transcripts systematically (a) exchanging nucleotides (nine symmetric, e.g. A<->C, fourteen asymmetric, e.g. A->C->G->A, swinger RNAs) translated according to tri-, tetra- and pentacodons, and (b) deleting mono-, dinucleotides after each trinucleotide (delRNAs). Here analyses of two independent proteomic datasets considering natural proteolysis confirm independently translation of these non-canonical RNAs, also along tetra- and pentacodons, increasing coverage of putative, cryptically encoded proteins. Analyses assuming endoproteinase GluC and elastase digestions (cleavages after residues D, E, and A, L, I, V, respectively) detect additional peptides colocalizing with detected non-canonical RNAs. Analyses detect fewer peptides matching GluC-, elastase- than trypsin-digestions: artificial trypsin-digestion outweighs natural proteolysis. Results suggest occurrences of complete proteins entirely matching non-canonical, superimposed encoding(s). Protein-coding after bijective transformations could explain genetic code symmetries, such as along Rumer's transformation.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Émergentes, Faculté de Médecine, URMITE CNRS-IRD 198 UMER 6236, IHU (Institut Hospitalo-Universitaire), Aix-Marseille University, Marseille, France.
| |
Collapse
|
12
|
Pawar H, Chavan S, Mahale K, Khobragade S, Kulkarni A, Patil A, Chaphekar D, Varriar P, Sudeep A, Pai K, Prasad T, Gowda H, Patole MS. A proteomic map of the unsequenced kala-azar vector Phlebotomus papatasi using cell line. Acta Trop 2015; 152:80-89. [PMID: 26307495 DOI: 10.1016/j.actatropica.2015.08.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2015] [Revised: 07/16/2015] [Accepted: 08/18/2015] [Indexed: 11/25/2022]
Abstract
The debilitating disease kala-azar or visceral leishmaniasis is caused by the kinetoplastid protozoan parasite Leishmania donovani. The parasite is transmitted by the hematophagous sand fly vector of the genus Phlebotomus in the old world and Lutzomyia in the new world. The predominant Phlebotomine species associated with the transmission of kala-azar are Phlebotomus papatasi and Phlebotomus argentipes. Understanding the molecular interaction of the sand fly and Leishmania, during the development of parasite within the sand fly gut is crucial to the understanding of the parasite life cycle. The complete genome sequences of sand flies (Phlebotomus and Lutzomyia) are currently not available and this hinders identification of proteins in the sand fly vector. The current study utilizes a three frame translated transcriptomic data of P. papatasi in the absence of genomic sequences to analyze the mass spectrometry data of P. papatasi cell line using a proteogenomic approach. Additionally, we have carried out the proteogenomic analysis of P. papatasi by comparative homology-based searches using related sequenced dipteran protein data. This study resulted in the identification of 1313 proteins from P. papatasi based on homology. Our study demonstrates the power of proteogenomic approaches in mapping the proteomes of unsequenced organisms.
Collapse
|
13
|
Mitchell CJ, Getnet D, Kim MS, Manda SS, Kumar P, Huang TC, Pinto SM, Nirujogi RS, Iwasaki M, Shaw PG, Wu X, Zhong J, Chaerkady R, Marimuthu A, Muthusamy B, Sahasrabuddhe NA, Raju R, Bowman C, Danilova L, Cutler J, Kelkar DS, Drake CG, Prasad TSK, Marchionni L, Murakami PN, Scott AF, Shi L, Thierry-Mieg J, Thierry-Mieg D, Irizarry R, Cope L, Ishihama Y, Wang C, Gowda H, Pandey A. A multi-omic analysis of human naïve CD4+ T cells. BMC SYSTEMS BIOLOGY 2015; 9:75. [PMID: 26542228 PMCID: PMC4636073 DOI: 10.1186/s12918-015-0225-4] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 10/28/2015] [Indexed: 12/21/2022]
Abstract
Background Cellular function and diversity are orchestrated by complex interactions of fundamental biomolecules including DNA, RNA and proteins. Technological advances in genomics, epigenomics, transcriptomics and proteomics have enabled massively parallel and unbiased measurements. Such high-throughput technologies have been extensively used to carry out broad, unbiased studies, particularly in the context of human diseases. Nevertheless, a unified analysis of the genome, epigenome, transcriptome and proteome of a single human cell type to obtain a coherent view of the complex interplay between various biomolecules has not yet been undertaken. Here, we report the first multi-omic analysis of human primary naïve CD4+ T cells isolated from a single individual. Results Integrating multi-omics datasets allowed us to investigate genome-wide methylation and its effect on mRNA/protein expression patterns, extent of RNA editing under normal physiological conditions and allele specific expression in naïve CD4+ T cells. In addition, we carried out a multi-omic comparative analysis of naïve with primary resting memory CD4+ T cells to identify molecular changes underlying T cell differentiation. This analysis provided mechanistic insights into how several molecules involved in T cell receptor signaling are regulated at the DNA, RNA and protein levels. Phosphoproteomics revealed downstream signaling events that regulate these two cellular states. Availability of multi-omics data from an identical genetic background also allowed us to employ novel proteogenomics approaches to identify individual-specific variants and putative novel protein coding regions in the human genome. Conclusions We utilized multiple high-throughput technologies to derive a comprehensive profile of two primary human cell types, naïve CD4+ T cells and memory CD4+ T cells, from a single donor. Through vertical as well as horizontal integration of whole genome sequencing, methylation arrays, RNA-Seq, miRNA-Seq, proteomics, and phosphoproteomics, we derived an integrated and comparative map of these two closely related immune cells and identified potential molecular effectors of immune cell differentiation following antigen encounter. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0225-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Christopher J Mitchell
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Derese Getnet
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Min-Sik Kim
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Srikanth S Manda
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Praveen Kumar
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Tai-Chung Huang
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Sneha M Pinto
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Raja Sekhar Nirujogi
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Mio Iwasaki
- Department of Molecular & Cellular BioAnalysis, Kyoto University, Kyoto, Japan.
| | - Patrick G Shaw
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Xinyan Wu
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Jun Zhong
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Raghothama Chaerkady
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Arivusudar Marimuthu
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | | | | | - Rajesh Raju
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Caitlyn Bowman
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Ludmila Danilova
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Jevon Cutler
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Dhanashree S Kelkar
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Charles G Drake
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - T S Keshava Prasad
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Luigi Marchionni
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Peter N Murakami
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
| | - Alan F Scott
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Leming Shi
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA.
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, USA.
| | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, USA.
| | - Rafael Irizarry
- Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute, Boston, MA, USA.
| | - Leslie Cope
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Yasushi Ishihama
- Department of Molecular & Cellular BioAnalysis, Kyoto University, Kyoto, Japan.
| | - Charles Wang
- Center for Genomics and Division of Microbiology & Molecular Genetics, Loma Linda University, Loma Linda, CA, USA.
| | - Harsha Gowda
- Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India.
| | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA. .,Institute of Bioinformatics, International Tech Park, Whitefield, Bangalore, India. .,Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD, USA. .,Department of Pathology and Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
14
|
Torbett BE, Baird A, Eliceiri BP. Understanding the rules of the road: proteomic approaches to interrogate the blood brain barrier. Front Neurosci 2015; 9:70. [PMID: 25788875 PMCID: PMC4349081 DOI: 10.3389/fnins.2015.00070] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 02/17/2015] [Indexed: 11/13/2022] Open
Abstract
The blood brain barrier (BBB) is often regarded as a passive barrier that protects brain parenchyma from toxic substances, circulating leukocytes, while allowing the passage of selected molecules. Recently, a combination of molecular profiling techniques have characterized the constituents of the BBB based on in vitro models using isolated endothelial cells and ex vivo models analyzing isolated blood vessels. Characterization of gene expression profiles that are specific to the endothelium of brain blood vessels, and the identification of proteins, cells and multi-cellular structure that comprise the BBB have led to a emerging consensus that the BBB is not, in and of itself, a simple barrier of specialized endothelial cells. Instead, regulation of transcytosis, permeability, and drug translocation into the central nervous system is now viewed as a collection of neurovascular units (NVUs) that, together, give the BBB its unique biological properties. We will review recent technology advancing the understanding of the molecular basis of the BBB with a focus on proteomic approaches.
Collapse
Affiliation(s)
- Bruce E Torbett
- Molecular and Experimental Medicine, The Scripps Research Institute La Jolla, CA, USA
| | - Andrew Baird
- Department of Surgery, University of California, San Diego San Diego, CA, USA
| | - Brian P Eliceiri
- Department of Surgery, University of California, San Diego San Diego, CA, USA
| |
Collapse
|
15
|
Pawar H, Kulkarni A, Dixit T, Chaphekar D, Patole MS. A bioinformatics approach to reanalyze the genome annotation of kinetoplastid protozoan parasite Leishmania donovani. Genomics 2014; 104:554-61. [DOI: 10.1016/j.ygeno.2014.09.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Revised: 09/18/2014] [Accepted: 09/19/2014] [Indexed: 10/24/2022]
|
16
|
Kucharova V, Wiker HG. Proteogenomics in microbiology: taking the right turn at the junction of genomics and proteomics. Proteomics 2014; 14:2360-675. [PMID: 25263021 DOI: 10.1002/pmic.201400168] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/18/2014] [Accepted: 09/23/2014] [Indexed: 12/14/2022]
Abstract
High-accuracy and high-throughput proteomic methods have completely changed the way we can identify and characterize proteins. MS-based proteomics can now provide a unique supplement to genomic data and add a new level of information to the interpretation of genomic sequences. Proteomics-driven genome annotation has become especially relevant in microbiology where genomes are sequenced on a daily basis and limitations of an in silico driven annotation process are well recognized. In this review paper, we outline different strategies on how one can design a proteogenomic experiment, for example on genome-sequenced (synonymous proteogenomics) versus unsequenced organisms (ortho-proteogenomics) or with the aid of other "omic" data such as RNA-seq. We touch upon many challenges that are encountered during a typical proteogenomic study, mostly concerning bioinformatics methods and downstream data analysis, but also related to creation and use of sequence databases. A large list of proteogenomic case studies of different microorganisms is provided to illustrate the mapping of MS/MS-derived peptide spectra to genomic DNA sequences. These investigations have led to accurate determination of translational initiation sites, pointed out eventual read-throughs or programmed frameshifts, detected signal peptide processing or other protein maturation events, removed questionable annotation assignments, and provided evidence for predicted hypothetical proteins.
Collapse
Affiliation(s)
- Veronika Kucharova
- Department of Clinical Science, The Gade Research Group for Infection and Immunity, University of Bergen, Norway
| | | |
Collapse
|
17
|
Extracting data from the muck: deriving biological insight from complex microbial communities and non-model organisms with next generation sequencing. Curr Opin Biotechnol 2014; 28:103-10. [DOI: 10.1016/j.copbio.2014.01.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 01/09/2014] [Accepted: 01/10/2014] [Indexed: 01/09/2023]
|
18
|
Pawar H, Renuse S, Khobragade SN, Chavan S, Sathe G, Kumar P, Mahale KN, Gore K, Kulkarni A, Dixit T, Raju R, Prasad TSK, Harsha HC, Patole MS, Pandey A. Neglected Tropical Diseases and Omics Science: Proteogenomics Analysis of the Promastigote Stage ofLeishmania majorParasite. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:499-512. [DOI: 10.1089/omi.2013.0159] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Harsh Pawar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Rajiv Gandhi University of Health Sciences, Bangalore, India
| | - Santosh Renuse
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Department of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam, India
| | | | - Sandip Chavan
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Manipal University, Madhav Nagar, Manipal, India
| | - Gajanan Sathe
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Manipal University, Madhav Nagar, Manipal, India
| | - Praveen Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | | | | | - Tanwi Dixit
- National Centre for Cell Sciences, Pune, India
| | - Rajesh Raju
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | - H. C. Harsha
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| |
Collapse
|
19
|
Nirujogi RS, Pawar H, Renuse S, Kumar P, Chavan S, Sathe G, Sharma J, Khobragade S, Pande J, Modak B, Prasad TSK, Harsha HC, Patole MS, Pandey A. Moving from unsequenced to sequenced genome: reanalysis of the proteome of Leishmania donovani. J Proteomics 2014; 97:48-61. [PMID: 23665000 PMCID: PMC4710096 DOI: 10.1016/j.jprot.2013.04.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Revised: 04/02/2013] [Accepted: 04/11/2013] [Indexed: 10/26/2022]
Abstract
The kinetoplastid protozoan parasite, Leishmania donovani, is the causative agent of kala azar or visceral leishmaniasis. Kala azar is a severe form of leishmaniasis that is fatal in the majority of untreated cases. Studies on proteomic analysis of L. donovani thus far have been carried out using homology-based identification based on related Leishmania species (L. infantum, L. major and L. braziliensis) whose genomes have been sequenced. Recently, the genome of L. donovani was fully sequenced and the data became publicly available. We took advantage of the availability of its genomic sequence to carry out a more accurate proteogenomic analysis of L. donovani proteome using our previously generated dataset. This resulted in identification of 17,504 unique peptides upon database-dependent search against the annotated proteins in L. donovani. These peptides were assigned to 3999 unique proteins in L. donovani. 2296 proteins were identified in both the life stages of L. donovani, while 613 and 1090 proteins were identified only from amastigote and promastigote stages, respectively. The proteomic data was also searched against six-frame translated L. donovani genome, which led to 255 genome search-specific peptides (GSSPs) resulting in identification of 20 novel genes and correction of 40 existing gene models in L. donovani. BIOLOGICAL SIGNIFICANCE Leishmania donovani genome sequencing was recently completed, which permitted us to use a proteogenomic approach to map its proteome and to carry out annotation of it genome. This resulted in mapping of 50% (3999 proteins) of L. donovani proteome. Our study identified 20 novel genes previously not predicted from the L. donovani genome in addition to correcting annotations of 40 existing gene models. The identified proteins may help in better understanding of stage-specific protein expression profiles in L. donovani and to identify novel stage-specific drug targets in L. donovani which could be used in the treatment of leishmaniasis. This article is part of a Special Issue entitled: Trends in Microbial Proteomics.
Collapse
Affiliation(s)
- Raja Sekhar Nirujogi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry 605014, India
| | - Harsh Pawar
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Rajiv Gandhi University of Health Sciences, Bangalore 560041, India
| | - Santosh Renuse
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Department of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam 690525, India
| | - Praveen Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India
| | - Sandip Chavan
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - Gajanan Sathe
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | | | | | - Bhakti Modak
- National Centre for Cell Sciences, Pune 411007, India
| | - T S Keshava Prasad
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry 605014, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - H C Harsha
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Pathology, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA.
| |
Collapse
|
20
|
Chung J, Rocha AA, Tonelli RR, Castilho BA, Schenkman S. Eukaryotic initiation factor 5A dephosphorylation is required for translational arrest in stationary phase cells. Biochem J 2013; 451:257-67. [PMID: 23368777 DOI: 10.1042/bj20121553] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The protein known as eIF5A (eukaryotic initiation factor 5A) has an elusive role in translation. It has a unique and essential hypusine modification at a conserved lysine residue in most eukaryotes. In addition, this protein is modified by phosphorylation with unknown functions. In the present study we show that a phosphorylated state of eIF5A predominates in exponentially growing Trypanosoma cruzi cells, and extensive dephosphorylation occurs in cells in stationary phase. Phosphorylation occurs mainly at Ser(2), as shown in yeast eIF5A. In addition, a novel phosphorylation site was identified at Tyr(21). In exponential cells, T. cruzi eIF5A is partially associated with polysomes, compatible with a proposed function as an elongation factor, and becomes relatively enriched in polysomal fractions in stationary phase. Overexpression of the wild-type eIF5A, or eIF5A with Ser(2) replaced by an aspartate residue, but not by alanine, increases the rate of cell proliferation and protein synthesis. However, the presence of an aspartate residue instead of Ser(2) is toxic for cells reaching the stationary phase, which show a less-pronounced protein synthesis arrest and a decreased amount of eIF5A in dense fractions of sucrose gradients. We conclude that eIF5A phosphorylation and dephosphorylation cycles regulate translation according to the growth conditions.
Collapse
Affiliation(s)
- Janete Chung
- Departamento de Microbiologia, Imunologia e Parasitologia, Universidade Federal de São Paulo, Rua Pedro de Toledo 669 L6A, São Paulo, S.P. 04039-032, Brazil
| | | | | | | | | |
Collapse
|
21
|
Ikemura S, Yamamoto T, Motomura G, Yamaguchi R, Zhao G, Iwasaki K, Iwamoto Y. Preventive effects of the anti-vasospasm agent via the regulation of the Rho-kinase pathway on the development of steroid-induced osteonecrosis in rabbits. Bone 2013; 53:329-35. [PMID: 23313282 DOI: 10.1016/j.bone.2012.12.050] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Revised: 12/25/2012] [Accepted: 12/28/2012] [Indexed: 12/21/2022]
Abstract
A number of studies have suggested that ischemia is the principal pathomechanism of osteonecrosis, however, the detailed mechanism responsible for ischemia remains unclear. We examined the effects of fasudil, an anti-vasospasm agent, on the development of steroid-induced osteonecrosis in rabbits. One group of rabbits received 15mg/kg of fasudil intravenously, which were then injected once intramuscularly with 20mg/kg of methylprednisolone (n=33), and one received methylprednisolone alone as a control (n=28). Eight rabbits from each group were sacrificed 24h after methylprednisolone injection to analyze them by the expression of endothelinA-receptor and eNOS. Two weeks after the steroid injection, the femora and humeri were examined histopathologically for the incidence of osteonecrosis. In addition, plasma from each of four osteonecrosis-positive or -negative rabbits was used for the proteomic analysis in the fasudil group. The incidence of osteonecrosis was significantly lower in the fasudil group (32%) than that in the control group (75%) (P<0.01). Immunohistochemically, endothelinA-receptor expressions levels were decreased in the smooth muscle of the bone marrow in the fasudil group in comparison to that in the control group. The eNOS expressions levels in both serum and bone marrow in the MF group were significantly higher than those in the M group (P<0.05). Based on the proteomic analysis, several proteins related to vasospasm, such as fibrinogen, thrombin, and apolipoprotein E, were identified in rabbits with osteonecrosis soon after steroid administration. This study indicates that vasospasm is one of the important factors involved in the pathogenesis of steroid-induced osteonecrosis and that the anti-vasospasm agents seem to decrease the incidence of steroid-induced osteonecrosis.
Collapse
Affiliation(s)
- Satoshi Ikemura
- Investigation performed in the Department of Orthopaedic Surgery, Kyushu University, Fukuoka, Japan
| | | | | | | | | | | | | |
Collapse
|
22
|
Abstract
Historically many genome annotation strategies have lacked experimental evidence at the protein level, which and have instead relied heavily on ab initio gene prediction tools, which consequently resulted in many incorrectly annotated genomic sequences. Proteogenomics aims to address these issues using mass spectrometry (MS)-based proteomics, genomic mapping, and providing statistical significance measures such as false discovery rates (FDRs) to validate the mapped peptides. Presented here is a tool capable of meeting this goal, the UCSD proteogenomic pipeline, which maps peptide-spectrum matches (PSMs) to the genome using the Inspect MS/MS database search tool and assigns a statistical significance to the match using a target-decoy search approach to assign estimated FDRs. This pipeline also provides the option of using a more reliable approach to proteogenomics by determining the precise false-positive rates (FPRs) and p-values of each PSM by calculating their spectral probabilities and rescoring each PSM accordingly. In addition to the protein prediction challenges in the rapidly growing number of sequenced plant genomes, it is difficult to extract high-quality protein samples from many plant species. For that reason, this chapter contains methods for protein extraction and trypsin digestion that reliably produce samples suitable for proteogenomic analysis.
Collapse
|
23
|
Pawar H, Sahasrabuddhe NA, Renuse S, Keerthikumar S, Sharma J, Kumar GSS, Venugopal A, Sekhar NR, Kelkar DS, Nemade H, Khobragade SN, Muthusamy B, Kandasamy K, Harsha HC, Chaerkady R, Patole MS, Pandey A. A proteogenomic approach to map the proteome of an unsequenced pathogen - Leishmania donovani. Proteomics 2012; 12:832-44. [DOI: 10.1002/pmic.201100505] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Harsh Pawar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Rajiv Gandhi University of Health Sciences; Bangalore Karnataka India
| | - Nandini A. Sahasrabuddhe
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | - Santosh Renuse
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | | | - Jyoti Sharma
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
| | - Ghantasala. S. Sameer Kumar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Abhilash Venugopal
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Nirujogi Raja Sekhar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Dhanashree S. Kelkar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | - Harshal Nemade
- National Centre for Cell Sciences; Pune Maharashtra India
| | | | - Babylakshmi Muthusamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Kumaran Kandasamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - H. C. Harsha
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - Raghothama Chaerkady
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Oncology; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Pathology; Johns Hopkins University School of Medicine; Baltimore MD USA
| |
Collapse
|
24
|
Kailasa SK, Wu HF. Functionalized quantum dots with dopamine dithiocarbamate as the matrix for the quantification of efavirenz in human plasma and as affinity probes for rapid identification of microwave tryptic digested proteins in MALDI-TOF-MS. J Proteomics 2011; 75:2924-33. [PMID: 22202183 DOI: 10.1016/j.jprot.2011.12.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Revised: 12/09/2011] [Accepted: 12/10/2011] [Indexed: 01/01/2023]
Abstract
Functionalized quantum dots with dopamine dithiocarbamate (QDs-DDTC) were utilized for the first time as an efficient material for the quantification of efavirenz in human plasma of HIV infected patients and rapid identification of microwave tryptic digest proteins (cytochrome c, lysozyme and BSA) by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS). The synthesized QDs-DDTC was characterized by using spectroscopic (UV-visible, FT-IR and (1)H NMR) and microscopic (SEM and TEM) techniques. Functionalized QDs-DDTC exhibited a high desorption/ionization efficiency for the rapid quantification of small molecules (efavirenz, tobramycin and aspartame) at low-mass region. QDs-DDTC has well ability to trap target species, and capable to transfer laser energy for efficient desorption/ionization of analytes with background-free detection. The use of QDs-DDTC as a matrix provided good linearity for the quantification of small molecules (R(2)=~0.9983), with good reproducibility (RSD<10%), in the analysis of efavirenz in the plasma of HIV infected patients by the standard addition method. We also demonstrated that the use of functionalized QDs-DDTC as affinity probes for the rapid identification of microwave tryptic digested proteins (cytochrome c, lysozyme and BSA) by MALDI-TOF-MS. QDs-DDTC-based MALDI-TOF-MS approach provides simplicity, rapidity, accuracy, and precision for the determination of efavirenz in human plasma of HIV infected patients and rapid identification of microwave tryptic digested proteins. This new material presents a marked advance in the development of matrix-free mass spectrometric methods for the rapid and precise quantitative determination of a variety of molecules. This article is part of a Special Issue entitled: Proteomics: The clinical link.
Collapse
Affiliation(s)
- Suresh Kumar Kailasa
- Department of Chemistry, National Sun Yat-Sen University, Kaohsiung, 80424, Taiwan
| | | |
Collapse
|
25
|
Prasad TSK, Harsha HC, Keerthikumar S, Sekhar NR, Selvan LDN, Kumar P, Pinto SM, Muthusamy B, Subbannayya Y, Renuse S, Chaerkady R, Mathur PP, Ravikumar R, Pandey A. Proteogenomic Analysis of Candida glabrata using High Resolution Mass Spectrometry. J Proteome Res 2011; 11:247-60. [DOI: 10.1021/pr200827k] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Affiliation(s)
- T. S. Keshava Prasad
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Centre
of Excellence in Bioinformatics,
Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
- Manipal University, Madhav Nagar, Manipal, Karnataka 576104; India
- Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
| | - H. C. Harsha
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
| | | | - Nirujogi Raja Sekhar
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Centre
of Excellence in Bioinformatics,
Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
| | - Lakshmi Dhevi N. Selvan
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
| | - Praveen Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
| | - Sneha M. Pinto
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Manipal University, Madhav Nagar, Manipal, Karnataka 576104; India
| | - Babylakshmi Muthusamy
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Centre
of Excellence in Bioinformatics,
Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
| | - Yashwanth Subbannayya
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Rajiv Gandhi University of Health Sciences, Jayanagar, Bangalore −560
041, India
| | - Santosh Renuse
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
- Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
| | - Raghothama Chaerkady
- Institute of Bioinformatics, International Technology Park, Bangalore
-560 066, India
| | - Premendu P. Mathur
- Centre
of Excellence in Bioinformatics,
Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
| | - Raju Ravikumar
- Department of
Neuromicrobiology, National Institute of Mental Health and Neuro Sciences, Bangalore -560029, India
| | | |
Collapse
|
26
|
Venter E, Smith RD, Payne SH. Proteogenomic analysis of bacteria and archaea: a 46 organism case study. PLoS One 2011; 6:e27587. [PMID: 22114679 PMCID: PMC3219674 DOI: 10.1371/journal.pone.0027587] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Accepted: 10/20/2011] [Indexed: 11/19/2022] Open
Abstract
Experimental evidence is increasingly being used to reassess the quality and accuracy of genome annotation. Proteomics data used for this purpose, called proteogenomics, can alleviate many of the problematic areas of genome annotation, e.g. short protein validation and start site assignment. We performed a proteogenomic analysis of 46 genomes spanning eight bacterial and archaeal phyla across the tree of life. These diverse datasets facilitated the development of a robust approach for proteogenomics that is functional across genomes varying in %GC, gene content, proteomic sampling depth, phylogeny, and genome size. In addition to finding evidence for 682 novel proteins, 1336 new start sites, and numerous dubious genes, we discovered sites of post-translational maturation in the form of proteolytic cleavage of 1175 signal peptides. The number of novel proteins per genome is highly variable (median 7, mean 15, stdev 20). Moreover, comparison of novel genes with the current genes did not reveal any consistent abnormalities. Thus, we conclude that proteogenomics fulfills a yet to be understood deficiency in gene prediction. With the adoption of new sequencing technologies which have higher error rates than Sanger-based methods and the advances in proteomics, proteogenomics may become even more important in the future.
Collapse
Affiliation(s)
- Eli Venter
- Department of Informatics, J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Richard D. Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Samuel H. Payne
- Department of Informatics, J. Craig Venter Institute, Rockville, Maryland, United States of America
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, United States of America
- * E-mail:
| |
Collapse
|
27
|
Kelkar DS, Kumar D, Kumar P, Balakrishnan L, Muthusamy B, Yadav AK, Shrivastava P, Marimuthu A, Anand S, Sundaram H, Kingsbury R, Harsha HC, Nair B, Prasad TSK, Chauhan DS, Katoch K, Katoch VM, Kumar P, Chaerkady R, Ramachandran S, Dash D, Pandey A. Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry. Mol Cell Proteomics 2011; 10:M111.011627. [PMID: 22338125 PMCID: PMC3270104 DOI: 10.1074/mcp.m111.011445] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Mass spectrometric sequencing of low abundance, integral membrane proteins, particularly the transmembrane domains, presents challenges that span the multiple phases of sample preparation including solubilization, purification, enzymatic digestion, peptide extraction, and chromatographic separation. We describe a method through which we have obtained high peptide coverage for 12 γ-aminobutyric acid type A receptor (GABAA receptor) subunits from 2 picomoles of affinity-purified GABAA receptors from rat brain neocortex. Focusing on the α1 subunit, we identified peptides covering 96% of the protein sequence from fragmentation spectra (MS2) using a database searching algorithm and deduced 80% of the amino acid residues in the protein from de novo sequencing of Orbitrap spectra. The workflow combined microscale membrane protein solubilization, protein delipidation, in-solution multi-enzyme digestion, multiple stationary phases for peptide extraction, and acquisition of high-resolution full scan and fragmentation spectra. For de novo sequencing of peptides containing the transmembrane domains, timed digestions with chymotrypsin were utilized to generate peptides with overlapping sequences that were then recovered by sequential solid phase extraction using a C4 followed by a porous graphitic carbon stationary phase. The specificity of peptide identifications and amino acid residue sequences was increased by high mass accuracy and charge state assignment to parent and fragment ions. Analysis of three separate brain samples demonstrated that 78% of the sequence of the α1 subunit was observed in all three replicates with an additional 13% covered in two of the three replicates, indicating a high degree of sequence coverage reproducibility. Label-free quantitative analysis was applied to the three replicates to determine the relative abundances of 11 γ-aminobutyric acid type A receptor subunits. The deep sequence MS data also revealed two N-glycosylation sites on the α1 subunit, confirmed two splice variants of the γ2 subunit (γ2L and γ2S) and resolved a database discrepancy in the sequence of the α5 subunit.
Collapse
Affiliation(s)
- Dhanashree S Kelkar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Kelkar DS, Kumar D, Kumar P, Balakrishnan L, Muthusamy B, Yadav AK, Shrivastava P, Marimuthu A, Anand S, Sundaram H, Kingsbury R, Harsha HC, Nair B, Prasad TSK, Chauhan DS, Katoch K, Katoch VM, Kumar P, Chaerkady R, Ramachandran S, Dash D, Pandey A. Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry. Mol Cell Proteomics 2011. [PMID: 21969609 DOI: 10.1074/mcp.m111.011627] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ~250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes.
Collapse
Affiliation(s)
- Dhanashree S Kelkar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Ansong C, Tolić N, Purvine SO, Porwollik S, Jones M, Yoon H, Payne SH, Martin JL, Burnet MC, Monroe ME, Venepally P, Smith RD, Peterson SN, Heffron F, McClelland M, Adkins JN. Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium. BMC Genomics 2011; 12:433. [PMID: 21867535 PMCID: PMC3174948 DOI: 10.1186/1471-2164-12-433] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2011] [Accepted: 08/25/2011] [Indexed: 12/22/2022] Open
Abstract
Background Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. However, determining protein-coding genes for most new genomes is almost completely performed by inference using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. Results We experimentally annotated the bacterial pathogen Salmonella Typhimurium 14028, using "shotgun" proteomics to accurately uncover the translational landscape and post-translational features. The data provide protein-level experimental validation for approximately half of the predicted protein-coding genes in Salmonella and suggest revisions to several genes that appear to have incorrectly assigned translational start sites, including a potential novel alternate start codon. Additionally, we uncovered 12 non-annotated genes missed by gene prediction programs, as well as evidence suggesting a role for one of these novel ORFs in Salmonella pathogenesis. We also characterized post-translational features in the Salmonella genome, including chemical modifications and proteolytic cleavages. We find that bacteria have a much larger and more complex repertoire of chemical modifications than previously thought including several novel modifications. Our in vivo proteolysis data identified more than 130 signal peptide and N-terminal methionine cleavage events critical for protein function. Conclusion This work highlights several ways in which application of proteomics data can improve the quality of genome annotations to facilitate novel biological insights and provides a comprehensive proteome map of Salmonella as a resource for systems analysis.
Collapse
Affiliation(s)
- Charles Ansong
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Chaerkady R, Kelkar DS, Muthusamy B, Kandasamy K, Dwivedi SB, Sahasrabuddhe NA, Kim MS, Renuse S, Pinto SM, Sharma R, Pawar H, Sekhar NR, Mohanty AK, Getnet D, Yang Y, Zhong J, Dash AP, MacCallum RM, Delanghe B, Mlambo G, Kumar A, Keshava Prasad TS, Okulate M, Kumar N, Pandey A. A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry. Genome Res 2011; 21:1872-81. [PMID: 21795387 DOI: 10.1101/gr.127951.111] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Anopheles gambiae is a major mosquito vector responsible for malaria transmission, whose genome sequence was reported in 2002. Genome annotation is a continuing effort, and many of the approximately 13,000 genes listed in VectorBase for Anopheles gambiae are predictions that have still not been validated by any other method. To identify protein-coding genes of An. gambiae based on its genomic sequence, we carried out a deep proteomic analysis using high-resolution Fourier transform mass spectrometry for both precursor and fragment ions. Based on peptide evidence, we were able to support or correct more than 6000 gene annotations including 80 novel gene structures and about 500 translational start sites. An additional validation by RT-PCR and cDNA sequencing was successfully performed for 105 selected genes. Our proteogenomic analysis led to the identification of 2682 genome search-specific peptides. Numerous cases of encoded proteins were documented in regions annotated as intergenic, introns, or untranslated regions. Using a database created to contain potential splice sites, we also identified 35 novel splice junctions. This is a first report to annotate the An. gambiae genome using high-accuracy mass spectrometry data as a complementary technology for genome annotation.
Collapse
Affiliation(s)
- Raghothama Chaerkady
- McKusick-Nathans Institute of Genetic Medicine and Department of Biological Chemistry, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Renuse S, Chaerkady R, Pandey A. Proteogenomics. Proteomics 2011; 11:620-30. [DOI: 10.1002/pmic.201000615] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 11/14/2010] [Accepted: 11/16/2010] [Indexed: 12/13/2022]
|
32
|
Payne SH, Huang ST, Pieper R. A proteogenomic update to Yersinia: enhancing genome annotation. BMC Genomics 2010; 11:460. [PMID: 20687929 PMCID: PMC3091656 DOI: 10.1186/1471-2164-11-460] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2010] [Accepted: 08/05/2010] [Indexed: 01/18/2023] Open
Abstract
Background Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins. Results The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, Yersinia pestis KIM. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other Yersinia genomes, correcting and enhancing their annotations. Conclusions In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.
Collapse
Affiliation(s)
- Samuel H Payne
- J Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| | | | | |
Collapse
|
33
|
Jordan JD, Nyquist P. Biomarkers and vasospasm after aneurysmal subarachnoid hemorrhage. Neurosurg Clin N Am 2010; 21:381-91. [PMID: 20380977 DOI: 10.1016/j.nec.2009.10.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Subarachnoid hemorrhage from the rupture of a saccular aneurysm is a devastating neurological disease that has a high morbidity and mortality not only from the initial hemorrhage, but also from the delayed complications, such as cerebral vasospasm. Cerebral vasospasm can lead to delayed ischemic injury 1 to 2 weeks after the initial hemorrhage. Although the pathophysiology of vasospasm has been described for decades, the molecular basis remains poorly understood. With the many advances in the past decade in the development of sensitive molecular biological techniques, imaging, biochemical purification, and protein identification, new insights are beginning to reveal the etiology of vasospasm. These findings will not only help to identify markers of vasospasm and prognostic outcome, but will also yield potential therapeutic targets for the treatment of this disease. This review focuses on the methods available for the identification of biological markers of vasospasm and their limitations, the current understanding as to the utility and prognostic significance of identified biomarkers, the utility of these biomarkers in predicting vasospasm and outcome, and future directions of research in this field.
Collapse
Affiliation(s)
- J Dedrick Jordan
- Johns Hopkins School of Medicine, 600 North Wolfe Street, Meyer 8-140, Baltimore, MD 21287-7840, USA
| | | |
Collapse
|
34
|
Sankaralingam S, Lalu MM, Xu Y, Davidge ST. Effect of Peroxynitrite Scavenging on Endothelial Cells Stimulated by Plasma from Women with Preeclampsia: A Proteomic Approach. Hypertens Pregnancy 2010; 29:419-28. [DOI: 10.3109/10641950903452360] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
35
|
Allmer J. Existing bioinformatics tools for the quantitation of post-translational modifications. Amino Acids 2010; 42:129-38. [DOI: 10.1007/s00726-010-0614-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2009] [Accepted: 04/27/2010] [Indexed: 12/25/2022]
|
36
|
Agrawal P, Kumar S, Das HR. Mass spectrometric characterization of isoform variants of peanut (Arachis hypogaea) stem lectin (SL-I). J Proteomics 2010; 73:1573-86. [PMID: 20348039 DOI: 10.1016/j.jprot.2010.03.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2009] [Revised: 02/11/2010] [Accepted: 03/10/2010] [Indexed: 12/31/2022]
Abstract
Matrix assisted laser desorption/ionization-time-of-flight (MALDI-TOF) mass spectrometric (MS) analysis of purified Arachis hypogaea stem lectin (SL-I) and its tryptic digests suggested it to be an isoformic glucose/mannose binding lectin. Two-dimensional gel electrophoresis of SL-I indicated six isoforms (A1-A6), which were confirmed by Western blotting and MALDI-TOF MS analysis. Comparative analysis of peptide mass spectra of the isoforms matched with A. hypogaea lectins with three different accession numbers (Q43376_ARAHY, Q43377_ARAHY, Q70DJ5_ARAHY). Tandem mass spectrometric (MS/MS) analysis of tryptic peptides revealed these to be isoformic variants with altered amino acid sequences. Among the peptides, the peptide T12 showed major variation. The (199)Val-Ser-Tyr-Asn(202) sequence in peptide T12 of A1 and A2 was replaced by (199)Leu-Ser-His-Glu(202) in A3 and A4 (T12') while in A5 and A6 this sequence was (199)Val-Ser-Tyr-Val(202) (T12''). Peptide T1 showed the presence of (10)Asn in the isoforms A1-A5 while in A6 this amino acid was replaced by (10)Lys (T1'). Overall amino acid sequence as identified by MS/MS showed a high degree of similarity between A1, A2 and among A3, A4, A5. Carbohydrate binding domain and adenine binding site seem to be conserved.
Collapse
Affiliation(s)
- Praveen Agrawal
- Proteomics and Structural Biology Division, Institute of Genomics and Integrative Biology, Delhi, India
| | | | | |
Collapse
|
37
|
Chandramouli K, Qian PY. Proteomics: challenges, techniques and possibilities to overcome biological sample complexity. HUMAN GENOMICS AND PROTEOMICS : HGP 2009; 2009. [PMID: 20948568 PMCID: PMC2950283 DOI: 10.4061/2009/239204] [Citation(s) in RCA: 230] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2009] [Accepted: 08/28/2009] [Indexed: 01/12/2023]
Abstract
Proteomics is the large-scale study of the structure and function of proteins in complex biological sample. Such an approach has the potential value to understand the complex nature of the organism. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. Advances in protein fractionation and labeling techniques have improved protein identification to include the least abundant proteins. In addition, proteomics has been complemented by the analysis of posttranslational modifications and techniques for the quantitative comparison of different proteomes. However, the major limitation of proteomic investigations remains the complexity of biological structures and physiological processes, rendering the path of exploration paved with various difficulties and pitfalls. The quantity of data that is acquired with new techniques places new challenges on data processing and analysis. This article provides a brief overview of currently available proteomic techniques and their applications, followed by detailed description of advantages and technical challenges. Some solutions to circumvent technical difficulties are proposed.
Collapse
|
38
|
Krishnan HB, Oehrle NW, Natarajan SS. A rapid and simple procedure for the depletion of abundant storage proteins from legume seeds to advance proteome analysis: a case study using Glycine max. Proteomics 2009; 9:3174-88. [PMID: 19526550 DOI: 10.1002/pmic.200800875] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2008] [Accepted: 03/01/2009] [Indexed: 11/06/2022]
Abstract
2-D analysis of plant proteomes containing thousands of proteins has limited dynamic resolution because only abundant proteins can be detected. Proteomic assessment of the non-abundant proteins within seeds is difficult when 60-80% is storage proteins. Resolution can be improved through sample fractionation using separation techniques based upon different physiological or biochemical principles. We have developed a fast and simple fractionation technique using 10 mM Ca(2+) to precipitate soybean (Glycine max) seed storage globulins, glycinin and beta-conglycinin. This method removes 87+/-4% of the highly abundant seed proteins from the extract, allowing for 541 previously inconspicuous proteins present in soybean seed to be more detectable (volume increase of >or=50%) using fluorescent detection. Of those 541 enhanced spots, 197 increased more than 2.5-fold when visualized with Coomassie. The majority of those spots were isolated and identified using peptide mass fingerprinting. Fractionation also provided detection of 63 new phosphorylated protein spots and enhanced the visibility of 15 phosphorylated protein spots, using 2-D electrophoretic separation and an in-gel phosphoprotein stain. Application of this methodology toward other legumes, such as peanut, bean, pea, alfalfa and others, also containing high amounts of storage proteins, was examined, and is reported here.
Collapse
Affiliation(s)
- Hari B Krishnan
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, MO 65211, USA.
| | | | | |
Collapse
|
39
|
Abu-Farha M, Elisma F, Zhou H, Tian R, Zhou H, Asmer MS, Figeys D. Proteomics: From Technology Developments to Biological Applications. Anal Chem 2009; 81:4585-99. [PMID: 19371061 DOI: 10.1021/ac900735j] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Mohamed Abu-Farha
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Fred Elisma
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Houjiang Zhou
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Ruijun Tian
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Hu Zhou
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Mehmet Selim Asmer
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Daniel Figeys
- Ottawa Institute of Systems Biology (OISB), University of Ottawa, Ottawa, Ontario, Canada, and Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
40
|
Carpenter PM, Dao AV, Arain ZS, Chang MK, Nguyen HP, Arain S, Wang-Rodriguez J, Kwon SY, Wilczynski SP. Motility induction in breast carcinoma by mammary epithelial laminin 332 (laminin 5). Mol Cancer Res 2009; 7:462-75. [PMID: 19351903 DOI: 10.1158/1541-7786.mcr-08-0148] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Host interactions with tumor cells contribute to tumor progression by several means. This study was done to determine whether mammary epithelium could interact with breast carcinoma by producing substances capable of inducing motility in the cancer cells. Conditioned medium of immortalized 184A1 mammary epithelium collected in serum-free conditions induced dose-dependent motility in the MCF-7 breast carcinoma cell line by both a semiquantitative scattering assay and a Boyden chamber assay. Purification of the motility factor revealed that it was laminin 332 (formerly laminin 5) by mass spectroscopy. A Western blot of the 184A1 conditioned medium using a polyclonal antibody confirmed the presence of laminin 332 in the conditioned medium. Blockage of the motility with antibodies to the laminin 332 and its receptor components, alpha(3) and beta(1) integrins, provided further evidence that tumor cell motility was caused by the laminin 332 in the conditioned medium. Invasion of MCF-7, BT-20, and MDA-MB-435 S was induced by purified laminin 332 and 184A1 conditioned medium and blocked by an anti-alpha(3) integrin antibody. Staining of carcinoma in situ from breast cancer specimens revealed that laminin 332 in the myoepithelium adjacent to the preinvasive cells provided a source of laminin 332 that could potentially encourage the earliest steps of stromal invasion. In metaplastic breast carcinomas, the presence of laminin 332-producing cells coexpressing alpha(3) integrin and the greater metastatic potential of tumors with higher laminin 332 levels suggest that laminin 332 expression is associated with aggressive features in these human breast cancers.
Collapse
Affiliation(s)
- Philip M Carpenter
- Department of Pathology and Laboratory Medicine, University of California, Orange, CA 92868, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Koenig T, Menze BH, Kirchner M, Monigatti F, Parker KC, Patterson T, Steen JJ, Hamprecht FA, Steen H. Robust prediction of the MASCOT score for an improved quality assessment in mass spectrometric proteomics. J Proteome Res 2008; 7:3708-17. [PMID: 18707158 DOI: 10.1021/pr700859x] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein identification by tandem mass spectrometry is based on the reliable processing of the acquired data. Unfortunately, the generation of a large number of poor quality spectra is commonly observed in LC-MS/MS, and the processing of these mostly noninformative spectra with its associated costs should be avoided. We present a continuous quality score that can be computed very quickly and that can be considered an approximation of the MASCOT score in case of a correct identification. This score can be used to reject low quality spectra prior to database identification, or to draw attention to those spectra that exhibit a (supposedly) high information content, but could not be identified. The proposed quality score can be calibrated automatically on site without the need for a manually generated training set. When this score is turned into a classifier and when features are used that are independent of the instrument, the proposed approach performs equally to previously published classifiers and feature sets and also gives insights into the behavior of the MASCOT score.
Collapse
Affiliation(s)
- Thomas Koenig
- Interdisciplinary Center for Scientific Computing, University of Heidelberg, 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Abstract
PROWL is a collection of tools for the identification of protein sequences, using input data derived from mass spectrometry. Experimental data from various types of mass spectrometers can be input directly into PROWL's component software. This unit presents protocols for several of the individual PROWL tools. Specifically, PepFrag allows for the analysis of a single spectrum derived from tandem mass spectrometry. GPM, on the other hand, provides for the analysis of multiple MS/MS spectra. An additional protocol introduces ProFound for analyzing a single spectrum of peptide mass fingerprinting data.
Collapse
|
43
|
Meinnel T, Giglione C. Tools for analyzing and predicting N-terminal protein modifications. Proteomics 2008; 8:626-49. [DOI: 10.1002/pmic.200700592] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
44
|
Ishino Y, Okada H, Ikeuchi M, Taniguchi H. Mass spectrometry-based prokaryote gene annotation. Proteomics 2008; 7:4053-65. [PMID: 17994627 DOI: 10.1002/pmic.200700080] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MS combined with database searching has become the preferred method for identifying proteins present in cell or tissue samples. The technique enables us to execute large-scale proteome analyses of species whose genomes have already been sequenced. Searching mass spectrometric data against protein databases composed of annotated genes has been widely conducted. However, there are some issues with this technique; wrong annotations in protein databases cause deterioration in the accuracy of protein identification, and only proteins that have already been annotated can be identified. We propose a new framework that can detect correct ORFs by integrating an MS/MS proteomic data mapping and a knowledge-based system regarding the translation initiation sites. This technique can provide correction of predicted coding sequences, together with the possibility of identifying novel genes. We have developed a computational system; it should first conduct the probabilistic peptide-matching against all possible translational frames using MS/MS data, then search for discriminative DNA patterns around the detected peptides, and lastly integrate the facts using empirical knowledge stored in knowledge bases to obtain correct ORFs. We used photosynthetic bacteria Synechocystis sp. PCC6803 as a sample prokaryote, resulting in the finding of 14 N-terminus annotation errors and several new candidate genes.
Collapse
|
45
|
Affiliation(s)
- Yasushi Ishihama
- Institute for Advanced Biosciences, Keio University
- PRESTO, Japan Science and Technology Agency
| |
Collapse
|
46
|
Pitre S, Alamgir M, Green JR, Dumontier M, Dehne F, Golshani A. Computational methods for predicting protein-protein interactions. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2008; 110:247-67. [PMID: 18202838 DOI: 10.1007/10_2007_089] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in many cellular functions. A number of experimental techniques have been applied to discover PPIs; however, these techniques are expensive in terms of time, money, and expertise. There are also large discrepancies between the PPI data collected by the same or different techniques in the same organism. We therefore turn to computational techniques for the prediction of PPIs. Computational techniques have been applied to the collection, indexing, validation, analysis, and extrapolation of PPI data. This chapter will focus on computational prediction of PPI, reviewing a number of techniques including PIPE, developed in our own laboratory. For comparison, the conventional large-scale approaches to predict PPIs are also briefly discussed. The chapter concludes with a discussion of the limitations of both experimental and computational methods of determining PPIs.
Collapse
Affiliation(s)
- Sylvain Pitre
- School of Computer Science, Carleton University, 5304 Herzberg Building, 1125 Colonel By Drive, K1S 5B6, Ottawa, Ontario, Canada
| | | | | | | | | | | |
Collapse
|
47
|
Gupta N, Tanner S, Jaitly N, Adkins JN, Lipton M, Edwards R, Romine M, Osterman A, Bafna V, Smith RD, Pevzner PA. Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation. Genes Dev 2007; 17:1362-77. [PMID: 17690205 PMCID: PMC1950905 DOI: 10.1101/gr.6427907] [Citation(s) in RCA: 159] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/12/2007] [Indexed: 11/24/2022]
Abstract
While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages, and cleavages of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs, and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.
Collapse
Affiliation(s)
- Nitin Gupta
- Bioinformatics Program, University of California San Diego, La Jolla, California 92093, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Vij S, Tyagi AK. Emerging trends in the functional genomics of the abiotic stress response in crop plants. PLANT BIOTECHNOLOGY JOURNAL 2007; 5:361-80. [PMID: 17430544 DOI: 10.1111/j.1467-7652.2007.00239.x] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Plants are exposed to different abiotic stresses, such as water deficit, high temperature, salinity, cold, heavy metals and mechanical wounding, under field conditions. It is estimated that such stress conditions can potentially reduce the yield of crop plants by more than 50%. Investigations of the physiological, biochemical and molecular aspects of stress tolerance have been conducted to unravel the intrinsic mechanisms developed during evolution to mitigate against stress by plants. Before the advent of the genomics era, researchers primarily used a gene-by-gene approach to decipher the function of the genes involved in the abiotic stress response. However, abiotic stress tolerance is a complex trait and, although large numbers of genes have been identified to be involved in the abiotic stress response, there remain large gaps in our understanding of the trait. The availability of the genome sequences of certain important plant species has enabled the use of strategies, such as genome-wide expression profiling, to identify the genes associated with the stress response, followed by the verification of gene function by the analysis of mutants and transgenics. Certain components of both abscisic acid-dependent and -independent cascades involved in the stress response have already been identified. Information originating from the genome-wide analysis of abiotic stress tolerance will help to provide an insight into the stress-responsive network(s), and may allow the modification of this network to reduce the loss caused by stress and to increase agricultural productivity.
Collapse
Affiliation(s)
- Shubha Vij
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi South Campus, New Delhi, India
| | | |
Collapse
|
49
|
Maron PA, Ranjard L, Mougel C, Lemanceau P. Metaproteomics: a new approach for studying functional microbial ecology. MICROBIAL ECOLOGY 2007; 53:486-93. [PMID: 17431707 DOI: 10.1007/s00248-006-9196-8] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Revised: 11/17/2006] [Accepted: 11/26/2006] [Indexed: 05/14/2023]
Abstract
In the postgenomic era, there is a clear recognition of the limitations of nucleic acid-based methods for getting information on functions expressed by microbial communities in situ. In this context, the large-scale study of proteins expressed by indigenous microbial communities (metaproteome) should provide information to gain insights into the functioning of the microbial component in ecosystems. Characterization of the metaproteome is expected to provide data linking genetic and functional diversity of microbial communities. Studies on the metaproteome together with those on the metagenome and the metatranscriptome will contribute to progress in our knowledge of microbial communities and their contribution in ecosystem functioning. Effectiveness of the metaproteomic approach will be improved as increasing metagenomic information is made available thanks to the environmental sequencing projects currently running. More specifically, analysis of metaproteome in contrasted environmental situations should allow (1) tracking new functional genes and metabolic pathways and (2) identifying proteins preferentially associated with specific stresses. These proteins considered as functional bioindicators should contribute, in the future, to help policy makers in defining strategies for sustainable management of our environment.
Collapse
Affiliation(s)
- Pierre-Alain Maron
- UMR Microbiologie et Géochimie des Sols, INRA/Université de Bourgogne, CMSE, BP 86510, 17 rue de Sully, 21065, Dijon Cedex, France
| | | | | | | |
Collapse
|
50
|
Allmer J, Naumann B, Markert C, Zhang M, Hippler M. Mass spectrometric genomic data mining: Novel insights into bioenergetic pathways in Chlamydomonas reinhardtii. Proteomics 2007; 6:6207-20. [PMID: 17078018 DOI: 10.1002/pmic.200600208] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A new high-throughput computational strategy was established that improves genomic data mining from MS experiments. The MS/MS data were analyzed by the SEQUEST search algorithm and a combination of de novo amino acid sequencing in conjunction with an error-tolerant database search tool, operating on a 256 processor computer cluster. The error-tolerant search tool, previously established as GenomicPeptideFinder (GPF), enables detection of intron-split and/or alternatively spliced peptides from MS/MS data when deduced from genomic DNA. Isolated thylakoid membranes from the eukaryotic green alga Chlamydomonas reinhardtii were separated by 1-D SDS gel electrophoresis, protein bands were excised from the gel, digested in-gel with trypsin and analyzed by coupling nano-flow LC with MS/MS. The concerted action of SEQUEST and GPF allowed identification of 2622 distinct peptides. In total 448 peptides were identified by GPF analysis alone, including 98 intron-split peptides, resulting in the identification of novel proteins, improved annotation of gene models, and evidence of alternative splicing.
Collapse
Affiliation(s)
- Jens Allmer
- Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | | | | | | | | |
Collapse
|