1
|
Song D, Hu X. Data Fusion Algorithm for Myocardial Proteomics and Its Research in Sports. Comput Math Methods Med 2022; 2022:4049169. [PMID: 35186113 PMCID: PMC8853782 DOI: 10.1155/2022/4049169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/29/2021] [Accepted: 01/10/2022] [Indexed: 11/19/2022]
Abstract
Sport is a type of comprehensive activity that the human body consciously engages in to improve physical fitness. Proteomics is a comprehensive technology dedicated to the study of all protein profiles expressed by a species, individual organ, tissue, or cell under specific conditions and specific times. Proteomics is a science that studies the protein composition of cells, tissues, or organisms and their changing laws with proteomics as the research object. Related technologies are now widely used in sports and other fields. The purpose of this article is to study myocardial proteomic technology and its application in sports. During the research process, the main methods used in this study are literature survey and controlled experiment. The results achieved and the problems in this field, followed by selecting 30 SD rats into 3 groups for control experiments. The results of the study showed that among the three groups of rats, the left ventricular ejection fraction of the sham operation group was the highest, which was 7.7% and 4.6% higher than that of the operation group and the model group, respectively. The operation group had the highest left ventricular short axis shortening rate, and the left ventricle diastolic inner diameter is the longest. It can be seen that myocardial proteomics can accurately reflect the heart condition of rats. In addition, the length, diastolic velocity, and diastolic time of cardiomyocytes of the three groups of rats were different. Among them, the cardiomyocytes of the operation group had the longest time and the longest diastolic time, which were 37.1% and 8.5% higher than those of the sham operation group and the model group.
Collapse
Affiliation(s)
- Ditao Song
- College of Physical Education, Guangxi Science & Technology Normal University, Laibin, 546199 Guangxi, China
| | - Xiaoyong Hu
- Institute of Physical Education, Guiyang College, Guiyang, 550005 Guizhou, China
| |
Collapse
|
2
|
Charmpi K, Chokkalingam M, Johnen R, Beyer A. Optimizing network propagation for multi-omics data integration. PLoS Comput Biol 2021; 17:e1009161. [PMID: 34762640 PMCID: PMC8664198 DOI: 10.1371/journal.pcbi.1009161] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 12/10/2021] [Accepted: 10/12/2021] [Indexed: 01/11/2023] Open
Abstract
Network propagation refers to a class of algorithms that integrate information from input data across connected nodes in a given network. These algorithms have wide applications in systems biology, protein function prediction, inferring condition-specifically altered sub-networks, and prioritizing disease genes. Despite the popularity of network propagation, there is a lack of comparative analyses of different algorithms on real data and little guidance on how to select and parameterize the various algorithms. Here, we address this problem by analyzing different combinations of network normalization and propagation methods and by demonstrating schemes for the identification of optimal parameter settings on real proteome and transcriptome data. Our work highlights the risk of a ‘topology bias’ caused by the incorrect use of network normalization approaches. Capitalizing on the fact that network propagation is a regularization approach, we show that minimizing the bias-variance tradeoff can be utilized for selecting optimal parameters. The application to real multi-omics data demonstrated that optimal parameters could also be obtained by either maximizing the agreement between different omics layers (e.g. proteome and transcriptome) or by maximizing the consistency between biological replicates. Furthermore, we exemplified the utility and robustness of network propagation on multi-omics datasets for identifying ageing-associated genes in brain and liver tissues of rats and for elucidating molecular mechanisms underlying prostate cancer progression. Overall, this work compares different network propagation approaches and it presents strategies for how to use network propagation algorithms to optimally address a specific research question at hand. Modern technologies enable the simultaneous measurement of tens of thousands of molecules in biological samples. Algorithms called network propagation or network smoothing are frequently used to integrate such data with already known molecular interaction data, such as protein and gene interaction networks. These methods distribute the information on molecular perturbations within the network and help identifying network regions that are enriched for many perturbed (affected) molecules. Despite the popularity of these methods, there is a lack of guidance on how to optimally use them. Here, we highlight possible pitfalls when using incorrect network normalization methods. Further, we present different ways for optimizing the smoothing parameters used during network smoothing: the first approach maximizes the consistency between replicate measurements within a dataset; the second one maximizes the consistency between different types of ‘omics’ measurements, such as proteomics and transcriptomics. Using two multi-omics datasets, one from a cohort of prostate cancer patients, the other one from an ageing study on rat brain and liver tissues, we exemplify the effects of these strategies on real data.
Collapse
Affiliation(s)
- Konstantina Charmpi
- CECAD Cologne Excellence Cluster on Cellular Stress Responses in Aging Associated Diseases, Cologne, Germany
| | - Manopriya Chokkalingam
- CECAD Cologne Excellence Cluster on Cellular Stress Responses in Aging Associated Diseases, Cologne, Germany
| | - Ronja Johnen
- CECAD Cologne Excellence Cluster on Cellular Stress Responses in Aging Associated Diseases, Cologne, Germany
| | - Andreas Beyer
- CECAD Cologne Excellence Cluster on Cellular Stress Responses in Aging Associated Diseases, Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), Medical Faculty, University of Cologne, Cologne, Germany
- Institute for Genetics, Faculty of Mathematics and Natural Sciences, University of Cologne, Cologne, Germany
- * E-mail:
| |
Collapse
|
3
|
Li QK, Chen J, Hu Y, Höti N, Lih TSM, Thomas SN, Chen L, Roy S, Meeker A, Shah P, Chen L, Bova GS, Zhang B, Zhang H. Proteomic characterization of primary and metastatic prostate cancer reveals reduced proteinase activity in aggressive tumors. Sci Rep 2021; 11:18936. [PMID: 34556748 PMCID: PMC8460832 DOI: 10.1038/s41598-021-98410-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 09/03/2021] [Indexed: 12/29/2022] Open
Abstract
Prostate cancer (PCa) is a heterogeneous group of tumors with variable clinical courses. In order to improve patient outcomes, it is critical to clinically separate aggressive PCa (AG) from non-aggressive PCa (NAG). Although recent genomic studies have identified a spectrum of molecular abnormalities associated with aggressive PCa, it is still challenging to separate AG from NAG. To better understand the functional consequences of PCa progression and the unique features of the AG subtype, we studied the proteomic signatures of primary AG, NAG and metastatic PCa. 39 PCa and 10 benign prostate controls in a discovery cohort and 57 PCa in a validation cohort were analyzed using a data-independent acquisition (DIA) SWATH-MS platform. Proteins with the highest variances (top 500 proteins) were annotated for the pathway enrichment analysis. Functional analysis of differentially expressed proteins in NAG and AG was performed. Data was further validated using a validation cohort; and was also compared with a TCGA mRNA expression dataset and confirmed by immunohistochemistry (IHC) using PCa tissue microarray (TMA). 4,415 proteins were identified in the tumor and benign control tissues, including 158 up-regulated and 116 down-regulated proteins in AG tumors. A functional analysis of tumor-associated proteins revealed reduced expressions of several proteinases, including dipeptidyl peptidase 4 (DPP4), carboxypeptidase E (CPE) and prostate specific antigen (KLK3) in AG and metastatic PCa. A targeted analysis further identified that the reduced expression of DPP4 was associated with the accumulation of DPP4 substrates and the reduced ratio of DPP4 cleaved peptide to intact substrate peptide. Findings were further validated using an independently-collected tumor cohort, correlated with a TCGA mRNA dataset, and confirmed by immunohistochemical stains of PCa tumor microarray (TMA). Our study is the first large-scale proteomics analysis of PCa tissue using a DIA SWATH-MS platform. It provides not only an interrogative proteomic signature of PCa subtypes, but also indicates the critical roles played by certain proteinases during tumor progression. The spectrum map and protein profile generated in the study can be used to investigate potential biological mechanisms involved in PCa and for the development of a clinical assay to distinguish aggressive from indolent PCa.
Collapse
Affiliation(s)
- Qing Kay Li
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA.
- Department of Oncology, Sidney Kimmel Cancer Center, Johns Hopkins Medical Institutions, Baltimore, MD, USA.
| | - Jing Chen
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Yingwei Hu
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Naseruddin Höti
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Tung-Shing Mamie Lih
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Stefani N Thomas
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Li Chen
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Sujayita Roy
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Alan Meeker
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Punit Shah
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Lijun Chen
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - G Steven Bova
- Prostate Cancer Research Center, Faculty of Medicine and Health Technology, Tampere University, FI-33014, Tampere, Finland
| | - Bai Zhang
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA
| | - Hui Zhang
- Department of Pathology, The John Hopkins Medical Institutions, 600 N. Wolfe Street, Baltimore, MD, 21224, USA.
- Department of Oncology, Sidney Kimmel Cancer Center, Johns Hopkins Medical Institutions, Baltimore, MD, USA.
- Department of Urology, Sidney Kimmel Cancer Center, Johns Hopkins Medical Institutions, Baltimore, MD, USA.
- Johns Hopkins University, 400 N. Broadway, Smith Bldg Rm 4011, Baltimore, MD, 21287, USA.
| |
Collapse
|
4
|
Gardner ML, Freitas MA. Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics. Int J Mol Sci 2021; 22:ijms22179650. [PMID: 34502557 PMCID: PMC8431783 DOI: 10.3390/ijms22179650] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/28/2021] [Accepted: 08/31/2021] [Indexed: 01/15/2023] Open
Abstract
Analysis of differential abundance in proteomics data sets requires careful application of missing value imputation. Missing abundance values widely vary when performing comparisons across different sample treatments. For example, one would expect a consistent rate of “missing at random” (MAR) across batches of samples and varying rates of “missing not at random” (MNAR) depending on the inherent difference in sample treatments within the study. The missing value imputation strategy must thus be selected that best accounts for both MAR and MNAR simultaneously. Several important issues must be considered when deciding the appropriate missing value imputation strategy: (1) when it is appropriate to impute data; (2) how to choose a method that reflects the combinatorial manner of MAR and MNAR that occurs in an experiment. This paper provides an evaluation of missing value imputation strategies used in proteomics and presents a case for the use of hybrid left-censored missing value imputation approaches that can handle the MNAR problem common to proteomics data.
Collapse
Affiliation(s)
- Miranda L. Gardner
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
| | - Michael A. Freitas
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Correspondence: or
| |
Collapse
|
5
|
Noberini R, Savoia EO, Brandini S, Greco F, Marra F, Bertalot G, Pruneri G, McDonnell LA, Bonaldi T. Spatial epi-proteomics enabled by histone post-translational modification analysis from low-abundance clinical samples. Clin Epigenetics 2021; 13:145. [PMID: 34315505 PMCID: PMC8317427 DOI: 10.1186/s13148-021-01120-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 06/18/2021] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Increasing evidence linking epigenetic mechanisms and different diseases, including cancer, has prompted in the last 15 years the investigation of histone post-translational modifications (PTMs) in clinical samples. Methods allowing the isolation of histones from patient samples followed by the accurate and comprehensive quantification of their PTMs by mass spectrometry (MS) have been developed. However, the applicability of these methods is limited by the requirement for substantial amounts of material. RESULTS To address this issue, in this study we streamlined the protein extraction procedure from low-amount clinical samples and tested and implemented different in-gel digestion strategies, obtaining a protocol that allows the MS-based analysis of the most common histone PTMs from laser microdissected tissue areas containing as low as 1000 cells, an amount approximately 500 times lower than what is required by available methods. We then applied this protocol to breast cancer patient laser microdissected tissues in two proof-of-concept experiments, identifying differences in histone marks in heterogeneous regions selected by either morphological evaluation or MALDI MS imaging. CONCLUSIONS These results demonstrate that analyzing histone PTMs from very small tissue areas and detecting differences from adjacent tumor regions is technically feasible. Our method opens the way for spatial epi-proteomics, namely the investigation of epigenetic features in the context of tissue and tumor heterogeneity, which will be instrumental for the identification of novel epigenetic biomarkers and aberrant epigenetic mechanisms.
Collapse
Affiliation(s)
- Roberta Noberini
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy.
| | - Evelyn Oliva Savoia
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Stefania Brandini
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Francesco Greco
- Institute of Life Sciences, Sant'Anna School of Advanced Studies, 56127, Pisa, Italy
- Fondazione Pisana Per La Scienza ONLUS, 56107, San Giuliano Terme, PI, Italy
| | - Francesca Marra
- Department of Pathology, Fondazione IRCCS-Istituto Nazionale Tumori, Milan, Italy
| | - Giovanni Bertalot
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Giancarlo Pruneri
- Department of Pathology, Fondazione IRCCS-Istituto Nazionale Tumori, Milan, Italy
| | - Liam A McDonnell
- Fondazione Pisana Per La Scienza ONLUS, 56107, San Giuliano Terme, PI, Italy
| | - Tiziana Bonaldi
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
6
|
Abstract
Since the outset of COVID-19, the pandemic has prompted immediate global efforts to sequence SARS-CoV-2, and over 450 000 complete genomes have been publicly deposited over the course of 12 months. Despite this, comparative nucleotide and amino acid sequence analyses often fall short in answering key questions in vaccine design. For example, the binding affinity between different ACE2 receptors and SARS-COV-2 spike protein cannot be fully explained by amino acid similarity at ACE2 contact sites because protein structure similarities are not fully reflected by amino acid sequence similarities. To comprehensively compare protein homology, secondary structure (SS) analysis is required. While protein structure is slow and difficult to obtain, SS predictions can be made rapidly, and a well-predicted SS structure may serve as a viable proxy to gain biological insight. Here we review algorithms and information used in predicting protein SS to highlight its potential application in pandemics research. We also showed examples of how SS predictions can be used to compare ACE2 proteins and to evaluate the zoonotic origins of viruses. As computational tools are much faster than wet-lab experiments, these applications can be important for research especially in times when quickly obtained biological insights can help in speeding up response to pandemics.
Collapse
Affiliation(s)
- Alibek Kruglikov
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Mohan Rakesh
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Yulong Wei
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Xuhua Xia
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
- Ottawa
Institute of Systems Biology, University
of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| |
Collapse
|
7
|
Abstract
In mass spectrometry-based proteomics, relative quantitative approaches enable differential protein abundance analysis. Isobaric labeling strategies, such as tandem mass tags (TMT), provide simultaneous quantification of several samples (e.g., up to 16 using 16plex TMTpro) owing to its multiplexing capability. This technology improves sample throughput and thereby minimizes both measurement time and overall experimental variation. However, TMT-based MS data processing and statistical analysis are probably the crucial parts of this pipeline to obtain reliable, plausible, and significantly quantified results. Here, we provide a step-by-step guide to the analysis and evaluation of TMT quantitative proteomics data.
Collapse
Affiliation(s)
- Oliver Pagel
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Bunsen-Kirchhoff-Straße 11, Dortmund, Germany
| | - Laxmikanth Kollipara
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Bunsen-Kirchhoff-Straße 11, Dortmund, Germany
| | - Albert Sickmann
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Bunsen-Kirchhoff-Straße 11, Dortmund, Germany.
- Department of Chemistry, College of Physical Sciences, University of Aberdeen, Aberdeen, UK.
- Medizinische Fakultät, Medizinisches Proteom-Center (MPC), Ruhr-Universität Bochum, Bochum, Germany.
| |
Collapse
|
8
|
Antoniassi MP, Belardin LB, Camargo M, Intasqui P, Carvalho VM, Cardozo KHM, Bertolla RP. Seminal plasma protein networks and enriched functions in varicocele: Effect of smoking. Andrologia 2020; 52:e13562. [PMID: 32150769 DOI: 10.1111/and.13562] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 01/30/2020] [Accepted: 02/22/2020] [Indexed: 12/14/2022] Open
Abstract
To verify a possible synergistic effect of smoking and varicocele on the seminal plasma proteome and biological functions, a cross-sectional study was performed in 25 smokers and 24 nonsmokers. Samples were used for conventional semen analysis, functional analysis (DNA fragmentation, acrosome integrity and mitochondrial activity) and proteomics by a shotgun approach. Functional enrichment of biological pathways was performed in differentially expressed proteins. Smokers presented lower ejaculate volume (p = .027), percentage of progressively motile spermatozoa (p = .002), total sperm count (p = .039), morphology (p = .001) and higher percentage of immotile spermatozoa (p = .03), round cell (p = .045) and neutrophil count (p = .009). Smokers also presented lower mitochondrial activity and acrosome integrity and higher DNA fragmentation. We identified and quantified 421 proteins in seminal plasma, of which one was exclusive, 21 were overexpressed and 70 were underexpressed in the seminal plasma of smokers. The proteins neprilysin, beta-defensin 106A and histone H4A were capable of predicting the smoker group. Enriched functions were related to immune function and sperm machinery in testis/epididymis. Based on our findings, we can conclude that cigarette smoking leads to the establishment of inflammatory protein pathways in the testis/epididymis in the presence of varicocele that seems to act in synergy with the toxic components of the cigarette.
Collapse
Affiliation(s)
- Mariana P Antoniassi
- Division of Urology, Department of Surgery, Human Reproduction Section, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Larissa B Belardin
- Division of Urology, Department of Surgery, Human Reproduction Section, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Mariana Camargo
- Division of Urology, Department of Surgery, Human Reproduction Section, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Paula Intasqui
- Division of Urology, Department of Surgery, Human Reproduction Section, Universidade Federal de São Paulo, São Paulo, Brazil
| | | | | | - Ricardo P Bertolla
- Division of Urology, Department of Surgery, Human Reproduction Section, Universidade Federal de São Paulo, São Paulo, Brazil
| |
Collapse
|
9
|
Chantzi E, Neidlin M, Macheras GA, Alexopoulos LG, Gustafsson MG. COMBSecretomics: A pragmatic methodological framework for higher-order drug combination analysis using secretomics. PLoS One 2020; 15:e0232989. [PMID: 32407402 PMCID: PMC7224510 DOI: 10.1371/journal.pone.0232989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 04/24/2020] [Indexed: 11/18/2022] Open
Abstract
Multi drug treatments are increasingly used in the clinic to combat complex and co-occurring diseases. However, most drug combination discovery efforts today are mainly focused on anticancer therapy and rarely examine the potential of using more than two drugs simultaneously. Moreover, there is currently no reported methodology for performing second- and higher-order drug combination analysis of secretomic patterns, meaning protein concentration profiles released by the cells. Here, we introduce COMBSecretomics (https://github.com/EffieChantzi/COMBSecretomics.git), the first pragmatic methodological framework designed to search exhaustively for second- and higher-order mixtures of candidate treatments that can modify, or even reverse malfunctioning secretomic patterns of human cells. This framework comes with two novel model-free combination analysis methods; a tailor-made generalization of the highest single agent principle and a data mining approach based on top-down hierarchical clustering. Quality control procedures to eliminate outliers and non-parametric statistics to quantify uncertainty in the results obtained are also included. COMBSecretomics is based on a standardized reproducible format and could be employed with any experimental platform that provides the required protein release data. Its practical use and functionality are demonstrated by means of a proof-of-principle pharmacological study related to cartilage degradation. COMBSecretomics is the first methodological framework reported to enable secretome-related second- and higher-order drug combination analysis. It could be used in drug discovery and development projects, clinical practice, as well as basic biological understanding of the largely unexplored changes in cell-cell communication that occurs due to disease and/or associated pharmacological treatment conditions.
Collapse
Affiliation(s)
- Efthymia Chantzi
- Cancer Pharmacology and Computational Medicine, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
- Signals and Systems, Department of Electrical Engineering, Uppsala University, Uppsala, Sweden
- * E-mail: (EC); (MGG)
| | - Michael Neidlin
- Biomedical Systems Laboratory, Department of Mechanical Engineering, National Technical University of Athens, Athens, Greece
| | | | - Leonidas G. Alexopoulos
- Biomedical Systems Laboratory, Department of Mechanical Engineering, National Technical University of Athens, Athens, Greece
| | - Mats G. Gustafsson
- Cancer Pharmacology and Computational Medicine, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
- Signals and Systems, Department of Electrical Engineering, Uppsala University, Uppsala, Sweden
- * E-mail: (EC); (MGG)
| |
Collapse
|
10
|
Tini G, Marchetti L, Priami C, Scott-Boyer MP. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform 2020; 20:1269-1279. [PMID: 29272335 DOI: 10.1093/bib/bbx167] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Revised: 11/06/2017] [Indexed: 12/19/2022] Open
Abstract
With the recent developments in the field of multi-omics integration, the interest in factors such as data preprocessing, choice of the integration method and the number of different omics considered had increased. In this work, the impact of these factors is explored when solving the problem of sample classification, by comparing the performances of five unsupervised algorithms: Multiple Canonical Correlation Analysis, Multiple Co-Inertia Analysis, Multiple Factor Analysis, Joint and Individual Variation Explained and Similarity Network Fusion. These methods were applied to three real data sets taken from literature and several ad hoc simulated scenarios to discuss classification performance in different conditions of noise and signal strength across the data types. The impact of experimental design, feature selection and parameter training has been also evaluated to unravel important conditions that can affect the accuracy of the result.
Collapse
|
11
|
Wang X, Shen S, Rasam SS, Qu J. MS1 ion current-based quantitative proteomics: A promising solution for reliable analysis of large biological cohorts. Mass Spectrom Rev 2019; 38:461-482. [PMID: 30920002 PMCID: PMC6849792 DOI: 10.1002/mas.21595] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 02/28/2019] [Indexed: 05/04/2023]
Abstract
The rapidly-advancing field of pharmaceutical and clinical research calls for systematic, molecular-level characterization of complex biological systems. To this end, quantitative proteomics represents a powerful tool but an optimal solution for reliable large-cohort proteomics analysis, as frequently involved in pharmaceutical/clinical investigations, is urgently needed. Large-cohort analysis remains challenging owing to the deteriorating quantitative quality and snowballing missing data and false-positive discovery of altered proteins when sample size increases. MS1 ion current-based methods, which have become an important class of label-free quantification techniques during the past decade, show considerable potential to achieve reproducible protein measurements in large cohorts with high quantitative accuracy/precision. Nonetheless, in order to fully unleash this potential, several critical prerequisites should be met. Here we provide an overview of the rationale of MS1-based strategies and then important considerations for experimental and data processing techniques, with the emphasis on (i) efficient and reproducible sample preparation and LC separation; (ii) sensitive, selective and high-resolution MS detection; iii)accurate chromatographic alignment; (iv) sensitive and selective generation of quantitative features; and (v) optimal post-feature-generation data quality control. Prominent technical developments in these aspects are discussed. Finally, we reviewed applications of MS1-based strategy in disease mechanism studies, biomarker discovery, and pharmaceutical investigations.
Collapse
Affiliation(s)
- Xue Wang
- Department of Cell Stress BiologyRoswell Park Cancer InstituteBuffaloNew York
| | - Shichen Shen
- Department of Pharmaceutical SciencesUniversity at BuffaloState University of New YorkNew YorkNew York
| | - Sailee Suryakant Rasam
- Department of Biochemistry, University at BuffaloState University of New YorkNew YorkNew York
| | - Jun Qu
- Department of Cell Stress BiologyRoswell Park Cancer InstituteBuffaloNew York
- Department of Pharmaceutical SciencesUniversity at BuffaloState University of New YorkNew YorkNew York
- Department of Biochemistry, University at BuffaloState University of New YorkNew YorkNew York
| |
Collapse
|
12
|
Martínez-Bartolomé S, Bamberger C, Lavallée-Adam M, McClatchy DB, Yates JR. Proteomics INTegrator (PINT): An Online Tool To Store, Query, and Visualize Large Proteomics Experiment Results. J Proteome Res 2019; 18:2999-3008. [PMID: 31260318 PMCID: PMC8278777 DOI: 10.1021/acs.jproteome.8b00711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The characterization of complex biological systems based on high-throughput protein quantification through mass spectrometry commonly involves differential expression analysis between replicate samples originating from different experimental conditions. Here we present Proteomics INTegrator (PINT), a new user-friendly Web-based platform-independent system to store, visualize, and query proteomics experiment results. PINT provides an extremely flexible query interface that allows advanced Boolean algebra-based data filtering of many different proteomics features such as confidence values, abundance levels or ratios, data set overlaps, sample characteristics, as well as UniProtKB annotations, which are transparently incorporated into the system. In addition, PINT allows developers to incorporate data visualization and analysis tools, such as PSEA-Quant and Reactome pathway analysis, for data set enrichment analysis. PINT serves as a centralized hub for large-scale proteomics data and as a platform for data analysis, facilitating the interpretation of proteomics results and expediting biologically relevant conclusions.
Collapse
Affiliation(s)
- Salvador Martínez-Bartolomé
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California, 92037, United States
| | - Casimir Bamberger
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California, 92037, United States
| | - Mathieu Lavallée-Adam
- Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, K1H 8M5, Canada
| | - Daniel B. McClatchy
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California, 92037, United States
| | - John R. Yates
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California, 92037, United States
| |
Collapse
|
13
|
Devabhaktuni A, Lin S, Zhang L, Swaminathan K, Gonzalez CG, Olsson N, Pearlman SM, Rawson K, Elias JE. TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets. Nat Biotechnol 2019; 37:469-479. [PMID: 30936560 PMCID: PMC6447449 DOI: 10.1038/s41587-019-0067-5] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 02/12/2019] [Indexed: 02/06/2023]
Abstract
Although mass spectrometry is well suited to identifying thousands of potential protein post-translational modifications (PTMs), it has historically been biased towards just a few. To measure the entire set of PTMs across diverse proteomes, software must overcome the dual challenges of covering enormous search spaces and distinguishing correct from incorrect spectrum interpretations. Here, we describe TagGraph, a computational tool that overcomes both challenges with an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model that we optimized for PTM assignments. We applied TagGraph to a published human proteomic dataset of 25 million mass spectra and tripled confident spectrum identifications compared to its original analysis. We identified thousands of modification types on almost 1 million sites in the proteome. We show alternative contexts for highly abundant yet understudied PTMs such as proline hydroxylation, and its unexpected association with cancer mutations. By enabling broad characterization of PTMs, TagGraph informs as to how their functions and regulation intersect.
Collapse
Affiliation(s)
- Arun Devabhaktuni
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Sarah Lin
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Lichao Zhang
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Kavya Swaminathan
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Carlos G Gonzalez
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Niclas Olsson
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Samuel M Pearlman
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Keith Rawson
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Joshua E Elias
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
14
|
Stein-O'Brien GL, Arora R, Culhane AC, Favorov AV, Garmire LX, Greene CS, Goff LA, Li Y, Ngom A, Ochs MF, Xu Y, Fertig EJ. Enter the Matrix: Factorization Uncovers Knowledge from Omics. Trends Genet 2018; 34:790-805. [PMID: 30143323 PMCID: PMC6309559 DOI: 10.1016/j.tig.2018.07.003] [Citation(s) in RCA: 100] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 06/01/2018] [Accepted: 07/16/2018] [Indexed: 12/20/2022]
Abstract
Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge - answering questions from high-dimensional data that we have not yet thought to ask.
Collapse
Affiliation(s)
- Genevieve L Stein-O'Brien
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA; Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Raman Arora
- Department of Computer Science, Institute for Data Intensive Engineering and Science, Johns Hopkins University, Baltimore, MD, USA
| | - Aedin C Culhane
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
| | - Alexander V Favorov
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA; Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, PA, USA; Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, PA, USA
| | - Loyal A Goff
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Yifeng Li
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, ON, Canada
| | - Aloune Ngom
- School of Computer Science, University of Windsor, Windsor, ON, Canada
| | - Michael F Ochs
- Department of Mathematics and Statistics, The College of New Jersey, Ewing, NJ, USA
| | - Yanxun Xu
- Department of Applied Mathematics and Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Elana J Fertig
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
15
|
Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Čech M, Chilton J, Clements D, Coraor N, Grüning BA, Guerler A, Hillman-Jackson J, Hiltemann S, Jalili V, Rasche H, Soranzo N, Goecks J, Taylor J, Nekrutenko A, Blankenberg D. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res 2018; 46:W537-W544. [PMID: 29790989 PMCID: PMC6030816 DOI: 10.1093/nar/gky379] [Citation(s) in RCA: 2148] [Impact Index Per Article: 358.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 04/25/2018] [Accepted: 05/02/2018] [Indexed: 02/06/2023] Open
Abstract
Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.
Collapse
Affiliation(s)
- Enis Afgan
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Dannon Baker
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Bérénice Batut
- Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany
| | | | - Dave Bouvier
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - Martin Čech
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - John Chilton
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - Dave Clements
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Nate Coraor
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - Björn A Grüning
- Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany
- Center for Biological Systems Analysis (ZBSA), University of Freiburg, Freiburg, Germany
| | - Aysam Guerler
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer Hillman-Jackson
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - Saskia Hiltemann
- Department of Pathology, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Vahid Jalili
- Department of Biomedical Engineering, Oregon Health and Science University, OR, USA
| | - Helena Rasche
- Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany
| | | | - Jeremy Goecks
- Department of Biomedical Engineering, Oregon Health and Science University, OR, USA
| | - James Taylor
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA
| | - Daniel Blankenberg
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| |
Collapse
|
16
|
LeDuc RD, Schwämmle V, Shortreed MR, Cesnik AJ, Solntsev SK, Shaw JB, Martin MJ, Vizcaino JA, Alpi E, Danis P, Kelleher NL, Smith LM, Ge Y, Agar JN, Chamot-Rooke J, Loo JA, Pasa-Tolic L, Tsybin YO. ProForma: A Standard Proteoform Notation. J Proteome Res 2018; 17:1321-1325. [PMID: 29397739 PMCID: PMC5837035 DOI: 10.1021/acs.jproteome.7b00851] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human-readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, and the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.
Collapse
Affiliation(s)
- Richard D. LeDuc
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60208, United States
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense, Denmark
| | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Anthony J. Cesnik
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Stefan K. Solntsev
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Jared B. Shaw
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Maria J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Juan A. Vizcaino
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Emanuele Alpi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Paul Danis
- Consortium for Top-Down Proteomics, Cambridge, Massachusetts 02142, United States
| | - Neil L. Kelleher
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60208, United States
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
- Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Jeffrey N. Agar
- Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts 02115, United States
| | - Julia Chamot-Rooke
- Mass Spectrometry for Biology Unit, Institut Pasteur, CNRS USR 2000, Paris Cedex 15, France
| | - Joseph A. Loo
- Department of Chemistry and Biochemistry and Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Ljiljana Pasa-Tolic
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | | |
Collapse
|
17
|
Xu JY, Dai C, Shan JJ, Xie T, Xie HH, Wang MM, Yang G. Determination of the effect of Pinellia ternata (Thunb.) Breit. on nervous system development by proteomics. J Ethnopharmacol 2018; 213:221-229. [PMID: 29141195 DOI: 10.1016/j.jep.2017.11.014] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Revised: 10/20/2017] [Accepted: 11/11/2017] [Indexed: 06/07/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Banxia (BX) is the dried tuber of Pinellia ternata (Thunb.) Breit., a commonly prescribed Chinese medicinal herb for the treatment of cough, phlegm, and vomiting in pregnant women. However, raw BX has been demonstrated to exert toxic effects on reproduction and the precise and comprehensive mechanisms remain elusive. AIM OF THE STUDY We applied an iTRAQ (isobaric tags for relative and absolute quantitation, iTRAQ)-based proteomic method to explore the mechanisms of raw BX-induced fetal toxicity in mice. MATERIALS AND METHODS The mice were separated into two groups, control mice and BX-treated mice. From gestation days 6-8, the control group was treated with normal saline and the BX group was exposed to BX suspension (2.275g/kg/day). Gastrulae were obtained and analyzed using the quantitative proteomic approach of iTRAQ coupled to liquid chromatography-tandem mass spectrometry (LC-MS/MS). A multi-omics data analysis tool, OmicsBean (http://www.omicsbean.cn), was employed to conduct bioinformatic analysis of differentially abundant proteins (DAPs). Quantitative real-time PCR (qRT-PCR) and western blotting methods were applied to detect the protein expression levels and validate the quality of the proteomics. RESULTS A total of 1245 proteins were identified with < 1% false discovery rate (FDR) and 583 protein abundance changes were confidently assessed. Moreover, 153 proteins identified in BX-treated samples showed significant differences in abundance. Bioinformatics analysis showed that the functions of 37 DAPs were predominantly related to nervous system development. The expression levels of the selected proteins for quantification by qRT-PCR or western blotting were consistent with the results in iTRAQ-labeled proteomics data. CONCLUSION The results suggested that oral administration of BX in mice may cause fetal abnormality of the nervous system. The findings may be helpful to elucidate the underlying mechanisms of BX-induced embryotoxicity.
Collapse
Affiliation(s)
- Jian-Ya Xu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China; Jiangsu Key Laboratory of Pediatric Respiratory Disease, Institute of Pediatrics, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Chen Dai
- College of Life Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Jin-Jun Shan
- Jiangsu Key Laboratory of Pediatric Respiratory Disease, Institute of Pediatrics, Nanjing University of Chinese Medicine, Nanjing 210023, China.
| | - Tong Xie
- Jiangsu Key Laboratory of Pediatric Respiratory Disease, Institute of Pediatrics, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Hui-Hui Xie
- Department of Pediatrics, Zhejiang Provincial Hospital of Traditional Chinese Medicine, Hangzhou 310006, China
| | - Ming-Ming Wang
- Jiangsu Key Laboratory of Pediatric Respiratory Disease, Institute of Pediatrics, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Guang Yang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China.
| |
Collapse
|
18
|
Agrawal M, Zitnik M, Leskovec J. Large-scale analysis of disease pathways in the human interactome. Pac Symp Biocomput 2018; 23:111-122. [PMID: 29218874 PMCID: PMC5731453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Discovering disease pathways, which can be defined as sets of proteins associated with a given disease, is an important problem that has the potential to provide clinically actionable insights for disease diagnosis, prognosis, and treatment. Computational methods aid the discovery by relying on protein-protein interaction (PPI) networks. They start with a few known disease-associated proteins and aim to find the rest of the pathway by exploring the PPI network around the known disease proteins. However, the success of such methods has been limited, and failure cases have not been well understood. Here we study the PPI network structure of 519 disease pathways. We find that 90% of pathways do not correspond to single well-connected components in the PPI network. Instead, proteins associated with a single disease tend to form many separate connected components/regions in the network. We then evaluate state-of-the-art disease pathway discovery methods and show that their performance is especially poor on diseases with disconnected pathways. Thus, we conclude that network connectivity structure alone may not be sufficient for disease pathway discovery. However, we show that higher-order network structures, such as small subgraphs of the pathway, provide a promising direction for the development of new methods.
Collapse
Affiliation(s)
- Monica Agrawal
- Department of Computer Science, Stanford University, Stanford, CA, USA,
| | | | | |
Collapse
|
19
|
Lyon YA, Riggs D, Fornelli L, Compton PD, Julian RR. The Ups and Downs of Repeated Cleavage and Internal Fragment Production in Top-Down Proteomics. J Am Soc Mass Spectrom 2018; 29:150-157. [PMID: 29038993 PMCID: PMC5786485 DOI: 10.1007/s13361-017-1823-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Revised: 09/08/2017] [Accepted: 09/23/2017] [Indexed: 05/10/2023]
Abstract
Analysis of whole proteins by mass spectrometry, or top-down proteomics, has several advantages over methods relying on proteolysis. For example, proteoforms can be unambiguously identified and examined. However, from a gas-phase ion-chemistry perspective, proteins are enormous molecules that present novel challenges relative to peptide analysis. Herein, the statistics of cleaving the peptide backbone multiple times are examined to evaluate the inherent propensity for generating internal versus terminal ions. The raw statistics reveal an inherent bias favoring production of terminal ions, which holds true regardless of protein size. Importantly, even if the full suite of internal ions is generated by statistical dissociation, terminal ions are predicted to account for at least 50% of the total ion current, regardless of protein size, if there are three backbone dissociations or fewer. Top-down analysis should therefore be a viable approach for examining proteins of significant size. Comparison of the purely statistical analysis with actual top-down data derived from ultraviolet photodissociation (UVPD) and higher-energy collisional dissociation (HCD) reveals that terminal ions account for much of the total ion current in both experiments. Terminal ion production is more favored in UVPD relative to HCD, which is likely due to differences in the mechanisms controlling fragmentation. Importantly, internal ions are not found to dominate from either the theoretical or experimental point of view. Graphical abstract ᅟ.
Collapse
Affiliation(s)
- Yana A Lyon
- Department of Chemistry, University of California, Riverside, 501 Big Springs Road, Riverside, CA, 92521, USA
| | - Dylan Riggs
- Department of Chemistry, University of California, Riverside, 501 Big Springs Road, Riverside, CA, 92521, USA
| | - Luca Fornelli
- Departments of Chemistry and Molecular Biosciences, and the Proteomics Center of Excellence, Northwestern University, 2145 N. Sheridan Road, Evanston, IL, 60208, USA
| | - Philip D Compton
- Departments of Chemistry and Molecular Biosciences, and the Proteomics Center of Excellence, Northwestern University, 2145 N. Sheridan Road, Evanston, IL, 60208, USA
| | - Ryan R Julian
- Department of Chemistry, University of California, Riverside, 501 Big Springs Road, Riverside, CA, 92521, USA.
| |
Collapse
|
20
|
Srivastava A, Kulkarni C, Mallick P, Huang K, Machiraju R. Building trans-omics evidence: using imaging and 'omics' to characterize cancer profiles. Pac Symp Biocomput 2018; 23:377-387. [PMID: 29218898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Utilization of single modality data to build predictive models in cancer results in a rather narrow view of most patient profiles. Some clinical facet s relate strongly to histology image features, e.g. tumor stages, whereas others are associated with genomic and proteomic variations (e.g. cancer subtypes and disease aggression biomarkers). We hypothesize that there are coherent "trans-omics" features that characterize varied clinical cohorts across multiple sources of data leading to more descriptive and robust disease characterization. In this work, for l 05 breast cancer patients from the TCGA (The Cancer Genome Atlas), we consider four clinical attributes (AJCC Stage, Tumor Stage, ER-Status and PAM50 mRNA Subtypes), and build predictive models using three different modalities of data (histopathological images, transcriptomics and proteomics). Following which, we identify critical multi-level features that drive successful classification of patients for the various different cohorts. To build predictors for each data type, we employ widely used "best practice" techniques including CNN-based (convolutional neural network) classifiers for histopathological images and regression models for proteogenomic data. While, as expected, histology images outperformed molecular features while predicting cancer stages, and transcriptomics held superior discriminatory power for ER-Status and PAM50 subtypes, there exist a few cases where all data modalities exhibited comparable performance. Further, we also identified sets of key genes and proteins whose expression and abundance correlate across each clinical cohort including (i) tumor severity and progression (incl. GABARAP), (ii) ER-status (incl.ESRl) and (iii) disease subtypes (incl. FOXCl). Thus, we quantitatively assess the efficacy of different data types to predict critical breast cancer patient attributes and improve disease characterization.
Collapse
Affiliation(s)
- Arunima Srivastava
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, Columbus, OH 43210, USA,
| | | | | | | | | |
Collapse
|
21
|
Delfani P, Dexlin Mellby L, Nordström M, Holmér A, Ohlsson M, Borrebaeck CAK, Wingren C. Technical Advances of the Recombinant Antibody Microarray Technology Platform for Clinical Immunoproteomics. PLoS One 2016; 11:e0159138. [PMID: 27414037 PMCID: PMC4944972 DOI: 10.1371/journal.pone.0159138] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 06/28/2016] [Indexed: 01/30/2023] Open
Abstract
In the quest for deciphering disease-associated biomarkers, high-performing tools for multiplexed protein expression profiling of crude clinical samples will be crucial. Affinity proteomics, mainly represented by antibody-based microarrays, have during recent years been established as a proteomic tool providing unique opportunities for parallelized protein expression profiling. But despite the progress, several main technical features and assay procedures remains to be (fully) resolved. Among these issues, the handling of protein microarray data, i.e. the biostatistics parts, is one of the key features to solve. In this study, we have therefore further optimized, validated, and standardized our in-house designed recombinant antibody microarray technology platform. To this end, we addressed the main remaining technical issues (e.g. antibody quality, array production, sample labelling, and selected assay conditions) and most importantly key biostatistics subjects (e.g. array data pre-processing and biomarker panel condensation). This represents one of the first antibody array studies in which these key biostatistics subjects have been studied in detail. Here, we thus present the next generation of the recombinant antibody microarray technology platform designed for clinical immunoproteomics.
Collapse
Affiliation(s)
- Payam Delfani
- Department of Immunotechnology and CREATE Health, Lund University, Medicon Village, Lund, Sweden
| | - Linda Dexlin Mellby
- Department of Immunotechnology and CREATE Health, Lund University, Medicon Village, Lund, Sweden
- Immunovia AB, Lund, Sweden
| | | | | | - Mattias Ohlsson
- Computational Biology & Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, Lund, Sweden
| | - Carl A. K. Borrebaeck
- Department of Immunotechnology and CREATE Health, Lund University, Medicon Village, Lund, Sweden
| | - Christer Wingren
- Department of Immunotechnology and CREATE Health, Lund University, Medicon Village, Lund, Sweden
- * E-mail:
| |
Collapse
|
22
|
Breckels LM, Holden SB, Wojnar D, Mulvey CM, Christoforou A, Groen A, Trotter MWB, Kohlbacher O, Lilley KS, Gatto L. Learning from Heterogeneous Data Sources: An Application in Spatial Proteomics. PLoS Comput Biol 2016; 12:e1004920. [PMID: 27175778 PMCID: PMC4866734 DOI: 10.1371/journal.pcbi.1004920] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 04/16/2016] [Indexed: 11/19/2022] Open
Abstract
Sub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a plethora of experimental spatial proteomics data for the cell biology community. Yet, there are many third-party data sources, such as immunofluorescence microscopy or protein annotations and sequences, which represent a rich and vast source of complementary information. We present a unique transfer learning classification framework that utilises a nearest-neighbour or support vector machine system, to integrate heterogeneous data sources to considerably improve on the quantity and quality of sub-cellular protein assignment. We demonstrate the utility of our algorithms through evaluation of five experimental datasets, from four different species in conjunction with four different auxiliary data sources to classify proteins to tens of sub-cellular compartments with high generalisation accuracy. We further apply the method to an experiment on pluripotent mouse embryonic stem cells to classify a set of previously unknown proteins, and validate our findings against a recent high resolution map of the mouse stem cell proteome. The methodology is distributed as part of the open-source Bioconductor pRoloc suite for spatial proteomics data analysis. Sub-cellular localisation of proteins is critical to their function in all cellular processes; proteins localising to their intended micro-environment, e.g organelles, vesicles or macro-molecular complexes, will meet the interaction partners and biochemical conditions suitable to pursue their molecular function. Therefore, sound data and methods to reliably and systematically study protein localisation, and hence their mis-localisation and the disruption of protein trafficking, that are relied upon by the cell biology community, are essential. Here we present a method to infer protein localisation relying on the optimal integration of experimental mass spectrometry-based data and auxiliary sources, such as GO annotation, outputs from third-party software, protein-protein interactions or immunocytochemistry data. We found that the application of transfer learning algorithms across these diverse data sources considerably improves on the quantity and reliability of sub-cellular protein assignment, compared to single data classifiers previously applied to infer sub-cellular localisation using experimental data only. We show how our method does not compromise biologically relevant experimental-specific signal after integration with heterogeneous freely available third-party resources. The integration of different data sources is an important challenge in the data intensive world of biology and we anticipate the transfer learning methods presented here will prove useful to many areas of biology, to unify data obtained from different but complimentary sources.
Collapse
Affiliation(s)
- Lisa M. Breckels
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Sean B. Holden
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - David Wojnar
- Quantitative Biology Center, Universität Tübingen, Tübingen, Germany
| | - Claire M. Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Andy Christoforou
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Arnoud Groen
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Oliver Kohlbacher
- Quantitative Biology Center, Universität Tübingen, Tübingen, Germany
- Center for Bioinformatics, Universität Tübingen, Tübingen, Germany
- Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Laurent Gatto
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
23
|
Abstract
This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
Collapse
Affiliation(s)
- Evelyn Schlenker
- Division of Basic Biomedical Sciences, Sanford School of Medicine, The University of South Dakota, 414 E. Clark Street, Vermillion, SD, 57069, USA.
| |
Collapse
|
24
|
Verkhivker GM. INTEGRATING GENETIC AND STRUCTURAL DATA ON HUMAN PROTEIN KINOME IN NETWORK-BASED MODELING OF KINASE SENSITIVITIES AND RESISTANCE TO TARGETED AND PERSONALIZED ANTICANCER DRUGS. Pac Symp Biocomput 2016; 21:45-56. [PMID: 26776172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The human protein kinome presents one of the largest protein families that orchestrate functional processes in complex cellular networks, and when perturbed, can cause various cancers. The abundance and diversity of genetic, structural, and biochemical data underlies the complexity of mechanisms by which targeted and personalized drugs can combat mutational profiles in protein kinases. Coupled with the evolution of system biology approaches, genomic and proteomic technologies are rapidly identifying and charactering novel resistance mechanisms with the goal to inform rationale design of personalized kinase drugs. Integration of experimental and computational approaches can help to bring these data into a unified conceptual framework and develop robust models for predicting the clinical drug resistance. In the current study, we employ a battery of synergistic computational approaches that integrate genetic, evolutionary, biochemical, and structural data to characterize the effect of cancer mutations in protein kinases. We provide a detailed structural classification and analysis of genetic signatures associated with oncogenic mutations. By integrating genetic and structural data, we employ network modeling to dissect mechanisms of kinase drug sensitivities to oncogenic EGFR mutations. Using biophysical simulations and analysis of protein structure networks, we show that conformational-specific drug binding of Lapatinib may elicit resistant mutations in the EGFR kinase that are linked with the ligand-mediated changes in the residue interaction networks and global network properties of key residues that are responsible for structural stability of specific functional states. A strong network dependency on high centrality residues in the conformation-specific Lapatinib-EGFR complex may explain vulnerability of drug binding to a broad spectrum of mutations and the emergence of drug resistance. Our study offers a systems-based perspective on drug design by unravelling complex relationships between robustness of targeted kinase genes and binding specificity of targeted kinase drugs. We discuss how these approaches can exploit advances in chemical biology and network science to develop novel strategies for rationally tailored and robust personalized drug therapies.
Collapse
Affiliation(s)
- Gennady M Verkhivker
- Department of Computational Biosciences, Schmid College of Science & Technology, Chapman University, One University Drive, Orange CA 92866, USA2Department of Pharmacology, University of California San Diego, 9500 Gilman Drive, San Diego CA 92093, USA†This work is partly supported by funding from Chapman University.,
| |
Collapse
|
25
|
Abstract
After separation through two-dimensional gel electrophoresis (2-DE), several hundreds of individual protein abundances can be quantified in a cell population or sample tissue. However, gel-based proteomics has the reputation of being a slow and cumbersome art. But art is not dead! While 2-DE may no longer be the tool of choice in high-throughput differential proteomics, it is still very effective to identify and quantify protein species caused by genetic variations, alternative splicing, and/or PTMs. This chapter reviews some typical statistical exploratory and confirmatory tools available and suggests case-specific guidelines for (1) the discovery of potentially interesting protein spots, and (2) the further characterization of protein families and their possible PTMs.
Collapse
Affiliation(s)
- Sebastien C Carpentier
- Department of Biosystems, Faculty of Bioscience Engineering, K.U. Leuven, Willem Decroylaan 42, Leuven, 3001, Belgium.
- SYBIOMA: Facility for Systems Biology Based Mass Spectrometry, Herestraat 49 O&N2, Leuven, 3000, Belgium.
| |
Collapse
|
26
|
Alessio M, Cannistraci CV. Nonlinear Dimensionality Reduction by Minimum Curvilinearity for Unsupervised Discovery of Patterns in Multidimensional Proteomic Data. Methods Mol Biol 2016; 1384:289-298. [PMID: 26611421 DOI: 10.1007/978-1-4939-3255-9_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Dimensionality reduction is largely and successfully employed for the visualization and discrimination of patterns, hidden in multidimensional proteomics datasets. Principal component analysis (PCA), which is the preferred approach for linear dimensionality reduction, may present serious limitations, in particular when samples are nonlinearly related, as often occurs in several two-dimensional electrophoresis (2-DE) datasets. An aggravating factor is that PCA robustness is impaired when the number of samples is small in comparison to the number of proteomic features, and this is the case in high-dimensional proteomic datasets, including 2-DE ones. Here, we describe the use of a nonlinear unsupervised learning machine for dimensionality reduction called minimum curvilinear embedding (MCE) that was successfully applied to different biological samples datasets. In particular, we provide an example where we directly compare MCE performance with that of PCA in disclosing neuropathic pain patterns, hidden in a multidimensional proteomic dataset.
Collapse
Affiliation(s)
- Massimo Alessio
- Proteome Biochemistry, IRCCS-San Raffaele Scientific Institute, Milan, Italy.
| | - Carlo Vittorio Cannistraci
- Biomedical Cybernetics Group, Biotechnology Center (BIOTEC), Technische Universität Dresden, Tatzberg 47/49, 01307, Dresden, Germany.
| |
Collapse
|
27
|
Semba RD, Lam M, Sun K, Zhang P, Schaumberg DA, Ferrucci L, Ping P, Van Eyk JE. Priorities and trends in the study of proteins in eye research, 1924-2014. Proteomics Clin Appl 2015; 9:1105-22. [PMID: 26123431 PMCID: PMC4695326 DOI: 10.1002/prca.201500006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Revised: 03/26/2015] [Accepted: 06/25/2015] [Indexed: 11/12/2022]
Abstract
PURPOSE To identify the proteins that are relevant to eye research and develop assays for the study of a set of these proteins. EXPERIMENTAL DESIGN We conducted a bibliometric analysis by merging gene lists for human and mouse from the National Center for Biotechnology Information FTP site and combining them with PubMed references that were retrieved with the search terms "eye" [MeSH Terms] OR "eye" [All Fields] OR "eyes" [All Fields]. RESULTS For human and mouse eye studies, respectively, the total number of publications was 13,525 and 23,895 and the total number of proteins was 4050 and 4717. For proteins in human and mouse eye studies, respectively, 88.7 and 81.7% had five or fewer citations. The top 50 most intensively studied proteins for human and mouse eye studies were generally in the areas of photoreceptors and phototransduction, inflammation, and angiogenesis, neurodevelopment, lens transparency, and cell-cycle and cellular processes. We proposed selected reaction monitoring assays that were developed in silico for the top fifty most intensively studied proteins in human and mouse eye research. CONCLUSIONS AND CLINICAL RELEVANCE We conclude that scientists engaged in eye research tend to focus on the same proteins. Newer resources and tools in proteomics can expand the investigations to lesser-known proteins of the eye.
Collapse
Affiliation(s)
- Richard D. Semba
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Maggie Lam
- Cardiac Proteomics and Signaling Laboratory, Department of Physiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA
| | - Kai Sun
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Pingbo Zhang
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Debra A. Schaumberg
- Center for Translational Medicine, Moran Eye Center, University of Utah School of Medicine, Salt Lake City, UT
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Luigi Ferrucci
- National Institute on Aging, National Institutes of Health, Baltimore, MD
| | - Peipei Ping
- Cardiac Proteomics and Signaling Laboratory, Department of Physiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA
| | - Jennifer E. Van Eyk
- Advanced Clinical BioSystems Research Institute, The Heart Institute and Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA
| |
Collapse
|
28
|
Elia AEH, Boardman AP, Wang DC, Huttlin EL, Everley RA, Dephoure N, Zhou C, Koren I, Gygi SP, Elledge SJ. Quantitative Proteomic Atlas of Ubiquitination and Acetylation in the DNA Damage Response. Mol Cell 2015; 59:867-81. [PMID: 26051181 DOI: 10.1016/j.molcel.2015.05.006] [Citation(s) in RCA: 243] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Revised: 03/17/2015] [Accepted: 04/29/2015] [Indexed: 01/06/2023]
Abstract
Execution of the DNA damage response (DDR) relies upon a dynamic array of protein modifications. Using quantitative proteomics, we have globally profiled ubiquitination, acetylation, and phosphorylation in response to UV and ionizing radiation. To improve acetylation site profiling, we developed the strategy FACET-IP. Our datasets of 33,500 ubiquitination and 16,740 acetylation sites provide valuable insight into DDR remodeling of the proteome. We find that K6- and K33-linked polyubiquitination undergo bulk increases in response to DNA damage, raising the possibility that these linkages are largely dedicated to DDR function. We also show that Cullin-RING ligases mediate 10% of DNA damage-induced ubiquitination events and that EXO1 is an SCF-Cyclin F substrate in the response to UV radiation. Our extensive datasets uncover additional regulated sites on known DDR players such as PCNA and identify previously unknown DDR targets such as CENPs, underscoring the broad impact of the DDR on cellular physiology.
Collapse
Affiliation(s)
- Andrew E H Elia
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA; Department of Radiation Oncology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Alexander P Boardman
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA; Department of Radiation Oncology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - David C Wang
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA; Department of Radiation Oncology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Edward L Huttlin
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Robert A Everley
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Noah Dephoure
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Chunshui Zhou
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA
| | - Itay Koren
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA
| | - Steven P Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Stephen J Elledge
- Department of Genetics, Harvard Medical School; Division of Genetics, Brigham and Women's Hospital; Howard Hughes Medical Institute, Boston, MA 02115, USA.
| |
Collapse
|
29
|
Stavrakas V, Melas IN, Sakellaropoulos T, Alexopoulos LG. Network reconstruction based on proteomic data and prior knowledge of protein connectivity using graph theory. PLoS One 2015; 10:e0128411. [PMID: 26020784 PMCID: PMC4447287 DOI: 10.1371/journal.pone.0128411] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 04/27/2015] [Indexed: 12/12/2022] Open
Abstract
Modeling of signal transduction pathways is instrumental for understanding cells’ function. People have been tackling modeling of signaling pathways in order to accurately represent the signaling events inside cells’ biochemical microenvironment in a way meaningful for scientists in a biological field. In this article, we propose a method to interrogate such pathways in order to produce cell-specific signaling models. We integrate available prior knowledge of protein connectivity, in a form of a Prior Knowledge Network (PKN) with phosphoproteomic data to construct predictive models of the protein connectivity of the interrogated cell type. Several computational methodologies focusing on pathways’ logic modeling using optimization formulations or machine learning algorithms have been published on this front over the past few years. Here, we introduce a light and fast approach that uses a breadth-first traversal of the graph to identify the shortest pathways and score proteins in the PKN, fitting the dependencies extracted from the experimental design. The pathways are then combined through a heuristic formulation to produce a final topology handling inconsistencies between the PKN and the experimental scenarios. Our results show that the algorithm we developed is efficient and accurate for the construction of medium and large scale signaling networks. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGF/TNFA stimulation against made up experimental data. To avoid the possibility of erroneous predictions, we performed a cross-validation analysis. Finally, we validate that the introduced approach generates predictive topologies, comparable to the ILP formulation. Overall, an efficient approach based on graph theory is presented herein to interrogate protein–protein interaction networks and to provide meaningful biological insights.
Collapse
Affiliation(s)
- Vassilis Stavrakas
- Department of Mechanical Engineering, National Technical University of Athens, Heroon Polytechniou 9, Zografou 15780, Greece
| | - Ioannis N. Melas
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Theodore Sakellaropoulos
- Department of Mechanical Engineering, National Technical University of Athens, Heroon Polytechniou 9, Zografou 15780, Greece
| | - Leonidas G. Alexopoulos
- Department of Mechanical Engineering, National Technical University of Athens, Heroon Polytechniou 9, Zografou 15780, Greece
- * E-mail:
| |
Collapse
|
30
|
Abstract
Drug repositioning has shorter developmental time, lower cost and less safety risk than traditional drug development process. The current study aims to repurpose marketed drugs and clinical candidates for new indications in diabetes treatment by mining clinical ‘omics’ data. We analyzed data from genome wide association studies (GWAS), proteomics and metabolomics studies and revealed a total of 992 proteins as potential anti-diabetic targets in human. Information on the drugs that target these 992 proteins was retrieved from the Therapeutic Target Database (TTD) and 108 of these proteins are drug targets with drug projects information. Research and preclinical drug targets were excluded and 35 of the 108 proteins were selected as druggable proteins. Among them, five proteins were known targets for treating diabetes. Based on the pathogenesis knowledge gathered from the OMIM and PubMed databases, 12 protein targets of 58 drugs were found to have a new indication for treating diabetes. CMap (connectivity map) was used to compare the gene expression patterns of cells treated by these 58 drugs and that of cells treated by known anti-diabetic drugs or diabetes risk causing compounds. As a result, 9 drugs were found to have the potential to treat diabetes. Among the 9 drugs, 4 drugs (diflunisal, nabumetone, niflumic acid and valdecoxib) targeting COX2 (prostaglandin G/H synthase 2) were repurposed for treating type 1 diabetes, and 2 drugs (phenoxybenzamine and idazoxan) targeting ADRA2A (Alpha-2A adrenergic receptor) had a new indication for treating type 2 diabetes. These findings indicated that ‘omics’ data mining based drug repositioning is a potentially powerful tool to discover novel anti-diabetic indications from marketed drugs and clinical candidates. Furthermore, the results of our study could be related to other disorders, such as Alzheimer’s disease.
Collapse
Affiliation(s)
- Ming Zhang
- Department of Medicine, Division of Neurology, Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, 60 Leonard Street, Toronto, Ontario, M5T 2S8, Canada
- * E-mail:
| | - Heng Luo
- University of Arkansas at Little Rock/University of Arkansas for Medical Sciences Bioinformatics Graduate Program, 2801 S. University Ave., Little Rock, AR, 72204, United States of America
| | - Zhengrui Xi
- Department of Medicine, Division of Neurology, Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, 60 Leonard Street, Toronto, Ontario, M5T 2S8, Canada
| | - Ekaterina Rogaeva
- Department of Medicine, Division of Neurology, Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, 60 Leonard Street, Toronto, Ontario, M5T 2S8, Canada
| |
Collapse
|
31
|
Abstract
We describe a method to predict protein-protein interactions (PPIs) formed between structured domains and short peptide motifs. We take an integrative approach based on consensus patterns of known motifs in databases, structures of domain-motif complexes from the PDB and various sources of non-structural evidence. We combine this set of clues using a Bayesian classifier that reports the likelihood of an interaction and obtain significantly improved prediction performance when compared to individual sources of evidence and to previously reported algorithms. Our Bayesian approach was integrated into PrePPI, a structure-based PPI prediction method that, so far, has been limited to interactions formed between two structured domains. Around 80,000 new domain-motif mediated interactions were predicted, thus enhancing PrePPI’s coverage of the human protein interactome. Complexes formed between a structured domain on one protein and an unstructured peptide on another are ubiquitous. However, they are often quite difficult to detect experimentally. The development of computational approaches to predict domain-motif interactions is therefore an important goal. We report a method to predict domain-motif interactions using a Bayesian approach to integrate evidence from a variety of sources, including three-dimensional structural and non-structural information. The method was applied to the entire human proteome and showed significant improvement over existing methods. The method was incorporated into PrePPI, a computational pipeline for the prediction of protein-protein interactions that relies heavily on structural information. Approximately 80,000 new interactions were detected. The new PrePPI database provides easy access to about 400,000 human protein-protein interactions and should thus constitute a valuable resource in a variety of biological applications including the characterization of molecular interaction networks and, more generally, in the study of interactions mediated by proteins in families that may not be extensively studied experimentally.
Collapse
Affiliation(s)
- T. Scott Chen
- Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Donald Petrey
- Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Jose Ignacio Garzon
- Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Barry Honig
- Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
32
|
Abstract
Tandem mass (MS/MS) spectrometry has become the method of choice for protein identification and has launched a quest for the identification of every translated protein and peptide. However, computational developments have lagged behind the pace of modern data acquisition protocols and have become a major bottleneck in proteomics analysis of complex samples. As it stands today, attempts to identify MS/MS spectra against large databases (e.g., the human microbiome or 6-frame translation of the human genome) face a search space that is 10-100 times larger than the human proteome, where it becomes increasingly challenging to separate between true and false peptide matches. As a result, the sensitivity of current state-of-the-art database search methods drops by nearly 38% to such low identification rates that almost 90% of all MS/MS spectra are left as unidentified. We address this problem by extending the generating function approach to rigorously compute the joint spectral probability of multiple spectra being matched to peptides with overlapping sequences, thus enabling the confident assignment of higher significance to overlapping peptide-spectrum matches (PSMs). We find that these joint spectral probabilities can be several orders of magnitude more significant than individual PSMs, even in the ideal case when perfect separation between signal and noise peaks could be achieved per individual MS/MS spectrum. After benchmarking this approach on a typical lysate MS/MS dataset, we show that the proposed intersecting spectral probabilities for spectra from overlapping peptides improve peptide identification by 30-62%.
Collapse
Affiliation(s)
- Adrian Guthals
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
| | - Christina Boucher
- Department of Computer Science, Colorado State University, Fort Collins, Colorado
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California–San Diego, La Jolla, California
| |
Collapse
|
33
|
Chen ZF, Zhang H, Wang H, Matsumura K, Wong YH, Ravasi T, Qian PY. Quantitative proteomics study of larval settlement in the Barnacle Balanus amphitrite. PLoS One 2014; 9:e88744. [PMID: 24551147 PMCID: PMC3923807 DOI: 10.1371/journal.pone.0088744] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 01/08/2014] [Indexed: 01/06/2023] Open
Abstract
Barnacles are major sessile components of the intertidal areas worldwide, and also one of the most dominant fouling organisms in fouling communities. Larval settlement has a crucial ecological effect not only on the distribution of the barnacle population but also intertidal community structures. However, the molecular mechanisms involved in the transition process from the larval to the juvenile stage remain largely unclear. In this study, we carried out comparative proteomic profiles of stage II nauplii, stage VI nauplii, cyprids, and juveniles of the barnacle Balanus amphitrite using label-free quantitative proteomics, followed by the measurement of the gene expression levels of candidate proteins. More than 700 proteins were identified at each stage; 80 were significantly up-regulated in cyprids and 95 in juveniles vs other stages. Specifically, proteins involved in energy and metabolism, the nervous system and signal transduction were significantly up-regulated in cyprids, whereas proteins involved in cytoskeletal remodeling, transcription and translation, cell proliferation and differentiation, and biomineralization were up-regulated in juveniles, consistent with changes associated with larval metamorphosis and tissue remodeling in juveniles. These findings provided molecular evidence for the morphological, physiological and biological changes that occur during the transition process from the larval to the juvenile stages in B. amphitrite.
Collapse
Affiliation(s)
- Zhang-Fan Chen
- KAUST Global Collaborative Research Program, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Huoming Zhang
- Bioscience Core Laboratory, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Hao Wang
- KAUST Global Collaborative Research Program, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Kiyotaka Matsumura
- KAUST Global Collaborative Research Program, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Yue Him Wong
- KAUST Global Collaborative Research Program, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Timothy Ravasi
- Integrative Systems Biology Lab, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Pei-Yuan Qian
- KAUST Global Collaborative Research Program, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| |
Collapse
|
34
|
Abstract
High-throughput genetic screens in model microbial organisms are a primary means of interrogating biological systems. In numerous cases, such screens have identified the genes that underlie a particular phenotype or a set of gene-gene, gene-environment or protein-protein interactions, which are then used to construct highly informative network maps for biological research. However, the potential test space of genes, proteins, or interactions is typically much larger than current screening systems can address. To push the limits of screening technology, we developed an ultra-high-density, 6144-colony arraying system and analysis toolbox. Using budding yeast as a benchmark, we find that these tools boost genetic screening throughput 4-fold and yield significant cost and time reductions at quality levels equal to or better than current methods. Thus, the new ultra-high-density screening tools enable researchers to significantly increase the size and scope of their genetic screens.
Collapse
Affiliation(s)
- Gordon J. Bean
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, California, United States of America
| | - Philipp A. Jaeger
- Departments of Medicine and Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Sondra Bahr
- Banting and Best Department of Medical Research, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Trey Ideker
- Departments of Medicine and Bioengineering, University of California San Diego, La Jolla, California, United States of America
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
35
|
Han H. A novel profile biomarker diagnosis for mass spectral proteomics. Pac Symp Biocomput 2014:340-351. [PMID: 24297560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Mass spectrometry based proteomics technologies have allowed for a great progress in identifying disease biomarkers for clinical diagnosis and prognosis. However, they face acute challenges from a data reproducibility standpoint, in that no two independent studies have been found to produce the same proteomic patterns. Such reproducibility issues cause the identified biomarker patterns to lose repeatability and prevent real clinical usage. In this work, we propose a profile biomarker approach to overcome this problem from a machine-learning viewpoint by developing a novel derivative component analysis (DCA). As an implicit feature selection algorithm, derivative component analysis enables the separation of true signals from red herrings by capturing subtle data behaviors and removing system noises from a proteomic profile. We further demonstrate its advantages in disease diagnosis by viewing input data as a profile biomarker. The results from our profile biomarker diagnosis suggest an effective solution to overcoming proteomics data's reproducibility problem, present an alternative method for biomarker discovery in proteomics, and provide a good candidate for clinical proteomic diagnosis.
Collapse
Affiliation(s)
- Henry Han
- Department of Computer and Information Science, Fordham University, New York, NY 10023, USA.
| |
Collapse
|
36
|
Kumar D, Yadav AK, Kadimi PK, Nagaraj SH, Grimmond SM, Dash D. Proteogenomic analysis of Bradyrhizobium japonicum USDA110 using GenoSuite, an automated multi-algorithmic pipeline. Mol Cell Proteomics 2013; 12:3388-97. [PMID: 23882027 PMCID: PMC3820949 DOI: 10.1074/mcp.m112.027169] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2013] [Revised: 07/19/2013] [Indexed: 11/06/2022] Open
Abstract
We present GenoSuite, an integrated proteogenomic pipeline to validate, refine and discover protein coding genes using high-throughput mass spectrometry (MS) data from prokaryotes. To demonstrate the effectiveness of GenoSuite, we analyzed proteomics data of Bradyrhizobium japonicum (USDA110), a model organism to study agriculturally important rhizobium-legume symbiosis. Our analysis confirmed 31% of known genes, refined 49 gene models for their translation initiation site (TIS) and discovered 59 novel protein coding genes. Notably, a novel protein which redefined the boundary of a crucial cytochrome P450 system related operon was discovered, known to be highly expressed in the anaerobic symbiotic bacteroids. A focused analysis on N-terminally acetylated peptides indicated downstream TIS for gene blr0594. Finally, ortho-proteogenomic analysis revealed three novel genes in recently sequenced B. japonicum USDA6(T) genome. The discovery of large number of missing genes and correction of gene models have expanded the proteomic landscape of B. japonicum and presents an unparalleled utility of proteogenomic analyses and versatility of GenoSuite for annotating prokaryotic genomes including pathogens.
Collapse
Affiliation(s)
- Dhirendra Kumar
- From the ‡G.N. Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi 110025, India
| | - Amit Kumar Yadav
- From the ‡G.N. Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi 110025, India
| | - Puneet Kumar Kadimi
- From the ‡G.N. Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi 110025, India
| | - Shivashankar H. Nagaraj
- §Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Sean M. Grimmond
- §Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Debasis Dash
- From the ‡G.N. Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, South Campus, Sukhdev Vihar, Mathura Road, Delhi 110025, India
| |
Collapse
|
37
|
Assawamakin A, Prueksaaroon S, Kulawonganunchai S, Shaw PJ, Varavithya V, Ruangrajitpakorn T, Tongsima S. Biomarker selection and classification of "-omics" data using a two-step bayes classification framework. Biomed Res Int 2013; 2013:148014. [PMID: 24106694 PMCID: PMC3784073 DOI: 10.1155/2013/148014] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 07/04/2013] [Accepted: 08/06/2013] [Indexed: 11/18/2022]
Abstract
Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.
Collapse
Affiliation(s)
- Anunchai Assawamakin
- Department of Pharmacology, Faculty of Pharmacy, Mahidol University, 447 Sri-Ayuthaya Road, Rajathevi, Bangkok 10400, Thailand
| | - Supakit Prueksaaroon
- Department of Electrical and Computer Engineering, Faculty of Engineering, Thammasat University, 99 Phahonyothin Road, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| | - Supasak Kulawonganunchai
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| | - Philip James Shaw
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| | - Vara Varavithya
- Department of Electrical and Computer Engineering, King Mongkut University of Technology North Bangkok, 1518 Piboonsongkarm Road, Bangkok 10800, Thailand
| | - Taneth Ruangrajitpakorn
- Language and Semantic Technology Laboratory, National Electronic and Computer Technology Center, 112 Thailand Science Park, Phahonyothin Road, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| | - Sissades Tongsima
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| |
Collapse
|
38
|
Galazis N, Pang YL, Galazi M, Haoula Z, Layfield R, Atiomo W. Proteomic biomarkers of endometrial cancer risk in women with polycystic ovary syndrome: a systematic review and biomarker database integration. Gynecol Endocrinol 2013; 29:638-44. [PMID: 23527552 DOI: 10.3109/09513590.2013.777416] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There is a need for research studies into the molecular mechanisms underpinning the link between polycystic ovary syndrome (PCOS) and endometrial cancer (EC) to facilitate screening and to encourage the development of novel strategies to prevent disease progression. The objective of this review was to identify proteomic biomarkers of EC risk in women with PCOS. All eligible published studies on proteomic biomarkers for EC identified through the literature were evaluated. Proteomic biomarkers for EC were then integrated with an updated previously published database of all proteomic biomarkers identified so far in PCOS women. Nine protein biomarkers were similarly either under or over expressed in women with EC and PCOS in various tissues. These include transgelin, pyruvate kinase M1/M2, gelsolin-like capping protein (macrophage capping protein), glutathione S-transferase P, leucine aminopeptidase (cytosol aminopeptidase), peptidyl-prolyl cis-transisomerase, cyclophilin A, complement component C4A and manganese-superoxide dismutase. If validated, these biomarkers may provide a useful framework on which the knowledge base in this area could be developed and will facilitate future mathematical modelling to enhance screening and prevention of EC in women with PCOS who have been shown to be at increased risk.
Collapse
Affiliation(s)
- Nicolas Galazis
- Nottingham Medical School, University of Nottingham, Queen's Medical Centre Campus Nottingham University Hospital, Nottingham, UK.
| | | | | | | | | | | |
Collapse
|
39
|
Feng PM, Ding H, Chen W, Lin H. Naïve Bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013:530696. [PMID: 23762187 PMCID: PMC3671239 DOI: 10.1155/2013/530696] [Citation(s) in RCA: 107] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/10/2013] [Revised: 04/16/2013] [Accepted: 04/28/2013] [Indexed: 12/31/2022]
Abstract
Knowledge about the protein composition of phage virions is a key step to understand the functions of phage virion proteins. However, the experimental method to identify virion proteins is time consuming and expensive. Thus, it is highly desirable to develop novel computational methods for phage virion protein identification. In this study, a Naïve Bayes based method was proposed to predict phage virion proteins using amino acid composition and dipeptide composition. In order to remove redundant information, a novel feature selection technique was employed to single out optimized features. In the jackknife test, the proposed method achieved an accuracy of 79.15% for phage virion and nonvirion proteins classification, which are superior to that of other state-of-the-art classifiers. These results indicate that the proposed method could be as an effective and promising high-throughput method in phage proteomics research.
Collapse
Affiliation(s)
- Peng-Mian Feng
- School of Public Health, Hebei United University, Tangshan 063000, China
| | - Hui Ding
- Key Laboratory for Neuroinformation of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Department of Physics, School of Sciences, Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China
| | - Hao Lin
- Key Laboratory for Neuroinformation of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
40
|
Bohra R, Klepacki J, Klawitter J, Klawitter J, Thurman J, Christians U. Proteomics and metabolomics in renal transplantation-quo vadis? Transpl Int 2013; 26:225-41. [PMID: 23350848 PMCID: PMC4006577 DOI: 10.1111/tri.12003] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2012] [Revised: 05/07/2012] [Accepted: 10/07/2012] [Indexed: 12/13/2022]
Abstract
The improvement of long-term transplant organ and patient survival remains a critical challenge following kidney transplantation. Proteomics and biochemical profiling (metabolomics) may allow for the detection of early changes in cell signal transduction regulation and biochemistry with high sensitivity and specificity. Hence, these analytical strategies hold the promise to detect and monitor disease processes and drug effects before histopathological and pathophysiological changes occur. In addition, they will identify enriched populations and enable individualized drug therapy. However, proteomics and metabolomics have not yet lived up to such high expectations. Renal transplant patients are highly complex, making it difficult to establish cause-effect relationships between surrogate markers and disease processes. Appropriate study design, adequate sample handling, storage and processing, quality and reproducibility of bioanalytical multi-analyte assays, data analysis and interpretation, mechanistic verification, and clinical qualification (=establishment of sensitivity and specificity in adequately powered prospective clinical trials) are important factors for the success of molecular marker discovery and development in renal transplantation. However, a newly developed and appropriately qualified molecular marker can only be successful if it is realistic that it can be implemented in a clinical setting. The development of combinatorial markers with supporting software tools is an attractive goal.
Collapse
Affiliation(s)
- Rahul Bohra
- iC42 Clinical Research & Development, Department of Anesthesiology, University of Colorado Denver, Aurora, Colorado, USA
| | - Jacek Klepacki
- iC42 Clinical Research & Development, Department of Anesthesiology, University of Colorado Denver, Aurora, Colorado, USA
| | - Jelena Klawitter
- iC42 Clinical Research & Development, Department of Anesthesiology, University of Colorado Denver, Aurora, Colorado, USA
- Renal Medicine, University of Colorado Denver, Aurora, USA
| | - Jost Klawitter
- iC42 Clinical Research & Development, Department of Anesthesiology, University of Colorado Denver, Aurora, Colorado, USA
| | - Joshua Thurman
- Renal Medicine, University of Colorado Denver, Aurora, USA
| | - Uwe Christians
- iC42 Clinical Research & Development, Department of Anesthesiology, University of Colorado Denver, Aurora, Colorado, USA
| |
Collapse
|
41
|
Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, von Mering C. PaxDb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics 2012; 11:492-500. [PMID: 22535208 PMCID: PMC3412977 DOI: 10.1074/mcp.o111.014704] [Citation(s) in RCA: 344] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2011] [Revised: 03/26/2012] [Indexed: 02/04/2023] Open
Abstract
Although protein expression is regulated both temporally and spatially, most proteins have an intrinsic, "typical" range of functionally effective abundance levels. These extend from a few molecules per cell for signaling proteins, to millions of molecules for structural proteins. When addressing fundamental questions related to protein evolution, translation and folding, but also in routine laboratory work, a simple rough estimate of the average wild type abundance of each detectable protein in an organism is often desirable. Here, we introduce a meta-resource dedicated to integrating information on absolute protein abundance levels; we place particular emphasis on deep coverage, consistent post-processing and comparability across different organisms. Publicly available experimental data are mapped onto a common namespace and, in the case of tandem mass spectrometry data, re-processed using a standardized spectral counting pipeline. By aggregating and averaging over the various samples, conditions and cell-types, the resulting integrated data set achieves increased coverage and a high dynamic range. We score and rank each contributing, individual data set by assessing its consistency against externally provided protein-network information, and demonstrate that our weighted integration exhibits more consistency than the data sets individually. The current PaxDb-release 2.1 (at http://pax-db.org/) presents whole-organism data as well as tissue-resolved data, and covers 85,000 proteins in 12 model organisms. All values can be seamlessly compared across organisms via pre-computed orthology relationships.
Collapse
Affiliation(s)
- M. Wang
- From the ‡Institute of Molecular Life Sciences, and
- §Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - M. Weiss
- From the ‡Institute of Molecular Life Sciences, and
- §Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - M. Simonovic
- From the ‡Institute of Molecular Life Sciences, and
- §Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - G. Haertinger
- From the ‡Institute of Molecular Life Sciences, and
- §Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | | | | | - C. von Mering
- From the ‡Institute of Molecular Life Sciences, and
- §Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
42
|
Martin B, Chadwick W, Yi T, Park SS, Lu D, Ni B, Gadkaree S, Farhang K, Becker KG, Maudsley S. VENNTURE--a novel Venn diagram investigational tool for multiple pharmacological dataset analysis. PLoS One 2012; 7:e36911. [PMID: 22606307 PMCID: PMC3351456 DOI: 10.1371/journal.pone.0036911] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2011] [Accepted: 04/10/2012] [Indexed: 12/24/2022] Open
Abstract
As pharmacological data sets become increasingly large and complex, new visual analysis and filtering programs are needed to aid their appreciation. One of the most commonly used methods for visualizing biological data is the Venn diagram. Currently used Venn analysis software often presents multiple problems to biological scientists, in that only a limited number of simultaneous data sets can be analyzed. An improved appreciation of the connectivity between multiple, highly-complex datasets is crucial for the next generation of data analysis of genomic and proteomic data streams. We describe the development of VENNTURE, a program that facilitates visualization of up to six datasets in a user-friendly manner. This program includes versatile output features, where grouped data points can be easily exported into a spreadsheet. To demonstrate its unique experimental utility we applied VENNTURE to a highly complex parallel paradigm, i.e. comparison of multiple G protein-coupled receptor drug dose phosphoproteomic data, in multiple cellular physiological contexts. VENNTURE was able to reliably and simply dissect six complex data sets into easily identifiable groups for straightforward analysis and data output. Applied to complex pharmacological datasets, VENNTURE's improved features and ease of analysis are much improved over currently available Venn diagram programs. VENNTURE enabled the delineation of highly complex patterns of dose-dependent G protein-coupled receptor activity and its dependence on physiological cellular contexts. This study highlights the potential for such a program in fields such as pharmacology, genomics, and bioinformatics.
Collapse
Affiliation(s)
- Bronwen Martin
- Metabolism Unit, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Wayne Chadwick
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Tie Yi
- Metabolism Unit, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Sung-Soo Park
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Daoyuan Lu
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Bin Ni
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Shekhar Gadkaree
- Diabetes Section, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Kathleen Farhang
- Diabetes Section, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Kevin G. Becker
- Gene Expression and Genomics Unit, Research Resources Branch, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Stuart Maudsley
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
43
|
Abstract
Exponentially Modified Protein Abundance Index (emPAI) is an established method of estimating protein abundances from peptide counts in a single LC-MS/MS experiment. EmPAI is defined as 10PAI minus one, where PAI (Protein Abundance Index) denotes the ratio of observed to observable peptides. EmPAI was first proposed by Ishihama et al [1] who found that PAI is approximately proportional to the logarithm of absolute protein concentration. I define emPAI65 = 6.5PAI-1 and show that it performs significantly better than emPAI, while it is equally easy to compute. The higher accuracy of emPAI65 is demonstrated by analyzing three data sets, including the one used in the original study [1]. I conclude that emPAI65 ought to be used instead of the original emPAI for protein quantitation.
Collapse
Affiliation(s)
- Andrzej Kudlicki
- Department of Biochemistry and Molecular Biology, Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas, United States of America.
| |
Collapse
|
44
|
Stuart GW, Berry MW. A Comprehensive Whole Genome Bacterial Phylogeny Using Correlated Peptide Motifs Defined in a High Dimensional Vector Space. J Bioinform Comput Biol 2012; 1:475-93. [PMID: 15290766 DOI: 10.1142/s0219720003000265] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2003] [Revised: 04/28/2003] [Accepted: 04/29/2003] [Indexed: 11/18/2022]
Abstract
As whole genome sequences continue to expand in number and complexity, effective methods for comparing and categorizing both genes and species represented within extremely large datasets are required. Methods introduced to date have generally utilized incomplete and likely insufficient subsets of the available data. We have developed an accurate and efficient method for producing robust gene and species phylogenies using very large whole genome protein datasets. This method relies on multidimensional protein vector definitions supplied by the singular value decomposition (SVD) of a large sparse data matrix in which each protein is uniquely represented as a vector of overlapping tetrapeptide frequencies. Quantitative pairwise estimates of species similarity were obtained by summing the protein vectors to form species vectors, then determining the cosines of the angles between species vectors. Evolutionary trees produced using this method confirmed many accepted prokaryotic relationships. However, several unconventional relationships were also noted. In addition, we demonstrate that many of the SVD-derived right basis vectors represent particular conserved protein families, while many of the corresponding left basis vectors describe conserved motifs within these families as sets of correlated peptides (copeps). This analysis represents the most detailed simultaneous comparison of prokaryotic genes and species available to date.
Collapse
Affiliation(s)
- Gary W Stuart
- Department of Life Sciences, Indiana State University, Terre Haute, IN 47809, USA.
| | | |
Collapse
|
45
|
Hugo A, Baxter DJ, Cannon WR, Kalyanaraman A, Kulkarni G, Callister SJ. Proteotyping of microbial communities by optimization of tandem mass spectrometry data interpretation. Pac Symp Biocomput 2012:225-234. [PMID: 22174278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We report the development of a novel high performance computing method for the identification of proteins from unknown (environmental) samples. The method uses computational optimization to provide an effective way to control the false discovery rate for environmental samples and complements de novo peptide sequencing. Furthermore, the method provides information based on the expressed protein in a microbial community, and thus complements DNA-based identification methods. Testing on blind samples demonstrates that the method provides 79-95% overlap with analogous results from searches involving only the correct genomes. We provide scaling and performance evaluations for the software that demonstrate the ability to carry out large-scale optimizations on 1258 genomes containing 4.2M proteins.
Collapse
Affiliation(s)
- Alys Hugo
- Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | | | |
Collapse
|
46
|
Schäfer M, Lkhagvasuren O, Klein HU, Elling C, Wüstefeld T, Müller-Tidow C, Zender L, Koschmieder S, Dugas M, Ickstadt K. Integrative analyses for omics data: a Bayesian mixture model to assess the concordance of ChIP-chip and ChIP-seq measurements. J Toxicol Environ Health A 2012; 75:461-470. [PMID: 22686305 DOI: 10.1080/15287394.2012.674914] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The analysis of different variations in genomics, transcriptomics, epigenomics, and proteomics has increased considerably in recent years. This is especially due to the success of microarray and, more recently, sequencing technology. Apart from understanding mechanisms of disease pathogenesis on a molecular basis, for example in cancer research, the challenge of analyzing such different data types in an integrated way has become increasingly important also for the validation of new sequencing technologies with maximum resolution. For this purpose, a methodological framework for their comparison with microarray techniques in the context of smallest sample sizes, which result from the high costs of experiments, is proposed in this contribution. Based on an adaptation of the externally centered correlation coefficient ( Schäfer et al. 2009 ), it is demonstrated how a Bayesian mixture model can be applied to compare and classify measurements of histone acetylation that stem from chromatin immunoprecipitation combined with either microarray (ChIP-chip) or sequencing techniques (ChIP-seq) for the identification of DNA fragments. Here, the murine hematopoietic cell line 32D, which was transduced with the oncogene BCR-ABL, the hallmark of chronic myeloid leukemia, was characterized. Cells were compared to mock-transduced cells as control. Activation or inhibition of other genes by histone modifications induced by the oncogene is considered critical in such a context for the understanding of the disease.
Collapse
Affiliation(s)
- Martin Schäfer
- Department of Statistics, TU Dortmund University, Dortmund, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Oeltze S, Freiler W, Hillert R, Doleisch H, Preim B, Schubert W. Interactive, graph-based visual analysis of high-dimensional, multi-parameter fluorescence microscopy data in toponomics. IEEE Trans Vis Comput Graph 2011; 17:1882-1891. [PMID: 22034305 DOI: 10.1109/tvcg.2011.217] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In Toponomics, the function protein pattern in cells or tissue (the toponome) is imaged and analyzed for applications in toxicology, new drug development and patient-drug-interaction. The most advanced imaging technique is robot-driven multi-parameter fluorescence microscopy. This technique is capable of co-mapping hundreds of proteins and their distribution and assembly in protein clusters across a cell or tissue sample by running cycles of fluorescence tagging with monoclonal antibodies or other affinity reagents, imaging, and bleaching in situ. The imaging results in complex multi-parameter data composed of one slice or a 3D volume per affinity reagent. Biologists are particularly interested in the localization of co-occurring proteins, the frequency of co-occurrence and the distribution of co-occurring proteins across the cell. We present an interactive visual analysis approach for the evaluation of multi-parameter fluorescence microscopy data in toponomics. Multiple, linked views facilitate the definition of features by brushing multiple dimensions. The feature specification result is linked to all views establishing a focus+context visualization in 3D. In a new attribute view, we integrate techniques from graph visualization. Each node in the graph represents an affinity reagent while each edge represents two co-occurring affinity reagent bindings. The graph visualization is enhanced by glyphs which encode specific properties of the binding. The graph view is equipped with brushing facilities. By brushing in the spatial and attribute domain, the biologist achieves a better understanding of the function protein patterns of a cell. Furthermore, an interactive table view is integrated which summarizes unique fluorescence patterns. We discuss our approach with respect to a cell probe containing lymphocytes and a prostate tissue section.
Collapse
|
48
|
Abstract
The ongoing genomics and proteomics efforts have helped identify many new genes and proteins in living organisms. However, simply knowing the existence of genes and proteins does not tell us much about the biological processes in which they participate. Many major biological processes are controlled by protein interaction networks. A comprehensive description of protein–protein interactions is therefore necessary to understand the genetic program of life. In this tutorial, we provide an overview of the various current high-throughput methods for discovering protein–protein interactions, covering both the conventional experimental methods and new computational approaches.
Collapse
Affiliation(s)
- See-Kiong Ng
- Knowledge Discovery Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore.
| | | |
Collapse
|
49
|
Berg D, Wolff C, Langer R, Schuster T, Feith M, Slotta-Huspenina J, Malinowsky K, Becker KF. Discovery of new molecular subtypes in oesophageal adenocarcinoma. PLoS One 2011; 6:e23985. [PMID: 21966358 PMCID: PMC3179464 DOI: 10.1371/journal.pone.0023985] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 07/28/2011] [Indexed: 12/22/2022] Open
Abstract
A large number of patients suffering from oesophageal adenocarcinomas do not respond to conventional chemotherapy; therefore, it is necessary to identify new predictive biomarkers and patient signatures to improve patient outcomes and therapy selections. We analysed 87 formalin-fixed and paraffin-embedded (FFPE) oesophageal adenocarcinoma tissue samples with a reverse phase protein array (RPPA) to examine the expression of 17 cancer-related signalling molecules. Protein expression levels were analysed by unsupervised hierarchical clustering and correlated with clinicopathological parameters and overall patient survival. Proteomic analyses revealed a new, very promising molecular subtype of oesophageal adenocarcinoma patients characterised by low levels of the HSP27 family proteins and high expression of those of the HER family with positive lymph nodes, distant metastases and short overall survival. After confirmation in other independent studies, our results could be the foundation for the development of a Her2-targeted treatment option for this new patient subgroup of oesophageal adenocarcinoma.
Collapse
Affiliation(s)
- Daniela Berg
- Institute of Pathology, Technische Universität München, Munich, Germany
| | - Claudia Wolff
- Institute of Pathology, Technische Universität München, Munich, Germany
| | - Rupert Langer
- Institute of Pathology, Technische Universität München, Munich, Germany
| | - Tibor Schuster
- Institute of Medical Statistics and Epidemiology, Technische Universität München, Munich, Germany
| | - Marcus Feith
- Department of Surgery, Technische Universität München, Munich, Germany
| | | | | | | |
Collapse
|
50
|
|