1
|
Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases.
Collapse
|
2
|
Two-Dimensional Gel Electrophoresis Image Analysis. Methods Mol Biol 2021; 2361:3-13. [PMID: 34236652 DOI: 10.1007/978-1-0716-1641-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Gel-based proteomics is still quite widespread due to its high-resolution power; the experimental approach is based on differential analysis, where groups of samples (e.g., control vs diseased) are compared to identify panels of potential biomarkers. However, the reliability of the result of the differential analysis is deeply influenced by 2D-PAGE maps image analysis procedures. The analysis of 2D-PAGE images consists of several steps, such as image preprocessing, spot detection and quantitation, image warping and alignment, spot matching. Several approaches are present in literature, and classical or last-generation commercial software packages exploit different algorithms for each step of the analysis. Here, the most widespread approaches and a comparison of the different strategies are presented.
Collapse
|
3
|
Abstract
2D-DIGE is still a very widespread technique in proteomics for the identification of panels of biomarkers, allowing to tackle with some important drawback of classical two-dimensional gel-electrophoresis. However, once 2D-gels are obtained, they must undergo a quite articulated multistep image analysis procedure before the final differential analysis via statistical mono- and multivariate methods. Here, the main steps of image analysis software are described and the most recent procedures reported in the literature are briefly presented.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
4
|
Cannistraci CV, Alessio M. Image Pretreatment Tools I: Algorithms for Map Denoising and Background Subtraction Methods. Methods Mol Biol 2016; 1384:79-89. [PMID: 26611410 DOI: 10.1007/978-1-4939-3255-9_5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
One of the critical steps in two-dimensional electrophoresis (2-DE) image pre-processing is the denoising, that might aggressively affect either spot detection or pixel-based methods. The Median Modified Wiener Filter (MMWF), a new nonlinear adaptive spatial filter, resulted to be a good denoising approach to use in practice with 2-DE. MMWF is suitable for global denoising, and contemporary for the removal of spikes and Gaussian noise, being its best setting invariant on the type of noise. The second critical step rises because of the fact that 2-DE gel images may contain high levels of background, generated by the laboratory experimental procedures, that must be subtracted for accurate measurements of the proteomic optical density signals. Here we discuss an efficient mathematical method for background estimation, that is suitable to work even before the 2-DE image spot detection, and it is based on the 3D mathematical morphology (3DMM) theory.
Collapse
Affiliation(s)
- Carlo Vittorio Cannistraci
- Biomedical Cybernetics Group, Biotechnology Center (BIOTEC), Technische Universität Dresden, Tatzberg 47/49, 01307, Dresden, Germany.
| | - Massimo Alessio
- Proteome Biochemistry, IRCCS-San Raffaele Scientific Institute, Via Olgettina 58, 20132, Milan, Italy.
| |
Collapse
|
5
|
Abstract
Analysis of two-dimensional gel images is a crucial step for the determination of changes in the protein expression, but at present, it still represents one of the bottlenecks in 2-DE studies. Over the years, different commercial and academic software packages have been developed for the analysis of 2-DE images. Each of these shows different advantageous characteristics in terms of quality of analysis. In this chapter, the characteristics of the different commercial software packages are compared in order to evaluate their main features and performances.
Collapse
Affiliation(s)
- Daniela Cecconi
- Mass Spectrometry & Proteomics Lab, Department of Biotechnology, University of Verona, Strada le Grazie 15, 37134, Verona, Italy.
| |
Collapse
|
6
|
Shamekhi S, Miran Baygi MH, Azarian B, Gooya A. A novel multi-scale Hessian based spot enhancement filter for two dimensional gel electrophoresis images. Comput Biol Med 2015; 66:154-69. [PMID: 26409228 DOI: 10.1016/j.compbiomed.2015.07.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Revised: 07/09/2015] [Accepted: 07/13/2015] [Indexed: 11/26/2022]
Abstract
Two dimensional gel electrophoresis (2DGE) is a useful method for studying proteins in a wide variety of applications including identifying post-translation modification (PTM), biomarker discovery, and protein purification. Computerized segmentation and detection of the proteins are the two main processes that are carried out on the scanned image of the gel. Due to the complexities of 2DGE images and the presence of artifacts, the segmentation and detection of protein spots in these images are non-trivial, and involve supervised and time consuming processes. This paper introduces a new spot filter for enhancing, and separating the closely overlapping spots of protein in 2DGE images based on the multi-scale eigenvalue analysis of the image Hessian. Using a Gaussian spot model, we have derived closed form equations to compute the eigen components of the image Hessian of two overlapping spots in a multi-scale fashion. Based on this analysis, we have proposed a novel filter that suppresses the overlapping area and results in a better spot separation. The performance of the proposed filter has been evaluated on the synthetic and real 2DGE images. The comparison with three conventional techniques and a commercial software package reveals the superiority and effectiveness of the proposed filter.
Collapse
Affiliation(s)
- Sina Shamekhi
- Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran
| | | | - Bahareh Azarian
- Protein-Chemistry Lab., Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Ali Gooya
- Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
7
|
Wheelock AM, Goto S. Effects of post-electrophoretic analysis on variance in gel-based proteomics. Expert Rev Proteomics 2014; 3:129-42. [PMID: 16445357 DOI: 10.1586/14789450.3.1.129] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
2D electrophoresis (2DE) is a prominent separation method for complex proteomes. Although recent advances have increased the utility of this method in quantitative proteomics studies, many sources of variance still exist. This review discusses the post-electrophoretic sources of variance in current 2DE analysis. The essential improvements in protein visualization and software algorithms that have made 2DE a leading quantitative proteomics method are briefly reviewed. A number of shortcomings in the post-electrophoretic analysis of 2DE data that require further attention are highlighted. Topics discussed include protein visualization and image acquisition, internal standards and normalization methods, background subtraction algorithms, normality of distribution, and the need for standardized tests for the evaluation of 2DE analysis software packages.
Collapse
Affiliation(s)
- Asa M Wheelock
- Kyoto University, Bioinformatics Center, Institute for Chemical Research, Uji, Kyoto, 611-0011, Japan.
| | | |
Collapse
|
8
|
Caccia D, Dugo M, Callari M, Bongarzone I. Bioinformatics tools for secretome analysis. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:2442-53. [PMID: 23395702 DOI: 10.1016/j.bbapap.2013.01.039] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Revised: 01/23/2013] [Accepted: 01/29/2013] [Indexed: 12/29/2022]
Abstract
Over recent years, analyses of secretomes (complete sets of secreted proteins) have been reported in various organisms, cell types, and pathologies and such studies are quickly gaining popularity. Fungi secrete enzymes can break down potential food sources; plant secreted proteins are primarily parts of the cell wall proteome; and human secreted proteins are involved in cellular immunity and communication, and provide useful information for the discovery of novel biomarkers, such as for cancer diagnosis. Continuous development of methodologies supports the wide identification and quantification of secreted proteins in a given cellular state. The role of secreted factors is also investigated in the context of the regulation of major signaling events, and connectivity maps are built to describe the differential expression and dynamic changes of secretomes. Bioinformatics has become the bridge between secretome data and computational tasks for managing, mining, and retrieving information. Predictions can be made based on this information, contributing to the elucidation of a given organism's physiological state and the determination of the specific malfunction in disease states. Here we provide an overview of the available bioinformatics databases and software that are used to analyze the biological meaning of secretome data, including descriptions of the main functions and limitations of these tools. The important challenges of data analysis are mainly related to the integration of biological information from dissimilar sources. Improvements in databases and developments in software will likely substantially contribute to the usefulness and reliability of secretome studies. This article is part of a Special Issue entitled: An Updated Secretome.
Collapse
Affiliation(s)
- Dario Caccia
- Proteomics Laboratory, Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
| | | | | | | |
Collapse
|
9
|
Huang H, Correa N, Roy A, Adali T. Bootstrap testing of 2D electrophoresis gels across groups. Stat (Int Stat Inst) 2012. [DOI: 10.1002/sta4.12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Hui Huang
- Department of Management Science; University of Miami; Coral Gables FL 33146 USA
| | - Nicole Correa
- Department of CSEE; University of Maryland; Baltimore County, Baltimore MD 21250 USA
| | - Anindya Roy
- Department of Mathematics and Statistics; University of Maryland; Baltimore County, Baltimore MD 21250 USA
| | - Tulay Adali
- Department of CSEE; University of Maryland; Baltimore County, Baltimore MD 21250 USA
| |
Collapse
|
10
|
Frasch JV, Lodwich A, Shafait F, Breuel TM. A Bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recognit Lett 2011. [DOI: 10.1016/j.patrec.2011.04.010] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
11
|
Tsakanikas P, Manolakos ES. Protein spot detection and quantification in 2-DE gel images using machine-learning methods. Proteomics 2011; 11:2038-50. [DOI: 10.1002/pmic.201000601] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Revised: 02/02/2011] [Accepted: 02/11/2011] [Indexed: 01/16/2023]
|
12
|
Millioni R, Puricelli L, Sbrignadello S, Iori E, Murphy E, Tessari P. Operator- and software-related post-experimental variability and source of error in 2-DE analysis. Amino Acids 2011; 42:1583-90. [PMID: 21394601 DOI: 10.1007/s00726-011-0873-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Accepted: 02/26/2011] [Indexed: 01/09/2023]
Abstract
In the field of proteomics, several approaches have been developed for separating proteins and analyzing their differential relative abundance. One of the oldest, yet still widely used, is 2-DE. Despite the continuous advance of new methods, which are less demanding from a technical standpoint, 2-DE is still compelling and has a lot of potential for improvement. The overall variability which affects 2-DE includes biological, experimental, and post-experimental (software-related) variance. It is important to highlight how much of the total variability of this technique is due to post-experimental variability, which, so far, has been largely neglected. In this short review, we have focused on this topic and explained that post-experimental variability and source of error can be further divided into those which are software-dependent and those which are operator-dependent. We discuss these issues in detail, offering suggestions for reducing errors that may affect the quality of results, summarizing the advantages and drawbacks of each approach.
Collapse
Affiliation(s)
- Renato Millioni
- Division of Metabolism, Department of Clinical and Experimental Medicine, University of Padua, via Giustiniani 2, 35128, Padua, Italy.
| | | | | | | | | | | |
Collapse
|
13
|
Geromanos SJ, Hughes C, Golick D, Ciavarini S, Gorenstein MV, Richardson K, Hoyes JB, Vissers JP, Langridge JI. Simulating and validating proteomics data and search results. Proteomics 2011; 11:1189-211. [DOI: 10.1002/pmic.201000576] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Revised: 11/30/2010] [Accepted: 12/05/2010] [Indexed: 11/08/2022]
|
14
|
Cannistraci CV, Montevecchi FM, Alessio M. Median-modified Wiener filter provides efficient denoising, preserving spot edge and morphology in 2-DE image processing. Proteomics 2009; 9:4908-19. [PMID: 19862762 DOI: 10.1002/pmic.200800538] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Denoising is a fundamental early stage in 2-DE image analysis strongly influencing spot detection or pixel-based methods. A novel nonlinear adaptive spatial filter (median-modified Wiener filter, MMWF), is here compared with five well-established denoising techniques (Median, Wiener, Gaussian, and Polynomial-Savitzky-Golay filters; wavelet denoising) to suggest, by means of fuzzy sets evaluation, the best denoising approach to use in practice. Although median filter and wavelet achieved the best performance in spike and Gaussian denoising respectively, they are unsuitable for contemporary removal of different types of noise, because their best setting is noise-dependent. Vice versa, MMWF that arrived second in each single denoising category, was evaluated as the best filter for global denoising, being its best setting invariant of the type of noise. In addition, median filter eroded the edge of isolated spots and filled the space between close-set spots, whereas MMWF because of a novel filter effect (drop-off-effect) does not suffer from erosion problem, preserves the morphology of close-set spots, and avoids spot and spike fuzzyfication, an aberration encountered for Wiener filter. In our tests, MMWF was assessed as the best choice when the goal is to minimize spot edge aberrations while removing spike and Gaussian noise.
Collapse
|
15
|
Abstract
One of the most commonly used methods for protein separation is 2-DE. After 2-DE gel scanning, images with a plethora of spot features emerge that are usually contaminated by inherent noise. The objective of the denoising process is to remove noise to the extent that the true spots are recovered correctly and accurately i.e. without introducing distortions leading to the detection of false-spot features. In this paper we propose and justify the use of the contourlet transform as a tool for 2-DE gel images denoising. We compare its effectiveness with state-of-the-art methods such as wavelets-based multiresolution image analysis and spatial filtering. We show that contourlets not only achieve better average S/N performance than wavelets and spatial filters, but also preserve better spot boundaries and faint spots and alter less the intensities of informative spot features, leading to more accurate spot volume estimation and more reliable spot detection, operations that are essential to differential expression proteomics for biomarkers discovery.
Collapse
|
16
|
Abstract
The image analysis part of gel-based proteome research plays an important role in the overall success of the experiment. The main purpose of software-assisted 2DE gel analysis is to detect the protein spots, match them between gels within an experiment, and identify any differences in protein expression between sets of samples. Efficient analysis of protein expression relies on automated image processing techniques. There are several factors to consider in the choice of software product, as well as in the implementation of the analysis itself. Successful quantification of protein expression levels is largely dependent on the algorithms for spot matching, normalization, and background subtraction provided by the 2DE analysis software. In addition to generic protocols for image acquisition and subsequent 2DE image analysis (using Progenesis PG200), this chapter describes methods for quantitative and qualitative evaluation of the quality of the image analysis.
Collapse
|
17
|
Kang Y, Techanukul T, Mantalaris A, Nagy JM. Comparison of Three Commercially Available DIGE Analysis Software Packages: Minimal User Intervention in Gel-Based Proteomics. J Proteome Res 2009; 8:1077-84. [DOI: 10.1021/pr800588f] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Yunyi Kang
- Department of Chemical Engineering and Chemical Technology, Imperial College London, London, SW7 2AZ, United Kingdom, and Institute of Biomedical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Tanasit Techanukul
- Department of Chemical Engineering and Chemical Technology, Imperial College London, London, SW7 2AZ, United Kingdom, and Institute of Biomedical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Anthanasios Mantalaris
- Department of Chemical Engineering and Chemical Technology, Imperial College London, London, SW7 2AZ, United Kingdom, and Institute of Biomedical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Judit M. Nagy
- Department of Chemical Engineering and Chemical Technology, Imperial College London, London, SW7 2AZ, United Kingdom, and Institute of Biomedical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| |
Collapse
|
18
|
Statistical Analysis of Image Data Provided by Two-Dimensional Gel Electrophoresis for Discovery Proteomics. ACTA ACUST UNITED AC 2008. [DOI: 10.1007/978-1-60327-148-6_15] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
19
|
Li MD, Wang J. Neuroproteomics and its applications in research on nicotine and other drugs of abuse. Proteomics Clin Appl 2007; 1:1406-27. [PMID: 21136639 DOI: 10.1002/prca.200700321] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Indexed: 12/24/2022]
Abstract
The rapidly growing field of neuroproteomics is able to track changes in protein expression and protein modifications underlying various physiological conditions, including the neural diseases related to drug addiction. Thus, it presents great promise in characterizing protein function, biochemical pathways, and networks to understand the mechanisms underlying drug dependence. In this article, we first provide an overview of proteomics technologies and bioinformatics tools available to analyze proteomics data. Then we summarize the recent applications of proteomics to profile the protein expression pattern in animal or human brain tissues after the administration of nicotine, alcohol, amphetamine, butorphanol, cocaine, and morphine. By comparing the protein expression profiles in response to chronic nicotine exposure with those appearing in response to treatment with other drugs of abuse, we identified three biological processes that appears to be regulated by multiple drugs of abuse: energy metabolism, oxidative stress response, and protein degradation and modification. Such similarity indicates that despite the obvious differences among their chemical properties and the receptors with which they interact, different substances of abuse may cause some similar changes in cellular activities and biological processes in neurons.
Collapse
Affiliation(s)
- Ming D Li
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, Charlottesville, VA, USA.
| | | |
Collapse
|
20
|
Chich JF, David O, Villers F, Schaeffer B, Lutomski D, Huet S. Statistics for proteomics: Experimental design and 2-DE differential analysis. J Chromatogr B Analyt Technol Biomed Life Sci 2007; 849:261-72. [PMID: 17081811 DOI: 10.1016/j.jchromb.2006.09.033] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2006] [Revised: 08/25/2006] [Accepted: 09/08/2006] [Indexed: 11/24/2022]
Abstract
Proteomics relies on the separation of complex protein mixtures using bidimensional electrophoresis. This approach is largely used to detect the expression variations of proteins prepared from two or more samples. Recently, attention was drawn on the reliability of the results published in literature. Among the critical points identified were experimental design, differential analysis and the problem of missing data, all problems where statistics can be of help. Using examples and terms understandable by biologists, we describe how a collaboration between biologists and statisticians can improve reliability of results and confidence in conclusions.
Collapse
Affiliation(s)
- Jean-François Chich
- INRA, Biologie Physico-Chimique des Prions, VIM 78352 Jouy-en-Josas Cedex, France.
| | | | | | | | | | | |
Collapse
|
21
|
Dowsey AW, English J, Pennington K, Cotter D, Stuehler K, Marcus K, Meyer HE, Dunn MJ, Yang GZ. Examination of 2-DE in the Human Proteome Organisation Brain Proteome Project pilot studies with the new RAIN gel matching technique. Proteomics 2006; 6:5030-47. [PMID: 16927431 DOI: 10.1002/pmic.200600152] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The Human Proteome Organisation (HUPO) Brain Proteome Project (BPP) pilot studies have generated over 200 2-D gels from eight participating laboratories. This data includes 67 single-channel and 60 DIGE gels comparing 30 whole frozen C57/BL6 female mouse brains, ten each at embryonic day 16, postnatal day 7 (juvenile) and postnatal day 54-56 (adult); and ten single-channel and three DIGE gels comparing human epilepsy surgery of the temporal front lobe with a corresponding post-mortem specimen. The samples were generated centrally and distributed to the participating laboratories, but otherwise no restrictions were placed on sample preparation, running and staining protocols, nor on the 2-D gel analysis packages used. Spots were characterised by MS and the annotated gel images published on a ProteinScape web server. In order to examine the resultant differential expression and protein identifications, we have reprocessed a large subset of the gels using the newly developed RAIN (Robust Automated Image Normalisation) 2-D gel matching algorithm. Traditional approaches use symbolic representation of spots at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in spot modelling and matching. With RAIN, image intensity distributions, rather than selected features, are used, where smooth geometric deformation and expression bias are modelled using multi-resolution image registration and bias-field correction. The method includes a new approach of volume-invariant warping which ensures the volume of protein expression under transformation is preserved. An image-based statistical expression analysis phase is then proposed, where small insignificant expression changes over one gel pair can be revealed when reinforced by the same consistent changes in others. Results of the proposed method as applied to the HUPO BPP data show significant intra-laboratory improvements in matching accuracy over a previous state-of-the-art technique, Multi-resolution Image Registration (MIR), and the commercial Progenesis PG240 package.
Collapse
Affiliation(s)
- Andrew W Dowsey
- Royal Society / Wolfson Foundation Medical Image Computing Laboratory, Department of Computing, Imperial College London, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Biron DG, Brun C, Lefevre T, Lebarbenchon C, Loxdale HD, Chevenet F, Brizard JP, Thomas F. The pitfalls of proteomics experiments without the correct use of bioinformatics tools. Proteomics 2006; 6:5577-96. [PMID: 16991202 DOI: 10.1002/pmic.200600223] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The elucidation of the entire genomic sequence of various organisms, from viruses to complex metazoans, most recently man, is undoubtedly the greatest triumph of molecular biology since the discovery of the DNA double helix. Over the past two decades, the focus of molecular biology has gradually moved from genomes to proteomes, the intention being to discover the functions of the genes themselves. The postgenomic era stimulated the development of new techniques (e.g. 2-DE and MS) and bioinformatics tools to identify the functions, reactions, interactions and location of the gene products in tissues and/or cells of living organisms. Both 2-DE and MS have been very successfully employed to identify proteins involved in biological phenomena (e.g. immunity, cancer, host-parasite interactions, etc.), although recently, several papers have emphasised the pitfalls of 2-DE experiments, especially in relation to experimental design, poor statistical treatment and the high rate of 'false positive' results with regard to protein identification. In the light of these perceived problems, we review the advantages and misuses of bioinformatics tools - from realisation of 2-DE gels to the identification of candidate protein spots - and suggest some useful avenues to improve the quality of 2-DE experiments. In addition, we present key steps which, in our view, need to be to taken into consideration during such analyses. Lastly, we present novel biological entities named 'interactomes', and the bioinformatics tools developed to analyse the large protein-protein interaction networks they form, along with several new perspectives of the field.
Collapse
Affiliation(s)
- David G Biron
- GEMI, UMR CNRS/IRD 2724, Centre IRD, Montpellier, France.
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Palagi PM, Hernandez P, Walther D, Appel RD. Proteome informatics I: Bioinformatics tools for processing experimental data. Proteomics 2006; 6:5435-44. [PMID: 16991191 DOI: 10.1002/pmic.200600273] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Bioinformatics tools for proteomics, also called proteome informatics tools, span today a large panel of very diverse applications ranging from simple tools to compare protein amino acid compositions to sophisticated software for large-scale protein structure determination. This review considers the available and ready to use tools that can help end-users to interpret, validate and generate biological information from their experimental data. It concentrates on bioinformatics tools for 2-DE analysis, for LC followed by MS analysis, for protein identification by PMF, by peptide fragment fingerprinting and by de novo sequencing and for data quantitation with MS data. It also discloses initiatives that propose to automate the processes of MS analysis and enhance the quality of the obtained results.
Collapse
Affiliation(s)
- Patricia M Palagi
- Proteome Informatics Group, Swiss Institute of Bioinformatics, Geneva, Switzerland.
| | | | | | | |
Collapse
|
24
|
Liu S, Davis JM. Dependence on saturation of average minimum resolution in two-dimensional statistical-overlap theory: peak overlap in saturated two-dimensional separations. J Chromatogr A 2006; 1126:244-56. [PMID: 16782109 DOI: 10.1016/j.chroma.2006.05.064] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2006] [Revised: 05/13/2006] [Accepted: 05/22/2006] [Indexed: 10/24/2022]
Abstract
A theory is proposed for the dependence on saturation of the average minimum resolution R(*) in point-process statistical-overlap theory for two-dimensional separations. Peak maxima are modelled by clusters of overlapping circles in hexagonal arrangements similar to close-packed layers. Such clusters exist only for specific circle numbers, but equations are derived that facilitate prediction of equivalent cluster properties for any number of circles. A metric is proposed for the average minimum resolution that separates two such clusters into two maxima. From this metric, the average minimum resolution of the two nearest-neighbor single-component peaks (SCPs)--one in each cluster--is calculated. Its value varies with the number of SCPs in both clusters. These resolutions are weighted by the probability that the two clusters contain the postulated numbers of SCPs and summed to give R(*), which decreases with increasing saturation. The dependence of R(*) on saturation is combined with a theory correcting the probability of overlap in a reduced square for boundary effects. The numbers of maxima in simulations of 75, 150, and 300 randomly distributed bi-Gaussians having exponential heights and aspect ratios of 1, 30, and 60 are compared to predictions. Excellent agreement between maxima numbers and theory is found at low and high saturation. Good estimates of the numbers of bi-Gaussians in simulations are calculated by fitting theory to numbers of maxima using least-squares regression. The theory is applied to mimicked GC x GCs of 93 compounds having many correlated retention times, with predictions that agree fairly well with maxima numbers.
Collapse
Affiliation(s)
- Siyuan Liu
- Department of Chemistry and Biochemistry, Southern Illinois University at Carbondale, Carbondale, IL 62901-4409, USA
| | | |
Collapse
|
25
|
Pietrogrande MC, Marchetti N, Tosi A, Dondi F, Righetti PG. Decoding two-dimensional polyacrylamide gel electrophoresis complex maps by autocovariance function: A simplified approach useful for proteomics. Electrophoresis 2005; 26:2739-48. [PMID: 15966009 DOI: 10.1002/elps.200410375] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This paper describes a mathematical approach applied for decoding the complex signal of two-dimensional polyacrylamide gel electrophoresis maps of protein mixtures. The method is helpful in extracting analytical information since separation of all the proteins present in the sample is still far from being achieved and co-migrating proteins are generally present in the same spot. The simplified method described is based on the study of the 2-D autocovariance function (2D-ACVF) computed on an experimental digitized map. The first part of the 2D-ACVF allows for the estimation of the number of proteins present in the sample (2D-ACVF computed at the origin) and of the separation performance (mean spot size). Moreover, the 2D-ACVF plot is a powerful tool in identifying order in the spot position, and singling it out from the complex separation pattern. This method was validated on synthetic maps obtained by computer simulation to describe 2-D PAGE real maps and reference maps retrieved from the SWISS-2DPAGE database. The results obtained are discussed by focusing on specific information relevant in proteomics: sample complexity, separation performance, and identification of spot trains related to post-translational modifications.
Collapse
|
26
|
Mansour L, Cheikali C, Desaunais P, Coulon JP, Daubin J, Hassine OKB, Vivarès CP, Jeanjean J, Cornillot E. Description of an ultrathin multiwire proportional chamber-based detector and application to the characterization of theSpraguea lophii(Microspora) two-dimensional genome fingerprint. Electrophoresis 2004; 25:3365-77. [PMID: 15490460 DOI: 10.1002/elps.200406089] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Multiwire proportional chamber is a useful technology to build detectors that supersede the lack of interactivity of autoradiography in molecular biology experiments. Some drawbacks still limited the diffusion of existing instruments in biological laboratories. The major competitors are storage phosphor imaging systems. The simplified description of a radio-chromato-imager prototype (RCI) based on an original ultrathin multiwire proportional chamber is presented. It combines the advantage of the different existing technologies to present competitive properties in terms of efficiency, spatial resolution, robustness, manipulation easiness and production cost. Application of the RCI detector to molecular biology was performed by the analysis of karyotype and restriction display two-dimensional pulsed-field gel electrophoresis (KARD 2-D PFGE) data which are used to describe small eukaryotic genome structures. The comparative analysis with autoradiography was performed with the PDQuest software on Spraguea lophii (Microspora) genome fingerprints. The spot detection procedure applied to the different images leads to a similar conclusion considering the genome structure of S. lophii which appeared to be composed of 15 chromosomes for 13 karyotypic bands (200-880 kbp).
Collapse
Affiliation(s)
- Lamjed Mansour
- Parasitologie Moléculaireet Cellulaire, Université Blaise Pascal, Aubière, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Affiliation(s)
- Helen Kim
- Department of Pharmacology and Toxicology, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA.
| | | | | |
Collapse
|