1
|
Boskamp T, Lachmund D, Casadonte R, Hauberg-Lotte L, Kobarg JH, Kriegsmann J, Maass P. Using the Chemical Noise Background in MALDI Mass Spectrometry Imaging for Mass Alignment and Calibration. Anal Chem 2019; 92:1301-1308. [PMID: 31793765 DOI: 10.1021/acs.analchem.9b04473] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI MSI) is an established tool for the investigation of formalin fixed paraffin embedded (FFPE) tissue samples and shows a high potential for applications in clinical research and histopathological diagnosis. The applicability and accuracy of this method, however, heavily depends on the quality of the acquired data, and in particular mass misalignment in axial time-of-flight (TOF) MSI continues to be a serious issue. We present a mass alignment and recalibration method that is specifically designed to operate on MALDI peptide imaging data. The proposed method exploits statistical properties of the characteristic chemical noise background observed in peptide imaging experiments. By comparing these properties to a theoretical peptide mass model, the effective mass shift of each spectrum is estimated and corrected. The method was evaluated on a cohort of 31 FFPE tissue samples, pursuing a statistical validation approach to estimate both the reduction of relative misalignment, as well as the increase in absolute mass accuracy. Our results suggest that a relative mass precision of approximately 5 ppm and an absolute accuracy of approximately 20 ppm are achievable using our method.
Collapse
Affiliation(s)
- Tobias Boskamp
- Center for Industrial Mathematics , University of Bremen , Bremen 28359 , Germany.,SCiLS , Bremen 28359 , Germany
| | - Delf Lachmund
- Center for Industrial Mathematics , University of Bremen , Bremen 28359 , Germany
| | | | - Lena Hauberg-Lotte
- Center for Industrial Mathematics , University of Bremen , Bremen 28359 , Germany
| | | | - Jörg Kriegsmann
- Proteopath , Trier 54296 , Germany.,Center for Histology, Cytology, and Molecular Diagnostics , Trier 54292 , Germany
| | - Peter Maass
- Center for Industrial Mathematics , University of Bremen , Bremen 28359 , Germany.,SCiLS , Bremen 28359 , Germany
| |
Collapse
|
2
|
Egertson JD, Eng JK, Bereman MS, Hsieh EJ, Merrihew GE, MacCoss MJ. De novo correction of mass measurement error in low resolution tandem MS spectra for shotgun proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2012; 23:2075-2082. [PMID: 23007965 PMCID: PMC3515694 DOI: 10.1007/s13361-012-0482-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Revised: 08/17/2012] [Accepted: 08/18/2012] [Indexed: 06/01/2023]
Abstract
We report an algorithm designed for the calibration of low resolution peptide mass spectra. Our algorithm is implemented in a program called FineTune, which corrects systematic mass measurement error in 1 min, with no input required besides the mass spectra themselves. The mass measurement accuracy for a set of spectra collected on an LTQ-Velos improved 20-fold from -0.1776 ± 0.0010 m/z to 0.0078 ± 0.0006 m/z after calibration (avg ± 95 % confidence interval). The precision in mass measurement was improved due to the correction of non-linear variation in mass measurement accuracy across the m/z range.
Collapse
Affiliation(s)
- Jarrett D Egertson
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | | | |
Collapse
|
3
|
An on-target desalting and concentration sample preparation protocol for MALDI-MS and MS/MS analysis. Methods Mol Biol 2012; 909:17-28. [PMID: 22903706 DOI: 10.1007/978-1-61779-959-4_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
2DE coupled with MALDI-MS is one of the most widely used and powerful analytic technologies in proteomics study. The MALDI sample preparation method has been developed and optimized towards the combination of simplicity, sample-cleaning, and sample concentration since its introduction. Here we present a protocol of the so-called Sample loading, Matrix loading, and on-target Wash (SMW) method which fulfills the three criteria by taking advantage of the AnchorChip™ targets. Our method is extremely simple and no pre-desalting or concentration is needed when dealing with samples prepared from 2DE. The protocol is amendable for automation and would pave the road for high-throughput MALDI-MS or MS/MS-based proteomics studies with guaranteed sensitivity and high identification rate. The method has been successfully applied to mouse liver proteome study and so far has been employed in other proteome studies by world-wide researchers.
Collapse
|
4
|
He Z, Yang C, Yang C, Qi RZ, Po-Ming Tam J, Yu W. Optimization-Based Peptide Mass Fingerprinting for Protein Mixture Identification. J Comput Biol 2010; 17:221-35. [DOI: 10.1089/cmb.2009.0160] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Zengyou He
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Chao Yang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Can Yang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Robert Z. Qi
- Department of Biochemistry, The Hong Kong University of Science and Technology, Hong Kong
| | - Jason Po-Ming Tam
- Department of Biochemistry, The Hong Kong University of Science and Technology, Hong Kong
| | - Weichuan Yu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| |
Collapse
|
5
|
Abstract
Computational proteomics applications are often imagined as a pipeline, where information is processed in each stage before it flows to the next one. Independent of the type of application, the first stage invariably consists of obtaining the raw mass spectrometric data from the spectrometer and preparing it for use in the later stages by enhancing the signal of interest while suppressing spurious components. Numerous approaches for preprocessing MS data have been described in the literature. In this chapter, we will describe both, standard techniques originating from classical signal and image processing, and novel computational approaches specifically tailored to the analysis of MS data sets. We will focus on low level signal processing tasks such as baseline reduction, denoising, and feature detection.
Collapse
Affiliation(s)
- Rene Hussong
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | | |
Collapse
|
6
|
Bioinformatics methods for protein identification using Peptide mass fingerprinting. Methods Mol Biol 2009. [PMID: 20013361 DOI: 10.1007/978-1-60761-444-9_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Protein identification by mass spectrometry (MS) is an important technique in proteomics. By searching an MS spectrum against a given protein database, the most matched proteins are sorted using a scoring function and the top one is often considered the correctly identified protein. Peptide mass fingerprinting (PMF) is one of the major methods for protein identification using MS technology. It is faster and cheaper than the other popular technique - Tandem Mass Spectrometry. Key bioinformatics issues in PMF analysis include designing a scoring function to quantitatively measure the degree of consistency between a PMF spectrum and a protein sequence and assessing the confidence of identified proteins. In this chapter, we will introduce several scoring functions that were developed by others and us. We will also provide a new statistic model to evaluate the confidence of the score and make an improvement for ranking proteins in protein identification. Our developments have been implemented in a software package "ProteinDecision," which is available at http://digbio.missouri.edu/ProteinDecision/ .
Collapse
|
7
|
Song Z, Chen L, Xu D. Confidence assessment for protein identification by using peptide-mass fingerprinting data. Proteomics 2009; 9:3090-9. [PMID: 19526559 DOI: 10.1002/pmic.200701159] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Protein identification using Peptide Mass Fingerprinting (PMF) data remains an important yet only partially solved problem. Current computational methods may lead to false positive identification since the top hit from a database search may not be the target protein. In addition, the identification scores assigned singly by a scoring function (raw scores) are not normalized. Therefore, the ranking based on raw scores may be biased. To address the above issue, we have developed a statistical model to evaluate the confidence of the raw score and to improve the ranking of proteins for identification. The results show that the statistical model better ranks the correct protein than the raw scores. Our study provides a new method to enhance the accuracy of protein identification by using PMF data. We incorporated the method into our software package "Protein-Decision" together with a user-friendly graphical interface. A standalone version of Protein-Decision is freely available at http://digbio.missouri.edu/ProteinDecision/.
Collapse
Affiliation(s)
- Zhao Song
- Computer Science Department, Christopher S Bond Life Sciences Center, University of Missouri, Columbia, MO 65211-2060, USA
| | | | | |
Collapse
|
8
|
Blind search for post-translational modifications and amino acid substitutions using peptide mass fingerprints from two proteases. BMC Res Notes 2008; 1:130. [PMID: 19099572 PMCID: PMC2653020 DOI: 10.1186/1756-0500-1-130] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2008] [Accepted: 12/19/2008] [Indexed: 11/10/2022] Open
Abstract
Background Mass spectrometric analysis of peptides is an essential part of protein identification and characterization, the latter meaning the identification of modifications and amino acid substitutions. There are two main approaches for characterization: (i) using a predefined set of possible modifications and substitutions or (ii) performing a blind search. The first option is straightforward, but can not detect modifications or substitutions outside the predefined set. A blind search does not have this limitation, and therefore has the potential of detecting both known and unknown modifications and substitutions. Combining the peptide mass fingerprints from two proteases result in overlapping sequence coverage of the protein, thereby offering alternative views of the protein and a novel way of indicating post-translational modifications and amino acid substitutions. Results We have developed an algorithm and a software tool, MassShiftFinder, that performs a blind search using peptide mass fingerprints from two proteases with different cleavage specificities. The algorithm is based on equal mass shifts for overlapping peptides from the two proteases used, and can indicate both post-translational modifications and amino acid substitutions. In most cases it is possible to suggest a restricted area within the overlapping peptides where the mass shift can occur. The program is available at . Conclusion Without any prior assumptions on their presence the described algorithm is able to indicate post-translational modifications or amino acid substitutions in MALDI-TOF experiments on identified proteins, and can thereby direct the involved peptides to subsequent TOF-TOF analysis. The algorithm is designed for detailed and low-throughput characterization of single proteins.
Collapse
|
9
|
Lee SW, Choi JP, Kim HJ, Hong JM, Hur CG. ASPMF: A new approach for identifying alternative splicing isoforms using peptide mass fingerprinting. Biochem Biophys Res Commun 2008; 377:253-6. [DOI: 10.1016/j.bbrc.2008.09.115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2008] [Accepted: 09/24/2008] [Indexed: 02/04/2023]
|
10
|
Improved peptide mass fingerprinting matches via optimized sample preparation in MALDI mass spectrometry. Anal Chim Acta 2008; 627:162-8. [DOI: 10.1016/j.aca.2008.05.059] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Revised: 05/14/2008] [Accepted: 05/16/2008] [Indexed: 11/23/2022]
|
11
|
Bocker S, Makinen V. Combinatorial approaches for mass spectra recalibration. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:91-100. [PMID: 18245878 DOI: 10.1109/tcbb.2007.1077] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger datasets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors. We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem, and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence, and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line-sweep.
Collapse
|
12
|
Barsnes H, Eidhammer I, Cruciani V, Mikalsen SO. Protease-dependent fractional mass and peptide properties. EUROPEAN JOURNAL OF MASS SPECTROMETRY (CHICHESTER, ENGLAND) 2008; 14:311-317. [PMID: 19023148 DOI: 10.1255/ejms.934] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Mass spectrometric analyses of peptides mainly rely on cleavage of proteins with proteases that have a defined specificity. The specificities of the proteases imply that there is not a random distribution of amino acids in the peptides. The physico-chemical effects of this distribution have been partly analyzed for tryptic peptides, but to a lesser degree for other proteases. Using all human proteins in Swiss-Prot, the relationships between peptide fractional mass, pI and hydrophobicity were investigated. The distribution of the fractional masses and the average regression lines for the fractional masses were similar, but not identical, for the peptides generated by the proteases trypsin, chymotrypsin and gluC, with the steepest regression line for gluC. The fractional mass regression lines for individual proteins showed up to +/-100 ppm in mass difference from the average regression line and the peptides generated showed protease-dependent properties. We here show that the fractional mass and some other properties of the peptides are dependent on the protease used for generating the peptides. With the increasing accuracy of mass spectrometry instruments it is possible to exploit the information embedded in the fractional mass of unknown peaks in peptide mass fingerprint spectra.
Collapse
Affiliation(s)
- Harald Barsnes
- Department of Informatics, University of Bergen, Norway.
| | | | | | | |
Collapse
|
13
|
Eidhammer I, Barsnes H, Mikalsen SO. MassSorter: peptide mass fingerprinting data analysis. Methods Mol Biol 2008; 484:345-359. [PMID: 18592191 DOI: 10.1007/978-1-59745-398-1_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
MassSorter is a software tool that sorts, systemizes, and analyzes data from peptide mass fingerprinting (PMF) experiments on proteins with known amino acid sequences. Several experiments can be simultaneously analyzed for sequence coverage and posttranslational modifications occurring during sample handling, induced chemical modifications, and unexpected cleavages. Experimental m/z values are compared with m/z values from an in silico digestion, taking modifications into account. Filters can be defined by users for marking autolytic protease peaks and other contaminating peaks. MassSorter functions as a database of all the detected peptides. It includes tools for visualization of the results, such as sequence coverage, accuracy plots, statistics, and 3D models.
Collapse
|
14
|
Tolmachev AV, Monroe ME, Jaitly N, Petyuk VA, Adkins JN, Smith RD. Mass Measurement Accuracy in Analyses of Highly Complex Mixtures Based Upon Multidimensional Recalibration. Anal Chem 2006; 78:8374-85. [PMID: 17165830 DOI: 10.1021/ac0606251] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry combined with a range of on-line separation techniques has become a powerful tool for characterization of complex mixtures, including protein digests in proteomics studies. Accurate mass measurements can be compromised due to variations that occur in the course of an on-line separation, e.g., due to excessive space charge in an ion trap, temperature changes, or other sources of instrument "drift". We have developed a multidimensional recalibration approach that utilizes existing information on the likely mixture composition, taking into account variable conditions of mass measurements, and that corrects the mass calibration for sets of individual peaks binned by, for example, the total ion count for the mass spectrum, the individual peak abundance, m/z value, and liquid chromatography separation time. The multidimensional recalibration approach uses a statistical matching of measured masses in such measurements, often exceeding 105, to a significant number of putative known species likely to be present in the mixture (i.e., having known accurate masses), to identify a subset of the detected species that serve as effective calibrants. The recalibration procedure involves optimization of the mass accuracy distribution (histogram), to provide a more confident distinction between true and false identifications. We report the mass accuracy improvement obtained for data acquired using a TOF and several FTICR mass spectrometers. We show that the multidimensional recalibration better compensates for systematic mass measurement errors and also significantly reduces the mass error spread: i.e., both the accuracy and precision of mass measurements are improved. The mass measurement improvement is found to be virtually independent of the initial instrument calibration, allowing, for example, less frequent calibration. We show that this recalibration can provide sub-ppm mass measurement accuracy for measurements of a complex fungal proteome tryptic digest and provide improved confidence or numbers of peptide identifications.
Collapse
Affiliation(s)
- Aleksey V Tolmachev
- Biological Sciences Division, Pacific Northwest National Laboratory, PO Box 999, Richland, Washington 99352, USA
| | | | | | | | | | | |
Collapse
|
15
|
Wolski WE, Farrow M, Emde AK, Lehrach H, Lalowski M, Reinert K. Analytical model of peptide mass cluster centres with applications. Proteome Sci 2006; 4:18. [PMID: 16995952 PMCID: PMC1617084 DOI: 10.1186/1477-5956-4-18] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2006] [Accepted: 09/23/2006] [Indexed: 11/10/2022] Open
Abstract
Background The elemental composition of peptides results in formation of distinct, equidistantly spaced clusters across the mass range. The property of peptide mass clustering is used to calibrate peptide mass lists, to identify and remove non-peptide peaks and for data reduction. Results We developed an analytical model of the peptide mass cluster centres. Inputs to the model included, the amino acid frequencies in the sequence database, the average length of the proteins in the database, the cleavage specificity of the proteolytic enzyme used and the cleavage probability. We examined the accuracy of our model by comparing it with the model based on an in silico sequence database digest. To identify the crucial parameters we analysed how the cluster centre location depends on the inputs. The distance to the nearest cluster was used to calibrate mass spectrometric peptide peak-lists and to identify non-peptide peaks. Conclusion The model introduced here enables us to predict the location of the peptide mass cluster centres. It explains how the location of the cluster centres depends on the input parameters. Fast and efficient calibration and filtering of non-peptide peaks is achieved by a distance measure suggested by Wool and Smilansky.
Collapse
Affiliation(s)
- Witold E Wolski
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Malcolm Farrow
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Anne-Katrin Emde
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, D-14195 Berlin, Germany
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| |
Collapse
|
16
|
Hilario M, Kalousis A, Pellegrini C, Müller M. Processing and classification of protein mass spectra. MASS SPECTROMETRY REVIEWS 2006; 25:409-49. [PMID: 16463283 DOI: 10.1002/mas.20072] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Among the many applications of mass spectrometry, biomarker pattern discovery from protein mass spectra has aroused considerable interest in the past few years. While research efforts have raised hopes of early and less invasive diagnosis, they have also brought to light the many issues to be tackled before mass-spectra-based proteomic patterns become routine clinical tools. Known issues cover the entire pipeline leading from sample collection through mass spectrometry analytics to biomarker pattern extraction, validation, and interpretation. This study focuses on the data-analytical phase, which takes as input mass spectra of biological specimens and discovers patterns of peak masses and intensities that discriminate between different pathological states. We survey current work and investigate computational issues concerning the different stages of the knowledge discovery process: exploratory analysis, quality control, and diverse transforms of mass spectra, followed by further dimensionality reduction, classification, and model evaluation. We conclude after a brief discussion of the critical biomedical task of analyzing discovered discriminatory patterns to identify their component proteins as well as interpret and validate their biological implications.
Collapse
Affiliation(s)
- Melanie Hilario
- Artificial Intelligence Laboratory, Computer Science Department, University of Geneva, CH-1211 Geneva 4, Switzerland.
| | | | | | | |
Collapse
|
17
|
Abstract
The rapid expansion of methods for measuring biological data ranging from DNA sequence variations to mRNA expression and protein abundance presents the opportunity to utilize multiple types of information jointly in the study of human health and disease. Organisms are complex systems that integrate inputs at myriad levels to arrive at an observable phenotype. Therefore, it is essential that questions concerning the etiology of phenotypes as complex as common human diseases take the systemic nature of biology into account, and integrate the information provided by each data type in a manner analogous to the operation of the body itself. While limited in scope, the initial forays into the joint analysis of multiple data types have yielded interesting results that would not have been reached had only one type of data been considered. These early successes, along with the aforementioned theoretical appeal of data integration, provide impetus for the development of methods for the parallel, high-throughput analysis of multiple data types. The idea that the integrated analysis of multiple data types will improve the identification of biomarkers of clinical endpoints, such as disease susceptibility, is presented as a working hypothesis.
Collapse
Affiliation(s)
- David M Reif
- Center for Human Genetics Research, Vanderbilt University Medical School, 519 Light Hall, Nashville, TN 37232-0700, USA.
| | | | | |
Collapse
|
18
|
Barsnes H, Mikalsen SO, Eidhammer I. MassSorter: a tool for administrating and analyzing data from mass spectrometry experiments on proteins with known amino acid sequences. BMC Bioinformatics 2006; 7:42. [PMID: 16438723 PMCID: PMC1403804 DOI: 10.1186/1471-2105-7-42] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2005] [Accepted: 01/26/2006] [Indexed: 11/29/2022] Open
Abstract
Background Proteomics is the study of the proteome, and is critical to the understanding of cellular processes. Two central and related tasks of proteomics are protein identification and protein characterization. Many small laboratories are interested in the characterization of a small number of proteins, e.g., how posttranslational modifications change under different conditions. Results We have developed a software tool called MassSorter for administrating and analyzing data from peptide mass fingerprinting experiments on proteins with known amino acid sequences. It is meant for small scale mass spectrometry laboratories that are interested in posttranslational modifications of known proteins. Several experiments can be compared simultaneously, and the matched and unmatched peak values are clearly indicated. The hits can be sorted according to m/z values (default) or according to the sequence of the protein. Filters defined by the user can mark autolytic protease peaks and other contaminating peaks (keratins, proteins co-migrating with the protein of interest, etc.). Unmatched peaks can be further analyzed for unexpected modifications by searches against a local version of the UniMod database. They can also be analyzed for unexpected cleavages, a highly useful feature for proteins that undergo maturation by proteolytic cleavage, creating new N- or C-terminals. Additional tools exist for visualization of the results, like sequence coverage, accuracy plots, different types of statistics, 3D models, etc. The program and a tutorial are freely available for academic users at . Conclusion MassSorter has a number of useful features that can promote the analysis and administration of MS-data.
Collapse
Affiliation(s)
- Harald Barsnes
- Department of informatics, University of Bergen, PB. 7800, N-5020 Bergen, Norway
- Computational Biology Unit, Bergen Center for Computational Science, UNIFOB/UIB, Thormoehlensgt. 55, N-5008 Bergen, Norway
| | - Svein-Ole Mikalsen
- Institute for Cancer Research, Rikshospitalet-Radiumhospitalet University Hospital, Montebello, N-0310 Oslo, Norway
| | - Ingvar Eidhammer
- Department of informatics, University of Bergen, PB. 7800, N-5020 Bergen, Norway
| |
Collapse
|
19
|
Malyarenko DI, Cooke WE, Tracy ER, Drake RR, Shin S, Semmes OJ, Sasinowski M, Manos DM. Resampling and deconvolution of linear time-of-flight records for enhanced protein profiling. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2006; 20:1670-8. [PMID: 16637003 PMCID: PMC7432531 DOI: 10.1002/rcm.2496] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
We have developed a peak deconvolution strategy that is applicable to the full mass range of a time-of-flight (TOF) spectrum. This strategy involves resampling a spectrum to create a time series that has equal peak widths (in time) across the entire spectrum, and then using the deconvolution filters we have previously described. We use this technique to deconvolve the protein mass spectra for blood serum and cell lysates acquired on three separate TOF instruments. Following deconvolution, we resolve spectral structures consistent with expected events such as multiply charged ions, matrix adducts and post-translational protein modifications. The deconvolution procedure produces a 40% improvement in the resolution and enhanced experimental sensitivity over the full length of the linear TOF record, up to m/z 150 000. This approach is particularly appropriate for automated data analysis and peak detection in dense TOF spectra.
Collapse
Affiliation(s)
- Dariya I Malyarenko
- Departments of Physics and Applied Science, College of William and Mary, Williamsburg, VA 23187-8795, USA.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Malyarenko DI, Cooke WE, Tracy ER, Trosset MW, Semmes OJ, Sasinowski M, Manos DM. Deconvolution filters to enhance resolution of dense time-of-flight survey spectra in the time-lag optimization range. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2006; 20:1661-9. [PMID: 16636999 PMCID: PMC4503320 DOI: 10.1002/rcm.2487] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
By applying time-domain filters to time-of-flight (TOF) mass spectrometry signals, we have simultaneously smoothed and narrowed spectra resulting in improved resolution and increased signal-to-noise ratios. This filtering procedure has an advantage over detailed curve fitting of spectra in the case of large dense spectra, when neither the location nor the number of mass peaks is known a priori. This time series method is directly applicable in the time lag optimization range, where point density per peak is constant. We present a systematic methodology to optimize the filters according to any desired figure of merit, illustrating the procedure by optimizing the signal-to-noise per unit bandwidth of matrix-assisted laser desorption/ionization (MALDI) data. We also introduce a nonlinear filter that reduces the spurious structure that often accompanies deconvolution filters. The net result of the application of these filters is that we can identify new structures in dense MALDI-TOF data, clearly showing small adducts to heavy biomolecules.
Collapse
Affiliation(s)
- Dariya I Malyarenko
- Departments of Physics and Applied Science, College of William and Mary, Williamsburg, VA 23187-8795, USA.
| | | | | | | | | | | | | |
Collapse
|
21
|
Wolski WE, Lalowski M, Martus P, Herwig R, Giavalisco P, Gobom J, Sickmann A, Lehrach H, Reinert K. Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process. BMC Bioinformatics 2005; 6:285. [PMID: 16318636 PMCID: PMC1343595 DOI: 10.1186/1471-2105-6-285] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2005] [Accepted: 11/30/2005] [Indexed: 11/22/2022] Open
Abstract
Background: Biological Mass Spectrometry is used to analyse peptides and proteins. A mass spectrum generates a list of measured mass to charge ratios and intensities of ionised peptides, which is called a peak-list. In order to classify the underlying amino acid sequence, the acquired spectra are usually compared with synthetic ones. Development of suitable methods of direct peak-list comparison may be advantageous for many applications. Results: The pairwise peak-list comparison is a multistage process composed of matching of peaks embedded in two peak-lists, normalisation, scaling of peak intensities and dissimilarity measures. In our analysis, we focused on binary and intensity based measures. We have modified the measures in order to comprise the mass spectrometry specific properties of mass measurement accuracy and non-matching peaks. We compared the labelling of peak-list pairs, obtained using different factors of the pairwise peak-list comparison, as being the same or different to those determined by sequence database searches. In order to elucidate how these factors influence the peak-list comparison we adopted an analysis of variance type method with the partial area under the ROC curve as a dependent variable. Conclusion: The analysis of variance provides insight into the relevance of various factors influencing the outcome of the pairwise peak-list comparison. For large MS/MS and PMF data sets the outcome of ANOVA analysis was consistent, providing a strong indication that the results presented here might be valid for many various types of peptide mass measurements.
Collapse
Affiliation(s)
- Witold E Wolski
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Peter Martus
- Institute for Medical Informatics, Biometry and Epidemiology; Charite University Medicine Berlin, Hindenburgdamm 30 (HBD 30), 12200 Berlin
| | - Ralf Herwig
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Patrick Giavalisco
- Boyce Thompson Institute for Plant Research, Tower Road, Ithaca 14850, NY, USA
| | - Johan Gobom
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Albert Sickmann
- DFG Research Center for Experimental Biomedicine, University of Würzburg, Versbacherstr. 9, D-97078 Würzburg, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, D-14195 Berlin, Germany
| |
Collapse
|
22
|
Wolski WE, Lalowski M, Jungblut P, Reinert K. Calibration of mass spectrometric peptide mass fingerprint data without specific external or internal calibrants. BMC Bioinformatics 2005; 6:203. [PMID: 16102175 PMCID: PMC1199585 DOI: 10.1186/1471-2105-6-203] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2005] [Accepted: 08/15/2005] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Peptide Mass Fingerprinting (PMF) is a widely used mass spectrometry (MS) method of analysis of proteins and peptides. It relies on the comparison between experimentally determined and theoretical mass spectra. The PMF process requires calibration, usually performed with external or internal calibrants of known molecular masses. RESULTS We have introduced two novel MS calibration methods. The first method utilises the local similarity of peptide maps generated after separation of complex protein samples by two-dimensional gel electrophoresis. It computes a multiple peak-list alignment of the data set using a modified Minimum Spanning Tree (MST) algorithm. The second method exploits the idea that hundreds of MS samples are measured in parallel on one sample support. It improves the calibration coefficients by applying a two-dimensional Thin Plate Splines (TPS) smoothing algorithm. We studied the novel calibration methods utilising data generated by three different MALDI-TOF-MS instruments. We demonstrate that a PMF data set can be calibrated without resorting to external or relying on widely occurring internal calibrants. The methods developed here were implemented in R and are part of the BioConductor package mscalib available from http://www.bioconductor.org. CONCLUSION The MST calibration algorithm is well suited to calibrate MS spectra of protein samples resulting from two-dimensional gel electrophoretic separation. The TPS based calibration algorithm might be used to correct systematic mass measurement errors observed for large MS sample supports. As compared to other methods, our combined MS spectra calibration strategy increases the peptide/protein identification rate by an additional 5-15%.
Collapse
Affiliation(s)
- Witold E Wolski
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, D-14195 Berlin, Germany
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Peter Jungblut
- Max Planck Institute for Infection Biology, Schumannstr. 21–22, D-10117 Berlin, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| |
Collapse
|
23
|
Magnin J, Masselot A, Menzel C, Colinge J. OLAV-PMF: a novel scoring scheme for high-throughput peptide mass fingerprinting. J Proteome Res 2004; 3:55-60. [PMID: 14998163 DOI: 10.1021/pr034055m] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We propose a new type of probabilistic scoring scheme framework for protein identification from peptide masses. We first introduce the framework itself and explain its requirements. In a second part, we describe a particular implementation and test it on a data set of more than 8000 MALDI-TOF spectra with known contents. Doing so, we also compare its performance to two widely used scoring schemes, thereby demonstrating the potential of the proposed approach.
Collapse
Affiliation(s)
- Jérôme Magnin
- Geneprot Inc, 2 rue Pré-de-la-Fontaine, CH-1217 Meyrin, Switzerland
| | | | | | | |
Collapse
|
24
|
|
25
|
Havilio M, Haddad Y, Smilansky Z. Intensity-based statistical scorer for tandem mass spectrometry. Anal Chem 2003; 75:435-44. [PMID: 12585468 DOI: 10.1021/ac0258913] [Citation(s) in RCA: 140] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We describe a new statistical scorer for tandem mass spectrometry. The scorer is based on the probability that fragments with given chemical properties create measured intensity levels in the experimental spectrum. The scorer's parameters are computed using a fully automated procedure. Benchmarking the new scorer on a large set of experimental spectra, we show that it performs significantly better than the widely used cross-correlation scoring algorithm of Eng et al. (Eng, J. K; McKormack, A. L.; Yates, J. R. J. Am. Soc. Mass Spectrom. 1994, 5, 976-989.).
Collapse
Affiliation(s)
- Moshe Havilio
- Compugen Limited, 72 Pinhas Rozen Street, Tel Aviv, 69512 Israel.
| | | | | |
Collapse
|
26
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2003. [PMCID: PMC2448450 DOI: 10.1002/cfg.228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|