Cardoso LS, Suissas CE, Ramirez M, Antunes M, Pinto FR. Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays.
BMC Res Notes 2014;
7:148. [PMID:
24629208 PMCID:
PMC3995598 DOI:
10.1186/1756-0500-7-148]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 03/10/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND
Microarray-based comparative genomic hybridization (aCGH) is used for rapid comparison of genomes of different bacterial strains. The purpose is to evaluate the distribution of genes from sequenced bacterial strains (control) among unsequenced strains (test). We previously compared the use of single strain versus multiple strain control with arrays covering multiple genomes. The conclusion was that a multiple strain control promoted a better separation of signals between present and absent genes.
FINDINGS
We now extend our previous study by applying the Expectation-Maximization (EM) algorithm to fit a mixture model to the signal distribution in order to classify each gene as present or absent and by comparing different methods for analyzing aCGH data, using combinations of different control strain choices, two different statistical mixture models, with or without normalization, with or without logarithm transformation and with test-over-control or inverse signal ratio calculation. We also assessed the impact of replication on classification accuracy. Higher values of accuracy have been achieved using the ratio of control-over-test intensities, without logarithmic transformation and with a strain mix control. Normalization and the type of mixture model fitted by the EM algorithm did not have a significant impact on classification accuracy. Similarly, using the average of replicate arrays to perform the classification does not significantly improve the results.
CONCLUSIONS
Our work provides a guiding benchmark comparison of alternative methods to analyze aCGH results that can impact on the analysis of currently ongoing comparative genomic projects or in the re-analysis of published studies.
Collapse