Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Spencer M, Sangaralingam A. A phylogenetic mixture model for gene family loss in parasitic bacteria. Mol Biol Evol 2009;26:1901-8. [PMID: 19435739 DOI: 10.1093/molbev/msp102] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Spencer M, Sangaralingam A. A phylogenetic mixture model for gene family loss in parasitic bacteria. Mol Biol Evol 2009;26:1901-8. [PMID: 19435739 DOI: 10.1093/molbev/msp102] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Mutua TM, Kulohoma BW. Differences in genetic flux in invasive Streptococcus pneumoniae associated with bacteraemia and meningitis. Heliyon 2022;8:e12229. [PMID: 36593853 PMCID: PMC9803773 DOI: 10.1016/j.heliyon.2022.e12229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 11/07/2022] [Accepted: 11/30/2022] [Indexed: 12/23/2022] Open

Fukunaga T, Iwasaki W. Mirage: estimation of ancestral gene-copy numbers by considering different evolutionary patterns among gene families. BIOINFORMATICS ADVANCES 2021;1:vbab014. [PMID: 36700099 PMCID: PMC9710636 DOI: 10.1093/bioadv/vbab014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/22/2021] [Accepted: 07/28/2021] [Indexed: 01/28/2023]

Croce G, Gueudré T, Ruiz Cuevas MV, Keidel V, Figliuzzi M, Szurmant H, Weigt M. A multi-scale coevolutionary approach to predict interactions between protein domains. PLoS Comput Biol 2019;15:e1006891. [PMID: 31634362 PMCID: PMC6822775 DOI: 10.1371/journal.pcbi.1006891] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 10/31/2019] [Accepted: 09/27/2019] [Indexed: 11/18/2022] Open

Estimation of Gene Insertion/Deletion Rates with Missing Data. Genetics 2016;204:513-529. [PMID: 27565162 DOI: 10.1534/genetics.116.191973] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 08/17/2016] [Indexed: 11/18/2022] Open

Zamani-Dahaj SA, Okasha M, Kosakowski J, Higgs PG. Estimating the Frequency of Horizontal Gene Transfer Using Phylogenetic Models of Gene Gain and Loss. Mol Biol Evol 2016;33:1843-57. [DOI: 10.1093/molbev/msw062] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kim T, Hao W. DiscML: an R package for estimating evolutionary rates of discrete characters using maximum likelihood. BMC Bioinformatics 2014;15:320. [PMID: 25260628 PMCID: PMC4261585 DOI: 10.1186/1471-2105-15-320] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2014] [Accepted: 09/25/2014] [Indexed: 11/17/2022] Open

Horizontal transfer and gene conversion as an important driving force in shaping the landscape of mitochondrial introns. G3-GENES GENOMES GENETICS 2014;4:605-12. [PMID: 24515269 PMCID: PMC4059233 DOI: 10.1534/g3.113.009910] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Cohen O, Ashkenazy H, Levy Karin E, Burstein D, Pupko T. CoPAP: Coevolution of presence-absence patterns. Nucleic Acids Res 2013;41:W232-7. [PMID: 23748951 PMCID: PMC3692100 DOI: 10.1093/nar/gkt471] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Meinel T, Krause A. Meta-analysis of general bacterial subclades in whole-genome phylogenies using tree topology profiling. Evol Bioinform Online 2012;8:489-525. [PMID: 22915837 PMCID: PMC3422217 DOI: 10.4137/ebo.s9642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Abstract

In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, disruptive biological effects and the steadily increasing number of genomes have led to a huge diversity in published phylogenies. Comparison of those and, moreover, identification of the impact of inference properties (underlying data model, inference technique) on particular reconstructions is almost impossible. In this work, we introduce tree topology profiling as a method to compare already published whole-genome phylogenies. This method requires visual determination of the particular topology in a drawn whole-genome phylogeny for a set of particular bacterial clans. For each clan, neighborhoods to other bacteria are collected into a catalogue of generalized alternative topologies. Particular topology alternatives found for an ordered list of bacterial clans reveal a topology profile that represents the analyzed phylogeny. To simulate the inhomogeneity of published gene content phylogenies we generate a set of seven phylogenies using different inference techniques and the SYSTERS-PhyloMatrix data model. After tree topology profiling on in total 54 selected published and newly inferred phylogenies, we separate artefactual from biologically meaningful phylogenies and associate particular inference results (phylogenies) with inference background (inference techniques as well as data models). Topological relationships of particular bacterial species groups are presented. With this work we introduce tree topology profiling into the scientific field of comparative phylogenomics.

Collapse

Cohen O, Pupko T. Inference of gain and loss events from phyletic patterns using stochastic mapping and maximum parsimony--a simulation study. Genome Biol Evol 2011;3:1265-75. [PMID: 21971516 PMCID: PMC3215202 DOI: 10.1093/gbe/evr101] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/27/2011] [Indexed: 12/26/2022] Open

Abstract

Bacterial evolution is characterized by frequent gain and loss events of gene families. These events can be inferred from phyletic pattern data-a compact representation of gene family repertoire across multiple genomes. The maximum parsimony paradigm is a classical and prevalent approach for the detection of gene family gains and losses mapped on specific branches. We and others have previously developed probabilistic models that aim to account for the gain and loss stochastic dynamics. These models are a critical component of a methodology termed stochastic mapping, in which probabilities and expectations of gain and loss events are estimated for each branch of an underlying phylogenetic tree. In this work, we present a phyletic pattern simulator in which the gain and loss dynamics are assumed to follow a continuous-time Markov chain along the tree. Various models and options are implemented to make the simulation software useful for a large number of studies in which binary (presence/absence) data are analyzed. Using this simulation software, we compared the ability of the maximum parsimony and the stochastic mapping approaches to accurately detect gain and loss events along the tree. Our simulations cover a large array of evolutionary scenarios in terms of the propensities for gene family gains and losses and the variability of these propensities among gene families. Although in all simulation schemes, both methods obtain relatively low levels of false positive rates, stochastic mapping outperforms maximum parsimony in terms of true positive rates. We further studied the factors that influence the performance of both methods. We find, for example, that the accuracy of maximum parsimony inference is substantially reduced when the goal is to map gain and loss events along internal branches of the phylogenetic tree. Furthermore, the accuracy of stochastic mapping is reduced with smaller data sets (limited number of gene families) due to unreliable estimation of branch lengths. Our simulator and simulation results are additionally relevant for the analysis of other types of binary-coded data, such as the existence of homologues restriction sites, gaps, and introns, to name a few. Both the simulation software and the inference methodology are freely available at a user-friendly server: http://gloome.tau.ac.il/.

Collapse

Cohen O, Gophna U, Pupko T. The Complexity Hypothesis Revisited: Connectivity Rather Than Function Constitutes a Barrier to Horizontal Gene Transfer. Mol Biol Evol 2010;28:1481-9. [DOI: 10.1093/molbev/msq333] [Citation(s) in RCA: 146] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Sangaralingam A, Susko E, Bryant D, Spencer M. On the artefactual parasitic eubacteria clan in conditioned logdet phylogenies: heterotachy and ortholog identification artefacts as explanations. BMC Evol Biol 2010;10:343. [PMID: 21062453 PMCID: PMC2992526 DOI: 10.1186/1471-2148-10-343] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2010] [Accepted: 11/09/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Phylogenetic reconstruction methods based on gene content often place all the parasitic and endosymbiotic eubacteria (parasites for short) together in a clan. Many other lines of evidence point to this parasites clan being an artefact. This artefact could be a consequence of the methods used to construct ortholog databases (due to some unknown bias), the methods used to estimate the phylogeny, or both.We test the idea that the parasites clan is an ortholog identification artefact by analyzing three different ortholog databases (COG, TRIBES, and OFAM), which were constructed using different methods, and are thus unlikely to share the same biases. In each case, we estimate a phylogeny using an improved version of the conditioned logdet distance method. If the parasites clan appears in trees from all three databases, it is unlikely to be an ortholog identification artefact.Accelerated loss of a subset of gene families in parasites (a form of heterotachy) may contribute to the difficulty of estimating a phylogeny from gene content data. We test the idea that heterotachy is the underlying reason for the estimation of an artefactual parasites clan by applying two different mixture models (phylogenetic and non-phylogenetic), in combination with conditioned logdet. In these models, there are two categories of gene families, one of which has accelerated loss in parasites. Distances are estimated separately from each category by conditioned logdet. This should reduce the tendency for tree estimation methods to group the parasites together, if heterotachy is the underlying reason for estimation of the parasites clan.

RESULTS

The parasites clan appears in conditioned logdet trees estimated from all three databases. This makes it less likely to be an artefact of database construction. The non-phylogenetic mixture model gives trees without a parasites clan. However, the phylogenetic mixture model still results in a tree with a parasites clan. Thus, it is not entirely clear whether heterotachy is the underlying reason for the estimation of a parasites clan. Simulation studies suggest that the phylogenetic mixture model approach may be unsuccessful because the model of gene family gain and loss it uses does not adequately describe the real data.

CONCLUSIONS

The most successful methods for estimating a reliable phylogenetic tree for parasitic and endosymbiotic eubacteria from gene content data are still ad-hoc approaches such as the SHOT distance method. however, the improved conditioned logdet method we developed here may be useful for non-parasites and can be accessed at http://www.liv.ac.uk/~cgrbios/cond_logdet.html.

Collapse

Cohen O, Ashkenazy H, Belinky F, Huchon D, Pupko T. GLOOME: gain loss mapping engine. Bioinformatics 2010;26:2914-5. [PMID: 20876605 DOI: 10.1093/bioinformatics/btq549] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Inferring bacterial genome flux while considering truncated genes. Genetics 2010;186:411-26. [PMID: 20551435 DOI: 10.1534/genetics.110.118448] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Cohen O, Pupko T. Inference and characterization of horizontally transferred gene families using stochastic mapping. Mol Biol Evol 2009;27:703-13. [PMID: 19808865 PMCID: PMC2822287 DOI: 10.1093/molbev/msp240] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open