1
|
Wang Y, Zhou L, Wu L, Song C, Ma X, Xu S, Du T, Li X, Li J. Evolutionary ecology of the visual opsin gene sequence and its expression in turbot (Scophthalmus maximus). BMC Ecol Evol 2021; 21:114. [PMID: 34098879 PMCID: PMC8186084 DOI: 10.1186/s12862-021-01837-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 05/24/2021] [Indexed: 11/27/2022] Open
Abstract
Background As flatfish, turbot undergo metamorphosis as part of their life cycle. In the larval stage, turbot live at the ocean surface, but after metamorphosis they move to deeper water and turn to benthic life. Thus, the light environment differs greatly between life stages. The visual system plays a great role in organic evolution, but reports of the relationship between the visual system and benthic life are rare. In this study, we reported the molecular and evolutionary analysis of opsin genes in turbot, and the heterochronic shifts in opsin expression during development. Results Our gene synteny analysis showed that subtype RH2C was not on the same gene cluster as the other four green-sensitive opsin genes (RH2) in turbot. It was translocated to chromosome 8 from chromosome 6. Based on branch-site test and spectral tuning sites analyses, E122Q and M207L substitutions in RH2C, which were found to be under positive selection, are closely related to the blue shift of optimum light sensitivities. And real-time PCR results indicated the dominant opsin gene shifted from red-sensitive (LWS) to RH2B1 during turbot development, which may lead to spectral sensitivity shifts to shorter wavelengths. Conclusions This is the first report that RH2C may be an important subtype of green opsin gene that was retained by turbot and possibly other flatfish species during evolution. Moreover, E122Q and M207L substitutions in RH2C may contribute to the survival of turbot in the bluish colored ocean. And heterochronic shifts in opsin expression may be an important strategy for turbot to adapt to benthic life. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-021-01837-2.
Collapse
Affiliation(s)
- Yunong Wang
- College of Fisheries, Ocean University of China, Qingdao, 266003, People's Republic of China.,CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.,Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, PR China
| | - Li Zhou
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.,Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, PR China
| | - Lele Wu
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.,Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, PR China
| | - Changbin Song
- Institute of Semiconductors, Chinese Academy of Science, Beijing, 100083, People's Republic of China
| | - Xiaona Ma
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China
| | - Shihong Xu
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China
| | - Tengfei Du
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.,Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, PR China
| | - Xian Li
- College of Fisheries, Ocean University of China, Qingdao, 266003, People's Republic of China. .,CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China. .,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.
| | - Jun Li
- CAS Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, People's Republic of China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, People's Republic of China.,Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, PR China
| |
Collapse
|
2
|
Posada D, Crandall KA. Felsenstein Phylogenetic Likelihood. J Mol Evol 2021; 89:134-145. [PMID: 33438113 PMCID: PMC7803665 DOI: 10.1007/s00239-020-09982-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/07/2020] [Indexed: 01/08/2023]
Abstract
In 1981, the Journal of Molecular Evolution (JME) published an article entitled “Evolutionary trees from DNA sequences: A maximum likelihood approach” by Joseph (Joe) Felsenstein (J Mol Evol 17:368–376, 1981). This groundbreaking work laid the foundation for the emerging field of statistical phylogenetics, providing a tractable way of finding maximum likelihood (ML) estimates of evolutionary trees from DNA sequence data. This paper is the second most cited (more than 9000 citations) in JME after Kimura’s (J Mol Evol 16:111–120, 1980) seminal paper on a model of nucleotide substitution (with nearly 20,000 citations). On the occasion of the 50th anniversary of JME, we elaborate on the significance of Felsenstein’s ML approach to estimating phylogenetic trees.
Collapse
Affiliation(s)
- David Posada
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain. .,Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310, Vigo, Spain. .,Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain.
| | - Keith A Crandall
- Computational Biology Institute and Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA. .,Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA.
| |
Collapse
|
3
|
Chen YH, Wang H. Exploring Diversity of COVID‑19 Based on Substitution Distance. Infect Drug Resist 2020; 13:3887-3894. [PMID: 33149633 PMCID: PMC7605616 DOI: 10.2147/idr.s277620] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 10/09/2020] [Indexed: 11/24/2022] Open
Abstract
Background The number of COVID-19 infections worldwide has reached 10 million. COVID‑19 caused by SARS-CoV-2 is more contagious than SARS-CoV-1. There is a dispute about the origin of COVID-19. Study results showed that all SARS-CoV-2 sequences around the world share a common ancestor towards the end of 2019. Methods Virus sequences from COVID-19 samples at the early time should be less diversifiable than those from samples at the later time because there might be more mutations when the virus evolutes over time. The diversity of virus nucleotide sequences can be measured by the nucleotide substitution distance. To explore the diversity of SARS-CoV-2, we use different nucleotide substitution models to calculate the distances of SARS-CoV-2 samples from 3 different areas, China, Europe, and the USA. Then, we use these distances to infer the origin of COVID-19. Results It is known that COVID-19 originated in Wuhan China and then spread to Europe and the USA. By using different substitution models, the distances of SARS-CoV-2 samples from these areas are significantly different. By ANOVA testing, the p-value is less than 2.2e-16. The analyzed results in most substitution models show that China has the lowest diversity, followed by Europe and lastly by the USA. This outcome coincides with the virus transmission time order that SARS-CoV-2 starts in China, then outbreaks in Europe and finally in the USA. Conclusion The magnitude of nucleotide substitution distance of SARS-CoV-2 is closely related to the transmission time order of SARS-CoV-2. This outcome reveals that the nucleotide substitution distance of SARS-CoV-2 may be used to infer the origin of COVID-19.
Collapse
Affiliation(s)
- Yi-Hau Chen
- Institute of Statistical Science, Academia Sinica, Nankang, Taipei, Taiwan
| | - Hsiuying Wang
- Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
4
|
Acharya A, Fonsah JY, Mbanya D, Njamnshi AK, Kanmogne GD. Near-Full-Length Genetic Characterization of a Novel HIV-1 Unique Recombinant with Similarities to A1, CRF01_AE, and CRFO2_AG Viruses in Yaoundé, Cameroon. AIDS Res Hum Retroviruses 2019; 35:762-768. [PMID: 30860392 DOI: 10.1089/aid.2019.0042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Variations in the HIV genome influence HIV/AIDS epidemiology. We report here a novel HIV-1 unique recombinant form (URF) isolated from an HIV-infected female (NACMR092) in Cameroon, based on the analyses of near-full-length viral genome (partial gag, full-length pol, env, tat, rev, vif, vpr, vpu, and nef genes, and partial 3'-long terminal repeat). Phylogeny, recombination breakpoints, and recombination map analyses showed that NACMR092 was infected with a mosaic URF that had eight breakpoints (two in gag, one in pol, one in vpr, two in env, and two in the nef regions), nine subgenomic regions, and included fragments that had important similarities with HIV-1 subtypes A1, CRF02_AG, and CRF01_AE. This novel mosaic URF underscores complex recombination events occurring between HIV-1 subtypes circulating in Cameroon. Continued monitoring and detection of such recombinants and accurate classification of HIV genotype is important for tracking viral molecular epidemiology and antigenic diversity.
Collapse
Affiliation(s)
- Arpan Acharya
- Department of Pharmacology and Experimental Neuroscience, College of Medicine, University of Nebraska Medical Center, Omaha, Nebraska
| | - Julius Y. Fonsah
- Faculty of Medicine and Biomedical Sciences, University of Yaoundé I, Yaoundé, Cameroon
- Department of Neurology, Yaoundé Central Hospital, Yaoundé, Cameroon
| | - Dora Mbanya
- Faculty of Medicine and Biomedical Sciences, University of Yaoundé I, Yaoundé, Cameroon
- Yaoundé University Teaching Hospital, Department of Haematology, Yaoundé, Cameroon
| | - Alfred K. Njamnshi
- Faculty of Medicine and Biomedical Sciences, University of Yaoundé I, Yaoundé, Cameroon
- Department of Neurology, Yaoundé Central Hospital, Yaoundé, Cameroon
- Brain Research Africa Initiative (BRAIN), Yaoundé, Cameroon
| | - Georgette D. Kanmogne
- Department of Pharmacology and Experimental Neuroscience, College of Medicine, University of Nebraska Medical Center, Omaha, Nebraska
| |
Collapse
|
5
|
Dysgonomonas massiliensis sp. nov., a new species isolated from the human gut and its taxonogenomic description. Antonie van Leeuwenhoek 2019; 112:935-945. [DOI: 10.1007/s10482-019-01227-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 01/05/2019] [Indexed: 12/22/2022]
|
6
|
Mishra P, Kumar A, Sivaraman G, Shukla AK, Kaliamoorthy R, Slater A, Velusamy S. Character-based DNA barcoding for authentication and conservation of IUCN Red listed threatened species of genus Decalepis (Apocynaceae). Sci Rep 2017; 7:14910. [PMID: 29097709 PMCID: PMC5668324 DOI: 10.1038/s41598-017-14887-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 10/09/2017] [Indexed: 11/09/2022] Open
Abstract
The steno-endemic species of genus Decalepis are highly threatened by destructive wild harvesting. The medicinally important fleshy tuberous roots of Decalepis hamiltonii are traded as substitute, to meet the international market demand of Hemidesmus indicus. In addition, the tuberous roots of all three species of Decalepis possess similar exudates and texture, which challenges the ability of conventional techniques alone to perform accurate species authentication. This study was undertaken to generate DNA barcodes that could be utilized in monitoring and curtailing the illegal trade of these endangered species. The DNA barcode reference library was developed in BOLD database platform for candidate barcodes rbcL, matK, psbA-trnH, ITS and ITS2. The average intra-specific variations (0-0.27%) were less than the distance to nearest neighbour (0.4-11.67%) with matK and ITS. Anchoring the coding region rbcL in multigene tiered approach, the combination rbcL + matK + ITS yielded 100% species resolution, using the least number of loci combinations either with PAUP or BLOG methods to support a character-based approach. Species-specific SNP position (230 bp) in the matK region that is characteristic of D. hamiltonii could be used to design specific assays, enhancing its applicability for direct use in CITES enforcement for distinguishing it from H. indicus.
Collapse
Affiliation(s)
- Priyanka Mishra
- Plant Biology and Systematics, CSIR - Central Institute of Medicinal and Aromatic Plants, Research Center, Allalsandra, GKVK Post, Bengaluru, 560065, Karnataka, India
| | - Amit Kumar
- Plant Biology and Systematics, CSIR - Central Institute of Medicinal and Aromatic Plants, Research Center, Allalsandra, GKVK Post, Bengaluru, 560065, Karnataka, India
| | - Gokul Sivaraman
- Plant Biology and Systematics, CSIR - Central Institute of Medicinal and Aromatic Plants, Research Center, Allalsandra, GKVK Post, Bengaluru, 560065, Karnataka, India
| | - Ashutosh K Shukla
- Biotechnology Division, CSIR - Central Institute of Medicinal and Aromatic Plants, P.O. CIMAP, Lucknow, 226015, Uttar Pradesh, India
| | - Ravikumar Kaliamoorthy
- School of Conservation, TransDisciplinary University, 74/2, Jarakabande Kaval, Post Attur, Via Yelahanka, Bangalore, 560064, Karnataka, India
| | - Adrian Slater
- Biomolecular Technology Group, Faculty of Health and Life Sciences, De Montfort University, Leicester, LE1 9BH, UK
| | - Sundaresan Velusamy
- Plant Biology and Systematics, CSIR - Central Institute of Medicinal and Aromatic Plants, Research Center, Allalsandra, GKVK Post, Bengaluru, 560065, Karnataka, India.
| |
Collapse
|
7
|
Francisella-Like Endosymbionts and Rickettsia Species in Local and Imported Hyalomma Ticks. Appl Environ Microbiol 2017; 83:AEM.01302-17. [PMID: 28710265 DOI: 10.1128/aem.01302-17] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2017] [Accepted: 07/03/2017] [Indexed: 01/20/2023] Open
Abstract
Hyalomma ticks (Acari: Ixodidae) are hosts for Francisella-like endosymbionts (FLE) and may serve as vectors of zoonotic disease agents. This study aimed to provide an initial characterization of the interaction between Hyalomma and FLE and to determine the prevalence of pathogenic Rickettsia in these ticks. Hyalomma marginatum, Hyalomma rufipes, Hyalommadromedarii, Hyalommaaegyptium, and Hyalommaexcavatum ticks, identified morphologically and molecularly, were collected from different hosts and locations representing the distribution of the genus Hyalomma in Israel, as well as from migratory birds. A high prevalence of FLE was found in all Hyalomma species (90.6%), as well as efficient maternal transmission of FLE (91.8%), and the localization of FLE in Malpighian tubules, ovaries, and salivary glands in H. marginatum Furthermore, we demonstrated strong cophylogeny between FLE and their host species. Contrary to FLE, the prevalence of Rickettsia ranged from 2.4% to 81.3% and was significantly different between Hyalomma species, with a higher prevalence in ticks collected from migratory birds. Using ompA gene sequences, most of the Rickettsia spp. were similar to Rickettsiaaeschlimannii, while a few were similar to Rickettsiaafricae of the spotted fever group (SFG). Given their zoonotic importance, 249 ticks were tested for Crimean Congo hemorrhagic fever virus infection, and all were negative. The results imply that Hyalomma and FLE have obligatory symbiotic interactions, indicating a potential SFG Rickettsia zoonosis risk. A further understanding of the possible influence of FLE on Hyalomma development, as well as on its infection with Rickettsia pathogens, may lead to novel ways to control tick-borne zoonoses.IMPORTANCE This study shows that Francisella-like endosymbionts were ubiquitous in Hyalomma, were maternally transmitted, and cospeciated with their hosts. These findings imply that the interaction between FLE and Hyalomma is of an obligatory nature. It provides an example of an integrative taxonomy approach to simply differentiate among species infesting the same host and to identify nymphal and larval stages to be used in further studies. In addition, it shows the potential of imported Hyalomma ticks to serve as a vector for spotted fever group rickettsiae. The information gathered in this study can be further implemented in the development of symbiont-based disease control strategies for the benefit of human health.
Collapse
|
8
|
Rogers J, Fishberg A, Youngs N, Wu YC. Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species. BMC Bioinformatics 2017; 18:292. [PMID: 28583091 PMCID: PMC5460407 DOI: 10.1186/s12859-017-1701-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 05/22/2017] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND In phylogenetics, we often seek to reconcile gene trees with species trees within the framework of an evolutionary model. While the most popular models for eukaryotic species allow for only gene duplication and gene loss or only multispecies coalescence, recent work has combined these phenomena through a reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes the duplication-loss and coalescent history of a gene family. However, the LCT makes the simplifying assumption that only one individual is sampled per species whereas, with advances in gene sequencing, we now have access to multiple samples per species. RESULTS We demonstrate that with these additional samples, there exist gene tree topologies that are impossible to reconcile with any species tree. In particular, the multiple samples enforce new constraints on the placement of duplications within a valid reconciliation. To model these constraints, we extend the LCT to a new structure, the partially labeled coalescent tree (PLCT) and demonstrate how to use the PLCT to evaluate the feasibility of a gene tree topology. We apply our algorithm to two clades of apes and flies to characterize possible sources of infeasibility. CONCLUSION Going forward, we believe that this model represents a first step towards understanding reconciliations in duplication-loss-coalescence models with multiple samples per species.
Collapse
Affiliation(s)
- Jennifer Rogers
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Andrew Fishberg
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Nora Youngs
- Department of Mathematics, Harvey Mudd College, Claremont, 91711, California, USA
- Current Address: Department of Mathematics and Statistics, Colby College, Waterville, 04901, Maine, USA
| | - Yi-Chieh Wu
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA.
| |
Collapse
|
9
|
Nallabelli N, Patil PP, Pal VK, Singh N, Jain A, Patil PB, Grover V, Korpole S. Biochemical and genome sequence analyses of Megasphaera sp. strain DISK18 from dental plaque of a healthy individual reveals commensal lifestyle. Sci Rep 2016; 6:33665. [PMID: 27651180 PMCID: PMC5030485 DOI: 10.1038/srep33665] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Accepted: 08/30/2016] [Indexed: 11/08/2022] Open
Abstract
Much of the work in periodontal microbiology in recent years has focused on identifying and understanding periodontal pathogens. As the majority of oral microbes have not yet been isolated in pure form, it is essential to understand the phenotypic characteristics of microbes to decipher their role in oral environment. In this study, strain DISK18 was isolated from gingival sulcus and identified as a Megasphaera species. Although metagenomics studies revealed Megasphaera species as a major group within the oral habitat, they have never been isolated in cultivable form to date. Therefore, we have characterized the DISK18 strain to better understand its role in the periodontal ecosystem. Strain Megasphaera sp. DISK18 displayed the ability to adhere and self-aggregate, which are essential requisite features for inhabiting and persisting in oral cavity. It also coaggregated with other pioneer oral colonizers like Streptococcus and Lactobacillus species but not with Veillonella. This behaviour points towards its role in the ecologic succession of a multispecies biofilm as an early colonizer. The absence of virulence determining genes as observed in whole genome sequence analysis coupled with an inability to degrade collagen reveals that Megasphaera sp. strain DISK18 is likely not a pathogenic species and emphasizes its commensal lifestyle.
Collapse
Affiliation(s)
| | | | | | - Namrata Singh
- CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Ashish Jain
- Dr. Harvansh Singh Judge Institute of Dental Sciences and Hospital, Panjab University, Chandigarh, India
| | | | - Vishakha Grover
- Dr. Harvansh Singh Judge Institute of Dental Sciences and Hospital, Panjab University, Chandigarh, India
| | - Suresh Korpole
- CSIR-Institute of Microbial Technology, Chandigarh, India
| |
Collapse
|
10
|
Rodriguez-Flores JL, Fakhro K, Agosto-Perez F, Ramstetter MD, Arbiza L, Vincent TL, Robay A, Malek JA, Suhre K, Chouchane L, Badii R, Al-Nabet Al-Marri A, Abi Khalil C, Zirie M, Jayyousi A, Salit J, Keinan A, Clark AG, Crystal RG, Mezey JG. Indigenous Arabs are descendants of the earliest split from ancient Eurasian populations. Genome Res 2016; 26:151-62. [PMID: 26728717 PMCID: PMC4728368 DOI: 10.1101/gr.191478.115] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 12/15/2015] [Indexed: 12/26/2022]
Abstract
An open question in the history of human migration is the identity of the earliest Eurasian populations that have left contemporary descendants. The Arabian Peninsula was the initial site of the out-of-Africa migrations that occurred between 125,000 and 60,000 yr ago, leading to the hypothesis that the first Eurasian populations were established on the Peninsula and that contemporary indigenous Arabs are direct descendants of these ancient peoples. To assess this hypothesis, we sequenced the entire genomes of 104 unrelated natives of the Arabian Peninsula at high coverage, including 56 of indigenous Arab ancestry. The indigenous Arab genomes defined a cluster distinct from other ancestral groups, and these genomes showed clear hallmarks of an ancient out-of-Africa bottleneck. Similar to other Middle Eastern populations, the indigenous Arabs had higher levels of Neanderthal admixture compared to Africans but had lower levels than Europeans and Asians. These levels of Neanderthal admixture are consistent with an early divergence of Arab ancestors after the out-of-Africa bottleneck but before the major Neanderthal admixture events in Europe and other regions of Eurasia. When compared to worldwide populations sampled in the 1000 Genomes Project, although the indigenous Arabs had a signal of admixture with Europeans, they clustered in a basal, outgroup position to all 1000 Genomes non-Africans when considering pairwise similarity across the entire genome. These results place indigenous Arabs as the most distant relatives of all other contemporary non-Africans and identify these people as direct descendants of the first Eurasian populations established by the out-of-Africa migrations.
Collapse
Affiliation(s)
- Juan L Rodriguez-Flores
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Khalid Fakhro
- Sidra Medical and Research Center, Doha, Qatar; Department of Genetic Medicine, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Francisco Agosto-Perez
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA; Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| | - Monica D Ramstetter
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| | - Thomas L Vincent
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Amal Robay
- Department of Genetic Medicine, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Joel A Malek
- Department of Genetic Medicine, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Karsten Suhre
- Bioinformatics Core, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Lotfi Chouchane
- Department of Genetic Medicine, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Ramin Badii
- Laboratory Medicine and Pathology, Hamad Medical Corporation, Doha, Qatar
| | | | - Charbel Abi Khalil
- Department of Genetic Medicine, Weill Cornell Medical College-Qatar, Doha, Qatar
| | - Mahmoud Zirie
- Department of Medicine, Hamad Medical Corporation, Doha, Qatar
| | - Amin Jayyousi
- Department of Medicine, Hamad Medical Corporation, Doha, Qatar
| | - Jacqueline Salit
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Alon Keinan
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| | - Andrew G Clark
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| | - Ronald G Crystal
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Jason G Mezey
- Department of Genetic Medicine, Weill Cornell Medical College, New York, New York 10065, USA; Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
| |
Collapse
|
11
|
Pickering J, Richmond PC, Kirkham LAS. Molecular tools for differentiation of non-typeable Haemophilus influenzae from Haemophilus haemolyticus. Front Microbiol 2014; 5:664. [PMID: 25520712 PMCID: PMC4251515 DOI: 10.3389/fmicb.2014.00664] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 11/15/2014] [Indexed: 12/18/2022] Open
Abstract
Non-typeable Haemophilus influenzae (NTHi) and Haemophilus haemolyticus are closely related bacteria that reside in the upper respiratory tract. NTHi is associated with respiratory tract infections that frequently result in antibiotic prescription whilst H. haemolyticus is rarely associated with disease. NTHi and H. haemolyticus can be indistinguishable by traditional culture methods and molecular differentiation has proven difficult. This current review chronologically summarizes the molecular approaches that have been developed for differentiation of NTHi from H. haemolyticus, highlighting the advantages and disadvantages of each target and/or technique. We also provide suggestions for the development of new tools that would be suitable for clinical and research laboratories.
Collapse
Affiliation(s)
- Janessa Pickering
- School of Paediatrics and Child Health, The University of Western Australia Perth, WA, Australia
| | - Peter C Richmond
- School of Paediatrics and Child Health, The University of Western Australia Perth, WA, Australia ; Centre for Vaccine and Infectious Disease Research, Telethon Kids Institute, The University of Western Australia Perth, WA, Australia
| | - Lea-Ann S Kirkham
- School of Paediatrics and Child Health, The University of Western Australia Perth, WA, Australia ; Centre for Vaccine and Infectious Disease Research, Telethon Kids Institute, The University of Western Australia Perth, WA, Australia
| |
Collapse
|
12
|
Tang CQ, Humphreys AM, Fontaneto D, Barraclough TG, Paradis E. Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data. Methods Ecol Evol 2014; 5:1086-1094. [PMID: 25821577 PMCID: PMC4374709 DOI: 10.1111/2041-210x.12246] [Citation(s) in RCA: 158] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 08/13/2014] [Indexed: 11/27/2022]
Abstract
Coalescent-based species delimitation methods combine population genetic and phylogenetic theory to provide an objective means for delineating evolutionarily significant units of diversity. The generalised mixed Yule coalescent (GMYC) and the Poisson tree process (PTP) are methods that use ultrametric (GMYC or PTP) or non-ultrametric (PTP) gene trees as input, intended for use mostly with single-locus data such as DNA barcodes.
Here, we assess how robust the GMYC and PTP are to different phylogenetic reconstruction and branch smoothing methods. We reconstruct over 400 ultrametric trees using up to 30 different combinations of phylogenetic and smoothing methods and perform over 2000 separate species delimitation analyses across 16 empirical data sets. We then assess how variable diversity estimates are, in terms of richness and identity, with respect to species delimitation, phylogenetic and smoothing methods.
The PTP method generally generates diversity estimates that are more robust to different phylogenetic methods. The GMYC is more sensitive, but provides consistent estimates for BEAST trees. The lower consistency of GMYC estimates is likely a result of differences among gene trees introduced by the smoothing step. Unresolved nodes (real anomalies or methodological artefacts) affect both GMYC and PTP estimates, but have a greater effect on GMYC estimates. Branch smoothing is a difficult step and perhaps an underappreciated source of bias that may be widespread among studies of diversity and diversification.
Nevertheless, careful choice of phylogenetic method does produce equivalent PTP and GMYC diversity estimates. We recommend simultaneous use of the PTP model with any model-based gene tree (e.g. RAxML) and GMYC approaches with BEAST trees for obtaining species hypotheses.
Collapse
Affiliation(s)
- Cuong Q Tang
- Department of Life Sciences, Imperial College London Ascot, Berkshire, SL5 7PY, UK
| | - Aelys M Humphreys
- Department of Life Sciences, Imperial College London Ascot, Berkshire, SL5 7PY, UK ; Department of Ecology, Environment and Plant Sciences, Stockholm University 10691, Stockholm, Sweden
| | - Diego Fontaneto
- National Research Council, Institute of Ecosystem Study 28922, Verbania Pallanza, Italy
| | | | - Emmanuel Paradis
- Department of Life Sciences, Imperial College London Ascot, Berkshire, SL5 7PY, UK
| |
Collapse
|
13
|
Jia F, Lo N, Ho SYW. The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales. PLoS One 2014; 9:e95722. [PMID: 24798481 PMCID: PMC4010409 DOI: 10.1371/journal.pone.0095722] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 03/28/2014] [Indexed: 12/23/2022] Open
Abstract
Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6-10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data.
Collapse
Affiliation(s)
- Fangzhi Jia
- School of Biological Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Nathan Lo
- School of Biological Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Simon Y. W. Ho
- School of Biological Sciences, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
14
|
Mustafa G, Tahir A, Asgher M, Rahman MU, Jamil A. Comparative sequence analysis of citrate synthase and 18S ribosomal DNA from a wild and mutant strains of Aspergillus niger with various fungi. Bioinformation 2014; 10:1-7. [PMID: 24516318 PMCID: PMC3916811 DOI: 10.6026/97320630010001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Accepted: 01/16/2014] [Indexed: 12/13/2022] Open
Abstract
UNLABELLED A mutation was induced in Aspergillus niger wild strain using ethidium bromide resulting in enhanced expression of citric acid by three folds and 112.42 mg/mL citric acid was produced under optimum conditions with 121.84 mg/mL of sugar utilization. Dendograms of 18S rDNA and citrate synthase from different fungi including sample strains were made to assess homology among different fungi and to study the correlation of citrate synthase gene with evolution of fungi. Subsequent comparative sequence analysis revealed strangeness between the citrate synthase and 18S rDNA phylogenetic trees. Furthermore, the citrate synthase movement suggests that the use of traditional marker molecule of 18S rDNA gives misleading information about the evolution of citrate synthase in different fungi as it has shown that citrate synthase gene transferred independently among different fungi having no evolutionary relationships. Random amplified polymorphic DNA (RAPD-PCR) analysis was also employed to study genetic variation between wild and mutant strains of A. niger and only 71.43% similarity was found between both the genomes. Keeping in view the importance of citric acid as a necessary constituent of various food preparations, synthetic biodegradable detergents and pharmaceuticals the enhanced production of citric acid by mutant derivative might provide significant boost in commercial scale viability of this useful product. ABBREVIATIONS CS - Citrate synthase, CA - Citric acid, RAPD - Random amplified polymorphic DNA, TAF - Total amplified fragments, PAF - Polymorphic amplified fragments, CAF - Common amplified fragments.
Collapse
Affiliation(s)
- Ghulam Mustafa
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, USA
| | - Aisha Tahir
- Department of Chemistry and Biochemistry, University of Agriculture, Faisalabad-38040, Pakistan
| | - Muhammad Asgher
- Department of Chemistry and Biochemistry, University of Agriculture, Faisalabad-38040, Pakistan
| | - Mehboob-ur Rahman
- National Institute for Biotechnology & Genetic Engineering (NIBGE) PO Box 577 Jhang Road Faisalabad, Pakistan
| | - Amer Jamil
- Department of Chemistry and Biochemistry, University of Agriculture, Faisalabad-38040, Pakistan
| |
Collapse
|
15
|
Xing P, Guo L, Tian W, Wu QL. Novel Clostridium populations involved in the anaerobic degradation of Microcystis blooms. ISME JOURNAL 2010; 5:792-800. [PMID: 21107445 DOI: 10.1038/ismej.2010.176] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Understanding the microbial degradation of Microcystis biomass is crucial for determining the ecological consequences of Microcystis blooms in freshwater lakes. The purpose of this study was to identify bacteria involved in the anaerobic degradation of Microcystis blooms. Microcystis scum was anaerobically incubated for 90 days at three temperatures (15 °C, 25 °C and 35 °C). We used terminal restriction fragment length polymorphism (T-RFLP) analysis of bacterial 16S rRNA genes, followed by cloning and sequencing of selected samples, to reveal the community composition of bacteria and their dynamics during decomposition. Clostridium spp. were found to be the most dominant bacteria in the incubations, accounting for 72% of the sequenced clones. Eight new clusters or subclusters (designated CLOS.1-8) were identified in the Clostridium phylogenetic tree. The bacterial populations displayed distinct successions during Microcystis decomposition. Temperature had a strong effect on the dynamics of the bacterial populations. At 15 °C, the initial dominance of a 207-bp T-RF (Betaproteobacteria) was largely substituted by a 227-bp T-RF (Clostridium, new cluster CLOS.2) at 30 days. In contrast, at 25 °C and 35 °C, we observed an alternating succession of the 227-bp T-RF and a 231-bp T-RF (Clostridium, new cluster CLOS.1) that occurred more than four times; no one species dominated the flora for the entire experiment. Our study shows that novel Clostridium clusters and their diverse consortiums dominate the bacterial communities during anaerobic degradation of Microcystis, suggesting that these microbes' function in the degradation process.
Collapse
Affiliation(s)
- Peng Xing
- State Key Laboratory of Lake Science and Environment, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, #73 East Beijing Road, Nanjing, Jiangsu, People's Republic of China
| | | | | | | |
Collapse
|
16
|
Martinsen L, Venanzetti F, Johnsen A, Sbordoni V, Bachmann L. Molecular evolution of the pDo500 satellite DNA family in Dolichopoda cave crickets (Rhaphidophoridae). BMC Evol Biol 2009; 9:301. [PMID: 20038292 PMCID: PMC2808323 DOI: 10.1186/1471-2148-9-301] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Accepted: 12/28/2009] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Non-coding satellite DNA (satDNA) usually has a high turn-over rate frequently leading to species specific patterns. However, some satDNA families evolve more slowly and can be found in several related species. Here, we analyzed the mode of evolution of the pDo500 satDNA family of Dolichopoda cave crickets. In addition, we discuss the potential of slowly evolving satDNAs as phylogenetic markers. RESULTS We sequenced 199 genomic or PCR amplified satDNA repeats of the pDo500 family from 12 Dolichopoda species. For the 38 populations under study, 39 pDo500 consensus sequences were deduced. Phylogenetic analyses using Bayesian, Maximum Parsimony, and Maximum Likelihood approaches yielded largely congruent tree topologies. The vast majority of pDo500 sequences grouped according to species designation. Scatter plots and statistical tests revealed a significant correlation between genetic distances for satDNA and mitochondrial DNA. Sliding window analyses showed species specific patterns of variable and conserved regions. The evolutionary rate of the pDo500 satDNA was estimated to be 1.63-1.78% per lineage per million years. CONCLUSIONS The pDo500 satDNA evolves gradually at a rate that is only slightly faster than previously published rates of insect mitochondrial COI sequences. The pDo500 phylogeny was basically congruent with the previously published mtDNA phylogenies. Accordingly, the slowly evolving pDo500 satDNA family is indeed informative as a phylogenetic marker.
Collapse
Affiliation(s)
- Lene Martinsen
- National Centre of Biosystematics, Natural History Museum, University of Oslo, 0318 Oslo, Norway
| | | | - Arild Johnsen
- National Centre of Biosystematics, Natural History Museum, University of Oslo, 0318 Oslo, Norway
| | - Valerio Sbordoni
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 00133 Rome, Italy
| | - Lutz Bachmann
- National Centre of Biosystematics, Natural History Museum, University of Oslo, 0318 Oslo, Norway
| |
Collapse
|
17
|
Low taxon richness of bacterioplankton in high-altitude lakes of the eastern tibetan plateau, with a predominance of Bacteroidetes and Synechococcus spp. Appl Environ Microbiol 2009; 75:7017-25. [PMID: 19767472 DOI: 10.1128/aem.01544-09] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Plankton samples were collected from six remote freshwater and saline lakes located at altitudes of 3,204 to 4,718 m and 1,000 km apart within an area of ca. 1 million km(2) on the eastern Tibetan Plateau to comparatively assess how environmental factors influence the diversity of bacterial communities in high-altitude lakes. The composition of the bacterioplankton was investigated by analysis of large clone libraries of 16S rRNA genes. Comparison of bacterioplankton diversities estimated for the six Tibetan lakes with reference data previously published for lakes located at lower altitudes indicated relatively low taxon richness in the Tibetan lakes. The estimated average taxon richness in the four Tibetan freshwater lakes was only one-fifth of the average taxon richness estimated for seven low-altitude reference lakes. This cannot be explained by low coverage of communities in the Tibetan lakes by the established libraries or by differences in habitat size. Furthermore, a comparison of the taxonomic compositions of bacterioplankton across the six Tibetan lakes revealed low overlap between their community compositions. About 70.9% of the operational taxonomic units (99% similarity) were specific to single lakes, and a relatively high percentage (11%) of sequences were <95% similar to publicly deposited sequences of cultured or uncultured bacteria. This beta diversity was explained by differences in salinity between lakes rather than by distance effects. Another characteristic of the investigated lakes was the predominance of Cyanobacteria (Synechococcus) and Bacteroidetes. These features of bacterioplankton diversity may reflect specific adaptation of various lineages to the environmental conditions in these high-altitude lakes.
Collapse
|
18
|
Pointer MA, Mundy NI. Testing whether macroevolution follows microevolution: are colour differences among swans (Cygnus) attributable to variation at the MCIR locus? BMC Evol Biol 2008; 8:249. [PMID: 18789136 PMCID: PMC2553801 DOI: 10.1186/1471-2148-8-249] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2008] [Accepted: 09/12/2008] [Indexed: 11/11/2022] Open
Abstract
Background The MC1R (melanocortin-1 receptor) locus underlies intraspecific variation in melanin-based dark plumage coloration in several unrelated birds with plumage polymorphisms. There is far less evidence for functional variants of MC1R being involved in interspecific variation, in which spurious genotype-phenotype associations arising through population history are a far greater problem than in intraspecific studies. We investigated the relationship between MC1R variation and plumage coloration in swans (Cygnus), which show extreme variation in melanic plumage phenotypes among species (white to black). Results The two species with melanic plumage, C. atratus and C. melanocoryphus (black and black-necked swans respectively), both have amino acid changes at important functional sites in MC1R that are consistent with increased MC1R activity and melanism. Reconstruction of MC1R evolution over a newly generated independent molecular phylogeny of Cygnus and related genera shows that these putative melanizing mutations were independently derived in the two melanic lineages. However, interpretation is complicated by the fact that one of the outgroup genera, Coscoroba, also has a putative melanizing mutation at MC1R that has arisen independently but has nearly pure white plumage. Epistasis at other loci seems the most likely explanation for this discrepancy. Unexpectedly, the phylogeny shows that the genus Cygnus may not be monophyletic, with C. melanocoryphus placed as a sister group to true geese (Anser), but further data will be needed to confirm this. Conclusion Our study highlights the difficulty of extrapolating from intraspecific studies to understand the genetic basis of interspecific adaptive phenotypic evolution, even with a gene whose structure-function relationships are as well understood as MC1R as confounding variation make clear genotype/phenotype associations difficult at the macroevolutionary scale. However, the identification of substitutions in the black and black-necked swan that are known to be associated with melanic phenotypes, suggests Cygnus may be another example where there appears to be convergent evolution at MC1R. This study therefore provides a novel example where previously described intraspecific genotype/phenotype associations occur at the macroevolutionary level.
Collapse
Affiliation(s)
- Marie A Pointer
- Department of Zoology, University of Cambridge, Cambridge, UK.
| | | |
Collapse
|
19
|
Rasmussen MD, Kellis M. Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. Genes Dev 2007; 17:1932-42. [PMID: 17989260 PMCID: PMC2099600 DOI: 10.1101/gr.7105007] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2007] [Accepted: 10/16/2007] [Indexed: 01/02/2023]
Abstract
Comparative genomics provides a general methodology for discovering functional DNA elements and understanding their evolution. The availability of many related genomes enables more powerful analyses, but requires rigorous phylogenetic methods to resolve orthologous genes and regions. Here, we use 12 recently sequenced Drosophila genomes and nine fungal genomes to address the problem of accurate gene-tree reconstruction across many complete genomes. We show that existing phylogenetic methods that treat each gene tree in isolation show large-scale inaccuracies, largely due to insufficient phylogenetic information in individual genes. However, we find that gene trees exhibit common properties that can be exploited for evolutionary studies and accurate phylogenetic reconstruction. Evolutionary rates can be decoupled into gene-specific and species-specific components, which can be learned across complete genomes. We develop a phylogenetic reconstruction methodology that exploits these properties and achieves significantly higher accuracy, addressing the species-level heterotachy and enabling studies of gene evolution in the context of species evolution.
Collapse
Affiliation(s)
- Matthew D. Rasmussen
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA
- The Broad Institute, Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02140, USA
| |
Collapse
|
20
|
Wägele JW, Mayer C. Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects. BMC Evol Biol 2007; 7:147. [PMID: 17725833 PMCID: PMC2040160 DOI: 10.1186/1471-2148-7-147] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2007] [Accepted: 08/28/2007] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Published molecular phylogenies are usually based on data whose quality has not been explored prior to tree inference. This leads to errors because trees obtained with conventional methods suppress conflicting evidence, and because support values may be high even if there is no distinct phylogenetic signal. Tools that allow an a priori examination of data quality are rarely applied. RESULTS Using data from published molecular analyses on the phylogeny of crustaceans it is shown that tree topologies and popular support values do not show existing differences in data quality. To visualize variations in signal distinctness, we use network analyses based on split decomposition and split support spectra. Both methods show the same differences in data quality and the same clade-supporting patterns. Both methods are useful to discover long-branch effects. We discern three classes of long branch effects. Class I effects consist of attraction of terminal taxa caused by symplesiomorphies, which results in a false monophyly of paraphyletic groups. Addition of carefully selected taxa can fix this effect. Class II effects are caused by drastic signal erosion. Long branches affected by this phenomenon usually slip down the tree to form false clades that in reality are polyphyletic. To recover the correct phylogeny, more conservative genes must be used. Class III effects consist of attraction due to accumulated chance similarities or convergent character states. This sort of noise can be reduced by selecting less variable portions of the data set, avoiding biases, and adding slower genes. CONCLUSION To increase confidence in molecular phylogenies an exploratory analysis of the signal to noise ratio can be conducted with split decomposition methods. If long-branch effects are detected, it is necessary to discern between three classes of effects to find the best approach for an improvement of the raw data.
Collapse
Affiliation(s)
| | - Christoph Mayer
- Lehrstuhl Spezielle Zoologie, Faculty of Biology, University Bochum, 44780 Bochum, Germany
| |
Collapse
|
21
|
Som A. A new approach for estimating the efficiencies of the nucleotide substitution models. Theory Biosci 2007; 125:133-45. [PMID: 17412292 DOI: 10.1016/j.thbio.2006.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2006] [Revised: 11/16/2006] [Accepted: 11/21/2006] [Indexed: 11/20/2022]
Abstract
In this article, a new approach is presented for estimating the efficiencies of the nucleotide substitution models in a four-taxon case and then this approach is used to estimate the relative efficiencies of six substitution models under a wide variety of conditions. In this approach, efficiencies of the models are estimated by using a simple probability distribution theory. To assess the accuracy of the new approach, efficiencies of the models are also estimated by using the direct estimation method. Simulation results from the direct estimation method confirmed that the new approach is highly accurate. The success of the new approach opens a unique opportunity to develop analytical methods for estimating the relative efficiencies of the substitution models in a straightforward way.
Collapse
Affiliation(s)
- Anup Som
- Center for Evolutionary Functional Genomics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA.
| |
Collapse
|
22
|
Gupta RS, Sneath PHA. Application of the character compatibility approach to generalized molecular sequence data: branching order of the proteobacterial subdivisions. J Mol Evol 2006; 64:90-100. [PMID: 17160641 DOI: 10.1007/s00239-006-0082-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2006] [Accepted: 08/28/2006] [Indexed: 10/23/2022]
Abstract
The character compatibility approach, which removes all homoplasic characters and involves finding the largest clique of compatible characters in a dataset, in principle, provides a powerful means for obtaining correct topology in difficult to resolve cases. However, the usefulness of this approach to generalized molecular sequence data for phylogeny determination has not been studied in the past. We have used this approach to determine the topology of 23 proteobacterial species (6 each of alpha-, beta- and gamma-, 3 delta-, and 2 epsilon-proteobacteria) using sequence data for 10 conserved proteins (Hsp60, Hsp70, EF-Tu, EF-G, alanyl-tRNA synthetase, RecA, GyrA, GyrB, RpoB and RpoC). All sites in the sequence alignments of these proteins where only two amino acids were found, with each amino acid present in at least two species, were selected. Mutual compatibility determination on these binary state sites was carried out by two means. In one case, all of these sites were combined into a large dataset (Set A; 957 characters) prior to compatibility analysis. In the second case, compatibility analysis was carried out on characters from individual proteins and all compatible sites were combined into a large dataset (Set B; 398 characters) for further studies. Upon compatibility analyses, the largest cliques that were obtained from Sets A and B consisted of 337 and 323 compatible characters, respectively. In these cliques, all proteobacterial subgroups were clearly distinguished and branching orders of most of the species were also resolved. The epsilon-proteobacteria exhibited the earliest branching, whereas the beta- and gamma-subgroups were found to have emerged last. The relative placement of the alpha- and delta-subgroups, however, was not resolved. The topology of these species was also determined based on 16S rRNA sequences and a concatenated dataset of sequences for all 10 proteins by means of neighbor-joining, maximum likelihood, and maximum parsimony methods. In the protein trees, all proteobacterial groups were reliably resolved and they branched in the following order: (epsilon(delta(alpha(beta,gamma)))). However, in the rRNA trees, the gamma- and beta-subgroups exhibited polyphyletic branching and many internal nodes were not resolved. These results indicate that the character compatibility analysis using generalized molecular sequence data provides a powerful means for evolutionary studies. Based on molecular sequences, it should be possible to obtain very large datasets of compatible characters that should prove very helpful in clarifying difficult to resolve phylogenetic relationships.
Collapse
Affiliation(s)
- Radhey S Gupta
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada L8N 3Z5.
| | | |
Collapse
|
23
|
Pollard DA, Moses AM, Iyer VN, Eisen MB. Detecting the limits of regulatory element conservation and divergence estimation using pairwise and multiple alignments. BMC Bioinformatics 2006; 7:376. [PMID: 16904011 PMCID: PMC1613255 DOI: 10.1186/1471-2105-7-376] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2006] [Accepted: 08/14/2006] [Indexed: 01/01/2023] Open
Abstract
Background Molecular evolutionary studies of noncoding sequences rely on multiple alignments. Yet how multiple alignment accuracy varies across sequence types, tree topologies, divergences and tools, and further how this variation impacts specific inferences, remains unclear. Results Here we develop a molecular evolution simulation platform, CisEvolver, with models of background noncoding and transcription factor binding site evolution, and use simulated alignments to systematically examine multiple alignment accuracy and its impact on two key molecular evolutionary inferences: transcription factor binding site conservation and divergence estimation. We find that the accuracy of multiple alignments is determined almost exclusively by the pairwise divergence distance of the two most diverged species and that additional species have a negligible influence on alignment accuracy. Conserved transcription factor binding sites align better than surrounding noncoding DNA yet are often found to be misaligned at relatively short divergence distances, such that studies of binding site gain and loss could easily be confounded by alignment error. Divergence estimates from multiple alignments tend to be overestimated at short divergence distances but reach a tool specific divergence at which they cease to increase, leading to underestimation at long divergences. Our most striking finding was that overall alignment accuracy, binding site alignment accuracy and divergence estimation accuracy vary greatly across branches in a tree and are most accurate for terminal branches connecting sister taxa and least accurate for internal branches connecting sub-alignments. Conclusion Our results suggest that variation in alignment accuracy can lead to errors in molecular evolutionary inferences that could be construed as biological variation. These findings have implications for which species to choose for analyses, what kind of errors would be expected for a given set of species and how multiple alignment tools and phylogenetic inference methods might be improved to minimize or control for alignment errors.
Collapse
Affiliation(s)
- Daniel A Pollard
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
| | - Alan M Moses
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
| | - Venky N Iyer
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Michael B Eisen
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Department of Genome Sciences, Genomics Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
- Center for Integrative Genomics, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
24
|
Barkman TJ, Chenery G, McNeal JR, Lyons-Weiler J, Ellisens WJ, Moore G, Wolfe AD, dePamphilis CW. Independent and combined analyses of sequences from all three genomic compartments converge on the root of flowering plant phylogeny. Proc Natl Acad Sci U S A 2000; 97:13166-71. [PMID: 11069280 PMCID: PMC27196 DOI: 10.1073/pnas.220427497] [Citation(s) in RCA: 160] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2000] [Accepted: 09/06/2000] [Indexed: 11/18/2022] Open
Abstract
Plant phylogenetic estimates are most likely to be reliable when congruent evidence is obtained independently from the mitochondrial, plastid, and nuclear genomes with all methods of analysis. Here, results are presented from separate and combined genomic analyses of new and previously published data, including six and nine genes (8, 911 bp and 12,010 bp, respectively) for different subsets of taxa that suggest Amborella + Nymphaeales (water lilies) are the first-branching angiosperm lineage. Before and after tree-independent noise reduction, most individual genomic compartments and methods of analysis estimated the Amborella + Nymphaeales basal topology with high support. Previous phylogenetic estimates placing Amborella alone as the first extant angiosperm branch may have been misled because of a series of specific problems with paralogy, suboptimal outgroups, long-branch taxa, and method dependence. Ancestral character state reconstructions differ between the two topologies and affect inferences about the features of early angiosperms.
Collapse
Affiliation(s)
- T J Barkman
- Department of Biology, Institute of Molecular Evolutionary Genetics, and Life Sciences Consortium, Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Gupta RS. Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev 1998; 62:1435-91. [PMID: 9841678 PMCID: PMC98952 DOI: 10.1128/mmbr.62.4.1435-1491.1998] [Citation(s) in RCA: 380] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes.
Collapse
Affiliation(s)
- R S Gupta
- Department of Biochemistry, McMaster University, Hamilton, Ontario L8N 3Z5, Canada.
| |
Collapse
|
26
|
Nei M, Kumar S, Takahashi K. The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. Proc Natl Acad Sci U S A 1998; 95:12390-7. [PMID: 9770497 PMCID: PMC22842 DOI: 10.1073/pnas.95.21.12390] [Citation(s) in RCA: 103] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.
Collapse
Affiliation(s)
- M Nei
- Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, University Park, PA 16802-5301, USA.
| | | | | |
Collapse
|
27
|
Abstract
New equations are derived to estimate the number of amino acid substitutions per site between two homologous proteins from the root mean square (RMS) deviation between two spatial structures and from the fraction of identical residues between two sequences. The equations are based on evolutionary models, analyzing predominantly structural changes and not sequence changes. Evolution of spatial structure is treated as a diffusion in an elastic force field. Diffusion accounts for structural changes caused by amino acid substitutions, and elastic force reflects selection, which preserves protein fold. Obtained equations are supported by analysis of protein spatial structures.
Collapse
Affiliation(s)
- N V Grishin
- Department of Pharmacology, University of Texas Southwestern Medical Center at Dallas 75235-9041, USA
| |
Collapse
|
28
|
Rannala B, Yang Z. Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 1996; 43:304-11. [PMID: 8703097 DOI: 10.1007/bf02338839] [Citation(s) in RCA: 859] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A new method is presented for inferring evolutionary trees using nucleotide sequence data. The birth-death process is used as a model of speciation and extinction to specify the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and the substitution model are estimated by maximum likelihood. The posterior probabilities of different phylogenies are calculated and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. We refer to this as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process.
Collapse
Affiliation(s)
- B Rannala
- Department of Integrative Biology, University of California, Berkeley, CA 94720-3140, USA
| | | |
Collapse
|
29
|
Baldauf SL, Palmer JD, Doolittle WF. The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny. Proc Natl Acad Sci U S A 1996; 93:7749-54. [PMID: 8755547 PMCID: PMC38819 DOI: 10.1073/pnas.93.15.7749] [Citation(s) in RCA: 162] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
The genes for the protein synthesis elongation factors Tu (EF-Tu) and G (EF-G) are the products of an ancient gene duplication, which appears to predate the divergence of all extant organismal lineages. Thus, it should be possible to root a universal phylogeny based on either protein using the second protein as an outgroup. This approach was originally taken independently with two separate gene duplication pairs, (i) the regulatory and catalytic subunits of the proton ATPases and (ii) the protein synthesis elongation factors EF-Tu and EF-G. Questions about the orthology of the ATPase genes have obscured the former results, and the elongation factor data have been criticized for inadequate taxonomic representation and alignment errors. We have expanded the latter analysis using a broad representation of taxa from all three domains of life. All phylogenetic methods used strongly place the root of the universal tree between two highly distinct groups, the archaeons/eukaryotes and the eubacteria. We also find that a combined data set of EF-Tu and EF-G sequences favors placement of the eukaryotes within the Archaea, as the sister group to the Crenarchaeota. This relationship is supported by bootstrap values of 60-89% with various distance and maximum likelihood methods, while unweighted parsimony gives 58% support for archaeal monophyly.
Collapse
Affiliation(s)
- S L Baldauf
- Canadian Institute for Advanced Research and Department of Biochemistry, Dalhousie University, Halifax, Canada
| | | | | |
Collapse
|
30
|
Martin MJ, González-Candelas F, Sobrino F, Dopazo J. A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Mol Evol 1995; 41:1128-38. [PMID: 8587110 DOI: 10.1007/bf00173194] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The availability of fast and accurate sequencing procedures along with the use of PCR has led to a proliferation of studies of variability at the molecular level in populations. Nevertheless, it is often impractical to examine long genomic stretches and a large number of individuals at the same time. In order to optimize this kind of study, we suggest a heuristic procedure for detection of the shortest region whose informational content can be considered sufficient for significant phylogenetic reconstruction. The method is based on the comparison of the pairwise genetic distances obtained from a set of sequences of reference to those obtained for different windows of variable size and position by means of a simple index. We also present an approach for testing whether the informative content in the stretches selected in this way is significantly different from the corresponding content shown by the larger genomic regions used as reference. Application of this test to the analysis of the VP1 protein gene of foot-and-mouth-disease type C virus allowed us to define optimal stretches whose informative content is not significantly different from that displayed by the complete VP1 sequence. We showed that the predictions made for type C sequences are valid for type O sequences, indicating that the results of the procedure are consistent.
Collapse
Affiliation(s)
- M J Martin
- Tecnología para Diagnóstico e Investigación (TDI) S.A., c/Condes de Torreanaz, Madrid, Spain
| | | | | | | |
Collapse
|
31
|
Bocchetta M, Ceccarelli E, Creti R, Sanangelantoni AM, Tiboni O, Cammarano P. Arrangement and nucleotide sequence of the gene (fus) encoding elongation factor G (EF-G) from the hyperthermophilic bacterium Aquifex pyrophilus: phylogenetic depth of hyperthermophilic bacteria inferred from analysis of the EF-G/fus sequences. J Mol Evol 1995; 41:803-12. [PMID: 8587125 DOI: 10.1007/bf00173160] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The gene fus (for EF-G) of the hyperthermophilic bacterium Aquifex pyrophilus was cloned and sequenced. Unlike the other bacteria, which display the streptomycin-operon arrangement of EF genes (5'-rps12-rps7-fus-tuf-3'), the Aquifex fus gene (700 codons) is not preceded by the two small ribosomal subunit genes although it is still followed by a tuf gene (for EF-Tu). The opposite strand upstream from the EF-G coding locus revealed an open reading frame (ORF) encoding a polypeptide having 52.5% identity with an E. coli protein (the pdxJ gene product) involved in pyridoxine condensation. The Aquifex EF-G was aligned with available homologs representative of Deinococci, high G+C Gram positives, Proteobacteria, cyanobacteria, and several Archaea. Outgroup-rooted phylogenies were constructed from both the amino acid and the DNA sequences using first and second codon positions in the alignments except sites containing synonymous changes. Both datasets and alternative tree-making methods gave a consistent topology, with Aquifex and Thermotoga maritima (a hyperthermophile) as the first and the second deepest offshoots, respectively. However, the robustness of the inferred phylogenies is not impressive. The branching of Aquifex more deeply than Thermotoga and the branching of Thermotoga more deeply than the other taxa examined are given at bootstrap values between 65 and 70% in the fus-based phylogenies, while the EF-G(2)-based phylogenies do not provide a statistically significant level of support (< or = 50% bootstrap confirmation) for the emergence of Thermotoga between Aquifex and the successive offshoot (Thermus genus). At present, therefore, the placement of Aquifex at the root of the bacterial tree, albeit reproducible, can be asserted only with reservation, while the emergence of Thermotoga between the Aquificales and the Deinococci remains (statistically) indeterminate.
Collapse
Affiliation(s)
- M Bocchetta
- Istituto Pasteur Fondazione Cenci-Bolognetti, Dipartimento di Biopatologia Umana, Universita di Roma I La Sapienza, Policlinico Umberto I, Italy
| | | | | | | | | | | |
Collapse
|
32
|
Takezaki N, Nei M. Inconsistency of the maximum parsimony method when the rate of nucleotide substitution is constant. J Mol Evol 1994; 39:210-8. [PMID: 7932784 DOI: 10.1007/bf00163810] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The inconsistency of the maximum parsimony method is known to occur even when the rate of nucleotide substitution is constant. To understand why this inconsistency occurs, a mathematical study was conducted for the cases of five, six, and seven sequences. The results obtained indicate that this inconsistency occurs because the probability of occurrence of nucleotide configurations generated by one substitution on a short interior branch is often lower than that of configurations generated by more substitutions on other longer branches. The chance of occurrence of this event--or, the inconsistency of the maximum parsimony method--apparently increases as the number of sequences increases. The inconsistency may occur even when the extent of sequence divergence is relatively small.
Collapse
Affiliation(s)
- N Takezaki
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | |
Collapse
|