151
|
Fay JC, Wittkopp PJ. Evaluating the role of natural selection in the evolution of gene regulation. Heredity (Edinb) 2007; 100:191-9. [PMID: 17519966 DOI: 10.1038/sj.hdy.6801000] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Surveys of gene expression reveal extensive variability both within and between a wide range of species. Compelling cases have been made for adaptive changes in gene regulation, but the proportion of expression divergence attributable to natural selection remains unclear. Distinguishing adaptive changes driven by positive selection from neutral divergence resulting from mutation and genetic drift is critical for understanding the evolution of gene expression. Here, we review the various methods that have been used to test for signs of selection in genomic expression data. We also discuss properties of regulatory systems relevant to neutral models of gene expression. Despite some potential caveats, published studies provide considerable evidence for adaptive changes in gene expression. Future challenges for studies of regulatory evolution will be to quantify the frequency of adaptive changes, identify the genetic basis of expression divergence and associate changes in gene expression with specific organismal phenotypes.
Collapse
Affiliation(s)
- J C Fay
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA.
| | | |
Collapse
|
152
|
Petit N, Casillas S, Ruiz A, Barbadilla A. Protein polymorphism is negatively correlated with conservation of intronic sequences and complexity of expression patterns in Drosophila melanogaster. J Mol Evol 2007; 64:511-8. [PMID: 17460807 DOI: 10.1007/s00239-006-0047-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2006] [Accepted: 01/17/2007] [Indexed: 10/23/2022]
Abstract
We report a significant negative correlation between nonsynonymous polymorphism and intron length in Drosophila melanogaster. This correlation is similar to that between protein divergence and intron length previously reported in Drosophila. We show that the relationship can be explained by the content of conserved noncoding sequences (CNS) within introns. In addition, genes with a high regulatory complexity and many genetic interactions also exhibit larger amounts of CNS within their introns and lower values of nonsynonymous polymorphism. The present study provides relevant evidence on the importance of intron content and expression patterns on the levels of coding polymorphism.
Collapse
Affiliation(s)
- Natalia Petit
- Departament de Genètica i Microbiologia, Facultat de Biociències, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain.
| | | | | | | |
Collapse
|
153
|
Savas S, Taylor IW, Wrana JL, Ozcelik H. Functional nonsynonymous single nucleotide polymorphisms from the TGF-β protein interaction network. Physiol Genomics 2007; 29:109-17. [PMID: 17190851 DOI: 10.1152/physiolgenomics.00226.2006] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Protein complexes mediated by protein-protein interactions are essential for many cellular functions. Transforming growth factor (TGF)-β signaling involves a cascade of protein-protein interactions and malfunctioning of this pathway has been implicated in human diseases. Using an in silico approach, we analyzed the naturally occurring human genetic variations from the proteins involved in the TGF-β signaling (10 TGF-β proteins and 242 other proteins interacting with them) to identify the ones that have potential biological consequences. All proteins were searched in the dbSNP database for the presence of nonsynonymous single nucleotide polymorphisms (nsSNPs). A total of 118 validated nsSNPs from 63 proteins were retrieved and analyzed in terms of 1) evolutionary conservation status, 2) being located in a functional protein domain or motif, and 3) altering putative protein motif or phosphorylation sites. Our results indicated the presence of 31 nsSNPs that occurred at evolutionarily conserved residues, 37 nsSNPs were located in protein domains, motifs, or repeats, and 46 nsSNPs were predicted to either create or abolish putative protein motifs or phosphorylation sites. We undertook this study to analyze the human genetic variations that can affect the protein function and the TGF-β signaling. The nsSNPs reported in here can be characterized by experimental approaches to elucidate their exact biological roles and whether they are related to human disease.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
154
|
Weadick CJ, Chang BSW. Long-wavelength sensitive visual pigments of the guppy (Poecilia reticulata): six opsins expressed in a single individual. BMC Evol Biol 2007; 7 Suppl 1:S11. [PMID: 17288569 PMCID: PMC1796605 DOI: 10.1186/1471-2148-7-s1-s11] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The diversity of visual systems in fish has long been of interest for evolutionary biologists and neurophysiologists, and has recently begun to attract the attention of molecular evolutionary geneticists. Several recent studies on the copy number and genomic organization of visual pigment proteins, the opsins, have revealed an increased opsin diversity in fish relative to most vertebrates, brought about through recent instances of opsin duplication and divergence. However, for the subfamily of opsin genes that mediate vision at the long-wavelength end of the spectrum, the LWS opsins, it appears that most fishes possess only one or two loci, a value comparable to most other vertebrates. Here, we characterize the LWS opsins from cDNA of an individual guppy, Poecilia reticulata, a fish that is known exhibit variation in its long-wavelength sensitive visual system, mate preferences and colour patterns. RESULTS We identified six LWS opsins expressed within a single individual. Phylogenetic analysis revealed that these opsins descend from duplication events both pre-dating and following the divergence of the guppy lineage from that of the bluefin killifish, Lucania goodei, the closest species for which comparable data exists. Numerous amino acid substitutions exist among these different LWS opsins, many at sites known to be important for visual pigment function, including spectral sensitivity and G-protein activation. Likelihood analyses using codon-based models of evolution reveal significant changes in selective constraint along two of the guppy LWS opsin lineages. CONCLUSION The guppy displays an unusually high number of LWS opsins compared to other fish, and to vertebrates in general. Observing both substitutions at functionally important sites and the persistence of lineages across species boundaries suggests that these opsins might have functionally different roles, especially with regard to G-protein activation. The reasons why are currently unknown, but may relate to aspects of the guppy's behavioural ecology, in which both male colour patterns and the female mate preferences for these colour patterns experience strong, highly variable selection pressures.
Collapse
Affiliation(s)
- Cameron J Weadick
- Departments of Ecology & Evolutionary Biology, Cell & Systems Biology, and Centre for the Analysis of Genome Evolution & Function, University of Toronto, 25 Harbord Street, M5S3G5, Ontario, Canada
| | - Belinda SW Chang
- Departments of Ecology & Evolutionary Biology, Cell & Systems Biology, and Centre for the Analysis of Genome Evolution & Function, University of Toronto, 25 Harbord Street, M5S3G5, Ontario, Canada
| |
Collapse
|
155
|
Prüfer K, Muetzel B, Do HH, Weiss G, Khaitovich P, Rahm E, Pääbo S, Lachmann M, Enard W. FUNC: a package for detecting significant associations between gene sets and ontological annotations. BMC Bioinformatics 2007; 8:41. [PMID: 17284313 PMCID: PMC1800870 DOI: 10.1186/1471-2105-8-41] [Citation(s) in RCA: 139] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Accepted: 02/06/2007] [Indexed: 11/17/2022] Open
Abstract
Background Genome-wide expression, sequence and association studies typically yield large sets of gene candidates, which must then be further analysed and interpreted. Information about these genes is increasingly being captured and organized in ontologies, such as the Gene Ontology. Relationships between the gene sets identified by experimental methods and biological knowledge can be made explicit and used in the interpretation of results. However, it is often difficult to assess the statistical significance of such analyses since many inter-dependent categories are tested simultaneously. Results We developed the program package FUNC that includes and expands on currently available methods to identify significant associations between gene sets and ontological annotations. Implemented are several tests in particular well suited for genome wide sequence comparisons, estimates of the family-wise error rate, the false discovery rate, a sensitive estimator of the global significance of the results and an algorithm to reduce the complexity of the results. Conclusion FUNC is a versatile and useful tool for the analysis of genome-wide data. It is freely available under the GPL license and also accessible via a web service.
Collapse
Affiliation(s)
- Kay Prüfer
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Bjoern Muetzel
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Hong-Hai Do
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstr. 16-18, D-04107, Germany
| | - Gunter Weiss
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Philipp Khaitovich
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
- Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031, China
| | - Erhard Rahm
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstr. 16-18, D-04107, Germany
| | - Svante Pääbo
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Michael Lachmann
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Wolfgang Enard
- Max-Planck-Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| |
Collapse
|
156
|
Zhang G, Jung BP, Ho W, Jugloff DGM, Cheung HH, Gurd JW, Wallace MC, Eubanks JH. Isolation and characterization of LCHN: a novel factor induced by transient global ischemia in the adult rat hippocampus. J Neurochem 2006; 101:263-73. [PMID: 17394467 DOI: 10.1111/j.1471-4159.2006.04374.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Using mRNA differential display to identify cerebral ischemia-responsive mRNAs, we isolated and cloned a cDNA derived from a novel gene, that has been designated LCHN. Antisense mRNA in situ hybridization and immunoblotting confirmed LCHN expression to be induced in the rat hippocampus following transient forebrain ischemia. The deduced amino acid sequence of the novel LCHN cDNA contains an open reading frame of 455 amino acids, encoding a protein with a predicted molecular mass of approximately 51 kDa. Although LCHN is highly conserved between rat, mouse, and human, the deduced amino acid sequence of LCHN does not possess significant homology to other known genes. LCHN immunoreactivity is detected within the somatodendritic compartment of neurons, is also present on dendritic growth cones, but is not detected on astrocytes. The induction of LCHN in the hippocampus following ischemic injury may have functional consequences, as the ectopic over-expression of LCHN generated neurons with longer and more branched axons and dendrites. Taken together, these data suggest that LCHN could play a role in neuritogenesis, as well as in neuronal recovery and/or restructuring in the hippocampus following transient cerebral ischemia.
Collapse
Affiliation(s)
- Guangming Zhang
- Division of Cell and Molecular Biology, Toronto Western Research Institute, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | |
Collapse
|
157
|
Kohn MH, Murphy WJ, Ostrander EA, Wayne RK. Genomics and conservation genetics. Trends Ecol Evol 2006; 21:629-37. [PMID: 16908089 DOI: 10.1016/j.tree.2006.08.001] [Citation(s) in RCA: 160] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2005] [Revised: 06/29/2006] [Accepted: 08/01/2006] [Indexed: 10/24/2022]
Abstract
In large part, the relevance of genetics to conservation rests on the premise that neutral marker variation in populations reflects levels of detrimental and adaptive genetic variation. Despite its prominence, this tenet has been difficult to evaluate, until now. As we discuss here, genome sequence information and new technological and bioinformatics platforms now enable comprehensive surveys of neutral variation and more direct inferences of detrimental and adaptive variation in species with sequenced genomes and in 'genome-enabled' endangered taxa. Moreover, conservation schemes could begin to consider specific pathological genetic variants. A new conservation genetic agenda would utilize data from enhanced surveys of genomic variation in endangered species to better manage functional genetic variation.
Collapse
Affiliation(s)
- Michael H Kohn
- Department of Ecology & Evolutionary Biology, Rice University, MS 170, 6100 Main Street, Houston, TX 77005, USA.
| | | | | | | |
Collapse
|
158
|
Abstract
It has been suggested that evolutionary changes in gene expression account for most phenotypic differences between species, in particular between humans and apes. What general rules can be described governing expression evolution? We find that a neutral model where negative selection and divergence time are the major factors is a useful null hypothesis for both transcriptome and genome evolution. Two tissues that stand out with regard to gene expression are the testes, where positive selection has exerted a substantial influence in both humans and chimpanzees, and the brain, where gene expression has changed less than in other organs but acceleration might have occurred in human ancestors.
Collapse
Affiliation(s)
- Philipp Khaitovich
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | | | | | | |
Collapse
|
159
|
Savas S, Schmidt S, Jarjanazi H, Ozcelik H. Functional nsSNPs from carcinogenesis-related genes expressed in breast tissue: potential breast cancer risk alleles and their distribution across human populations. Hum Genomics 2006; 2:287-96. [PMID: 16595073 PMCID: PMC3500178 DOI: 10.1186/1479-7364-2-5-287] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Although highly penetrant alleles of BRCA1 and BRCA2 have been shown to predispose to breast cancer, the majority of breast cancer cases are assumed to result from the presence of low-moderate penetrant alleles and environmental carcinogens. Non-synonymous single nucleotide polymorphisms (nsSNPs) are hypothesised to contribute to disease susceptibility and approximately 30 per cent of them are predicted to have a biological significance. In this study, we have applied a bioinformatics-based strategy to identify breast cancer-related nsSNPs from 981 carcinogenesis-related genes expressed in breast tissue. Our results revealed a total of 367 validated nsSNPs, 109 (29.7 per cent) of which are predicted to affect the protein function (functional nsSNPs), suggesting that these nsSNPs are likely to influence the development and homeostasis of breast tissue and hence contribute to breast cancer susceptibility. Sixty-seven of the functional nsSNPs presented as commonly occurring nsSNPs (minor allele frequencies ≥ 5 per cent), representing excellent candidates for breast cancer susceptibility. Additionally, a non-uniform distribution of the common functional nsSNPs among different human populations was observed: 15 nsSNPs were reported to be present in all populations analysed, whereas another set of 15 nsSNPs was specific to particular population(s). We propose that the nsSNPs analysed in this study constitute a unique resource of potential genetic factors for breast cancer susceptibility. Furthermore, the variations in functional nsSNP allele frequencies across major population backgrounds may point to the potential variability of the molecular basis of breast cancer predisposition and treatment response among different human populations.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| | - Steffen Schmidt
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Hamdi Jarjanazi
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| | - Hilmi Ozcelik
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| |
Collapse
|
160
|
Crespi BJ, Summers K. Positive selection in the evolution of cancer. Biol Rev Camb Philos Soc 2006; 81:407-24. [PMID: 16762098 DOI: 10.1017/s1464793106007056] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2005] [Revised: 03/27/2006] [Accepted: 03/29/2006] [Indexed: 01/29/2023]
Abstract
We hypothesize that forms of antagonistic coevolution have forged strong links between positive selection at the molecular level and increased cancer risk. By this hypothesis, evolutionary conflict between males and females, mothers and foetuses, hosts and parasites, and other parties with divergent fitness interests has led to rapid evolution of genetic systems involved in control over fertilization and cellular resources. The genes involved in such systems promote cancer risk as a secondary effect of their roles in antagonistic coevolution, which generates evolutionary disequilibrium and maladaptation. Evidence from two sources: (1) studies on specific genes, including SPANX cancer/testis antigen genes, several Y-linked genes, the pem homebox gene, centromeric histone genes, the breast cancer gene BRCA1, the angiogenesis gene ANG, cadherin genes, cytochrome P450 genes, and viral oncogenes; and (2) large-scale database studies of selection on different functional categories of genes, supports our hypothesis. These results have important implications for understanding the evolutionary underpinnings of cancer and the dynamics of antagonistically-coevolving molecular systems.
Collapse
Affiliation(s)
- Bernard J Crespi
- Behavioural Ecology Research Group, Department of Biology, Simon Fraser University, Burnaby, BC V5A 1 S6 Canada.
| | | |
Collapse
|
161
|
Chain FJJ, Evans BJ. Multiple mechanisms promote the retained expression of gene duplicates in the tetraploid frog Xenopus laevis. PLoS Genet 2006; 2:e56. [PMID: 16683033 PMCID: PMC1449897 DOI: 10.1371/journal.pgen.0020056] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2005] [Accepted: 02/28/2006] [Indexed: 01/19/2023] Open
Abstract
Gene duplication provides a window of opportunity for biological variants to persist under the protection of a co-expressed copy with similar or redundant function. Duplication catalyzes innovation (neofunctionalization), subfunction degeneration (subfunctionalization), and genetic buffering (redundancy), and the genetic survival of each paralog is triggered by mechanisms that add, compromise, or do not alter protein function. We tested the applicability of three types of mechanisms for promoting the retained expression of duplicated genes in 290 expressed paralogs of the tetraploid clawed frog, Xenopus laevis. Tests were based on explicit expectations concerning the ka/ks ratio, and the number and location of nonsynonymous substitutions after duplication. Functional constraints on the majority of paralogs are not significantly different from a singleton ortholog. However, we recover strong support that some of them have an asymmetric rate of nonsynonymous substitution: 6% match predictions of the neofunctionalization hypothesis in that (1) each paralog accumulated nonsynonymous substitutions at a significantly different rate and (2) the one that evolves faster has a higher ka/ks ratio than the other paralog and than a singleton ortholog. Fewer paralogs (3%) exhibit a complementary pattern of substitution at the protein level that is predicted by enhancement or degradation of different functional domains, and the remaining 13% have a higher average ka/ks ratio in both paralogs that is consistent with altered functional constraints, diversifying selection, or activity-reducing mutations after duplication. We estimate that these paralogs have been retained since they originated by genome duplication between 21 and 41 million years ago. Multiple mechanisms operate to promote the retained expression of duplicates in the same genome, in genes in the same functional class, over the same period of time following duplication, and sometimes in the same pair of paralogs. None of these paralogs are superfluous; degradation or enhancement of different protein subfunctions and neofunctionalization are plausible hypotheses for the retained expression of some of them. Evolution of most X. laevis paralogs, however, is consistent with retained expression via mechanisms that do not radically alter functional constraints, such as selection to preserve post-duplication stoichiometry or temporal, quantitative, or spatial subfunctionalization. Gene duplication plays a fundamental role in biological innovation but it is not clear how both copies of a duplicated gene manage to circumvent degradation by mutation if neither is unique. This study explores genetic mechanisms that could make each copy of a duplicate gene different, and therefore distinguishable and potentially preserved by natural selection. It is based on DNA sequences of the protein-coding region of 290 expressed duplicated genes in a frog, Xenopus laevis, that underwent complete duplication of its entire genome. Results provide evidence for multiple mechanisms acting within the same genome, within the same functional classes of genes, within the same period of time following duplication, and even on the same set of duplicated genes. Each copy of a duplicate gene may be subject to distinct evolutionary constraints, and this could be associated with degradation or enhancement of function. Functional constraints of most of these duplicates, however, are not substantially different from a single copy gene; their persistence in the first dozens of millions of years after duplication may more frequently be explained by mechanisms acting on their expression rather than their function.
Collapse
Affiliation(s)
- Frédéric J. J Chain
- Center for Environmental Genomics, Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Ben J Evans
- Center for Environmental Genomics, Department of Biology, McMaster University, Hamilton, Ontario, Canada
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
162
|
Chen SL, Hung CS, Xu J, Reigstad CS, Magrini V, Sabo A, Blasiar D, Bieri T, Meyer RR, Ozersky P, Armstrong JR, Fulton RS, Latreille JP, Spieth J, Hooton TM, Mardis ER, Hultgren SJ, Gordon JI. Identification of genes subject to positive selection in uropathogenic strains of Escherichia coli: a comparative genomics approach. Proc Natl Acad Sci U S A 2006; 103:5977-82. [PMID: 16585510 PMCID: PMC1424661 DOI: 10.1073/pnas.0600938103] [Citation(s) in RCA: 435] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Escherichia coli is a model laboratory bacterium, a species that is widely distributed in the environment, as well as a mutualist and pathogen in its human hosts. As such, E. coli represents an attractive organism to study how environment impacts microbial genome structure and function. Uropathogenic E. coli (UPEC) must adapt to life in several microbial communities in the human body, and has a complex life cycle in the bladder when it causes acute or recurrent urinary tract infection (UTI). Several studies designed to identify virulence factors have focused on genes that are uniquely represented in UPEC strains, whereas the role of genes that are common to all E. coli has received much less attention. Here we describe the complete 5,065,741-bp genome sequence of a UPEC strain recovered from a patient with an acute bladder infection and compare it with six other finished E. coli genome sequences. We searched 3,470 ortholog sets for genes that are under positive selection only in UPEC strains. Our maximum likelihood-based analysis yielded 29 genes involved in various aspects of cell surface structure, DNA metabolism, nutrient acquisition, and UTI. These results were validated by resequencing a subset of the 29 genes in a panel of 50 urinary, periurethral, and rectal E. coli isolates from patients with UTI. These studies outline a computational approach that may be broadly applicable for studying strain-specific adaptation and pathogenesis in other bacteria.
Collapse
Affiliation(s)
| | | | - Jian Xu
- *Center for Genome Sciences
- Genome Sequencing Center, and Departments of
- Genetics, and
| | - Christopher S. Reigstad
- *Center for Genome Sciences
- **Molecular Biology and Pharmacology, Washington University School of Medicine, St. Louis, MO 63110; and
| | | | - Aniko Sabo
- Genome Sequencing Center, and Departments of
| | | | | | | | | | | | | | | | - John Spieth
- Genome Sequencing Center, and Departments of
| | - Thomas M. Hooton
- Department of Medicine, University of Washington, Seattle, WA 98195
| | | | | | - Jeffrey I. Gordon
- *Center for Genome Sciences
- **Molecular Biology and Pharmacology, Washington University School of Medicine, St. Louis, MO 63110; and
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
163
|
Arnau V, Gallach M, Lucas JI, Marín I. UVPAR: fast detection of functional shifts in duplicate genes. BMC Bioinformatics 2006; 7:174. [PMID: 16569227 PMCID: PMC1570150 DOI: 10.1186/1471-2105-7-174] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2005] [Accepted: 03/28/2006] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The imprint of natural selection on gene sequences is often difficult to detect. A plethora of methods have been devised to detect genetic changes due to selective processes. However, many of those methods depend heavily on underlying assumptions regarding the mode of change of DNA sequences and often require sophisticated mathematical treatments that made them computationally slow. The development of fast and effective methods to detect modifications in the selective constraints of genes is therefore of great interest. RESULTS We describe UVPAR, a program designed to quickly test for changes in the functional constraints of duplicate genes. Starting with alignments of the proteins encoded by couples of duplicate genes in two different species, UVPAR detects the regions in which modifications of the functional constraints in the paralogs occurred since both species diverged. Sequences can be analyzed with UVPAR in just a few minutes on a standard PC computer. To demonstrate the power of the program, we first show how the results obtained with UVPAR compare to those based on other approaches, using data for vertebrate Hox genes. We then describe a comprehensive study of the RBR family of ubiquitin ligases in which we have performed 529 analyses involving 14 duplicate genes in seven model species. A significant increase in the number of functional shifts was observed for the species Danio rerio and for the gene Ariadne-2. CONCLUSION These results show that UVPAR can be used to generate sensitive analyses to detect changes in the selection constraints acting on paralogs. The high speed of the program allows its application to genome-scale analyses.
Collapse
Affiliation(s)
- Vicente Arnau
- Departamento de Informática, Universidad de Valencia, Burjassot, Spain
| | - Miguel Gallach
- Departamento de Genética, Universidad de Valencia, Burjassot, Spain
| | - J Ignasi Lucas
- Departamento de Genética, Universidad de Valencia, Burjassot, Spain
| | - Ignacio Marín
- Departamento de Genética, Universidad de Valencia, Burjassot, Spain
| |
Collapse
|
164
|
Oldham MC, Geschwind DH. Comparative genomics: Grasping human transcriptome evolution: what does it all mean? Heredity (Edinb) 2006; 96:339-40. [PMID: 16552432 DOI: 10.1038/sj.hdy.6800807] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
165
|
Lu J, Tang T, Tang H, Huang J, Shi S, Wu CI. The accumulation of deleterious mutations in rice genomes: a hypothesis on the cost of domestication. Trends Genet 2006; 22:126-31. [PMID: 16443304 DOI: 10.1016/j.tig.2006.01.004] [Citation(s) in RCA: 141] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 12/07/2005] [Accepted: 01/13/2006] [Indexed: 11/20/2022]
Abstract
The extent of molecular differentiation between domesticated animals or plants and their wild relatives is postulated to be small. The availability of the complete genome sequences of two subspecies of the Asian rice, Oryza sativa (indica and japonica) and their wild relatives have provided an unprecedented opportunity to study divergence following domestication. We observed significantly more amino acid substitutions during rice domestication than can be expected from a comparison among wild species. This excess is disproportionately larger for the more radical kinds of amino acid changes (e.g. Cys<-->Tyr). We estimate that approximately a quarter of the amino acid differences between rice cultivars are deleterious, not accountable by the relaxation of selective constraints. This excess is negatively correlated with the rate of recombination, suggesting that 'hitchhiking' has occurred. We hypothesize that during domestication artificial selection increased the frequency of many deleterious mutations.
Collapse
Affiliation(s)
- Jian Lu
- Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, IL 60637, USA
| | | | | | | | | | | |
Collapse
|
166
|
Savas S, Tuzmen S, Ozcelik H. Human SNPs resulting in premature stop codons and protein truncation. Hum Genomics 2006; 2:274-86. [PMID: 16595072 PMCID: PMC3500177 DOI: 10.1186/1479-7364-2-5-274] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2005] [Accepted: 11/10/2005] [Indexed: 11/22/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) constitute the most common type of genetic variation in humans. SNPs introducing premature termination codons (PTCs), herein called X-SNPs, can alter the stability and function of transcripts and proteins and thus are considered to be biologically important. Initial studies suggested a strong selection against such variations/mutations. In this study, we undertook a genome-wide systematic screening to identify human X-SNPs using the dbSNP database. Our results demonstrated the presence of 28 X-SNPs from 28 genes with known minor allele frequencies. Eight X-SNPs (28.6 per cent) were predicted to cause transcript degradation by nonsense-mediated mRNA decay. Seventeen X-SNPs (60.7 per cent) resulted in moderate to severe truncation at the C-terminus of the proteins (deletion of >50 per cent of the amino acids). The majority of the X-SNPs (78.6 per cent) represent commonly occurring SNPs, by contrast with the rarely occurring disease-causing PTC mutations. Interestingly, X-SNPs displayed a non-uniform distribution across human populations: eight X-SNPs were reported to be prevalent across three different human populations, whereas six X-SNPs were found exclusively in one or two population(s). In conclusion, we have systematically investigated human SNPs introducing PTCs with respect to their possible biological consequences, distributions across different human populations and evolutionary aspects. We believe that the SNPs reported here are likely to affect gene/protein function, although their biological and evolutionary roles need to be further investigated.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| | - Sukru Tuzmen
- Cancer Drug Development Laboratory, Translational Genomics Research Institure, 13208 East Shea Blvd, Suite 110, Scottsdale, AZ 85259, USA
| | - Hilmi Ozcelik
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| |
Collapse
|
167
|
Pollinger JP, Bustamante CD, Fledel-Alon A, Schmutz S, Gray MM, Wayne RK. Selective sweep mapping of genes with large phenotypic effects. Genome Res 2006; 15:1809-19. [PMID: 16339379 PMCID: PMC1356119 DOI: 10.1101/gr.4374505] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Many domestic dog breeds have originated through fixation of discrete mutations by intense artificial selection. As a result of this process, markers in the proximity of genes influencing breed-defining traits will have reduced variation (a selective sweep) and will show divergence in allele frequency. Consequently, low-resolution genomic scans can potentially be used to identify regions containing genes that have a major influence on breed-defining traits. We model the process of breed formation and show that the probability of two or three adjacent marker loci showing a spurious signal of selection within at least one breed (i.e., Type I error or false-positive rate) is low if highly variable and moderately spaced markers are utilized. We also use simulations with selection to demonstrate that even a moderately spaced set of highly polymorphic markers (e.g., one every 0.8 cM) has high power to detect regions targeted by strong artificial selection in dogs. Further, we show that a gene responsible for black coat color in the Large Munsterlander has a 40-Mb region surrounding the gene that is very low in heterozygosity for microsatellite markers. Similarly, we survey 302 microsatellite markers in the Dachshund and find three linked monomorphic microsatellite markers all within a 10-Mb region on chromosome 3. This region contains the FGFR3 gene, which is responsible for achondroplasia in humans, but not in dogs. Consequently, our results suggest that the causative mutation is a gene or regulatory region closely linked to FGFR3.
Collapse
Affiliation(s)
- John P Pollinger
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA.
| | | | | | | | | | | |
Collapse
|
168
|
Rodin SN, Parkhomchuk DV, Rodin AS, Holmquist GP, Riggs AD. Repositioning-dependent fate of duplicate genes. DNA Cell Biol 2006; 24:529-42. [PMID: 16153154 DOI: 10.1089/dna.2005.24.529] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Gene duplication is the main source of evolutionary novelties. However, the problem with duplicates is that the purifying selection overlooks deleterious mutations in the redundant sequence, which therefore, instead of gaining a new function, often degrades into a functionless pseudogene. This risk of functional loss instead of gain is much higher for small populations of higher organisms with a slow and complex development. We propose that it is the epigenetic tissue/stage-complementary silencing of duplicates that makes them exposable to the purifying selection, thus saving them from pseudogenization and opening the way towards new function(s). Our genome-wide analyses of gene duplicates in several eukaryotic species combined with the phylogenetic comparison of vertebrate alpha- and beta-globin gene clusters strongly support this epigenetic complementation (EC) model. The distinctive condition for a new duplicate to survive by the EC mechanism seems to be its repositioning to an ectopic site, which is accompanied by changes in the rate and direction of mutagenesis. The most distinguished in this respect is the human genome. In this review, we extend and discuss the data on the EC- and repositioning-dependent fate of gene duplicates with the special emphasis on the problem of detecting brief postduplication period of adaptive evolution driven by positive selection. Accordingly, we propose a new CpG-focused measure of selection that is insensitive to translocation-caused biases in mutagenesis.
Collapse
Affiliation(s)
- Sergei N Rodin
- Theoretical Biology Department, Beckman Research Institute of the City of Hope, Duarte, CA 91010, USA.
| | | | | | | | | |
Collapse
|
169
|
Abstract
A recent paper in this journal has challenged the idea that complex adaptive features of proteins can be explained by known molecular, genetic, and evolutionary mechanisms. It is shown here that the conclusions of this prior work are an artifact of unwarranted biological assumptions, inappropriate mathematical modeling, and faulty logic. Numerous simple pathways exist by which adaptive multi-residue functions can evolve on time scales of a million years (or much less) in populations of only moderate size. Thus, the classical evolutionary trajectory of descent with modification is adequate to explain the diversification of protein functions.
Collapse
Affiliation(s)
- Michael Lynch
- Department of Biology, Indiana University, Bloomington, IN 47405, USA.
| |
Collapse
|
170
|
Affiliation(s)
- Hilliary Creely
- Max-Planck Institute for Evolutionary Anthropology, Deutscher Platz, D-04103 Leipzig, Germany
| | | |
Collapse
|
171
|
Aspholm M, Kalia A, Ruhl S, Schedin S, Arnqvist A, Lindén S, Sjöström R, Gerhard M, Semino-Mora C, Dubois A, Unemo M, Danielsson D, Teneberg S, Lee WK, Berg DE, Borén T. Helicobacter pylori adhesion to carbohydrates. Methods Enzymol 2006; 417:293-339. [PMID: 17132512 PMCID: PMC2576508 DOI: 10.1016/s0076-6879(06)17020-2] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Adherence of bacterial pathogens to host tissues contributes to colonization and virulence and typically involves specific interactions between bacterial proteins called adhesins and cognate oligosaccharide (glycan) or protein motifs in the host that are used as receptors. A given pathogen may have multiple adhesins, each specific for a different set of receptors and, potentially, with different roles in infection and disease. This chapter provides strategies for identifying and analyzing host glycan receptors and the bacterial adhesins that exploit them as receptors, with particular reference to adherence of the gastric pathogen Helicobacter pylori.
Collapse
Affiliation(s)
- Marina Aspholm
- Department of Molecular Biosciences, University of Oslo, Oslo, Norway
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
172
|
Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, Zody MC, Mauceli E, Xie X, Breen M, Wayne RK, Ostrander EA, Ponting CP, Galibert F, Smith DR, DeJong PJ, Kirkness E, Alvarez P, Biagi T, Brockman W, Butler J, Chin CW, Cook A, Cuff J, Daly MJ, DeCaprio D, Gnerre S, Grabherr M, Kellis M, Kleber M, Bardeleben C, Goodstadt L, Heger A, Hitte C, Kim L, Koepfli KP, Parker HG, Pollinger JP, Searle SMJ, Sutter NB, Thomas R, Webber C, Baldwin J, Abebe A, Abouelleil A, Aftuck L, Ait-Zahra M, Aldredge T, Allen N, An P, Anderson S, Antoine C, Arachchi H, Aslam A, Ayotte L, Bachantsang P, Barry A, Bayul T, Benamara M, Berlin A, Bessette D, Blitshteyn B, Bloom T, Blye J, Boguslavskiy L, Bonnet C, Boukhgalter B, Brown A, Cahill P, Calixte N, Camarata J, Cheshatsang Y, Chu J, Citroen M, Collymore A, Cooke P, Dawoe T, Daza R, Decktor K, DeGray S, Dhargay N, Dooley K, Dooley K, Dorje P, Dorjee K, Dorris L, Duffey N, Dupes A, Egbiremolen O, Elong R, Falk J, Farina A, Faro S, Ferguson D, Ferreira P, Fisher S, FitzGerald M, et alLindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, Zody MC, Mauceli E, Xie X, Breen M, Wayne RK, Ostrander EA, Ponting CP, Galibert F, Smith DR, DeJong PJ, Kirkness E, Alvarez P, Biagi T, Brockman W, Butler J, Chin CW, Cook A, Cuff J, Daly MJ, DeCaprio D, Gnerre S, Grabherr M, Kellis M, Kleber M, Bardeleben C, Goodstadt L, Heger A, Hitte C, Kim L, Koepfli KP, Parker HG, Pollinger JP, Searle SMJ, Sutter NB, Thomas R, Webber C, Baldwin J, Abebe A, Abouelleil A, Aftuck L, Ait-Zahra M, Aldredge T, Allen N, An P, Anderson S, Antoine C, Arachchi H, Aslam A, Ayotte L, Bachantsang P, Barry A, Bayul T, Benamara M, Berlin A, Bessette D, Blitshteyn B, Bloom T, Blye J, Boguslavskiy L, Bonnet C, Boukhgalter B, Brown A, Cahill P, Calixte N, Camarata J, Cheshatsang Y, Chu J, Citroen M, Collymore A, Cooke P, Dawoe T, Daza R, Decktor K, DeGray S, Dhargay N, Dooley K, Dooley K, Dorje P, Dorjee K, Dorris L, Duffey N, Dupes A, Egbiremolen O, Elong R, Falk J, Farina A, Faro S, Ferguson D, Ferreira P, Fisher S, FitzGerald M, Foley K, Foley C, Franke A, Friedrich D, Gage D, Garber M, Gearin G, Giannoukos G, Goode T, Goyette A, Graham J, Grandbois E, Gyaltsen K, Hafez N, Hagopian D, Hagos B, Hall J, Healy C, Hegarty R, Honan T, Horn A, Houde N, Hughes L, Hunnicutt L, Husby M, Jester B, Jones C, Kamat A, Kanga B, Kells C, Khazanovich D, Kieu AC, Kisner P, Kumar M, Lance K, Landers T, Lara M, Lee W, Leger JP, Lennon N, Leuper L, LeVine S, Liu J, Liu X, Lokyitsang Y, Lokyitsang T, Lui A, Macdonald J, Major J, Marabella R, Maru K, Matthews C, McDonough S, Mehta T, Meldrim J, Melnikov A, Meneus L, Mihalev A, Mihova T, Miller K, Mittelman R, Mlenga V, Mulrain L, Munson G, Navidi A, Naylor J, Nguyen T, Nguyen N, Nguyen C, Nguyen T, Nicol R, Norbu N, Norbu C, Novod N, Nyima T, Olandt P, O'Neill B, O'Neill K, Osman S, Oyono L, Patti C, Perrin D, Phunkhang P, Pierre F, Priest M, Rachupka A, Raghuraman S, Rameau R, Ray V, Raymond C, Rege F, Rise C, Rogers J, Rogov P, Sahalie J, Settipalli S, Sharpe T, Shea T, Sheehan M, Sherpa N, Shi J, Shih D, Sloan J, Smith C, Sparrow T, Stalker J, Stange-Thomann N, Stavropoulos S, Stone C, Stone S, Sykes S, Tchuinga P, Tenzing P, Tesfaye S, Thoulutsang D, Thoulutsang Y, Topham K, Topping I, Tsamla T, Vassiliev H, Venkataraman V, Vo A, Wangchuk T, Wangdi T, Weiand M, Wilkinson J, Wilson A, Yadav S, Yang S, Yang X, Young G, Yu Q, Zainoun J, Zembek L, Zimmer A, Lander ES. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 2005; 438:803-19. [PMID: 16341006 DOI: 10.1038/nature04338] [Show More Authors] [Citation(s) in RCA: 1736] [Impact Index Per Article: 86.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2005] [Accepted: 10/11/2005] [Indexed: 12/12/2022]
Abstract
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Collapse
Affiliation(s)
- Kerstin Lindblad-Toh
- Broad Institute of Harvard and MIT, 320 Charles Street, Cambridge, Massachusetts 02141, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
173
|
Carlini DB, Genut JE. Synonymous SNPs Provide Evidence for Selective Constraint on Human Exonic Splicing Enhancers. J Mol Evol 2005; 62:89-98. [PMID: 16320116 DOI: 10.1007/s00239-005-0055-x] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2005] [Accepted: 08/01/2005] [Indexed: 11/24/2022]
Abstract
The human SNP database was used to detect selection on 238 hexamers previously identified as exonic splicing enhancers (ESEs). We compared the distribution of the 238 putative ESEs in biallelic and triallelic SNPs within five different functional categories of the SNP database: synonymous, nonsynonymous, introns, UTRs, and nongenic SNPs. Since true ESEs do not function outside of exons, SNPs that disrupt ESE motifs were expected to be more common in nonexonic portions of the genome. Our results supported this expectation: ESEs were least prevalent within synonymous SNPs and most common in nongenic SNPs. There were approximately 11% fewer ESEs within synonymous biallelic SNPs than expected under no selective constraint. We also compared the frequency of neutral SNPs, those where neither allele was an ESE, with deleterious SNPs, those where one or more alleles was an ESE, across the five different functional classes of SNPs. In comparison with the other functional classes of SNPs, synonymous SNPs contained an excess of neutral variants (+1.64% and +6.04% for biallelic and triallelic SNPs, respectively) and a dearth of deleterious variants (-13.11% and -52.39% for biallelic and triallelic SNPs, respectively). The observed patterns were consistent with purifying selection on the 238 hexamers to maintain their function as ESEs. However, in contrast to previous work, we did not find evidence for selection to maintain ESE function at nonsynonymous SNPs because selection at the protein level probably obscured any difference at the level of ESE function.
Collapse
Affiliation(s)
- David B Carlini
- Department of Biology, American University, 4400 Massachusetts Avenue, NW, Washington, DC 20016, USA.
| | | |
Collapse
|
174
|
Doan JW, Schmidt TR, Wildman DE, Goodman M, Weiss ML, Grossman LI. Rapid nonsynonymous evolution of the iron-sulfur protein in anthropoid primates. J Bioenerg Biomembr 2005; 37:35-41. [PMID: 15906147 DOI: 10.1007/s10863-005-4121-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2004] [Accepted: 11/19/2004] [Indexed: 11/28/2022]
Abstract
Cytochrome c (CYC) and 9 of the 13 subunits of cytochrome c oxidase (complex IV; COX) were previously shown to have accelerated rates of nonsynonymous substitution in anthropoid primates. Cytochrome b, the mtDNA encoded subunit of ubiquinol-cytochrome c reductase (complex III), also showed an accelerated nonsynonymous substitution rate in anthropoid primates but rate information about the nuclear encoded subunits of complex III has been lacking. We now report that phylogenetic and relative rates analysis of a nuclear encoded catalytically active subunit of complex III, the iron-sulfur protein (ISP), shows an accelerated rate of amino acid replacement similar to cytochrome b. Because both ISP and subunit 9, whose function is not directly related to electron transport, are produced by cleavage into two subunits of the initial translation product of a single gene, it is probable that these two subunits of complex III have essentially identical underlying rates of mutation. Nevertheless, we find that the catalytically active ISP has an accelerated rate of amino acid replacement in anthropoid primates whereas the catalytically inactive subunit 9 does not.
Collapse
Affiliation(s)
- Jeffrey W Doan
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, 540 E. Canfield Avenue, Detroit, MI 48201, USA
| | | | | | | | | | | |
Collapse
|
175
|
Marques-Bonet T, Navarro A. Chromosomal rearrangements are associated with higher rates of molecular evolution in mammals. Gene 2005; 353:147-54. [PMID: 15951139 DOI: 10.1016/j.gene.2005.05.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2005] [Revised: 04/25/2005] [Accepted: 05/10/2005] [Indexed: 10/25/2022]
Abstract
Evolutionary rates are not uniformly distributed across the genome. Knowledge about the biological causes of this observation is still incomplete, but its exploration has provided valuable insight into the genomical, historical and demographical variables that influence rates of genetic divergence. Recent studies suggest a possible association between chromosomal rearrangements and regions of greater divergence, but evidence is limited and contradictory. Here, we test the hypothesis of a relationship between chromosomal rearrangements and higher rates of molecular evolution by studying the genomic distribution of divergence between 12,000 human-mouse orthologous genes. Our results clearly show that genes located in genomic regions that have been highly rearranged between the two species present higher rates of synonymous (0.7686 vs. 0.7076) and non-synonymous substitution (0.1014 vs. 0.0871), and that synonymous substitution rates are higher in genes close to the breakpoints of individual rearrangements. The many potential causes of such striking are discussed, particularly in the light of speciation models suggesting that chromosomal rearrangements may have contributed to some of the speciation processes along the human and mouse lineages. Still, there are other possible causes and further research is needed to properly explore them.
Collapse
Affiliation(s)
- Tomàs Marques-Bonet
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Doctor Aiguader 80, 08003 Barcelona, Spain
| | | |
Collapse
|
176
|
de Parseval N, Diop G, Blaise S, Helle F, Vasilescu A, Matsuda F, Heidmann T. Comprehensive search for intra- and inter-specific sequence polymorphisms among coding envelope genes of retroviral origin found in the human genome: genes and pseudogenes. BMC Genomics 2005; 6:117. [PMID: 16150157 PMCID: PMC1236922 DOI: 10.1186/1471-2164-6-117] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2005] [Accepted: 09/09/2005] [Indexed: 12/11/2022] Open
Abstract
Background The human genome carries a high load of proviral-like sequences, called Human Endogenous Retroviruses (HERVs), which are the genomic traces of ancient infections by active retroviruses. These elements are in most cases defective, but open reading frames can still be found for the retroviral envelope gene, with sixteen such genes identified so far. Several of them are conserved during primate evolution, having possibly been co-opted by their host for a physiological role. Results To characterize further their status, we presently sequenced 12 of these genes from a panel of 91 Caucasian individuals. Genomic analyses reveal strong sequence conservation (only two non synonymous Single Nucleotide Polymorphisms [SNPs]) for the two HERV-W and HERV-FRD envelope genes, i.e. for the two genes specifically expressed in the placenta and possibly involved in syncytiotrophoblast formation. We further show – using an ex vivo fusion assay for each allelic form – that none of these SNPs impairs the fusogenic function. The other envelope proteins disclose variable polymorphisms, with the occurrence of a stop codon and/or frameshift for most – but not all – of them. Moreover, the sequence conservation analysis of the orthologous genes that can be found in primates shows that three env genes have been maintained in a fully coding state throughout evolution including envW and envFRD. Conclusion Altogether, the present study strongly suggests that some but not all envelope encoding sequences are bona fide genes. It also provides new tools to elucidate the possible role of endogenous envelope proteins as susceptibility factors in a number of pathologies where HERVs have been suspected to be involved.
Collapse
Affiliation(s)
- Nathalie de Parseval
- UMR 8122 CNRS, Institut Gustave Roussy, 39 rue Camille Desmoulins, 94805 Villejuif Cedex, France
| | - Gora Diop
- Centre National de Génotypage, 2, rue Gaston Crémieux, Évry, France
| | - Sandra Blaise
- UMR 8122 CNRS, Institut Gustave Roussy, 39 rue Camille Desmoulins, 94805 Villejuif Cedex, France
| | - François Helle
- UMR 8122 CNRS, Institut Gustave Roussy, 39 rue Camille Desmoulins, 94805 Villejuif Cedex, France
| | | | - Fumihiko Matsuda
- Centre National de Génotypage, 2, rue Gaston Crémieux, Évry, France
| | - Thierry Heidmann
- UMR 8122 CNRS, Institut Gustave Roussy, 39 rue Camille Desmoulins, 94805 Villejuif Cedex, France
| |
Collapse
|
177
|
Khaitovich P, Hellmann I, Enard W, Nowick K, Leinweber M, Franz H, Weiss G, Lachmann M, Pääbo S. Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 2005; 309:1850-4. [PMID: 16141373 DOI: 10.1126/science.1108296] [Citation(s) in RCA: 420] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
The determination of the chimpanzee genome sequence provides a means to study both structural and functional aspects of the evolution of the human genome. Here we compare humans and chimpanzees with respect to differences in expression levels and protein-coding sequences for genes active in brain, heart, liver, kidney, and testis. We find that the patterns of differences in gene expression and gene sequences are markedly similar. In particular, there is a gradation of selective constraints among the tissues so that the brain shows the least differences between the species whereas liver shows the most. Furthermore, expression levels as well as amino acid sequences of genes active in more tissues have diverged less between the species than have genes active in fewer tissues. In general, these patterns are consistent with a model of neutral evolution with negative selection. However, for X-chromosomal genes expressed in testis, patterns suggestive of positive selection on sequence changes as well as expression changes are seen. Furthermore, although genes expressed in the brain have changed less than have genes expressed in other tissues, in agreement with previous work we find that genes active in brain have accumulated more changes on the human than on the chimpanzee lineage.
Collapse
MESH Headings
- Adult
- Aged
- Amino Acid Sequence
- Animals
- Base Sequence
- Child
- Chromosomes, Human, X/genetics
- Chromosomes, Mammalian/genetics
- Evolution, Molecular
- Female
- Gene Expression
- Gene Expression Profiling
- Gene Expression Regulation
- Genome
- Genome, Human
- Heart/physiology
- Humans
- Kidney/physiology
- Liver/physiology
- Male
- Middle Aged
- Models, Genetic
- Oligonucleotide Array Sequence Analysis
- Organ Specificity
- Pan troglodytes/genetics
- Prefrontal Cortex/physiology
- Promoter Regions, Genetic
- Proteins/genetics
- Selection, Genetic
- Sequence Analysis, DNA
- Species Specificity
- Testis/physiology
- Transcription, Genetic
- X Chromosome/genetics
Collapse
Affiliation(s)
- Philipp Khaitovich
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
178
|
Savas S, Ozcelik H. Phosphorylation states of cell cycle and DNA repair proteins can be altered by the nsSNPs. BMC Cancer 2005; 5:107. [PMID: 16111488 PMCID: PMC1208866 DOI: 10.1186/1471-2407-5-107] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2004] [Accepted: 08/19/2005] [Indexed: 01/20/2023] Open
Abstract
Background Phosphorylation is a reversible post-translational modification that affects the intrinsic properties of proteins, such as structure and function. Non-synonymous single nucleotide polymorphisms (nsSNPs) result in the substitution of the encoded amino acids and thus are likely to alter the phosphorylation motifs in the proteins. Methods In this study, we used the web-based NetPhos tool to predict candidate nsSNPs that either introduce or remove putative phosphorylation sites in proteins that act in DNA repair and cell cycle pathways. Results Our results demonstrated that a total of 15 nsSNPs (16.9%) were likely to alter the putative phosphorylation patterns of 14 proteins. Three of these SNPs (CDKN1A-S31R, OGG1-S326C, and XRCC3-T241M) have already found to be associated with altered cancer risk. We believe that this set of nsSNPs constitutes an excellent resource for further molecular and genetic analyses. Conclusion The novel systematic approach used in this study will accelerate the understanding of how naturally occurring human SNPs may alter protein function through the modification of phosphorylation mechanisms and contribute to disease susceptibility.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, M5G 1X5, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, M5G 1X5, ON, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, M5G 1L5, Toronto, ON, Canada
| | - Hilmi Ozcelik
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, M5G 1X5, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, M5G 1X5, ON, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, M5G 1L5, Toronto, ON, Canada
| |
Collapse
|
179
|
Abstract
Compared to protein-coding sequences, the evolution of noncoding sequences and the selective constraints placed on these sequences is not well characterized. To compare the evolution of coding and noncoding sequences, we have conducted a survey for DNA polymorphism at five randomly chosen loci among a diverse collection of 81 strains of Saccharomyces cerevisiae. Average rates of both polymorphism and divergence are 40% lower at noncoding sites and 90% lower at nonsynonymous sites in comparison to synonymous sites. Although noncoding and coding sequences show substantial variability in ratios of polymorphism to divergence, two of the loci, MLS1 and PDR10, show a higher rate of polymorphism at noncoding compared to synonymous sites. The high rate of polymorphism is not accompanied by a high rate of divergence and is limited to a few small regions. These hypervariable regions include sites with three segregating bases at a single site and adjacent polymorphic sites. We show that this clustering of polymorphic sites is significantly greater than one would expect on the basis of the spacing between polymorphic fourfold degenerate sites. Although hypervariable noncoding sequences could result from selection on regulatory mutations, they could also result from transient mutational hotspots.
Collapse
Affiliation(s)
- Justin C Fay
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA.
| | | |
Collapse
|
180
|
Woods CG, Bond J, Enard W. Autosomal recessive primary microcephaly (MCPH): a review of clinical, molecular, and evolutionary findings. Am J Hum Genet 2005; 76:717-28. [PMID: 15806441 PMCID: PMC1199363 DOI: 10.1086/429930] [Citation(s) in RCA: 315] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Accepted: 02/25/2005] [Indexed: 12/24/2022] Open
Abstract
Autosomal recessive primary microcephaly (MCPH) is a neurodevelopmental disorder. It is characterized by two principal features, microcephaly present at birth and nonprogressive mental retardation. The microcephaly is the consequence of a small but architecturally normal brain, and it is the cerebral cortex that shows the greatest size reduction. There are at least seven MCPH loci, and four of the genes have been identified: MCPH1, encoding Microcephalin; MCPH3, encoding CDK5RAP2; MCPH5, encoding ASPM; and MCPH6, encoding CENPJ. These findings are starting to have an impact on the clinical management of families affected with MCPH. Present data suggest that MCPH is the consequence of deficient neurogenesis within the neurogenic epithelium. Evolutionary interest in MCPH has been sparked by the suggestion that changes in the MCPH genes might also be responsible for the increase in brain size during human evolution. Indeed, evolutionary analyses of Microcephalin and ASPM reveal evidence for positive selection during human and great ape evolution. So an understanding of this rare genetic disorder may offer us significant insights into neurogenic mitosis and the evolution of the most striking differences between us and our closest living relatives: brain size and cognitive ability.
Collapse
Affiliation(s)
- C Geoffrey Woods
- Department of Medical Genetics, Cambridge Institute for Medical Research, Cambridge, United Kingdom.
| | | | | |
Collapse
|
181
|
Savas S, Ahmad MF, Shariff M, Kim DY, Ozcelik H. Candidate nsSNPs that can affect the functions and interactions of cell cycle proteins. Proteins 2004; 58:697-705. [PMID: 15617026 DOI: 10.1002/prot.20367] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Nonsynonymous single nucleotide polymorphisms (nsSNPs) alter the encoded amino acid sequence, and are thus likely to affect the function of the proteins, and represent potential disease-modifiers. There is an enormous number of nsSNPs in the human population, and the major challenge lies in distinguishing the functionally significant and potentially disease-related ones from the rest. In this study, we analyzed the genetic variations that can alter the functions and the interactions of a group of cell cycle proteins (n = 60) and the proteins interacting with them (n = 26) using computational tools. As a result, we extracted 249 nsSNPs from 77 cell cycle proteins and their interaction partners from public SNP databases. Only 31 (12.4%) of the nsSNPs were validated. The majority (64.5%) of the validated SNPs were rare (minor allele frequencies < 5%). Evolutionary conservation analysis using the SIFT tool suggested that 16.1% of the validated nsSNPs may disrupt the protein function. In addition, 58% of the validated nsSNPs were located in functional protein domains/motifs, which together with the evolutionary conservation analysis enabled us to infer possible biological consequences of the nsSNPs in our set. Our study strongly suggests the presence of naturally occurring genetic variations in the cell cycle proteins that may affect their interactions and functions with possible roles in complex human diseases, such as cancer.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|
182
|
Abstract
With the completion of the human genome sequence and the advent of technologies to study functional aspects of genomes, molecular comparisons between humans and other primates have gained momentum. The comparison of the human genome to the genomes of species closely related to humans allows the identification of genomic features that set primates apart from other mammals and of features that set certain primates notably humans apart from other primates. In this article, we review recent progress in these areas with an emphasis on how comparative approaches may be used to identify functionally relevant features unique to the human genome.
Collapse
Affiliation(s)
- Wolfgang Enard
- Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany.
| | | |
Collapse
|