101
|
Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M. Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res 2005; 33:2374-83. [PMID: 15860774 PMCID: PMC1087782 DOI: 10.1093/nar/gki531] [Citation(s) in RCA: 154] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2005] [Revised: 03/14/2005] [Accepted: 04/04/2005] [Indexed: 01/31/2023] Open
Abstract
Pseudogenes, in the case of protein-coding genes, are gene copies that have lost the ability to code for a protein; they are typically identified through annotation of disabled, decayed or incomplete protein-coding sequences. Processed pseudogenes (PPsigs) are made through mRNA retrotransposition. There is overwhelming genomic evidence for thousands of human PPsigs and also dozens of human processed genes that comprise complete retrotransposed copies of other genes. Here, we survey for an intermediate entity, the transcribed processed pseudogene (TPPsig), which is disabled but nonetheless transcribed. TPPsigs may affect expression of paralogous genes, as observed in the case of the mouse makorin1-p1 TPPsig. To elucidate their role, we identified human TPPsigs by mapping expressed sequences onto PPsigs and, reciprocally, extracting TPPsigs from known mRNAs. We consider only those PPsigs that are homologous to either non-mammalian eukaryotic proteins or protein domains of known structure, and require detection of identical coding-sequence disablements in both the expressed and genomic sequences. Oligonucleotide microarray data provide further expression verification. Overall, we find 166-233 TPPsigs ( approximately 4-6% of PPsigs). Proteins/transcripts with the highest numbers of homologous TPPsigs generally have many homologous PPsigs and are abundantly expressed. TPPsigs are significantly over-represented near both the 5' and 3' ends of genes; this suggests that TPPsigs can be formed through gene-promoter co-option, or intrusion into untranslated regions. However, roughly half of the TPPsigs are located away from genes in the intergenic DNA and thus may be co-opting cryptic promoters of undesignated origin. Furthermore, TPPsigs are unlike other PPsigs and processed genes in the following ways: (i) they do not show a significant tendency to either deposit on or originate from the X chromosome; (ii) only 5% of human TPPsigs have potential orthologs in mouse. This latter finding indicates that the vast majority of TPPsigs is lineage specific. This is likely linked to well-documented extensive lineage-specific SINE/LINE activity. The list of TPPsigs is available at: http://www.biology.mcgill.ca/faculty/harrison/tppg/bppg.tov (or) http:pseudogene.org.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University Stewart Biology Building, 1205 Dr. Penfield Avenue, Montreal, Quebec, Canada H3A 1B1.
| | | | | | | | | |
Collapse
|
102
|
Boccia A, Petrillo M, di Bernardo D, Guffanti A, Mignone F, Confalonieri S, Luzi L, Pesole G, Paolella G, Ballabio A, Banfi S. DG-CST (Disease Gene Conserved Sequence Tags), a database of human-mouse conserved elements associated to disease genes. Nucleic Acids Res 2005; 33:D505-10. [PMID: 15608249 PMCID: PMC539965 DOI: 10.1093/nar/gki011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The identification and study of evolutionarily conserved genomic sequences that surround disease-related genes is a valuable tool to gain insight into the functional role of these genes and to better elucidate the pathogenetic mechanisms of disease. We created the DG-CST (Disease Gene Conserved Sequence Tags) database for the identification and detailed annotation of human–mouse conserved genomic sequences that are localized within or in the vicinity of human disease-related genes. CSTs are defined as sequences that show at least 70% identity between human and mouse over a length of at least 100 bp. The database contains CST data relative to over 1088 genes responsible for monogenetic human genetic diseases or involved in the susceptibility to multifactorial/polygenic diseases. DG-CST is accessible via the internet at http://dgcst.ceinge.unina.it/ and may be searched using both simple and complex queries. A graphic browser allows direct visualization of the CSTs and related annotations within the context of the relative gene and its transcripts.
Collapse
|
103
|
Abstract
This is the year of the chimpanzee genome. Chimpanzee chromosome 22 has been sequenced and soon will be followed by the whole genome, and thousands of chimpanzee cDNA sequences are available for comparative analysis. Not only does this genomic information allow us to identify human-specific changes in particular genes that are potentially under selection, but also to understand molecular evolutionary dynamics characterizing the two most closely related mammalian genomes sequenced so far. Studies comparing gene expression in chimpanzees and other closely related primates reveal significant species differences in brain, liver and fibroblasts. New empirical data, in combination with models of speciation, are giving insight into how humans and chimpanzees speciated.
Collapse
Affiliation(s)
- Maryellen Ruvolo
- Department of Anthropology, Harvard University, 11 Divinity Avenue, Cambridge, MA 02138, USA.
| |
Collapse
|
104
|
Flegr J, Hrusková M, Hodný Z, Novotná M, Hanusová J. Body height, body mass index, waist-hip ratio, fluctuating asymmetry and second to fourth digit ratio in subjects with latent toxoplasmosis. Parasitology 2005; 130:621-8. [PMID: 15977898 DOI: 10.1017/s0031182005007316] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Between 20% and 60% of the population of most countries are infected with the protozoanToxoplasma gondii. Subjects with clinically asymptomatic life-long latent toxoplasmosis differ from those who areToxoplasmafree in several behavioural parameters. Case-control studies cannot decide whether these differences already existed before infection or whether they were induced by the presence ofToxoplasmain the brain of infected hosts. Here we searched for such morphological differences betweenToxoplasma-infected andToxoplasma-free subjects that could be induced by the parasite (body weight, body height, body mass index, waist-hip ratio), or could rather correlate with their natural resistance to parasitic infection (fluctuating asymmetry, 2D[ratio ]4D ratio). We foundToxoplasma-infected men to be taller andToxoplasma-infected men and women to have lower 2D[ratio ]4D ratios previously reported to be associated with higher pre-natal testosterone levels. The 2D[ratio ]4D ratio negatively correlated with the level of specific anti-Toxoplasmaantibodies inToxoplasma-free subjects. These results suggest that some of the observed differences between infected and non-infected subjects may have existed before infection and could be caused by the lower natural resistance toToxoplasmainfection in subjects with higher pre-natal testosterone levels.
Collapse
Affiliation(s)
- J Flegr
- Department of Parasitology, Faculty of Science, Charles University, Prague, Czech Republic.
| | | | | | | | | |
Collapse
|
105
|
Postlethwait J, Ruotti V, Carvan MJ, Tonellato PJ. Automated analysis of conserved syntenies for the zebrafish genome. Methods Cell Biol 2005; 77:255-71. [PMID: 15602916 DOI: 10.1016/s0091-679x(04)77014-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Affiliation(s)
- John Postlethwait
- Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403, USA
| | | | | | | |
Collapse
|
106
|
Ligon AH, Moore SDP, Parisi MA, Mealiffe ME, Harris DJ, Ferguson HL, Quade BJ, Morton CC. Constitutional rearrangement of the architectural factor HMGA2: a novel human phenotype including overgrowth and lipomas. Am J Hum Genet 2005; 76:340-8. [PMID: 15593017 PMCID: PMC1196379 DOI: 10.1086/427565] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2004] [Accepted: 11/17/2004] [Indexed: 11/03/2022] Open
Abstract
Although somatic mutations in a number of genes have been associated with development of human tumors, such as lipomas, relatively few examples exist of germline mutations in these genes. Here we describe an 8-year-old boy who has a de novo pericentric inversion of chromosome 12, with breakpoints at p11.22 and q14.3, and a phenotype including extreme somatic overgrowth, advanced endochondral bone and dental ages, a cerebellar tumor, and multiple lipomas. His chromosomal inversion was found to truncate HMGA2, a gene that encodes an architectural factor involved in the etiology of many benign mesenchymal tumors and that maps to the 12q14.3 breakpoint. Similar truncations of murine Hmga2 in transgenic mice result in somatic overgrowth and, in particular, increased abundance of fat and lipomas, features strikingly similar to those observed in the child. This represents the first report of a constitutional rearrangement affecting HMGA2 and demonstrates the role of this gene in human growth and development. Systematic genetic analysis and clinical studies of this child may offer unique insights into the role of HMGA2 in adipogenesis, osteogenesis, and general growth control.
Collapse
Affiliation(s)
- Azra H Ligon
- Department of Pathology, Gynecology and Reproductive Biology, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
107
|
Dermitzakis ET, Reymond A, Antonarakis SE. Conserved non-genic sequences — an unexpected feature of mammalian genomes. Nat Rev Genet 2005; 6:151-7. [PMID: 15716910 DOI: 10.1038/nrg1527] [Citation(s) in RCA: 192] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Mammalian genomes contain highly conserved sequences that are not functionally transcribed. These sequences are single copy and comprise approximately 1-2% of the human genome. Evolutionary analysis strongly supports their functional conservation, although their potentially diverse, functional attributes remain unknown. It is likely that genomic variation in conserved non-genic sequences is associated with phenotypic variability and human disorders. So how might their function and contribution to human disorders be examined?
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | | | | |
Collapse
|
108
|
Evidence for widespread degradation of gene control regions in hominid genomes. PLoS Biol 2005; 3:e42. [PMID: 15678168 PMCID: PMC544929 DOI: 10.1371/journal.pbio.0030042] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2004] [Accepted: 12/01/2004] [Indexed: 01/28/2023] Open
Abstract
Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human–chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees. A comparison of hominid and rodent lineages reveals that the gene control regions of hominids are not conserved and are accumulating mutations, suggesting widespread degradation of the hominid genome
Collapse
|
109
|
Abstract
Genome comparisons are behind the powerful new annotation methods being developed to find all human genes, as well as genes from other genomes. Genomes are now frequently being studied in pairs to provide cross-comparison datasets. This 'Noah's Ark' approach often reveals unsuspected genes and may support the deletion of false-positive predictions. Joining mouse and human as the cross-comparison dataset for the first two mammals are: two Drosophila species, D. melanogaster and D. pseudoobscura; two sea squirts, Ciona intestinalis and Ciona savignyi; four yeast (Saccharomyces) species; two nematodes, Caenorhabditis elegans and Caenorhabditis briggsae; and two pufferfish (Takefugu rubripes and Tetraodon nigroviridis). Even genomes like yeast and C. elegans, which have been known for more than five years, are now being significantly improved. Methods developed for yeast or nematodes will now be applied to mouse and human, and soon to additional mammals such as rat and dog, to identify all the mammalian protein-coding genes. Current large disparities between human Unigene predictions (127,835 genes) and gene-scanning methods (45,000 genes) still need to be resolved. This will be the challenge during the next few years.
Collapse
Affiliation(s)
- David R Nelson
- Department of Molecular Sciences and The UT Center of Excellence in Genomics and Bioinformatics, University of Tennessee, Memphis, Tennessee 38163, USA
| | - Daniel W Nebert
- Department of Environmental Health and Center for Environmental Genetics (CEG), University of Cincinnati Medical Center, Cincinnati, Ohio 45267-0056, USA
| |
Collapse
|
110
|
Abstract
The utility of DNA sequence information for phylogenetics and phylogeography is now well known. Rather than attempt to summarize studies addressing this well-demonstrated utility, this chapter focuses on fundamental approaches and techniques that implement the collection of DNA sequence data for comparative phylogenetic purposes in a genomic context (phylogenomics). Whole genome sequencing approaches have changed the way we think about phylogenetics and have opened the way for new perspectives on "old" phylogenetics concerns. Some of these concerns are which gene regions to use and how much sequence information is needed for robust phylogenetic inference. Whole genome sequences of a few animal model organisms have gone a long way to implement approaches to better understand these important phylogenetic concerns. This chapter also addresses how genomics has made it more important for a clear understanding of orthology of gene regions in comparative biology. Finally, genome-enabled technologies that are affecting comparative biology are also discussed.
Collapse
Affiliation(s)
- Rob DeSalle
- Department of Interbrate Zoology, American Museum of Natural History, New York, New York 10024, USA
| |
Collapse
|
111
|
Zaiou M. The future of genetic and genomic medicine in health risk assessment and disease: a path toward individualized medicine. Pharmacogenomics 2005; 6:7-12. [PMID: 15723600 DOI: 10.1517/14622416.6.1.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Affiliation(s)
- Mohamed Zaiou
- Université Henri Poincaré Nancy I, INSERM U525, Equipe 4, Faculté de Pharmacie, 30 Rue Lionnois, Nancy, France.
| |
Collapse
|
112
|
Woolfe A, Goodson M, Goode DK, Snell P, McEwen GK, Vavouri T, Smith SF, North P, Callaway H, Kelly K, Walter K, Abnizova I, Gilks W, Edwards YJK, Cooke JE, Elgar G. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol 2005; 3:e7. [PMID: 15630479 PMCID: PMC526512 DOI: 10.1371/journal.pbio.0030007] [Citation(s) in RCA: 685] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2004] [Accepted: 10/21/2004] [Indexed: 02/06/2023] Open
Abstract
In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH), in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development, including many transcription factors. These highly conserved non-coding sequences are likely to form part of the genomic circuitry that uniquely defines vertebrate development.
Collapse
Affiliation(s)
- Adam Woolfe
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Martin Goodson
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Debbie K Goode
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Phil Snell
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Gayle K McEwen
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Tanya Vavouri
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Sarah F Smith
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Phil North
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Heather Callaway
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Krys Kelly
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Klaudia Walter
- 2Medical Research Council Biostatistics Unit, Institute of Public Health, Addenbrookes HospitalCambridgeUnited Kingdom
| | - Irina Abnizova
- 2Medical Research Council Biostatistics Unit, Institute of Public Health, Addenbrookes HospitalCambridgeUnited Kingdom
| | - Walter Gilks
- 2Medical Research Council Biostatistics Unit, Institute of Public Health, Addenbrookes HospitalCambridgeUnited Kingdom
| | - Yvonne J. K Edwards
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Julie E Cooke
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| | - Greg Elgar
- 1Medical Research Council Rosalind Franklin Centre for Genomics ResearchHinxton, CambridgeUnited Kingdom
| |
Collapse
|
113
|
Antonarakis SE, Lyle R, Dermitzakis ET, Reymond A, Deutsch S. Chromosome 21 and down syndrome: from genomics to pathophysiology. Nat Rev Genet 2004; 5:725-38. [PMID: 15510164 DOI: 10.1038/nrg1448] [Citation(s) in RCA: 457] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The sequence of chromosome 21 was a turning point for the understanding of Down syndrome. Comparative genomics is beginning to identify the functional components of the chromosome and that in turn will set the stage for the functional characterization of the sequences. Animal models combined with genome-wide analytical methods have proved indispensable for unravelling the mysteries of gene dosage imbalance.
Collapse
Affiliation(s)
- Stylianos E Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland.
| | | | | | | | | |
Collapse
|
114
|
ABC: software for interactive browsing of genomic multiple sequence alignment data. BMC Bioinformatics 2004; 5:192. [PMID: 15588288 PMCID: PMC539296 DOI: 10.1186/1471-2105-5-192] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2004] [Accepted: 12/08/2004] [Indexed: 01/14/2023] Open
Abstract
Background Alignment and comparison of related genome sequences is a powerful method to identify regions likely to contain functional elements. Such analyses are data intensive, requiring the inclusion of genomic multiple sequence alignments, sequence annotations, and scores describing regional attributes of columns in the alignment. Visualization and browsing of results can be difficult, and there are currently limited software options for performing this task. Results The Application for Browsing Constraints (ABC) is interactive Java software for intuitive and efficient exploration of multiple sequence alignments and data typically associated with alignments. It is used to move quickly from a summary view of the entire alignment via arbitrary levels of resolution to individual alignment columns. It allows for the simultaneous display of quantitative data, (e.g., sequence similarity or evolutionary rates) and annotation data (e.g. the locations of genes, repeats, and constrained elements). It can be used to facilitate basic comparative sequence tasks, such as export of data in plain-text formats, visualization of phylogenetic trees, and generation of alignment summary graphics. Conclusions The ABC is a lightweight, stand-alone, and flexible graphical user interface for browsing genomic multiple sequence alignments of specific loci, up to hundreds of kilobases or a few megabases in length. It is coded in Java for cross-platform use and the program and source code are freely available under the General Public License. Documentation and a sample data set are also available .
Collapse
|
115
|
Das SK, Chu W, Zhang Z, Hasstedt SJ, Elbein SC. Calsquestrin 1 (CASQ1) gene polymorphisms under chromosome 1q21 linkage peak are associated with type 2 diabetes in Northern European Caucasians. Diabetes 2004; 53:3300-6. [PMID: 15561963 DOI: 10.2337/diabetes.53.12.3300] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Genome-wide scans in multiple populations have identified chromosome 1q21-q24 as one susceptibility region for type 2 diabetes. To map the susceptibility genes, we first placed a dense single nucleotide polymorphism (SNP) map across the linked region. We identified two SNPs that showed strong associations, and both mapped to within intron 2 of the calsequestrin 1 (CASQ1) gene. We tested the hypothesis that sequence variation in or near CASQ1 contributed to type 2 diabetes susceptibility in Northern European Caucasians by identifying additional SNPs from the public database and by screening the CASQ1 gene for additional variation. In addition to 15 known SNPs in this region, we found 8 new SNPs, 3 of which were in exons. A single rare nonsynonymous SNP in exon 11 (A348V) was not associated with type 2 diabetes. The associated SNPs were localized to the region between -1,404 in the 5' flanking region and 2,949 in intron 2 (P = 0.002 to P = 0.034). No SNP 3' to intron 2, including the adjacent gene PEA15, showed an association. The strongest associations were restricted to individuals of Northern European ancestry ascertained in Utah. A six-marker haplotype was also associated with type 2 diabetes (P = 0.008), but neither transmission disequilibrium test nor family-based association studies were significant for the most strongly associated SNP in intron 2 (SNP CASQ2312). An independent association of SNPs in introns 2 and 4 with type 2 diabetes is reported in Amish families with linkage to chromosome 1q21-q24. Our findings suggest that noncoding SNPs in CASQ1 alter diabetes susceptibility, either by a direct effect on CASQ1 gene expression or perhaps by regulating a nearby gene such as PEA15.
Collapse
Affiliation(s)
- Swapan Kumar Das
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| | | | | | | | | |
Collapse
|
116
|
Affiliation(s)
- Michael W Nachman
- Department of Ecology and Evolutionary Biology, P. O. Box 210088, Biosciences West Building, University of Arizona, Tucson, AZ 85721, USA.
| |
Collapse
|
117
|
Abstract
The genome of monotremes, like the animals themselves, is unique and strange. The importance of monotremes to genomics depends on their position as the earliest offshoot of the mammalian lineage. Although there has been controversy in the literature over the phylogenetic position of monotremes, this traditional interpretation is now confirmed by recent sequence comparisons. Characterizing the monotreme genome will therefore be important for studying the evolution and organization of the mammalian genome, and the proposal to sequence the platypus genome has been received enthusiastically by the genomics community. Recent investigations of X-chromosome inactivation, genomic imprinting and sex chromosome evolution provide good examples of the power of the monotreme genome to inform us about mammalian genome organization and evolution.
Collapse
Affiliation(s)
- Frank Grützner
- Research School of Biological Sciences, Australian National University, GPO Box 475, Canberra, Australian Capital Territory 2601, Australia.
| | | |
Collapse
|
118
|
Fournier PE, Zhu Y, Ogata H, Raoult D. Use of highly variable intergenic spacer sequences for multispacer typing of Rickettsia conorii strains. J Clin Microbiol 2004; 42:5757-66. [PMID: 15583310 PMCID: PMC535242 DOI: 10.1128/jcm.42.12.5757-5766.2004] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2004] [Revised: 06/28/2004] [Accepted: 08/09/2004] [Indexed: 11/20/2022] Open
Abstract
By use of the nearly perfectly colinear genomes of Rickettsia conorii and Rickettsia prowazekii, we compared the usefulness of three types of sequences for typing of R. conorii isolates: (i) 5 variable coding genes comprising the 16S ribosomal DNA, gltA, ompB, and sca4 (gene D) genes, which are present in both genomes, and the ompA gene, which is degraded in R. prowazekii; (ii) 28 genes degraded in R. conorii but intact in R. prowazekii, including 23 split and 5 remnant genes; and (iii) 27 conserved and 25 variable intergenic spacers. The 4 conserved and 23 split genes as well as the 27 conserved intergenic spacers each had identical sequences in 34 human and 5 tick isolates of R. conorii. Analysis of the ompA sequences identified three genotypes of R. conorii. The variable intergenic spacers were significantly more variable than conserved genes, split genes, remnant genes, and conserved spacers (P < 10(-2) in all cases). Four of the variable intergenic spacers (dksA-xerC, mppA-purC, rpmE-tRNA(fMet), and tRNA(Gly)-tRNA(Tyr)) had highly variable sequences; when they were combined for typing, multispacer typing (MST) identified 27 different genotypes in the 39 R. conorii isolates. Two batches from the same R. conorii strain, Malish (Seven), with different culture passage histories were found to exhibit the same MST type. MST was more discriminatory for strain genotyping than multiple gene sequencing (P < 10(-2)). Phylogenetic analysis based on MST sequences was concordant with the geographic origins of R. conorii isolates. Our study supports the usefulness of MST for strain genotyping. This tool may be useful for tracing a strain and identifying its source during outbreaks, including those resulting from bioterrorism.
Collapse
Affiliation(s)
- Pierre-Edouard Fournier
- Unité des Rickettsies, IFR 48, CNRS UMR 6020, Faculté de Médecine, Université de la Mediterranée, 27 Blvd. Jean Moulin, 13385 Marseille Cedex 5, France
| | | | | | | |
Collapse
|
119
|
Abstract
Rattus norvegicus is an important experimental organism and interesting to evolutionary biologists. The recently published draft rat genome sequence provides us with insights into both the rat's evolution and its physiology. We learn more about genome evolution and, in particular, the adaptive significance of gene family expansions and the evolution of rodent genomes, which appears to have decelerated since the divergence of mouse and rat. An important observation is that some regions of genomes, many in noncoding regions, show very high sequence conservation, while others show unexpectedly fast evolution. Both of these may be pointers to functional significance.
Collapse
|
120
|
Abstract
The genomes from three mammals (human, mouse, and rat), two worms, and several yeasts have been sequenced, and more genomes will be completed in the near future for comparison with those of the major model organisms. Scientists have used various methods to align and compare the sequenced genomes to address critical issues in genome function and evolution. This review covers some of the major new insights about gene content, gene regulation, and the fraction of mammalian genomes that are under purifying selection and presumed functional. We review the evolutionary processes that shape genomes, with particular attention to variation in rates within genomes and along different lineages. Internet resources for accessing and analyzing the treasure trove of sequence alignments and annotations are reviewed, and we discuss critical problems to address in new bioinformatic developments in comparative genomics.
Collapse
Affiliation(s)
- Webb Miller
- The Center for Comparative Genomics and Bioinformatics, The Huck Institutes of Life Sciences, Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA.
| | | | | | | |
Collapse
|
121
|
Nebert DW, Vesell ES. Advances in pharmacogenomics and individualized drug therapy: exciting challenges that lie ahead. Eur J Pharmacol 2004; 500:267-80. [PMID: 15464039 DOI: 10.1016/j.ejphar.2004.07.031] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2004] [Indexed: 12/16/2022]
Abstract
Between the 1930s and 1990s, several dozen predominantly monogenic, high-penetrance disorders involving pharmacogenetics were described, fueling the crusade that gene-drug interactions are quite simple. Then, in 1990, the Human Genome Project was established; in 1995, the term pharmacogenomics was introduced; finally, the complexities of determining an unequivocal phenotype, as well as an unequivocal genotype, have recently become apparent. Since 1965, more than 1000 reviews on this topic have painted an overly optimistic picture-suggesting that the advent of individualized drug therapy used by the practicing physician is fast approaching. For many reasons listed here, however, we emphasize that these high expectations must be tempered. We now realize that the nucleotide sequence of the genome represents only a starting point from which we must proceed to a more difficult stage: knowledge of the function encoded and how this affects the phenotype. To achieve individualized drug therapy, a high level of accuracy and precision is required of any clinical test proposed in human patients. Finally, we suggest that metabonomics, perhaps in combination with proteomics, might complement genomics in eventually helping us to achieve individualized drug therapy.
Collapse
Affiliation(s)
- Daniel W Nebert
- Division of Human Genetics, Department of Pediatrics and Molecular Developmental Biology, University of Cincinnati Medical Center, P.O. Box 670056, Cincinnati OH 45267-0056, USA.
| | | |
Collapse
|
122
|
Mallon AM, Wilming L, Weekes J, Gilbert JGR, Ashurst J, Peyrefitte S, Matthews L, Cadman M, McKeone R, Sellick CA, Arkell R, Botcherby MRM, Strivens MA, Campbell RD, Gregory S, Denny P, Hancock JM, Rogers J, Brown SDM. Organization and evolution of a gene-rich region of the mouse genome: a 12.7-Mb region deleted in the Del(13)Svea36H mouse. Genome Res 2004; 14:1888-901. [PMID: 15364904 PMCID: PMC524412 DOI: 10.1101/gr.2478604] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Del(13)Svea36H (Del36H) is a deletion of approximately 20% of mouse chromosome 13 showing conserved synteny with human chromosome 6p22.1-6p22.3/6p25. The human region is lost in some deletion syndromes and is the site of several disease loci. Heterozygous Del36H mice show numerous phenotypes and may model aspects of human genetic disease. We describe 12.7 Mb of finished, annotated sequence from Del36H. Del36H has a higher gene density than the draft mouse genome, reflecting high local densities of three gene families (vomeronasal receptors, serpins, and prolactins) which are greatly expanded relative to human. Transposable elements are concentrated near these gene families. We therefore suggest that their neighborhoods are gene factories, regions of frequent recombination in which gene duplication is more frequent. The gene families show different proportions of pseudogenes, likely reflecting different strengths of purifying selection and/or gene conversion. They are also associated with relatively low simple sequence concentrations, which vary across the region with a periodicity of approximately 5 Mb. Del36H contains numerous evolutionarily conserved regions (ECRs). Many lie in noncoding regions, are detectable in species as distant as Ciona intestinalis, and therefore are candidate regulatory sequences. This analysis will facilitate functional genomic analysis of Del36H and provides insights into mouse genome evolution.
Collapse
Affiliation(s)
- Ann-Marie Mallon
- Medical Research Council Mammalian Genetics Unit, Harwell, Oxfordshire, United Kingdom
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
123
|
Suzuki M, Hayashizaki Y. Mouse-centric comparative transcriptomics of protein coding and non-coding RNAs. Bioessays 2004; 26:833-43. [PMID: 15273986 DOI: 10.1002/bies.20084] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The largest transcriptome reported so far comprises 60,770 mouse full-length cDNA clones, and is an effective reference data set for comparative transcriptomics. The number of mouse cDNAs identified greatly exceeds the number of genes predicted from the sequenced human and mouse genomes. This is largely because of extensive alternative splicing and the presence of many non-coding RNAs (ncRNAs), which are difficult to predict from genomic sequences. Notably, ncRNAs are a major component of the transcriptomes of higher organisms, and many sense-antisense pairs have been identified. The ncRNAs function in a range of regulatory mechanisms for gene expression and other biological processes. They might also have contributed to the increased functional diversification of genomes during evolution. In this review, we discuss aspects of the transcriptome of various organisms in relation to the mouse data, in order to shed light on the regulatory mechanisms and physiological significance of these abundant RNAs.
Collapse
Affiliation(s)
- Masanori Suzuki
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Kanagawa, Japan
| | | |
Collapse
|
124
|
Hammond CJ, Andrew T, Mak YT, Spector TD. A susceptibility locus for myopia in the normal population is linked to the PAX6 gene region on chromosome 11: a genomewide scan of dizygotic twins. Am J Hum Genet 2004; 75:294-304. [PMID: 15307048 PMCID: PMC1216063 DOI: 10.1086/423148] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2004] [Accepted: 06/07/2004] [Indexed: 11/03/2022] Open
Abstract
Myopia is a common, complex trait with considerable economic and social impact and, in highly affected individuals, ocular morbidity. We performed a classic twin study of 506 unselected twin pairs and inferred the heritability of refractive error to be 0.89 (95% confidence interval 0.86-0.91). A genomewide scan of 221 dizygotic twin pairs, analyzed by use of optimal Haseman-Elston regression methods implemented by use of generalized linear modeling, showed significant linkage (LOD >3.2) to refractive error at four loci, with a maximum LOD score of 6.1 at 40 cM on chromome 11p13. Evidence of linkage at this locus, as well as at the other linkage peaks at chromosomes 3q26 (LOD 3.7), 8p23 (LOD 4.1), and 4q12 (LOD 3.3), remained the same or became stronger after model fit was checked and outliers were downweighted. Examination of potential candidate genes showed the PAX6 gene directly below the highest peak at the 11p13 locus. PAX6 is fundamental to identity and growth of the eye, but reported mutations usually result in catastrophic congenital phenotypes such as aniridia. Haplotype tagging of 17 single-nucleotide polymorphisms (SNPs), which covered the PAX6 gene and had common minor allele frequencies, identified 5 SNPs that explained 0.999 of the haplotype diversity. Linkage and association analysis of the tagging SNPs showed strong evidence of linkage for all markers with a minimum chi 21 of 7.5 (P=.006) but no association. This suggests that PAX6 may play a role in myopia development, possibly because of genetic variation in an upstream promoter or regulator, although no definite association between PAX6 common variants and myopia was demonstrated in this study.
Collapse
Affiliation(s)
- Christopher J Hammond
- Twin Research and Genetic Epidemiology Unit, St. Thomas' Hospital, London, and West Kent Eye Center, Princess Royal University Hospital, Orpington, United Kingdom.
| | | | | | | |
Collapse
|
125
|
Abstract
Sudden cardiac death (SCD) remains a public health problem of major magnitude. Contrary to earlier expectations, and despite decreased overall cardiac mortality, SCD rates appear to be rising in concert with escalating global prevalence of coronary disease and heart failure, the two major conditions predisposing to SCD. With the exception of the implantable defibrillator, there are few effective approaches to SCD prevention and even fewer clues concerning patient phenotypes predisposed to life-threatening arrhythmias. Clinical variables such as ejection fraction predict mortality but are not sensitive enough to identify many high SCD risk patients. The predictive power of autonomic dysregulation and markers such as lipid levels, hypertension, diabetes, and smoking is quite low in subclinical heart disease, the population in which the majority of SCDs occur. This review addresses advances in genomic science applicable to the SCD public health problem in both rare and common forms of heart disease. These include novel bioinformatic approaches to both identify candidate genes/pathways and identify previously unknown functional genetic elements, as well as methods to comprehensively screen these elements. We also discuss the possibility of applying high-density genome-wide SNP analyses to examine genetic contributions to arrhythmia susceptibility in community-based, case-control studies of common forms of SCD. The development of novel strategies to identify contributors to susceptibility in common cardiac phenotypes is most likely to lead to new and relevant therapeutic targets for SCD.
Collapse
Affiliation(s)
- Dan E Arking
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N Broadway, Room 580, Baltimore, Md 21205, USA.
| | | | | | | |
Collapse
|
126
|
Boffelli D, Nobrega MA, Rubin EM. Comparative genomics at the vertebrate extremes. Nat Rev Genet 2004; 5:456-65. [PMID: 15153998 DOI: 10.1038/nrg1350] [Citation(s) in RCA: 190] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Dario Boffelli
- DOE Joint Genome Institute, Walnut Creek, California 94598, USA
| | | | | |
Collapse
|
127
|
Sémon M, Duret L. Evidence that functional transcription units cover at least half of the human genome. Trends Genet 2004; 20:229-32. [PMID: 15109775 DOI: 10.1016/j.tig.2004.03.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Transcriptome analyses have revealed that a large proportion of the human genome is transcribed. However, many of these transcripts might be functionless. To distinguish functional transcription units (FTUs) from spurious transcripts, we searched for the hallmarks of selective pressure against mutations that impair transcription. We analyzed the distribution of transposable elements, which are counter selected within FTUs. We show that these features are sufficiently informative to predict whether a sequence is transcribed and, if transcribed, in which orientation. Our results indicate that FTUs constitute at least 50% of the genome and that approximately one-third of these transcripts apparently do not encode proteins.
Collapse
Affiliation(s)
- Marie Sémon
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558 Université Claude Bernard Lyon 1, 16 rue Raphaël Dubois, 69622 Villeurbanne Cedex, France
| | | |
Collapse
|
128
|
Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D. Ultraconserved elements in the human genome. Science 2004; 304:1321-5. [PMID: 15131266 DOI: 10.1126/science.1098119] [Citation(s) in RCA: 1222] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
There are 481 segments longer than 200 base pairs (bp) that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat, and mouse genomes. Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95 and 99% identity, respectively. Many are also significantly conserved in fish. These ultraconserved elements of the human genome are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. Along with more than 5000 sequences of over 100 bp that are absolutely conserved among the three sequenced mammals, these represent a class of genetic elements whose functions and evolutionary origins are yet to be determined, but which are more highly conserved between these species than are proteins and appear to be essential for the ontogeny of mammals and other vertebrates.
Collapse
Affiliation(s)
- Gill Bejerano
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | |
Collapse
|
129
|
Bofkin L, Whelan S. Comparative genomics: Functional needles in a genomic haystack. Heredity (Edinb) 2004; 92:363-4. [PMID: 15107806 DOI: 10.1038/sj.hdy.6800429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
130
|
Dermitzakis ET, Kirkness E, Schwarz S, Birney E, Reymond A, Antonarakis SE. Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res 2004; 14:852-9. [PMID: 15078857 PMCID: PMC479112 DOI: 10.1101/gr.1934904] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic conservation remains unknown, however. The availability of the sequence of a third mammalian genome, the dog, allows for a large-scale analysis of evolutionary attributes of CNGs in mammals. We have aligned 1638 previously identified CNGs and 976 conserved exons (CODs) from human chromosome 21 (Hsa21) with their orthologous sequences in mouse and dog. Attributes of selective constraint, such as sequence conservation, clustering, and direction of substitutions were compared between CNGs and CODs, showing a clear distinction between the two classes. We subsequently performed a chromosome-wide analysis of CNGs by correlating selective constraint metrics with their position on the chromosome and relative to their distance from genes. We found that CNGs appear to be randomly arranged in intergenic regions, with no bias to be closer or farther from genes. Moreover, conservation and clustering of substitutions of CNGs appear to be completely independent of their distance from genes. These results suggest that the majority of CNGs are not typical of previously described regulatory elements in terms of their location. We propose models for a global role of CNGs in genome function and regulation, through long-distance cis or trans chromosomal interactions.
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Division of Medical Genetics, University of Geneva Medical School, CH-1211 Geneva, Switzerland.
| | | | | | | | | | | |
Collapse
|
131
|
Abstract
A number of recent studies have indicated that the location of a given mammalian chromosome within the interphase nucleus is related to its size, whereas other work has implicated a chromosome's gene density as a factor. Recent investigations of the degree to which an ordered arrangement of mitotic chromosomes on the metaphase plate is inherited and perpetuated during successive cell cycles have also yielded somewhat controversial results. The arrangement of chromosomes in the nucleus also has been investigated by the analysis of chromosomal translocations, with some surprising recent findings.
Collapse
Affiliation(s)
- Thoru Pederson
- Program in Cell Dynamics and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
132
|
Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, Okwuonu G, Hines S, Lewis L, DeRamo C, Delgado O, Dugan-Rocha S, Miner G, Morgan M, Hawes A, Gill R, Celera, Holt RA, Adams MD, Amanatides PG, Baden-Tillson H, Barnstead M, Chin S, Evans CA, Ferriera S, Fosler C, Glodek A, Gu Z, Jennings D, Kraft CL, Nguyen T, Pfannkoch CM, Sitter C, Sutton GG, Venter JC, Woodage T, Smith D, Lee HM, Gustafson E, Cahill P, Kana A, Doucette-Stamm L, Weinstock K, Fechtel K, Weiss RB, Dunn DM, Green ED, Blakesley RW, Bouffard GG, De Jong PJ, Osoegawa K, Zhu B, Marra M, Schein J, Bosdet I, Fjell C, Jones S, Krzywinski M, Mathewson C, Siddiqui A, Wye N, McPherson J, Zhao S, Fraser CM, Shetty J, Shatsman S, Geer K, Chen Y, Abramzon S, Nierman WC, Havlak PH, Chen R, Durbin KJ, Simons R, Ren Y, Song XZ, Li B, Liu Y, Qin X, Cawley S, Worley KC, Cooney AJ, D'Souza LM, Martin K, Wu JQ, Gonzalez-Garay ML, Jackson AR, Kalafus KJ, McLeod MP, Milosavljevic A, Virk D, Volkov A, Wheeler DA, Zhang Z, Bailey JA, Eichler EE, et alGibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, Okwuonu G, Hines S, Lewis L, DeRamo C, Delgado O, Dugan-Rocha S, Miner G, Morgan M, Hawes A, Gill R, Celera, Holt RA, Adams MD, Amanatides PG, Baden-Tillson H, Barnstead M, Chin S, Evans CA, Ferriera S, Fosler C, Glodek A, Gu Z, Jennings D, Kraft CL, Nguyen T, Pfannkoch CM, Sitter C, Sutton GG, Venter JC, Woodage T, Smith D, Lee HM, Gustafson E, Cahill P, Kana A, Doucette-Stamm L, Weinstock K, Fechtel K, Weiss RB, Dunn DM, Green ED, Blakesley RW, Bouffard GG, De Jong PJ, Osoegawa K, Zhu B, Marra M, Schein J, Bosdet I, Fjell C, Jones S, Krzywinski M, Mathewson C, Siddiqui A, Wye N, McPherson J, Zhao S, Fraser CM, Shetty J, Shatsman S, Geer K, Chen Y, Abramzon S, Nierman WC, Havlak PH, Chen R, Durbin KJ, Simons R, Ren Y, Song XZ, Li B, Liu Y, Qin X, Cawley S, Worley KC, Cooney AJ, D'Souza LM, Martin K, Wu JQ, Gonzalez-Garay ML, Jackson AR, Kalafus KJ, McLeod MP, Milosavljevic A, Virk D, Volkov A, Wheeler DA, Zhang Z, Bailey JA, Eichler EE, Tuzun E, Birney E, Mongin E, Ureta-Vidal A, Woodwark C, Zdobnov E, Bork P, Suyama M, Torrents D, Alexandersson M, Trask BJ, Young JM, Huang H, Wang H, Xing H, Daniels S, Gietzen D, Schmidt J, Stevens K, Vitt U, Wingrove J, Camara F, Mar Albà M, Abril JF, Guigo R, Smit A, Dubchak I, Rubin EM, Couronne O, Poliakov A, Hübner N, Ganten D, Goesele C, Hummel O, Kreitler T, Lee YA, Monti J, Schulz H, Zimdahl H, Himmelbauer H, Lehrach H, Jacob HJ, Bromberg S, Gullings-Handley J, Jensen-Seaman MI, Kwitek AE, Lazar J, Pasko D, Tonellato PJ, Twigger S, Ponting CP, Duarte JM, Rice S, Goodstadt L, Beatson SA, Emes RD, Winter EE, Webber C, Brandt P, Nyakatura G, Adetobi M, Chiaromonte F, Elnitski L, Eswara P, Hardison RC, Hou M, Kolbe D, Makova K, Miller W, Nekrutenko A, Riemer C, Schwartz S, Taylor J, Yang S, Zhang Y, Lindpaintner K, Andrews TD, Caccamo M, Clamp M, Clarke L, Curwen V, Durbin R, Eyras E, Searle SM, Cooper GM, Batzoglou S, Brudno M, Sidow A, Stone EA, Venter JC, Payseur BA, Bourque G, López-Otín C, Puente XS, Chakrabarti K, Chatterji S, Dewey C, Pachter L, Bray N, Yap VB, Caspi A, Tesler G, Pevzner PA, Haussler D, Roskin KM, Baertsch R, Clawson H, Furey TS, Hinrichs AS, Karolchik D, Kent WJ, Rosenbloom KR, Trumbower H, Weirauch M, Cooper DN, Stenson PD, Ma B, Brent M, Arumugam M, Shteynberg D, Copley RR, Taylor MS, Riethman H, Mudunuri U, Peterson J, Guyer M, Felsenfeld A, Old S, Mockrin S, Collins F. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 2004; 428:493-521. [PMID: 15057822 DOI: 10.1038/nature02426] [Show More Authors] [Citation(s) in RCA: 1557] [Impact Index Per Article: 74.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2003] [Accepted: 02/20/2004] [Indexed: 01/16/2023]
Abstract
The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.
Collapse
Affiliation(s)
- Richard A Gibbs
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, MS BCM226, One Baylor Plaza, Houston, Texas 77030, USA. http://www.hgsc.bcm.tmc.edu
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
133
|
Shabalina SA, Ogurtsov AY, Rogozin IB, Koonin EV, Lipman DJ. Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. Nucleic Acids Res 2004; 32:1774-82. [PMID: 15031317 PMCID: PMC390323 DOI: 10.1093/nar/gkh313] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Sequencing of multiple, nearly complete eukaryotic genomes creates opportunities for detecting previously unnoticed, subtle functional signals in non-coding regions. A genome-wide comparative analysis of orthologous sets of mammalian and yeast mRNAs revealed distinct patterns of evolutionary conservation at the boundaries of the untranslated regions (UTRs) and the coding region (CDS). Elevated sequence conservation was detected in approximately 30 nt regions around the start codon. There seems to be a complementary relationship between sequence conservation in the approximately 30 nt regions of the 5'-UTR immediately upstream of the start codon and that in the synonymous positions of the 5'-terminal 30 nt of the CDS: in mammalian mRNAs, the 5'-UTR shows a greater conservation than the CDS, whereas the opposite trend holds for yeast mRNAs. Unexpectedly, a approximately 30 nt region downstream of the stop codon shows a substantially lower level of sequence conservation than the downstream portions of the 3'-UTRs. However, the sequence in this poorly conserved 30 nt portion of the 3'-UTR is non-random in that it has a higher GC content than the rest of the UTR. It is hypothesized that the elevated sequence conservation in the region immediately upstream of the start codon is related to the requirement for initiation factor binding during pre-initiation ribosomal scanning. In contrast, the poorly conserved region downstream of the stop codon could be involved in the post- termination scanning and dissociation of the ribosomes from the mRNA, which requires only the mRNA-ribosome interaction. Additionally, it was found that the choice of the stop codon in mammals, but not in yeasts, and the context in the immediate vicinity of the stop codons in both mammals and yeasts are subject to strong selection. Thus, genome-wide analysis of orthologous gene sets allows detection of previously unrecognized patterns of sequence conservation, which are likely to reflect hidden functional signals, such as ribosomal filters that could regulate translation by modulating the interaction between the mRNA and ribosomes.
Collapse
Affiliation(s)
- Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | | | |
Collapse
|
134
|
Castrillo JI, Oliver SG. Yeast as a Touchstone in Post-genomic Research: Strategies for Integrative Analysis in Functional Genomics. BMB Rep 2004; 37:93-106. [PMID: 14761307 DOI: 10.5483/bmbrep.2004.37.1.093] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The new complexity arising from the genome sequencing projects requires new comprehensive post-genomic strategies: advanced studies in regulatory mechanisms, application of new high-throughput technologies at a genome-wide scale, at the different levels of cellular complexity (genome, transcriptome, proteome and metabolome), efficient analysis of the results, and application of new bioinformatic methods in an integrative or systems biology perspective. This can be accomplished in studies with model organisms under controlled conditions. In this review a perspective of the favourable characteristics of yeast as a touchstone model in post-genomic research is presented. The state-of-the art, latest advances in the field and bottlenecks, new strategies, new regulatory mechanisms, applications (patents) and high-throughput technologies, most of them being developed and validated in yeast, are presented. The optimal characteristics of yeast as a well-defined system for comprehensive studies under controlled conditions makes it a perfect model to be used in integrative, "systems biology" studies to get new insights into the mechanisms of regulation (regulatory networks) responsible of specific phenotypes under particular environmental conditions, to be applied to more complex organisms (e.g. plants, human).
Collapse
Affiliation(s)
- Juan I Castrillo
- School of Biological Sciences, University of Manchester, 2205 Stopford Building, Oxford Road, Manchester M13 9PT, UK.
| | | |
Collapse
|
135
|
|
136
|
Affiliation(s)
- Greg Gibson
- Department of Genetics, North Carolina State University, Raleigh, NC 27695-76214, USA.
| |
Collapse
|
137
|
Affiliation(s)
- Mark Johnston
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | | |
Collapse
|