1
|
Niu H, Zhu B, Guo P, Zhang W, Xue J, Chen Y, Zhang L, Gao H, Gao X, Xu L, Li J. Estimation of linkage disequilibrium levels and haplotype block structure in Chinese Simmental and Wagyu beef cattle using high-density genotypes. Livest Sci 2016. [DOI: 10.1016/j.livsci.2016.05.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
2
|
Mokry FB, Buzanskas ME, de Alvarenga Mudadu M, do Amaral Grossi D, Higa RH, Ventura RV, de Lima AO, Sargolzaei M, Conceição Meirelles SL, Schenkel FS, da Silva MVGB, Méo Niciura SC, de Alencar MM, Munari D, de Almeida Regitano LC. Linkage disequilibrium and haplotype block structure in a composite beef cattle breed. BMC Genomics 2014; 15 Suppl 7:S6. [PMID: 25573652 PMCID: PMC4243187 DOI: 10.1186/1471-2164-15-s7-s6] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Background The development of linkage disequilibrium (LD) maps and the characterization of haplotype block structure at the population level are useful parameters for guiding genome wide association (GWA) studies, and for understanding the nature of non-linear association between phenotypes and genes. The elucidation of haplotype block structure can reduce the information of several single nucleotide polymorphisms (SNP) into the information of a haplotype block, reducing the number of SNPs in a coherent way for consideration in GWA and genomic selection studies. Results The maximum average LD, measured by r2 varied between 0.33 to 0.40 at a distance of < 2.5 kb, and the minimum average values of r2 varied between 0.05 to 0.07 at distances ranging from 400 to 500 kb, clearly showing that the average r2 reduced with the increase in SNP pair distances. The persistence of LD phase showed higher values at shorter genomic distances, decreasing with the increase in physical distance, varying from 0.96 at a distance of < 2.5 kb to 0.66 at a distance from 400 to 500 kb. A total of 78% of all SNPs were clustered into haplotype blocks, covering 1,57 Mb of the total autosomal genome size. Conclusions This study presented the first high density linkage disequilibrium map and haplotype block structure for a composite beef cattle population, and indicates that the high density SNP panel over 700 k can be used for genomic selection implementation and GWA studies for Canchim beef cattle.
Collapse
|
3
|
Veroneze R, Lopes PS, Guimarães SEF, Silva FF, Lopes MS, Harlizius B, Knol EF. Linkage disequilibrium and haplotype block structure in six commercial pig lines. J Anim Sci 2013; 91:3493-501. [DOI: 10.2527/jas.2012-6052] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- R. Veroneze
- Departamento de Zootecnia, Universidade Federal de Viçosa, Av. P.H. Holfs, 36570-000, Viçosa, MG, Brazil
| | - P. S. Lopes
- Departamento de Zootecnia, Universidade Federal de Viçosa, Av. P.H. Holfs, 36570-000, Viçosa, MG, Brazil
| | - S. E. F. Guimarães
- Departamento de Zootecnia, Universidade Federal de Viçosa, Av. P.H. Holfs, 36570-000, Viçosa, MG, Brazil
| | - F. F. Silva
- Departamento de Estatística, Universidade Federal de Viçosa, Av. P.H. Holfs, 36570-000, Viçosa, MG, Brazil
| | - M. S. Lopes
- TOPIGS Research Center IPG, PO Box 43, 6640 AA, Beuningen, The Netherlands
| | - B. Harlizius
- TOPIGS Research Center IPG, PO Box 43, 6640 AA, Beuningen, The Netherlands
| | - E. F. Knol
- TOPIGS Research Center IPG, PO Box 43, 6640 AA, Beuningen, The Netherlands
| |
Collapse
|
4
|
Abstract
We consider recombinant inbred lines obtained by crossing two given homozygous parents and then applying multiple generations of self-crossings or full-sib matings. The chromosomal content of any such line forms a mosaic of blocks, each alternatively inherited identically by descent from one of the parents. Quantifying the statistical properties of such mosaic genomes has remained an open challenge for many years. Here, we solve this problem by taking a continuous chromosome picture and assuming crossovers to be noninterfering. Using a continuous-time random walk framework and Markov chain theory, we determine the statistical properties of these identical-by-descent blocks. We find that successive block lengths are only very slightly correlated. Furthermore, the blocks on the ends of chromosomes are larger on average than the others, a feature understandable from the nonexponential distribution of block lengths.
Collapse
|
5
|
Wang HY, Greenawalt D, Cui X, Tereshchenko IV, Luo M, Yang Q, Azaro MA, Hu G, Chu Y, Li JY, Shen L, Lin Y, Zhang L, Li H. Identification of possible genetic alterations in the breast cancer cell line MCF-7 using high-density SNP genotyping microarray. J Carcinog 2011; 8:6. [PMID: 19439911 PMCID: PMC2687141 DOI: 10.4103/1477-3163.50886] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Context: Cancer cell lines are used extensively in various research. Knowledge of genetic alterations in these lines is important for understanding mechanisms underlying their biology. However, since paired normal tissues are usually unavailable for comparison, precisely determining genetic alterations in cancer cell lines is difficult. To address this issue, a highly efficient and reliable method is developed. Aims: Establishing a highly efficient and reliable experimental system for genetic profiling of cell lines. Materials and Methods: A widely used breast cancer cell line, MCF-7, was genetically profiled with 4,396 single nucleotide polymorphisms (SNPs) spanning 11 whole chromosomes and two other small regions using a newly developed high-throughput multiplex genotyping approach. Results: The fractions of homozygous SNPs in MCF-7 (13.3%) were significantly lower than those in the control cell line and in 24 normal human individuals (25.1% and 27.4%, respectively). Homozygous SNPs in MCF-7 were found in clusters. The sizes of these clusters were significantly larger than the expected based on random allelic combination. Fourteen such regions were found on chromosomes 1p, 1q, 2q, 6q, 13, 15q, 16q, 17q and 18p in MCF-7 and two in the small regions. Conclusions: These results are generally concordant with those obtained using different approaches but are better in defining their chromosomal positions. The used approach provides a reliable way to detecting possible genetic alterations in cancer cell lines without paired normal tissues.
Collapse
Affiliation(s)
- Hui-Yun Wang
- Department of Molecular Genetics, Microbiology and Immunology/The Cancer Institute of New Jersey, Piscataway, New Jersey, 08854, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Haerian BS, Lim KS, Tan CT, Raymond AA, Mohamed Z. Association of ABCB1 gene polymorphisms and their haplotypes with response to antiepileptic drugs: a systematic review and meta-analysis. Pharmacogenomics 2011; 12:713-25. [PMID: 21391884 DOI: 10.2217/pgs.10.212] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIMS Several studies demonstrated a link between ABCB1 gene variants and the response to treatment in epilepsy, but the results have been inconclusive. Here, we performed the first haplotype meta-analysis to examine the association of haplotypes of ABCB1 common variants with the response to treatment in epilepsy. MATERIALS & METHODS We meta-analyzed the studies that evaluated the role of ABCB1 C1236T, G2677T/A and C3435T polymorphisms and their haplotypes in the response to treatment. RESULTS Meta-analysis of 23 studies (7067 patients) showed no significant association of ABCB1 alleles, genotypes and haplotypes with the response to treatment in the overall population or in each ethnicity subgroup. CONCLUSION Our data suggest that the haplotypes of these loci may not be involved in the response to treatment.
Collapse
Affiliation(s)
- Batoul Sadat Haerian
- Pharmacogenomics Laboratory, Department of Pharmacology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia.
| | | | | | | | | |
Collapse
|
7
|
Pramanik S, Cui X, Wang HY, Chimge NO, Hu G, Shen L, Gao R, Li H. Segmental duplication as one of the driving forces underlying the diversity of the human immunoglobulin heavy chain variable gene region. BMC Genomics 2011; 12:78. [PMID: 21272357 PMCID: PMC3042411 DOI: 10.1186/1471-2164-12-78] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
Background Segmental duplication and deletion were implicated for a region containing the human immunoglobulin heavy chain variable (IGHV) gene segments, 1.9III/hv3005 (possible allelic variants of IGHV3-30) and hv3019b9 (a possible allelic variant of IGHV3-33). However, very little is known about the ranges of the duplication and the polymorphic region. This is mainly because of the difficulty associated with distinguishing between allelic and paralogous sequences in the IGHV region containing extensive repetitive sequences. Inability to separate the two parental haploid genomes in the subjects is another serious barrier. To address these issues, unique DNA sequence tags evenly distributed within and flanking the duplicated region implicated by the previous studies were selected. The selected tags in single sperm from six unrelated healthy donors were amplified by multiplex PCR followed by microarray detection. In this way, individual haplotypes of different parental origins in the sperm donors could be analyzed separately and precisely. The identified polymorphic region was further analyzed at the nucleotide sequence level using sequences from the three human genomic sequence assemblies in the database. Results A large polymorphic region was identified using the selected sequence tags. Four of the 12 haplotypes were shown to contain consecutively undetectable tags spanning in a variable range. Detailed analysis of sequences from the genomic sequence assemblies revealed two large duplicate sequence blocks of 24,696 bp and 24,387 bp, respectively, and an incomplete copy of 961 bp in this region. It contains up to 13 IGHV gene segments depending on haplotypes. A polymorphic region was found to be located within the duplicated blocks. The variants of this polymorphism unusually diverged at the nucleotide sequence level and in IGHV gene segment number, composition and organization, indicating a limited selection pressure in general. However, the divergence level within the gene segments is significantly different from that in the intergenic regions indicating that these regions may have been subject to different selection pressures and that the IGHV gene segments in this region are functionally important. Conclusions Non-reciprocal genetic rearrangements associated with large duplicate sequence blocks could substantially contribute to the IGHV region diversity. Since the resulting polymorphisms may affect the number, composition and organization of the gene segments in this region, it may have significant impact on the function of the IGHV gene segment repertoire, antibody diversity, and therefore, the immune system. Because one of the gene segments, 3-30 (1.9III), is associated with autoimmune diseases, it could be of diagnostic significance to learn about the variants in the haplotypes by using the multiplex haplotype analysis system used in the present study with DNA sequence tags specific for the variants of all gene segments in this region.
Collapse
Affiliation(s)
- Sreemanta Pramanik
- Department of Molecular Genetics, Microbiology, and Immunology, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ 08854, USA
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Khil PP, Camerini-Otero RD. Genetic crossovers are predicted accurately by the computed human recombination map. PLoS Genet 2010; 6:e1000831. [PMID: 20126534 PMCID: PMC2813264 DOI: 10.1371/journal.pgen.1000831] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2009] [Accepted: 12/28/2009] [Indexed: 11/26/2022] Open
Abstract
Hotspots of meiotic recombination can change rapidly over time. This instability and the reported high level of inter-individual variation in meiotic recombination puts in question the accuracy of the calculated hotspot map, which is based on the summation of past genetic crossovers. To estimate the accuracy of the computed recombination rate map, we have mapped genetic crossovers to a median resolution of 70 Kb in 10 CEPH pedigrees. We then compared the positions of crossovers with the hotspots computed from HapMap data and performed extensive computer simulations to compare the observed distributions of crossovers with the distributions expected from the calculated recombination rate maps. Here we show that a population-averaged hotspot map computed from linkage disequilibrium data predicts well present-day genetic crossovers. We find that computed hotspot maps accurately estimate both the strength and the position of meiotic hotspots. An in-depth examination of not-predicted crossovers shows that they are preferentially located in regions where hotspots are found in other populations. In summary, we find that by combining several computed population-specific maps we can capture the variation in individual hotspots to generate a hotspot map that can predict almost all present-day genetic crossovers. In eukaryotes genetic crossovers are responsible for generating genetic diversity and ensuring the proper segregation of chromosomes. Genetic crossovers are tightly clustered in hotspots. Although the existence of hotspots in humans is clearly proven, mechanisms of their formation and the regulation of meiotic recombination in general remain poorly understood. An additional complication in studies of meiotic recombination is the fact that the direct experimental mapping of human hotspots on a genome-wide scale is not feasible with current methods. The best available indirect methods compute the position of hotspots from patterns of historic associations between genetic markers in population samples. In this study we determined the positions of genetic crossovers in ten pedigrees of European origin and then compared the positions of crossovers with the hotspots computed from HapMap data. Importantly, we find that the population-averaged computed map is in close agreement with the observed distribution of genetic crossovers. We also find that cryptic hotspots that are not easily detected in the computed European map can be more effectively identified if other populations are included in the analysis. Our analysis shows that high-resolution recombination profiles are highly similar between distantly related populations and that by including computed hotspots from several populations we can predict nearly all crossovers.
Collapse
Affiliation(s)
- Pavel P. Khil
- Genetics and Biochemistry Branch, The National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - R. Daniel Camerini-Otero
- Genetics and Biochemistry Branch, The National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
9
|
Mets DG, Meyer BJ. Condensins regulate meiotic DNA break distribution, thus crossover frequency, by controlling chromosome structure. Cell 2009; 139:73-86. [PMID: 19781752 DOI: 10.1016/j.cell.2009.07.035] [Citation(s) in RCA: 187] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 05/17/2009] [Accepted: 07/08/2009] [Indexed: 12/27/2022]
Abstract
Meiotic crossover (CO) recombination facilitates evolution and accurate chromosome segregation. CO distribution is tightly regulated: homolog pairs receive at least one CO, CO spacing is nonrandom, and COs occur preferentially in short genomic intervals called hotspots. We show that CO number and distribution are controlled on a chromosome-wide basis at the level of DNA double-strand break (DSB) formation by a condensin complex composed of subunits from two known condensins: the C. elegans dosage compensation complex and mitotic condensin II. Disruption of any subunit of the CO-controlling condensin dominantly changes DSB distribution, and thereby COs, and extends meiotic chromosome axes. These phenotypes are cosuppressed by disruption of a chromosome axis element. Our data implicate higher-order chromosome structure in the regulation of CO recombination, provide a model for the rapid evolution of CO hotspots, and show that reshuffling of interchangeable molecular parts can create independent machines with similar architectures but distinct biological functions.
Collapse
Affiliation(s)
- David G Mets
- Howard Hughes Medical Institute, University of California-Berkeley, Berkeley, CA 94720-3204, USA
| | | |
Collapse
|
10
|
Luo M, Cui X, Fredman D, Brookes AJ, Azaro MA, Greenawalt DM, Hu G, Wang HY, Tereshchenko IV, Lin Y, Shentu Y, Gao R, Shen L, Li H. Genetic structures of copy number variants revealed by genotyping single sperm. PLoS One 2009; 4:e5236. [PMID: 19384415 PMCID: PMC2668179 DOI: 10.1371/journal.pone.0005236] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2008] [Accepted: 03/12/2009] [Indexed: 11/19/2022] Open
Abstract
Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis.
Collapse
Affiliation(s)
- Minjie Luo
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Xiangfeng Cui
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - David Fredman
- Bergen Center for Computational Science, University of Bergen, Bergen, Norway
| | - Anthony J. Brookes
- Department of Genetics, University of Leicester, Leicester, United Kingdom
| | - Marco A. Azaro
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Danielle M. Greenawalt
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Guohong Hu
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Hui-Yun Wang
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Irina V. Tereshchenko
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Yong Lin
- Department of Biometry, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Yue Shentu
- Department of Statistics, Rutgers University, Hill Center for the Mathematical Sciences, Piscataway, New Jersey, United States of America
| | - Richeng Gao
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Li Shen
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Honghua Li
- Department of Molecular Genetics, Microbiology, and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
11
|
Abstract
Copy number variation (CNV) is a source of genetic diversity in humans. Numerous CNVs are being identified with various genome analysis platforms, including array comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) genotyping platforms, and next-generation sequencing. CNV formation occurs by both recombination-based and replication-based mechanisms and de novo locus-specific mutation rates appear much higher for CNVs than for SNPs. By various molecular mechanisms, including gene dosage, gene disruption, gene fusion, position effects, etc., CNVs can cause Mendelian or sporadic traits, or be associated with complex diseases. However, CNV can also represent benign polymorphic variants. CNVs, especially gene duplication and exon shuffling, can be a predominant mechanism driving gene and genome evolution.
Collapse
Affiliation(s)
- Feng Zhang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | |
Collapse
|
12
|
Gu W, Zhang F, Lupski JR. Mechanisms for human genomic rearrangements. PATHOGENETICS 2008; 1:4. [PMID: 19014668 PMCID: PMC2583991 DOI: 10.1186/1755-8417-1-4] [Citation(s) in RCA: 432] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2008] [Accepted: 11/03/2008] [Indexed: 02/08/2023]
Abstract
Genomic rearrangements describe gross DNA changes of the size ranging from a couple of hundred base pairs, the size of an average exon, to megabases (Mb). When greater than 3 to 5 Mb, such changes are usually visible microscopically by chromosome studies. Human diseases that result from genomic rearrangements have been called genomic disorders. Three major mechanisms have been proposed for genomic rearrangements in the human genome. Non-allelic homologous recombination (NAHR) is mostly mediated by low-copy repeats (LCRs) with recombination hotspots, gene conversion and apparent minimal efficient processing segments. NAHR accounts for most of the recurrent rearrangements: those that share a common size, show clustering of breakpoints, and recur in multiple individuals. Non-recurrent rearrangements are of different sizes in each patient, but may share a smallest region of overlap whose change in copy number may result in shared clinical features among different patients. LCRs do not mediate, but may stimulate non-recurrent events. Some rare NAHRs can also be mediated by highly homologous repetitive sequences (for example, Alu, LINE); these NAHRs account for some of the non-recurrent rearrangements. Other non-recurrent rearrangements can be explained by non-homologous end-joining (NHEJ) and the Fork Stalling and Template Switching (FoSTeS) models. These mechanisms occur both in germ cells, where the rearrangements can be associated with genomic disorders, and in somatic cells in which such genomic rearrangements can cause disorders such as cancer. NAHR, NHEJ and FoSTeS probably account for the majority of genomic rearrangements in our genome and the frequency distribution of the three at a given locus may partially reflect the genomic architecture in proximity to that locus. We provide a review of the current understanding of these three models.
Collapse
Affiliation(s)
- Wenli Gu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | |
Collapse
|
13
|
Abstract
Our understanding of the details of mammalian meiotic recombination has recently advanced significantly. Sperm typing technologies, linkage studies, and computational inferences from population genetic data have together provided information in unprecedented detail about the location and activity of the sites of crossing-over in mice and humans. The results show that the vast majority of meiotic recombination events are localized to narrow DNA regions (hot spots) that constitute only a small fraction of the genome. The data also suggest that the molecular basis of hot spot activity is unlikely to be strictly determined by specific DNA sequence motifs in cis. Further molecular studies are needed to understand how hot spots originate, function and evolve.
Collapse
Affiliation(s)
- Norman Arnheim
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089-2910, USA.
| | | | | |
Collapse
|
14
|
Carvajal-Rodríguez A. Simulation of genomes: a review. Curr Genomics 2008; 9:155-9. [PMID: 19440512 PMCID: PMC2679650 DOI: 10.2174/138920208784340759] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2008] [Revised: 03/18/2008] [Accepted: 03/26/2008] [Indexed: 11/22/2022] Open
Abstract
There is an increasing role of population genetics in human genetic research linking empirical observations with hypotheses about sequence variation due to historical and evolutionary causes. In addition, the data sets are increasing in size, with genome-wide data becoming a common place in many empirical studies. As far as more information is available, it becomes clear that simplest hypotheses are not consistent with data. Simulations will provide the key tool to contrast complex hypotheses on real data by generating simulated data under the hypothetical historical and evolutionary conditions that we want to contrast. Undoubtedly, developing tools for simulating large sequences that at the same time allow simulate natural selection, recombination and complex demography patterns will be of great interest in order to better understanding the trace left on the DNA by different interacting evolutionary forces. Simulation tools will be also essential to evaluate the sampling properties of any statistics used on genome-wide association studies and to compare performance of methods applied at genome-wide scales. Several recent simulation tools have been developed. Here, we review some of the currently existing simulators which allow for efficient simulation of large sequences on complex evolutionary scenarios. In addition, we will point out future directions in this field which are already a key part of the current research in evolutionary biology and it seems that it will be a primary tool in the future research of genome and post-genomic biology.
Collapse
|
15
|
Carvajal-Rodríguez A. GENOMEPOP: a program to simulate genomes in populations. BMC Bioinformatics 2008; 9:223. [PMID: 18447924 PMCID: PMC2386491 DOI: 10.1186/1471-2105-9-223] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2008] [Accepted: 04/30/2008] [Indexed: 11/17/2022] Open
Abstract
Background There are several situations in population biology research where simulating DNA sequences is useful. Simulation of biological populations under different evolutionary genetic models can be undertaken using backward or forward strategies. Backward simulations, also called coalescent-based simulations, are computationally efficient. The reason is that they are based on the history of lineages with surviving offspring in the current population. On the contrary, forward simulations are less efficient because the entire population is simulated from past to present. However, the coalescent framework imposes some limitations that forward simulation does not. Hence, there is an increasing interest in forward population genetic simulation and efficient new tools have been developed recently. Software tools that allow efficient simulation of large DNA fragments under complex evolutionary models will be very helpful when trying to better understand the trace left on the DNA by the different interacting evolutionary forces. Here I will introduce GenomePop, a forward simulation program that fulfills the above requirements. The use of the program is demonstrated by studying the impact of intracodon recombination on global and site-specific dN/dS estimation. Results I have developed algorithms and written software to efficiently simulate, forward in time, different Markovian nucleotide or codon models of DNA mutation. Such models can be combined with recombination, at inter and intra codon levels, fitness-based selection and complex demographic scenarios. Conclusion GenomePop has many interesting characteristics for simulating SNPs or DNA sequences under complex evolutionary and demographic models. These features make it unique with respect to other simulation tools. Namely, the possibility of forward simulation under General Time Reversible (GTR) mutation or GTR×MG94 codon models with intra-codon recombination, arbitrary, user-defined, migration patterns, diploid or haploid models, constant or variable population sizes, etc. It also allows simulation of fitness-based selection under different distributions of mutational effects. Under the 2-allele model it allows the simulation of recombination hot-spots, the definition of different frequencies in different populations, etc. GenomePop can also manage large DNA fragments. In addition, it has a scaling option to save computation time when simulating large sequences and population sizes under complex demographic and evolutionary situations. These and many other features are detailed in its web page [1].
Collapse
|
16
|
Hu G, Yang Q, Cui X, Yue G, Azaro MA, Wang HY, Li H. A highly sensitive and specific system for large-scale gene expression profiling. BMC Genomics 2008; 9:9. [PMID: 18186939 PMCID: PMC2267712 DOI: 10.1186/1471-2164-9-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Accepted: 01/10/2008] [Indexed: 12/02/2022] Open
Abstract
Background Rapid progress in the field of gene expression-based molecular network integration has generated strong demand on enhancing the sensitivity and data accuracy of experimental systems. To meet the need, a high-throughput gene profiling system of high specificity and sensitivity has been developed. Results By using specially designed primers, the new system amplifies sequences in neighboring exons separated by big introns so that mRNA sequences may be effectively discriminated from other highly related sequences including their genes, unprocessed transcripts, pseudogenes and pseudogene transcripts. Probes used for microarray detection consist of sequences in the two neighboring exons amplified by the primers. In conjunction with a newly developed high-throughput multiplex amplification system and highly simplified experimental procedures, the system can be used to analyze >1,000 mRNA species in a single assay. It may also be used for gene expression profiling of very few (n = 100) or single cells. Highly reproducible results were obtained from duplicate samples with the same number of cells, and from those with a small number (100) and a large number (10,000) of cells. The specificity of the system was demonstrated by comparing results from a breast cancer cell line, MCF-7, and an ovarian cancer cell line, NCI/ADR-RES, and by using genomic DNA as starting material. Conclusion Our approach may greatly facilitate the analysis of combinatorial expression of known genes in many important applications, especially when the amount of RNA is limited.
Collapse
Affiliation(s)
- Guohong Hu
- Department of Molecular Genetics, Microbiology and Immunology/The Cancer Institute of New Jersey, University of Medicine and Dentistry of New Jersey Robert Wood Johnson Medical School, Piscataway, New Jersey 08854, USA.
| | | | | | | | | | | | | |
Collapse
|
17
|
Kauppi L, Jasin M, Keeney S. Meiotic crossover hotspots contained in haplotype block boundaries of the mouse genome. Proc Natl Acad Sci U S A 2007; 104:13396-401. [PMID: 17690247 PMCID: PMC1948908 DOI: 10.1073/pnas.0701965104] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Fertility requires successful chromosome segregation in meiosis, which in most sexual organisms depends on the formation of appropriately placed crossovers. The nonrandom genome-wide distributions of meiotic recombination events have been examined at the molecular level experimentally in yeast and by inference from linkage disequilibrium patterns in humans. Thus far, no method has existed for pinpointing sites of crossing-over on a genome-wide scale in an experimentally tractable animal whose genome size and complexity models that of humans. Here, we present a genomic approach to identify mouse crossover hotspots, based on targeting haplotype block boundaries. This represents a previously undescribed method potentially applicable to large-scale mouse hotspot identification. Using this method, we have successfully predicted the location of two previously uncharacterized crossover hotspots in male mice. As increasing amounts of single-nucleotide polymorphism data emerge, this approach will be useful for investigating the recombination landscape of the mouse genome.
Collapse
Affiliation(s)
- Liisa Kauppi
- Molecular Biology and Developmental Biology Programs, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA.
| | | | | |
Collapse
|
18
|
Abstract
Fine-scale estimation of recombination rates remains a challenging problem. Experimental techniques can provide accurate estimates at fine scales but are technically challenging and cannot be applied on a genome-wide scale. An alternative source of information comes from patterns of genetic variation. Several statistical methods have been developed to estimate recombination rates from randomly sampled chromosomes. However, most such methods either make poor assumptions about recombination rate variation, or simply assume that there is no rate variation. Since the discovery of recombination hotspots, it is clear that recombination rates can vary over many orders of magnitude at the fine scale. We present a method for the estimation of recombination rates in the presence of recombination hotspots. We demonstrate that the method is able to detect and accurately quantify recombination rate heterogeneity, and is a substantial improvement over a commonly used method. We then use the method to reanalyze genetic variation data from the HLA and MS32 regions of the human genome and demonstrate that the method is able to provide accurate rate estimates and simultaneously detect hotspots.
Collapse
Affiliation(s)
- Adam Auton
- Department of Statistics, University of Oxford, Oxford, UK.
| | | |
Collapse
|
19
|
Buard J, de Massy B. Playing hide and seek with mammalian meiotic crossover hotspots. Trends Genet 2007; 23:301-9. [PMID: 17434233 DOI: 10.1016/j.tig.2007.03.014] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2006] [Revised: 03/14/2007] [Accepted: 03/29/2007] [Indexed: 11/30/2022]
Abstract
Crossovers (COs) are essential for meiosis and contribute to genome diversity by promoting the reassociation of alleles, and thus improve the efficiency of selection. COs are not randomly distributed but are found at specific regions, or CO hotspots. Recent results have revealed the historical recombination rates and the distribution of hotspots across the human genome. Surprisingly, CO hotspots are highly dynamic, as shown by differences in activity between individuals, populations and closely related species. We propose a role for DNA methylation in preventing the formation of COs, a regulation that might explain, in part, the correlation between recombination rates and GC content in mammals.
Collapse
Affiliation(s)
- Jérôme Buard
- Institute of Human Genetics, UPR1142-CNRS, 141 rue de la Cardonille, 34396 Montpellier cedex 5, France
| | | |
Collapse
|
20
|
Guryev V, Smits BMG, van de Belt J, Verheul M, Hubner N, Cuppen E. Haplotype block structure is conserved across mammals. PLoS Genet 2006; 2:e121. [PMID: 16895449 PMCID: PMC1523234 DOI: 10.1371/journal.pgen.0020121] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2006] [Accepted: 06/16/2006] [Indexed: 11/19/2022] Open
Abstract
Genetic variation in genomes is organized in haplotype blocks, and species-specific block structure is defined by differential contribution of population history effects in combination with mutation and recombination events. Haplotype maps characterize the common patterns of linkage disequilibrium in populations and have important applications in the design and interpretation of genetic experiments. Although evolutionary processes are known to drive the selection of individual polymorphisms, their effect on haplotype block structure dynamics has not been shown. Here, we present a high-resolution haplotype map for a 5-megabase genomic region in the rat and compare it with the orthologous human and mouse segments. Although the size and fine structure of haplotype blocks are species dependent, there is a significant interspecies overlap in structure and a tendency for blocks to encompass complete genes. Extending these findings to the complete human genome using haplotype map phase I data reveals that linkage disequilibrium values are significantly higher for equally spaced positions in genic regions, including promoters, as compared to intergenic regions, indicating that a selective mechanism exists to maintain combinations of alleles within potentially interacting coding and regulatory regions. Although this characteristic may complicate the identification of causal polymorphisms underlying phenotypic traits, conservation of haplotype structure may be employed for the identification and characterization of functionally important genomic regions.
Collapse
Affiliation(s)
| | | | | | | | - Norbert Hubner
- Max-Delbruck-Center for Molecular Medicine (MDC), Berlin-Buch, Germany
| | - Edwin Cuppen
- Hubrecht Laboratory, Utrecht, The Netherlands
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
21
|
Hu G, Wang HY, Greenawalt DM, Azaro MA, Luo M, Tereshchenko IV, Cui X, Yang Q, Gao R, Shen L, Li H. AccuTyping: new algorithms for automated analysis of data from high-throughput genotyping with oligonucleotide microarrays. Nucleic Acids Res 2006; 34:e116. [PMID: 16982644 PMCID: PMC1635267 DOI: 10.1093/nar/gkl601] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Microarray-based analysis of single nucleotide polymorphisms (SNPs) has many applications in large-scale genetic studies. To minimize the influence of experimental variation, microarray data usually need to be processed in different aspects including background subtraction, normalization and low-signal filtering before genotype determination. Although many algorithms are sophisticated for these purposes, biases are still present. In the present paper, new algorithms for SNP microarray data analysis and the software, AccuTyping, developed based on these algorithms are described. The algorithms take advantage of a large number of SNPs included in each assay, and the fact that the top and bottom 20% of SNPs can be safely treated as homozygous after sorting based on their ratios between the signal intensities. These SNPs are then used as controls for color channel normalization and background subtraction. Genotype calls are made based on the logarithms of signal intensity ratios using two cutoff values, which were determined after training the program with a dataset of ∼160 000 genotypes and validated by non-microarray methods. AccuTyping was used to determine >300 000 genotypes of DNA and sperm samples. The accuracy was shown to be >99%. AccuTyping can be downloaded from .
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Honghua Li
- To whom correspondence should be addressed. Tel: +1 732 235 7330; Fax: +1 732 235 5223;
| |
Collapse
|
22
|
Myers S, Spencer CCA, Auton A, Bottolo L, Freeman C, Donnelly P, McVean G. The distribution and causes of meiotic recombination in the human genome. Biochem Soc Trans 2006; 34:526-30. [PMID: 16856851 DOI: 10.1042/bst0340526] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Using the statistical analysis of genetic variation, we have developed a high-resolution genetic map of recombination hotspots and recombination rate variation across the human genome. This map, which has a resolution several orders of magnitude greater than previous studies, identifies over 25,000 recombination hotspots and gives new insights into the distribution and determination of recombination. Wavelet-based analysis demonstrates scale-specific influences of base composition, coding context and DNA repeats on recombination rates, though, in contrast with other species, no association with DNase I hypersensitivity. We have also identified specific DNA motifs that are strongly associated with recombination hotspots and whose activity is influenced by local context. Comparative analysis of recombination rates in humans and chimpanzees demonstrates very high rates of evolution of the fine-scale structure of the recombination landscape. In the light of these observations, we suggest possible resolutions of the hotspot paradox.
Collapse
Affiliation(s)
- S Myers
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
| | | | | | | | | | | | | |
Collapse
|
23
|
Tiemann-Boege I, Calabrese P, Cochran DM, Sokol R, Arnheim N. High-resolution recombination patterns in a region of human chromosome 21 measured by sperm typing. PLoS Genet 2006; 2:e70. [PMID: 16680198 PMCID: PMC1456319 DOI: 10.1371/journal.pgen.0020070] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2006] [Accepted: 03/23/2006] [Indexed: 12/05/2022] Open
Abstract
For decades, classical crossover studies and linkage disequilibrium (LD) analysis of genomic regions suggested that human meiotic crossovers may not be randomly distributed along chromosomes but are focused instead in “hot spots.” Recent sperm typing studies provided data at very high resolution and accuracy that defined the physical limits of a number of hot spots. The data were also used to test whether patterns of LD can predict hot spot locations. These sperm typing studies focused on several small regions of the genome already known or suspected of containing a hot spot based on the presence of LD breakdown or previous experimental evidence of hot spot activity. Comparable data on target regions not specifically chosen using these two criteria is lacking but is needed to make an unbiased test of whether LD data alone can accurately predict active hot spots. We used sperm typing to estimate recombination in 17 almost contiguous ~5 kb intervals spanning 103 kb of human Chromosome 21. We found two intervals that contained new hot spots. The comparison of our data with recombination rates predicted by statistical analyses of LD showed that, overall, the two datasets corresponded well, except for one predicted hot spot that showed little crossing over. This study doubles the experimental data on recombination in men at the highest resolution and accuracy and supports the emerging genome-wide picture that recombination is localized in small regions separated by cold areas. Detailed study of one of the new hot spots revealed a sperm donor with a decrease in recombination intensity at the canonical recombination site but an increase in crossover activity nearby. This unique finding suggests that the position and intensity of hot spots may evolve by means of a concerted mechanism that maintains the overall recombination intensity in the region. Meiotic crossover events are not randomly distributed across the human genome, but are concentrated in many small regions of a few kb with high recombination rates compared to surrounding regions. How the distribution of recombination events affects the association of different alleles along the chromosome (linkage disequilibrium, or LD) was recently addressed using sperm typing in regions already known or suspected to contain unusually high recombination intensities. In the current paper, the authors used sperm typing to examine recombination in a region not known or suspected of containing recombination hot spots. They first established the crossover distribution pattern within a 103-kb region of human Chromosome 21. Then, they compared their data to predictions of crossover distributions estimated by statistical analyses of polymorphism in the region. They found a good concordance between the two, although it was not perfect. To the authors' knowledge, this work is the first to compare LD-based estimates of recombination to sperm-typing data from regions not previously known or suspected of containing recombination hot spots. In addition, one of the studied hot spots revealed an example of a decrease in recombination intensity with a concurrent increase at a nearby site. This unique observation suggests that the activity of hot spots may evolve in a concerted fashion such that the overall recombination activity of the region is maintained.
Collapse
Affiliation(s)
- Irene Tiemann-Boege
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, United States of America
| | - Peter Calabrese
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, United States of America
| | - David M Cochran
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, United States of America
| | - Rebecca Sokol
- Obstetrics, Gynecology and Medicine, and Women's Hospital, Health Sciences Campus, University of Southern California, Los Angeles, California, United States of America
| | - Norman Arnheim
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|