Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liu EY, Zhang Q, McMillan L, de Villena FPM, Wang W. Efficient genome ancestry inference in complex pedigrees with inbreeding. Bioinformatics 2010;26:i199-207. [PMID: 20529906 PMCID: PMC2881372 DOI: 10.1093/bioinformatics/btq187] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Liu EY, Zhang Q, McMillan L, de Villena FPM, Wang W. Efficient genome ancestry inference in complex pedigrees with inbreeding. Bioinformatics 2010;26:i199-207. [PMID: 20529906 PMCID: PMC2881372 DOI: 10.1093/bioinformatics/btq187] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Campos-Martin R, Schmickler S, Goel M, Schneeberger K, Tresch A. Reliable genotyping of recombinant genomes using a robust hidden Markov model. PLANT PHYSIOLOGY 2023;192:821-836. [PMID: 36946207 PMCID: PMC10231367 DOI: 10.1093/plphys/kiad191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 01/20/2023] [Accepted: 01/27/2023] [Indexed: 06/01/2023]

Manching H, Wisser RJ. SPEARS: Standard Performance Evaluation of Ancestral haplotype Reconstruction through Simulation. Bioinformatics 2021;37:868-870. [PMID: 32840564 PMCID: PMC8097754 DOI: 10.1093/bioinformatics/btaa749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 07/05/2020] [Accepted: 08/18/2020] [Indexed: 11/14/2022] Open

Malosetti M, Zwep LB, Forrest K, van Eeuwijk FA, Dieters M. Lessons from a GWAS study of a wheat pre-breeding program: pyramiding resistance alleles to Fusarium crown rot. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021;134:897-908. [PMID: 33367942 PMCID: PMC7925461 DOI: 10.1007/s00122-020-03740-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Accepted: 11/24/2020] [Indexed: 05/18/2023]

Abstract

Much has been published on QTL detection for complex traits using bi-parental and multi-parental crosses (linkage analysis) or diversity panels (GWAS studies). While successful for detection, transferability of results to real applications has proven more difficult. Here, we combined a QTL detection approach using a pre-breeding populations which utilized intensive phenotypic selection for the target trait across multiple plant generations, combined with rapid generation turnover (i.e. "speed breeding") to allow cycling of multiple plant generations each year. The reasoning is that QTL mapping information would complement the selection process by identifying the genome regions under selection within the relevant germplasm. Questions to answer were the location of the genomic regions determining response to selection and the origin of the favourable alleles within the pedigree. We used data from a pre-breeding program that aimed at pyramiding different resistance sources to Fusarium crown rot into elite (but susceptible) wheat backgrounds. The population resulted from a complex backcrossing scheme involving multiple resistance donors and multiple elite backgrounds, akin to a MAGIC population (985 genotypes in total, with founders, and two major offspring layers within the pedigree). A significant increase in the resistance level was observed (i.e. a positive response to selection) after the selection process, and 17 regions significantly associated with that response were identified using a GWAS approach. Those regions included known QTL as well as potentially novel regions contributing resistance to Fusarium crown rot. In addition, we were able to trace back the sources of the favourable alleles for each QTL. We demonstrate that QTL detection using breeding populations under selection for the target trait can identify QTL controlling the target trait and that the frequency of the favourable alleles was increased as a response to selection, thereby validating the QTL detected. This is a valuable opportunistic approach that can provide QTL information that is more easily transferred to breeding applications.

Collapse

Finke K, Kourakos M, Brown G, Dang HT, Tan SJS, Simons YB, Ramdas S, Schäffer AA, Kember RL, Bućan M, Mathieson S. Ancestral haplotype reconstruction in endogamous populations using identity-by-descent. PLoS Comput Biol 2021;17:e1008638. [PMID: 33635861 PMCID: PMC7946327 DOI: 10.1371/journal.pcbi.1008638] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 03/10/2021] [Accepted: 12/15/2020] [Indexed: 12/24/2022] Open

Abstract

In this work we develop a novel algorithm for reconstructing the genomes of ancestral individuals, given genotype or sequence data from contemporary individuals and an extended pedigree of family relationships. A pedigree with complete genomes for every individual enables the study of allele frequency dynamics and haplotype diversity across generations, including deviations from neutrality such as transmission distortion. When studying heritable diseases, ancestral haplotypes can be used to augment genome-wide association studies and track disease inheritance patterns. The building blocks of our reconstruction algorithm are segments of Identity-By-Descent (IBD) shared between two or more genotyped individuals. The method alternates between identifying a source for each IBD segment and assembling IBD segments placed within each ancestral individual. Unlike previous approaches, our method is able to accommodate complex pedigree structures with hundreds of individuals genotyped at millions of SNPs.

We apply our method to an Old Order Amish pedigree from Lancaster, Pennsylvania, whose founders came to North America from Europe during the early 18th century. The pedigree includes 1338 individuals from the past 12 generations, 394 with genotype data. The motivation for reconstruction is to understand the genetic basis of diseases segregating in the family through tracking haplotype transmission over time. Using our algorithm thread, we are able to reconstruct an average of 224 ancestral individuals per chromosome. For these ancestral individuals, on average we reconstruct 79% of their haplotypes. We also identify a region on chromosome 16 that is difficult to reconstruct—we find that this region harbors a short Amish-specific copy number variation and the gene HYDIN. thread was developed for endogamous populations, but can be applied to any extensive pedigree with the recent generations genotyped. We anticipate that this type of practical ancestral reconstruction will become more common and necessary to understand rare and complex heritable diseases in extended families.

When analyzing complex heritable traits, genomic data from many generations of an extended family increases the amount of information available for statistical inference. However, typically only genomic data from the recent generations of a pedigree are available, as ancestral individuals are deceased. In this work we present an algorithm, called thread, for reconstructing the genomes of ancestral individuals, given a complex pedigree and genomic data from the recent generations. Previous approaches have not been able to accommodate large datasets (both in terms of sites and individuals), made simplifying assumptions about pedigree structure, or did not tie reconstructed sequences back to specific individuals. We apply thread to a complex Old Order Amish pedigree of 1338 individuals, 394 with genotype data.

Collapse

Scott MF, Ladejobi O, Amer S, Bentley AR, Biernaskie J, Boden SA, Clark M, Dell'Acqua M, Dixon LE, Filippi CV, Fradgley N, Gardner KA, Mackay IJ, O'Sullivan D, Percival-Alwyn L, Roorkiwal M, Singh RK, Thudi M, Varshney RK, Venturini L, Whan A, Cockram J, Mott R. Multi-parent populations in crops: a toolbox integrating genomics and genetic mapping with breeding. Heredity (Edinb) 2020;125:396-416. [PMID: 32616877 PMCID: PMC7784848 DOI: 10.1038/s41437-020-0336-6] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 06/16/2020] [Accepted: 06/16/2020] [Indexed: 11/21/2022] Open

Affiliation(s)

Michael F Scott UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK.
Olufunmilayo Ladejobi UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK.
Samer Amer University of Reading, Reading, RG6 6AH, UK Faculty of Agriculture, Alexandria University, Alexandria, 23714, Egypt
Alison R Bentley The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Jay Biernaskie Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
Scott A Boden School of Agriculture, Food and Wine, University of Adelaide, Glen Osmond, SA, 5064, Australia
Matt Clark Natural History Museum, London, UK
Matteo Dell'Acqua Institute of Life Sciences, Scuola Superiore Sant'Anna, Pisa, Italy
Laura E Dixon Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
Carla V Filippi Instituto de Agrobiotecnología y Biología Molecular (IABIMO), INTA-CONICET, Nicolas Repetto y Los Reseros s/n, 1686, Hurlingham, Buenos Aires, Argentina
Nick Fradgley The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Keith A Gardner The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Ian J Mackay SRUC, West Mains Road, Kings Buildings, Edinburgh, EH9 3JG, UK
Donal O'Sullivan University of Reading, Reading, RG6 6AH, UK
Lawrence Percival-Alwyn The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Manish Roorkiwal Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Rakesh Kumar Singh International Center for Biosaline Agriculture, Academic City, Dubai, United Arab Emirates
Mahendar Thudi Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Rajeev Kumar Varshney Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Luca Venturini Natural History Museum, London, UK
Alex Whan CSIRO, GPO Box 1700, Canberra, ACT, 2601, Australia
James Cockram The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Richard Mott UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK

Collapse

Determinants of QTL Mapping Power in the Realized Collaborative Cross. G3-GENES GENOMES GENETICS 2019;9:1707-1727. [PMID: 30914424 PMCID: PMC6505132 DOI: 10.1534/g3.119.400194] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Recursive Algorithms for Modeling Genomic Ancestral Origins in a Fixed Pedigree. G3-GENES GENOMES GENETICS 2018;8:3231-3245. [PMID: 30068523 PMCID: PMC6169389 DOI: 10.1534/g3.118.200340] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Male Infertility Is Responsible for Nearly Half of the Extinction Observed in the Mouse Collaborative Cross. Genetics 2017;206:557-572. [PMID: 28592496 DOI: 10.1534/genetics.116.199596] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 03/09/2017] [Indexed: 11/18/2022] Open

Abstract

The goal of the Collaborative Cross (CC) project was to generate and distribute over 1000 independent mouse recombinant inbred strains derived from eight inbred founders. With inbreeding nearly complete, we estimated the extinction rate among CC lines at a remarkable 95%, which is substantially higher than in the derivation of other mouse recombinant inbred populations. Here, we report genome-wide allele frequencies in 347 extinct CC lines. Contrary to expectations, autosomes had equal allelic contributions from the eight founders, but chromosome X had significantly lower allelic contributions from the two inbred founders with underrepresented subspecific origins (PWK/PhJ and CAST/EiJ). By comparing extinct CC lines to living CC strains, we conclude that a complex genetic architecture is driving extinction, and selection pressures are different on the autosomes and chromosome X Male infertility played a large role in extinction as 47% of extinct lines had males that were infertile. Males from extinct lines had high variability in reproductive organ size, low sperm counts, low sperm motility, and a high rate of vacuolization of seminiferous tubules. We performed QTL mapping and identified nine genomic regions associated with male fertility and reproductive phenotypes. Many of the allelic effects in the QTL were driven by the two founders with underrepresented subspecific origins, including a QTL on chromosome X for infertility that was driven by the PWK/PhJ haplotype. We also performed the first example of cross validation using complementary CC resources to verify the effect of sperm curvilinear velocity from the PWK/PhJ haplotype on chromosome 2 in an independent population across multiple generations. While selection typically constrains the examination of reproductive traits toward the more fertile alleles, the CC extinct lines provided a unique opportunity to study the genetic architecture of fertility in a widely genetically variable population. We hypothesize that incompatibilities between alleles with different subspecific origins is a key driver of infertility. These results help clarify the factors that drove strain extinction in the CC, reveal the genetic regions associated with poor fertility in the CC, and serve as a resource to further study mammalian infertility.

Collapse

Oreper D, Cai Y, Tarantino LM, de Villena FPM, Valdar W. Inbred Strain Variant Database (ISVdb): A Repository for Probabilistically Informed Sequence Differences Among the Collaborative Cross Strains and Their Founders. G3 (BETHESDA, MD.) 2017;7:1623-1630. [PMID: 28592645 PMCID: PMC5473744 DOI: 10.1534/g3.117.041491] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 03/20/2017] [Indexed: 02/07/2023]

X-Chromosome Control of Genome-Scale Recombination Rates in House Mice. Genetics 2017;205:1649-1656. [PMID: 28159751 PMCID: PMC5378119 DOI: 10.1534/genetics.116.197533] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 01/24/2017] [Indexed: 12/19/2022] Open

Plethysmography Phenotype QTL in Mice Before and After Allergen Sensitization and Challenge. G3-GENES GENOMES GENETICS 2016;6:2857-65. [PMID: 27449512 PMCID: PMC5015943 DOI: 10.1534/g3.116.032912] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Probabilistic Multilocus Haplotype Reconstruction in Outcrossing Tetraploids. Genetics 2016;203:119-31. [PMID: 26920758 DOI: 10.1534/genetics.115.185579] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 02/22/2016] [Indexed: 01/29/2023] Open

Didion JP, Morgan AP, Yadgary L, Bell TA, McMullan RC, Ortiz de Solorzano L, Britton-Davidian J, Bult CJ, Campbell KJ, Castiglia R, Ching YH, Chunco AJ, Crowley JJ, Chesler EJ, Förster DW, French JE, Gabriel SI, Gatti DM, Garland T, Giagia-Athanasopoulou EB, Giménez MD, Grize SA, Gündüz İ, Holmes A, Hauffe HC, Herman JS, Holt JM, Hua K, Jolley WJ, Lindholm AK, López-Fuster MJ, Mitsainas G, da Luz Mathias M, McMillan L, Ramalhinho MDGM, Rehermann B, Rosshart SP, Searle JB, Shiao MS, Solano E, Svenson KL, Thomas-Laemont P, Threadgill DW, Ventura J, Weinstock GM, Pomp D, Churchill GA, Pardo-Manuel de Villena F. R2d2 Drives Selfish Sweeps in the House Mouse. Mol Biol Evol 2016;33:1381-95. [PMID: 26882987 PMCID: PMC4868115 DOI: 10.1093/molbev/msw036] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Affiliation(s)

John P Didion Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Andrew P Morgan Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Liran Yadgary Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Timothy A Bell Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Rachel C McMullan Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Lydia Ortiz de Solorzano Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Janice Britton-Davidian Institut des Sciences de l'Evolution, Université De Montpellier, CNRS, IRD, EPHE, Montpellier, France
Carol J Bult The Jackson Laboratory, Bar Harbor, ME
Karl J Campbell Island Conservation, Puerto Ayora, Galápagos Island, Ecuador School of Geography, Planning & Environmental Management, The University of Queensland, St Lucia, QLD, Australia
Riccardo Castiglia Department of Biology and Biotechnologies "Charles Darwin", University of Rome "La Sapienza", Rome, Italy
Yung-Hao Ching Department of Molecular Biology and Human Genetics, Tzu Chi University, Hualien City, Taiwan
Amanda J Chunco Department of Environmental Studies, Elon University
James J Crowley Department of Genetics, The University of North Carolina at Chapel Hill
Elissa J Chesler The Jackson Laboratory, Bar Harbor, ME
Daniel W Förster Department of Evolutionary Genetics, Leibniz-Institute for Zoo and Wildlife Research, Berlin, Germany
John E French National Toxicology Program, National Institute of Environmental Sciences, NIH, Research Triangle Park, NC
Sofia I Gabriel Department of Animal Biology & CESAM - Centre for Environmental and Marine Studies, Faculty of Sciences, University of Lisbon, Lisboa, Portugal
Daniel M Gatti The Jackson Laboratory, Bar Harbor, ME
Theodore Garland Department of Biology, University of California Riverside
Eva B Giagia-Athanasopoulou Section of Animal Biology, Department of Biology, University of Patras, Patras, Greece
Mabel D Giménez Instituto de Biología Subtropical, CONICET - Universidad Nacional de Misiones, Posadas, Misiones, Argentina
Sofia A Grize Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
İslam Gündüz Department of Biology, Faculty of Arts and Sciences, University of Ondokuz Mayis, Samsun, Turkey
Andrew Holmes Laboratory of Behavioral and Genomic Neuroscience, National Institute on Alcohol Abuse and Alcoholism, NIH, Bethesda, MD
Heidi C Hauffe Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, San Michele All'adige, TN, Italy
Jeremy S Herman Department of Natural Sciences, National Museums Scotland, Edinburgh, United Kingdom
James M Holt Department of Computer Science, The University of North Carolina at Chapel Hill
Kunjie Hua Department of Genetics, The University of North Carolina at Chapel Hill
Wesley J Jolley Island Conservation, Santa Cruz, CA
Anna K Lindholm Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
María J López-Fuster Faculty of Biology, Universitat de Barcelona, Barcelona, Spain
George Mitsainas Section of Animal Biology, Department of Biology, University of Patras, Patras, Greece
Maria da Luz Mathias Department of Animal Biology & CESAM - Centre for Environmental and Marine Studies, Faculty of Sciences, University of Lisbon, Lisboa, Portugal
Leonard McMillan Department of Computer Science, The University of North Carolina at Chapel Hill
Maria da Graça Morgado Ramalhinho Department of Animal Biology & CESAM - Centre for Environmental and Marine Studies, Faculty of Sciences, University of Lisbon, Lisboa, Portugal
Barbara Rehermann Immunology Section, Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, MD
Stephan P Rosshart Immunology Section, Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, MD
Jeremy B Searle Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY
Meng-Shin Shiao Research Center, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
Emanuela Solano Department of Biology and Biotechnologies "Charles Darwin", University of Rome "La Sapienza", Rome, Italy
Karen L Svenson The Jackson Laboratory, Bar Harbor, ME
Patricia Thomas-Laemont Department of Environmental Studies, Elon University
David W Threadgill Department of Veterinary Pathobiology, Texas A&M University, College Station Department of Molecular and Cellular Medicine, Texas A&M University, College Station
Jacint Ventura Departament de Biologia Animal, de Biologia Vegetal y de Ecologia, Facultat de Biociències, Universitat Autònoma de Barcelona, Barcelona, Spain
George M Weinstock Jackson Laboratory for Genomic Medicine, Farmington, CT
Daniel Pomp Department of Genetics, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill
Gary A Churchill The Jackson Laboratory, Bar Harbor, ME
Fernando Pardo-Manuel de Villena Department of Genetics, The University of North Carolina at Chapel Hill Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill Carolina Center for Genome Science, The University of North Carolina at Chapel Hill

Collapse

Rutledge H, Baran-Gale J, de Villena FPM, Chesler EJ, Churchill GA, Sethupathy P, Kelada SNP. Identification of microRNAs associated with allergic airway disease using a genetically diverse mouse population. BMC Genomics 2015;16:633. [PMID: 26303911 PMCID: PMC4548451 DOI: 10.1186/s12864-015-1732-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 06/29/2015] [Indexed: 12/17/2022] Open

Abstract

Background

Allergic airway diseases (AADs) such as asthma are characterized in part by granulocytic airway inflammation. The gene regulatory networks that govern granulocyte recruitment are poorly understood, but evidence is accruing that microRNAs (miRNAs) play an important role. To identify miRNAs that may underlie AADs, we used two complementary approaches that leveraged the genotypic and phenotypic diversity of the Collaborative Cross (CC) mouse population. In the first approach, we sought to identify miRNA expression quantitative trait loci (eQTL) that overlap QTL for AAD-related phenotypes. Specifically, CC founder strains and incipient lines of the CC were sensitized and challenged with house dust mite allergen followed by measurement of granulocyte recruitment to the lung. Total lung RNA was isolated and miRNA was measured using arrays for CC founders and qRT-PCR for incipient CC lines.

Results

Among CC founders, 92 miRNAs were differentially expressed. We measured the expression of 40 of the most highly expressed of these 92 miRNAs in the incipient lines of the CC and identified 18 eQTL corresponding to 14 different miRNAs. Surprisingly, half of these eQTL were distal to the corresponding miRNAs, and even on different chromosomes. One of the largest-effect local miRNA eQTL was for miR-342-3p, for which we identified putative causal variants by bioinformatic analysis of the effects of single nucleotide polymorphisms on RNA structure. None of the miRNA eQTL co-localized with QTL for eosinophil or neutrophil recruitment. In the second approach, we constructed putative miRNA/mRNA regulatory networks and identified three miRNAs (miR-497, miR-351 and miR-31) as candidate master regulators of genes associated with neutrophil recruitment. Analysis of a dataset from human keratinocytes transfected with a miR-31 inhibitor revealed two target genes in common with miR-31 targets correlated with neutrophils, namely Oxsr1 and Nsf.

Conclusions

miRNA expression in the allergically inflamed murine lung is regulated by genetic loci that are smaller in effect size compared to mRNA eQTL and often act in trans. Thus our results indicate that the genetic architecture of miRNA expression is different from mRNA expression. We identified three miRNAs, miR-497, miR-351 and miR-31, that are candidate master regulators of genes associated with neutrophil recruitment. Because miR-31 is expressed in airway epithelia and is predicted to target genes with known links to neutrophilic inflammation, we suggest that miR-31 is a potentially novel regulator of airway inflammation.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1732-9) contains supplementary material, which is available to authorized users.

Collapse

A general modeling framework for genome ancestral origins in multiparental populations. Genetics 2015;198:87-101. [PMID: 25236451 DOI: 10.1534/genetics.114.163006] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Bayesian modeling of haplotype effects in multiparent populations. Genetics 2015;198:139-56. [PMID: 25236455 PMCID: PMC4174926 DOI: 10.1534/genetics.114.166249] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Morgan AP, Welsh CE. Informatics resources for the Collaborative Cross and related mouse populations. Mamm Genome 2015;26:521-39. [PMID: 26135136 DOI: 10.1007/s00335-015-9581-z] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 06/23/2015] [Indexed: 02/05/2023]

Reconstruction of Genome Ancestry Blocks in Multiparental Populations. Genetics 2015;200:1073-87. [PMID: 26048018 DOI: 10.1534/genetics.115.177873] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Accepted: 05/31/2015] [Indexed: 11/18/2022] Open

Modeling X-linked ancestral origins in multiparental populations. G3-GENES GENOMES GENETICS 2015;5:777-801. [PMID: 25740936 PMCID: PMC4426366 DOI: 10.1534/g3.114.016154] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Didion JP, Morgan AP, Clayshulte AMF, Mcmullan RC, Yadgary L, Petkov PM, Bell TA, Gatti DM, Crowley JJ, Hua K, Aylor DL, Bai L, Calaway M, Chesler EJ, French JE, Geiger TR, Gooch TJ, Garland T, Harrill AH, Hunter K, McMillan L, Holt M, Miller DR, O'Brien DA, Paigen K, Pan W, Rowe LB, Shaw GD, Simecek P, Sullivan PF, Svenson KL, Weinstock GM, Threadgill DW, Pomp D, Churchill GA, Pardo-Manuel de Villena F. A multi-megabase copy number gain causes maternal transmission ratio distortion on mouse chromosome 2. PLoS Genet 2015;11:e1004850. [PMID: 25679959 PMCID: PMC4334553 DOI: 10.1371/journal.pgen.1004850] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Accepted: 10/24/2014] [Indexed: 12/29/2022] Open

Abstract

Significant departures from expected Mendelian inheritance ratios (transmission ratio distortion, TRD) are frequently observed in both experimental crosses and natural populations. TRD on mouse Chromosome (Chr) 2 has been reported in multiple experimental crosses, including the Collaborative Cross (CC). Among the eight CC founder inbred strains, we found that Chr 2 TRD was exclusive to females that were heterozygous for the WSB/EiJ allele within a 9.3 Mb region (Chr 2 76.9 - 86.2 Mb). A copy number gain of a 127 kb-long DNA segment (designated as responder to drive, R2d) emerged as the strongest candidate for the causative allele. We mapped R2d sequences to two loci within the candidate interval. R2d1 is located near the proximal boundary, and contains a single copy of R2d in all strains tested. R2d2 maps to a 900 kb interval, and the number of R2d copies varies from zero in classical strains (including the mouse reference genome) to more than 30 in wild-derived strains. Using real-time PCR assays for the copy number, we identified a mutation (R2d2WSBdel1) that eliminates the majority of the R2d2WSB copies without apparent alterations of the surrounding WSB/EiJ haplotype. In a three-generation pedigree segregating for R2d2WSBdel1, the mutation is transmitted to the progeny and Mendelian segregation is restored in females heterozygous for R2d2WSBdel1, thus providing direct evidence that the copy number gain is causal for maternal TRD. We found that transmission ratios in R2d2WSB heterozygous females vary between Mendelian segregation and complete distortion depending on the genetic background, and that TRD is under genetic control of unlinked distorter loci. Although the R2d2WSB transmission ratio was inversely correlated with average litter size, several independent lines of evidence support the contention that female meiotic drive is the cause of the distortion. We discuss the implications and potential applications of this novel meiotic drive system.

Collapse

Affiliation(s)

John P. Didion Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Andrew P. Morgan Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Amelia M.-F. Clayshulte Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Rachel C. Mcmullan Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Liran Yadgary Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Petko M. Petkov The Jackson Laboratory, Bar Harbor, Maine, United States of America
Timothy A. Bell Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Daniel M. Gatti The Jackson Laboratory, Bar Harbor, Maine, United States of America
James J. Crowley Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Kunjie Hua Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
David L. Aylor Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
Ling Bai Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
Mark Calaway Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Elissa J. Chesler The Jackson Laboratory, Bar Harbor, Maine, United States of America
John E. French National Toxicology Program, National Institute of Environmental Sciences, NIH, Research Triangle Park, North Carolina, United States of America
Thomas R. Geiger Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
Terry J. Gooch Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Theodore Garland Department of Biology, University of California Riverside, Riverside, California, United States of America
Alison H. Harrill Department of Environmental and Occupational Health, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
Kent Hunter Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
Leonard McMillan Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Matt Holt Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Darla R. Miller Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Deborah A. O'Brien Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Kenneth Paigen The Jackson Laboratory, Bar Harbor, Maine, United States of America
Wenqi Pan Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Lucy B. Rowe The Jackson Laboratory, Bar Harbor, Maine, United States of America
Ginger D. Shaw Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Petr Simecek The Jackson Laboratory, Bar Harbor, Maine, United States of America
Patrick F. Sullivan Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Karen L Svenson The Jackson Laboratory, Bar Harbor, Maine, United States of America
George M. Weinstock Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
David W. Threadgill Department of Veterinary Pathobiology and Department of Molecular and Cellular Medicine, Texas A&M University, College Station, Texas, United States of America
Daniel Pomp Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
Gary A. Churchill The Jackson Laboratory, Bar Harbor, Maine, United States of America
Fernando Pardo-Manuel de Villena Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America

Collapse

Genetic regulation of Zfp30, CXCL1, and neutrophilic inflammation in murine lung. Genetics 2014;198:735-45. [PMID: 25114278 DOI: 10.1534/genetics.114.168138] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Liu EY, Morgan AP, Chesler EJ, Wang W, Churchill GA, Pardo-Manuel de Villena F. High-resolution sex-specific linkage maps of the mouse reveal polarized distribution of crossovers in male germline. Genetics 2014;197:91-106. [PMID: 24578350 PMCID: PMC4012503 DOI: 10.1534/genetics.114.161653] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Accepted: 02/20/2014] [Indexed: 12/31/2022] Open

Hitzemann R, Bottomly D, Iancu O, Buck K, Wilmot B, Mooney M, Searles R, Zheng C, Belknap J, Crabbe J, McWeeney S. The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits. Mamm Genome 2013;25:12-22. [PMID: 24374554 PMCID: PMC3916704 DOI: 10.1007/s00335-013-9495-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Accepted: 11/25/2013] [Indexed: 02/06/2023]

Illingworth CJR, Parts L, Bergström A, Liti G, Mustonen V. Inferring genome-wide recombination landscapes from advanced intercross lines: application to yeast crosses. PLoS One 2013;8:e62266. [PMID: 23658715 PMCID: PMC3642125 DOI: 10.1371/journal.pone.0062266] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 03/19/2013] [Indexed: 01/23/2023] Open

The genome architecture of the Collaborative Cross mouse genetic reference population. Genetics 2012;190:389-401. [PMID: 22345608 PMCID: PMC3276630 DOI: 10.1534/genetics.111.132639] [Citation(s) in RCA: 384] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

The Collaborative Cross Consortium reports here on the development of a unique genetic resource population. The Collaborative Cross (CC) is a multiparental recombinant inbred panel derived from eight laboratory mouse inbred strains. Breeding of the CC lines was initiated at multiple international sites using mice from The Jackson Laboratory. Currently, this innovative project is breeding independent CC lines at the University of North Carolina (UNC), at Tel Aviv University (TAU), and at Geniad in Western Australia (GND). These institutions aim to make publicly available the completed CC lines and their genotypes and sequence information. We genotyped, and report here, results from 458 extant lines from UNC, TAU, and GND using a custom genotyping array with 7500 SNPs designed to be maximally informative in the CC and used a novel algorithm to infer inherited haplotypes directly from hybridization intensity patterns. We identified lines with breeding errors and cousin lines generated by splitting incipient lines into two or more cousin lines at early generations of inbreeding. We then characterized the genome architecture of 350 genetically independent CC lines. Results showed that founder haplotypes are inherited at the expected frequency, although we also consistently observed highly significant transmission ratio distortion at specific loci across all three populations. On chromosome 2, there is significant overrepresentation of WSB/EiJ alleles, and on chromosome X, there is a large deficit of CC lines with CAST/EiJ alleles. Linkage disequilibrium decays as expected and we saw no evidence of gametic disequilibrium in the CC population as a whole or in random subsets of the population. Gametic equilibrium in the CC population is in marked contrast to the gametic disequilibrium present in a large panel of classical inbred strains. Finally, we discuss access to the CC population and to the associated raw data describing the genetic structure of individual lines. Integration of rich phenotypic and genomic data over time and across a wide variety of fields will be vital to delivering on one of the key attributes of the CC, a common genetic reference platform for identifying causative variants and genetic networks determining traits in mammals.

Collapse

Genotype probabilities at intermediate generations in the construction of recombinant inbred lines. Genetics 2012;190:403-12. [PMID: 22345609 PMCID: PMC3276635 DOI: 10.1534/genetics.111.132647] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Ten years of the Collaborative Cross. Genetics 2012;190:291-4. [PMID: 22345604 DOI: 10.1534/genetics.111.138032] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Expression quantitative trait Loci for extreme host response to influenza a in pre-collaborative cross mice. G3-GENES GENOMES GENETICS 2012;2:213-21. [PMID: 22384400 PMCID: PMC3284329 DOI: 10.1534/g3.111.001800] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2011] [Accepted: 12/08/2011] [Indexed: 01/05/2023]

Genetic analysis of hematological parameters in incipient lines of the collaborative cross. G3-GENES GENOMES GENETICS 2012;2:157-65. [PMID: 22384394 PMCID: PMC3284323 DOI: 10.1534/g3.111.001776] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 12/20/2011] [Indexed: 12/19/2022]

Accelerating the inbreeding of multi-parental recombinant inbred lines generated by sibling matings. G3-GENES GENOMES GENETICS 2012;2:191-8. [PMID: 22384397 PMCID: PMC3284326 DOI: 10.1534/g3.111.001784] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 11/06/2011] [Indexed: 11/26/2022]

Ten years of the collaborative cross. G3-GENES GENOMES GENETICS 2012;2:153-6. [PMID: 22384393 PMCID: PMC3284322 DOI: 10.1534/g3.111.001891] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Quantitative trait Loci association mapping by imputation of strain origins in multifounder crosses. Genetics 2011;190:459-73. [PMID: 22143921 DOI: 10.1534/genetics.111.135095] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Aylor DL, Valdar W, Foulds-Mathes W, Buus RJ, Verdugo RA, Baric RS, Ferris MT, Frelinger JA, Heise M, Frieman MB, Gralinski LE, Bell TA, Didion JD, Hua K, Nehrenberg DL, Powell CL, Steigerwalt J, Xie Y, Kelada SNP, Collins FS, Yang IV, Schwartz DA, Branstetter LA, Chesler EJ, Miller DR, Spence J, Liu EY, McMillan L, Sarkar A, Wang J, Wang W, Zhang Q, Broman KW, Korstanje R, Durrant C, Mott R, Iraqi FA, Pomp D, Threadgill D, de Villena FPM, Churchill GA. Genetic analysis of complex traits in the emerging Collaborative Cross. Genome Res 2011;21:1213-22. [PMID: 21406540 DOI: 10.1101/gr.111310.110] [Citation(s) in RCA: 260] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Guzzetta G, Jurman G, Furlanello C. A machine learning pipeline for quantitative phenotype prediction from genotype data. BMC Bioinformatics 2010;11 Suppl 8:S3. [PMID: 21034428 PMCID: PMC2966290 DOI: 10.1186/1471-2105-11-s8-s3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

Background

Quantitative phenotypes emerge everywhere in systems biology and biomedicine due to a direct interest for quantitative traits, or to high individual variability that makes hard or impossible to classify samples into distinct categories, often the case with complex common diseases. Machine learning approaches to genotype-phenotype mapping may significantly improve Genome-Wide Association Studies (GWAS) results by explicitly focusing on predictivity and optimal feature selection in a multivariate setting. It is however essential that stringent and well documented Data Analysis Protocols (DAP) are used to control sources of variability and ensure reproducibility of results. We present a genome-to-phenotype pipeline of machine learning modules for quantitative phenotype prediction. The pipeline can be applied for the direct use of whole-genome information in functional studies. As a realistic example, the problem of fitting complex phenotypic traits in heterogeneous stock mice from single nucleotide polymorphims (SNPs) is here considered.

Methods

The core element in the pipeline is the L1L2 regularization method based on the naïve elastic net. The method gives at the same time a regression model and a dimensionality reduction procedure suitable for correlated features. Model and SNP markers are selected through a DAP originally developed in the MAQC-II collaborative initiative of the U.S. FDA for the identification of clinical biomarkers from microarray data. The L1L2 approach is compared with standard Support Vector Regression (SVR) and with Recursive Jump Monte Carlo Markov Chain (MCMC). Algebraic indicators of stability of partial lists are used for model selection; the final panel of markers is obtained by a procedure at the chromosome scale, termed ’saturation’, to recover SNPs in Linkage Disequilibrium with those selected.

Results

With respect to both MCMC and SVR, comparable accuracies are obtained by the L1L2 pipeline. Good agreement is also found between SNPs selected by the L1L2 algorithms and candidate loci previously identified by a standard GWAS. The combination of L1L2-based feature selection with a saturation procedure tackles the issue of neglecting highly correlated features that affects many feature selection algorithms.

Conclusions

The L1L2 pipeline has proven effective in terms of marker selection and prediction accuracy. This study indicates that machine learning techniques may support quantitative phenotype prediction, provided that adequate DAPs are employed to control bias in model selection.

Collapse