1
|
Jewett EM, McManus KF, Freyman WA, Auton A, Auton A. Bonsai: An efficient method for inferring large human pedigrees from genotype data. Am J Hum Genet 2021; 108:2052-2070. [PMID: 34739834 PMCID: PMC8595950 DOI: 10.1016/j.ajhg.2021.09.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 09/24/2021] [Indexed: 11/29/2022] Open
Abstract
Pedigree inference from genotype data is a challenging problem, particularly when pedigrees are sparsely sampled and individuals may be distantly related to their closest genotyped relatives. We present a method that infers small pedigrees of close relatives and then assembles them into larger pedigrees. To assemble large pedigrees, we introduce several formulas and tools including a likelihood for the degree separating two small pedigrees, a generalization of the fast DRUID point estimate of the degree separating two pedigrees, a method for detecting individuals who share background identity-by-descent (IBD) that does not reflect recent common ancestry, and a method for identifying the ancestral branches through which distant relatives are connected. Our method also takes several approaches that help to improve the accuracy and efficiency of pedigree inference. In particular, we incorporate age information directly into the likelihood rather than using ages only for consistency checks and we employ a heuristic branch-and-bound-like approach to more efficiently explore the space of possible pedigrees. Together, these approaches make it possible to construct large pedigrees that are challenging or intractable for current inference methods.
Collapse
|
2
|
Ko A, Nielsen R. Joint Estimation of Pedigrees and Effective Population Size Using Markov Chain Monte Carlo. Genetics 2019; 212:855-868. [PMID: 31123041 PMCID: PMC6614905 DOI: 10.1534/genetics.119.302280] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 05/16/2019] [Indexed: 12/31/2022] Open
Abstract
Pedigrees provide the genealogical relationships among individuals at a fine resolution and serve an important function in many areas of genetic studies. One such use of pedigree information is in the estimation of the short-term effective population size [Formula: see text], which is of great relevance in fields such as conservation genetics. Despite the usefulness of pedigrees, however, they are often an unknown parameter and must be inferred from genetic data. In this study, we present a Bayesian method to jointly estimate pedigrees and [Formula: see text] from genetic markers using Markov Chain Monte Carlo. Our method supports analysis of a large number of markers and individuals within a single generation with the use of a composite likelihood, which significantly increases computational efficiency. We show, on simulated data, that our method is able to jointly estimate relationships up to first cousins and [Formula: see text] with high accuracy. We also apply the method on a real dataset of house sparrows to reconstruct their previously unreported pedigree.
Collapse
Affiliation(s)
- Amy Ko
- Department of Integrative Biology, University of California, Berkeley, 94720 California
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, 94720 California
- Department of Statistics, University of California, Berkeley, 94720 California
- Museum of Natural History, University of Copenhagen, 1123 Denmark
| |
Collapse
|
3
|
Yuan Y, Shen X, Pan W, Wang Z. Constrained likelihood for reconstructing a directed acyclic Gaussian graph. Biometrika 2018; 106:109-125. [PMID: 30799877 DOI: 10.1093/biomet/asy057] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Indexed: 11/13/2022] Open
Abstract
Directed acyclic graphs are widely used to describe directional pairwise relations. Such relations are estimated by reconstructing a directed acyclic graph's structure, which is challenging when the ordering of nodes of the graph is unknown. In such a situation, existing methods such as the neighbourhood and search-and-score methods have high estimation errors or computational complexities, especially when a local or sequential approach is used to enumerate edge directions by testing or optimizing a criterion locally, as a local method may break down even for moderately sized graphs. We propose a novel approach to simultaneously identifying all estimable directed edges and model parameters, using constrained maximum likelihood with nonconvex constraints. We develop a constraint reduction method that constructs a set of active constraints from super-exponentially many constraints. This, coupled with an alternating direction method of multipliers and a difference convex method, permits efficient computation for large-graph learning. We show that the proposed method consistently reconstructs identifiable directions of the true graph and achieves the optimal performance in terms of parameter estimation. Numerically, the method compares favourably with competitors. A protein network is analysed to demonstrate that the proposed method can make a difference in identifying the network's structure.
Collapse
Affiliation(s)
- Yiping Yuan
- School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church St S.E., Minneapolis, Minnesota, U.S.A
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church St S.E., Minneapolis, Minnesota, U.S.A
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, 420 Delaware St S.E., Minneapolis, Minnesota, U.S.A
| | - Zizhuo Wang
- Department of Industrial and Systems Engineering, University of Minnesota, 111 Church St S.E., Minneapolis, Minnesota, U.S.A
| |
Collapse
|
4
|
Mo SK, Ren ZL, Yang YR, Liu YC, Zhang JJ, Wu HJ, Li Z, Bo XC, Wang SQ, Yan JW, Ni M. A 472-SNP panel for pairwise kinship testing of second-degree relatives. Forensic Sci Int Genet 2018; 34:178-185. [PMID: 29510334 DOI: 10.1016/j.fsigen.2018.02.019] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 02/22/2018] [Accepted: 02/25/2018] [Indexed: 10/17/2022]
Abstract
Kinship testing based on genetic markers, as forensic short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs), has valuable practical applications. Paternity and first-degree relationship can be accurately identified by current commonly-used forensic STRs and reported SNP markers. However, second-degree and more distant relationships remain challenging. Although ∼105-106 SNPs can be used to estimate relatedness of higher degrees, genome-wide genotyping and analysis may be impractical for forensic use. With rapid growth of human genome data sets, it is worthwhile to explore additional markers, especially SNPs, for kinship analysis. Here, we reported an autosomal SNP panel consisted of 342 SNP selected from >84 million SNPs and 131 SNPs from previous systems. We genotyped these SNPs in 136 Chinese individuals by multiplex amplicon Massively Parallel Sequencing, and performed pairwise gender-independent kinship testing. The specificity and sensitivity of these SNPs to distinguish second-degree relatives and the unrelated was 99.9% and 100%, respectively, compared with 53.7% and 99.9% of 19 commonly-used forensic STRs. Moreover, the specificity increased to 100% by the combined use of these STRs and SNPs. The 472-SNP panel could also greatly facilitate the discrimination among different relationships. We estimated that the power of ∼6.45 SNPs were equivalent to one forensic STR in the scenario of 2nd-degree relative pedigree. Altogether, we proposed a panel of 472 SNP markers for kinship analysis, which could be important supplementary of current forensic STRs to solve the problem of second-degree relative testing.
Collapse
Affiliation(s)
- Shao-Kang Mo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China; Department of Reproductive Center, General Hospital of Lanzhou Military Region, Lanzhou 730050, China.
| | - Zi-Lin Ren
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | - Ya-Ran Yang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Ya-Cheng Liu
- Department of Genetics, Beijing Tongda Shoucheng Institute of Forensic Science, Beijing 100192, China.
| | - Jing-Jing Zhang
- Department of Biotechnology, Beijing Center for Physical and Chemical Analysis, Beijing 100089, China.
| | - Hui-Juan Wu
- Department of Biotechnology, Beijing Center for Physical and Chemical Analysis, Beijing 100089, China.
| | - Zhen Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | - Xiao-Chen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | - Sheng-Qi Wang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | - Jiang-Wei Yan
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Ming Ni
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| |
Collapse
|
5
|
|
6
|
Ko A, Nielsen R. Composite likelihood method for inferring local pedigrees. PLoS Genet 2017; 13:e1006963. [PMID: 28827797 PMCID: PMC5578687 DOI: 10.1371/journal.pgen.1006963] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Revised: 08/31/2017] [Accepted: 08/07/2017] [Indexed: 12/21/2022] Open
Abstract
Pedigrees contain information about the genealogical relationships among individuals and are of fundamental importance in many areas of genetic studies. However, pedigrees are often unknown and must be inferred from genetic data. Despite the importance of pedigree inference, existing methods are limited to inferring only close relationships or analyzing a small number of individuals or loci. We present a simulated annealing method for estimating pedigrees in large samples of otherwise seemingly unrelated individuals using genome-wide SNP data. The method supports complex pedigree structures such as polygamous families, multi-generational families, and pedigrees in which many of the member individuals are missing. Computational speed is greatly enhanced by the use of a composite likelihood function which approximates the full likelihood. We validate our method on simulated data and show that it can infer distant relatives more accurately than existing methods. Furthermore, we illustrate the utility of the method on a sample of Greenlandic Inuit. Pedigrees contain information about the genealogical relationships among individuals. This information can be used in many areas of genetic studies such as disease association studies, conservation efforts, and for inferences about the demographic history and social structure of a population. Despite their importance, pedigrees are often unknown and must be estimated from genetic information. However, pedigree inference remains a difficult problem due to the high cost of likelihood computation and the enormous number of possible pedigrees that must be considered. These difficulties limit existing methods in their ability to infer pedigrees when the sample size or the number of markers is large, or when the sample contains only distant relatives. In this report, we present a method that circumvents these computational challenges in order to infer pedigrees of complex structure for a large number of individuals. Using simulations, we find that the method can infer distant relatives much more accurately than existing methods. Furthermore, we show that even pairwise inferences of relatedness can be improved substantially by consideration of the pedigree structure with other related individuals in the sample.
Collapse
Affiliation(s)
- Amy Ko
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- * E-mail:
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Museum of Natural History, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
7
|
Städele V, Vigilant L. Strategies for determining kinship in wild populations using genetic data. Ecol Evol 2016; 6:6107-20. [PMID: 27648229 PMCID: PMC5016635 DOI: 10.1002/ece3.2346] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Revised: 07/01/2016] [Accepted: 07/04/2016] [Indexed: 01/17/2023] Open
Abstract
Knowledge of kin relationships between members of wild animal populations has broad application in ecology and evolution research by allowing the investigation of dispersal dynamics, mating systems, inbreeding avoidance, kin recognition, and kin selection as well as aiding the management of endangered populations. However, the assessment of kinship among members of wild animal populations is difficult in the absence of detailed multigenerational pedigrees. Here, we first review the distinction between genetic relatedness and kinship derived from pedigrees and how this makes the identification of kin using genetic data inherently challenging. We then describe useful approaches to kinship classification, such as parentage analysis and sibship reconstruction, and explain how the combined use of marker systems with biparental and uniparental inheritance, demographic information, likelihood analyses, relatedness coefficients, and estimation of misclassification rates can yield reliable classifications of kinship in groups with complex kin structures. We outline alternative approaches for cases in which explicit knowledge of dyadic kinship is not necessary, but indirect inferences about kinship on a group- or population-wide scale suffice, such as whether more highly related dyads are in closer spatial proximity. Although analysis of highly variable microsatellite loci is still the dominant approach for studies on wild populations, we describe how the long-awaited use of large-scale single-nucleotide polymorphism and sequencing data derived from noninvasive low-quality samples may eventually lead to highly accurate assessments of varying degrees of kinship in wild populations.
Collapse
Affiliation(s)
- Veronika Städele
- Department of PrimatologyMax Planck Institute for Evolutionary AnthropologyDeutscher Platz 6D‐04103LeipzigGermany
| | - Linda Vigilant
- Department of PrimatologyMax Planck Institute for Evolutionary AnthropologyDeutscher Platz 6D‐04103LeipzigGermany
| |
Collapse
|
8
|
Wallace SE, Gourna EG, Nikolova V, Sheehan NA. Family tree and ancestry inference: is there a need for a 'generational' consent? BMC Med Ethics 2015; 16:87. [PMID: 26645273 PMCID: PMC4673846 DOI: 10.1186/s12910-015-0080-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 11/30/2015] [Indexed: 11/24/2022] Open
Abstract
Background Genealogical research and ancestry testing are popular recreational activities but little is known about the impact of the use of these services on clients’ biological and social families. Ancestry databases are being enriched with self-reported data and data from deoxyribonucleic acid (DNA) analyses, but also are being linked to other direct-to-consumer genetic testing and research databases. As both family history data and DNA can provide information on more than just the individual, we asked whether companies, as a part of the consent process, were informing clients, and through them clients’ relatives, of the potential implications of the use and linkage of their personal data. Methods We used content analysis to analyse publically-available consent and informational materials provided to potential clients of ancestry and direct-to-consumer genetic testing companies to determine what consent is required, what risks associated with participation were highlighted, and whether the consent or notification of third parties was suggested or required. Results We identified four categories of companies providing: 1) services based only on self-reported data, such as personal or family history; 2) services based only on DNA provided by the client; 3) services using both; and 4) services using both that also have a research component. The amount of information provided on the potential issues varied significantly across the categories of companies. ‘Traditional’ ancestry companies showed the greatest awareness of the implications for family members, while companies only asking for DNA focused solely on the client. While in some cases companies included text recommending clients inform their relatives, showing they recognised the issues, often it was located within lengthy terms and conditions or privacy statements that may not be read by potential clients. Conclusions We recommend that companies should make it clearer that clients should inform third parties about their plans to participate, that third parties’ data will be provided to companies, and that that data will be linked to other databases, thus raising privacy and issues on use of data. We also suggest investigating whether a ‘generational consent’ should be created that would include more than just the individual in decisions about participating in genetic investigations.
Collapse
Affiliation(s)
- Susan E Wallace
- Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, UK.
| | - Elli G Gourna
- Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, UK.
| | - Viktoriya Nikolova
- Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, UK.
| | - Nuala A Sheehan
- Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, UK. .,Department of Genetics, University of Leicester, Leicester, LE1 7RH, UK.
| |
Collapse
|
9
|
Sun M, Jobling MA, Taliun D, Pramstaller PP, Egeland T, Sheehan NA. On the use of dense SNP marker data for the identification of distant relative pairs. Theor Popul Biol 2015; 107:14-25. [PMID: 26474828 DOI: 10.1016/j.tpb.2015.10.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Revised: 10/02/2015] [Accepted: 10/05/2015] [Indexed: 01/05/2023]
Abstract
There has been recent interest in the exploitation of readily available dense genome scan marker data for the identification of relatives. However, there are conflicting findings on how informative these data are in practical situations and, in particular, sets of thinned markers are often used with no concrete justification for the chosen spacing. We explore the potential usefulness of dense single nucleotide polymorphism (SNP) arrays for this application with a focus on inferring distant relative pairs. We distinguish between relationship estimation, as defined by a pedigree connecting the two individuals of interest, and estimation of general relatedness as would be provided by a kinship coefficient or a coefficient of relatedness. Since our primary interest is in the former case, we adopt a pedigree likelihood approach. We consider the effect of additional SNPs and data on an additional typed relative, together with choice of that relative, on relationship inference. We also consider the effect of linkage disequilibrium. When overall relatedness, rather than the specific relationship, would suffice, we propose an approximate approach that is easy to implement and appears to compete well with a popular moment-based estimator and a recent maximum likelihood approach based on chromosomal sharing. We conclude that denser marker data are more informative for distant relatives. However, linkage disequilibrium cannot be ignored and will be the main limiting factor for applications to real data.
Collapse
Affiliation(s)
- M Sun
- Department of Health Sciences, University of Leicester, UK
| | - M A Jobling
- Department of Genetics, University of Leicester, UK
| | - D Taliun
- Center for Biomedicine, European Academy of Bolzano (EURAC), Bolzano, Italy; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - P P Pramstaller
- Center for Biomedicine, European Academy of Bolzano (EURAC), Bolzano, Italy
| | - T Egeland
- IKBM Norwegian University of Life Sciences, Norway
| | - N A Sheehan
- Department of Health Sciences, University of Leicester, UK; Department of Genetics, University of Leicester, UK.
| |
Collapse
|
10
|
Srivastava VK, Spinello D. A Two-Phase Combined Gradient-Tunneling Based Algorithm for Constrained Integer Programming Problems. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES 2015. [DOI: 10.1080/02522667.2014.932092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
11
|
Staples J, Qiao D, Cho M, Silverman E, Nickerson D, Below J, Below JE. PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent. Am J Hum Genet 2014; 95:553-64. [PMID: 25439724 DOI: 10.1016/j.ajhg.2014.10.005] [Citation(s) in RCA: 115] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Accepted: 10/02/2014] [Indexed: 11/29/2022] Open
Abstract
Understanding and correctly utilizing relatedness among samples is essential for genetic analysis; however, managing sample records and pedigrees can often be error prone and incomplete. Data sets ascertained by random sampling often harbor cryptic relatedness that can be leveraged in genetic analyses for maximizing power. We have developed a method that uses genome-wide estimates of pairwise identity by descent to identify families and quickly reconstruct and score all possible pedigrees that fit the genetic data by using up to third-degree relatives, and we have included it in the software package PRIMUS (Pedigree Reconstruction and Identification of the Maximally Unrelated Set). Here, we validate its performance on simulated, clinical, and HapMap pedigrees. Among these samples, we demonstrate that PRIMUS can verify reported pedigree structures and identify cryptic relationships. Finally, we show that PRIMUS reconstructed pedigrees, all of which were previously unknown, for 203 families from a cohort collected in Starr County, TX (1,890 samples).
Collapse
Affiliation(s)
| | | | | | | | | | | | - Jennifer E Below
- Epidemiology, Human Genetics, & Environmental Sciences, University of Texas Health Science Center, Houston, TX 77225, USA.
| |
Collapse
|
12
|
Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theor Popul Biol 2014; 97:11-9. [DOI: 10.1016/j.tpb.2014.07.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 07/11/2014] [Accepted: 07/16/2014] [Indexed: 11/17/2022]
|
13
|
Shem-Tov D, Halperin E. Historical pedigree reconstruction from extant populations using PArtitioning of RElatives (PREPARE). PLoS Comput Biol 2014; 10:e1003610. [PMID: 24945698 PMCID: PMC4063675 DOI: 10.1371/journal.pcbi.1003610] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Accepted: 03/13/2014] [Indexed: 11/18/2022] Open
Abstract
Recent technological improvements in the field of genetic data extraction give rise to the possibility of reconstructing the historical pedigrees of entire populations from the genotypes of individuals living today. Current methods are still not practical for real data scenarios as they have limited accuracy and assume unrealistic assumptions of monogamy and synchronized generations. In order to address these issues, we develop a new method for pedigree reconstruction, , which is based on formulations of the pedigree reconstruction problem as variants of graph coloring. The new formulation allows us to consider features that were overlooked by previous methods, resulting in a reconstruction of up to 5 generations back in time, with an order of magnitude improvement of false-negatives rates over the state of the art, while keeping a lower level of false positive rates. We demonstrate the accuracy of compared to previous approaches using simulation studies over a range of population sizes, including inbred and outbred populations, monogamous and polygamous mating patterns, as well as synchronous and asynchronous mating. Learning the correct relationships between individuals from genetic data is a basic theoretical problem in the field of genetics, and has many practical consequences. A wide variety of statistical methods for genetic analysis assume the relationships between individuals are known, and can manifest relatedness information to improve inference. The current state-of-the-art methods for relationship inference consider pair-wise genetic similarity, and use it to infer the relationship between each pair of individuals. Reconstructing the pedigrees of an entire population directly has the potential to use more elaborate relationship information, and thus obtains a better prediction of the familial relationships in the population. In contrast to the full set of pair-wise relationships in a population, genetic pedigrees provide a lossless and conflict-free structure for depicting the relationships between individuals. In an effort to make pedigree reconstruction practical we developed a new method, which is an order of magnitude more accurate than previous methods, and is the first method that has the ability to reconstruct polygamous pedigrees.
Collapse
Affiliation(s)
- Doron Shem-Tov
- The Balvatnic School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
- * E-mail:
| | - Eran Halperin
- The Balvatnic School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
- International Computer Science Institute, Berkeley, California, United States of America
- Molecular Microbiology and Biotechnology Department, Tel-Aviv University, Tel-Aviv, Israel
| |
Collapse
|
14
|
Studený M, Haws D. Learning Bayesian network structure: Towards the essential graph by integer linear programming tools. Int J Approx Reason 2014. [DOI: 10.1016/j.ijar.2013.09.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
15
|
Cope RC, Lanyon JM, Seddon JM, Pollett PK. Development and testing of a genetic marker-based pedigree reconstruction system 'PR-genie' incorporating size-class data. Mol Ecol Resour 2014; 14:857-70. [PMID: 24373173 DOI: 10.1111/1755-0998.12219] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Revised: 12/02/2013] [Accepted: 12/11/2013] [Indexed: 11/28/2022]
Abstract
For wildlife populations, it is often difficult to determine biological parameters that indicate breeding patterns and population mixing, but knowledge of these parameters is essential for effective management. A pedigree encodes the relationship between individuals and can provide insight into the dynamics of a population over its recent history. Here, we present a method for the reconstruction of pedigrees for wild populations of animals that live long enough to breed multiple times over their lifetime and that have complex or unknown generational structures. Reconstruction was based on microsatellite genotype data along with ancillary biological information: sex and observed body size class as an indicator of relative age of individuals within the population. Using body size-class data to infer relative age has not been considered previously in wildlife genealogy and provides a marked improvement in accuracy of pedigree reconstruction. Body size-class data are particularly useful for wild populations because it is much easier to collect noninvasively than absolute age data. This new pedigree reconstruction system, PR-genie, performs reconstruction using maximum likelihood with optimization driven by the cross-entropy method. We demonstrated pedigree reconstruction performance on simulated populations (comparing reconstructed pedigrees to known true pedigrees) over a wide range of population parameters and under assortative and intergenerational mating schema. Reconstruction accuracy increased with the presence of size-class data and as the amount and quality of genetic data increased. We provide recommendations as to the amount and quality of data necessary to provide insight into detailed familial relationships in a wildlife population using this pedigree reconstruction technique.
Collapse
Affiliation(s)
- Robert C Cope
- School of Biological Science, The University of Queensland, St Lucia, Qld, 4072, Australia
| | | | | | | |
Collapse
|