Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Burger KE, Pfaffelhuber P, Baumdicker F. Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown. PLoS Comput Biol 2022;18:e1010407. [PMID: 35921376 PMCID: PMC9377634 DOI: 10.1371/journal.pcbi.1010407] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 08/15/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open

For:	Burger KE, Pfaffelhuber P, Baumdicker F. Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown. PLoS Comput Biol 2022;18:e1010407. [PMID: 35921376 PMCID: PMC9377634 DOI: 10.1371/journal.pcbi.1010407] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 08/15/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Amin MR, Hasan M, DeGiorgio M. Digital Image Processing to Detect Adaptive Evolution. Mol Biol Evol 2024;41:msae242. [PMID: 39565932 PMCID: PMC11631197 DOI: 10.1093/molbev/msae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 10/28/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open

Riley R, Mathieson I, Mathieson S. Interpreting generative adversarial networks to infer natural selection from genetic data. Genetics 2024;226:iyae024. [PMID: 38386895 PMCID: PMC10990424 DOI: 10.1093/genetics/iyae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/24/2024] Open

Abstract

Understanding natural selection and other forms of non-neutrality is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically require slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection and other local evolutionary processes that requires relatively few selection simulations during training. We build upon a generative adversarial network trained to simulate realistic neutral data. This consists of a generator (fitted demographic model), and a discriminator (convolutional neural network) that predicts whether a genomic region is real or fake. As the generator can only generate data under neutral demographic processes, regions of real data that the discriminator recognizes as having a high probability of being "real" do not fit the neutral demographic model and are therefore candidates for targets of selection. To incentivize identification of a specific mode of selection, we fine-tune the discriminator with a small number of custom non-neutral simulations. We show that this approach has high power to detect various forms of selection in simulations, and that it finds regions under positive selection identified by state-of-the-art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics.

Collapse

Huang X, Rymbekova A, Dolgova O, Lao O, Kuhlwilm M. Harnessing deep learning for population genetic inference. Nat Rev Genet 2024;25:61-78. [PMID: 37666948 DOI: 10.1038/s41576-023-00636-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2023] [Indexed: 09/06/2023]

Mo Z, Siepel A. Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data. PLoS Genet 2023;19:e1011032. [PMID: 37934781 PMCID: PMC10655966 DOI: 10.1371/journal.pgen.1011032] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 11/17/2023] [Accepted: 10/23/2023] [Indexed: 11/09/2023] Open

Nait Saada J, Tsangalidou Z, Stricker M, Palamara PF. Inference of Coalescence Times and Variant Ages Using Convolutional Neural Networks. Mol Biol Evol 2023;40:msad211. [PMID: 37738175 PMCID: PMC10581698 DOI: 10.1093/molbev/msad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/11/2023] [Accepted: 09/18/2023] [Indexed: 09/24/2023] Open

Riley R, Mathieson I, Mathieson S. INTERPRETING GENERATIVE ADVERSARIAL NETWORKS TO INFER NATURAL SELECTION FROM GENETIC DATA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531546. [PMID: 36945387 PMCID: PMC10028936 DOI: 10.1101/2023.03.07.531546] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]

Abstract

Understanding natural selection in humans and other species is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically requires slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Mismatches between simulated training data and real test data can lead to incorrect inference. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection that requires relatively few selection simulations during training. We use a Generative Adversarial Network (GAN) trained to simulate realistic neutral data. The resulting GAN consists of a generator (fitted demographic model) and a discriminator (convolutional neural network). For a genomic region, the discriminator predicts whether it is "real" or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real are likely to have experienced selection. To further incentivize this behavior, we "fine-tune" the discriminator with a small number of selection simulations. We show that this approach has high power to detect selection in simulations, and that it finds regions under selection identified by state-of-the art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.

Collapse

Sanchez T, Bray EM, Jobic P, Guez J, Letournel AC, Charpiat G, Cury J, Jay F. dnadna: a deep learning framework for population genetics inference. Bioinformatics 2022;39:6851140. [PMID: 36445000 PMCID: PMC9825738 DOI: 10.1093/bioinformatics/btac765] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 10/30/2022] [Accepted: 11/28/2022] [Indexed: 11/30/2022] Open