1
|
Farage G, Zhao C, Choi HY, Garrett TJ, Elam MB, Kechris K, Sen Ś. Matrix Linear Models for Connecting Metabolite Composition to Individual Characteristics. Metabolites 2025; 15:140. [PMID: 39997765 PMCID: PMC11857268 DOI: 10.3390/metabo15020140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 02/10/2025] [Accepted: 02/13/2025] [Indexed: 02/26/2025] Open
Abstract
Background/Objectives: High-throughput metabolomics data provide a detailed molecular window into biological processes. We consider the problem of assessing how association of metabolite levels with individual (sample) characteristics, such as sex or treatment, depend on metabolite characteristics such as pathways. Typically, this is done using a two-step process. In the first step, we assess the association of each metabolite with individual characteristics. In the second step, an enrichment analysis is performed by metabolite characteristics. Methods: We combine the two steps using a bilinear model based on the matrix linear model (MLM) framework previously developed for high-throughput genetic screens. Our method can estimate relationships in metabolites sharing known characteristics, whether categorical (such as type of lipid or pathway) or numerical (such as number of double bonds in triglycerides). Results: We demonstrate the flexibility and interoperability of MLMs by applying them to three metabolomic studies. We show that our approach can separate the contribution of the overlapping triglyceride characteristics, such as the number of double bonds and the number of carbon atoms. Conclusion: The matrix linear model offers a flexible, efficient, and interpretable framework for integrating external information and examining complex relationships in metabolomics data. Our method has been implemented in the open-source Julia package, MatrixLM. Data analysis scripts with example data analyses are also available.
Collapse
Affiliation(s)
- Gregory Farage
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (G.F.); (H.Y.C.)
| | - Chenhao Zhao
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (G.F.); (H.Y.C.)
| | - Hyo Young Choi
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (G.F.); (H.Y.C.)
| | - Timothy J. Garrett
- Department of Pathology, Immunology and Laboratory Medicine, University of Florida, Gainesville, FL 32610, USA;
| | - Marshall B. Elam
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Katerina Kechris
- Department of Biostatistics & Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA;
| | - Śaunak Sen
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (G.F.); (H.Y.C.)
| |
Collapse
|
2
|
Singh G, García-Bernalt Diego J, Warang P, Park SC, Chang LA, Noureddine M, Laghlali G, Bykov Y, Prellberg M, Yan V, Singh S, Pache L, Cuadrado-Castano S, Webb B, García-Sastre A, Schotsaert M. Outcome of SARS-CoV-2 reinfection depends on genetic background in female mice. Nat Commun 2024; 15:10178. [PMID: 39580470 PMCID: PMC11585546 DOI: 10.1038/s41467-024-54334-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 11/06/2024] [Indexed: 11/25/2024] Open
Abstract
Antigenically distinct SARS-CoV-2 variants increase the reinfection risk for vaccinated and previously exposed population due to antibody neutralization escape. COVID-19 severity depends on many variables, including host immune responses, which differ depending on genetic predisposition. To address this, we perform immune profiling of female mice with different genetic backgrounds -transgenic K18-hACE2 and wild-type 129S1- infected with the severe B.1.351, 30 days after exposure to the milder BA.1 or severe H1N1. Prior BA.1 infection protects against B.1.351-induced morbidity in K18-hACE2 but aggravates disease in 129S1. H1N1 protects against B.1.351-induced morbidity only in 129S1. Enhanced severity in B.1.351 re-infected 129S1 is characterized by an increase of IL-10, IL-1β, IL-18 and IFN-γ, while in K18-hACE2 the cytokine profile resembles naïve mice undergoing their first viral infection. Enhanced pathology during 129S1 reinfection cannot be attributed to weaker adaptive immune responses to BA.1. Infection with BA.1 causes long-term differential remodeling and transcriptional changes in the bronchioalveolar CD11c+ compartment. K18-hACE2 CD11c+ cells show a strong antiviral defense expression profile whereas 129S1 CD11c+ cells present a more pro-inflammatory response upon restimulation. In conclusion, BA.1 induces cross-reactive adaptive immune responses in K18-hACE2 and 129S1, but reinfection outcome correlates with differential CD11c+ cells responses in the alveolar space.
Collapse
Affiliation(s)
- Gagandeep Singh
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
| | - Juan García-Bernalt Diego
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
| | - Prajakta Warang
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
| | - Seok-Chan Park
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
| | - Lauren A Chang
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Moataz Noureddine
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gabriel Laghlali
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Department of Pharmaceutics, Ghent University, Ghent, Belgium
| | - Yonina Bykov
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Matthew Prellberg
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Vivian Yan
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sarabjot Singh
- RT-PCR COVID-19 Laboratory, Civil Hospital, Moga, Punjab, India
| | - Lars Pache
- NCI Designated Cancer Center, Sanford-Burnham Prebys Medical Discovery Institute, 10901 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Sara Cuadrado-Castano
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Lipschultz Precision Immunology Institute (PrIISM), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Brett Webb
- Department of Veterinary Sciences, University of Wyoming, Laramie, WY, USA
| | - Adolfo García-Sastre
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- Department of Medicine, Division of Infectious Diseases, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
- The Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA
| | - Michael Schotsaert
- Department of Microbiology, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA.
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai New York, New York, NY, USA.
- Lipschultz Precision Immunology Institute (PrIISM), Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
3
|
Farage G, Zhao C, Choi HY, Garrett TJ, Kechris K, Elam MB, Sen Ś. Matrix Linear Models for connecting metabolite composition to individual characteristics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.19.572450. [PMID: 38187579 PMCID: PMC10769268 DOI: 10.1101/2023.12.19.572450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
High-throughput metabolomics data provide a detailed molecular window into biological processes. We consider the problem of assessing how the association of metabolite levels with individual (sample) characteristics such as sex or treatment may depend on metabolite characteristics such as pathway. Typically this is one in a two-step process: In the first step we assess the association of each metabolite with individual characteristics. In the second step an enrichment analysis is performed by metabolite characteristics among significant associations. We combine the two steps using a bilinear model based on the matrix linear model (MLM) framework we have previously developed for high-throughput genetic screens. Our framework can estimate relationships in metabolites sharing known characteristics, whether categorical (such as type of lipid or pathway) or numerical (such as number of double bonds in triglycerides). We demonstrate how MLM offers flexibility and interpretability by applying our method to three metabolomic studies. We show that our approach can separate the contribution of the overlapping triglycerides characteristics, such as the number of double bonds and the number of carbon atoms. The proposed method have been implemented in the open-source Julia package, MatrixLM. Data analysis scripts with example data analyses are also available.
Collapse
Affiliation(s)
- Gregory Farage
- Department of Preventive Medicine, Division of Biostatistics, University of Tennessee Health Science Center, Memphis, TN 38163
| | - Chenhao Zhao
- Department of Preventive Medicine, Division of Biostatistics, University of Tennessee Health Science Center, Memphis, TN 38163
| | - Hyo Young Choi
- Department of Preventive Medicine, Division of Biostatistics, University of Tennessee Health Science Center, Memphis, TN 38163
| | - Timothy J Garrett
- Department of Pathology, Immunology and Laboratory Medicine, University of Florida, Gainesville, FL 32610
| | - Katerina Kechris
- Department of Biostatistics & Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
| | - Marshall B Elam
- Department of Pharmacology and of Medicine, University of Tennessee Health Science Center, Memphis, TN 38163
| | - Śaunak Sen
- Department of Preventive Medicine, Division of Biostatistics, University of Tennessee Health Science Center Memphis, TN 38163
| |
Collapse
|
4
|
Singh G, Warang P, García-Bernalt Diego J, Chang L, Bykov Y, Singh S, Pache L, Cuadrado-Castano S, Webb B, Garcia-Sastre A, Schotsaert M. Host immune responses associated with SARS-CoV-2 Omicron infection result in protection or pathology during reinfection depending on mouse genetic background. RESEARCH SQUARE 2023:rs.3.rs-3637405. [PMID: 38077015 PMCID: PMC10705603 DOI: 10.21203/rs.3.rs-3637405/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Rapid emergence of antigenic distinct SARS-CoV-2 variants implies a greater risk of reinfection as viruses can escape neutralizing antibodies induced by vaccination or previous viral exposure. Disease severity during COVID-19 depends on many variables such as age-related comorbidities, host immune status and genetic variation. The host immune response during infection with SARS-CoV-2 may contribute to disease severity, which can range from asymptomatic to severe with fatal outcome. Furthermore, the extent of host immune response activation may rely on underlying genetic predisposition for disease or protection. To address these questions, we performed immune profiling studies in mice with different genetic backgrounds - transgenic K18-hACE2 and wild-type 129S1 mice - subjected to reinfection with the severe disease-causing SARS-CoV-2 B.1.351 variant, 30 days after experimental milder BA.1 infection. BA.1 preinfection conferred protection against B.1.351-induced morbidity in K18-hACE2 mice but aggravated disease in 129S1 mice. We found that he cytokine/chemokine profile in B.1.351 re-infected 129S1mice is similar to that during severe SARS-CoV-2 infection in humans and is characterized by a much higher level of IL-10, IL-1β, IL-18 and IFN-γ, whereas in B.1.351 re-infected K18-hACE2 mice, the cytokine profile echoes the signature of naïve mice undergoing viral infection for the first time. Interestingly, the enhanced pathology observed in 129S1 mice upon reinfection cannot be attributed to a less efficient induction of adaptive immune responses to the initial BA.1 infection, as both K18-hACE2 and 129S1 mice exhibited similar B and T cell responses at 30 DPI against BA.1, with similar anti-BA.1 or B.1.351 spike-specific ELISA binding titers, levels of germinal center B-cells, and SARS-CoV-2-Spike specific tissue-resident T-cells. Long-term effects of BA.1 infection are associated with differential transcriptional changes in bronchoalveolar lavage-derived CD11c + immune cells from K18-hACE2 and 129S1, with K18-hACE2 CD11c + cells showing a strong antiviral defense gene expression profile whereas 129S1 CD11c + cells showed a more pro-inflammatory response. In conclusion, initial infection with BA.1 induces cross-reactive adaptive immune responses in both K18-hACE2 and 129S1 mice, however the different disease outcome of reinfection seems to be driven by differential responses of CD11c + cells in the alveolar space.
Collapse
Affiliation(s)
| | | | | | | | | | - Sarabjot Singh
- RT-PCR COVID-19 Laboratory, Civil Hospital, Moga, Punjab, India
| | - Lars Pache
- Sanford Burnham Prebys Medical Discovery Institute
| | | | - Brett Webb
- Department of Veterinary Sciences, University of Wyoming
| | | | | |
Collapse
|
5
|
Liang JW, Sen Ś. Sparse matrix linear models for structured high-throughput data. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Jane W. Liang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| | - Śaunak Sen
- Department of Preventive Medicine, University of Tennessee Health Science Center
| |
Collapse
|
6
|
Roth C, Murray D, Scott A, Fu C, Averette AF, Sun S, Heitman J, Magwene PM. Pleiotropy and epistasis within and between signaling pathways defines the genetic architecture of fungal virulence. PLoS Genet 2021; 17:e1009313. [PMID: 33493169 PMCID: PMC7861560 DOI: 10.1371/journal.pgen.1009313] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 02/04/2021] [Accepted: 12/17/2020] [Indexed: 01/11/2023] Open
Abstract
Cryptococcal disease is estimated to affect nearly a quarter of a million people annually. Environmental isolates of Cryptococcus deneoformans, which make up 15 to 30% of clinical infections in temperate climates such as Europe, vary in their pathogenicity, ranging from benign to hyper-virulent. Key traits that contribute to virulence, such as the production of the pigment melanin, an extracellular polysaccharide capsule, and the ability to grow at human body temperature have been identified, yet little is known about the genetic basis of variation in such traits. Here we investigate the genetic basis of melanization, capsule size, thermal tolerance, oxidative stress resistance, and antifungal drug sensitivity using quantitative trait locus (QTL) mapping in progeny derived from a cross between two divergent C. deneoformans strains. Using a "function-valued" QTL analysis framework that exploits both time-series information and growth differences across multiple environments, we identified QTL for each of these virulence traits and drug susceptibility. For three QTL we identified the underlying genes and nucleotide differences that govern variation in virulence traits. One of these genes, RIC8, which encodes a regulator of cAMP-PKA signaling, contributes to variation in four virulence traits: melanization, capsule size, thermal tolerance, and resistance to oxidative stress. Two major effect QTL for amphotericin B resistance map to the genes SSK1 and SSK2, which encode key components of the HOG pathway, a fungal-specific signal transduction network that orchestrates cellular responses to osmotic and other stresses. We also discovered complex epistatic interactions within and between genes in the HOG and cAMP-PKA pathways that regulate antifungal drug resistance and resistance to oxidative stress. Our findings advance the understanding of virulence traits among diverse lineages of Cryptococcus, and highlight the role of genetic variation in key stress-responsive signaling pathways as a major contributor to phenotypic variation.
Collapse
Affiliation(s)
- Cullen Roth
- Department of Biology, Duke University, Durham, North Carolina, United States of America
- University Program in Genetics and Genomics, Duke University, Durham, North Carolina, United States of America
| | - Debra Murray
- Department of Biology, Duke University, Durham, North Carolina, United States of America
| | - Alexandria Scott
- Department of Biology, Duke University, Durham, North Carolina, United States of America
| | - Ci Fu
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Anna F. Averette
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Sheng Sun
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Joseph Heitman
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Paul M. Magwene
- Department of Biology, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
7
|
Arjas A, Hauptmann A, Sillanpää MJ. Estimation of dynamic SNP-heritability with Bayesian Gaussian process models. Bioinformatics 2020; 36:3795-3802. [PMID: 32186692 PMCID: PMC7672693 DOI: 10.1093/bioinformatics/btaa199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 03/10/2020] [Accepted: 03/17/2020] [Indexed: 11/23/2022] Open
Abstract
Motivation Improved DNA technology has made it practical to estimate single-nucleotide polymorphism (SNP)-heritability among distantly related individuals with unknown relationships. For growth- and development-related traits, it is meaningful to base SNP-heritability estimation on longitudinal data due to the time-dependency of the process. However, only few statistical methods have been developed so far for estimating dynamic SNP-heritability and quantifying its full uncertainty. Results We introduce a completely tuning-free Bayesian Gaussian process (GP)-based approach for estimating dynamic variance components and heritability as their function. For parameter estimation, we use a modern Markov Chain Monte Carlo method which allows full uncertainty quantification. Several datasets are analysed and our results clearly illustrate that the 95% credible intervals of the proposed joint estimation method (which ‘borrows strength’ from adjacent time points) are significantly narrower than of a two-stage baseline method that first estimates the variance components at each time point independently and then performs smoothing. We compare the method with a random regression model using MTG2 and BLUPF90 software and quantitative measures indicate superior performance of our method. Results are presented for simulated and real data with up to 1000 time points. Finally, we demonstrate scalability of the proposed method for simulated data with tens of thousands of individuals. Availability and implementation The C++ implementation dynBGP and simulated data are available in GitHub: https://github.com/aarjas/dynBGP. The programmes can be run in R. Real datasets are available in QTL archive: https://phenome.jax.org/centers/QTLA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Arttu Arjas
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland
| | - Andreas Hauptmann
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland.,Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland.,Infotech Oulu, University of Oulu, Oulu FI-90014, Finland
| |
Collapse
|
8
|
Vanhatalo J, Li Z, Sillanpää MJ. A Gaussian process model and Bayesian variable selection for mapping function-valued quantitative traits with incomplete phenotypic data. Bioinformatics 2020; 35:3684-3692. [PMID: 30850830 PMCID: PMC6761969 DOI: 10.1093/bioinformatics/btz164] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 12/05/2018] [Accepted: 03/06/2019] [Indexed: 12/22/2022] Open
Abstract
Motivation Recent advances in high dimensional phenotyping bring time as an extra dimension into the phenotypes. This promotes the quantitative trait locus (QTL) studies of function-valued traits such as those related to growth and development. Existing approaches for analyzing functional traits utilize either parametric methods or semi-parametric approaches based on splines and wavelets. However, very limited choices of software tools are currently available for practical implementation of functional QTL mapping and variable selection. Results We propose a Bayesian Gaussian process (GP) approach for functional QTL mapping. We use GPs to model the continuously varying coefficients which describe how the effects of molecular markers on the quantitative trait are changing over time. We use an efficient gradient based algorithm to estimate the tuning parameters of GPs. Notably, the GP approach is directly applicable to the incomplete datasets having even larger than 50% missing data rate (among phenotypes). We further develop a stepwise algorithm to search through the model space in terms of genetic variants, and use a minimal increase of Bayesian posterior probability as a stopping rule to focus on only a small set of putative QTL. We also discuss the connection between GP and penalized B-splines and wavelets. On two simulated and three real datasets, our GP approach demonstrates great flexibility for modeling different types of phenotypic trajectories with low computational cost. The proposed model selection approach finds the most likely QTL reliably in tested datasets. Availability and implementation Software and simulated data are available as a MATLAB package ‘GPQTLmapping’, and they can be downloaded from GitHub (https://github.com/jpvanhat/GPQTLmapping). Real datasets used in case studies are publicly available at QTL Archive. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jarno Vanhatalo
- Department of Mathematics and Statistics and Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland
| | - Zitong Li
- CSIRO Agriculture & Food, GPO Box 1600, Canberra, ACT 2601, Australia
| | - Mikko J Sillanpää
- Department of Mathematical Sciences, Biocenter Oulu and Infotech Oulu University of Oulu, Oulu FI-90014, Finland
| |
Collapse
|
9
|
Liang JW, Nichols RJ, Sen Ś. Matrix Linear Models for High-Throughput Chemical Genetic Screens. Genetics 2019; 212:1063-1073. [PMID: 31243057 PMCID: PMC6707451 DOI: 10.1534/genetics.119.302299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 06/06/2019] [Indexed: 11/18/2022] Open
Abstract
We develop a flexible and computationally efficient approach for analyzing high-throughput chemical genetic screens. In such screens, a library of genetic mutants is phenotyped in a large number of stresses. Typically, interactions between genes and stresses are detected by grouping the mutants and stresses into categories, and performing modified t-tests for each combination. This approach does not have a natural extension if mutants or stresses have quantitative or nonoverlapping annotations (e.g., if conditions have doses or a mutant falls into more than one category simultaneously). We develop a matrix linear model (MLM) framework that allows us to model relationships between mutants and conditions in a simple, yet flexible, multivariate framework. It encodes both categorical and continuous relationships to enhance detection of associations. We develop a fast estimation algorithm that takes advantage of the structure of MLMs. We evaluate our method's performance in simulations and in an Escherichia coli chemical genetic screen, comparing it with an existing univariate approach based on modified t-tests. We show that MLMs perform slightly better than the univariate approach when mutants and conditions are classified in nonoverlapping categories, and substantially better when conditions can be ordered in dosage categories. Therefore, it is an attractive alternative to current methods, and provides a computationally scalable framework for larger and complex chemical genetic screens. A Julia language implementation of MLMs and the code used for this paper are available at https://github.com/janewliang/GeneticScreen.jl and https://bitbucket.org/jwliang/mlm_gs_supplement, respectively.
Collapse
Affiliation(s)
- Jane W Liang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115
| | - Robert J Nichols
- Department of Microbiology and Immunology, University of California, San Francisco, California 94143
| | - Śaunak Sen
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, Tennessee 38163
| |
Collapse
|
10
|
Wang N, Chu T, Luo J, Wu R, Wang Z. Funmap2: an R package for QTL mapping using longitudinal phenotypes. PeerJ 2019; 7:e7008. [PMID: 31183256 PMCID: PMC6546077 DOI: 10.7717/peerj.7008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 04/23/2019] [Indexed: 01/08/2023] Open
Abstract
Quantitative trait locus (QTL) mapping has been used as a powerful tool for inferring the complexity of the genetic architecture that underlies phenotypic traits. This approach has shown its unique power to map the developmental genetic architecture of complex traits by implementing longitudinal data analysis. Here, we introduce the R package Funmap2 based on the functional mapping framework, which integrates prior biological knowledge into the statistical model. Specifically, the functional mapping framework is engineered to include longitudinal curves that describe the genetic effects and the covariance matrix of the trait of interest. Funmap2 chooses the type of longitudinal curve and covariance matrix automatically using information criteria. Funmap2 is available for download at https://github.com/wzhy2000/Funmap2.
Collapse
Affiliation(s)
- Nating Wang
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China
| | - Tinyi Chu
- Graduate field of Computational Biology, Cornell University, Ithaca, NY, United States of America
| | - Jiangtao Luo
- Department of Biostatistics, College of Public Health, University of Nebraska Medical Center, Omaha, NE, United States of America
| | - Rongling Wu
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China
| | - Zhong Wang
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China.,Baker Institute for Animal Health, College of Veterinary Medicine, Cornell College, Ithaca, NY, United States of America
| |
Collapse
|
11
|
Ning C, Wang D, Zhou L, Wei J, Liu Y, Kang H, Zhang S, Zhou X, Xu S, Liu JF. Efficient multivariate analysis algorithms for longitudinal genome-wide association studies. Bioinformatics 2019; 35:4879-4885. [DOI: 10.1093/bioinformatics/btz304] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 04/16/2019] [Accepted: 04/25/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Current dynamic phenotyping system introduces time as an extra dimension to genome-wide association studies (GWAS), which helps to explore the mechanism of dynamical genetic control for complex longitudinal traits. However, existing methods for longitudinal GWAS either ignore the covariance among observations of different time points or encounter computational efficiency issues.
Results
We herein developed efficient genome-wide multivariate association algorithms for longitudinal data. In contrast to existing univariate linear mixed model analyses, the proposed method has improved statistic power for association detection and computational speed. In addition, the new method can analyze unbalanced longitudinal data with thousands of individuals and more than ten thousand records within a few hours. The corresponding time for balanced longitudinal data is just a few minutes.
Availability and implementation
A software package to implement the efficient algorithm named GMA (https://github.com/chaoning/GMA) is available freely for interested users in relevant fields.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chao Ning
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Dan Wang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Lei Zhou
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Julong Wei
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Yuanxin Liu
- School of English, Beijing International Studies University, Beijing, China
| | - Huimin Kang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Shengli Zhang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Shizhong Xu
- Department of Botany and Plant Science, University of California, Riverside, CA, USA
| | - Jian-Feng Liu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| |
Collapse
|
12
|
Ning C, Wang D, Zheng X, Zhang Q, Zhang S, Mrode R, Liu JF. Eigen decomposition expedites longitudinal genome-wide association studies for milk production traits in Chinese Holstein. Genet Sel Evol 2018; 50:12. [PMID: 29576014 PMCID: PMC5868076 DOI: 10.1186/s12711-018-0383-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 03/01/2018] [Indexed: 11/16/2022] Open
Abstract
Background Pseudo-phenotypes, such as 305-day yields, estimated breeding values or deregressed proofs, are usually used as response variables for genome-wide association studies (GWAS) of milk production traits in dairy cattle. Computational inefficiency challenges the direct use of test-day records for longitudinal GWAS with large datasets. Results We propose a rapid longitudinal GWAS method that is based on a random regression model. Our method uses Eigen decomposition of the phenotypic covariance matrix to rotate the data, thereby transforming the complex mixed linear model into weighted least squares analysis. We performed a simulation study that showed that our method can control type I errors well and has higher power than a longitudinal GWAS method that does not include time-varied additive genetic effects. We also applied our method to the analysis of milk production traits in the first three parities of 6711 Chinese Holstein cows. The analysis for each trait was completed within 1 day with known variances. In total, we located 84 significant single nucleotide polymorphisms (SNPs) of which 65 were within previously reported quantitative trait loci (QTL) regions. Conclusions Our rapid method can control type I errors in the analysis of longitudinal data and can be applied to other longitudinal traits. We detected QTL that were for the most part similar to those reported in a previous study in Chinese Holstein. Moreover, six additional SNPs for fat percentage and 13 SNPs for protein percentage were identified by our method. These additional 19 SNPs could be new candidate quantitative trait nucleotides for milk production traits in Chinese Holstein. Electronic supplementary material The online version of this article (10.1186/s12711-018-0383-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chao Ning
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Dan Wang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Xianrui Zheng
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Qin Zhang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Shengli Zhang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Raphael Mrode
- Animal Biosciences, International Livestock Research Institute, Nairobi, 00100, Kenya
| | - Jian-Feng Liu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
13
|
Baker RL, Leong WF, An N, Brock MT, Rubin MJ, Welch S, Weinig C. Bayesian estimation and use of high-throughput remote sensing indices for quantitative genetic analyses of leaf growth. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2018; 131:283-298. [PMID: 29058049 DOI: 10.1007/s00122-017-3001-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 10/09/2017] [Indexed: 06/07/2023]
Abstract
We develop Bayesian function-valued trait models that mathematically isolate genetic mechanisms underlying leaf growth trajectories by factoring out genotype-specific differences in photosynthesis. Remote sensing data can be used instead of leaf-level physiological measurements. Characterizing the genetic basis of traits that vary during ontogeny and affect plant performance is a major goal in evolutionary biology and agronomy. Describing genetic programs that specifically regulate morphological traits can be complicated by genotypic differences in physiological traits. We describe the growth trajectories of leaves using novel Bayesian function-valued trait (FVT) modeling approaches in Brassica rapa recombinant inbred lines raised in heterogeneous field settings. While frequentist approaches estimate parameter values by treating each experimental replicate discretely, Bayesian models can utilize information in the global dataset, potentially leading to more robust trait estimation. We illustrate this principle by estimating growth asymptotes in the face of missing data and comparing heritabilities of growth trajectory parameters estimated by Bayesian and frequentist approaches. Using pseudo-Bayes factors, we compare the performance of an initial Bayesian logistic growth model and a model that incorporates carbon assimilation (A max) as a cofactor, thus statistically accounting for genotypic differences in carbon resources. We further evaluate two remotely sensed spectroradiometric indices, photochemical reflectance (pri2) and MERIS Terrestrial Chlorophyll Index (mtci) as covariates in lieu of A max, because these two indices were genetically correlated with A max across years and treatments yet allow much higher throughput compared to direct leaf-level gas-exchange measurements. For leaf lengths in uncrowded settings, including A max improves model fit over the initial model. The mtci and pri2 indices also outperform direct A max measurements. Of particular importance for evolutionary biologists and plant breeders, hierarchical Bayesian models estimating FVT parameters improve heritabilities compared to frequentist approaches.
Collapse
Affiliation(s)
- Robert L Baker
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA.
- Biology Department, Miami University, Oxford, OH, 45056, USA.
| | - Wen Fung Leong
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Nan An
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Marcus T Brock
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - Matthew J Rubin
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - Stephen Welch
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Cynthia Weinig
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA
| |
Collapse
|
14
|
Camargo AV, Mackay I, Mott R, Han J, Doonan JH, Askew K, Corke F, Williams K, Bentley AR. Functional Mapping of Quantitative Trait Loci (QTLs) Associated With Plant Performance in a Wheat MAGIC Mapping Population. FRONTIERS IN PLANT SCIENCE 2018; 9:887. [PMID: 30038630 PMCID: PMC6047115 DOI: 10.3389/fpls.2018.00887] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 06/07/2018] [Indexed: 05/18/2023]
Abstract
In crop genetic studies, the mapping of longitudinal data describing the spatio-temporal nature of agronomic traits can elucidate the factors influencing their formation and development. Here, we combine the mapping power and precision of a MAGIC wheat population with robust computational methods to track the spatio- temporal dynamics of traits associated with wheat performance. NIAB MAGIC lines were phenotyped throughout their lifecycle under smart house conditions. Growth models were fitted to the data describing growth trajectories of plant area, height, water use and senescence and fitted parameters were mapped as quantitative traits. Trait data from single time points were also mapped to determine when and how markers became and ceased to be significant. Assessment of temporal dynamics allowed the identification of marker-trait associations and tracking of trait development against the genetic contribution of key markers. We establish a data-driven approach for understanding complex agronomic traits and accelerate research in plant breeding.
Collapse
Affiliation(s)
- Anyela V. Camargo
- The John Bingham Laboratory, National Institute of Agricultural Botany, Cambridge, United Kingdom
- *Correspondence: Anyela V. Camargo
| | - Ian Mackay
- The John Bingham Laboratory, National Institute of Agricultural Botany, Cambridge, United Kingdom
| | - Richard Mott
- Division of Bioscience, Genetics Institute, University College London, London, United Kingdom
| | - Jiwan Han
- National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - John H. Doonan
- National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - Karen Askew
- National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - Fiona Corke
- National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - Kevin Williams
- National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - Alison R. Bentley
- The John Bingham Laboratory, National Institute of Agricultural Botany, Cambridge, United Kingdom
| |
Collapse
|
15
|
Phan DN, Le Thi HA, Dinh TP. Sparse Covariance Matrix Estimation by DCA-Based Algorithms. Neural Comput 2017; 29:3040-3077. [DOI: 10.1162/neco_a_01012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
This letter proposes a novel approach using the [Formula: see text]-norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the [Formula: see text] term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of [Formula: see text]-norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.
Collapse
Affiliation(s)
- Duy Nhat Phan
- Laboratory of Theoretical and Applied Computer Science EA 3097, University of Lorraine, Ile du Saulcy, 57045 Metz, France
| | - Hoai An Le Thi
- Laboratory of Theoretical and Applied Computer Science EA 3097, University of Lorraine, Ile du Saulcy, 57045 Metz, France
| | - Tao Pham Dinh
- Laboratory of Mathematics, INSA–Rouen, University of Normandie, 76801 Saint-Etienne-du-Rouvray cedex, France
| |
Collapse
|
16
|
Performance Gains in Genome-Wide Association Studies for Longitudinal Traits via Modeling Time-varied effects. Sci Rep 2017; 7:590. [PMID: 28377602 PMCID: PMC5428860 DOI: 10.1038/s41598-017-00638-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 03/08/2017] [Indexed: 11/09/2022] Open
Abstract
Complex traits with multiple phenotypic values changing over time are called longitudinal traits. In traditional genome-wide association studies (GWAS) for longitudinal traits, a combined/averaged estimated breeding value (EBV) or deregressed proof (DRP) instead of multiple phenotypic measurements per se for each individual was frequently treated as response variable in statistical model. This can result in power losses or even inflate false positive rates (FPRs) in the detection due to failure of exploring time-dependent relationship among measurements. Aiming at overcoming such limitation, we developed two random regression-based models for functional GWAS on longitudinal traits, which could directly use original time-dependent records as response variable and fit the time-varied Quantitative Trait Nucleotide (QTN) effect. Simulation studies showed that our methods could control the FPRs and increase statistical powers in detecting QTN in comparison with traditional methods where EBVs, DRPs or estimated residuals were considered as response variables. Besides, our proposed models also achieved reliable powers in gene detection when implementing into two real datasets, a Chinese Holstein Cattle data and the Genetic Analysis Workshop 18 data. Our study herein offers an optimal way to enhance the power of gene detection and further understand genetic control of developmental processes for complex longitudinal traits.
Collapse
|
17
|
Li Z, Sillanpää MJ. Dynamic Quantitative Trait Locus Analysis of Plant Phenomic Data. TRENDS IN PLANT SCIENCE 2015; 20:822-833. [PMID: 26482958 DOI: 10.1016/j.tplants.2015.08.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 08/12/2015] [Accepted: 08/26/2015] [Indexed: 05/27/2023]
Abstract
Advanced platforms have recently become available for automatic and systematic quantification of plant growth and development. These new techniques can efficiently produce multiple measurements of phenotypes over time, and introduce time as an extra dimension to quantitative trait locus (QTL) studies. Functional mapping utilizes a class of statistical models for identifying QTLs associated with the growth characteristics of interest. A major benefit of functional mapping is that it integrates information over multiple timepoints, and therefore could increase the statistical power for QTL detection. We review the current development of computationally efficient functional mapping methods which provide invaluable tools for analyzing large-scale timecourse data that are readily available in our post-genome era.
Collapse
Affiliation(s)
- Zitong Li
- Biocenter Oulu, Oulu, Finland; Department of Mathematical Sciences and Department of Biology, University of Oulu, 90014 Oulu, Finland
| | - Mikko J Sillanpää
- Biocenter Oulu, Oulu, Finland; Department of Mathematical Sciences and Department of Biology, University of Oulu, 90014 Oulu, Finland.
| |
Collapse
|
18
|
Mapping Quantitative Trait Loci Underlying Function-Valued Traits Using Functional Principal Component Analysis and Multi-Trait Mapping. G3-GENES GENOMES GENETICS 2015; 6:79-86. [PMID: 26530421 PMCID: PMC4704727 DOI: 10.1534/g3.115.024133] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We previously proposed a simple regression-based method to map quantitative trait loci underlying function-valued phenotypes. In order to better handle the case of noisy phenotype measurements and accommodate the correlation structure among time points, we propose an alternative approach that maintains much of the simplicity and speed of the regression-based method. We overcome noisy measurements by replacing the observed data with a smooth approximation. We then apply functional principal component analysis, replacing the smoothed phenotype data with a small number of principal components. Quantitative trait locus mapping is applied to these dimension-reduced data, either with a multi-trait method or by considering the traits individually and then taking the average or maximum LOD score across traits. We apply these approaches to root gravitropism data on Arabidopsis recombinant inbred lines and further investigate their performance in computer simulations. Our methods have been implemented in the R package, funqtl.
Collapse
|
19
|
Baker RL, Leong WF, Brock MT, Markelz RJC, Covington MF, Devisetty UK, Edwards CE, Maloof J, Welch S, Weinig C. Modeling development and quantitative trait mapping reveal independent genetic modules for leaf size and shape. THE NEW PHYTOLOGIST 2015; 208:257-68. [PMID: 26083847 DOI: 10.1111/nph.13509] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 05/11/2015] [Indexed: 05/24/2023]
Abstract
Improved predictions of fitness and yield may be obtained by characterizing the genetic controls and environmental dependencies of organismal ontogeny. Elucidating the shape of growth curves may reveal novel genetic controls that single-time-point (STP) analyses do not because, in theory, infinite numbers of growth curves can result in the same final measurement. We measured leaf lengths and widths in Brassica rapa recombinant inbred lines (RILs) throughout ontogeny. We modeled leaf growth and allometry as function valued traits (FVT), and examined genetic correlations between these traits and aspects of phenology, physiology, circadian rhythms and fitness. We used RNA-seq to construct a SNP linkage map and mapped trait quantitative trait loci (QTL). We found genetic trade-offs between leaf size and growth rate FVT and uncovered differences in genotypic and QTL correlations involving FVT vs STPs. We identified leaf shape (allometry) as a genetic module independent of length and width and identified selection on FVT parameters of development. Leaf shape is associated with venation features that affect desiccation resistance. The genetic independence of leaf shape from other leaf traits may therefore enable crop optimization in leaf shape without negative effects on traits such as size, growth rate, duration or gas exchange.
Collapse
Affiliation(s)
- Robert L Baker
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - Wen Fung Leong
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Marcus T Brock
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - R J Cody Markelz
- Department of Plant Biology, University of California, Davis, CA, 95616, USA
| | - Michael F Covington
- Department of Plant Biology, University of California, Davis, CA, 95616, USA
| | - Upendra K Devisetty
- Department of Plant Biology, University of California, Davis, CA, 95616, USA
| | - Christine E Edwards
- Center for Conservation and Sustainable Development, Missouri Botanical Garden, St Louis, MO, 63166, USA
| | - Julin Maloof
- Department of Plant Biology, University of California, Davis, CA, 95616, USA
| | - Stephen Welch
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Cynthia Weinig
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA
| |
Collapse
|
20
|
Hernandez KM. Understanding the genetic architecture of complex traits using the function-valued approach. THE NEW PHYTOLOGIST 2015; 208:1-3. [PMID: 26311281 DOI: 10.1111/nph.13607] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Affiliation(s)
- Kyle M Hernandez
- Center for Research Informatics, Biological Sciences Division, University of Chicago, Chicago, IL, 60637, USA
| |
Collapse
|
21
|
Kawajiri M, Yoshida K, Fujimoto S, Mokodongan DF, Ravinet M, Kirkpatrick M, Yamahira K, Kitano J. Ontogenetic stage-specific quantitative trait loci contribute to divergence in developmental trajectories of sexually dimorphic fins between medaka populations. Mol Ecol 2014; 23:5258-75. [PMID: 25251151 DOI: 10.1111/mec.12933] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 09/16/2014] [Accepted: 09/17/2014] [Indexed: 11/29/2022]
Abstract
Sexual dimorphism can evolve when males and females differ in phenotypic optima. Genetic constraints can, however, limit the evolution of sexual dimorphism. One possible constraint is derived from alleles expressed in both sexes. Because males and females share most of their genome, shared alleles with different fitness effects between sexes are faced with intralocus sexual conflict. Another potential constraint is derived from genetic correlations between developmental stages. Sexually dimorphic traits are often favoured at adult stages, but selected against as juvenile, so developmental decoupling of traits between ontogenetic stages may be necessary for the evolution of sexual dimorphism in adults. Resolving intralocus conflicts between sexes and ages is therefore a key to the evolution of age-specific expression of sexual dimorphism. We investigated the genetic architecture of divergence in the ontogeny of sexual dimorphism between two populations of the Japanese medaka (Oryzias latipes) that differ in the magnitude of dimorphism in anal and dorsal fin length. Quantitative trait loci (QTL) mapping revealed that few QTL had consistent effects throughout ontogenetic stages and the majority of QTL change the sizes and directions of effects on fin growth rates during ontogeny. We also found that most QTL were sex-specific, suggesting that intralocus sexual conflict is almost resolved. Our results indicate that sex- and age-specific QTL enable the populations to achieve optimal developmental trajectories of sexually dimorphic traits in response to complex natural and sexual selection.
Collapse
Affiliation(s)
- Maiko Kawajiri
- Ecological Genetics Laboratory, National Institute of Genetics, Yata 1111, Mishima, Shizuoka, 411-8540, Japan
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Abstract
Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued phenotypes, such as growth measured over time. While methods exist for QTL mapping with function-valued phenotypes, they are generally computationally intensive and focus on single-QTL models. We propose two simple, fast methods that maintain high power and precision and are amenable to extensions with multiple-QTL models using a penalized likelihood approach. After identifying multiple QTL by these approaches, we can view the function-valued QTL effects to provide a deeper understanding of the underlying processes. Our methods have been implemented as a package for R, funqtl.
Collapse
|
23
|
Des Marais DL, Hernandez KM, Juenger TE. Genotype-by-Environment Interaction and Plasticity: Exploring Genomic Responses of Plants to the Abiotic Environment. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2013. [DOI: 10.1146/annurev-ecolsys-110512-135806] [Citation(s) in RCA: 256] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- David L. Des Marais
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712;
| | - Kyle M. Hernandez
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712;
| | - Thomas E. Juenger
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712;
- Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas 78712
| |
Collapse
|
24
|
Fan J, Liao Y, Mincheva M. Large Covariance Estimation by Thresholding Principal Orthogonal Complements. J R Stat Soc Series B Stat Methodol 2013; 75. [PMID: 24348088 DOI: 10.1111/rssb.12016] [Citation(s) in RCA: 194] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
Collapse
Affiliation(s)
- Jianqing Fan
- Department of Operations Research and Financial Engineering, Princeton University ; Bendheim Center for Finance, Princeton University
| | - Yuan Liao
- Department of Mathematics, University of Maryland
| | - Martina Mincheva
- Department of Operations Research and Financial Engineering, Princeton University
| |
Collapse
|
25
|
Abstract
In biology, many quantitative traits are dynamic in nature. They can often be described by some smooth functions or curves. A joint analysis of all the repeated measurements of the dynamic traits by functional quantitative trait loci (QTL) mapping methods has the benefits to (1) understand the genetic control of the whole dynamic process of the quantitative traits and (2) improve the statistical power to detect QTL. One crucial issue in functional QTL mapping is how to correctly describe the smoothness of trajectories of functional valued traits. We develop an efficient Bayesian nonparametric multiple-loci procedure for mapping dynamic traits. The method uses the Bayesian P-splines with (nonparametric) B-spline bases to specify the functional form of a QTL trajectory and a random walk prior to automatically determine its degree of smoothness. An efficient deterministic variational Bayes algorithm is used to implement both (1) the search of an optimal subset of QTL among large marker panels and (2) estimation of the genetic effects of the selected QTL changing over time. Our method can be fast even on some large-scale data sets. The advantages of our method are illustrated on both simulated and real data sets.
Collapse
|