1
|
Fernandes AFA, Dórea JRR, Valente BD, Fitzgerald R, Herring W, Rosa GJM. Comparison of data analytics strategies in computer vision systems to predict pig body composition traits from 3D images. J Anim Sci 2020; 98:skaa250. [PMID: 32770242 PMCID: PMC7447136 DOI: 10.1093/jas/skaa250] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 07/31/2020] [Indexed: 12/17/2022] Open
Abstract
Computer vision systems (CVS) have been shown to be a powerful tool for the measurement of live pig body weight (BW) with no animal stress. With advances in precision farming, it is now possible to evaluate the growth performance of individual pigs more accurately. However, important traits such as muscle and fat deposition can still be evaluated only via ultrasound, computed tomography, or dual-energy x-ray absorptiometry. Therefore, the objectives of this study were: 1) to develop a CVS for prediction of live BW, muscle depth (MD), and back fat (BF) from top view 3D images of finishing pigs and 2) to compare the predictive ability of different approaches, such as traditional multiple linear regression, partial least squares, and machine learning techniques, including elastic networks, artificial neural networks, and deep learning (DL). A dataset containing over 12,000 images from 557 finishing pigs (average BW of 120 ± 12 kg) was split into training and testing sets using a 5-fold cross-validation (CV) technique so that 80% and 20% of the dataset were used for training and testing in each fold. Several image features, such as volume, area, length, widths, heights, polar image descriptors, and polar Fourier transforms, were extracted from the images and used as predictor variables in the different approaches evaluated. In addition, DL image encoders that take raw 3D images as input were also tested. This latter method achieved the best overall performance, with the lowest mean absolute scaled error (MASE) and root mean square error for all traits, and the highest predictive squared correlation (R2). The median predicted MASE achieved by this method was 2.69, 5.02, and 13.56, and R2 of 0.86, 0.50, and 0.45, for BW, MD, and BF, respectively. In conclusion, it was demonstrated that it is possible to successfully predict BW, MD, and BF via CVS on a fully automated setting using 3D images collected in farm conditions. Moreover, DL algorithms simplified and optimized the data analytics workflow, with raw 3D images used as direct inputs, without requiring prior image processing.
Collapse
Affiliation(s)
- Arthur F A Fernandes
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI
| | - João R R Dórea
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI
| | | | | | | | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
2
|
Causal phenotypic networks for egg traits in an F 2 chicken population. Mol Genet Genomics 2019; 294:1455-1462. [PMID: 31240383 DOI: 10.1007/s00438-019-01588-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 06/17/2019] [Indexed: 12/24/2022]
Abstract
Traditional single-trait genetic analyses, such as quantitative trait locus (QTL) and genome-wide association studies (GWAS), have been used to understand genotype-phenotype relationships for egg traits in chickens. Even though these techniques can detect potential genes of major effect, they cannot reveal cryptic causal relationships among QTLs and phenotypes. Thus, to better understand the relationships involving multiple genes and phenotypes of interest, other data analysis techniques must be used. Here, we utilized a QTL-directed dependency graph (QDG) mapping approach for a joint analysis of chicken egg traits, so that functional relationships and potential causal effects between them could be investigated. The QDG mapping identified a total of 17 QTLs affecting 24 egg traits that formed three independent networks of phenotypic trait groups (eggshell color, egg production, and size and weight of egg components), clearly distinguishing direct and indirect effects of QTLs towards correlated traits. For example, the network of size and weight of egg components contained 13 QTLs and 18 traits that are densely connected to each other. This indicates complex relationships between genotype and phenotype involving both direct and indirect effects of QTLs on the studied traits. Most of the QTLs were commonly identified by both the traditional (single-trait) mapping and the QDG approach. The network analysis, however, offers additional insight regarding the source and characterization of pleiotropy affecting egg traits. As such, the QDG analysis provides a substantial step forward, revealing cryptic relationships among QTLs and phenotypes, especially regarding direct and indirect QTL effects as well as potential causal relationships between traits, which can be used, for example, to optimize management practices and breeding strategies for the improvement of the traits.
Collapse
|
3
|
Chitakasempornkul K, Meneget MB, Rosa GJM, Lopes FB, Jager A, Gonçalves MAD, Dritz SS, Tokach MD, Goodband RD, Bello NM. Investigating causal biological relationships between reproductive performance traits in high-performing gilts and sows1. J Anim Sci 2019; 97:2385-2401. [PMID: 30968112 PMCID: PMC6541814 DOI: 10.1093/jas/skz115] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Accepted: 04/08/2019] [Indexed: 11/13/2022] Open
Abstract
Efficient management of swine production systems requires understanding of complex reproductive physiological mechanisms. Our objective in this study was to investigate potential causal biological relationships between reproductive performance traits in high-producing gilts and sows. Data originated from a nutrition experiment and consisted of 200 sows and 440 gilts arranged in body weight blocks and randomly assigned to dietary treatments during late gestation at a commercial swine farm. Reproductive performance traits consisted of weight gain during late gestation, total number born and number born alive in a litter, born alive average birth weight, wean-to-estrous interval, and total litter size born in the subsequent farrowing. Structural equation models combined with the inductive causation algorithm, both adapted to a hierarchical Bayesian framework, were employed to search for, estimate, and infer upon causal links between the traits within each parity group. Results indicated potentially distinct reproductive networks for gilts and for sows. Sows showed sparse connectivity between reproductive traits, whereas the network learned for gilts was densely interconnected, suggesting closely linked physiological mechanisms in younger females, with a potential for ripple effects throughout their productive lifecycle in response to early implementation of tailored managerial interventions. Cross-validation analyses indicated substantial network stability both for the general structure and for individual links, though results about directionality of such links were unstable in this study and will need further investigation. An assessment of relative statistical power in sows and gilts indicated that the observed network discrepancies may be partially explained on a biological basis. In summary, our results suggest distinctly heterogeneous mechanistic networks of reproductive physiology for gilts and sows, consistent with physiological differences between the groups. These findings have potential practical implications for integrated understanding and differential management of gilts and sows to enhance efficiency of swine production systems.
Collapse
Affiliation(s)
| | - Mariana B Meneget
- Department of Diagnostic Medicine/Pathobiology, Kansas State University, Manhattan, KS
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI
| | - Fernando B Lopes
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI
| | - Abigail Jager
- Department of Statistics, Kansas State University, Manhattan KS
| | | | - Steve S Dritz
- Department of Diagnostic Medicine/Pathobiology, Kansas State University, Manhattan, KS
| | - Mike D Tokach
- Department of Animal Sciences and Industry, Kansas State University, Manhattan, KS
| | - Robert D Goodband
- Department of Animal Sciences and Industry, Kansas State University, Manhattan, KS
| | - Nora M Bello
- Department of Statistics, Kansas State University, Manhattan KS
| |
Collapse
|
4
|
Velez-Irizarry D, Casiro S, Daza KR, Bates RO, Raney NE, Steibel JP, Ernst CW. Genetic control of longissimus dorsi muscle gene expression variation and joint analysis with phenotypic quantitative trait loci in pigs. BMC Genomics 2019; 20:3. [PMID: 30606113 PMCID: PMC6319002 DOI: 10.1186/s12864-018-5386-2] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 12/18/2018] [Indexed: 12/21/2022] Open
Abstract
Background Economically important growth and meat quality traits in pigs are controlled by cascading molecular events occurring during development and continuing throughout the conversion of muscle to meat. However, little is known about the genes and molecular mechanisms involved in this process. Evaluating transcriptomic profiles of skeletal muscle during the initial steps leading to the conversion of muscle to meat can identify key regulators of polygenic phenotypes. In addition, mapping transcript abundance through genome-wide association analysis using high-density marker genotypes allows identification of genomic regions that control gene expression, referred to as expression quantitative trait loci (eQTL). In this study, we perform eQTL analyses to identify potential candidate genes and molecular markers regulating growth and meat quality traits in pigs. Results Messenger RNA transcripts obtained with RNA-seq of longissimus dorsi muscle from 168 F2 animals from a Duroc x Pietrain pig resource population were used to estimate gene expression variation subject to genetic control by mapping eQTL. A total of 339 eQTL were mapped (FDR ≤ 0.01) with 191 exhibiting local-acting regulation. Joint analysis of eQTL with phenotypic QTL (pQTL) segregating in our population revealed 16 genes significantly associated with 21 pQTL for meat quality, carcass composition and growth traits. Ten of these pQTL were for meat quality phenotypes that co-localized with one eQTL on SSC2 (8.8-Mb region) and 11 eQTL on SSC15 (121-Mb region). Biological processes identified for co-localized eQTL genes include calcium signaling (FERM, MRLN, PKP2 and CHRNA9), energy metabolism (SUCLG2 and PFKFB3) and redox hemostasis (NQO1 and CEP128), and results support an important role for activation of the PI3K-Akt-mTOR signaling pathway during the initial conversion of muscle to meat. Conclusion Co-localization of eQTL with pQTL identified molecular markers significantly associated with both economically important phenotypes and gene transcript abundance. This study reveals candidate genes contributing to variation in pig production traits, and provides new knowledge regarding the genetic architecture of meat quality phenotypes. Electronic supplementary material The online version of this article (10.1186/s12864-018-5386-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Sebastian Casiro
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Kaitlyn R Daza
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Ronald O Bates
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Nancy E Raney
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Juan P Steibel
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA.,Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI, 48824, USA
| | - Catherine W Ernst
- Department of Animal Science, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
5
|
Bello NM, Ferreira VC, Gianola D, Rosa GJM. Conceptual framework for investigating causal effects from observational data in livestock. J Anim Sci 2018; 96:4045-4062. [PMID: 30107524 DOI: 10.1093/jas/sky277] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 07/03/2018] [Indexed: 01/07/2023] Open
Abstract
Understanding causal mechanisms among variables is critical to efficient management of complex biological systems such as animal agriculture production. The increasing availability of data from commercial livestock operations offers unique opportunities for attaining causal insight, despite the inherently observational nature of these data. Causal claims based on observational data are substantiated by recent theoretical and methodological developments in the rapidly evolving field of causal inference. Thus, the objectives of this review are as follows: 1) to introduce a unifying conceptual framework for investigating causal effects from observational data in livestock, 2) to illustrate its implementation in the context of the animal sciences, and 3) to discuss opportunities and challenges associated with this framework. Foundational to the proposed conceptual framework are graphical objects known as directed acyclic graphs (DAGs). As mathematical constructs and practical tools, DAGs encode putative structural mechanisms underlying causal models together with their probabilistic implications. The process of DAG elicitation and causal identification is central to any causal claims based on observational data. We further discuss necessary causal assumptions and associated limitations to causal inference. Last, we provide practical recommendations to facilitate implementation of causal inference from observational data in the context of the animal sciences.
Collapse
Affiliation(s)
- Nora M Bello
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Statistics, Kansas State University, Manhattan, KS.,Center for Outcomes Research and Epidemiology, Kansas State University, Manhattan, KS
| | - Vera C Ferreira
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Dairy Science, University of Wisconsin-Madison, Madison, WI.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
6
|
Zeng ISL, Lumley T. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science). Bioinform Biol Insights 2018; 12:1177932218759292. [PMID: 29497285 PMCID: PMC5824897 DOI: 10.1177/1177932218759292] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 01/24/2018] [Indexed: 12/14/2022] Open
Abstract
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Collapse
Affiliation(s)
- Irene Sui Lan Zeng
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| | - Thomas Lumley
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
7
|
Braconi D, Bernardini G, Millucci L, Santucci A. Foodomics for human health: current status and perspectives. Expert Rev Proteomics 2017; 15:153-164. [PMID: 29271263 DOI: 10.1080/14789450.2018.1421072] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
INTRODUCTION In the post-genomic era, the opportunity to combine and integrate cutting-edge analytical platforms and data processing systems allowed the birth of foodomics, 'a discipline that studies the Food and Nutrition domains through the application of advanced omics technologies to improve consumer's well-being, health, and confidence'. Since then, this discipline has rapidly evolved and researchers are now facing the daunting tasks to meet consumers' needs in terms of food traceability, sustainability, quality, safety and integrity. Most importantly, today it is imperative to provide solid evidence of the mechanisms through which food can promote human health and well-being. Areas covered: In this review, the complex relationships connecting food, nutrition and human health will be discussed, with emphasis on the relapses for the development of functional foods and nutraceuticals, personalized nutrition approaches, and the study of the interplay among gut microbiota, diet and health/diseases. Expert commentary: Evidence has been provided supporting the role of various omic platforms in studying the health-promoting effects of food and customized dietary interventions. However, although associated to major analytical challenges, only the proper integration of multi-omics studies and the implementation of bioinformatics tools and databases will help translate findings from clinical practice into effective personalized treatment strategies.
Collapse
Affiliation(s)
- Daniela Braconi
- a Dipartimento di Biotecnologie, Chimica e Farmacia , Università degli Studi di Siena , Siena , Italy
| | - Giulia Bernardini
- a Dipartimento di Biotecnologie, Chimica e Farmacia , Università degli Studi di Siena , Siena , Italy
| | - Lia Millucci
- a Dipartimento di Biotecnologie, Chimica e Farmacia , Università degli Studi di Siena , Siena , Italy
| | - Annalisa Santucci
- a Dipartimento di Biotecnologie, Chimica e Farmacia , Università degli Studi di Siena , Siena , Italy
| |
Collapse
|
8
|
Cha E, Sanderson M, Renter D, Jager A, Cernicchiaro N, Bello NM. Implementing structural equation models to observational data from feedlot production systems. Prev Vet Med 2017; 147:163-171. [PMID: 29254715 DOI: 10.1016/j.prevetmed.2017.09.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Revised: 08/09/2017] [Accepted: 09/03/2017] [Indexed: 12/28/2022]
Abstract
The objective of this study was to illustrate the implementation of a mixed-model-based structural equation modeling (SEM) approach to observational data in the context of feedlot production systems. Different from traditional multiple-trait models, SEMs allow assessment of potential causal interrelationships between outcomes and can effectively discriminate between direct and indirect effects. For illustration, we focused on feedlot performance and its relationship to health outcomes related to Bovine Respiratory Disease (BRD), which accounts for approximately 75% of morbidity and 50-80% of deaths in feedlots. Our data consisted of 1430 lots representing 178,983 cattle from 9 feedlot operations located across the US Great Plains. We explored functional links between arrival weight (AW; i = 1), BRD-related treatment costs (Trt$; as a proxy for health; i = 2) and average daily weight gain (ADG; as an indicator of productive performance i = 3), accounting for the fixed effect of sex and correlation patterns due to the clustering of lots within feedlots. We proposed competing plausible causal models based on expert knowledge. The best fitting model selected for inference supported direct effects of AW on ADG as well as indirect effects of AW on ADG mediated by Trt$. Direct effects from outcome i' to outcome i are quantified by the structural coefficient λii', such that every unit increase in kg/head of AW had a direct effect of increasing ADG by approximately (estimate ± standard error) λˆ31=0.002±0.0001 kg/head/day and also a direct effect of reducing Trt$ by an estimated λˆ21=$0.08±0.006 USD per head. In addition, every $1 USD spent on Trt$ directly decreased ADG by an estimated λˆ32=0.004±0.0006 kg/head/day. From these estimates, we show how to compute the indirect, Trt$-mediated, effect of AW on ADG, as well as the overall effect of AW on ADG, including both direct and indirect effects. We further compared estimates of SEM-based effects with those obtained from standard linear regression mixed models and demonstrated the additional advantage of explicitly distinguishing direct and indirect components of an overall regression effect using SEMs. Understanding the direct and indirect mechanisms of interplay between health and performance outcomes may provide valuable insight into production systems.
Collapse
Affiliation(s)
- Elva Cha
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA; Center for Outcomes Research and Epidemiology (CORE), College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA
| | - Mike Sanderson
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA; Center for Outcomes Research and Epidemiology (CORE), College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA
| | - David Renter
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA; Center for Outcomes Research and Epidemiology (CORE), College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA
| | - Abigail Jager
- Department of Statistics, College of Arts and Sciences, Kansas State University, Manhattan, KS, USA
| | - Natalia Cernicchiaro
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA; Center for Outcomes Research and Epidemiology (CORE), College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA
| | - Nora M Bello
- Center for Outcomes Research and Epidemiology (CORE), College of Veterinary Medicine, Kansas State University, Manhattan, KS, USA; Department of Statistics, College of Arts and Sciences, Kansas State University, Manhattan, KS, USA.
| |
Collapse
|
9
|
Bayesian Networks Illustrate Genomic and Residual Trait Connections in Maize ( Zea mays L.). G3-GENES GENOMES GENETICS 2017. [PMID: 28637811 PMCID: PMC5555481 DOI: 10.1534/g3.117.044263] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Relationships among traits were investigated on the genomic and residual levels using novel methodology. This included inference on these relationships via Bayesian networks and an assessment of the networks with structural equation models. The methodology employed three steps. First, a Bayesian multiple-trait Gaussian model was fitted to the data to decompose phenotypic values into their genomic and residual components. Second, genomic and residual network structures among traits were learned from estimates of these two components. Network learning was performed using six different algorithmic settings for comparison, of which two were score-based and four were constraint-based approaches. Third, structural equation model analyses ranked the networks in terms of goodness of fit and predictive ability, and compared them with the standard multiple-trait fully recursive network. The methodology was applied to experimental data representing the European heterotic maize pools Dent and Flint (Zea mays L.). Inferences on genomic and residual trait connections were depicted separately as directed acyclic graphs. These graphs provide information beyond mere pairwise genetic or residual associations between traits, illustrating for example conditional independencies and hinting at potential causal links among traits. Network analysis suggested some genetic correlations as potentially spurious. Genomic and residual networks were compared between Dent and Flint.
Collapse
|