1
|
O'Neill S, Grieve R, Singh K, Dutt V, Powell-Jackson T. Persistence and heterogeneity of the effects of educating mothers to improve child immunisation uptake: Experimental evidence from Uttar Pradesh in India. JOURNAL OF HEALTH ECONOMICS 2024; 96:102899. [PMID: 38805881 DOI: 10.1016/j.jhealeco.2024.102899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 05/13/2024] [Accepted: 05/17/2024] [Indexed: 05/30/2024]
Abstract
Childhood vaccinations are among the most cost-effective health interventions. Yet, in India, where immunisation services are widely available free of charge, a substantial proportion of children remain unvaccinated. We revisit households 30 months after a randomised experiment of a health information intervention designed to educate mothers on the benefits of child vaccination in Uttar Pradesh, India. We find that the large short-term effects on the uptake of diphtheria-pertussis-tetanus and measles vaccination were sustained at 30 months, suggesting the intervention did not simply bring forward vaccinations. We apply causal forests and find that the intervention increased vaccination uptake, but that there was substantial variation in the magnitude of the estimated effects. We conclude that characterising those who benefited most and conversely those who benefited least provides policy-makers with insights on how the intervention worked, and how the targeting of households could be improved.
Collapse
Affiliation(s)
- Stephen O'Neill
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, London, United Kingdom.
| | - Richard Grieve
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Kultar Singh
- Sambodhi Research and Communications, Noida, Uttar Pradesh, India
| | - Varun Dutt
- ConveGenius Insights Pvt. Ltd, Hyderabad, India
| | - Timothy Powell-Jackson
- Department of Global Health and Development, London School of Hygiene & Tropical Medicine, London, United Kingdom
| |
Collapse
|
2
|
Wang Q, Tang TM, Youlton N, Weldy CS, Kenney AM, Ronen O, Weston Hughes J, Chin ET, Sutton SC, Agarwal A, Li X, Behr M, Kumbier K, Moravec CS, Wilson Tang WH, Margulies KB, Cappola TP, Butte AJ, Arnaout R, Brown JB, Priest JR, Parikh VN, Yu B, Ashley EA. Epistasis regulates genetic control of cardiac hypertrophy. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.11.06.23297858. [PMID: 37987017 PMCID: PMC10659487 DOI: 10.1101/2023.11.06.23297858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
The combinatorial effect of genetic variants is often assumed to be additive. Although genetic variation can clearly interact non-additively, methods to uncover epistatic relationships remain in their infancy. We develop low-signal signed iterative random forests to elucidate the complex genetic architecture of cardiac hypertrophy. We derive deep learning-based estimates of left ventricular mass from the cardiac MRI scans of 29,661 individuals enrolled in the UK Biobank. We report epistatic genetic variation including variants close to CCDC141 , IGF1R , TTN , and TNKS. Several loci where variants were deemed insignificant in univariate genome-wide association analyses are identified. Functional genomic and integrative enrichment analyses reveal a complex gene regulatory network in which genes mapped from these loci share biological processes and myogenic regulatory factors. Through a network analysis of transcriptomic data from 313 explanted human hearts, we found strong gene co-expression correlations between these statistical epistasis contributors in healthy hearts and a significant connectivity decrease in failing hearts. We assess causality of epistatic effects via RNA silencing of gene-gene interactions in human induced pluripotent stem cell-derived cardiomyocytes. Finally, single-cell morphology analysis using a novel high-throughput microfluidic system shows that cardiomyocyte hypertrophy is non-additively modifiable by specific pairwise interactions between CCDC141 and both TTN and IGF1R . Our results expand the scope of genetic regulation of cardiac structure to epistasis.
Collapse
|
3
|
Wang Z, Chen J, Gong M, Shao Z. Higher-order neurodynamical equation for simplex prediction. Neural Netw 2024; 173:106185. [PMID: 38387202 DOI: 10.1016/j.neunet.2024.106185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 02/01/2024] [Accepted: 02/15/2024] [Indexed: 02/24/2024]
Abstract
It is demonstrated that higher-order patterns beyond pairwise relations can significantly enhance the learning capability of existing graph-based models, and simplex is one of the primary form for graphically representing higher-order patterns. Predicting unknown (disappeared) simplices in real-world complex networks can provide us with deeper insights, thereby assisting us in making better decisions. Nevertheless, previous efforts to predict simplices suffer from two issues: (i) they mainly focus on 2- or 3-simplices, and there are few models available for predicting simplices of arbitrary orders, and (ii) they lack the ability to analyze and learn the features of simplices from the perspective of dynamics. In this paper, we present a Higher-order Neurodynamical Equation for Simplex Prediction of arbitrary order (HNESP), which is a framework that combines neural networks and neurodynamics. Specifically, HNESP simulates the dynamical coupling process of nodes in simplicial complexes through different relations (i.e., strong pairwise relation, weak pairwise relation, and simplex) to learn node-level representations, while explaining the learning mechanism of neural networks from neurodynamics. To enrich the higher-order information contained in simplices, we exploit the entropy and normalized multivariate mutual information of different sub-structures of simplices to acquire simplex-level representations. Furthermore, simplex-level representations and multi-layer perceptron are used to quantify the existence probability of simplices. The effectiveness of HNESP is demonstrated by extensive simulations on seven higher-order benchmarks. Experimental results show that HNESP improves the AUC values of the state-of-the-art baselines by an average of 8.32%. Our implementations will be publicly available at: https://github.com/jianruichen/HNESP.
Collapse
Affiliation(s)
- Zhihui Wang
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Jianrui Chen
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Maoguo Gong
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xi'an, China; School of Electronic Engineering, Xidian University, Xi'an, China.
| | - Zhongshi Shao
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|
4
|
Lin H, Westbrook A, Fan F, Inzlicht M. An experimental manipulation of the value of effort. Nat Hum Behav 2024; 8:988-1000. [PMID: 38438651 DOI: 10.1038/s41562-024-01842-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 01/31/2024] [Indexed: 03/06/2024]
Abstract
People who take on challenges and persevere longer are more likely to succeed in life. But individuals often avoid exerting effort, and there is limited experimental research investigating whether we can learn to value effort. We developed a paradigm to test the hypothesis that people can learn to value effort and will seek effortful challenges if directly incentivized to do so. We also dissociate the effects of rewarding people for choosing effortful challenges and performing well. The results provide limited evidence that rewarding effort increased people's willingness to choose harder tasks when rewards were no longer offered (near transfer). There was also mixed evidence that rewarding effort increased willingness to choose harder tasks in another unrelated and unrewarded task (far transfer). These heterogeneous results highlight the need for further research to understand when this paradigm may be the most effective for increasing and generalizing the value of effort.
Collapse
Affiliation(s)
- Hause Lin
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Psychology, Cornell University, Ithaca, NY, USA.
| | - Andrew Westbrook
- Center for Advanced Human Brain Imaging Research, Rutgers University, Piscataway, NJ, USA
| | - Frank Fan
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
| | - Michael Inzlicht
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- Rotman School of Management, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
5
|
Chang-Brahim I, Koppensteiner LJ, Beltrame L, Bodner G, Saranti A, Salzinger J, Fanta-Jende P, Sulzbachner C, Bruckmüller F, Trognitz F, Samad-Zamini M, Zechner E, Holzinger A, Molin EM. Reviewing the essential roles of remote phenotyping, GWAS and explainable AI in practical marker-assisted selection for drought-tolerant winter wheat breeding. FRONTIERS IN PLANT SCIENCE 2024; 15:1319938. [PMID: 38699541 PMCID: PMC11064034 DOI: 10.3389/fpls.2024.1319938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/13/2024] [Indexed: 05/05/2024]
Abstract
Marker-assisted selection (MAS) plays a crucial role in crop breeding improving the speed and precision of conventional breeding programmes by quickly and reliably identifying and selecting plants with desired traits. However, the efficacy of MAS depends on several prerequisites, with precise phenotyping being a key aspect of any plant breeding programme. Recent advancements in high-throughput remote phenotyping, facilitated by unmanned aerial vehicles coupled to machine learning, offer a non-destructive and efficient alternative to traditional, time-consuming, and labour-intensive methods. Furthermore, MAS relies on knowledge of marker-trait associations, commonly obtained through genome-wide association studies (GWAS), to understand complex traits such as drought tolerance, including yield components and phenology. However, GWAS has limitations that artificial intelligence (AI) has been shown to partially overcome. Additionally, AI and its explainable variants, which ensure transparency and interpretability, are increasingly being used as recognised problem-solving tools throughout the breeding process. Given these rapid technological advancements, this review provides an overview of state-of-the-art methods and processes underlying each MAS, from phenotyping, genotyping and association analyses to the integration of explainable AI along the entire workflow. In this context, we specifically address the challenges and importance of breeding winter wheat for greater drought tolerance with stable yields, as regional droughts during critical developmental stages pose a threat to winter wheat production. Finally, we explore the transition from scientific progress to practical implementation and discuss ways to bridge the gap between cutting-edge developments and breeders, expediting MAS-based winter wheat breeding for drought tolerance.
Collapse
Affiliation(s)
- Ignacio Chang-Brahim
- Unit Bioresources, Center for Health & Bioresources, AIT Austrian Institute of Technology, Tulln, Austria
| | | | - Lorenzo Beltrame
- Unit Assistive and Autonomous Systems, Center for Vision, Automation & Control, AIT Austrian Institute of Technology, Vienna, Austria
| | - Gernot Bodner
- Department of Crop Sciences, Institute of Agronomy, University of Natural Resources and Life Sciences Vienna, Tulln, Austria
| | - Anna Saranti
- Human-Centered AI Lab, Department of Forest- and Soil Sciences, Institute of Forest Engineering, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Jules Salzinger
- Unit Assistive and Autonomous Systems, Center for Vision, Automation & Control, AIT Austrian Institute of Technology, Vienna, Austria
| | - Phillipp Fanta-Jende
- Unit Assistive and Autonomous Systems, Center for Vision, Automation & Control, AIT Austrian Institute of Technology, Vienna, Austria
| | - Christoph Sulzbachner
- Unit Assistive and Autonomous Systems, Center for Vision, Automation & Control, AIT Austrian Institute of Technology, Vienna, Austria
| | - Felix Bruckmüller
- Unit Assistive and Autonomous Systems, Center for Vision, Automation & Control, AIT Austrian Institute of Technology, Vienna, Austria
| | - Friederike Trognitz
- Unit Bioresources, Center for Health & Bioresources, AIT Austrian Institute of Technology, Tulln, Austria
| | | | - Elisabeth Zechner
- Verein zur Förderung einer nachhaltigen und regionalen Pflanzenzüchtung, Zwettl, Austria
| | - Andreas Holzinger
- Human-Centered AI Lab, Department of Forest- and Soil Sciences, Institute of Forest Engineering, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Eva M. Molin
- Unit Bioresources, Center for Health & Bioresources, AIT Austrian Institute of Technology, Tulln, Austria
- Human-Centered AI Lab, Department of Forest- and Soil Sciences, Institute of Forest Engineering, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| |
Collapse
|
6
|
Behr M, Kumbier K, Cordova-Palomera A, Aguirre M, Ronen O, Ye C, Ashley E, Butte AJ, Arnaout R, Brown B, Priest J, Yu B. Learning epistatic polygenic phenotypes with Boolean interactions. PLoS One 2024; 19:e0298906. [PMID: 38625909 PMCID: PMC11020961 DOI: 10.1371/journal.pone.0298906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 01/31/2024] [Indexed: 04/18/2024] Open
Abstract
Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.
Collapse
Affiliation(s)
- Merle Behr
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
| | - Karl Kumbier
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, United States of America
| | | | - Matthew Aguirre
- Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
- Department of Biomedical Data Science, Stanford Medicine, Stanford, CA, United States of America
| | - Omer Ronen
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
| | - Chengzhong Ye
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
| | - Euan Ashley
- Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA, United States of America
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America
| | - Rima Arnaout
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America
- Division of Cardiology, Department of Medicine, University of California, San Francisco, San Francisco, CA, United States of America
| | - Ben Brown
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
- Biosciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - James Priest
- Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
| | - Bin Yu
- Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California at Berkeley, Berkeley, CA, United States of America
| |
Collapse
|
7
|
Midya V, Nagdeo K, Lane JM, Torres-Olascoaga LA, Torres-Calapiz M, Gennings C, Horton MK, Téllez-Rojo MM, Wright RO, Arora M, Eggers S. Prenatal metal exposures and childhood gut microbial signatures are associated with depression score in late childhood. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 916:170361. [PMID: 38278245 PMCID: PMC10922719 DOI: 10.1016/j.scitotenv.2024.170361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/19/2024] [Accepted: 01/20/2024] [Indexed: 01/28/2024]
Abstract
BACKGROUND Childhood depression is a major public health issue worldwide. Previous studies have linked both prenatal metal exposures and the gut microbiome to depression in children. However, few, if any, have studied their interacting effect in specific subgroups of children. OBJECTIVES Using an interpretable machine-learning method, this study investigates whether children with specific combinations of prenatal metals and childhood microbial signatures (cliques or groups of metals and microbes) were more likely to have higher depression scores at 9-11 years of age. METHODS We leveraged data from a well-characterized pediatric longitudinal birth cohort in Mexico City and its microbiome substudy (n = 112). Eleven metal exposures were measured in maternal whole blood samples in the second and third trimesters of pregnancy. The gut microbial abundances were measured at 9-11-year-olds using shotgun metagenomic sequencing. Depression symptoms were assessed using the Child Depression Index (CDI) t-scores at 9-11 years of age. We used Microbial and Chemical Exposure Analysis (MiCxA), which combines interpretable machine-learning into a regression framework to identify and estimate joint associations of metal-microbial cliques in specific subgroups. Analyses were adjusted for relevant covariates. RESULTS We identified a subgroup of children (11.6 % of the sample) characterized by a four-component metal-microbial clique that had a significantly high depression score (15.4 % higher than the rest) in late childhood. This metal-microbial clique consisted of high Zinc in the second trimester, low Cobalt in the third trimester, a high abundance of Bacteroides fragilis, a high abundance of Faecalibacterium prausnitzii. All combinations of cliques (two-, three-, and four-components) were significantly associated with increased log-transformed t-scored CDI (β = 0.14, 95%CI = [0.05,0.23], P < 0.01 for the four-component clique). SIGNIFICANCE This study offers a new approach to chemical-microbial analysis and a novel demonstration that children with specific gut microbiome cliques and metal exposures during pregnancy may have a higher likelihood of elevated depression scores.
Collapse
Affiliation(s)
- Vishal Midya
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Kiran Nagdeo
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jamil M Lane
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Libni A Torres-Olascoaga
- Center for Nutrition and Health Research, National Institute of Public Health, Cuernavaca, Mexico
| | - Mariana Torres-Calapiz
- Center for Nutrition and Health Research, National Institute of Public Health, Cuernavaca, Mexico
| | - Chris Gennings
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Megan K Horton
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Martha M Téllez-Rojo
- Center for Nutrition and Health Research, National Institute of Public Health, Cuernavaca, Mexico
| | - Robert O Wright
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Manish Arora
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shoshannah Eggers
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Epidemiology, University of Iowa College of Public Health, Iowa City, IA, USA
| |
Collapse
|
8
|
Midya V, Nagdeo K, Lane J, Torres-Olascoaga L, Martínez G, Horton M, Gennings C, Téllez-Rojo M, Wright R, Arora M, Eggers S. Akkermansia muciniphila modifies the association between metal exposure during pregnancy and depressive symptoms in late childhood. RESEARCH SQUARE 2024:rs.3.rs-3922286. [PMID: 38410473 PMCID: PMC10896378 DOI: 10.21203/rs.3.rs-3922286/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
Emerging research suggests that exposures to metals during pregnancy and gut microbiome (GM) disruptions are associated with depressive disorders in childhood. Akkermansia muciniphila, a GM bacteria, has been studied for its potential antidepressant effects. However, its role in the influence of prenatal metal exposures on depressive symptoms during childhood is unknown. Leveraging a well-characterized pediatric longitudinal birth cohort and its microbiome substudy (n=112) and using a state-of-the-art machine-learning model, we investigated whether the presence of A.muciniphila in GM of 9-11-year-olds modifies the associations between exposure to a specific group of metals (or metal-clique) during pregnancy and concurrent childhood depressive symptoms. Among children with no A.muciniphila, a metal-clique of Zinc-Chromium-Cobalt was strongly associated with increased depression score (P<0.0001), whereas, for children with A.muciniphila, this same metal-clique was weakly associated with decreased depression score(P<0.4). Our analysis provides the first exploratory evidence hypothesizing A. muciniphila as a probiotic intervention attenuating the effect of prenatal metal-exposures-associated depressive disorders in late childhood.
Collapse
Affiliation(s)
| | | | | | | | - Gabriela Martínez
- Center for Research on Nutrition and Health, National Institute of Public Health
| | | | | | - Martha Téllez-Rojo
- Center for Research on Nutrition and Health, National Institute of Public Health
| | | | | | | |
Collapse
|
9
|
Cameron-Harp MV, Hendricks NP, Potter NA. Predicting the spatial variation in cost-efficiency for agricultural greenhouse gas mitigation programs in the U.S. CARBON BALANCE AND MANAGEMENT 2024; 19:6. [PMID: 38337091 PMCID: PMC10858497 DOI: 10.1186/s13021-024-00252-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 01/27/2024] [Indexed: 02/12/2024]
Abstract
BACKGROUND Two major factors that determine the efficiency of programs designed to mitigate greenhouse gases by encouraging voluntary changes in U.S. agricultural land management are the effect of land use changes on producers' profitability and the net sequestration those changes create. In this work, we investigate how the interaction of these factors produces spatial heterogeneity in the cost-efficiency of voluntary programs incentivizing tillage reduction and cover-cropping practices. We map county-level predicted rates of adoption for each practice with the greenhouse gas mitigation or carbon sequestration benefits expected from their use. Then, we use these bivariate maps to describe how the cost efficiency of agricultural mitigation efforts is likely to vary spatially in the United States. RESULTS Our results suggest the combination of high adoption rates and large reductions in net emissions make reduced tillage programs most cost efficient in the Chesapeake Bay watershed or the Upper Mississippi and Lower Missouri sub-basins of the Mississippi River. For programs aiming to reduce net emissions by incentivizing cover-cropping, we expect cost-efficiency to be greatest in the areas near the main stem of the Mississippi River within its Middle and Lower sections. CONCLUSIONS Many voluntary agricultural conservation programs offer the same incentives across the United States. Yet spatial variation in profitability and efficacy of conservation practices suggest that these uniform approaches are not cost-effective. Spatial targeting of voluntary agricultural conservation programs has the potential to increase the cost-efficiency of these programs due to regional heterogeneity in the profitability and greenhouse gas mitigation benefits of agricultural land management practices across the continental United States. We illustrate how predicted rates of adoption and greenhouse gas sequestration might be used to target regions where efforts to incentivize cover-cropping and reductions in tillage are most likely to be cost -effective.
Collapse
Affiliation(s)
- Micah V Cameron-Harp
- Department of Agricultural Economics, Kansas State University, Manhattan, Kansas, USA.
| | - Nathan P Hendricks
- Department of Agricultural Economics, Kansas State University, Manhattan, Kansas, USA
| | | |
Collapse
|
10
|
Sakai S, Tanaka Y, Tsukamoto Y, Kimura-Ohba S, Hesaka A, Hamase K, Hsieh CL, Kawakami E, Ono H, Yokote K, Yoshino M, Okuzaki D, Matsumura H, Fukushima A, Mita M, Nakane M, Doi M, Isaka Y, Kimura T. d -Alanine Affects the Circadian Clock to Regulate Glucose Metabolism in the Kidney. KIDNEY360 2024; 5:237-251. [PMID: 38098136 PMCID: PMC10914205 DOI: 10.34067/kid.0000000000000345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 12/07/2023] [Indexed: 03/01/2024]
Abstract
Key Points d -Alanine affects the circadian clock to regulate gluconeogenesis in the kidney. d -Alanine itself has a clear intrinsic circadian rhythm, which is regulated by urinary excretion, and acts on the circadian rhythm. d -Alanine is a signal activator for circadian rhythm and gluconeogenesis through circadian transcriptional network. Background The aberrant glucose circadian rhythm is associated with the pathogenesis of diabetes. Similar to glucose metabolism in the kidney and liver, d -alanine, a rare enantiomer of alanine, shows circadian alteration, although the effect of d- alanine on glucose metabolism has not been explored. Here, we show that d- alanine acts on the circadian clock and affects glucose metabolism in the kidney. Methods The blood and urinary levels of d -alanine in mice were measured using two-dimensional high-performance liquid chromatography system. Metabolic effects of d -alanine were analyzed in mice and in primary culture of kidney proximal tubular cells from mice. Behavioral and gene expression analyses of circadian rhythm were performed using mice bred under constant darkness. Results d- Alanine levels in blood exhibited a clear intrinsic circadian rhythm. Since this rhythm was regulated by the kidney through urinary excretion, we examined the effect of d -alanine on the kidney. In the kidney, d -alanine induced the expressions of genes involved in gluconeogenesis and circadian rhythm. Treatment of d- alanine mediated glucose production in mice. Ex vivo glucose production assay demonstrated that the treatment of d -alanine induced glucose production in primary culture of kidney proximal tubular cells, where d -amino acids are known to be reabsorbed, but not in that of liver cells. Gluconeogenetic effect of d -alanine has an intraday variation, and this effect was in part mediated through circadian transcriptional network. Under constant darkness, treatment of d- alanine normalized the circadian cycle of behavior and kidney gene expressions. Conclusions d- Alanine induces gluconeogenesis in the kidney and adjusts the period of the circadian clock. Normalization of circadian cycle by d -alanine may provide the therapeutic options for life style–related diseases and shift workers.
Collapse
Affiliation(s)
- Shinsuke Sakai
- Department of Nephrology, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | - Youichi Tanaka
- Department of Systems Biology, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Yusuke Tsukamoto
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | - Shihoko Kimura-Ohba
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | - Atsushi Hesaka
- Department of Nephrology, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | - Kenji Hamase
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka, Japan
| | - Chin-Ling Hsieh
- Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka, Japan
| | - Eiryo Kawakami
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Japan
- Advanced Data Science (ADSP), RIKEN Information R&D and Strategy Headquarters, Yokohama, Kanagawa, Japan
- Institute for Advanced Academic Research (IAAR), Chiba University, Chiba, Japan
| | - Hiraku Ono
- Department of Endocrinology, Hematology and Gerontorogy, Graduate School of Medicine, Chiba University,Chiba, Japan
| | - Kotaro Yokote
- Department of Endocrinology, Hematology and Gerontorogy, Graduate School of Medicine, Chiba University,Chiba, Japan
| | - Mitsuaki Yoshino
- Laboratory of Rare Disease Information and Resource library, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), Ibaraki, Osaka, Japan
| | - Daisuke Okuzaki
- Genome Information Research Center, Research Institute for Microbial Disease, Osaka University, Suita, Osaka, Japan
| | - Hiroyo Matsumura
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | - Atsuko Fukushima
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| | | | | | - Masao Doi
- Department of Systems Biology, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Yoshitaka Isaka
- Department of Nephrology, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
| | - Tomonori Kimura
- Department of Nephrology, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
- Reverse Translational Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- KAGAMI Project, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
| |
Collapse
|
11
|
Künzel SH, Pohlmann D, Bonsen LZ, Krappitz M, Zeitz O, Joussen AM, Dubrac A, Künzel SE. Transcriptome Analysis of Choroidal Endothelium Links Androgen Receptor Role to Central Serous Chorioretinopathy. Eur J Ophthalmol 2024:11206721241226735. [PMID: 38263930 DOI: 10.1177/11206721241226735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
BACKGROUND Central Serous Chorioretinopathy (CSCR) manifests as fluid accumulation between the neurosensory retina and the retinal pigment epithelium (RPE). Elevated levels of steroid hormones have been implicated in CSCR pathogenesis. This investigation aims to delineate the gene expression patterns of CSCR-associated risk and steroid receptors across human choroidal cell types and RPE cells to discern potential underlying mechanisms. METHODS This study utilized a comprehensive query of transcriptomic data derived from non-pathological human choroid and RPE cells. FINDINGS CSCR-associated genes such as PTPRB, CFH, and others are predominantly expressed in the choroidal endothelium as opposed to the RPE. The androgen receptor, encoded by the AR gene, demonstrates heightened expression in the macular endothelium compared to peripheral regions, unlike other steroid receptor genes. AR-expressing endothelial cells display an augmented responsiveness to Transforming growth factor beta (TGF-β), indicating a propensity towards endothelial to mesenchymal transition (endMT) transcriptional profiling. INTERPRETATION These results highlight the proclivity of CSCR to manifest primarily within the choroidal vasculature rather than the RPE, suggesting its categorization as a vascular eye disorder. This study accentuates the pivotal role of androgenic steroids, in addition to glucocorticoids. The observed linkage to TGF-β-mediated endMT provides a potential mechanistic insight into the disease's etiology.
Collapse
Affiliation(s)
| | - Dominika Pohlmann
- Department of Ophthalmology, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Lynn Zur Bonsen
- Department of Ophthalmology, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Matteus Krappitz
- Department of Nephrology and Medical Intensive Care, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Oliver Zeitz
- Department of Ophthalmology, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Antonia M Joussen
- Department of Ophthalmology, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Alexandre Dubrac
- Département de Pathologie et Biologie Cellulaire, Université de Montréal, Montréal, Quebec, Canada
| | - Steffen E Künzel
- Department of Ophthalmology, Charité University Hospital Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Département de Pathologie et Biologie Cellulaire, Université de Montréal, Montréal, Quebec, Canada
| |
Collapse
|
12
|
Lu Y, Wang X, Du C, Wang Y, Geng Y, Shi L, Park J. Understanding the role of neutral species by means of high-order interaction in the rock-paper-scissors dynamics. Phys Rev E 2024; 109:014313. [PMID: 38366519 DOI: 10.1103/physreve.109.014313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 01/05/2024] [Indexed: 02/18/2024]
Abstract
The existence of neutral species carries profound ecological implications that warrant further investigation. In this paper, we study the impact of neutral species on biodiversity in a spatial tritrophic system of cyclic competition, in which the neutral species are identified as the fourth species that may affect the competition process of the other three species under the rock-paper-scissors (RPS) rule. Extensive simulations showed that neutral species can promote coexistence in a high mobility regime within the system. When coexistence occurs, we found that the state can be maintained by two mechanisms: Species can either (i) adhere to traditional RPS rule or (ii) form patches to resist invasion. Our findings might aid in understanding the impact of neutral species on biodiversity in ecosystems.
Collapse
Affiliation(s)
- Yikang Lu
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, Yunnan 650221, China
- Institute for Biocomputation and Physics of Complex Systems, University of Zaragoza, 50018 Zaragoza, Spain
| | - Xiaoyue Wang
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, Yunnan 650221, China
| | - Chunpeng Du
- School of Mathematics, Kunming University, Kunming, 650214, China
| | - Yanan Wang
- Institute for Biocomputation and Physics of Complex Systems, University of Zaragoza, 50018 Zaragoza, Spain
- School of Economics and Management, Beihang University, Beijing 100191, China
| | - Yini Geng
- School of Mathematics and Statistics, Hunan Normal University, Changsha 410081, China
| | - Lei Shi
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, Yunnan 650221, China
- Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
| | - Junpyo Park
- Department of Applied Mathematics, College of Applied Sciences, Kyung Hee University, Yongin 17104, Republic of Korea
| |
Collapse
|
13
|
Zhao Y, Potenza MN, Tapert SF, Paulus MP. Neural correlates of negative life events and their relationships with alcohol and cannabis use initiation. DIALOGUES IN CLINICAL NEUROSCIENCE 2023; 25:112-121. [PMID: 37916739 PMCID: PMC10623894 DOI: 10.1080/19585969.2023.2252437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 08/22/2023] [Indexed: 11/03/2023]
Abstract
OBJECTIVE Negative life events (NLEs), e.g., poor academic performance (controllable) or being the victim of a crime (uncontrollable), can profoundly affect the trajectory of one's life. Yet, their impact on how the brain develops is still not well understood. This investigation examined the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) dataset for the impact of NLEs on the initiation of alcohol and cannabis use, as well as underlying neural mechanisms. METHODS This study evaluated the impact of controllable and uncontrollable NLEs on substance use initiation in 207 youth who initiated alcohol use, 168 who initiated cannabis use, and compared it to 128 youth who remained substance-naïve, using generalised linear regression models. Mediation analyses were conducted to determine neural pathways of NLE impacting substance use trajectories. RESULTS Dose-response relationships between controllable NLEs and substance use initiation were observed. Having one controllable NLE increased the odds of alcohol initiation by 50% (95%CI [1.18, 1.93]) and cannabis initiation by 73% (95%CI [1.36, 2.24]), respectively. Greater cortical thickness in left banks of the superior temporal sulcus mediated effects of controllable NLEs on alcohol and cannabis initiations. Greater left caudate gray-matter volumes mediated effects of controllable NLEs on cannabis initiation. CONCLUSIONS Controllable but not uncontrollable NLEs increased the odds of alcohol and cannabis initiation. Moreover, those individuals with less mature brain structures at the time of the NLEs experienced a greater impact of NLEs on subsequent initiation of alcohol or cannabis use. Targeting youth experiencing controllable NLEs may help mitigate alcohol and cannabis initiation.
Collapse
Affiliation(s)
- Yihong Zhao
- Columbia University School of Nursing, New York, NY, USA
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Marc N. Potenza
- Department of Psychiatry, Child Study Center, Yale University School of Medicine, New Haven, CT, USA
- Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA
- Connecticut Mental Health Center, New Haven, CT, USA
- Connecticut Council on Problem Gambling, Wethersfield, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - Susan F. Tapert
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Martin P. Paulus
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
- Laureate Institute for Brain Research, Tulsa, OK, USA
| |
Collapse
|
14
|
Grady SK, Dojcsak L, Harville EW, Wallace ME, Vilda D, Donneyong MM, Hood DB, Valdez RB, Ramesh A, Im W, Matthews-Juarez P, Juarez PD, Langston MA. Seminar: Scalable Preprocessing Tools for Exposomic Data Analysis. ENVIRONMENTAL HEALTH PERSPECTIVES 2023; 131:124201. [PMID: 38109119 PMCID: PMC10727037 DOI: 10.1289/ehp12901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 11/22/2023] [Accepted: 11/28/2023] [Indexed: 12/19/2023]
Abstract
BACKGROUND The exposome serves as a popular framework in which to study exposures from chemical and nonchemical stressors across the life course and the differing roles that these exposures can play in human health. As a result, data relevant to the exposome have been used as a resource in the quest to untangle complicated health trajectories and help connect the dots from exposures to adverse outcome pathways. OBJECTIVES The primary aim of this methods seminar is to clarify and review preprocessing techniques critical for accurate and effective external exposomic data analysis. Scalability is emphasized through an application of highly innovative combinatorial techniques coupled with more traditional statistical strategies. The Public Health Exposome is used as an archetypical model. The novelty and innovation of this seminar's focus stem from its methodical, comprehensive treatment of preprocessing and its demonstration of the positive effects preprocessing can have on downstream analytics. DISCUSSION State-of-the-art technologies are described for data harmonization and to mitigate noise, which can stymie downstream interpretation, and to select key exposomic features, without which analytics may lose focus. A main task is the reduction of multicollinearity, a particularly formidable problem that frequently arises from repeated measurements of similar events taken at various times and from multiple sources. Empirical results highlight the effectiveness of a carefully planned preprocessing workflow as demonstrated in the context of more highly concentrated variable lists, improved correlational distributions, and enhanced downstream analytics for latent relationship discovery. The nascent field of exposome science can be characterized by the need to analyze and interpret a complex confluence of highly inhomogeneous spatial and temporal data, which may present formidable challenges to even the most powerful analytical tools. A systematic approach to preprocessing can therefore provide an essential first step in the application of modern computer and data science methods. https://doi.org/10.1289/EHP12901.
Collapse
Affiliation(s)
- Stephen K. Grady
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, USA
| | - Levente Dojcsak
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, USA
| | - Emily W. Harville
- Department Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, Louisiana, USA
| | - Maeve E. Wallace
- Department of Social, Behavioral, and Population Sciences, Tulane University School of Public Health and Tropical Medicine, New Orleans, Louisiana, USA
| | - Dovile Vilda
- Department of Social, Behavioral, and Population Sciences, Tulane University School of Public Health and Tropical Medicine, New Orleans, Louisiana, USA
| | | | - Darryl B. Hood
- Division of Environmental Health Sciences, College of Public Health, Ohio State University, Columbus, Ohio, USA
| | - R. Burciaga Valdez
- Department of Economics, University of New Mexico, Albuquerque, New Mexico, USA
| | - Aramandla Ramesh
- Department of Biochemistry, Cancer Biology, Neuroscience & Pharmacology, Meharry Medical College, Nashville, Tennessee, USA
| | - Wansoo Im
- Department of Family and Community Medicine, Meharry Medical College, Nashville, Tennessee, USA
| | | | - Paul D. Juarez
- Department of Family and Community Medicine, Meharry Medical College, Nashville, Tennessee, USA
- Institute on Health Disparities, Equity, and the Exposome, Meharry Medical College, Nashville, Tennessee, USA
| | - Michael A. Langston
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, USA
| |
Collapse
|
15
|
Desai RI, Kangas BD, Luc OT, Solakidou E, Smith EC, Dawes MH, Ma X, Makriyannis A, Chatterjee S, Dayeh MA, Muñoz-Jaramillo A, Desai MI, Limoli CL. Complex 33-beam simulated galactic cosmic radiation exposure impacts cognitive function and prefrontal cortex neurotransmitter networks in male mice. Nat Commun 2023; 14:7779. [PMID: 38012180 PMCID: PMC10682413 DOI: 10.1038/s41467-023-42173-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 09/28/2023] [Indexed: 11/29/2023] Open
Abstract
Astronauts will encounter extended exposure to galactic cosmic radiation (GCR) during deep space exploration, which could impair brain function. Here, we report that in male mice, acute or chronic GCR exposure did not modify reward sensitivity but did adversely affect attentional processes and increased reaction times. Potassium (K+)-stimulation in the prefrontal cortex (PFC) elevated dopamine (DA) but abolished temporal DA responsiveness after acute and chronic GCR exposure. Unlike acute GCR, chronic GCR increased levels of all other neurotransmitters, with differences evident between groups after higher K+-stimulation. Correlational and machine learning analysis showed that acute and chronic GCR exposure differentially reorganized the connection strength and causation of DA and other PFC neurotransmitter networks compared to controls which may explain space radiation-induced neurocognitive deficits.
Collapse
Affiliation(s)
- Rajeev I Desai
- Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA.
- Behavioral Biology Program, McLean Hospital, Belmont, MA, 02478, USA.
- Center for Drug Discovery, Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA.
| | - Brian D Kangas
- Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA
- Behavioral Biology Program, McLean Hospital, Belmont, MA, 02478, USA
| | - Oanh T Luc
- Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA
- Behavioral Biology Program, McLean Hospital, Belmont, MA, 02478, USA
| | - Eleana Solakidou
- Center for Drug Discovery, Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA
- Medical School, University of Crete, Heraklion, Greece
| | - Evan C Smith
- Center for Drug Discovery, Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA
| | - Monica H Dawes
- Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA
- Behavioral Biology Program, McLean Hospital, Belmont, MA, 02478, USA
| | - Xiaoyu Ma
- Center for Drug Discovery, Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA
| | - Alexandros Makriyannis
- Center for Drug Discovery, Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA
| | | | - Maher A Dayeh
- Southwest Research Institute, San Antonio, TX, 78238, USA
- University of San Antonio, San Antonio, TX, 78249, USA
| | | | - Mihir I Desai
- Southwest Research Institute, San Antonio, TX, 78238, USA
- University of San Antonio, San Antonio, TX, 78249, USA
| | - Charles L Limoli
- Department of Radiation Oncology, University of California, Irvine, Orange, CA, 92697, USA
| |
Collapse
|
16
|
Midya V, Alcala CS, Rechtman E, Gregory JK, Kannan K, Hertz-Picciotto I, Teitelbaum SL, Gennings C, Rosa MJ, Valvi D. Machine Learning Assisted Discovery of Interactions between Pesticides, Phthalates, Phenols, and Trace Elements in Child Neurodevelopment. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:18139-18150. [PMID: 37595051 PMCID: PMC10666542 DOI: 10.1021/acs.est.3c00848] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/10/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]
Abstract
A growing body of literature suggests that developmental exposure to individual or mixtures of environmental chemicals (ECs) is associated with autism spectrum disorder (ASD). However, investigating the effect of interactions among these ECs can be challenging. We introduced a combination of the classical exposure-mixture Weighted Quantile Sum (WQS) regression and a machine-learning method termed Signed iterative Random Forest (SiRF) to discover synergistic interactions between ECs that are (1) associated with higher odds of ASD diagnosis, (2) mimic toxicological interactions, and (3) are present only in a subset of the sample whose chemical concentrations are higher than certain thresholds. In a case-control Childhood Autism Risks from Genetics and Environment (CHARGE) study, we evaluated multiordered synergistic interactions among 62 ECs measured in the urine samples of 479 children in association with increased odds for ASD diagnosis (yes vs no). WQS-SiRF identified two synergistic two-ordered interactions between (1) trace-element cadmium (Cd) and the organophosphate pesticide metabolite diethyl-phosphate (DEP); and (2) 2,4,6-trichlorophenol (TCP-246) and DEP. Both interactions were suggestively associated with increased odds of ASD diagnosis in the subset of children with urinary concentrations of Cd, DEP, and TCP-246 above the 75th percentile. This study demonstrates a novel method that combines the inferential power of WQS and the predictive accuracy of machine-learning algorithms to discover potentially biologically relevant chemical-chemical interactions associated with ASD.
Collapse
Affiliation(s)
- Vishal Midya
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Cecilia Sara Alcala
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Elza Rechtman
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jill K. Gregory
- Instructional
Technology Group,Icahn School of Medicine
at Mount Sinai, New York, New York 10029, United States
| | - Kurunthachalam Kannan
- Department
of Pediatrics and Department of Environmental Medicine, New York University School of Medicine, New York, New York 10016, United States
| | - Irva Hertz-Picciotto
- Department
of Public Health Sciences, School of Medicine, University of California at Davis, Davis, California 95616, United States
- UC
Davis MIND (Medical Investigations of Neurodevelopmental Disorders)
Institute, University of California at Davis, Sacramento, California 95817, United States
| | - Susan L. Teitelbaum
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Chris Gennings
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Maria J. Rosa
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Damaskini Valvi
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| |
Collapse
|
17
|
Nassani R, Bokhari Y, Alrfaei BM. Molecular signature to predict quality of life and survival with glioblastoma using Multiview omics model. PLoS One 2023; 18:e0287448. [PMID: 37972206 PMCID: PMC10653472 DOI: 10.1371/journal.pone.0287448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 06/05/2023] [Indexed: 11/19/2023] Open
Abstract
Glioblastoma multiforme (GBM) patients show a variety of signs and symptoms that affect their quality of life (QOL) and self-dependence. Since most existing studies have examined prognostic factors based only on clinical factors, there is a need to consider the value of integrating multi-omics data including gene expression and proteomics with clinical data in identifying significant biomarkers for GBM prognosis. Our research aimed to isolate significant features that differentiate between short-term (≤ 6 months) and long-term (≥ 2 years) GBM survival, and between high Karnofsky performance scores (KPS ≥ 80) and low (KPS ≤ 60), using the iterative random forest (iRF) algorithm. Using the Cancer Genomic Atlas (TCGA) database, we identified 35 molecular features composed of 19 genes and 16 proteins. Our findings propose molecular signatures for predicting GBM prognosis and will improve clinical decisions, GBM management, and drug development.
Collapse
Affiliation(s)
- Rayan Nassani
- Center for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, United Kingdom
- King Abdullah International Medical Research Center (KAIMRC), King Saud bin Abdulaziz University for Health Sciences (KSAU-HS), Riyadh, Saudi Arabia
| | - Yahya Bokhari
- Department of AI and Bioinformatics, King Abdullah International Medical Research Center (KAIMRC), King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), Riyadh, Saudi Arabia
- Department of Health Informatics, College of Public Health and Health Informatics, King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), Riyadh, Saudi Arabia
| | - Bahauddeen M. Alrfaei
- King Abdullah International Medical Research Center (KAIMRC), King Saud bin Abdulaziz University for Health Sciences (KSAU-HS), Riyadh, Saudi Arabia
- College of Medicine, King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), Riyadh, Saudi Arabia
| |
Collapse
|
18
|
Midya V, Lane JM, Gennings C, Torres-Olascoaga LA, Gregory JK, Wright RO, Arora M, Téllez-Rojo MM, Eggers S. Prenatal Lead Exposure Is Associated with Reduced Abundance of Beneficial Gut Microbial Cliques in Late Childhood: An Investigation Using Microbial Co-Occurrence Analysis (MiCA). ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:16800-16810. [PMID: 37878664 PMCID: PMC10634322 DOI: 10.1021/acs.est.3c04346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 09/08/2023] [Accepted: 09/11/2023] [Indexed: 10/27/2023]
Abstract
Many analytical methods used in gut microbiome research focus on either single bacterial taxa or the whole microbiome, ignoring multibacteria relationships (microbial cliques). We present a novel analytical approach to identify microbial cliques within the gut microbiome of children at 9-11 years associated with prenatal lead (Pb) exposure. Data came from a subset of participants (n = 123) in the Programming Research in Obesity, Growth, Environment and Social Stressors cohort. Pb concentrations were measured in maternal whole blood from the second and third trimesters of pregnancy. Stool samples collected at 9-11 years old underwent metagenomic sequencing to assess the gut microbiome. Using a novel analytical approach, Microbial Co-occurrence Analysis (MiCA), we paired a machine learning algorithm with randomization-based inference to first identify microbial cliques that were predictive of prenatal Pb exposure and then estimate the association between prenatal Pb exposure and microbial clique abundance. With second-trimester Pb exposure, we identified a two-taxa microbial clique that included Bifidobacterium adolescentis and Ruminococcus callidus and a three-taxa clique that also included Prevotella clara. Increasing second-trimester Pb exposure was associated with significantly increased odds of having the two-taxa microbial clique below the median relative abundance (odds ratio (OR) = 1.03, 95% confidence interval (CI) [1.01-1.05]). Using a novel combination of machine learning and causal inference, MiCA identified a significant association between second-trimester Pb exposure and the reduced abundance of a probiotic microbial clique within the gut microbiome in late childhood.
Collapse
Affiliation(s)
- Vishal Midya
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jamil M. Lane
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Chris Gennings
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Libni A. Torres-Olascoaga
- Center
for Research on Nutrition and Health, National
Institute of Public Health, Cuernavaca 62100, Mexico
| | - Jill K. Gregory
- Instructional
Technology Group, Icahn School of Medicine
at Mount Sinai, New York, New York 10029, United States
| | - Robert O. Wright
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Manish Arora
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Martha Maria Téllez-Rojo
- Center
for Research on Nutrition and Health, National
Institute of Public Health, Cuernavaca 62100, Mexico
| | - Shoshannah Eggers
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
- Department
of Epidemiology, University of Iowa College
of Public Health, Iowa City, Iowa 52242, United States
| |
Collapse
|
19
|
Noshay J, Walker T, Alexander W, Klingeman D, Romero J, Walker A, Prates E, Eckert C, Irle S, Kainer D, Jacobson D. Quantum biological insights into CRISPR-Cas9 sgRNA efficiency from explainable-AI driven feature engineering. Nucleic Acids Res 2023; 51:10147-10161. [PMID: 37738140 PMCID: PMC10602897 DOI: 10.1093/nar/gkad736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 08/07/2023] [Accepted: 08/29/2023] [Indexed: 09/24/2023] Open
Abstract
CRISPR-Cas9 tools have transformed genetic manipulation capabilities in the laboratory. Empirical rules-of-thumb have been developed for only a narrow range of model organisms, and mechanistic underpinnings for sgRNA efficiency remain poorly understood. This work establishes a novel feature set and new public resource, produced with quantum chemical tensors, for interpreting and predicting sgRNA efficiency. Feature engineering for sgRNA efficiency is performed using an explainable-artificial intelligence model: iterative Random Forest (iRF). By encoding quantitative attributes of position-specific sequences for Escherichia coli sgRNAs, we identify important traits for sgRNA design in bacterial species. Additionally, we show that expanding positional encoding to quantum descriptors of base-pair, dimer, trimer, and tetramer sequences captures intricate interactions in local and neighboring nucleotides of the target DNA. These features highlight variation in CRISPR-Cas9 sgRNA dynamics between E. coli and H. sapiens genomes. These novel encodings of sgRNAs enhance our understanding of the elaborate quantum biological processes involved in CRISPR-Cas9 machinery.
Collapse
Affiliation(s)
- Jaclyn M Noshay
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Tyler Walker
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - William G Alexander
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Dawn M Klingeman
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jonathon Romero
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Angelica M Walker
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Erica Prates
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Carrie Eckert
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Stephan Irle
- Computational Sciences and Engineering, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - David Kainer
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Daniel A Jacobson
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| |
Collapse
|
20
|
Wang W, Chen S, Qiao L, Zhang S, Liu Q, Yang K, Pan Y, Liu J, Liu W. Four Markers Useful for the Distinction of Intrauterine Growth Restriction in Sheep. Animals (Basel) 2023; 13:3305. [PMID: 37958061 PMCID: PMC10648371 DOI: 10.3390/ani13213305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 10/14/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
Intrauterine growth restriction (IUGR) is a common perinatal complication in animal reproduction, with long-lasting negative effects on neonates and postnatal animals, which seriously negatively affects livestock production. In this study, we aimed to identify potential genes associated with the diagnosis of IUGR through bioinformatics analysis. Based on the 73 differentially expressed related genes obtained by differential analysis and weighted gene co-expression network analysis, we used three machine learning algorithms to identify 4 IUGR-related hub genes (IUGR-HGs), namely, ADAM9, CRYL1, NDP52, and SERPINA7, whose ROC curves showed that they are a good diagnostic target for IUGR. Next, we identified two molecular subtypes of IUGR through consensus clustering analysis and constructed a gene scoring system based on the IUGR-HGs. The results showed that the IUGR score was positively correlated with the risk of IUGR. The AUC value of IUGR scoring accuracy was 0.970. Finally, we constructed a new artificial neural network model based on the four IUGR-HGs to diagnose sheep IUGR, and its accuracy reached 0.956. In conclusion, the IUGR-HGs we identified provide new potential molecular markers and models for the diagnosis of IUGR in sheep; they can better diagnose whether sheep have IUGR. The present findings provide new perspectives on the diagnosis of IUGR.
Collapse
Affiliation(s)
- Wannian Wang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Sijia Chen
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Liying Qiao
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Siying Zhang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Qiaoxia Liu
- Shanxi Animal Husbandry Technology Extension Service Center, Taiyuan 030001, China;
| | - Kaijie Yang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Yangyang Pan
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Jianhua Liu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
| | - Wenzhong Liu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, Shanxi Agricultural University, Taigu, Jinzhong 030801, China; (W.W.); (S.C.); (L.Q.); (S.Z.); (K.Y.); (Y.P.); (J.L.)
- Key Laboratory of Farm Animal Genetic Resources Exploration and Breeding of Shanxi Province, Jinzhong 030801, China
| |
Collapse
|
21
|
Allen B. An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence. PLoS One 2023; 18:e0292341. [PMID: 37796874 PMCID: PMC10553328 DOI: 10.1371/journal.pone.0292341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 09/18/2023] [Indexed: 10/07/2023] Open
Abstract
BACKGROUND There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box. OBJECTIVE The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence. METHODS This study shows the application of explainable artificial intelligence methods to machine learning models of cross-sectional obesity prevalence data collected from 3,142 counties in the United States. County-level features from 7 broad categories: health outcomes, health behaviors, clinical care, social and economic factors, physical environment, demographics, and severe housing conditions. Explainable methods applied to random forest prediction models include feature importance, accumulated local effects, global surrogate decision tree, and local interpretable model-agnostic explanations. RESULTS The results show that machine learning models explained 79% of the variance in obesity prevalence, with physical inactivity, diabetes, and smoking prevalence being the most important factors in predicting obesity prevalence. CONCLUSIONS Interpretable machine learning models of health behaviors and outcomes provide substantial insight into obesity prevalence variation across counties in the United States.
Collapse
Affiliation(s)
- Ben Allen
- Department of Psychology, University of Kansas, Lawrence, Kansas, United States of America
| |
Collapse
|
22
|
Desai TA, Hedman ÅK, Dimitriou M, Koprulu M, Figiel S, Yin W, Johansson M, Watts EL, Atkins JR, Sokolov AV, Schiöth HB, Gunter MJ, Tsilidis KK, Martin RM, Pietzner M, Langenberg C, Mills IG, Lamb AD, Mälarstig A, Key TJ, Travis RC, Smith-Byrne K. Identifying proteomic risk factors for overall, aggressive and early onset prostate cancer using Mendelian randomization and tumor spatial transcriptomics. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.21.23295864. [PMID: 37790472 PMCID: PMC10543057 DOI: 10.1101/2023.09.21.23295864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Background Understanding the role of circulating proteins in prostate cancer risk can reveal key biological pathways and identify novel targets for cancer prevention. Methods We investigated the association of 2,002 genetically predicted circulating protein levels with risk of prostate cancer overall, and of aggressive and early onset disease, using cis-pQTL Mendelian randomization (MR) and colocalization. Findings for proteins with support from both MR, after correction for multiple-testing, and colocalization were replicated using two independent cancer GWAS, one of European and one of African ancestry. Proteins with evidence of prostate-specific tissue expression were additionally investigated using spatial transcriptomic data in prostate tumor tissue to assess their role in tumor aggressiveness. Finally, we mapped risk proteins to drug and ongoing clinical trials targets. Results We identified 20 proteins genetically linked to prostate cancer risk (14 for overall [8 specific], 7 for aggressive [3 specific], and 8 for early onset disease [2 specific]), of which a majority were novel and replicated. Among these were proteins associated with aggressive disease, such as PPA2 [Odds Ratio (OR) per 1 SD increment = 2.13, 95% CI: 1.54-2.93], PYY [OR = 1.87, 95% CI: 1.43-2.44] and PRSS3 [OR = 0.80, 95% CI: 0.73-0.89], and those associated with early onset disease, including EHPB1 [OR = 2.89, 95% CI: 1.99-4.21], POGLUT3 [OR = 0.76, 95% CI: 0.67-0.86] and TPM3 [OR = 0.47, 95% CI: 0.34-0.64]. We confirm an inverse association of MSMB with prostate cancer overall [OR = 0.81, 95% CI: 0.80-0.82], and also find an inverse association with both aggressive [OR = 0.84, 95% CI: 0.82-0.86] and early onset disease [OR = 0.71, 95% CI: 0.68-0.74]. Using spatial transcriptomics data, we identified MSMB as the genome-wide top-most predictive gene to distinguish benign regions from high grade cancer regions that had five-fold lower MSMB expression. Additionally, ten proteins that were associated with prostate cancer risk mapped to existing therapeutic interventions. Conclusion Our findings emphasize the importance of proteomics for improving our understanding of prostate cancer etiology and of opportunities for novel therapeutic interventions. Additionally, we demonstrate the added benefit of in-depth functional analyses to triangulate the role of risk proteins in the clinical aggressiveness of prostate tumors. Using these integrated methods, we identify a subset of risk proteins associated with aggressive and early onset disease as priorities for investigation for the future prevention and treatment of prostate cancer.
Collapse
Affiliation(s)
- Trishna A Desai
- Cancer Epidemiology Unit, Oxford Population Health, University of Oxford, Oxford, United Kingdom
| | - Åsa K Hedman
- External Science and Innovation, Pfizer Worldwide Research, Development and Medical, Stockholm, Sweden
- Department of Medicine, Department of Medicine, Stockholm, Sweden
| | - Marios Dimitriou
- External Science and Innovation, Pfizer Worldwide Research, Development and Medical, Stockholm, Sweden
- Department of Medicine, Department of Medicine, Stockholm, Sweden
| | - Mine Koprulu
- MRC Epidemiology Unit, University of Cambridge, United Kingdom
| | - Sandy Figiel
- University of Oxford, Nuffield Department of Surgical Sciences, Oxford, United Kingdom
| | - Wencheng Yin
- University of Oxford, Nuffield Department of Surgical Sciences, Oxford, United Kingdom
| | - Mattias Johansson
- Genomic Epidemiology Branch, International Agency for Research on Cancer (IARC-WHO), Lyon, France
| | - Eleanor L Watts
- Metabolic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, United States of America
| | - Joshua R Atkins
- Cancer Epidemiology Unit, Oxford Population Health, University of Oxford, Oxford, United Kingdom
| | - Aleksandr V Sokolov
- Department of Surgical Sciences, Functional Pharmacology and Neuroscience Uppsala University, 75124 Uppsala, Sweden
| | - Helgi B Schiöth
- Department of Surgical Sciences, Functional Pharmacology and Neuroscience Uppsala University, 75124 Uppsala, Sweden
| | - Marc J Gunter
- Genomic Epidemiology Branch, International Agency for Research on Cancer (IARC-WHO), Lyon, France
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, St Mary's Campus, Norfolk Place, London, W2 1PG, United Kingdom
| | - Konstantinos K Tsilidis
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, St Mary's Campus, Norfolk Place, London, W2 1PG, United Kingdom
- Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
| | - Richard M Martin
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- NIHR Bristol Biomedical Research Centre, Hospitals Bristol and Weston NHS Foundation Trust and the University of Bristol, Bristol, United Kingdom
| | - Maik Pietzner
- MRC Epidemiology Unit, University of Cambridge, United Kingdom
- Computational Medicine, Berlin Institute of HealthHealth (BIH) at Charité - Univeritätsmedizin- Universitätsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, United Kingdom
| | - Claudia Langenberg
- MRC Epidemiology Unit, University of Cambridge, United Kingdom
- Computational Medicine, Berlin Institute of HealthHealth (BIH) at Charité - Univeritätsmedizin- Universitätsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, United Kingdom
| | - Ian G Mills
- University of Oxford, Nuffield Department of Surgical Sciences, Oxford, United Kingdom
| | - Alastair D Lamb
- University of Oxford, Nuffield Department of Surgical Sciences, Oxford, United Kingdom
| | - Anders Mälarstig
- External Science and Innovation, Pfizer Worldwide Research, Development and Medical, Stockholm, Sweden
- Department of Medicine, Department of Medicine, Stockholm, Sweden
| | - Tim J Key
- Cancer Epidemiology Unit, Oxford Population Health, University of Oxford, Oxford, United Kingdom
| | - Ruth C Travis
- Cancer Epidemiology Unit, Oxford Population Health, University of Oxford, Oxford, United Kingdom
| | - Karl Smith-Byrne
- Cancer Epidemiology Unit, Oxford Population Health, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
23
|
Irajizad E, Kenney A, Tang T, Vykoukal J, Wu R, Murage E, Dennison JB, Sans M, Long JP, Loftus M, Chabot JA, Kluger MD, Kastrinos F, Brais L, Babic A, Jajoo K, Lee LS, Clancy TE, Ng K, Bullock A, Genkinger JM, Maitra A, Do KA, Yu B, Wolpin BM, Hanash S, Fahrmann JF. A blood-based metabolomic signature predictive of risk for pancreatic cancer. Cell Rep Med 2023; 4:101194. [PMID: 37729870 PMCID: PMC10518621 DOI: 10.1016/j.xcrm.2023.101194] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 12/20/2022] [Accepted: 08/21/2023] [Indexed: 09/22/2023]
Abstract
Emerging evidence implicates microbiome involvement in the development of pancreatic cancer (PaCa). Here, we investigate whether increases in circulating microbial-related metabolites associate with PaCa risk by applying metabolomics profiling to 172 sera collected within 5 years prior to PaCa diagnosis and 863 matched non-subject sera from participants in the Prostate, Lung, Colorectal, and Ovarian (PLCO) cohort. We develop a three-marker microbial-related metabolite panel to assess 5-year risk of PaCa. The addition of five non-microbial metabolites further improves 5-year risk prediction of PaCa. The combined metabolite panel complements CA19-9, and individuals with a combined metabolite panel + CA19-9 score in the top 2.5th percentile have absolute 5-year risk estimates of >13%. The risk prediction model based on circulating microbial and non-microbial metabolites provides a potential tool to identify individuals at high risk of PaCa that would benefit from surveillance and/or from potential cancer interception strategies.
Collapse
Affiliation(s)
- Ehsan Irajizad
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ana Kenney
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Tiffany Tang
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Jody Vykoukal
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ranran Wu
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Eunice Murage
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jennifer B Dennison
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marta Sans
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - James P Long
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Maureen Loftus
- Dana-Farber Brigham and Women's Cancer Center, Division of Gastrointestinal Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - John A Chabot
- Division of Digestive and Liver Diseases, Columbia University Irving Medical Cancer and the Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - Michael D Kluger
- Division of Digestive and Liver Diseases, Columbia University Irving Medical Cancer and the Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - Fay Kastrinos
- Division of Digestive and Liver Diseases, Columbia University Irving Medical Cancer and the Vagelos College of Physicians and Surgeons, New York, NY, USA; Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Lauren Brais
- Dana-Farber Brigham and Women's Cancer Center, Division of Gastrointestinal Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Ana Babic
- Dana-Farber Brigham and Women's Cancer Center, Division of Gastrointestinal Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Kunal Jajoo
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Linda S Lee
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Thomas E Clancy
- Dana-Farber Brigham and Women's Cancer Center, Division of Surgical Oncology, Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA USA
| | - Kimmie Ng
- Dana-Farber Brigham and Women's Cancer Center, Division of Gastrointestinal Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Andrea Bullock
- Division of Hematology/Oncology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Jeanine M Genkinger
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA; Department of Epidemiology, Columbia Mailman School of Public Health, New York, NY, USA
| | - Anirban Maitra
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Kim-Anh Do
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Bin Yu
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Brian M Wolpin
- Dana-Farber Brigham and Women's Cancer Center, Division of Gastrointestinal Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Sam Hanash
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | - Johannes F Fahrmann
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
24
|
Lyu WN, Lin MC, Shen CY, Chen LH, Lee YH, Chen SK, Lai LC, Chuang EY, Lou PJ, Tsai MH. An Oral Microbial Biomarker for Early Detection of Recurrence of Oral Squamous Cell Carcinoma. ACS Infect Dis 2023; 9:1783-1792. [PMID: 37565768 PMCID: PMC10496842 DOI: 10.1021/acsinfecdis.3c00269] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Indexed: 08/12/2023]
Abstract
Changes in the oral microbiome are associated with oral squamous cell carcinoma (OSCC). Oral microbe-derived signatures have been utilized as markers of OSCC. However, the structure of the oral microbiome during OSCC recurrence and biomarkers for the prediction of OSCC recurrence remains unknown. To identify OSCC recurrence-associated microbial biomarkers for the prediction of OSCC recurrence, we performed 16S rRNA amplicon sequencing on 54 oral swab samples from OSCC patients. Differences in bacterial compositions were observed in patients with vs without recurrence. We found that Granulicatella, Peptostreptococcus, Campylobacter, Porphyromonas, Oribacterium, Actinomyces, Corynebacterium, Capnocytophaga, and Dialister were enriched in OSCC recurrence. Functional analysis of the oral microbiome showed altered functions associated with OSCC recurrence compared with nonrecurrence. A random forest prediction model was constructed with five microbial signatures including Leptotrichia trevisanii, Capnocytophaga sputigena, Capnocytophaga, Cardiobacterium, and Olsenella to discriminate OSCC recurrence from original OSCC (accuracy = 0.963). Moreover, we validated the prediction model in another independent cohort (46 OSCC patients), achieving an accuracy of 0.761. We compared the accuracy of the prediction of OSCC recurrence between the five microbial signatures and two clinicopathological parameters, including resection margin and lymph node counts. The results predicted by the model with five microbial signatures showed a higher accuracy than those based on the clinical outcomes from the two clinicopathological parameters. This study demonstrated the validity of using recurrence-related microbial biomarkers, a noninvasive and effective method for the prediction of OSCC recurrence. Our findings may contribute to the prognosis and treatment of OSCC recurrence.
Collapse
Affiliation(s)
- Wei-Ni Lyu
- Institute
of Biotechnology, National Taiwan University, Taipei 10617, Taiwan
| | - Mei-Chun Lin
- Department
of Otolaryngology, National Taiwan University
Hospital, Taipei 10002, Taiwan
| | - Cheng-Ying Shen
- Institute
of Biotechnology, National Taiwan University, Taipei 10617, Taiwan
| | - Li-Han Chen
- Institute
of Fisheries Science, National Taiwan University, Taipei 10617, Taiwan
| | - Yung-Hua Lee
- Institute
of Biotechnology, National Taiwan University, Taipei 10617, Taiwan
| | - Shin-Kuang Chen
- Center
for Biotechnology, National Taiwan University, Taipei 10617, Taiwan
| | - Liang-Chuan Lai
- Graduate
Institute of Physiology, College of Medicine, National Taiwan University, Taipei 10051, Taiwan
| | - Eric Y. Chuang
- Graduate
Institute of Biomedical Electronics and Bioinformatics, National Taiwan
University, Taipei 10617, Taiwan
| | - Pei-Jen Lou
- Department
of Otolaryngology, National Taiwan University
Hospital, Taipei 10002, Taiwan
| | - Mong-Hsun Tsai
- Institute
of Biotechnology, National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
25
|
Sala M, Taylor A, Crochiere RJ, Zhang F, Forman EM. Application of machine learning to discover interactions predictive of dietary lapses. Appl Psychol Health Well Being 2023; 15:1166-1181. [PMID: 36573066 DOI: 10.1111/aphw.12432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 12/05/2022] [Indexed: 12/28/2022]
Abstract
The purpose of this study it to build a machine learning model to predict dietary lapses with comparable accuracy, sensitivity, and specificity to previous literature while recovering predictor interactions. The sample for the current study consisted of merged data from two separate studies of individuals with obesity/overweight (total N = 87). Participants completed six ecological momentary assessment surveys per day where they were asked about 16 risk factors of lapse and if they had lapsed from their dietary prescriptions since the previous survey. Alcohol consumption and self-efficacy were the most prevalent in the top 10 stable interactions. Alcohol consumption decreased the protective effect of self-efficacy, motivation, and planning. Higher planning predicted higher risk for lapse only when consuming alcohol. Low motivation, hunger, cravings, and lack of healthy food availability increased the protective effect of self-efficacy. Higher self-efficacy increased risk effect of positive mood and having recently eaten a meal on lapse. For individuals with lower levels of self-efficacy, planning increased the risk of lapse. Alcohol intake and self-efficacy interact with several variables to predict dietary lapses, and these interactions should be targeted in just-in-time adaptive interventions that deliver interventions for lapses.
Collapse
Affiliation(s)
- Margaret Sala
- Ferkauf Graduate School of Psychology, Yeshiva University, New York, New York, USA
| | - Alexei Taylor
- Department of Psychology, Drexel University, Philadelphia, Pennsylvania, USA
| | - Rebecca J Crochiere
- Department of Psychology, Drexel University, Philadelphia, Pennsylvania, USA
- Center for Weight, Eating, and Lifestyle Science (WELL Center), Drexel University, Philadelphia, Pennsylvania, USA
| | - Fengqing Zhang
- Department of Psychology, Drexel University, Philadelphia, Pennsylvania, USA
- Center for Weight, Eating, and Lifestyle Science (WELL Center), Drexel University, Philadelphia, Pennsylvania, USA
| | - Evan M Forman
- Department of Psychology, Drexel University, Philadelphia, Pennsylvania, USA
- Center for Weight, Eating, and Lifestyle Science (WELL Center), Drexel University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
26
|
Pavicic M, Walker AM, Sullivan KA, Lagergren J, Cliff A, Romero J, Streich J, Garvin MR, Pestian J, McMahon B, Oslin DW, Beckham JC, Kimbrel NA, Jacobson DA. Using iterative random forest to find geospatial environmental and Sociodemographic predictors of suicide attempts. Front Psychiatry 2023; 14:1178633. [PMID: 37599888 PMCID: PMC10433206 DOI: 10.3389/fpsyt.2023.1178633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 06/21/2023] [Indexed: 08/22/2023] Open
Abstract
Introduction Despite a recent global decrease in suicide rates, death by suicide has increased in the United States. It is therefore imperative to identify the risk factors associated with suicide attempts to combat this growing epidemic. In this study, we aim to identify potential risk factors of suicide attempt using geospatial features in an Artificial intelligence framework. Methods We use iterative Random Forest, an explainable artificial intelligence method, to predict suicide attempts using data from the Million Veteran Program. This cohort incorporated 405,540 patients with 391,409 controls and 14,131 attempts. Our predictive model incorporates multiple climatic features at ZIP-code-level geospatial resolution. We additionally consider demographic features from the American Community Survey as well as the number of firearms and alcohol vendors per 10,000 people to assess the contributions of proximal environment, access to means, and restraint decrease to suicide attempts. In total 1,784 features were included in the predictive model. Results Our results show that geographic areas with higher concentrations of married males living with spouses are predictive of lower rates of suicide attempts, whereas geographic areas where males are more likely to live alone and to rent housing are predictive of higher rates of suicide attempts. We also identified climatic features that were associated with suicide attempt risk by age group. Additionally, we observed that firearms and alcohol vendors were associated with increased risk for suicide attempts irrespective of the age group examined, but that their effects were small in comparison to the top features. Discussion Taken together, our findings highlight the importance of social determinants and environmental factors in understanding suicide risk among veterans.
Collapse
Affiliation(s)
- Mirko Pavicic
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| | - Angelica M. Walker
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, United States
| | - Kyle A. Sullivan
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| | - John Lagergren
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| | - Ashley Cliff
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, United States
| | - Jonathon Romero
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, United States
| | - Jared Streich
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| | - Michael R. Garvin
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| | - John Pestian
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, United States
| | - Benjamin McMahon
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM, United States
| | - David W. Oslin
- VISN 4 Mental Illness Research, Education, and Clinical Center, Center of Excellence, Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, United States
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Jean C. Beckham
- Durham Veterans Affairs Health Care System, Durham, NC, United States
- VA Mid-Atlantic Mental Illness, Research, Education, and Clinical Center, Seattle, WA, United States
- Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC, United States
| | - Nathan A. Kimbrel
- Durham Veterans Affairs Health Care System, Durham, NC, United States
- VA Mid-Atlantic Mental Illness, Research, Education, and Clinical Center, Seattle, WA, United States
- Duke University School of Medicine, Duke University, Durham, NC, United States
- VA Health Services Research and Development Center of Innovation to Accelerate Discovery and Practice Transformation, Durham, NC, United States
| | - Daniel A. Jacobson
- Oak Ridge National Laboratory, Computational and Predictive Biology, Oak Ridge, TN, United States
| |
Collapse
|
27
|
Dougherty K, Zhao Y, Dunlop AL, Corwin E. Association between Sexual Activity during Pregnancy, Pre- and Early-Term Birth, and Vaginal Cytokine Inflammation: A Prospective Study of Black Women. Healthcare (Basel) 2023; 11:1995. [PMID: 37510436 PMCID: PMC10379435 DOI: 10.3390/healthcare11141995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 06/24/2023] [Accepted: 07/06/2023] [Indexed: 07/30/2023] Open
Abstract
This study aimed to investigate the association between sexual activity during pregnancy and adverse birth outcomes among Black women, and to explore whether vaginal cytokine inflammation mediates this association. Data from 397 Black pregnant women through questionnaires on sexual activity and vaginal biosamples during early (8-14 weeks) and late (24-30 weeks) pregnancy, and birth outcomes were analyzed. Using a data-driven approach, the study found that vaginal sex during late pregnancy was associated with spontaneous early-term birth (sETB, 38-39 completed weeks' gestation) (OR = 0.39, 95% CI: [0.21, 0.72], p-value = 0.003) but not with spontaneous preterm birth (sPTB) (OR = 1.08, p-value = 0.86) compared to full-term birth. Overall, despite vaginal sex in late pregnancy showing an overall positive effect on sETB (total effect = -0.1580, p-value = 0.015), we observed a negative effect of vaginal sex on sETB (indirect effect = 0.0313, p-value = 0.026) due to the fact that having vaginal sex could lead to elevated IL6 levels, which in turn increased the odds of sETB. In conclusion, the study found an overall positive association between sexual activity on ETB and a negative partial mediation effect via increased vaginal cytokine inflammation induced by vaginal sexual activity. This inconsistent mediation model suggested that vaginal sexual activity is a complex behavior that could have both positive and negative effects on the birth outcome.
Collapse
Affiliation(s)
- Kylie Dougherty
- School of Nursing, Columbia University, New York, NY 10032, USA
| | - Yihong Zhao
- School of Nursing, Columbia University, New York, NY 10032, USA
| | - Anne L Dunlop
- School of Medicine, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
28
|
Helleckes LM, Hemmerich J, Wiechert W, von Lieres E, Grünberger A. Machine learning in bioprocess development: from promise to practice. Trends Biotechnol 2023; 41:817-835. [PMID: 36456404 DOI: 10.1016/j.tibtech.2022.10.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/20/2022] [Accepted: 10/27/2022] [Indexed: 11/30/2022]
Abstract
Fostered by novel analytical techniques, digitalization, and automation, modern bioprocess development provides large amounts of heterogeneous experimental data, containing valuable process information. In this context, data-driven methods like machine learning (ML) approaches have great potential to rationally explore large design spaces while exploiting experimental facilities most efficiently. Herein we demonstrate how ML methods have been applied so far in bioprocess development, especially in strain engineering and selection, bioprocess optimization, scale-up, monitoring, and control of bioprocesses. For each topic, we will highlight successful application cases, current challenges, and point out domains that can potentially benefit from technology transfer and further progress in the field of ML.
Collapse
Affiliation(s)
- Laura M Helleckes
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Johannes Hemmerich
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
| | - Wolfgang Wiechert
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Eric von Lieres
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Alexander Grünberger
- Multiscale Bioengineering, Technical Faculty, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Institute of Process Engineering in Life Sciences, Section III: Microsystems in Bioprocess Engineering, Karlsruhe Institute of Technology, Fritz-Haber-Weg 2, 76131, Karlsruhe, Germany.
| |
Collapse
|
29
|
Midya V, Lane JM, Gennings C, Torres-Olascoaga LA, Wright RO, Arora M, Téllez-Rojo MM, Eggers S. Prenatal Pb exposure is associated with reduced abundance of beneficial gut microbial cliques in late childhood: an investigation using Microbial Co-occurrence Analysis (MiCA). MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.18.23290127. [PMID: 37293091 PMCID: PMC10246125 DOI: 10.1101/2023.05.18.23290127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Background Many analytical methods used in gut microbiome research focus on either single bacterial taxa or the whole microbiome, ignoring multi-bacteria relationships (microbial cliques). We present a novel analytical approach to identify multiple bacterial taxa within the gut microbiome of children at 9-11 years associated with prenatal Pb exposure. Methods Data came from a subset of participants (n=123) in the Programming Research in Obesity, Growth, Environment and Social Stressors (PROGRESS) cohort. Pb concentrations were measured in maternal whole blood from the second and third trimesters of pregnancy. Stool samples collected at 9-11 years old underwent metagenomic sequencing to assess the gut microbiome. Using a novel analytical approach, Microbial Co-occurrence Analysis (MiCA), we paired a machine-learning algorithm with randomization-based inference to first identify microbial cliques that were predictive of prenatal Pb exposure and then estimate the association between prenatal Pb exposure and microbial clique abundance. Results With second-trimester Pb exposure, we identified a 2-taxa microbial clique that included Bifidobacterium adolescentis and Ruminococcus callidus, and a 3-taxa clique that added Prevotella clara. Increasing second-trimester Pb exposure was associated with significantly increased odds of having the 2-taxa microbial clique below the 50th percentile relative abundance (OR=1.03,95%CI[1.01-1.05]). In an analysis of Pb concentration at or above vs. below the United States and Mexico guidelines for child Pb exposure, odds of the 2-taxa clique in low abundance were 3.36(95%CI[1.32-8.51]) and 6.11(95%CI[1.87-19.93]), respectively. Trends were similar with the 3-taxa clique but not statistically significant. Discussion Using a novel combination of machine-learning and causal-inference, MiCA identified a significant association between second-trimester Pb exposure and reduced abundance of a probiotic microbial clique within the gut microbiome in late childhood. Pb exposure levels at the guidelines for child Pb poisoning in the United States, and Mexico are not sufficient to protect against the potential loss of probiotic benefits.
Collapse
Affiliation(s)
- V Midya
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - J M Lane
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - C Gennings
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - L A Torres-Olascoaga
- Center for Research on Nutrition and Health, National Institute of Public Health, Cuernavaca, Mexico
| | - R O Wright
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - M Arora
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - M M Téllez-Rojo
- Center for Research on Nutrition and Health, National Institute of Public Health, Cuernavaca, Mexico
| | - S Eggers
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Epidemiology, University of Iowa College of Public Health, Iowa City, Iowa, USA
| |
Collapse
|
30
|
Aw A, Jin LC, Ioannidis N, Song YS. The Impact of Stability Considerations on Genetic Fine-Mapping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.11.536456. [PMID: 37090514 PMCID: PMC10120703 DOI: 10.1101/2023.04.11.536456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Fine-mapping methods, which aim to identify genetic variants responsible for complex traits following genetic association studies, typically assume that sufficient adjustments for confounding within the association study cohort have been made, e.g., through regressing out the top principal components (i.e., residualization). Despite its widespread use, however, residualization may not completely remove all sources of confounding. Here, we propose a complementary stability-guided approach that does not rely on residualization, which identifies consistently fine-mapped variants across different genetic backgrounds or environments. We demonstrate the utility of this approach by applying it to fine-map eQTLs in the GEUVADIS data. Using 378 different functional annotations of the human genome, including recent deep learning-based annotations (e.g., Enformer), we compare enrichments of these annotations among variants for which the stability and traditional residualization-based fine-mapping approaches agree against those for which they disagree, and find that the stability approach enhances the power of traditional fine-mapping methods in identifying variants with functional impact. Finally, in cases where the two approaches report distinct variants, our approach identifies variants comparably enriched for functional annotations. Our findings suggest that the stability principle, as a conceptually simple device, complements existing approaches to fine-mapping, reinforcing recent advocacy of evaluating cross-population and cross-environment portability of biological findings. To support visualization and interpretation of our results, we provide a Shiny app, available at: https://alan-aw.shinyapps.io/stability_v0/.
Collapse
Affiliation(s)
- Alan Aw
- Department of Statistics, University of California, Berkeley
- Center for Computational Biology, University of California, Berkeley
| | | | - Nilah Ioannidis
- Center for Computational Biology, University of California, Berkeley
- Computer Science Division, University of California, Berkeley
| | - Yun S. Song
- Department of Statistics, University of California, Berkeley
- Center for Computational Biology, University of California, Berkeley
- Computer Science Division, University of California, Berkeley
| |
Collapse
|
31
|
Heinrich L, Kumbier K, Li L, Altschuler SJ, Wu LF. Selection of Optimal Cell Lines for High-Content Phenotypic Screening. ACS Chem Biol 2023; 18:679-685. [PMID: 36920184 PMCID: PMC10127200 DOI: 10.1021/acschembio.2c00878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
High-content microscopy offers a scalable approach to screen against multiple targets in a single pass. Prior work has focused on methods to select "optimal" cellular readouts in microscopy screens. However, methods to select optimal cell line models have garnered much less attention. Here, we provide a roadmap for how to select the cell line or lines that are best suited to identify bioactive compounds and their mechanism of action (MOA). We test our approach on compounds targeting cancer-relevant pathways, ranking cell lines in two tasks: detecting compound activity ("phenoactivity") and grouping compounds with similar MOA by similar phenotype ("phenosimilarity"). Evaluating six cell lines across 3214 well-annotated compounds, we show that optimal cell line selection depends on both the task of interest (e.g., detecting phenoactivity vs inferring phenosimilarity) and distribution of MOAs within the compound library. Given a task of interest and a set of compounds, we provide a systematic framework for choosing optimal cell line(s). Our framework can be used to reduce the number of cell lines required to identify hits within a compound library and help accelerate the pace of early drug discovery.
Collapse
Affiliation(s)
- Louise Heinrich
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California 94158, United States
| | - Karl Kumbier
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California 94158, United States
| | - Li Li
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California 94158, United States
| | - Steven J Altschuler
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California 94158, United States
| | - Lani F Wu
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California 94158, United States
| |
Collapse
|
32
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
33
|
Sha Z, Chen Y, Hu T. NSPA: characterizing the disease association of multiple genetic interactions at single-subject resolution. BIOINFORMATICS ADVANCES 2023; 3:vbad010. [PMID: 36818729 PMCID: PMC9927570 DOI: 10.1093/bioadv/vbad010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 01/02/2023] [Accepted: 02/02/2023] [Indexed: 02/10/2023]
Abstract
Motivation The interaction between genetic variables is one of the major barriers to characterizing the genetic architecture of complex traits. To consider epistasis, network science approaches are increasingly being used in research to elucidate the genetic architecture of complex diseases. Network science approaches associate genetic variables' disease susceptibility to their topological importance in the network. However, this network only represents genetic interactions and does not describe how these interactions attribute to disease association at the subject-scale. We propose the Network-based Subject Portrait Approach (NSPA) and an accompanying feature transformation method to determine the collective risk impact of multiple genetic interactions for each subject. Results The feature transformation method converts genetic variants of subjects into new values that capture how genetic variables interact with others to attribute to a subject's disease association. We apply this approach to synthetic and genetic datasets and learn that (1) the disease association can be captured using multiple disjoint sets of genetic interactions and (2) the feature transformation method based on NSPA improves predictive performance comparing with using the original genetic variables. Our findings confirm the role of genetic interaction in complex disease and provide a novel approach for gene-disease association studies to identify genetic architecture in the context of epistasis. Availability and implementation The codes of NSPA are now available in: https://github.com/MIB-Lab/Network-based-Subject-Portrait-Approach. Contact ting.hu@queensu.ca. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Zhendong Sha
- School of Computing, Queen’s University, Kingston, Ontario, Canada K7L 2N8
| | - Yuanzhu Chen
- School of Computing, Queen’s University, Kingston, Ontario, Canada K7L 2N8
| | - Ting Hu
- To whom correspondence should be addressed.
| |
Collapse
|
34
|
Harfouche AL, Nakhle F, Harfouche AH, Sardella OG, Dart E, Jacobson D. A primer on artificial intelligence in plant digital phenomics: embarking on the data to insights journey. TRENDS IN PLANT SCIENCE 2023; 28:154-184. [PMID: 36167648 DOI: 10.1016/j.tplants.2022.08.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 08/22/2022] [Accepted: 08/25/2022] [Indexed: 06/16/2023]
Abstract
Artificial intelligence (AI) has emerged as a fundamental component of global agricultural research that is poised to impact on many aspects of plant science. In digital phenomics, AI is capable of learning intricate structure and patterns in large datasets. We provide a perspective and primer on AI applications to phenome research. We propose a novel human-centric explainable AI (X-AI) system architecture consisting of data architecture, technology infrastructure, and AI architecture design. We clarify the difference between post hoc models and 'interpretable by design' models. We include guidance for effectively using an interpretable by design model in phenomic analysis. We also provide directions to sources of tools and resources for making data analytics increasingly accessible. This primer is accompanied by an interactive online tutorial.
Collapse
Affiliation(s)
- Antoine L Harfouche
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy.
| | - Farid Nakhle
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy
| | - Antoine H Harfouche
- Unité de Formation et de Recherche en Sciences Économiques, Gestion, Mathématiques, et Informatique, Université Paris Nanterre, 92001 Nanterre, France
| | - Orlando G Sardella
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy
| | - Eli Dart
- Energy Sciences Network (ESnet), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Daniel Jacobson
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
35
|
Heinrich L, Kumbier K, Li L, Altschuler SP, Wu LF. Selection of optimal cell lines for high-content phenotypic screening. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.11.523662. [PMID: 36711978 PMCID: PMC9882115 DOI: 10.1101/2023.01.11.523662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
High-content microscopy offers a scalable approach to screen against multiple targets in a single pass. Prior work has focused on methods to select "optimal" cellular readouts in microscopy screens. However, methods to select optimal cell line models have garnered much less attention. Here, we provide a roadmap for how to select the cell line or lines that are best suited to identify bioactive compounds and their mechanism of action (MOA). We test our approach on compounds targeting cancer-relevant pathways, ranking cell lines in two tasks: detecting compound activity ("phenoactivity") and grouping compounds with similar MOA by similar phenotype ("phenosimilarity"). Evaluating six cell lines across 3214 well-annotated compounds, we show that optimal cell line selection depends on both the task of interest (e.g. detecting phenoactivity vs. inferring phenosimilarity) and distribution of MOAs within the compound library. Given a task of interest and set of compounds, we provide a systematic framework for choosing optimal cell line(s). Our framework can be used to reduce the number of cell lines required to identify hits within a compound library and help accelerate the pace of early drug discovery.
Collapse
Affiliation(s)
- Louise Heinrich
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California, 94158, United States
| | - Karl Kumbier
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California, 94158, United States
| | - Li Li
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California, 94158, United States
| | - Steven P Altschuler
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California, 94158, United States
| | - Lani F Wu
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Fancisco, California, 94158, United States
| |
Collapse
|
36
|
Interpreting tree ensemble machine learning models with endoR. PLoS Comput Biol 2022; 18:e1010714. [PMID: 36516158 PMCID: PMC9797088 DOI: 10.1371/journal.pcbi.1010714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 12/28/2022] [Accepted: 11/07/2022] [Indexed: 12/15/2022] Open
Abstract
Tree ensemble machine learning models are increasingly used in microbiome science as they are compatible with the compositional, high-dimensional, and sparse structure of sequence-based microbiome data. While such models are often good at predicting phenotypes based on microbiome data, they only yield limited insights into how microbial taxa may be associated. We developed endoR, a method to interpret tree ensemble models. First, endoR simplifies the fitted model into a decision ensemble. Then, it extracts information on the importance of individual features and their pairwise interactions, displaying them as an interpretable network. Both the endoR network and importance scores provide insights into how features, and interactions between them, contribute to the predictive performance of the fitted model. Adjustable regularization and bootstrapping help reduce the complexity and ensure that only essential parts of the model are retained. We assessed endoR on both simulated and real metagenomic data. We found endoR to have comparable accuracy to other common approaches while easing and enhancing model interpretation. Using endoR, we also confirmed published results on gut microbiome differences between cirrhotic and healthy individuals. Finally, we utilized endoR to explore associations between human gut methanogens and microbiome components. Indeed, these hydrogen consumers are expected to interact with fermenting bacteria in a complex syntrophic network. Specifically, we analyzed a global metagenome dataset of 2203 individuals and confirmed the previously reported association between Methanobacteriaceae and Christensenellales. Additionally, we observed that Methanobacteriaceae are associated with a network of hydrogen-producing bacteria. Our method accurately captures how tree ensembles use features and interactions between them to predict a response. As demonstrated by our applications, the resultant visualizations and summary outputs facilitate model interpretation and enable the generation of novel hypotheses about complex systems.
Collapse
|
37
|
Towards convergence rate analysis of random forests for classification. ARTIF INTELL 2022. [DOI: 10.1016/j.artint.2022.103788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
38
|
Sievering AW, Wohlmuth P, Geßler N, Gunawardene MA, Herrlinger K, Bein B, Arnold D, Bergmann M, Nowak L, Gloeckner C, Koch I, Bachmann M, Herborn CU, Stang A. Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission. BMC Med Inform Decis Mak 2022; 22:309. [PMID: 36437469 PMCID: PMC9702742 DOI: 10.1186/s12911-022-02057-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 11/17/2022] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Machine learning (ML) algorithms have been trained to early predict critical in-hospital events from COVID-19 using patient data at admission, but little is known on how their performance compares with each other and/or with statistical logistic regression (LR). This prospective multicentre cohort study compares the performance of a LR and five ML models on the contribution of influencing predictors and predictor-to-event relationships on prediction model´s performance. METHODS We used 25 baseline variables of 490 COVID-19 patients admitted to 8 hospitals in Germany (March-November 2020) to develop and validate (75/25 random-split) 3 linear (L1 and L2 penalty, elastic net [EN]) and 2 non-linear (support vector machine [SVM] with radial kernel, random forest [RF]) ML approaches for predicting critical events defined by intensive care unit transfer, invasive ventilation and/or death (composite end-point: 181 patients). Models were compared for performance (area-under-the-receiver-operating characteristic-curve [AUC], Brier score) and predictor importance (performance-loss metrics, partial-dependence profiles). RESULTS Models performed close with a small benefit for LR (utilizing restricted cubic splines for non-linearity) and RF (AUC means: 0.763-0.731 [RF-L1]); Brier scores: 0.184-0.197 [LR-L1]). Top ranked predictor variables (consistently highest importance: C-reactive protein) were largely identical across models, except creatinine, which exhibited marginal (L1, L2, EN, SVM) or high/non-linear effects (LR, RF) on events. CONCLUSIONS Although the LR and ML models analysed showed no strong differences in performance and the most influencing predictors for COVID-19-related event prediction, our results indicate a predictive benefit from taking account for non-linear predictor-to-event relationships and effects. Future efforts should focus on leveraging data-driven ML technologies from static towards dynamic modelling solutions that continuously learn and adapt to changes in data environments during the evolving pandemic. TRIAL REGISTRATION NUMBER NCT04659187.
Collapse
Affiliation(s)
| | - Peter Wohlmuth
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Proresearch, Research Institute, Hamburg, Germany
| | - Nele Geßler
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Proresearch, Research Institute, Hamburg, Germany.,Department of Cardiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Melanie A Gunawardene
- Department of Cardiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Klaus Herrlinger
- Department of Internal Medicine, Asklepios Hospital Nord-Heidberg, Hamburg, Germany.,Asklepios Tumorzentrum, Hamburg, Germany
| | - Berthold Bein
- Department of Anesthesiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Dirk Arnold
- Asklepios Tumorzentrum, Hamburg, Germany.,Department of Hematology, Oncology, Palliative Care and Rheumatology, Asklepios Hospital Altona, Hamburg, Germany
| | - Martin Bergmann
- Department of Internal Medicine, Cardiology, and Pneumology, Asklepios Hospital Wandsbek, Hamburg, Germany
| | - Lorenz Nowak
- Department of Intensive Care and Ventilation Medicine, Asklepios Hospital München-Gauting, Gauting, Germany
| | - Christian Gloeckner
- Department of Internal Medicine, Asklepios Hospital Oberviechtach, Oberviechtach, Germany
| | - Ina Koch
- Biobank for Pulmonary Diseases, Asklepios Hospital München-Gauting, Gauting, Germany
| | - Martin Bachmann
- Department of Intensive Care and Ventilatory Medicine, Asklepios Hospital Harburg, Hamburg, Germany
| | - Christoph U Herborn
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Hospitals GmbH & Co. KGaA, Hamburg, Germany
| | - Axel Stang
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary. .,Asklepios Tumorzentrum, Hamburg, Germany. .,Department of Hematology, Oncology and Palliative Care Medicine, Asklepios Hospital Barmbek, Rübenkamp 220, 22291, Hamburg, Germany.
| |
Collapse
|
39
|
Kainer D, Templeton AR, Prates ET, Jacboson D, Allan ER, Climer S, Garvin MR. Structural variants identified using non-Mendelian inheritance patterns advance the mechanistic understanding of autism spectrum disorder. HGG ADVANCES 2022; 4:100150. [PMCID: PMC9634371 DOI: 10.1016/j.xhgg.2022.100150] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/03/2022] [Indexed: 11/06/2022] Open
Abstract
The heritability of autism spectrum disorder (ASD), based on 680,000 families and five countries, is estimated to be nearly 80%, yet heritability reported from SNP-based studies are consistently lower, and few significant loci have been identified with genome-wide association studies. This gap in genomic information may reside in rare variants, interaction among variants (epistasis), or cryptic structural variation (SV) and may provide mechanisms that underlie ASD. Here we use a method to identify potential SVs based on non-Mendelian inheritance patterns in pedigrees using parent-child genotypes from ASD families and demonstrate that they are enriched in ASD-risk genes. Most are in non-coding genic space and are over-represented in expression quantitative trait loci, suggesting that they affect gene regulation, which we confirm with their overlap of differentially expressed genes in postmortem brain tissue of ASD individuals. We then identify an SV in the GRIK2 gene that alters RNA splicing and a regulatory region of the ACMSD gene in the kynurenine pathway as significantly associated with a non-verbal ASD phenotype, supporting our hypothesis that these currently excluded loci can provide a clearer mechanistic understanding of ASD. Finally, we use an explainable artificial intelligence approach to define subgroups demonstrating their use in the context of precision medicine.
Collapse
Affiliation(s)
- David Kainer
- Computational Systems Biology, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Alan R. Templeton
- Department of Biology, Washington University – St Louis, St. Louis, MO, USA
| | - Erica T. Prates
- Computational Systems Biology, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Daniel Jacboson
- Computational Systems Biology, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | | | - Sharlee Climer
- Department of Computer Science, University of Missouri, St. Louis, MO, USA
| | - Michael R. Garvin
- Computational Systems Biology, Oak Ridge National Laboratory, Oak Ridge, TN, USA,Williwaw Biosciences, LLC, Clarkston, MI, USA,Corresponding author
| |
Collapse
|
40
|
Soil Metabolomics Predict Microbial Taxa as Biomarkers of Moisture Status in Soils from a Tidal Wetland. Microorganisms 2022; 10:microorganisms10081653. [PMID: 36014071 PMCID: PMC9416152 DOI: 10.3390/microorganisms10081653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/12/2022] [Accepted: 08/12/2022] [Indexed: 11/16/2022] Open
Abstract
We present observations from a laboratory-controlled study on the impacts of extreme wetting and drying on a wetland soil microbiome. Our approach was to experimentally challenge the soil microbiome to understand impacts on anaerobic carbon cycling processes as the system transitions from dryness to saturation and vice-versa. Specifically, we tested for impacts on stress responses related to shifts from wet to drought conditions. We used a combination of high-resolution data for small organic chemical compounds (metabolites) and biological (community structure based on 16S rRNA gene sequencing) features. Using a robust correlation-independent data approach, we further tested the predictive power of soil metabolites for the presence or absence of taxa. Here, we demonstrate that taking an untargeted, multidimensional data approach to the interpretation of metabolomics has the potential to indicate the causative pathways selecting for the observed bacterial community structure in soils.
Collapse
|
41
|
Nie F, Wang L, Huang Y, Yang P, Gong P, Feng Q, Yang C. Characteristics of Microbial Distribution in Different Oral Niches of Oral Squamous Cell Carcinoma. Front Cell Infect Microbiol 2022; 12:905653. [PMID: 36046741 PMCID: PMC9421053 DOI: 10.3389/fcimb.2022.905653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 04/28/2022] [Indexed: 11/15/2022] Open
Abstract
Oral squamous cell carcinoma (OSCC), one of the most common malignant tumors of the head and neck, is closely associated with the presence of oral microbes. However, the microbiomes of different oral niches in OSCC patients and their association with OSCC have not been adequately characterized. In this study, 305 samples were collected from 65 OSCC patients, including tumor tissue, adjacent normal tissue (paracancerous tissue), cancer surface tissue, anatomically matched contralateral normal mucosa, saliva, and tongue coat. 16S ribosomal DNA (16S rDNA) sequencing was used to compare the microbial composition, distribution, and co-occurrence network of different oral niches. The association between the microbiome and the clinical features of OSCC was also characterized. The oral microbiome of OSCC patients showed a regular ecological distribution. Tumor and paracancerous tissues were more microbially diverse than other oral niches. Cancer surface, contralateral normal mucosa, saliva, and tongue coat showed similar microbial compositions, especially the contralateral normal mucosa and saliva. Periodontitis-associated bacteria of the genera Fusobacterium, Prevotella, Porphyromonas, Campylobacter, and Aggregatibacter, and anaerobic bacteria were enriched in tumor samples. The microbiome was highly correlated with tumor clinicopathological features, with several genera (Lautropia, Asteroleplasma, Parvimonas, Peptostreptococcus, Pyramidobacter, Roseburia, and Propionibacterium) demonstrating a relatively high diagnostic power for OSCC metastasis, potentially providing an indicator for the development of OSCC.
Collapse
Affiliation(s)
- Fujiao Nie
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, China
| | - Lihua Wang
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- Department of Human Microbiome, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Yingying Huang
- Department of Oral and Maxillofacial Surgery, Qilu Hospital of Shandong University, Jinan, China
- Institute of Stomatology, Shandong University, Jinan, China
| | - Pishan Yang
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, China
| | - Pizhang Gong
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, China
| | - Qiang Feng
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- Department of Human Microbiome, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
- *Correspondence: Qiang Feng, ; Chengzhe Yang,
| | - Chengzhe Yang
- Department of Oral and Maxillofacial Surgery, Qilu Hospital of Shandong University, Jinan, China
- Institute of Stomatology, Shandong University, Jinan, China
- *Correspondence: Qiang Feng, ; Chengzhe Yang,
| |
Collapse
|
42
|
Kornblith AE, Singh C, Devlin G, Addo N, Streck CJ, Holmes JF, Kuppermann N, Grupp-Phelan J, Fineman J, Butte AJ, Yu B. Predictability and stability testing to assess clinical decision instrument performance for children after blunt torso trauma. PLOS DIGITAL HEALTH 2022; 1:e0000076. [PMID: 36812570 PMCID: PMC9931266 DOI: 10.1371/journal.pdig.0000076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 06/14/2022] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The Pediatric Emergency Care Applied Research Network (PECARN) has developed a clinical-decision instrument (CDI) to identify children at very low risk of intra-abdominal injury. However, the CDI has not been externally validated. We sought to vet the PECARN CDI with the Predictability Computability Stability (PCS) data science framework, potentially increasing its chance of a successful external validation. MATERIALS & METHODS We performed a secondary analysis of two prospectively collected datasets: PECARN (12,044 children from 20 emergency departments) and an independent external validation dataset from the Pediatric Surgical Research Collaborative (PedSRC; 2,188 children from 14 emergency departments). We used PCS to reanalyze the original PECARN CDI along with new interpretable PCS CDIs developed using the PECARN dataset. External validation was then measured on the PedSRC dataset. RESULTS Three predictor variables (abdominal wall trauma, Glasgow Coma Scale Score <14, and abdominal tenderness) were found to be stable. A CDI using only these three variables would achieve lower sensitivity than the original PECARN CDI with seven variables on internal PECARN validation but achieve the same performance on external PedSRC validation (sensitivity 96.8% and specificity 44%). Using only these variables, we developed a PCS CDI which had a lower sensitivity than the original PECARN CDI on internal PECARN validation but performed the same on external PedSRC validation (sensitivity 96.8% and specificity 44%). CONCLUSION The PCS data science framework vetted the PECARN CDI and its constituent predictor variables prior to external validation. We found that the 3 stable predictor variables represented all of the PECARN CDI's predictive performance on independent external validation. The PCS framework offers a less resource-intensive method than prospective validation to vet CDIs before external validation. We also found that the PECARN CDI will generalize well to new populations and should be prospectively externally validated. The PCS framework offers a potential strategy to increase the chance of a successful (costly) prospective validation.
Collapse
Affiliation(s)
- Aaron E. Kornblith
- Department of Emergency Medicine, University of California, San Francisco, San Francisco, United States of America
- Department of Pediatrics, University of California, San Francisco, San Francisco, United States of America
| | - Chandan Singh
- Department of Electrical Engineering & Computer Science, University of California, Berkeley, Berkeley, United States of America
| | - Gabriel Devlin
- Department of Pediatrics, University of California, San Francisco, San Francisco, United States of America
| | - Newton Addo
- Department of Emergency Medicine, University of California, San Francisco, San Francisco, United States of America
| | - Christian J. Streck
- Department of Surgery, Medical University of South Carolina, Children’s Hospital, Charleston, United States of America
| | - James F. Holmes
- Department of Emergency Medicine, University of California, Davis, Davis, United States of America
| | - Nathan Kuppermann
- Department of Emergency Medicine, University of California, Davis, Davis, United States of America
- Department of Pediatrics, University of California, Davis, Davis, United States of America
| | - Jacqueline Grupp-Phelan
- Department of Emergency Medicine, University of California, San Francisco, San Francisco, United States of America
- Department of Pediatrics, University of California, San Francisco, San Francisco, United States of America
| | - Jeffrey Fineman
- Department of Pediatrics, University of California, San Francisco, San Francisco, United States of America
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, United States of America
| | - Bin Yu
- Department of Electrical Engineering & Computer Science, University of California, Berkeley, Berkeley, United States of America
- Departments of Statistics, University of California, Berkeley, Berkeley, United States of America
- * E-mail:
| |
Collapse
|
43
|
Using Explainable Artificial Intelligence to Discover Interactions in an Ecological Model for Obesity. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19159447. [PMID: 35954804 PMCID: PMC9367834 DOI: 10.3390/ijerph19159447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/26/2022] [Accepted: 07/30/2022] [Indexed: 02/05/2023]
Abstract
Ecological theories suggest that environmental, social, and individual factors interact to cause obesity. Yet, many analytic techniques, such as multilevel modeling, require manual specification of interacting factors, making them inept in their ability to search for interactions. This paper shows evidence that an explainable artificial intelligence approach, commonly employed in genomics research, can address this problem. The method entails using random intersection trees to decode interactions learned by random forest models. Here, this approach is used to extract interactions between features of a multi-level environment from random forest models of waist-to-height ratios using 11,112 participants from the Adolescent Brain Cognitive Development study. This study shows that methods used to discover interactions between genes can also discover interacting features of the environment that impact obesity. This new approach to modeling ecosystems may help shine a spotlight on combinations of environmental features that are important to obesity, as well as other health outcomes.
Collapse
|
44
|
Kundu A, Fu R, Grace D, Logie C, Abramovich A, Baskerville B, Yager C, Schwartz R, Mitsakakis N, Planinac L, Chaiton M. Correlates of past year suicidal thoughts among sexual and gender minority young adults: A machine learning analysis. J Psychiatr Res 2022; 152:269-277. [PMID: 35759979 DOI: 10.1016/j.jpsychires.2022.06.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 05/27/2022] [Accepted: 06/07/2022] [Indexed: 01/14/2023]
Abstract
Sexual and gender minority populations are at elevated risk of experiencing suicidal thoughts and attempting suicide. The COVID-19 pandemic exacerbated mental health and substance use challenges among this population. We aimed to examine the relative importance and effects of intersectional factors and strong interactions associated with the risk of suicidal thoughts among Canadian lesbian, gay, bisexual, transgender, queer, questioning, intersex and Two Spirit (LGBTQI2S+) young adults. A cross-sectional online survey was conducted among LGBTQI2S + participants aged 16-29 years living in two Canadian provinces (Ontario, Quebec). Among 1414 participants (mean age 21.90 years), 61% (n = 857) participants reported suicidal thoughts in last 12 months. We built a random forest model to predict the risk of having past year suicidal thoughts, which achieved high performance with an area under the receiver operating characteristic curve (AUC) of 0.84. The top 10 correlates identified were: seeking help from health professionals for mental health or substance use issues since the start of the pandemic, current self-rated mental health status, insulted by parents or adults in childhood, ever heard that being identifying as LGBTQI2S+ is not normal, age in years, past week feeling depressed, lifetime diagnosis of mental illness, lifetime diagnosis of depressive disorder, past week feeling sad, ever pretended to be straight or cisgender to be accepted. The increase in the risk of suicidal thoughts for those having mental health challenges or facing minority stressors is more pronounced in those living in urban areas or being unemployed than those living in rural areas or being employed.
Collapse
Affiliation(s)
- Anasua Kundu
- Institute of Medical Science, University of Toronto, Toronto, Canada; Centre for Addiction and Mental Health, Toronto, Canada; Ontario Tobacco Research Unit, University of Toronto, Toronto, Canada.
| | - Rui Fu
- Department of Otolaryngology-Head and Neck Surgery, Sunnybrook Research Institute, University of Toronto, Toronto, Canada; Dalla Lana School of Public Health, University of Toronto, Canada
| | - Daniel Grace
- Dalla Lana School of Public Health, University of Toronto, Canada
| | - Carmen Logie
- Factor-Inwentash Faculty of Social Work, University of Toronto, Canada; United Nations University Institute for Water, Environment & Health, Hamilton, Canada
| | - Alex Abramovich
- Centre for Addiction and Mental Health, Toronto, Canada; Dalla Lana School of Public Health, University of Toronto, Canada
| | - Bruce Baskerville
- Canadian Institutes of Health Research, Ottawa, Canada; School of Pharmacy, Faculty of Science, University of Waterloo, Kitchener, Canada
| | | | - Robert Schwartz
- Centre for Addiction and Mental Health, Toronto, Canada; Ontario Tobacco Research Unit, University of Toronto, Toronto, Canada; Dalla Lana School of Public Health, University of Toronto, Canada
| | - Nicholas Mitsakakis
- Dalla Lana School of Public Health, University of Toronto, Canada; Children's Hospital of Eastern Ontario Research Institute, Ottawa, Canada
| | - Lynn Planinac
- Ontario Tobacco Research Unit, University of Toronto, Toronto, Canada
| | - Michael Chaiton
- Centre for Addiction and Mental Health, Toronto, Canada; Ontario Tobacco Research Unit, University of Toronto, Toronto, Canada; Dalla Lana School of Public Health, University of Toronto, Canada
| |
Collapse
|
45
|
Chen S, Gao C, Zhang P. Incorporation of Data-mined Knowledge into Black-box SVM for Interpretability. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3548775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The lack of interpretability often makes black-box models challenging to be applied in many practical domains. For this reason, the current work, from the black-box model input port, proposes to incorporate data-mined knowledge into the black-box soft-margin SVM model to enhance accuracy and interpretability. The concept and incorporation mechanism of data-mined knowledge are successively developed, based on which a partially interpretable soft-margin SVM (
pTsm
-SVM) optimization model is designed and then solved through reformulating the optimization problem as standard quadratic programming. An algorithm for mining linear positive (negative) class knowledge from general data sets is also proposed, which generates a linear two-dimensional discriminative rule with specificity (sensitivity) equal to 1 and the highest possible sensitivity (specificity) among all two-dimensional feature spaces. The knowledge-integrated
pTsm
-SVM works by achieving a good trade-off among the “large margin”, “high specificity”, and “high sensitivity”. Our experimental results on eight UCI datasets demonstrate the superiority of the proposed
pTsm
-SVM over the standard soft-margin SVM both in terms of accuracy and interpretability.
Collapse
Affiliation(s)
| | | | - Ping Zhang
- School of Mathematical Sciences, Zhejiang University, China
| |
Collapse
|
46
|
Hornung R, Boulesteix AL. Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
47
|
Zhang F, Gou J. Machine learning assessment of risk factors for depression in later adulthood. THE LANCET REGIONAL HEALTH. EUROPE 2022; 18:100399. [PMID: 35586270 PMCID: PMC9109181 DOI: 10.1016/j.lanepe.2022.100399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Affiliation(s)
- Fengqing Zhang
- Department of Psychological and Brain Sciences, Drexel University, 3201 Chestnut Street, Philadelphia PA 19104, USA
| | - Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, 800 E. Lancaster Ave. Villanova, PA 19085, USA
| |
Collapse
|
48
|
Wang Z, Niu Y, Vashisth T, Li J, Madden R, Livingston TS, Wang Y. Nontargeted metabolomics-based multiple machine learning modeling boosts early accurate detection for citrus Huanglongbing. HORTICULTURE RESEARCH 2022; 9:uhac145. [PMID: 36061619 PMCID: PMC9433982 DOI: 10.1093/hr/uhac145] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 06/20/2022] [Indexed: 06/15/2023]
Abstract
Early accurate detection of crop disease is extremely important for timely disease management. Huanglongbing (HLB), one of the most destructive citrus diseases, has brought about severe economic losses for the global citrus industry. The direct strategies for HLB identification, such as quantitative real-time polymerase chain reaction (qPCR) and chemical staining, are robust for the symptomatic plants but powerless for the asymptomatic ones at the early stage of affection. Thus, it is very necessary to develop a practical method used for the early detection of HLB. In this study, a novel method combining ultra-high performance liquid chromatography/mass spectrometry (UHPLC/MS)-based nontargeted metabolomics and machine learning (ML) was developed for conducting the early detection of HLB for the first time. Six ML algorithms were selected to build the classifiers. Regularized logistic regression (LR-L2) and gradient-boosted decision tree (GBDT) outperformed with the highest average accuracy of 95.83% to not only classify healthy and infected plants but identify significant features. The proposed method proved to be practical for early detection of HLB, which tackled the shortcomings of low sensitivity in the conventional methods and avoid the problems such as lighting condition interference in spectrum/image recognition-based ML methods. Additionally, the discovered biomarkers were verified by the metabolic pathway analysis and content change analysis, which was remarkably consistent with the previous reports.
Collapse
Affiliation(s)
- Zhixin Wang
- Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
| | - Yue Niu
- Department of Mathematics, University of Arizona, Tucson, Arizona 85721-0089, U.S.A
| | - Tripti Vashisth
- Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
| | - Jingwen Li
- Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
| | - Robert Madden
- Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
| | - Taylor Shea Livingston
- Citrus Research & Education Center, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, Florida 33850-2299, U.S.A
| | - Yu Wang
- Corresponding author: E-mail:
| |
Collapse
|
49
|
Walker AM, Cliff A, Romero J, Shah MB, Jones P, Felipe Machado Gazolla JG, Jacobson DA, Kainer D. Evaluating the Performance of Random Forest and Iterative Random Forest Based Methods when Applied to Gene Expression Data. Comput Struct Biotechnol J 2022; 20:3372-3386. [PMID: 35832622 PMCID: PMC9260260 DOI: 10.1016/j.csbj.2022.06.037] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/14/2022] [Accepted: 06/14/2022] [Indexed: 11/30/2022] Open
Abstract
Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices of gene expression data. Random Forest-Leave One Out Prediction (RF-LOOP) is a method that has been shown to be efficient at producing these gene-to-gene networks, frequently known as GEne Network Inference with Ensemble of trees (GENIE3). Random Forest can be replaced in this process by iterative Random Forest (iRF), which performs variable selection and boosting. Here we validate that iterative Random Forest-Leave One Out Prediction (iRF-LOOP) produces higher quality networks than GENIE3 (RF-LOOP). We use both synthetic and empirical networks from the Dialogue for Reverse Engineering Assessment and Methods (DREAM) Challenges by Sage Bionetworks, as well as two additional empirical networks created from Arabidopsis thaliana and Populus trichocarpa expression data.
Collapse
Affiliation(s)
- Angelica M. Walker
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, 821 Volunteer Blvd, Knoxville 37996, TN, USA
| | - Ashley Cliff
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, 821 Volunteer Blvd, Knoxville 37996, TN, USA
| | - Jonathon Romero
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, 821 Volunteer Blvd, Knoxville 37996, TN, USA
| | - Manesh B. Shah
- Computational and Predictive Biology, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge 37830, TN, USA
| | - Piet Jones
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, 821 Volunteer Blvd, Knoxville 37996, TN, USA
| | | | - Daniel A Jacobson
- Computational and Predictive Biology, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge 37830, TN, USA
- Corresponding authors.
| | - David Kainer
- Computational and Predictive Biology, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge 37830, TN, USA
- Corresponding authors.
| |
Collapse
|
50
|
Provable Boolean interaction recovery from tree ensemble obtained via random forests. Proc Natl Acad Sci U S A 2022; 119:e2118636119. [PMID: 35609192 PMCID: PMC9295780 DOI: 10.1073/pnas.2118636119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
SignificanceRandom Forests (RFs) are among the most successful machine-learning algorithms in terms of prediction accuracy. In many domain problems, however, the primary goal is not prediction, but to understand the data-generation process-in particular, finding important features and feature interactions. There exists strong empirical evidence that RF-based methods-in particular, iterative RF (iRF)-are very successful in terms of detecting feature interactions. In this work, we propose a biologically motivated, Boolean interaction model. Using this model, we complement the existing empirical evidence with theoretical evidence for the ability of iRF-type methods to select desirable interactions. Our theoretical analysis also yields deeper insights into the general interaction selection mechanism of decision-tree algorithms and the importance of feature subsampling.
Collapse
|