1
|
Ha E, Ha SM, Gerelkhuu Z, Kim HY, Yoon TH. AI-based nanotoxicity data extraction and prediction of nanotoxicity. Comput Struct Biotechnol J 2025; 29:138-148. [PMID: 40255458 PMCID: PMC12008667 DOI: 10.1016/j.csbj.2025.03.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 03/28/2025] [Accepted: 03/31/2025] [Indexed: 04/22/2025] Open
Abstract
With the growing use of nanomaterials (NMs), assessing their toxicity has become increasingly important. Among toxicity assessment methods, computational models for predicting nanotoxicity are emerging as alternatives to traditional in vitro and in vivo assays, which involve high costs and ethical concerns. As a result, the qualitative and quantitative importance of data is now widely recognized. However, collecting large, high-quality data is both time-consuming and labor-intensive. Artificial intelligence (AI)-based data extraction techniques hold significant potential for extracting and organizing information from unstructured text. However, the use of large language models (LLMs) and prompt engineering for nanotoxicity data extraction has not been widely studied. In this study, we developed an AI-based automated data extraction pipeline to facilitate efficient data collection. The automation process was implemented using Python-based LangChain. We used 216 nanotoxicity research articles as training data to refine prompts and evaluate LLM performance. Subsequently, the most suitable LLM with refined prompts was used to extract test data, from 605 research articles. As a result, data extraction performance on training data achieved F1D.E. (F1 score for Data Extraction) ranging from 84.6 % to 87.6 % across different LLMs. Furthermore, using the extracted dataset from test set, we constructed automated machine learning (AutoML) models that achieved F1N.P. (F1 score for Nanotoxicity Prediction) exceeding 86.1 % in predicting nanotoxicity. Additionally, we assessed the reliability and applicability of models by comparing them in terms of ground truth, size, and balance. This study highlights the potential of AI-based data extraction, representing a significant contribution to nanotoxicity research.
Collapse
Affiliation(s)
- Eunyong Ha
- Department of Chemistry, Hanyang University, Seoul 04763, Republic of Korea
| | - Seung Min Ha
- Department of Chemistry, Hanyang University, Seoul 04763, Republic of Korea
| | - Zayakhuu Gerelkhuu
- Research Institute for Convergence of Basic Science, Hanyang University, Seoul 04763, Republic of Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Republic of Korea
| | - Hyun-Yi Kim
- NGeneS Inc., Ansan-si 15495, Republic of Korea
| | - Tae Hyun Yoon
- Department of Chemistry, Hanyang University, Seoul 04763, Republic of Korea
- Research Institute for Convergence of Basic Science, Hanyang University, Seoul 04763, Republic of Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Republic of Korea
- Yoon Idea Lab. Co. Ltd., Seoul 04763, Republic of Korea
| |
Collapse
|
2
|
Varsou DD, Kolokathis PD, Antoniou M, Sidiropoulos NK, Tsoumanis A, Papadiamantis AG, Melagraki G, Lynch I, Afantitis A. In silico assessment of nanoparticle toxicity powered by the Enalos Cloud Platform: Integrating automated machine learning and synthetic data for enhanced nanosafety evaluation. Comput Struct Biotechnol J 2024; 25:47-60. [PMID: 38646468 PMCID: PMC11026727 DOI: 10.1016/j.csbj.2024.03.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 03/23/2024] [Indexed: 04/23/2024] Open
Abstract
The rapid advance of nanotechnology has led to the development and widespread application of nanomaterials, raising concerns regarding their potential adverse effects on human health and the environment. Traditional (experimental) methods for assessing the nanoparticles (NPs) safety are time-consuming, expensive, and resource-intensive, and raise ethical concerns due to their reliance on animals. To address these challenges, we propose an in silico workflow that serves as an alternative or complementary approach to conventional hazard and risk assessment strategies, which incorporates state-of-the-art computational methodologies. In this study we present an automated machine learning (autoML) scheme that employs dose-response toxicity data for silver (Ag), titanium dioxide (TiO2), and copper oxide (CuO) NPs. This model is further enriched with atomistic descriptors to capture the NPs' underlying structural properties. To overcome the issue of limited data availability, synthetic data generation techniques are used. These techniques help in broadening the dataset, thus improving the representation of different NP classes. A key aspect of this approach is a novel three-step applicability domain method (which includes the development of a local similarity approach) that enhances user confidence in the results by evaluating the prediction's reliability. We anticipate that this approach will significantly expedite the nanosafety assessment process enabling regulation to keep pace with innovation, and will provide valuable insights for the design and development of safe and sustainable NPs. The ML model developed in this study is made available to the scientific community as an easy-to-use web-service through the Enalos Cloud Platform (www.enaloscloud.novamechanics.com/sabydoma/safenanoscope/), facilitating broader access and collaborative advancements in nanosafety.
Collapse
Affiliation(s)
- Dimitra-Danai Varsou
- NovaMechanics MIKE, Piraeus 18545, Greece
- Entelos Institute, Larnaca 6059, Cyprus
| | | | | | | | - Andreas Tsoumanis
- Entelos Institute, Larnaca 6059, Cyprus
- NovaMechanics Ltd, Nicosia 1070, Cyprus
| | - Anastasios G. Papadiamantis
- Entelos Institute, Larnaca 6059, Cyprus
- NovaMechanics Ltd, Nicosia 1070, Cyprus
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| | - Georgia Melagraki
- Division of Physical Sciences and Applications, Hellenic Military Academy, Vari 16672, Greece
| | - Iseult Lynch
- Entelos Institute, Larnaca 6059, Cyprus
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| | - Antreas Afantitis
- NovaMechanics MIKE, Piraeus 18545, Greece
- Entelos Institute, Larnaca 6059, Cyprus
- NovaMechanics Ltd, Nicosia 1070, Cyprus
| |
Collapse
|
3
|
Ruebel ML, Gilley SP, Yeruva L, Tang M, Frank DN, Garcés A, Figueroa L, Lan RS, Assress HA, Kemp JF, Westcott JLE, Hambidge KM, Shankar K, Krebs NF. Associations between maternal microbiome, metabolome and incidence of low-birth weight in Guatemalan participants from the Women First Trial. Front Microbiol 2024; 15:1456087. [PMID: 39473842 PMCID: PMC11518777 DOI: 10.3389/fmicb.2024.1456087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 09/13/2024] [Indexed: 04/05/2025] Open
Abstract
Background Low birth weight (LBW; <2,500 g) affects approximately 15 to 20 percent of global births annually and is associated with suboptimal child development. Recent studies suggest a link between the maternal gut microbiome and poor obstetric and perinatal outcomes. The goal of this study was to examine relationships between maternal microbial taxa, fecal metabolites, and maternal anthropometry on incidence of LBW in resource-limited settings. Methods This was a secondary analysis of the Women First trial conducted in a semi-rural region of Guatemala. Maternal weight was measured at 12 and 34 weeks (wk) of gestation. Infant anthropometry measures were collected within 48 h of delivery. Maternal fecal samples at 12 and 34 weeks were used for microbiome (16S rRNA gene amplicon sequencing) and metabolomics analysis (34 wk). Linear mixed models using the MaAslin2 package were utilized to assess changes in microbiome associated with LBW. Predictive models using gradient boosted machines (XGBoost) were developed using the H2o.ai engine. Results No differences in β-diversity were observed at either time point between mothers with LBW infants relative to normal weight (NW) infants. Simpson diversity at 12 and 34 weeks was lower in mothers with LBW infants. Notable differences in genus-level abundance between LBW and NW mothers (p < 0.05) were observed at 12 weeks with increasing abundances of Barnesiella, Faecalibacterium, Sutterella, and Bacterioides. At 34 weeks, there were lower abundances of Magasphaera, Phascolarctobacterium, and Turicibacter and higher abundances of Bacteriodes, and Fusobacterium in mothers with LBW infants. Fecal metabolites related to bile acids, tryptophan metabolism and fatty acid related metabolites changed in mothers with LBW infants. Classification models to predict LBW based on maternal anthropometry and predicted microbial functions showed moderate performance. Conclusion Collectively, the findings indicate that alterations in the maternal microbiome and metabolome were associated with LBW. Future research should target functional and predictive roles of the maternal gut microbiome in infant birth outcomes including birthweight.
Collapse
Affiliation(s)
- Meghan L. Ruebel
- Microbiome and Metabolism Research Unit, USDA-ARS, Southeast Area USDA-ARS, Little Rock, AR, United States
- Arkansas Children's Nutrition Center, Little Rock, AR, United States
| | - Stephanie P. Gilley
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - Laxmi Yeruva
- Microbiome and Metabolism Research Unit, USDA-ARS, Southeast Area USDA-ARS, Little Rock, AR, United States
- Arkansas Children's Nutrition Center, Little Rock, AR, United States
| | - Minghua Tang
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - Daniel N. Frank
- Department of Medicine, Division of Infectious Disease, University of Colorado School of Medicine, Aurora, CO, United States
| | - Ana Garcés
- Maternal Infant Health Center, Instituto de Nutrición de Centro América y Panamá (INCAP), Guatemala City, Guatemala
| | - Lester Figueroa
- Maternal Infant Health Center, Instituto de Nutrición de Centro América y Panamá (INCAP), Guatemala City, Guatemala
| | - Renny S. Lan
- Arkansas Children's Nutrition Center, Little Rock, AR, United States
- Department of Pediatrics, Section of Developmental Nutrition, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Hailemariam Abrha Assress
- Arkansas Children's Nutrition Center, Little Rock, AR, United States
- Department of Pediatrics, Section of Developmental Nutrition, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Jennifer F. Kemp
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - Jamie L. E. Westcott
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - K. Michael Hambidge
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - Kartik Shankar
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| | - Nancy F. Krebs
- Department of Pediatrics, Section of Nutrition, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|