1
|
Wu L, Xu J, Tong W. PERform: assessing model performance with predictivity and explainability readiness formula. J Environ Sci Health C Toxicol Carcinog 2024:1-16. [PMID: 38619534 DOI: 10.1080/26896583.2024.2340391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
In the rapidly evolving field of artificial intelligence (AI), explainability has been traditionally assessed in a post-modeling process and is often subjective. In contrary, many quantitative metrics have been routinely used to assess a model's performance. We proposed a unified formular named PERForm, by incorporating explainability as a weight into the existing statistical metrics to provide an integrated and quantitative measure of both predictivity and explainability to guide model selection, application, and evaluation. PERForm was designed as a generic formula and can be applied to any data types. We applied PERForm on a range of diverse datasets, including DILIst, Tox21, and three MAQC-II benchmark datasets, using various modeling algorithms to predict a total of 73 distinct endpoints. For example, AdaBoost algorithms exhibited superior performance (PERForm AUC for AdaBoost is 0.129 where Linear regression is 0) in DILIst prediction, where linear regression outperformed other models in the majority of Tox21 endpoints (PERForm AUC for linear regression is 0.301 where AdaBoost is 0.283 in average). This research marks a significant step toward comprehensively evaluating the utility of an AI model to advance transparency and interpretability, where the tradeoff between a model's performance and its interpretability can have profound implications.
Collapse
Affiliation(s)
- Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| |
Collapse
|
2
|
Gong B, Lababidi S, Kusko R, Bouri K, Prezek S, Thovarai V, Prasanna A, Maier EJ, Golkaram M, Sun X, Kyriakidis K, Kitajima JP, Ebrahim Sahraeian SM, Guo Y, Johanson E, Jones W, Tong W, Xu J. Towards accurate indel calling for oncopanel sequencing through an international pipeline competition at precisionFDA. Sci Rep 2024; 14:8165. [PMID: 38589653 PMCID: PMC11001604 DOI: 10.1038/s41598-024-58573-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/01/2024] [Indexed: 04/10/2024] Open
Abstract
Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Samir Lababidi
- Health Informatics Staff, Office of Data, Analytics, and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Rebecca Kusko
- Cellino Biotech, 750 Main Street, Cambridge, MA, 02143, USA
| | - Khaled Bouri
- Office of Regulatory Science and Innovation, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | | | | | | | | | | | | | | | | | | | - Yunfei Guo
- Roche Sequencing Solutions, Santa Clara, CA, 95050, USA
| | - Elaine Johanson
- Health Informatics Staff, Office of Data, Analytics, and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Wendell Jones
- Q squared Solutions Genomics, 2400 Elis Road, Durham, NC, 27703, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
3
|
Wu L, Xu J, Thakkar S, Gray M, Qu Y, Li D, Tong W. A framework enabling LLMs into regulatory environment for transparency and trustworthiness and its application to drug labeling document. Regul Toxicol Pharmacol 2024; 149:105613. [PMID: 38570021 DOI: 10.1016/j.yrtph.2024.105613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 03/18/2024] [Accepted: 03/26/2024] [Indexed: 04/05/2024]
Abstract
Regulatory agencies consistently deal with extensive document reviews, ranging from product submissions to both internal and external communications. Large Language Models (LLMs) like ChatGPT can be invaluable tools for these tasks, however present several challenges, particularly the proprietary information, combining customized function with specific review needs, and transparency and explainability of the model's output. Hence, a localized and customized solution is imperative. To tackle these challenges, we formulated a framework named askFDALabel on FDA drug labeling documents that is a crucial resource in the FDA drug review process. AskFDALabel operates within a secure IT environment and comprises two key modules: a semantic search and a Q&A/text-generation module. The Module S built on word embeddings to enable comprehensive semantic queries within labeling documents. The Module T utilizes a tuned LLM to generate responses based on references from Module S. As the result, our framework enabled small LLMs to perform comparably to ChatGPT with as a computationally inexpensive solution for regulatory application. To conclude, through AskFDALabel, we have showcased a pathway that harnesses LLMs to support agency operations within a secure environment, offering tailored functions for the needs of regulatory research.
Collapse
Affiliation(s)
- Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA.
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA
| | - Shraddha Thakkar
- Office of Translational Sciences, Center for Drug Evaluation and Research (CDER), US FDA, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA
| | - Magnus Gray
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA
| | - Yanyan Qu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA
| | - Dongying Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA.
| |
Collapse
|
4
|
Connor S, Li T, Qu Y, Roberts RA, Tong W. Generation of a drug-induced renal injury list to facilitate the development of new approach methodologies for nephrotoxicity. Drug Discov Today 2024; 29:103938. [PMID: 38432353 DOI: 10.1016/j.drudis.2024.103938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 02/16/2024] [Accepted: 02/27/2024] [Indexed: 03/05/2024]
Abstract
Drug-induced renal injury (DIRI) causes >1.5 million adverse events annually in the USA alone. Although standard biomarkers exist for DIRI, they lack the sensitivity or specificity to detect nephrotoxicity before the significant loss of renal function. In this study, we describe the creation of DIRIL - a list of drugs associated with DIRI and nephrotoxicity - from two literature datasets with DIRI annotation, confirmed using FDA drug labeling. DIRIL comprises 317 orally administered drugs covering all 14 anatomical, therapeutic and chemical (ATC) classification categories. Of the 317 drugs, 171 were DIRI-positive and 146 were DIRI-negative. DIRIL will be a relevant and invaluable resource for discovery of new approach methods (NAMs) to predict the occurrence and possible severity of DIRI earlier in drug development.
Collapse
Affiliation(s)
- Skylar Connor
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA
| | - Ting Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA
| | - Yanyan Qu
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA
| | - Ruth A Roberts
- ApconiX, Alderley Park, Alderley Edge SK10 4TG, UK; University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA.
| |
Collapse
|
5
|
Gray M, Samala R, Liu Q, Skiles D, Xu J, Tong W, Wu L. Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory Science. Clin Pharmacol Ther 2024; 115:687-697. [PMID: 38018360 DOI: 10.1002/cpt.3117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 11/21/2023] [Indexed: 11/30/2023]
Abstract
Artificial intelligence (AI) is increasingly being used in decision making across various industries, including the public health arena. Bias in any decision-making process can significantly skew outcomes, and AI systems have been shown to exhibit biases at times. The potential for AI systems to perpetuate and even amplify biases is a growing concern. Bias, as used in this paper, refers to the tendency toward a particular characteristic or behavior, and thus, a biased AI system is one that shows biased associations entities. In this literature review, we examine the current state of research on AI bias, including its sources, as well as the methods for measuring, benchmarking, and mitigating it. We also examine the biases and methods of mitigation specifically relevant to the healthcare field and offer a perspective on bias measurement and mitigation in regulatory science decision making.
Collapse
Affiliation(s)
- Magnus Gray
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Ravi Samala
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, US Food and Drug Administration Center for Devices and Radiological Health, Silver Spring, Maryland, USA
| | - Qi Liu
- Office of Clinical Pharmacology, Office of Translational Sciences, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| | - Denny Skiles
- Office of Management, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Joshua Xu
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Weida Tong
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Leihong Wu
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| |
Collapse
|
6
|
Tong W, Renaudin M. Context is everything in regulatory application of large language models (LLMs). Drug Discov Today 2024; 29:103916. [PMID: 38364998 DOI: 10.1016/j.drudis.2024.103916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 02/07/2024] [Accepted: 02/08/2024] [Indexed: 02/18/2024]
|
7
|
Gong B, Li D, Zhang Y, Kusko R, Lababidi S, Cao Z, Chen M, Chen N, Chen Q, Chen Q, Dai J, Gan Q, Gao Y, Guo M, Hariani G, He Y, Hou W, Jiang H, Kushwaha G, Li JL, Li J, Li Y, Liu LC, Liu R, Liu S, Meriaux E, Mo M, Moore M, Moss TJ, Niu Q, Patel A, Ren L, Saremi NF, Shang E, Shang J, Song P, Sun S, Urban BJ, Wang D, Wang S, Wen Z, Xiong X, Yang J, Yin L, Zhang C, Zhang R, Bhandari A, Cai W, Eterovic AK, Megherbi DB, Shi T, Suo C, Yu Y, Zheng Y, Novoradovskaya N, Sears RL, Shi L, Jones W, Tong W, Xu J. Extend the benchmarking indel set by manual review using the individual cell line sequencing data from the Sequencing Quality Control 2 (SEQC2) project. Sci Rep 2024; 14:7028. [PMID: 38528062 DOI: 10.1038/s41598-024-57439-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 03/18/2024] [Indexed: 03/27/2024] Open
Abstract
Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Yifan Zhang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Rebecca Kusko
- Cellino Bio, 750 Main Street, Cambridge, MA, 02143, USA
| | - Samir Lababidi
- Office of Data Analytics and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Mingyang Chen
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Ning Chen
- iGeneTech Bioscience Co., Ltd., 8 Shengmingyuan Rd., Changping, Beijing, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jiacheng Dai
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Qiang Gan
- Clinical Diagnostics Division, Thermo Fisher Scientific, 46500 Kato Rd., Fremont, CA, 94538, USA
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Mingkun Guo
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Gunjan Hariani
- Q squared Solutions Genomics, 2400 Ellis Road, Durham, NC, 27703, USA
| | - Yujie He
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Garima Kushwaha
- Guardant Health, Inc., 505 Penobscot Drive, Redwood City, CA, 94063, USA
| | - Jian-Liang Li
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Jianying Li
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Yulan Li
- College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Liang-Chun Liu
- Clinical Diagnostics Division, Thermo Fisher Scientific, 46500 Kato Rd., Fremont, CA, 94538, USA
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Shiming Liu
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Edwin Meriaux
- CMINDS Research Center, University of Massachusetts, Lowell, MA, 01854, USA
| | - Mengqing Mo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, 200032, China
| | | | - Tyler J Moss
- Eurofins Viracor, LLC, 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Quanne Niu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ananddeep Patel
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Nedda F Saremi
- Agilent Technologies, Inc., 11011 N Torrey Pines Rd., La Jolla, CA, 92037, USA
| | - Erfei Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ping Song
- Cancer Genomics Laboratory, Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Siqi Sun
- ResearchDx, Irvine, CA, 92618, USA
| | - Brent J Urban
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Danke Wang
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Shangzi Wang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Xiangyi Xiong
- College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Lihui Yin
- PathGroup, Nashville, TN, 37217, USA
| | - Chao Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ruolan Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Wanshi Cai
- iGeneTech Bioscience Co., Ltd., 8 Shengmingyuan Rd., Changping, Beijing, China
| | - Agda Karina Eterovic
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Dalila B Megherbi
- CMINDS Research Center, University of Massachusetts, Lowell, MA, 01854, USA
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Chen Suo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, 200032, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Renee L Sears
- Velsera, 6 Cityplace Dr Suite 550, Creve Coeur, MO, 63141, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Wendell Jones
- Q squared Solutions Genomics, 2400 Ellis Road, Durham, NC, 27703, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
8
|
Le H, Chen R, Harris S, Fang H, Lyn-Cook B, Hong H, Ge W, Rogers P, Tong W, Zou W. RxNorm for drug name normalization: a case study of prescription opioids in the FDA adverse events reporting system. Front Bioinform 2024; 3:1328613. [PMID: 38250436 PMCID: PMC10796552 DOI: 10.3389/fbinf.2023.1328613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 12/11/2023] [Indexed: 01/23/2024] Open
Abstract
Numerous studies have been conducted on the US Food and Drug Administration (FDA) Adverse Events Reporting System (FAERS) database to assess post-marketing reporting rates for drug safety review and risk assessment. However, the drug names in the adverse event (AE) reports from FAERS were heterogeneous due to a lack of uniformity of information submitted mandatorily by pharmaceutical companies and voluntarily by patients, healthcare professionals, and the public. Studies using FAERS and other spontaneous reporting AEs database without drug name normalization may encounter incomplete collection of AE reports from non-standard drug names and the accuracies of the results might be impacted. In this study, we demonstrated applicability of RxNorm, developed by the National Library of Medicine, for drug name normalization in FAERS. Using prescription opioids as a case study, we used RxNorm application program interface (API) to map all FDA-approved prescription opioids described in FAERS AE reports to their equivalent RxNorm Concept Unique Identifiers (RxCUIs) and RxNorm names. The different names of the opioids were then extracted, and their usage frequencies were calculated in collection of more than 14.9 million AE reports for 13 FDA-approved prescription opioid classes, reported over 17 years. The results showed that a significant number of different names were consistently used for opioids in FAERS reports, with 2,086 different names (out of 7,892) used at least three times and 842 different names used at least ten times for each of the 92 RxNorm names of FDA-approved opioids. Our method of using RxNorm API mapping was confirmed to be efficient and accurate and capable of reducing the heterogeneity of prescription opioid names significantly in the AE reports in FAERS; meanwhile, it is expected to have a broad application to different sets of drug names from any database where drug names are diverse and unnormalized. It is expected to be able to automatically standardize and link different representations of the same drugs to build an intact and high-quality database for diverse research, particularly postmarketing data analysis in pharmacovigilance initiatives.
Collapse
Affiliation(s)
- Huyen Le
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Ru Chen
- Office of Translational Science, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, United States
| | - Stephen Harris
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Hong Fang
- Office of Scientific Coordination, Jefferson, AR, United States
| | - Beverly Lyn-Cook
- Division of Biochemistry Toxicity, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Paul Rogers
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| | - Wen Zou
- Division of Bioinformatics and Biostatistics, Jefferson, AR, United States
| |
Collapse
|
9
|
Ren L, Duan X, Dong L, Zhang R, Yang J, Gao Y, Peng R, Hou W, Liu Y, Li J, Yu Y, Zhang N, Shang J, Liang F, Wang D, Chen H, Sun L, Hao L, Scherer A, Nordlund J, Xiao W, Xu J, Tong W, Hu X, Jia P, Ye K, Li J, Jin L, Hong H, Wang J, Fan S, Fang X, Zheng Y, Shi L. Quartet DNA reference materials and datasets for comprehensively evaluating germline variant calling performance. Genome Biol 2023; 24:270. [PMID: 38012772 PMCID: PMC10680274 DOI: 10.1186/s13059-023-03109-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 11/13/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.
Collapse
Affiliation(s)
- Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Xiaoke Duan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | | | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Rongxue Peng
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Fan Liang
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Depeng Wang
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Hui Chen
- OrigiMed Co., Ltd, Shanghai, China
| | - Lele Sun
- Sequanta Technologies Co., Ltd, Shanghai, China
| | | | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Jessica Nordlund
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Department of Medical Sciences, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Xin Hu
- Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peng Jia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Jing Wang
- National Institute of Metrology, Beijing, China.
| | - Shaohua Fan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Shanghai Cancer Center, Fudan University, Shanghai, China
- International Human Phenome Institutes, Shanghai, China
| |
Collapse
|
10
|
Chen X, Roberts R, Liu Z, Tong W. A generative adversarial network model alternative to animal studies for clinical pathology assessment. Nat Commun 2023; 14:7141. [PMID: 37932302 PMCID: PMC10628291 DOI: 10.1038/s41467-023-42933-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/25/2023] [Indexed: 11/08/2023] Open
Abstract
Animal studies are unavoidable in evaluating chemical and drug safety. Generative Adversarial Networks (GANs) can generate synthetic animal data by learning from the legacy animal study results, thus may serve as an alternative approach to assess untested chemicals. AnimalGAN, a GAN method to simulate 38 rat clinical pathology measures, was developed with significant robustness even for the drugs that vary significantly from these used during training, both in terms of chemical structure, drug class, and the year of FDA approval. AnimalGAN showed comparable results in hepatotoxicity assessment as using the real animal data and outperformed 12 conventional quantitative structure-activity relationship approaches. Using AnimalGAN, a virtual experiment of 100,000 rats ranked hepatotoxicity of three structurally similar drugs in a similar trend that has been observed in human population. AnimalGAN represented a significant step with artificial intelligence towards the global effort in replacement, reduction, and refinement (3Rs) of animal use.
Collapse
Affiliation(s)
- Xi Chen
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge, SK10 4TG, UK
- University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Zhichao Liu
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA.
- Currently working at Integrative Toxicology, Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT, 06877, USA.
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
11
|
Qu Y, Li T, Liu Z, Li D, Tong W. DICTrank: The largest reference list of 1318 human drugs ranked by risk of drug-induced cardiotoxicity using FDA labeling. Drug Discov Today 2023; 28:103770. [PMID: 37714406 DOI: 10.1016/j.drudis.2023.103770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 08/28/2023] [Accepted: 09/08/2023] [Indexed: 09/17/2023]
Abstract
Drug-induced cardiotoxicity (DICT) is a leading cause of drug trial failure and discontinuation. Current drug annotations for cardiotoxicity largely focus on individual outcomes or mechanisms. Considering the broad spectrum of adverse cardiac events, we developed Drug-Induced Cardiotoxicity Rank (DICTrank) using FDA labeling and comprehensively classified 1318 human drugs into four categories: Most-DICT-Concern (n = 341), Less-DICT-Concern (n = 528), No-DICT-Concern (n = 343), and Ambiguous-DICT-Concern (n = 106). Notably, DICTrank covers diverse therapeutic categories, of which several were enriched with Most-DICT-Concern drugs, such as antineoplastic agents, sex hormones, anti-inflammatory drugs, beta-blockers, and cardiac therapy. DICTrank currently presents the largest drug list of DICT annotation, and it could contribute to the development of new approach methods, including AI models for early identification of DICT risk during drug development and beyond.
Collapse
Affiliation(s)
- Yanyan Qu
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA; University of Arkansas at Little Rock and University of Arkansas for Medical Sciences Joint Bioinformatics Program, Little Rock, AR, USA
| | - Ting Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Zhichao Liu
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Dongying Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA.
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
12
|
Wu L, Gray M, Dang O, Xu J, Fang H, Tong W. RxBERT: Enhancing drug labeling text mining and analysis with AI language modeling. Exp Biol Med (Maywood) 2023; 248:1937-1943. [PMID: 38166420 PMCID: PMC10798181 DOI: 10.1177/15353702231220669] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 11/02/2023] [Indexed: 01/04/2024] Open
Abstract
The US drug labeling document contains essential information on drug efficacy and safety, making it a crucial regulatory resource for Food and Drug Administration (FDA) drug reviewers. Due to its extensive volume and the presence of free-text, conventional text mining analysis have encountered challenges in processing these data. Recent advances in artificial intelligence (AI) for natural language processing (NLP) have provided an unprecedented opportunity to identify key information from drug labeling, thereby enhancing safety reviews and support for regulatory decisions. We developed RxBERT, a Bidirectional Encoder Representations from Transformers (BERT) model pretrained on FDA human prescription drug labeling documents for an enhanced application of drug labeling documents in both research and drug review. RxBERT was derived from BioBERT with further training on human prescription drug labeling documents. RxBERT was demonstrated in several tasks using regulatory datasets, including those involved in the National Institutes of Technology Text Analysis Challenge Dataset (NIST TAC dataset), the FDA Adverse Drug Event Evaluation Dataset (ADE Eval dataset), and the classification of texts from submission packages into labeling sections (US Drug Labeling dataset). For all these tasks, RxBERT reached 86.5 F1-scores in both TAC and ADE Eval classification, respectively, and prediction accuracy of 87% for the US Drug Labeling dataset. Overall, RxBERT was shown to be as competitive or have better performance compared to other NLP approaches such as BERT, BioBERT, etc. In summary, we developed RxBERT, a transformer-based model specific for drug labeling that outperformed the original BERT model. RxBERT has the potential to be used to assist research scientists and FDA reviewers to better process and utilize drug labeling information toward the advancement of drug effectiveness and safety for public health. This proof-of-concept study also demonstrated a potential pathway to customized large language models (LLMs) tailored to the sensitive regulatory documents for internal application.
Collapse
Affiliation(s)
- Leihong Wu
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR 72079, USA
| | - Magnus Gray
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR 72079, USA
| | - Oanh Dang
- Office of Surveillance and Epidemiology, FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR 72079, USA
| | - Hong Fang
- Office of Scientific Coordination, FDA National Center for Toxicological Research, Jefferson, AR 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR 72079, USA
| |
Collapse
|
13
|
Wu J, Ouyang J, Qin H, Zhou J, Roberts R, Siam R, Wang L, Tong W, Liu Z, Shi T. PLM-ARG: antibiotic resistance gene identification using a pretrained protein language model. Bioinformatics 2023; 39:btad690. [PMID: 37995287 PMCID: PMC10676515 DOI: 10.1093/bioinformatics/btad690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/23/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open
Abstract
MOTIVATION Antibiotic resistance presents a formidable global challenge to public health and the environment. While considerable endeavors have been dedicated to identify antibiotic resistance genes (ARGs) for assessing the threat of antibiotic resistance, recent extensive investigations using metagenomic and metatranscriptomic approaches have unveiled a noteworthy concern. A significant fraction of proteins defies annotation through conventional sequence similarity-based methods, an issue that extends to ARGs, potentially leading to their under-recognition due to dissimilarities at the sequence level. RESULTS Herein, we proposed an Artificial Intelligence-powered ARG identification framework using a pretrained large protein language model, enabling ARG identification and resistance category classification simultaneously. The proposed PLM-ARG was developed based on the most comprehensive ARG and related resistance category information (>28K ARGs and associated 29 resistance categories), yielding Matthew's correlation coefficients (MCCs) of 0.983 ± 0.001 by using a 5-fold cross-validation strategy. Furthermore, the PLM-ARG model was verified using an independent validation set and achieved an MCC of 0.838, outperforming other publicly available ARG prediction tools with an improvement range of 51.8%-107.9%. Moreover, the utility of the proposed PLM-ARG model was demonstrated by annotating resistance in the UniProt database and evaluating the impact of ARGs on the Earth's environmental microbiota. AVAILABILITY AND IMPLEMENTATION PLM-ARG is available for academic purposes at https://github.com/Junwu302/PLM-ARG, and a user-friendly webserver (http://www.unimd.org/PLM-ARG) is also provided.
Collapse
Affiliation(s)
- Jun Wu
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Jian Ouyang
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Haipeng Qin
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Jiajia Zhou
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge SK10 4TG, United Kingdom
- University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Rania Siam
- Biology Department, School of Sciences and Engineering, The American University in Cairo, New Cairo 11835, Egypt
| | - Lan Wang
- College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, United States
| | - Zhichao Liu
- Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc, Ridgefield, CT 06877, United States
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
- School of Statistics, Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, East China Normal University, Shanghai 200062, China
| |
Collapse
|
14
|
Le H, Hong H, Ge W, Francis H, Lyn-Cook B, Hwang YT, Rogers P, Tong W, Zou W. A systematic analysis and data mining of opioid-related adverse events submitted to the FAERS database. Exp Biol Med (Maywood) 2023; 248:1944-1951. [PMID: 38158803 PMCID: PMC10798186 DOI: 10.1177/15353702231211860] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 10/16/2023] [Indexed: 01/03/2024] Open
Abstract
The opioid epidemic has become a serious national crisis in the United States. An indepth systematic analysis of opioid-related adverse events (AEs) can clarify the risks presented by opioid exposure, as well as the individual risk profiles of specific opioid drugs and the potential relationships among the opioids. In this study, 92 opioids were identified from the list of all Food and Drug Administration (FDA)-approved drugs, annotated by RxNorm and were classified into 13 opioid groups: buprenorphine, codeine, dihydrocodeine, fentanyl, hydrocodone, hydromorphone, meperidine, methadone, morphine, oxycodone, oxymorphone, tapentadol, and tramadol. A total of 14,970,399 AE reports were retrieved and downloaded from the FDA Adverse Events Reporting System (FAERS) from 2004, Quarter 1 to 2020, Quarter 3. After data processing, Empirical Bayes Geometric Mean (EBGM) was then applied which identified 3317 pairs of potential risk signals within the 13 opioid groups. Based on these potential safety signals, a comparative analysis was pursued to provide a global overview of opioid-related AEs for all 13 groups of FDA-approved prescription opioids. The top 10 most reported AEs for each opioid class were then presented. Both network analysis and hierarchical clustering analysis were conducted to further explore the relationship between opioids. Results from the network analysis revealed a close association among fentanyl, oxycodone, hydrocodone, and hydromorphone, which shared more than 22 AEs. In addition, much less commonly reported AEs were shared among dihydrocodeine, meperidine, oxymorphone, and tapentadol. On the contrary, the hierarchical clustering analysis further categorized the 13 opioid classes into two groups by comparing the full profiles of presence/absence of AEs. The results of network analysis and hierarchical clustering analysis were not only consistent and cross-validated each other but also provided a better and deeper understanding of the associations and relationships between the 13 opioid groups with respect to their adverse effect profiles.
Collapse
Affiliation(s)
- Huyen Le
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Henry Francis
- Retired, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD 20993, USA
| | - Beverly Lyn-Cook
- Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Yi-Ting Hwang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
- Department of Statistics, National Taipei University, New Taipei City 23148, Taiwan
| | - Paul Rogers
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Wen Zou
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
15
|
Wang X, Xu X, Liu Z, Tong W. Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications. Exp Biol Med (Maywood) 2023; 248:1908-1917. [PMID: 38084745 PMCID: PMC10798182 DOI: 10.1177/15353702231215895] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 10/24/2023] [Indexed: 01/06/2024] Open
Abstract
Causality assessment is vital in patient safety and pharmacovigilance (PSPV) for safety signal detection, adverse reaction management, and regulatory submission. Large language models (LLMs), especially those designed with transformer architecture, are revolutionizing various fields, including PSPV. While attempts to utilize Bidirectional Encoder Representations from Transformers (BERT)-like LLMs for causal inference in PSPV are underway, a detailed evaluation of "fit-for-purpose" BERT-like model selection to enhance causal inference performance within PSPV applications remains absent. This study conducts an in-depth exploration of BERT-like LLMs, including generic pre-trained BERT LLMs, domain-specific pre-trained LLMs, and domain-specific pre-trained LLMs with safety knowledge-specific fine-tuning, for causal inference in PSPV. Our investigation centers around (1) the influence of data complexity and model architecture, (2) the correlation between the BERT size and its impact, and (3) the role of domain-specific training and fine-tuning on three publicly accessible PSPV data sets. The findings suggest that (1) BERT-like LLMs deliver consistent predictive power across varied data complexity levels, (2) the predictive performance and causal inference results do not directly correspond to the BERT-like model size, and (3) domain-specific pre-trained LLMs, with or without safety knowledge-specific fine-tuning, surpass generic pre-trained BERT models in causal inference. The findings are valuable to guide the future application of LLMs in a broad range of application.
Collapse
Affiliation(s)
- Xingqiao Wang
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR 72204, USA
| | - Xiaowei Xu
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR 72204, USA
| | - Zhichao Liu
- Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT 06877, USA
| | - Weida Tong
- FDA/National Center for Toxicological Research, Jefferson, AR 72079, USA
| |
Collapse
|
16
|
Yang J, Liu Y, Shang J, Chen Q, Chen Q, Ren L, Zhang N, Yu Y, Li Z, Song Y, Yang S, Scherer A, Tong W, Hong H, Xiao W, Shi L, Zheng Y. The Quartet Data Portal: integration of community-wide resources for multiomics quality control. Genome Biol 2023; 24:245. [PMID: 37884999 PMCID: PMC10601216 DOI: 10.1186/s13059-023-03091-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 10/17/2023] [Indexed: 10/28/2023] Open
Abstract
The Quartet Data Portal facilitates community access to well-characterized reference materials, reference datasets, and related resources established based on a family of four individuals with identical twins from the Quartet Project. Users can request DNA, RNA, protein, and metabolite reference materials, as well as datasets generated across omics, platforms, labs, protocols, and batches. Reproducible analysis tools allow for objective performance assessment of user-submitted data, while interactive visualization tools support rapid exploration of reference datasets. A closed-loop "distribution-collection-evaluation-integration" workflow enables updates and integration of community-contributed multiomics data. Ultimately, this portal helps promote the advancement of reference datasets and multiomics quality control.
Collapse
Affiliation(s)
- Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zhihui Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yueqiang Song
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shengpeng Yang
- Intelligent Storage, Alibaba Cloud, Alibaba Group, Hangzhou, Zhejiang, China
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Wenming Xiao
- Office of Oncological Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes (Shanghai), Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
17
|
Yu Y, Hou W, Liu Y, Wang H, Dong L, Mai Y, Chen Q, Li Z, Sun S, Yang J, Cao Z, Zhang P, Zi Y, Liu R, Gao J, Zhang N, Li J, Ren L, Jiang H, Shang J, Zhu S, Wang X, Qing T, Bao D, Li B, Li B, Suo C, Pi Y, Wang X, Dai F, Scherer A, Mattila P, Han J, Zhang L, Jiang H, Thierry-Mieg D, Thierry-Mieg J, Xiao W, Hong H, Tong W, Wang J, Li J, Fang X, Jin L, Xu J, Qian F, Zhang R, Shi L, Zheng Y. Author Correction: Quartet RNA reference materials improve the quality of transcriptomic data through ratio-based profiling. Nat Biotechnol 2023:10.1038/s41587-023-02008-y. [PMID: 37783850 DOI: 10.1038/s41587-023-02008-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | | | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zhihui Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shanyue Sun
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yi Zi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jian Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Sibo Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xiaolin Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Tao Qing
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ding Bao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bingying Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bin Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Chen Suo
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yan Pi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xia Wang
- National Institute of Metrology, Beijing, China
| | | | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, The Netherlands
| | - Pirkko Mattila
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, The Netherlands
| | | | - Lijun Zhang
- Nanjing Vazyme Biotech Co. Ltd., Nanjing, China
| | | | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Jing Wang
- National Institute of Metrology, Beijing, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
- National Center of Gerontology, Beijing, China
| | - Xiang Fang
- National Institute of Metrology, Beijing, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA.
| | - Feng Qian
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China.
| | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China.
- National Center of Gerontology, Beijing, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
18
|
Furuhama A, Kitazawa A, Yao J, Matos Dos Santos CE, Rathman J, Yang C, Ribeiro JV, Cross K, Myatt G, Raitano G, Benfenati E, Jeliazkova N, Saiakhov R, Chakravarti S, Foster RS, Bossa C, Battistelli CL, Benigni R, Sawada T, Wasada H, Hashimoto T, Wu M, Barzilay R, Daga PR, Clark RD, Mestres J, Montero A, Gregori-Puigjané E, Petkov P, Ivanova H, Mekenyan O, Matthews S, Guan D, Spicer J, Lui R, Uesawa Y, Kurosaki K, Matsuzaka Y, Sasaki S, Cronin MTD, Belfield SJ, Firman JW, Spînu N, Qiu M, Keca JM, Gini G, Li T, Tong W, Hong H, Liu Z, Igarashi Y, Yamada H, Sugiyama KI, Honma M. Evaluation of QSAR models for predicting mutagenicity: outcome of the Second Ames/QSAR international challenge project. SAR QSAR Environ Res 2023; 34:983-1001. [PMID: 38047445 DOI: 10.1080/1062936x.2023.2284902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 11/13/2023] [Indexed: 12/05/2023]
Abstract
Quantitative structure-activity relationship (QSAR) models are powerful in silico tools for predicting the mutagenicity of unstable compounds, impurities and metabolites that are difficult to examine using the Ames test. Ideally, Ames/QSAR models for regulatory use should demonstrate high sensitivity, low false-negative rate and wide coverage of chemical space. To promote superior model development, the Division of Genetics and Mutagenesis, National Institute of Health Sciences, Japan (DGM/NIHS), conducted the Second Ames/QSAR International Challenge Project (2020-2022) as a successor to the First Project (2014-2017), with 21 teams from 11 countries participating. The DGM/NIHS provided a curated training dataset of approximately 12,000 chemicals and a trial dataset of approximately 1,600 chemicals, and each participating team predicted the Ames mutagenicity of each trial chemical using various Ames/QSAR models. The DGM/NIHS then provided the Ames test results for trial chemicals to assist in model improvement. Although overall model performance on the Second Project was not superior to that on the First, models from the eight teams participating in both projects achieved higher sensitivity than models from teams participating in only the Second Project. Thus, these evaluations have facilitated the development of QSAR models.
Collapse
Affiliation(s)
- A Furuhama
- Division of Genetics and Mutagenesis (DGM), National Institute of Health Sciences (NIHS), Kawasaki, Japan
| | - A Kitazawa
- Division of Genetics and Mutagenesis (DGM), National Institute of Health Sciences (NIHS), Kawasaki, Japan
| | - J Yao
- Key Laboratory of Fluorine and Nitrogen Chemistry and Advanced Materials (Chinese Academy of Sciences), Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences (SIOC, CAS), Shanghai, China
| | - C E Matos Dos Santos
- Department of Computational Toxicology and In Silico Innovations, Altox Ltd, São Paulo-SP, Brazil
| | - J Rathman
- MN-AM, Nuremberg, Germany/Columbus, OH, USA
| | - C Yang
- MN-AM, Nuremberg, Germany/Columbus, OH, USA
| | | | - K Cross
- In Silico Department, Instem, Conshohocken, PA, USA
| | - G Myatt
- In Silico Department, Instem, Conshohocken, PA, USA
| | - G Raitano
- Laboratory of Environmental Toxicology and Chemistry, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS (IRFMN), Milano, Italy
| | - E Benfenati
- Laboratory of Environmental Toxicology and Chemistry, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS (IRFMN), Milano, Italy
| | | | | | | | | | - C Bossa
- Environment and Health Department, Istituto Superiore di Sanità (ISS), Rome, Italy
| | - C Laura Battistelli
- Environment and Health Department, Istituto Superiore di Sanità (ISS), Rome, Italy
| | - R Benigni
- Environment and Health Department, Istituto Superiore di Sanità (ISS), Rome, Italy
- Alpha-PreTox, Rome, Italy
| | - T Sawada
- Faculty of Regional Studies, Gifu University, Gifu, Japan
- xenoBiotic Inc, Gifu, Japan
| | - H Wasada
- Faculty of Regional Studies, Gifu University, Gifu, Japan
| | - T Hashimoto
- Faculty of Regional Studies, Gifu University, Gifu, Japan
| | - M Wu
- Massachusetts Institute of Technology, Cambridge, MA, USA
| | - R Barzilay
- Massachusetts Institute of Technology, Cambridge, MA, USA
| | - P R Daga
- Simulations Plus, Lancaster, CA, USA
| | - R D Clark
- Simulations Plus, Lancaster, CA, USA
| | | | | | | | - P Petkov
- LMC - Bourgas University, Bourgas, Bulgaria
| | - H Ivanova
- LMC - Bourgas University, Bourgas, Bulgaria
| | - O Mekenyan
- LMC - Bourgas University, Bourgas, Bulgaria
| | - S Matthews
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - D Guan
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - J Spicer
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - R Lui
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Y Uesawa
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| | - K Kurosaki
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| | - Y Matsuzaka
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| | - S Sasaki
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| | - M T D Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| | - S J Belfield
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| | - J W Firman
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| | - N Spînu
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| | - M Qiu
- Evergreen AI, Inc, Toronto, Canada
| | - J M Keca
- Evergreen AI, Inc, Toronto, Canada
| | - G Gini
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, Milano, Italy
| | - T Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration (NCTR/FDA), Jefferson, AR, USA
| | - W Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration (NCTR/FDA), Jefferson, AR, USA
| | - H Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration (NCTR/FDA), Jefferson, AR, USA
| | - Z Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration (NCTR/FDA), Jefferson, AR, USA
- Integrative Toxicology, Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc, Ridgefield, CT, USA
| | - Y Igarashi
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), Osaka, Japan
| | - H Yamada
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), Osaka, Japan
| | - K-I Sugiyama
- Division of Genetics and Mutagenesis (DGM), National Institute of Health Sciences (NIHS), Kawasaki, Japan
| | - M Honma
- Division of Genetics and Mutagenesis (DGM), National Institute of Health Sciences (NIHS), Kawasaki, Japan
| |
Collapse
|
19
|
Li T, Liu Z, Thakkar S, Roberts R, Tong W. DeepAmes: A deep learning-powered Ames test predictive model with potential for regulatory application. Regul Toxicol Pharmacol 2023; 144:105486. [PMID: 37633327 DOI: 10.1016/j.yrtph.2023.105486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 07/14/2023] [Accepted: 08/23/2023] [Indexed: 08/28/2023]
Abstract
The Ames assay is required by the regulatory agencies worldwide to assess the mutagenic potential risk of consumer products. As well as this in vitro assay, in silico approaches have been widely used to predict Ames test results as outlined in the International Council for Harmonization (ICH) guidelines. Building on this in silico approach, here we describe DeepAmes, a high performance and robust model developed with a novel deep learning (DL) approach for potential utility in regulatory science. DeepAmes was developed with a large and consistent Ames dataset (>10,000 compounds) and was compared with other five standard Machine Learning (ML) methods. Using a test set of 1,543 compounds, DeepAmes was the best performer in predicting the outcome of Ames assay. In addition, DeepAmes yielded the best and most stable performance up to when compounds were >30% outside of the applicability domain (AD). Regarding the potential for regulatory application, a revised version of DeepAmes with a much-improved sensitivity of 0.87 from 0.47. In conclusion, DeepAmes provides a DL-powered Ames test predictive model for predicting the results of Ames tests; with its defined AD and clear context of use, DeepAmes has potential for utility in regulatory application.
Collapse
Affiliation(s)
- Ting Li
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA
| | - Zhichao Liu
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA
| | - Shraddha Thakkar
- Office of Translational Sciences, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge, SK10 4TG, UK; University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
20
|
Yu Y, Hou W, Liu Y, Wang H, Dong L, Mai Y, Chen Q, Li Z, Sun S, Yang J, Cao Z, Zhang P, Zi Y, Liu R, Gao J, Zhang N, Li J, Ren L, Jiang H, Shang J, Zhu S, Wang X, Qing T, Bao D, Li B, Li B, Suo C, Pi Y, Wang X, Dai F, Scherer A, Mattila P, Han J, Zhang L, Jiang H, Thierry-Mieg D, Thierry-Mieg J, Xiao W, Hong H, Tong W, Wang J, Li J, Fang X, Jin L, Xu J, Qian F, Zhang R, Shi L, Zheng Y. Quartet RNA reference materials improve the quality of transcriptomic data through ratio-based profiling. Nat Biotechnol 2023:10.1038/s41587-023-01867-9. [PMID: 37679545 DOI: 10.1038/s41587-023-01867-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 06/15/2023] [Indexed: 09/09/2023]
Abstract
Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | | | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zhihui Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shanyue Sun
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yi Zi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jian Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Sibo Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xiaolin Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Tao Qing
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ding Bao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bingying Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bin Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Chen Suo
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yan Pi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xia Wang
- National Institute of Metrology, Beijing, China
| | | | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, The Netherlands
| | - Pirkko Mattila
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, The Netherlands
| | | | - Lijun Zhang
- Nanjing Vazyme Biotech Co. Ltd., Nanjing, China
| | | | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Jing Wang
- National Institute of Metrology, Beijing, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
- National Center of Gerontology, Beijing, China
| | - Xiang Fang
- National Institute of Metrology, Beijing, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA.
| | - Feng Qian
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China.
| | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China.
- National Center of Gerontology, Beijing, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
21
|
Zheng Y, Liu Y, Yang J, Dong L, Zhang R, Tian S, Yu Y, Ren L, Hou W, Zhu F, Mai Y, Han J, Zhang L, Jiang H, Lin L, Lou J, Li R, Lin J, Liu H, Kong Z, Wang D, Dai F, Bao D, Cao Z, Chen Q, Chen Q, Chen X, Gao Y, Jiang H, Li B, Li B, Li J, Liu R, Qing T, Shang E, Shang J, Sun S, Wang H, Wang X, Zhang N, Zhang P, Zhang R, Zhu S, Scherer A, Wang J, Wang J, Huo Y, Liu G, Cao C, Shao L, Xu J, Hong H, Xiao W, Liang X, Lu D, Jin L, Tong W, Ding C, Li J, Fang X, Shi L. Multi-omics data integration using ratio-based quantitative profiling with Quartet reference materials. Nat Biotechnol 2023:10.1038/s41587-023-01934-1. [PMID: 37679543 DOI: 10.1038/s41587-023-01934-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 07/31/2023] [Indexed: 09/09/2023]
Abstract
Characterization and integration of the genome, epigenome, transcriptome, proteome and metabolome of different datasets is difficult owing to a lack of ground truth. Here we develop and characterize suites of publicly available multi-omics reference materials of matched DNA, RNA, protein and metabolites derived from immortalized cell lines from a family quartet of parents and monozygotic twin daughters. These references provide built-in truth defined by relationships among the family members and the information flow from DNA to RNA to protein. We demonstrate how using a ratio-based profiling approach that scales the absolute feature values of a study sample relative to those of a concurrently measured common reference sample produces reproducible and comparable data suitable for integration across batches, labs, platforms and omics types. Our study identifies reference-free 'absolute' feature quantification as the root cause of irreproducibility in multi-omics measurement and data integration and establishes the advantages of ratio-based multi-omics profiling with common reference materials.
Collapse
Affiliation(s)
- Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, China
| | | | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Sha Tian
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Feng Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | | | | | | | - Ling Lin
- Zhangjiang Center for Translational Medicine, Shanghai Biotecan Medical Diagnostics Co. Ltd., Shanghai, China
| | - Jingwei Lou
- Zhangjiang Center for Translational Medicine, Shanghai Biotecan Medical Diagnostics Co. Ltd., Shanghai, China
| | - Ruiqiang Li
- Novogene Bioinformatics Institute, Beijing, China
| | - Jingchao Lin
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | | | | | - Depeng Wang
- Nextomics Biosciences Institute, Wuhan, China
| | | | - Ding Bao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xingdong Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bin Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bingying Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, China
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Tao Qing
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Erfei Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shanyue Sun
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xiaolin Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ruolan Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Sibo Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Jiucun Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jing Wang
- National Institute of Metrology, Beijing, China
| | - Yinbo Huo
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Gang Liu
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Chengming Cao
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Li Shao
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Xiaozhen Liang
- Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Daru Lu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Weida Tong
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Chen Ding
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes (Shanghai), Shanghai, China.
| |
Collapse
|
22
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
23
|
Gray M, Xu J, Tong W, Wu L. Classifying Free Texts Into Predefined Sections Using AI in Regulatory Documents: A Case Study with Drug Labeling Documents. Chem Res Toxicol 2023; 36:1290-1299. [PMID: 37487037 PMCID: PMC10445280 DOI: 10.1021/acs.chemrestox.3c00028] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Indexed: 07/26/2023]
Abstract
The US Food and Drug Administration (FDA) regulatory process often involves several reviewers who focus on sets of information related to their respective areas of review. Accordingly, manufacturers that provide submission packages to regulatory agencies are instructed to organize the contents using a structure that enables the information to be easily allocated, retrieved, and reviewed. However, this practice is not always followed correctly; as such, some documents are not well structured, with similar information spreading across different sections, hindering the efficient access and review of all of the relevant data as a whole. To improve this common situation, we evaluated an artificial intelligence (AI)-based natural language processing (NLP) methodology, called Bidirectional Encoder Representations from Transformers (BERT), to automatically classify free-text information into standardized sections, supporting a holistic review of drug safety and efficacy. Specifically, FDA labeling documents were used in this study as a proof of concept, where the labeling section structure defined by the Physician Label Rule (PLR) was used to classify labels in the development of the model. The model was subsequently evaluated on texts from both well-structured labeling documents (i.e., PLR-based labeling) and less- or differently structured documents (i.e., non-PLR and Summary of Product Characteristic [SmPC] labeling.) In the training process, the model yielded 96% and 88% accuracy for binary and multiclass tasks, respectively. The testing accuracies observed for the PLR, non-PLR, and SmPC testing data sets for the binary model were 95%, 88%, and 88%, and for the multiclass model were 82%, 73%, and 68%, respectively. Our study demonstrated that automatically classifying free texts into standardized sections with AI language models could be an advanced regulatory science approach for supporting the review process by effectively processing unformatted documents.
Collapse
Affiliation(s)
- Magnus Gray
- Division of Bioinformatics
and Biostatistics, National Center for Toxicological
Research, United States Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079 United States
| | - Joshua Xu
- Division of Bioinformatics
and Biostatistics, National Center for Toxicological
Research, United States Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079 United States
| | - Weida Tong
- Division of Bioinformatics
and Biostatistics, National Center for Toxicological
Research, United States Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079 United States
| | - Leihong Wu
- Division of Bioinformatics
and Biostatistics, National Center for Toxicological
Research, United States Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079 United States
| |
Collapse
|
24
|
Bussola N, Xu J, Wu L, Gorini L, Zhang Y, Furlanello C, Tong W. A Weakly Supervised Deep Learning Framework for Whole Slide Classification to Facilitate Digital Pathology in Animal Study. Chem Res Toxicol 2023; 36:1321-1331. [PMID: 37540590 PMCID: PMC10445282 DOI: 10.1021/acs.chemrestox.3c00058] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Indexed: 08/06/2023]
Abstract
The pathology of animal studies is crucial for toxicity evaluations and regulatory assessments, but the manual examination of slides by pathologists remains time-consuming and requires extensive training. One inherent challenge in this process is the interobserver variability, which can compromise the consistency and accuracy of a study. Artificial intelligence (AI) has demonstrated its ability to automate similar examinations in clinical applications with enhanced efficiency, consistency, and accuracy. However, training AI models typically relies on costly pixel-level annotation of injured regions and is often not available for animal pathology. To address this, we developed the PathologAI system, a "weakly" supervised approach for WSI classification in rat images without explicit lesion annotation at the pixel level. Using rat liver imaging data from the Open TG-GATEs system, PathologAI was applied to predict necrosis of n = 816 WSIs (377 controls). TG-GATEs studied 170 compounds at three dose levels (low, middle, and high) for four time points (3, 7, 14, and 28 days). PathologAI first preprocessed WSIs at the tile level to generate a high-level representation with a Generative Adversarial Network architecture. The prediction of liver necrosis relied on an ensemble model of 5 CNN classifiers trained on 335 WSIs. The ensemble model achieved notable classification accuracy on the holdout test set: 87% among 87 control slides free of findings, 83% among 120 controls with spontaneous necrosis, 67% among 147 treated animals with spontaneous minimal or slight necrosis, and 59% among 127 treated animals with minimal or slight necrosis caused by the treatment. Importantly, PathologAI was able to discriminate WSIs with spontaneous necrosis from those with treatment related necrosis and discriminated mild lesion level findings (slight vs minimal) and between treatment dose levels. PathologAI could provide an inexpensive and rapid screening tool to assist the digital pathology analysis in preclinical applications and general toxicological studies.
Collapse
Affiliation(s)
- Nicole Bussola
- Center
for Integrative Biology, University of Trento, Trento 38123, Italy
| | - Joshua Xu
- Division
of Bioinformatics and Biostatistics National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| | - Leihong Wu
- Division
of Bioinformatics and Biostatistics National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| | | | - Yifan Zhang
- Division
of Bioinformatics and Biostatistics National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| | | | - Weida Tong
- Division
of Bioinformatics and Biostatistics National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| |
Collapse
|
25
|
Li D, Fang Z, Shi Q, Zhang N, Gong B, Tong W, Coskun AF, Xu J. Single-cell RNA-sequencing and subcellular spatial transcriptomics facilitate the translation of liver microphysiological systems for regulatory application. J Pharm Anal 2023; 13:691-693. [PMID: 37577388 PMCID: PMC10422651 DOI: 10.1016/j.jpha.2023.06.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2023] Open
Affiliation(s)
- Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Zhou Fang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30332, USA
- Machine Learning Graduate Program, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Qiang Shi
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Nicholas Zhang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30332, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ahmet F Coskun
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30332, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, 30332, USA
- Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| |
Collapse
|
26
|
Li T, Roberts R, Liu Z, Tong W. TransOrGAN: An Artificial Intelligence Mapping of Rat Transcriptomic Profiles between Organs, Ages, and Sexes. Chem Res Toxicol 2023; 36:916-925. [PMID: 37200521 PMCID: PMC10433534 DOI: 10.1021/acs.chemrestox.3c00037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Indexed: 05/20/2023]
Abstract
Animal studies are required for the evaluation of candidate drugs to ensure patient and volunteer safety. Toxicogenomics is often applied in these studies to gain understanding of the underlying mechanisms of toxicity, which is usually focused on critical organs such as the liver or kidney in young male rats. There is a strong ethical reason to reduce, refine and replace animal use (the 3Rs), where the mapping of data between organs, sexes and ages could reduce the cost and time of drug development. Herein, we proposed a generative adversarial network (GAN)-based framework entitled TransOrGAN that allowed the molecular mapping of gene expression profiles in different rodent organ systems and across sex and age groups. We carried out a proof-of-concept study based on rat RNA-seq data from 288 samples in 9 different organs of both sexes and 4 developmental stages. First, we demonstrated that TransOrGAN could infer transcriptomic profiles between any 2 of the 9 organs studied, yielding an average cosine similarity of 0.984 between synthetic transcriptomic profiles and their corresponding real profiles. Second, we found that TransOrGAN could infer transcriptomic profiles observed in females from males, with an average cosine similarity of 0.984. Third, we found that TransOrGAN could infer transcriptomic profiles in juvenile, adult, and aged animals from adolescent animals with an average cosine similarity of 0.981, 0.983, and 0.989, respectively. Altogether, TransOrGAN is an innovative approach to infer transcriptomic profiles between ages, sexes, and organ systems, offering the opportunity to reduce animal usage and to provide an integrated assessment of toxicity in the whole organism irrespective of sex or age.
Collapse
Affiliation(s)
- Ting Li
- National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge SK10 4TG, United Kingdom
- University
of Birmingham, Edgbaston, Birmingham B15 2TT, United
Kingdom
| | - Zhichao Liu
- Integrative
Toxicology, Nonclinical Drug Safety, Boehringer
Ingelheim Pharmaceuticals, Inc., Ridgefield, Connecticut 06877, United States
| | - Weida Tong
- National
Center for Toxicological Research, Food
and Drug Administration, Jefferson, Arkansas 72079, United States
| |
Collapse
|
27
|
Thakkar S, Slikker W, Yiannas F, Silva P, Blais B, Chng KR, Liu Z, Adholeya A, Pappalardo F, Soares MDLC, Beeler P, Whelan M, Roberts R, Borlak J, Hugas M, Torrecilla-Salinas C, Girard P, Diamond MC, Verloo D, Panda B, Rose MC, Jornet JB, Furuhama A, Fang H, Kwegyir-Afful E, Heintz K, Arvidson K, Burgos JG, Horst A, Tong W. Artificial intelligence and real-world data for drug and food safety - A regulatory science perspective. Regul Toxicol Pharmacol 2023; 140:105388. [PMID: 37061083 DOI: 10.1016/j.yrtph.2023.105388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/07/2023] [Accepted: 04/12/2023] [Indexed: 04/17/2023]
Abstract
In 2013, the Global Coalition for Regulatory Science Research (GCRSR) was established with members from over ten countries (www.gcrsr.net). One of the main objectives of GCRSR is to facilitate communication among global regulators on the rise of new technologies with regulatory applications through the annual conference Global Summit on Regulatory Science (GSRS). The 11th annual GSRS conference (GSRS21) focused on "Regulatory Sciences for Food/Drug Safety with Real-World Data (RWD) and Artificial Intelligence (AI)." The conference discussed current advancements in both AI and RWD approaches with a specific emphasis on how they impact regulatory sciences and how regulatory agencies across the globe are pursuing the adaptation and oversight of these technologies. There were presentations from Brazil, Canada, India, Italy, Japan, Germany, Switzerland, Singapore, the United Kingdom, and the United States. These presentations highlighted how various agencies are moving forward with these technologies by either improving the agencies' operation and/or preparing regulatory mechanisms to approve the products containing these innovations. To increase the content and discussion, the GSRS21 hosted two debate sessions on the question of "Is Regulatory Science Ready for AI?" and a workshop to showcase the analytical data tools that global regulatory agencies have been using and/or plan to apply to regulatory science. Several key topics were highlighted and discussed during the conference, such as the capabilities of AI and RWD to assist regulatory science policies for drug and food safety, the readiness of AI and data science to provide solutions for regulatory science. Discussions highlighted the need for a constant effort to evaluate emerging technologies for fit-for-purpose regulatory applications. The annual GSRS conferences offer a unique platform to facilitate discussion and collaboration across regulatory agencies, modernizing regulatory approaches, and harmonizing efforts.
Collapse
Affiliation(s)
- Shraddha Thakkar
- Center for Drug Evaluations and Research (CDER), Food and Drug Administration (FDA), USA
| | - William Slikker
- National Center for Toxicological Research (NCTR), Food and Drug Administration (FDA), USA
| | | | | | | | - Kern Rei Chng
- National Centre for Food Science, Singapore Food Agency (SFA), Singapore
| | - Zhichao Liu
- National Center for Toxicological Research (NCTR), Food and Drug Administration (FDA), USA
| | - Alok Adholeya
- The Energy and Resources Institute (TERI), New Delhi, India
| | | | | | - Patrick Beeler
- Swissmedic, Bern, Switzerland; University of Zurich, Zurich, Switzerland
| | | | | | | | | | | | | | - Matthew C Diamond
- Center for Devices and Radiological Health (CDRH), Food and Drug Administration (FDA), USA
| | | | - Binay Panda
- Jawaharlal Nehru University (JNU), New Delhi, India
| | | | | | | | - Hong Fang
- National Center for Toxicological Research (NCTR), Food and Drug Administration (FDA), USA
| | - Ernest Kwegyir-Afful
- Center for Food Safety and Applied Nutrition (CFSAN), Food and Drug Administration (FDA), USA
| | - Kasey Heintz
- Center for Food Safety and Applied Nutrition (CFSAN), Food and Drug Administration (FDA), USA
| | - Kirk Arvidson
- Center for Food Safety and Applied Nutrition (CFSAN), Food and Drug Administration (FDA), USA
| | | | | | - Weida Tong
- National Center for Toxicological Research (NCTR), Food and Drug Administration (FDA), USA.
| |
Collapse
|
28
|
Buckley D, Aspinall D, Carroll R, Hayward C, Kotlyar E, Jabbour A, Bart N, Keogh A, MacDonald P, Muthiah K, Tong W. Routine Donor Specific Antibody Monitoring in Heart Transplant Recipients - Is There a Role? J Heart Lung Transplant 2023. [DOI: 10.1016/j.healun.2023.02.1108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023] Open
|
29
|
Tardo D, Carlos L, Burrows F, Carroll R, Tong W, Patel P, Taverniti A, Wiltshire S, Conte S, Parvar S, Emmanuel S, Grealy R, Hayward C, Bart N, Kotlyar E, Jabbour A, Keogh A, Patel J, Jansz P, Macdonald P, Muthiah K. Combined Plasmapheresis and Complement Inhibition in a Highly Allosensitized Cardiac Transplant Recipient. J Heart Lung Transplant 2023. [DOI: 10.1016/j.healun.2023.02.1227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023] Open
|
30
|
Zheng H, Wang Q, Fu T, Wei Z, Ye J, Huang B, Li C, Liu B, Zhang A, Li F, Gao F, Tong W. Robotic versus laparoscopic left colectomy with complete mesocolic excision for left-sided colon cancer: a multicentre study with propensity score matching analysis. Tech Coloproctol 2023:10.1007/s10151-023-02781-7. [PMID: 36964884 DOI: 10.1007/s10151-023-02781-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 02/28/2023] [Indexed: 03/26/2023]
Abstract
BACKGROUND Robotic surgery for right-sided colon and rectal cancer has rapidly increased; however, there is limited evidence in the literature of advantages of robotic left colectomy (RLC) for left-sided colon cancer. The purpose of this study was to compare the outcomes of RLC versus laparoscopic left colectomy (LLC) with complete mesocolic excision (CME) for left-sided colon cancer. METHODS Patients who had RLC or LLC with CME for left-sided colon cancer at 5 hospitals in China between January 2014 and April 2022 were included. A one-to-one propensity score matched analysis was performed to decrease confounding. The primary outcome was postoperative complications occurring within 30 days of surgery. Secondary outcomes were disease-free survival, overall survival and the number of harvested lymph nodes. RESULTS A total of 292 patients (187 males; median age 61.0 [20.0-85.0] years) were eligible for this study, and propensity score matching yielded 102 patients in each group. The clinical-pathological characteristics were well-matched between groups. The two groups did not differ in estimated blood loss, conversion to open rate, time to first flatus, reoperation rate, or postoperative length of hospital stay (p > 0.05). RLC was associated with a longer operation time (192.9 ± 53.2 vs. 168.9 ± 52.8 minutes, p=0.001). The incidence of postoperative complications did not differ between the RLC and LLC groups (18.6% vs. 17.6%, p = 0.856). The total number of lymph nodes harvested in the RLC group was higher than that in the LLC group (15.7 ± 8.3 vs. 12.1 ± 5.9, p< 0.001). There were no significant differences in 3-year and 5-year overall survival or 3-year and 5-year disease-free survival. CONCLUSIONS Compared to laparoscopic surgery, RLC with CME for left-sided colon cancer was found to be associated with higher numbers of lymph nodes harvested and similar postoperative complications and long-term survival outcomes.
Collapse
Affiliation(s)
- H Zheng
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China
| | - Q Wang
- Department of Gastrocolorectal Surgery, The First Hospital of Jilin University, Changchun, China
| | - T Fu
- Department of Gastrointestinal Surgery II, Renmin Hospital of Wuhan University, Wuhan, China
| | - Z Wei
- Department of Gastrointestinal Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - J Ye
- Department of Gastrointestinal Surgery, The People's Hospital of Shapingba District, Chongqing, China
| | - B Huang
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China
| | - C Li
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China
| | - B Liu
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China
| | - A Zhang
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China
| | - F Li
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China.
| | - F Gao
- Department of Colorectal Surgery, 940th Hospital of Joint Logistics Support force of PLA, Lanzhou, China.
| | - W Tong
- Gastric and Colorectal Surgery Division, Department of General Surgery, Army Medical Center (Daping Hospital), Army Medical University, No. 10, Changjiang Branch Road, Daping, Yuzhong District, Chongqing, China.
| |
Collapse
|
31
|
Wang X, Xu X, Tong W, Liu Q, Liu Z. DeepCausality: A general AI-powered causal inference framework for free text: A case study of LiverTox. Front Artif Intell 2022; 5:999289. [PMID: 36561659 PMCID: PMC9763446 DOI: 10.3389/frai.2022.999289] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/16/2022] [Indexed: 12/12/2022] Open
Abstract
Causality plays an essential role in multiple scientific disciplines, including the social, behavioral, and biological sciences and portions of statistics and artificial intelligence. Manual-based causality assessment from a large number of free text-based documents is very time-consuming, labor-intensive, and sometimes even impractical. Herein, we proposed a general causal inference framework named DeepCausality to empirically estimate the causal factors for suspected endpoints embedded in the free text. The proposed DeepCausality seamlessly incorporates AI-powered language models, named entity recognition and Judea Pearl's Do-calculus, into a general framework for causal inference to fulfill different domain-specific applications. We exemplified the utility of the proposed DeepCausality framework by employing the LiverTox database to estimate idiosyncratic drug-induced liver injury (DILI)-related causal terms and generate a knowledge-based causal tree for idiosyncratic DILI patient stratification. Consequently, the DeepCausality yielded a prediction performance with an accuracy of 0.92 and an F-score of 0.84 for the DILI prediction. Notably, 90% of causal terms enriched by the DeepCausality were consistent with the clinical causal terms defined by the American College of Gastroenterology (ACG) clinical guideline for evaluating suspected idiosyncratic DILI (iDILI). Furthermore, we observed a high concordance of 0.91 between the iDILI severity scores generated by DeepCausality and domain experts. Altogether, the proposed DeepCausality framework could be a promising solution for causality assessment from free text and is publicly available through https://github.com/XingqiaoWang/https-github.com-XingqiaoWang-DeepCausality-LiverTox.
Collapse
Affiliation(s)
- Xingqiao Wang
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR, United States
| | - Xiaowei Xu
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Qi Liu
- Office of Clinical Pharmacology, Office of Translational Sciences, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
32
|
Li T, Tong W, Roberts R, Liu Z, Thakkar S. Corrigendum: DeepCarc: Deep learning-powered carcinogenicity prediction using model-level representation. Front Artif Intell 2022; 5:1046668. [DOI: 10.3389/frai.2022.1046668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 10/07/2022] [Indexed: 11/29/2022] Open
|
33
|
Wu L, Chen S, Guo L, Shpyleva S, Harris K, Fahmi T, Flanigan T, Tong W, Xu J, Ren Z. Development of benchmark datasets for text mining and sentiment analysis to accelerate regulatory literature review. Regul Toxicol Pharmacol 2022; 137:105287. [PMID: 36372266 DOI: 10.1016/j.yrtph.2022.105287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 10/18/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
In the field of regulatory science, reviewing literature is an essential and important step, which most of the time is conducted by manually reading hundreds of articles. Although this process is highly time-consuming and labor-intensive, most output of this process is not well transformed into machine-readable format. The limited availability of data has largely constrained the artificial intelligence (AI) system development to facilitate this literature reviewing in the regulatory process. In the past decade, AI has revolutionized the area of text mining as many deep learning approaches have been developed to search, annotate, and classify relevant documents. After the great advancement of AI algorithms, a lack of high-quality data instead of the algorithms has recently become the bottleneck of AI system development. Herein, we constructed two large benchmark datasets, Chlorine Efficacy dataset (CHE) and Chlorine Safety dataset (CHS), under a regulatory scenario that sought to assess the antiseptic efficacy and toxicity of chlorine. For each dataset, ∼10,000 scientific articles were initially collected, manually reviewed, and their relevance to the review task were labeled. To ensure high data quality, each paper was labeled by a consensus among multiple experienced reviewers. The overall relevance rate was 27.21% (2,663 of 9,788) for CHE and 7.50% (761 of 10,153) for CHS, respectively. Furthermore, the relevant articles were categorized into five subgroups based on the focus of their content. Next, we developed an attention-based classification language model using these two datasets. The proposed classification model yielded 0.857 and 0.908 of Area Under the Curve (AUC) for CHE and CHS dataset, respectively. This performance was significantly better than permutation test (p < 10E-9), demonstrating that the labeling processes were valid. To conclude, our datasets can be used as benchmark to develop AI systems, which can further facilitate the literature review process in regulatory science.
Collapse
Affiliation(s)
- Leihong Wu
- Division of Bioinformatics and Biostatics, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA.
| | - Si Chen
- Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Lei Guo
- Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Svitlana Shpyleva
- Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Kelly Harris
- Division of Genetic and Molecular Toxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Tariq Fahmi
- Office of Scientific Coordination, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Timothy Flanigan
- Division of Neurotoxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatics, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatics, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA
| | - Zhen Ren
- Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. FDA, Jefferson, AR, 72079, USA.
| |
Collapse
|
34
|
Connor S, Li T, Roberts R, Thakkar S, Liu Z, Tong W. Adaptability of AI for safety evaluation in regulatory science: A case study of drug-induced liver injury. Front Artif Intell 2022; 5:1034631. [DOI: 10.3389/frai.2022.1034631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022] Open
Abstract
Artificial intelligence (AI) has played a crucial role in advancing biomedical sciences but has yet to have the impact it merits in regulatory science. As the field advances, in silico and in vitro approaches have been evaluated as alternatives to animal studies, in a drive to identify and mitigate safety concerns earlier in the drug development process. Although many AI tools are available, their acceptance in regulatory decision-making for drug efficacy and safety evaluation is still a challenge. It is a common perception that an AI model improves with more data, but does reality reflect this perception in drug safety assessments? Importantly, a model aiming at regulatory application needs to take a broad range of model characteristics into consideration. Among them is adaptability, defined as the adaptive behavior of a model as it is retrained on unseen data. This is an important model characteristic which should be considered in regulatory applications. In this study, we set up a comprehensive study to assess adaptability in AI by mimicking the real-world scenario of the annual addition of new drugs to the market, using a model we previously developed known as DeepDILI for predicting drug-induced liver injury (DILI) with a novel Deep Learning method. We found that the target test set plays a major role in assessing the adaptive behavior of our model. Our findings also indicated that adding more drugs to the training set does not significantly affect the predictive performance of our adaptive model. We concluded that the proposed adaptability assessment framework has utility in the evaluation of the performance of a model over time.
Collapse
|
35
|
Verheijen M, Meier M, Ochoteco J, Gant T, Tong W, Yauk C, Caiment F. P20-03 R-ODAF: an omics data analysis framework for regulatory application. Toxicol Lett 2022. [DOI: 10.1016/j.toxlet.2022.07.669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
36
|
Wu L, Ali S, Ali H, Brock T, Xu J, Tong W. NeuroCORD: A Language Model to Facilitate COVID-19-Associated Neurological Disorder Studies. Int J Environ Res Public Health 2022; 19:9974. [PMID: 36011614 PMCID: PMC9408703 DOI: 10.3390/ijerph19169974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/03/2022] [Accepted: 08/05/2022] [Indexed: 06/15/2023]
Abstract
COVID-19 can lead to multiple severe outcomes including neurological and psychological impacts. However, it is challenging to manually scan hundreds of thousands of COVID-19 articles on a regular basis. To update our knowledge, provide sound science to the public, and communicate effectively, it is critical to have an efficient means of following the most current published data. In this study, we developed a language model to search abstracts using the most advanced artificial intelligence (AI) to accurately retrieve articles on COVID-19-associated neurological disorders. We applied this NeuroCORD model to the largest benchmark dataset of COVID-19, CORD-19. We found that the model developed on the training set yielded 94% prediction accuracy on the test set. This result was subsequently verified by two experts in the field. In addition, when applied to 96,000 non-labeled articles that were published after 2020, the NeuroCORD model accurately identified approximately 3% of them to be relevant for the study of COVID-19-associated neurological disorders, while only 0.5% were retrieved using conventional keyword searching. In conclusion, NeuroCORD provides an opportunity to profile neurological disorders resulting from COVID-19 in a rapid and efficient fashion, and its general framework could be used to study other COVID-19-related emerging health issues.
Collapse
Affiliation(s)
- Leihong Wu
- National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
| | - Syed Ali
- National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
| | - Heather Ali
- Department of Internal Medicine, University of Arkansas for Medical Sciences, 4301 West Markham, Little Rock, AR 72205, USA
| | - Tyrone Brock
- National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
- Department of Mathematics and Computer Science, University of Arkansas at Pine Bluff, 1200 University Drive, Pine Bluff, AR 71601, USA
| | - Joshua Xu
- National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
| |
Collapse
|
37
|
Bisgin H, Bera T, Wu L, Ding H, Bisgin N, Liu Z, Pava-Ripoll M, Barnes A, Campbell JF, Vyas H, Furlanello C, Tong W, Xu J. Accurate species identification of food-contaminating beetles with quality-improved elytral images and deep learning. Front Artif Intell 2022; 5:952424. [PMID: 36034596 PMCID: PMC9412741 DOI: 10.3389/frai.2022.952424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 07/22/2022] [Indexed: 11/13/2022] Open
Abstract
Food samples are routinely screened for food-contaminating beetles (i.e., pantry beetles) due to their adverse impact on the economy, environment, public health and safety. If found, their remains are subsequently analyzed to identify the species responsible for the contamination; each species poses different levels of risk, requiring different regulatory and management steps. At present, this identification is done through manual microscopic examination since each species of beetle has a unique pattern on its elytra (hardened forewing). Our study sought to automate the pattern recognition process through machine learning. Such automation will enable more efficient identification of pantry beetle species and could potentially be scaled up and implemented across various analysis centers in a consistent manner. In our earlier studies, we demonstrated that automated species identification of pantry beetles is feasible through elytral pattern recognition. Due to poor image quality, however, we failed to achieve prediction accuracies of more than 80%. Subsequently, we modified the traditional imaging technique, allowing us to acquire high-quality elytral images. In this study, we explored whether high-quality elytral images can truly achieve near-perfect prediction accuracies for 27 different species of pantry beetles. To test this hypothesis, we developed a convolutional neural network (CNN) model and compared performance between two different image sets for various pantry beetles. Our study indicates improved image quality indeed leads to better prediction accuracy; however, it was not the only requirement for achieving good accuracy. Also required are many high-quality images, especially for species with a high number of variations in their elytral patterns. The current study provided a direction toward achieving our ultimate goal of automated species identification through elytral pattern recognition.
Collapse
Affiliation(s)
- Halil Bisgin
- Department of Mathematics and Applied Sciences, University of Michigan-Flint, Flint, MI, United States
| | - Tanmay Bera
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Hongjian Ding
- Food Chemistry Lab 1, Arkansas Regional Laboratory, Office of Regulatory Affairs, US Food and Drug Administration, Jefferson, AR, United States
| | - Neslihan Bisgin
- Department of Mathematics and Applied Sciences, University of Michigan-Flint, Flint, MI, United States
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Monica Pava-Ripoll
- Office for Food Safety, Center for Food Safety and Applied Nutrition, US Food and Drug Administration, College Park, MD, United States
| | - Amy Barnes
- Food Chemistry Lab 1, Arkansas Regional Laboratory, Office of Regulatory Affairs, US Food and Drug Administration, Jefferson, AR, United States
| | - James F. Campbell
- Stored Product Insect and Engineering Research Unit, US Department of Agriculture, Manhattan, KS, United States
| | - Himansi Vyas
- Food Chemistry Lab 1, Arkansas Regional Laboratory, Office of Regulatory Affairs, US Food and Drug Administration, Jefferson, AR, United States
| | | | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
- *Correspondence: Joshua Xu
| |
Collapse
|
38
|
Liu Z, Li T, Connor S, Thakkar S, Roberts R, Tong W. Best practice and reproducible science are required to advance artificial intelligence in real-world applications. Brief Bioinform 2022; 23:6618241. [PMID: 35848999 DOI: 10.1093/bib/bbac237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/13/2022] Open
Abstract
Drug-induced liver injury (DILI) is one of the most significant concerns in medical practice but yet it still cannot be fully recapitulated with existing in vivo, in vitro and in silico approaches. To address this challenge, Chen et al. [ 1] developed a deep learning-based DILI prediction model based on chemical structure information alone. The reported model yielded an outstanding prediction performance (i.e. 0.958, 0.976, 0.935, 0.947, 0.926 and 0.913 for AUC, accuracy, recall, precision, F1-score and specificity, respectively, on a test set), far outperforming all publicly available and similar in silico DILI models. This extraordinary model performance is counter-intuitive to what we know about the underlying biology of DILI and the principles and hypothesis behind this type of in silico approach. In this Letter to the Editor, we raise awareness of several issues concerning data curation, model validation and comparison practices, and data and model reproducibility.
Collapse
Affiliation(s)
- Zhichao Liu
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Ting Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Skylar Connor
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Shraddha Thakkar
- Center for Drug Evaluation and Research, US FDA, Silver Spring, MD 20993, USA
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge, SK10 4TG, UK.,University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| |
Collapse
|
39
|
Zhang Y, Blomquist TM, Kusko R, Stetson D, Zhang Z, Yin L, Sebra R, Gong B, Lococo JS, Mittal VK, Novoradovskaya N, Yeo JY, Dominiak N, Hipp J, Raymond A, Qiu F, Arib H, Smith ML, Brock JE, Farkas DH, Craig DJ, Crawford EL, Li D, Morrison T, Tom N, Xiao W, Yang M, Mason CE, Richmond TA, Jones W, Johann DJ, Shi L, Tong W, Willey JC, Xu J. Deep oncopanel sequencing reveals within block position-dependent quality degradation in FFPE processed samples. Genome Biol 2022; 23:141. [PMID: 35768876 PMCID: PMC9241261 DOI: 10.1186/s13059-022-02709-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 06/15/2022] [Indexed: 11/13/2022] Open
Abstract
Background Clinical laboratories routinely use formalin-fixed paraffin-embedded (FFPE) tissue or cell block cytology samples in oncology panel sequencing to identify mutations that can predict patient response to targeted therapy. To understand the technical error due to FFPE processing, a robustly characterized diploid cell line was used to create FFPE samples with four different pre-tissue processing formalin fixation times. A total of 96 FFPE sections were then distributed to different laboratories for targeted sequencing analysis by four oncopanels, and variants resulting from technical error were identified. Results Tissue sections that fail more frequently show low cellularity, lower than recommended library preparation DNA input, or target sequencing depth. Importantly, sections from block surfaces are more likely to show FFPE-specific errors, akin to “edge effects” seen in histology, while the inner samples display no quality degradation related to fixation time. Conclusions To assure reliable results, we recommend avoiding the block surface portion and restricting mutation detection to genomic regions of high confidence. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02709-8.
Collapse
Affiliation(s)
- Yifan Zhang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Thomas M Blomquist
- (Formerly) Department of Pathology, College of Medicine and Life Sciences, The University of Toledo, Toledo, OH, 43614, USA.,Lucas County Coroner's Office, 2595 Arlington Ave, Toledo, OH, 43614, USA
| | - Rebecca Kusko
- Immuneering Corporation, 245 Main St, Cambridge, MA, 02142, USA
| | - Daniel Stetson
- Astrazeneca Pharmaceuticals, 35 Gatehouse Dr, Waltham, MA, 02451, USA
| | - Zhihong Zhang
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Lihui Yin
- (Formerly) Pathology and Laboratory Medicine Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA
| | - Robert Sebra
- Icahn Institute and Department of Genetics and Genomic Sciences Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA
| | - Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | - Vinay K Mittal
- Thermo Fisher Scientific, 110 Miller Ave, Ann Arbor, MI, 48104, USA
| | | | - Ji-Youn Yeo
- Department of Pathology, University of Toledo, 3000 Arlington Ave, Toledo, OH, 43614, USA
| | - Nicole Dominiak
- Department of Pathology, University of Toledo, 3000 Arlington Ave, Toledo, OH, 43614, USA
| | - Jennifer Hipp
- Department of Pathology, Strata Oncology, Inc., Ann Arbor, MI, 48103, USA
| | - Amelia Raymond
- Astrazeneca Pharmaceuticals, 35 Gatehouse Dr, Waltham, MA, 02451, USA
| | - Fujun Qiu
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Hanane Arib
- Icahn Institute and Department of Genetics and Genomic Sciences Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA
| | - Melissa L Smith
- Icahn Institute and Department of Genetics and Genomic Sciences Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA
| | - Jay E Brock
- Pathology and Laboratory Medicine Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA
| | - Daniel H Farkas
- Pathology and Laboratory Medicine Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA
| | - Daniel J Craig
- Department of Medicine, College of Medicine and Life Sciences, The University of Toledo, Toledo, OH, 43614, USA
| | - Erin L Crawford
- Department of Medicine, College of Medicine and Life Sciences, The University of Toledo, Toledo, OH, 43614, USA
| | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tom Morrison
- Accugenomics, Inc., 1410 Commonwealth Drive, Suite 105, Wilmington, NC, 20403, USA
| | - Nikola Tom
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic.,EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081 HZ, Amsterdam, The Netherlands
| | - Wenzhong Xiao
- Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA.,Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
| | - Mary Yang
- Department of Information Science, University of Arkansas at Little Rock, 2801 S. Univ. Ave, Little Rock, AR, 72204, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | - Todd A Richmond
- Market & Application Development Bioinformatics, Roche Sequencing Solutions Inc., 4300 Hacienda Dr, Pleasanton, CA, 94588, USA
| | - Wendell Jones
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd, Morrisville, NC, 27560, USA
| | - Donald J Johann
- Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301 W Markham St, Little Rock, AR, 72205, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China.,Human Phenome Institute, Fudan University, Shanghai, 201203, China.,Fudan-Gospel Joint Research Center for Precision Medicine, Fudan University, Shanghai, 200438, China
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - James C Willey
- Departments of Medicine, Pathology, and Cancer Biology, College of Medicine and Life Sciences, University of Toledo Health Sciences Campus, 3000 Arlington Ave, Toledo, OH, 43614, USA.
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
40
|
Xiao TT, Ouyang ZW, Liu XC, Cao JJ, Wang ZX, Tong W. Angular dependence of spin-flop transition in triangular lattice antiferromagnet Cu 2(OH) 3Br. J Phys Condens Matter 2022; 34:275804. [PMID: 35453130 DOI: 10.1088/1361-648x/ac69a0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Accepted: 04/22/2022] [Indexed: 06/14/2023]
Abstract
We report angular dependence of spin-flop transition in triangular lattice antiferromagnet Cu2(OH)3Br by angle-dependent magnetization and ESR measurements. The results show that the antiferromagnetic easy magnetization axis is the diagonal direction (θ= 45°) of theac*plane, i.e., the orientation of Cu1 spins based on the magnetic structure (2020Phys. Rev. Lett.125037204), whereas the spin-flop axis is thebaxis. A phenomenological model is proposed to describe the angle-dependent spin-flop transitions. Based on this model, Cu1 spins are sensitive to external magnetic field, while Cu2 spins are robust against to the field, showing partial decoupling. The model is expected to be used in other uniaxial antiferromagnets with a more general easy axis and complex spin-flop transitions.
Collapse
Affiliation(s)
- T T Xiao
- Wuhan National High Magnetic Field Center and School of Physics, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - Z W Ouyang
- Wuhan National High Magnetic Field Center and School of Physics, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - X C Liu
- Wuhan National High Magnetic Field Center and School of Physics, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - J J Cao
- Wuhan National High Magnetic Field Center and School of Physics, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - Z X Wang
- Wuhan National High Magnetic Field Center and School of Physics, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - W Tong
- Anhui Province Key Laboratory of Condensed Matter Physics at Extreme Conditions, High Magnetic Field Laboratory, Chinese Academy of Sciences, Hefei 230031, People's Republic of China
| |
Collapse
|
41
|
Gong B, Deveson IW, Mercer T, Johann DJ, Jones W, Tong W, Xu J. Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity. Sci Data 2022; 9:170. [PMID: 35418127 PMCID: PMC9008010 DOI: 10.1038/s41597-022-01276-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/11/2022] [Indexed: 11/09/2022] Open
Abstract
Recently we reported the accuracy and reproducibility of circulating tumor DNA (ctDNA) assays using a unique set of reference materials, associated analytical framework, and suggested best practices. With the rapid adoption of ctDNA sequencing in precision oncology, it is critical to understand the analytical validity and technical limitations of this cutting-edge and medical-practice-changing technology. The SEQC2 Oncopanel Sequencing Working Group has developed a multi-site, cross-platform study design for evaluating the analytical performance of five industry-leading ctDNA assays. The study used tailor-made reference samples at various levels of input material to assess ctDNA sequencing across 12 participating clinical and research facilities. The generated dataset encompasses multiple key variables, including a broad range of mutation frequencies, sequencing coverage depth, DNA input quantity, etc. It is the most comprehensive public-facing dataset of its kind and provides valuable insights into ultra-deep ctDNA sequencing technology. Eventually the clinical utility of ctDNA assays is required and our proficiency study and corresponding dataset are needed steps towards this goal. Measurement(s) | Somatic Mutation • spike-in quality control role | Technology Type(s) | Tumor Panel Sequencing | Factor Type(s) | Tumor Panel • DNA Library Input Quantity • Variant Allele Frequency | Sample Characteristic - Organism | Homo sapiens |
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ira W Deveson
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia.,St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Sydney, NSW, Australia
| | - Timothy Mercer
- Australian Institute of Bioengineering and Nanotechnology, University of Queensland, St Lucia, QLD, Australia.,Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Donald J Johann
- Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301 W Markham St., Little Rock, AR, 72205, USA
| | - Wendell Jones
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
42
|
Liu Z, Roberts R, Mercer TR, Xu J, Sedlazeck FJ, Tong W. Towards accurate and reliable resolution of structural variants for clinical diagnosis. Genome Biol 2022; 23:68. [PMID: 35241127 PMCID: PMC8892125 DOI: 10.1186/s13059-022-02636-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 02/15/2022] [Indexed: 12/17/2022] Open
Abstract
Structural variants (SVs) are a major source of human genetic diversity and have been associated with different diseases and phenotypes. The detection of SVs is difficult, and a diverse range of detection methods and data analysis protocols has been developed. This difficulty and diversity make the detection of SVs for clinical applications challenging and requires a framework to ensure accuracy and reproducibility. Here, we discuss current developments in the diagnosis of SVs and propose a roadmap for the accurate and reproducible detection of SVs that includes case studies provided from the FDA-led SEquencing Quality Control Phase II (SEQC-II) and other consortium efforts.
Collapse
Affiliation(s)
- Zhichao Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ruth Roberts
- ApconiX, BioHub at Alderley Park, Alderley Edge, SK10 4TG, UK.,University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Timothy R Mercer
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Australia.,Garvan Institute of Medical Research, Sydney, NSW, Australia.,St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Joshua Xu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
43
|
Li D, Chen M, Hong H, Tong W, Ning B. Integrative approaches for studying the role of noncoding RNAs in influencing drug efficacy and toxicity. Expert Opin Drug Metab Toxicol 2022; 18:151-163. [PMID: 35296201 PMCID: PMC9117541 DOI: 10.1080/17425255.2022.2054802] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 03/14/2022] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Drug efficacy and toxicity are important factors for evaluation in drug development. Drug metabolizing enzymes and transporters (DMETs) play an essential role in drug efficacy and toxicity. Noncoding RNAs (ncRNAs) have been implicated to influence inter-individual variations in drug efficacy and safety by regulating DMETs. An efficient strategy is urgently needed to identify and functionally characterize ncRNAs that mediate drug efficacy and toxicity through regulating DMETs. AREAS COVERED We outline an integrative strategy to identify ncRNAs that modulate DMETs. We include reliable tools and databases for computational prediction of ncRNA targets with regard to their advantages and limitations. Various biochemical, molecular, and cellular assays are discussed for in vitro experimental verification of the regulatory function of ncRNAs. In vivo approaches for association of ncRNAs with drug treatment and toxicity are also reviewed. EXPERT OPINION A streamlined integration of computational prediction and wet-lab validation is important to elucidate mechanisms of ncRNAs in the regulation of DMETs related to drug efficacy and safety. Bioinformatic analyses using open-access tools and databases serve as a powerful booster for ncRNA Research in toxicology. Further refinement of computational algorithms and experimental technologies is needed to improve accuracy and efficiency in ncRNA target identification and characterization.
Collapse
Affiliation(s)
- Dongying Li
- National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration (FDA), Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration (FDA), Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration (FDA), Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration (FDA), Jefferson, AR, USA
| | - Baitang Ning
- National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration (FDA), Jefferson, AR, USA
| |
Collapse
|
44
|
Wang P, Tong W, Wang Q. Combined transabdominal-transanal surgical approach for iatrogenic rectovaginal fistula: two case reports. Ann R Coll Surg Engl 2022; 104:50-53. [PMID: 35100847 DOI: 10.1308/rcsann.2021.1352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023] Open
Abstract
Rectovaginal fistula (RVF) is a type of anastomotic leakage that may occur after low anterior resection for rectal cancer. The repair of RVF can be challenging because of the scar tissue stenosis and incomplete obstruction. Two patients presented in our department with vaginal faecal discharge almost 7 months after the radical resection of rectal cancer. On vaginal examination, titanium nails related to the rectal surgery were found in the vaginal wall. The patients were diagnosed with RVF. Considering that RVF positions in the patients were high and might adhere to the pelvic tissue, a combined transabdominal-transanal resection and vaginal repair surgery was performed. About 3 months after surgery, both patients underwent colonic closure surgery, with consequent good recovery. A combined transabdominal-transanal approach may provide distinct advantages in surgical repair of difficult cases of RVF.
Collapse
Affiliation(s)
- P Wang
- First Hospital of Jilin University, China
| | - W Tong
- First Hospital of Jilin University, China
| | - Q Wang
- First Hospital of Jilin University, China
| |
Collapse
|
45
|
Reis ALM, Deveson IW, Madala BS, Wong T, Barker C, Xu J, Lennon N, Tong W, Mercer TR. Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome. Genome Biol 2022; 23:19. [PMID: 35022065 PMCID: PMC8753822 DOI: 10.1186/s13059-021-02579-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/16/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low-complexity, and repetitive regions that are difficult to sequence and analyze. Despite their difficulty, these regions include many clinically important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. RESULTS To evaluate the accuracy by which these difficult regions are analyzed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically important human genome regions, including repeats, microsatellites, HLA genes, and immune receptors. These controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. CONCLUSIONS This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.
Collapse
Affiliation(s)
- Andre L M Reis
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Ira W Deveson
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Bindu Swapna Madala
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Ted Wong
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Chris Barker
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Niall Lennon
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tim R Mercer
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
- Australian Institute for Biotechnology and Nanoengineering, University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
46
|
Pan B, Ren L, Onuchic V, Guan M, Kusko R, Bruinsma S, Trigg L, Scherer A, Ning B, Zhang C, Glidewell-Kenney C, Xiao C, Donaldson E, Sedlazeck FJ, Schroth G, Yavas G, Grunenwald H, Chen H, Meinholz H, Meehan J, Wang J, Yang J, Foox J, Shang J, Miclaus K, Dong L, Shi L, Mohiyuddin M, Pirooznia M, Gong P, Golshani R, Wolfinger R, Lababidi S, Sahraeian SME, Sherry S, Han T, Chen T, Shi T, Hou W, Ge W, Zou W, Guo W, Bao W, Xiao W, Fan X, Gondo Y, Yu Y, Zhao Y, Su Z, Liu Z, Tong W, Xiao W, Zook JM, Zheng Y, Hong H. Assessing reproducibility of inherited variants detected with short-read whole genome sequencing. Genome Biol 2022; 23:2. [PMID: 34980216 PMCID: PMC8722114 DOI: 10.1186/s13059-021-02569-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 12/06/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.
Collapse
Affiliation(s)
- Bohu Pan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | | | | | | | - Len Trigg
- Real Time Genomics, Hamilton, New Zealand
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Baitang Ning
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Eric Donaldson
- Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Gokhan Yavas
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | | | | | - Joe Meehan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Jing Wang
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Lianhua Dong
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | | | | | - Samir Lababidi
- Office of Health Informatics, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | | | - Steve Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Tao Han
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tao Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wen Zou
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjing Guo
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjun Bao
- SAS Institute Inc., Cary, NC, 27513, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA, 94305, USA
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yoichi Gondo
- Department of Molecular Life Sciences, Tokai University School of Medicine, 143 Shimokasuya, Isehara, 259-1193, Japan
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Yongmei Zhao
- CCR-SF Bioinformatics Group, Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science, Frederick National Laboratory for Cancer Research, Frederick, MD, 21701, USA
| | - Zhenqiang Su
- Takeda Pharmaceuticals, Cambridge, MA, 02139, USA
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenming Xiao
- Division of Molecular Genetics and Pathology, Center for Device and Radiological Health, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Fudan University, Shanghai, 200438, China.
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
47
|
Anklam E, Bahl MI, Ball R, Beger RD, Cohen J, Fitzpatrick S, Girard P, Halamoda-Kenzaoui B, Hinton D, Hirose A, Hoeveler A, Honma M, Hugas M, Ishida S, Kass GEN, Kojima H, Krefting I, Liachenko S, Liu Y, Masters S, Marx U, McCarthy T, Mercer T, Patri A, Pelaez C, Pirmohamed M, Platz S, Ribeiro AJS, Rodricks JV, Rusyn I, Salek RM, Schoonjans R, Silva P, Svendsen CN, Sumner S, Sung K, Tagle D, Tong L, Tong W, van den Eijnden-van-Raaij J, Vary N, Wang T, Waterton J, Wang M, Wen H, Wishart D, Yuan Y, Slikker Jr. W. Emerging technologies and their impact on regulatory science. Exp Biol Med (Maywood) 2022; 247:1-75. [PMID: 34783606 PMCID: PMC8749227 DOI: 10.1177/15353702211052280] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
There is an evolution and increasing need for the utilization of emerging cellular, molecular and in silico technologies and novel approaches for safety assessment of food, drugs, and personal care products. Convergence of these emerging technologies is also enabling rapid advances and approaches that may impact regulatory decisions and approvals. Although the development of emerging technologies may allow rapid advances in regulatory decision making, there is concern that these new technologies have not been thoroughly evaluated to determine if they are ready for regulatory application, singularly or in combinations. The magnitude of these combined technical advances may outpace the ability to assess fit for purpose and to allow routine application of these new methods for regulatory purposes. There is a need to develop strategies to evaluate the new technologies to determine which ones are ready for regulatory use. The opportunity to apply these potentially faster, more accurate, and cost-effective approaches remains an important goal to facilitate their incorporation into regulatory use. However, without a clear strategy to evaluate emerging technologies rapidly and appropriately, the value of these efforts may go unrecognized or may take longer. It is important for the regulatory science field to keep up with the research in these technically advanced areas and to understand the science behind these new approaches. The regulatory field must understand the critical quality attributes of these novel approaches and learn from each other's experience so that workforces can be trained to prepare for emerging global regulatory challenges. Moreover, it is essential that the regulatory community must work with the technology developers to harness collective capabilities towards developing a strategy for evaluation of these new and novel assessment tools.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reza M Salek
- International Agency for Research on Cancer, France
| | | | | | | | | | | | | | - Li Tong
- Universities of Georgia Tech and Emory, USA
| | | | | | - Neil Vary
- Canadian Food Inspection Agency, Canada
| | - Tao Wang
- National Medical Products Administration, China
| | | | - May Wang
- Universities of Georgia Tech and Emory, USA
| | - Hairuo Wen
- National Institutes for Food and Drug Control, China
| | | | | | | |
Collapse
|
48
|
Abstract
Liver toxicity is a major adverse drug reaction that accounts for drug failure in clinical trials and withdrawal from the market. Therefore, predicting potential liver toxicity at an early stage in drug discovery is crucial to reduce costs and the potential for drug failure. However, current in vivo animal toxicity testing is very expensive and time consuming. As an alternative approach, various machine learning models have been developed to predict potential liver toxicity in humans. This chapter reviews current advances in the development and application of machine learning models for prediction of potential liver toxicity in humans and discusses possible improvements to liver toxicity prediction.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Sugunadevi Sakkiah
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Zuowei Ji
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wen Zou
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
49
|
Abstract
Animal studies are a critical component in biomedical research, pharmaceutical product development, and regulatory submissions. There is a worldwide effort in toxicology towards "reducing, refining and replacing" (3Rs) animal use. Here, we proposed a deep generative adversarial network (GAN)-based framework capable of deriving new animal results from existing animal studies without additional experiments. To prove the concept, we employed this Tox-GAN framework to generate both gene activities and expression profiles for multiple doses and treatment durations in toxicogenomics (TGx). Using the pre-existing rat liver TGx data from the Open TG-GATEs, we generated Tox-GAN transcriptomic profiles with high similarity (0.997 ± 0.002 in intensity and 0.740 ± 0.082 in fold change) to the corresponding real gene expression profiles. Consequently, Tox-GAN showed an outstanding performance in two critical TGx applications, gaining a molecular understanding of underlying toxicological mechanisms and gene expression-based biomarker development. For the former, over 87% agreement in Gene Ontology was found between Tox-GAN results and real gene expression data. For the latter, the concordance of biomarkers between real and generated data was high in both predictive performance and biomarker genes. We also demonstrated that the Tox-GAN models constructed with TG-GATEs data were capable of generating transcriptomic profiles reported in DrugMatrix. Finally, we demonstrated potential utility for Tox-GAN in aiding chemical-based read-across. To the best of our knowledge, the proposed Tox-GAN model is novel in its ability to generate in vivo transcriptomic profiles at different treatment conditions from chemical structures. Overall, Tox-GAN holds great promise for generating high-quality toxicogenomic profiles without animal experimentation.
Collapse
Affiliation(s)
- Xi Chen
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas 72079, USA
| | - Ruth Roberts
- ApconiX Ltd, Alderley Edge SK10 4TG, UK
- Department of Biosciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas 72079, USA
| | - Zhichao Liu
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas 72079, USA
| |
Collapse
|
50
|
Wu Y, Liu Z, Wu L, Chen M, Tong W. BERT-Based Natural Language Processing of Drug Labeling Documents: A Case Study for Classifying Drug-Induced Liver Injury Risk. Front Artif Intell 2021; 4:729834. [PMID: 34939028 PMCID: PMC8685544 DOI: 10.3389/frai.2021.729834] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 11/17/2021] [Indexed: 11/16/2022] Open
Abstract
Background & Aims: The United States Food and Drug Administration (FDA) regulates a broad range of consumer products, which account for about 25% of the United States market. The FDA regulatory activities often involve producing and reading of a large number of documents, which is time consuming and labor intensive. To support regulatory science at FDA, we evaluated artificial intelligence (AI)-based natural language processing (NLP) of regulatory documents for text classification and compared deep learning-based models with a conventional keywords-based model. Methods: FDA drug labeling documents were used as a representative regulatory data source to classify drug-induced liver injury (DILI) risk by employing the state-of-the-art language model BERT. The resulting NLP-DILI classification model was statistically validated with both internal and external validation procedures and applied to the labeling data from the European Medicines Agency (EMA) for cross-agency application. Results: The NLP-DILI model developed using FDA labeling documents and evaluated by cross-validations in this study showed remarkable performance in DILI classification with a recall of 1 and a precision of 0.78. When cross-agency data were used to validate the model, the performance remained comparable, demonstrating that the model was portable across agencies. Results also suggested that the model was able to capture the semantic meanings of sentences in drug labeling. Conclusion: Deep learning-based NLP models performed well in DILI classification of drug labeling documents and learned the meanings of complex text in drug labeling. This proof-of-concept work demonstrated that using AI technologies to assist regulatory activities is a promising approach to modernize and advance regulatory science.
Collapse
Affiliation(s)
- Yue Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
| | - Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|