1
|
Kajtazi A, Kajtazi M, Santos Barbetta MF, Bandini E, Eghbali H, Lynen F. Prediction of Retention Indices in LC-HRMS for Enhanced Structural Identification of Organic Micropollutants in Water: Selectivity-Based Filtration. Anal Chem 2025; 97:65-74. [PMID: 39752599 DOI: 10.1021/acs.analchem.4c01784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Addressing the global challenge of ensuring access to safe drinking water, especially in developing countries, demands cost-effective, eco-friendly, and readily available technologies. The persistence, toxicity, and bioaccumulation potential of organic pollutants arising from various human activities pose substantial hurdles. While high-performance liquid chromatography coupled with high-resolution mass spectrometry (HPLC-HRMS) is a widely utilized technique for identifying pollutants in water, the multitude of structures for a single elemental composition complicates structural identification. While current HRMS and MS/MS databases often can provide hits for known molecules, these are often erroneous or misleading when authentic standards are unavailable. In this research, a machine-learning algorithm is developed to support the structural elucidation of small organic pollutants in water, with a focus on (carbon, oxygen, and hydrogen-based) molecules weighing less than 500 Da. The approach relies on a comparison of the experimental and predicted retention of the possible structures of unknowns for which an elemental composition was obtained by HRMS. A promising novelty is thereby the improved removal of erroneous structures via the combination of the retention information obtained from two reversed-phase-based stationary phases, depicting different selectivities (octadecylsilica, C18 and pentafluorphenylsilica, F5). The study translates retention times into retention indices for instrument independence and transferability across diverse HPLC-HRMS systems. The predictive algorithm, utilizing retention data and molecular descriptors, accurately predicts retention indices and proves its utility by eliminating incorrect structural formulas through a 2-stationary phase intersection-based filtration. Using a data set of 100 training compounds and 16 external test set compounds, two Multiple Linear Regression (MLR), MLR-C18 and MLR-F5 models were developed, employing the 16 most influential descriptors, out of 5666 screened. MLR-C18 achieves precise RI predictions, R2 = 0.97, RMSE = 36, MAE = 26, while MLR-F5, though slightly less accurate, maintains a performance with R2 = 0.96, RMSE = 44, MAE = 34. The intersection-based filtration (within ±1.5σ) showed the elimination of more than 70% of impossible structures for a given elemental composition. The model was further implemented in the identification of a drinking water sample to prove its potential. This tool holds significant promise for supporting water quality management and sustainable practices, contributing to faster structural identification of unknown organic micropollutants in water.
Collapse
Affiliation(s)
- Ardiana Kajtazi
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium
| | - Marin Kajtazi
- Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Ul. Ivana Lučića 5, 10000 Zagreb, Croatia
| | - Maike Felipe Santos Barbetta
- Department of Chemistry, Faculty of Philosophy, Science and Letters at Ribeirão Preto, University of São Paulo, 14040-901 Ribeirão Preto, SP, Brazil
| | - Elena Bandini
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium
| | - Hamed Eghbali
- Packaging and Specialty Plastics R&D, Dow Benelux B.V., Terneuzen 4530 AA, The Netherlands
| | - Frédéric Lynen
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium
| |
Collapse
|
2
|
Zhou JY, Chen YQ, Hu G, Zhao H, Wan JB. An integrated strategy for in-depth profiling of N-acylethanolamines in biological samples by UHPLC-HRMS. Anal Chim Acta 2024; 1329:343262. [PMID: 39396319 DOI: 10.1016/j.aca.2024.343262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 09/15/2024] [Accepted: 09/18/2024] [Indexed: 10/15/2024]
Abstract
BACKGROUND N-acylethanolamines (NAEs) are a class of naturally occurring bioactive lipids that play crucial roles in various physiological processes, particularly exhibiting neuroprotective and anti-inflammatory properties. However, the comprehensive profiling of endogenous NAEs in complex biological matrices is challenging due to their low abundance, structural similarity and the limited availability of commercial standards. Here, we propose an integrated strategy for comprehensive profiling of NAEs that combines chemical derivatization and a three-dimensional (3D) prediction model based on quantitative structure-retention time relationship (QSRR) using liquid chromatography coupled with high-resolution tandem mass spectrometry (LC-HRMS). RESULTS After acetyl chloride (ACC) derivatization, the detection sensitivity of NAEs was significantly improved. We developed a QSRR prediction model to construct an in-house database for 141 NAEs, encompassing information on RT, MS1 (m/z), and MS/MS spectra. Propargylamine-labeled fatty acids were synthesized as RT calibrants across various analytical conditions to enhance the robustness of the RT prediction model. NAEs in biological samples were then in-depth profiled using parallel reaction monitoring (PRM) acquisition. This integrated strategy identified and annotated a total of 50 NAEs across serum, hippocampus and cortex tissues from a 5xFAD mouse model of Alzheimer's disease (AD). Notably, the levels of polyunsaturated NAEs, particularly NAE 20:5 and NAE 22:6, were significantly decreased in 5xFAD mice compared to WT mice, as confirmed by accurate quantitation using ACC-d0/d3 derivatization. SIGNIFICANCE Our integrated strategy exhibits great potential for the in-depth profiling of NAEs in complex biological samples, facilitating the elucidation of NAE functions in diverse physiological and pathological processes.
Collapse
Affiliation(s)
- Jun-Yi Zhou
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macao, China
| | - Yan-Qing Chen
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macao, China
| | - Guang Hu
- School of Pharmacy and Bioengineering, Chongqing University of Technology, Chongqing, 400054, China.
| | - Haiyu Zhao
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China.
| | - Jian-Bo Wan
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macao, China.
| |
Collapse
|
3
|
Liu Y, Yoshizawa AC, Ling Y, Okuda S. Insights into predicting small molecule retention times in liquid chromatography using deep learning. J Cheminform 2024; 16:113. [PMID: 39375739 PMCID: PMC11460055 DOI: 10.1186/s13321-024-00905-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 09/13/2024] [Indexed: 10/09/2024] Open
Abstract
In untargeted metabolomics, structures of small molecules are annotated using liquid chromatography-mass spectrometry by leveraging information from the molecular retention time (RT) in the chromatogram and m/z (formerly called ''mass-to-charge ratio'') in the mass spectrum. However, correct identification of metabolites is challenging due to the vast array of small molecules. Therefore, various in silico tools for mass spectrometry peak alignment and compound prediction have been developed; however, the list of candidate compounds remains extensive. Accurate RT prediction is important to exclude false candidates and facilitate metabolite annotation. Recent advancements in artificial intelligence (AI) have led to significant breakthroughs in the use of deep learning models in various fields. Release of a large RT dataset has mitigated the bottlenecks limiting the application of deep learning models, thereby improving their application in RT prediction tasks. This review lists the databases that can be used to expand training datasets and concerns the issue about molecular representation inconsistencies in datasets. It also discusses the application of AI technology for RT prediction, particularly in the 5 years following the release of the METLIN small molecule RT dataset. This review provides a comprehensive overview of the AI applications used for RT prediction, highlighting the progress and remaining challenges. SCIENTIFIC CONTRIBUTION: This article focuses on the advancements in small molecule retention time prediction in computational metabolomics over the past five years, with a particular emphasis on the application of AI technologies in this field. It reviews the publicly available datasets for small molecule retention time, the molecular representation methods, the AI algorithms applied in recent studies. Furthermore, it discusses the effectiveness of these models in assisting with the annotation of small molecule structures and the challenges that must be addressed to achieve practical applications.
Collapse
Affiliation(s)
- Yuting Liu
- Medical AI Center, Niigata University School of Medicine, Niigata City, Niigata, 951-8514, Japan
| | - Akiyasu C Yoshizawa
- Medical AI Center, Niigata University School of Medicine, Niigata City, Niigata, 951-8514, Japan
| | - Yiwei Ling
- Medical AI Center, Niigata University School of Medicine, Niigata City, Niigata, 951-8514, Japan
| | - Shujiro Okuda
- Medical AI Center, Niigata University School of Medicine, Niigata City, Niigata, 951-8514, Japan.
| |
Collapse
|
4
|
Rutan SC, Kempen T, Dahlseid T, Kruger Z, Pirok B, Shackman JG, Zhou Y, Wang Q, Stoll DR. Improved hydrophobic subtraction model of reversed-phase liquid chromatography selectivity based on a large dataset with a focus on isomer selectivity. J Chromatogr A 2024; 1731:465127. [PMID: 39053256 DOI: 10.1016/j.chroma.2024.465127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 06/13/2024] [Accepted: 06/28/2024] [Indexed: 07/27/2024]
Abstract
Reversed-phase (RP) liquid chromatography is an important tool for the characterization of materials and products in the pharmaceutical industry. Method development is still challenging in this application space, particularly when dealing with closely-related compounds. Models of chromatographic selectivity are useful for predicting which columns out of the hundreds that are available are likely to have very similar, or different, selectivity for the application at hand. The hydrophobic subtraction model (HSM1) has been widely employed for this purpose; the column database for this model currently stands at 750 columns. In previous work we explored a refinement of the original HSM1 (HSM2) and found that increasing the size of the dataset used to train the model dramatically reduced the number of gross errors in predictions of selectivity made using the model. In this paper we describe further work in this direction (HSM3), this time based on a much larger solute set (1014 solute/stationary phase combinations) containing selectivities for compounds covering a broader range of physicochemical properties compared to HSM1. The molecular weight range was doubled, and the range of the logarithm of the octanol/water partition coefficients was increased slightly. The number of active pharmaceutical ingredients and related synthetic intermediates and impurities was increased from four to 28, and ten pairs of closely related structures (e.g., geometric and cis-/trans- isomers) were included. The HSM3 model is based on retention measurements for 75 compounds using 13 RP stationary phases and a mobile phase of 40/60 acetonitrile/25 mM ammonium formate buffer at pH 3.2. This data-driven model produced predictions of ln α (chromatographic selectivity using ethylbenzene as the reference compound) with average absolute errors of approximately 0.033, which corresponds to errors in α of about 3 %. In some cases, the prediction of the trans-/cis- selectivities for positional and geometric isomers was relatively accurate, and the driving forces for the observed selectivity could be inferred by examination of the relative magnitudes of the terms in the HSM3 model. For some geometric isomer pairs the interactions mainly responsible for the observed selectivities could not be rationalized due to large uncertainties for particular terms in the model. This suggests that more work is needed in the future to explore other HSM-type models and continue expanding the training dataset in order to continue improving the predictive accuracy of these models. Additionally, we release with this paper a much larger data set (43,329 total retention measurements) at multiple mobile phase compositions, to enable other researchers to pursue their own lines of inquiry related to RP selectivity.
Collapse
Affiliation(s)
- Sarah C Rutan
- Department of Chemistry, Virginia Commonwealth University, Box 842006, Richmond, VA 23284-2006, USA
| | - Trevor Kempen
- Department of Chemistry, Gustavus Adolphus College, 800 W. College Ave., St. Peter, MN 56082, USA
| | - Tina Dahlseid
- Department of Chemistry, Gustavus Adolphus College, 800 W. College Ave., St. Peter, MN 56082, USA
| | - Zachary Kruger
- Department of Chemistry, Gustavus Adolphus College, 800 W. College Ave., St. Peter, MN 56082, USA
| | - Bob Pirok
- Department of Chemistry, Gustavus Adolphus College, 800 W. College Ave., St. Peter, MN 56082, USA
| | - Jonathan G Shackman
- Chemical Process Development, Bristol Myers Squibb, 1 Squibb Dr., New Brunswick, NJ 08903, USA
| | - Yiyang Zhou
- Chemical Process Development, Bristol Myers Squibb, 1 Squibb Dr., New Brunswick, NJ 08903, USA
| | - Qinggang Wang
- Chemical Process Development, Bristol Myers Squibb, 1 Squibb Dr., New Brunswick, NJ 08903, USA
| | - Dwight R Stoll
- Department of Chemistry, Gustavus Adolphus College, 800 W. College Ave., St. Peter, MN 56082, USA.
| |
Collapse
|
5
|
Bosten E, Pardon M, Chen K, Koppen V, Van Herck G, Hellings M, Cabooter D. Assisted Active Learning for Model-Based Method Development in Liquid Chromatography. Anal Chem 2024; 96:13699-13709. [PMID: 38979746 DOI: 10.1021/acs.analchem.4c02700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
In recent decades, there has been a growing interest in fully automated methods for tackling complex optimization problems across various fields. Active learning (AL) and its variant, assisted active learning (AAL), incorporating guidance or assistance from external sources into the learning process, play key roles in this automation by enabling the autonomous selection of optimal experimental conditions to efficiently explore the problem space. These approaches are particularly valuable in situations wherein experimentation is costly or time-consuming. This study explores the application of AAL in model-based method development (MD) for liquid chromatography (LC) by using Bayesian statistics to incorporate historical data and analyte information for the generation of initial retention models. The process involves updating the model parameters based on new experiments, coupled with an active data selection method to choose the most informative experiment to run in a subsequent step. This iterative process balances model exploitation and experimental exploration until a satisfactory separation is achieved. The effectiveness of this approach is demonstrated via two practical examples, resulting in optimized separations in a limited number of experiments by optimizing the gradient slope. It is shown that the ability of AAL to leverage past knowledge and compound information to improve accuracy and reduce experimental runs offers a flexible alternative approach to fixed design methods.
Collapse
Affiliation(s)
- Emery Bosten
- Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, University of Leuven (KU Leuven), Herestraat 49, 3000 Leuven, Belgium
- Therapeutics Development & Supply, Janssen Pharmaceutica, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Marie Pardon
- Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, University of Leuven (KU Leuven), Herestraat 49, 3000 Leuven, Belgium
| | - Kai Chen
- Therapeutics Development & Supply, Janssen Pharmaceutica, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Valerie Koppen
- Therapeutics Development & Supply, Janssen Pharmaceutica, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Gerd Van Herck
- Therapeutics Development & Supply, Janssen Pharmaceutica, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Mario Hellings
- Therapeutics Development & Supply, Janssen Pharmaceutica, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Deirdre Cabooter
- Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, University of Leuven (KU Leuven), Herestraat 49, 3000 Leuven, Belgium
| |
Collapse
|
6
|
Beck AG, Fine J, Aggarwal P, Regalado EL, Levorse D, De Jesus Silva J, Sherer EC. Machine learning models and performance dependency on 2D chemical descriptor space for retention time prediction of pharmaceuticals. J Chromatogr A 2024; 1730:465109. [PMID: 38968662 DOI: 10.1016/j.chroma.2024.465109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/07/2024]
Abstract
The predictive modeling of liquid chromatography methods can be an invaluable asset, potentially saving countless hours of labor while also reducing solvent consumption and waste. Tasks such as physicochemical screening and preliminary method screening systems where large amounts of chromatography data are collected from fast and routine operations are particularly well suited for both leveraging large datasets and benefiting from predictive models. Therefore, the generation of predictive models for retention time is an active area of development. However, for these predictive models to gain acceptance, researchers first must have confidence in model performance and the computational cost of building them should be minimal. In this study, a simple and cost-effective workflow for the development of machine learning models to predict retention time using only Molecular Operating Environment 2D descriptors as input for support vector regression is developed. Furthermore, we investigated the relative performance of models based on molecular descriptor space by utilizing uniform manifold approximation and projection and clustering with Gaussian mixture models to identify chemically distinct clusters. Results outlined herein demonstrate that local models trained on clusters in chemical space perform equivalently when compared to models trained on all data. Through 10-fold cross-validation on a comprehensive set containing 67,950 of our company's proprietary analytes, these models achieved coefficients of determination of 0.84 and 3 % error in terms of retention time. This promising statistical significance is found to translate from cross-validation to prospective prediction on an external test set of pharmaceutically relevant analytes. The observed equivalency of global and local modeling of large datasets is retained with METLIN's SMRT dataset, thereby confirming the wider applicability of the developed machine learning workflows for global models.
Collapse
Affiliation(s)
- Armen G Beck
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Jonathan Fine
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Pankaj Aggarwal
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA.
| | - Erik L Regalado
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Dorothy Levorse
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA
| | | | - Edward C Sherer
- Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ 07065, USA
| |
Collapse
|
7
|
Xue J, Wang B, Ji H, Li W. RT-Transformer: retention time prediction for metabolite annotation to assist in metabolite identification. Bioinformatics 2024; 40:btae084. [PMID: 38402516 PMCID: PMC10914443 DOI: 10.1093/bioinformatics/btae084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/14/2024] [Accepted: 02/22/2024] [Indexed: 02/26/2024] Open
Abstract
MOTIVATION Liquid chromatography retention times prediction can assist in metabolite identification, which is a critical task and challenge in nontargeted metabolomics. However, different chromatographic conditions may result in different retention times for the same metabolite. Current retention time prediction methods lack sufficient scalability to transfer from one specific chromatographic method to another. RESULTS Therefore, we present RT-Transformer, a novel deep neural network model coupled with graph attention network and 1D-Transformer, which can predict retention times under any chromatographic methods. First, we obtain a pre-trained model by training RT-Transformer on the large small molecule retention time dataset containing 80 038 molecules, and then transfer the resulting model to different chromatographic methods based on transfer learning. When tested on the small molecule retention time dataset, as other authors did, the average absolute error reached 27.30 after removing not retained molecules. Still, it reached 33.41 when no samples were removed. The pre-trained RT-Transformer was further transferred to 5 datasets corresponding to different chromatographic conditions and fine-tuned. According to the experimental results, RT-Transformer achieves competitive performance compared to state-of-the-art methods. In addition, RT-Transformer was applied to 41 external molecular retention time datasets. Extensive evaluations indicate that RT-Transformer has excellent scalability in predicting retention times for liquid chromatography and improves the accuracy of metabolite identification. AVAILABILITY AND IMPLEMENTATION The source code for the model is available at https://github.com/01dadada/RT-Transformer. The web server is available at https://huggingface.co/spaces/Xue-Jun/RT-Transformer.
Collapse
Affiliation(s)
- Jun Xue
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan 650500, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Bingyi Wang
- Yunnan Police College, Kunming, Yunnan 650223, China
- Key Laboratory of Smart Drugs Control (Yunnan Police College), Ministry of Education, Kunming, Yunnan 650223, China
| | - Hongchao Ji
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - WeiHua Li
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan 650500, China
| |
Collapse
|
8
|
Passarin PBS, Lourenço FR. Enhancing analytical development in the pharmaceutical industry: A DoE-QSRR model for virtual Method Operable Design Region assessment. J Pharm Biomed Anal 2024; 239:115907. [PMID: 38103415 DOI: 10.1016/j.jpba.2023.115907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/17/2023] [Accepted: 12/04/2023] [Indexed: 12/19/2023]
Abstract
Recently, the pharmaceutical industry has increasingly adopted the Analytical Quality by Design (AQbD) approach for analytical development. To facilitate AQbD approach implementation in the development of chromatographic methods for determining cephalosporin antibiotics, an in silico tool capable of performing virtual DoEs was developed enabling to obtain virtual operable regions of method. To this end, the drugs cephalexin, cefazolin, cefotaxime and ceftriaxone were analyzed using four experimental designs, deriving a DoE-QSRR model and employing Monte Carlo method. The DoE-QSRR model and virtual DoEs were validated using data not used in model's construction, obtaining coefficients of determination of 84.72 % for DoE-QSRR model and over 77 % for virtual DoEs. Virtual MODRs were constructed using data from the virtual DoEs. The virtual MODRs were validated by comparing them with experimental MODRs under various scenarios, with overlap areas reaching values exceeding 84 %. Therefore, the in silico tool was considered suitable for indicating analyte trends under different analytical conditions, being capable of performing virtual DoEs for cephalosporin drugs with sufficient assertiveness to guide analytical development and allow obtaining a MODR capable of providing results of adequate quality.
Collapse
Affiliation(s)
- Paula Beatriz Silva Passarin
- Faculty of Pharmaceutical Sciences, Department of Pharmacy, University of São Paulo, Avenida Professor Lineu Prestes 508, Butantan, São Paulo, SP, Brazil
| | - Felipe Rebello Lourenço
- Faculty of Pharmaceutical Sciences, Department of Pharmacy, University of São Paulo, Avenida Professor Lineu Prestes 508, Butantan, São Paulo, SP, Brazil.
| |
Collapse
|
9
|
Kang Q, Fang P, Zhang S, Qiu H, Lan Z. Deep graph convolutional network for small-molecule retention time prediction. J Chromatogr A 2023; 1711:464439. [PMID: 37865024 DOI: 10.1016/j.chroma.2023.464439] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 10/23/2023]
Abstract
The retention time (RT) is a crucial source of data for liquid chromatography-mass spectrometry (LCMS). A model that can accurately predict the RT for each molecule would empower filtering candidates with similar spectra but differing RT in LCMS-based molecule identification. Recent research shows that graph neural networks (GNNs) outperform traditional machine learning algorithms in RT prediction. However, all of these models use relatively shallow GNNs. This study for the first time investigates how depth affects GNNs' performance on RT prediction. The results demonstrate that a notable improvement can be achieved by pushing the depth of GNNs to 16 layers by the adoption of residual connection. Additionally, we also find that graph convolutional network (GCN) model benefits from the edge information. The developed deep graph convolutional network, DeepGCN-RT, significantly outperforms the previous state-of-the-art method and achieves the lowest mean absolute percentage error (MAPE) of 3.3% and the lowest mean absolute error (MAE) of 26.55 s on the SMRT test set. We also finetune DeepGCN-RT on seven datasets with various chromatographic conditions. The mean MAE of the seven datasets largely decreases 30% compared to previous state-of-the-art method. On the RIKEN-PlaSMA dataset, we also test the effectiveness of DeepGCN-RT in assisting molecular structure identification. By 30% lessening the number of potential structures, DeepGCN-RT is able to improve top-1 accuracy by about 11%.
Collapse
Affiliation(s)
- Qiyue Kang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| | - Pengfei Fang
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China
| | - Shuai Zhang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Huachuan Qiu
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Zhenzhong Lan
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
10
|
Kajtazi A, Russo G, Wicht K, Eghbali H, Lynen F. Facilitating structural elucidation of small environmental solutes in RPLC-HRMS by retention index prediction. CHEMOSPHERE 2023; 337:139361. [PMID: 37392796 DOI: 10.1016/j.chemosphere.2023.139361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/06/2023] [Accepted: 06/26/2023] [Indexed: 07/03/2023]
Abstract
Implementing effective environmental management strategies requires a comprehensive understanding of the chemical composition of environmental pollutants, particularly in complex mixtures. Utilizing innovative analytical techniques, such as high-resolution mass spectrometry and predictive retention index models, can provide valuable insights into the molecular structures of environmental contaminants. Liquid Chromatography-High-Resolution Mass Spectrometry is a powerful tool for the identification of isomeric structures in complex samples. However, there are some limitations that can prevent accurate isomeric structure identification, particularly in cases where the isomers have similar mass and fragmentation patterns. Liquid chromatographic retention, determined by the size, shape, and polarity of the analyte and its interactions with the stationary phase, contains valuable 3D structural information that is vastly underutilized. Therefore, a predictive retention index model is developed which is transferrable to LC-HRMS systems and can assist in the structural elucidation of unknowns. The approach is currently restricted to carbon, hydrogen, and oxygen-based molecules <500 g mol-1. The methodology facilitates the acceptance of accurate structural formulas and the exclusion of erroneous hypothetical structural representations by leveraging retention time estimations, thereby providing a permissible tolerance range for a given elemental composition and experimental retention time. This approach serves as a proof of concept for the development of a Quantitative Structure-Retention Relationship model using a generic gradient LC approach. The use of a widely used reversed-phase (U)HPLC column and a relatively large set of training (101) and test compounds (14) demonstrates the feasibility and potential applicability of this approach for predicting the retention behaviour of compounds in complex mixtures. By providing a standard operating procedure, this approach can be easily replicated and applied to various analytical challenges, further supporting its potential for broader implementation.
Collapse
Affiliation(s)
- Ardiana Kajtazi
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium
| | - Giacomo Russo
- School of Applied Sciences, Sighthill Campus, Edinburgh Napier University, 9 Sighthill Ct, EH11 4BN, Edinburgh, United Kingdom
| | - Kristina Wicht
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium
| | - Hamed Eghbali
- Packaging and Specialty Plastics R&D, Dow Benelux B.V., Terneuzen, 4530 AA, the Netherlands
| | - Frédéric Lynen
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium.
| |
Collapse
|
11
|
Ibrahim AE, El Gohary NA, Aboushady D, Samir L, Karim SEA, Herz M, Salman BI, Al-Harrasi A, Hanafi R, El Deeb S. Recent advances in chiral selectors immobilization and chiral mobile phase additives in liquid chromatographic enantio-separations: A review. J Chromatogr A 2023; 1706:464214. [PMID: 37506464 DOI: 10.1016/j.chroma.2023.464214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 07/10/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023]
Abstract
For decades now, the separation of chiral enantiomers of drugs has been gaining the interest and attention of researchers. In 1991, the first guidelines for development of chiral drugs were firstly released by the US-FDA. Since then, the development in chromatographic enantioseparation tools has been fast and variable, aiming at creating a suitable environment where the physically and chemically identical enantiomers can be separated. Among those tools, the immobilization of chiral selectors (CS) on different stationary phases and the chiral mobile phase additives (CMPA) which have been progressed and studied extensively. This review article highlights the major advances in immobilization of CS together with their different recognition mechanisms as well as CMPA as a cheaper and successful alternative for chiral stationary phases. Moreover, the role of molecular modeling tool as a pre-step in the choice of CS for evaluating possible interactions with different ligands has been pointed up. Illustrations of reported methods and updates for immobilized CS and CMPA have been included.
Collapse
Affiliation(s)
- Adel Ehab Ibrahim
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Port-Said University, Port-Said 42511, Egypt; Natural and Medical Sciences Research Center, University of Nizwa, P.O. Box 33, Birkat Al Mauz, Nizwa 616, Sultanate of Oman
| | - Nesrine Abdelrehim El Gohary
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Dina Aboushady
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Liza Samir
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Shereen Ekram Abdel Karim
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Magy Herz
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Baher I Salman
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Assiut Branch, Assiut, 71524, Egypt
| | - Ahmed Al-Harrasi
- Natural and Medical Sciences Research Center, University of Nizwa, P.O. Box 33, Birkat Al Mauz, Nizwa 616, Sultanate of Oman
| | - Rasha Hanafi
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Sami El Deeb
- Institute of Medicinal and Pharmaceutical Chemistry, Technische Universität Braunschweig, Braunschweig 38092, Germany; Institute of Pharmacy, Freie Universität Berlin, Königin-Luise-Str. 2+4, 14195 Berlin, Germany.
| |
Collapse
|
12
|
Singh YR, Shah DB, Maheshwari DG, Shah JS, Shah S. Advances in AI-Driven Retention Prediction for Different Chromatographic Techniques: Unraveling the Complexity. Crit Rev Anal Chem 2023; 54:3559-3569. [PMID: 37672314 DOI: 10.1080/10408347.2023.2254379] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Retention prediction through Artificial intelligence (AI)-based techniques has gained exponential growth due to their abilities to process complex sets of data and ease the crucial task of identification and separation of compounds in most employed chromatographic techniques. Numerous approaches were reported for retention prediction in different chromatographic techniques, and consistent results demonstrated that the accuracy and effectiveness of deep learning models outclassed the linear machine learning models, mainly in liquid and gas chromatography, as ML algorithms use fewer complex data to train and predict information. Support Vector machine-based neural networks were found to be most utilized for the prediction of retention factors of different compounds in thin-layer chromatography. Cheminformatics, chemometrics, and hybrid approaches were also employed for the modeling and were more reliable in retention prediction over conventional models. Quantitative Structure Retention Relationship (QSRR) was also a potential method for predicting retention in different chromatographic techniques and determining the separation method for analytes. These techniques demonstrated the aids of incorporating QSRR with AI-driven techniques acquiring more precise retention predictions. This review aims at recent exploration of different AI-driven approaches employed for retention prediction in different chromatographic techniques, and due to the lack of summarized literature, it also aims at providing a comprehensive literature that will be highly useful for the society of scientists exploring the field of AI in analytical chemistry.
Collapse
Affiliation(s)
- Yash Raj Singh
- Department of Pharmaceutical Quality Assurance, L. J. Institute of Pharmacy, L J University, Ahmedabad, Gujarat, India
| | - Darshil B Shah
- Department of Pharmaceutical Quality Assurance, L. J. Institute of Pharmacy, L J University, Ahmedabad, Gujarat, India
| | - Dilip G Maheshwari
- Department of Pharmaceutical Quality Assurance, L. J. Institute of Pharmacy, L J University, Ahmedabad, Gujarat, India
| | - Jignesh S Shah
- Department of Pharmaceutical Regulatory Affairs, L. J. Institute of Pharmacy, L J University, Ahmedabad, Gujarat, India
| | - Shreeraj Shah
- Department of Pharmaceutical Technology, L. J. Institute of Pharmacy, L J University, Ahmedabad, Gujarat, India
| |
Collapse
|
13
|
Wardecki D, Dołowy M, Bober-Majnusz K. Evaluation of the Usefulness of Topological Indices for Predicting Selected Physicochemical Properties of Bioactive Substances with Anti-Androgenic and Hypouricemic Activity. Molecules 2023; 28:5822. [PMID: 37570792 PMCID: PMC10420683 DOI: 10.3390/molecules28155822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 07/27/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023] Open
Abstract
Due to the observed increase in the importance of computational methods in determining selected physicochemical parameters of biologically active compounds that are key to understanding their ADME/T profile, such as lipophilicity, there is a great need to work on accurate and precise in silico models based on some structural descriptors, such as topological indices for predicting lipophilicity of certain anti-androgenic and hypouricemic agents and their derivatives, for which the experimental lipophilicity parameter is not accurately described in the available literature, e.g., febuxostat, oxypurinol, ailanthone, abiraterone and teriflunomide. Therefore, the following topological indices were accurately calculated in this paper: Gutman (M, Mν), Randić (0χ, 1χ, 0χν, 1χν), Wiener (W), Rouvray-Crafford (R) and Pyka (A, 0B, 1B) for the selected anti-androgenic drugs (abiraterone, bicalutamide, flutamide, nilutamide, leflunomide, teriflunomide, ailanthone) and some hypouricemic compounds (allopurinol, oxypurinol, febuxostat). Linear regression analysis was used to create simple linear correlations between the newly calculated topological indices and some physicochemical parameters, including lipophilicity descriptors of the tested compounds (previously obtained by TLC and theoretical methods). Our studies confirmed the usefulness of the obtained linear regression equations based on topological indices to predict ADME/T important parameters, such as lipophilicity descriptors of tested compounds with anti-androgenic and hypouricemic effects. The proposed calculation method based on topological indices is fast, easy to use and avoids valuable and lengthy laboratory experiments required in the case of experimental ADME/T studies.
Collapse
Affiliation(s)
- Dawid Wardecki
- Faculty of Pharmaceutical Sciences in Sosnowiec, Doctoral School, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland
| | - Małgorzata Dołowy
- Department of Analytical Chemistry, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland;
| | - Katarzyna Bober-Majnusz
- Department of Analytical Chemistry, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland;
| |
Collapse
|
14
|
Singh YR, Shah DB, Kulkarni M, Patel SR, Maheshwari DG, Shah JS, Shah S. Current trends in chromatographic prediction using artificial intelligence and machine learning. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2023; 15:2785-2797. [PMID: 37264667 DOI: 10.1039/d3ay00362k] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) gained tremendous growth and are rapidly becoming popular in various fields of prediction due to their potential abilities, accuracy, and speed. Machine learning algorithms employ historical data to analyze or predict information using patterns or trends. AI and ML were most employed in chromatographic predictions and particularly attractive options for liquid chromatography method development, as they can help achieve desired results faster, more accurately, and more efficiently. This review aims at exploring various AI and ML models employed in the determination of chromatographic characteristics. This review also aims to provide deep insight into reported artificial neural network (ANN) associated techniques which maintained better accuracy and significant possibilities for chromatographic characteristics prediction in liquid chromatography over classical linear models and also emphasizes the integration of a fuzzy system with an ANN, as this integrated study provides more efficient and accurate methods in chromatographic prediction than other linear models. This study also focuses on the retention prediction of a target molecule employing QSRR methodology combined with an ANN, highlighting a more effective technique than the QSRR alone. This approach showed the benefits of combining AI or ML algorithms with the QSRR to obtain more accurate retention predictions, emphasizing the potential of artificial intelligence and machine learning for overcoming adversities in analytical chemistry.
Collapse
Affiliation(s)
- Yash Raj Singh
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Darshil B Shah
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Mangesh Kulkarni
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Shreyanshu R Patel
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Dilip G Maheshwari
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Jignesh S Shah
- Department of Pharmaceutical Regulatory Affairs, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Shreeraj Shah
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| |
Collapse
|
15
|
Chen X, Huang N, Wang W, Wang Q, Hu HY. Enrichment and analysis methods for trace dissolved organic carbon in reverse osmosis effluent: A review. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 866:161393. [PMID: 36621505 DOI: 10.1016/j.scitotenv.2023.161393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/31/2022] [Accepted: 01/01/2023] [Indexed: 06/17/2023]
Abstract
Reverse osmosis (RO) is an essential unit for producing high-quality ultrapure water. The increasingly severe water shortage and water quality deterioration result in reclaimed water as an alternative source for ultrapure water production. However, when using reclaimed water as water sources, the dissolved organic carbon (DOC) in RO permeate exhibits higher concentration and more sophisticated components than when using clean water sources, thus affecting the effluent quality of ultrapure water and the effectiveness of subsequent treatment processes. To optimize the treatment processes, it is crucial to analyze the components of DOC. This review summarizes the enrichment and analysis methods of trace organic matter, and provides recommendations for the analysis and characterization of DOC in RO permeate. The study summarizes the operating conditions and enrichment properties of different enrichment methods, including solid-phase extraction, liquid-liquid extraction, purge-and-trap, lyophilization and rotary evaporation for low-concentration organic compounds, compares the applicability and limitations of different enrichment methods, and proposes the principles for the selection of enrichment methods. In this review, we discuss the application of mass spectrometry (including Fourier transform ion cyclotron resonance mass spectrometry) in the analysis of DOC components, and focus on data processing as the key procedure in analysis of DOC in RO permeate. Despite the advantages of mass spectrometry, an applicable workflow and open-source database are required to improve the reliability of the analysis. The treatability properties of DOC are suggested to be determined by analyzing the component characteristics or in combination with common removal techniques. This study provides theoretical support for a comprehensive analysis of DOC in RO permeates to improve the removal effect.
Collapse
Affiliation(s)
- Xiaowen Chen
- Environmental Simulation and Pollution Control State Key Joint Laboratory, State Environmental Protection Key Laboratory of Microorganism Application and Risk Control (SMARC), School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Nan Huang
- Department of Environmental Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China.
| | - Wenlong Wang
- Key Laboratory of Microorganism Application and Risk Control of Shenzhen, Guangdong Provincial Engineering Research Center for Urban Water Recycling and Environmental Safety, Institute of Environment and Ecology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, PR China
| | - Qi Wang
- Environmental Simulation and Pollution Control State Key Joint Laboratory, State Environmental Protection Key Laboratory of Microorganism Application and Risk Control (SMARC), School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Hong-Ying Hu
- Environmental Simulation and Pollution Control State Key Joint Laboratory, State Environmental Protection Key Laboratory of Microorganism Application and Risk Control (SMARC), School of Environment, Tsinghua University, Beijing 100084, PR China; Beijing Laboratory for Environmental Frontier Technologies, Beijing 100084, PR China
| |
Collapse
|
16
|
Boelrijk J, van Herwerden D, Ensing B, Forré P, Samanipour S. Predicting RP-LC retention indices of structurally unknown chemicals from mass spectrometry data. J Cheminform 2023; 15:28. [PMID: 36829215 PMCID: PMC9960388 DOI: 10.1186/s13321-023-00699-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 02/13/2023] [Indexed: 02/26/2023] Open
Abstract
Non-target analysis combined with liquid chromatography high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited number structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (r[Formula: see text]) values for structurally (un)known chemicals based on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of r[Formula: see text] values without the need for the exact structure of the chemicals, with an [Formula: see text] of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 r[Formula: see text] units for the NORMAN ([Formula: see text]) and amide ([Formula: see text]) test sets, respectively. This fragment based model showed comparable accuracy in r[Formula: see text] prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained an [Formula: see text] of 0.85 with an RMSE of 67.
Collapse
Affiliation(s)
- Jim Boelrijk
- AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands. .,Institute for Informatics, University of Amsterdam, Amsterdam, The Netherlands.
| | - Denice van Herwerden
- grid.7177.60000000084992262Van’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands
| | - Bernd Ensing
- grid.7177.60000000084992262AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands ,Computational Chemistry Group, Van’t Hoff Institute for Molecular Sciences (HIMS), Amsterdam, The Netherlands
| | - Patrick Forré
- grid.7177.60000000084992262AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands ,grid.7177.60000000084992262Institute for Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Saer Samanipour
- Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands. .,UvA Data Science Center, University of Amsterdam, Amsterdam, The Netherlands. .,Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Australia.
| |
Collapse
|
17
|
Svrkota B, Krmar J, Protić A, Otašević B. The secret of reversed-phase/weak cation exchange retention mechanisms in mixed-mode liquid chromatography applied for small drug molecule analysis. J Chromatogr A 2023; 1690:463776. [PMID: 36640679 DOI: 10.1016/j.chroma.2023.463776] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 01/02/2023] [Accepted: 01/03/2023] [Indexed: 01/07/2023]
Abstract
Resolving complex sample mixtures by liquid chromatography in a single run is challenging. The so-called mixed-mode liquid chromatography (MMLC) which combines several retention mechanisms within a single column, can provide resource-efficient separation of solutes of diverse nature. The Acclaim Mixed-Mode WCX-1 column, encompassing hydrophobic and weak cation exchange interactions, was employed for the analysis of small drug molecules. The stationary phase's interaction abilities were assessed by analysing molecules of different ionisation potentials. Mixed Quantitative Structure-Retention Relationship (QSRR) models were developed for revealing significant experimental parameters (EPs) and molecular features governing molecular retention. According to the plan of Face-Centred Central Composite Design, EPs (column temperature, acetonitrile content, pH and buffer concentration of aqueous mobile phase) variations were included in QSRR modelling. QSRRs were developed upon the whole data set (global model) and upon discrete parts, related to similarly ionized analytes (local models) by applying gradient boosted trees as a regression tool. Root mean squared errors of prediction for global and local QSRR models for cations, anions and neutrals were respectively 0.131; 0.105; 0.102 and 0.042 with the coefficient of determination 0.947; 0.872; 0.954 and 0.996, indicating satisfactory performances of all models, with slightly better accuracy of local ones. The research showed that influences of EPs were dependant on the molecule's ionisation potential. The molecular descriptors highlighted by models pointed out that electrostatic and hydrophobic interactions and hydrogen bonds participate in the retention process. The molecule's conformation significance was evaluated along with the topological relationship between the interaction centres, explicitly determined for each molecular species through local models. All models showed good molecular retention predictability thus showing potential for facilitating the method development.
Collapse
Affiliation(s)
- Bojana Svrkota
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Jovana Krmar
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Ana Protić
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Biljana Otašević
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia.
| |
Collapse
|
18
|
Chen X, Yang Z, Xu Y, Liu Z, Liu Y, Dai Y, Chen S. Progress and prediction of multicomponent quantification in complex systems with practical LC-UV methods. J Pharm Anal 2023; 13:142-155. [PMID: 36908853 PMCID: PMC9999300 DOI: 10.1016/j.jpha.2022.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/24/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022] Open
Abstract
Complex systems exist widely, including medicines from natural products, functional foods, and biological samples. The biological activity of complex systems is often the result of the synergistic effect of multiple components. In the quality evaluation of complex samples, multicomponent quantitative analysis (MCQA) is usually needed. To overcome the difficulty in obtaining standard products, scholars have proposed achieving MCQA through the "single standard to determine multiple components (SSDMC)" approach. This method has been used in the determination of multicomponent content in natural source drugs and the analysis of impurities in chemical drugs and has been included in the Chinese Pharmacopoeia. Depending on a convenient (ultra) high-performance liquid chromatography method, how can the repeatability and robustness of the MCQA method be improved? How can the chromatography conditions be optimized to improve the number of quantitative components? How can computer software technology be introduced to improve the efficiency of multicomponent analysis (MCA)? These are the key problems that remain to be solved in practical MCQA. First, this review article summarizes the calculation methods of relative correction factors in the SSDMC approach in the past five years, as well as the method robustness and accuracy evaluation. Second, it also summarizes methods to improve peak capacity and quantitative accuracy in MCA, including column selection and two-dimensional chromatographic analysis technology. Finally, computer software technologies for predicting chromatographic conditions and analytical parameters are introduced, which provides an idea for intelligent method development in MCA. This paper aims to provide methodological ideas for the improvement of complex system analysis, especially MCQA.
Collapse
Affiliation(s)
- Xi Chen
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Zhao Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Yang Xu
- Key Lab of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Zhe Liu
- Key Lab of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Yanfang Liu
- Key Lab of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Yuntao Dai
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- Corresponding author.
| | - Shilin Chen
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
- Corresponding author. Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China.
| |
Collapse
|
19
|
Xu Z, Chughtai H, Tian L, Liu L, Roy JF, Bayen S. Development of quantitative structure-retention relationship models to improve the identification of leachables in food packaging using non-targeted analysis. Talanta 2023; 253:123861. [PMID: 36095943 DOI: 10.1016/j.talanta.2022.123861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 08/15/2022] [Accepted: 08/17/2022] [Indexed: 12/13/2022]
Abstract
Quantitative structure-retention relationship (QSRR) models can be used to predict the chromatographic retention time of chemicals and facilitate the identification of unknown compounds, notably with non-targeted analysis. In this study, QSRR models were developed from the data obtained for 178 pure chemical standards and four types of analytical columns (C18, phenylhexyl, pentafluorophenyl, cyano) in liquid chromatography quadrupole time-of-flight mass spectrometry (LC-Q-TOF-MS). First, different data partitioning ratios and feature selection methods [random forest (RF) and support vector machine (SVM)] were tested to build models to predict chromatographic retention times based on 2D molecular descriptors. The internal and external performances of the non-linear (RF) and corresponding linear predictive models were systematically compared, and RF models resulted in better predictive capacities [p < 0.05, with an average PVE (proportion of variance explained) value of 0.89 ± 0.02] than linear models (0.79 ± 0.03). For each column, the resulting model was applied to identify leachables from actual plastic packaging samples. An in-depth investigation of the top 20 most intense molecular features revealed that all false-positives could be identified as outliers in the QSRR models (outside of the 95% prediction bands). Furthermore, analyzing a sample on multiple chromatographic columns and applying the associated QSRR models increased the capacity to filter false positives. Such an approach will contribute to a more effective identification of unknown or unexpected leachables in plastics (e.g. non-intended added substances), therefore refining our understanding of the chemical risks associated with food contact materials.
Collapse
Affiliation(s)
- Ziyun Xu
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Hamza Chughtai
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Lei Tian
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Lan Liu
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | | | - Stéphane Bayen
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada.
| |
Collapse
|
20
|
Madaki Z, Abacioglu N, Usman AG, Taner N, Sehirli AO, Abba SI. Novel Hybridized Computational Paradigms Integrated with Five Stand-Alone Algorithms for Clinical Prediction of HCV Status among Patients: A Data-Driven Technique. Life (Basel) 2022; 13:79. [PMID: 36676028 PMCID: PMC9866913 DOI: 10.3390/life13010079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/19/2022] [Accepted: 12/21/2022] [Indexed: 12/29/2022] Open
Abstract
The emergence of health informatics opens new opportunities and doors for different disease diagnoses. The current work proposed the implementation of five different stand-alone techniques coupled with four different novel hybridized paradigms for the clinical prediction of hepatitis C status among patients, using both sociodemographic and clinical input variables. Both the visualized and quantitative performances of the stand-alone algorithms present the capability of the Gaussian process regression (GPR), Generalized neural network (GRNN), and Interactive linear regression (ILR) over the Support Vector Regression (SVR) and Adaptive neuro-fuzzy inference system (ANFIS) models. Hence, due to the lower performance of the stand-alone algorithms at a certain point, four different novel hybrid data intelligent algorithms were proposed, including: interactive linear regression-Gaussian process regression (ILR-GPR), interactive linear regression-generalized neural network (ILR-GRNN), interactive linear regression-Support Vector Regression (ILR-SVR), and interactive linear regression-adaptive neuro-fuzzy inference system (ILR-ANFIS), to boost the prediction accuracy of the stand-alone techniques in the clinical prediction of hepatitis C among patients. Based on the quantitative prediction skills presented by the novel hybridized paradigms, the proposed techniques were able to enhance the performance efficiency of the single paradigms up to 44% and 45% in the calibration and validation phases, respectively.
Collapse
Affiliation(s)
- Zachariah Madaki
- Department of Pharmacology, Faculty of Pharmacy, Near East University, North Cyprus, Mersin-10, 99138 Nicosia, Türkiye
| | - Nurettin Abacioglu
- Department of Pharmacology, Faculty of Pharmacy, Near East University, North Cyprus, Mersin-10, 99138 Nicosia, Türkiye
| | - A. G. Usman
- Operational Research Centre in Healthcare, Near East University, North Cyprus, Mersin-10, 99138 Nicosia, Türkiye
- Department of Analytical Chemistry, Faculty of Pharmacy, Near East University, North Cyprus, Mersin-10, 99138 Nicosia, Türkiye
| | - Neda Taner
- Department of Clinical Pharmacy, Faculty of Pharmacy, Istanbul Medipol University, 34810 Istanbul, Türkiye
| | - Ahmet. O. Sehirli
- Department of Pharmacology, Faculty of Dentistry, Nicosia, Near East University, North Cyprus, Mersin-10, 99138 Nicosia, Türkiye
| | - S. I. Abba
- Interdisciplinary Research Centre for Membrane and Water Security, Faculty of Petroleum and Minerals, King Fahd University, Dhahran 31261, Saudi Arabia
| |
Collapse
|
21
|
Kensert A, Desmet G, Cabooter D. Graph Neural Networks for Improved Retention Time Predictions. LCGC EUROPE 2022. [DOI: 10.56530/lcgc.eu.qt5667e1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
In this extended special feature to celebrate the 35th anniversary edition of LCGC Europe, leading figures from the separation science community explore contemporary trends in separation science and identify possible future developments.
Collapse
|
22
|
Rocco K, Margoum C, Richard L, Coquery M. Enhanced database creation with in silico workflows for suspect screening of unknown tebuconazole transformation products in environmental samples by UHPLC-HRMS. JOURNAL OF HAZARDOUS MATERIALS 2022; 440:129706. [PMID: 35961075 DOI: 10.1016/j.jhazmat.2022.129706] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 07/12/2022] [Accepted: 07/30/2022] [Indexed: 06/15/2023]
Abstract
The search and identification of organic contaminants in agricultural watersheds has become a crucial effort to better characterize watershed contamination by pesticides. The past decade has brought a more holistic view of watershed contamination via the deployment of powerful analytical strategies such as non-target and suspect screening analysis that can search more contaminants and their transformation products. However, suspect screening analysis remains broadly confined to known molecules, primarily due to the lack of analytical standards and suspect databases for unknowns such as pesticide transformation products. Here we developed a novel workflow by cross-comparing the results of various in silico prediction tools against literature data to create an enhanced database for suspect screening of pesticide transformation products. This workflow was applied on tebuconazole, used here as a model pesticide, and resulted in a suspect screening database counting 291 transformation products. The chromatographic retention times and tandem mass spectra were predicted for each of these compounds using 6 models based on multilinear regression and more complex machine-learning algorithms. This comprehensive approach to the investigation and identification of tebuconazole transformation products was retrospectively applied on environmental samples and found 6 transformation products identified for the first time in river water samples.
Collapse
Affiliation(s)
- Kevin Rocco
- INRAE, UR RiverLy, 69625 Villeurbanne, France.
| | | | | | | |
Collapse
|
23
|
Retention Time Prediction with Message-Passing Neural Networks. SEPARATIONS 2022. [DOI: 10.3390/separations9100291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
Retention time prediction, facilitated by advances in machine learning, has become a useful tool in untargeted LC-MS applications. State-of-the-art approaches include graph neural networks and 1D-convolutional neural networks that are trained on the METLIN small molecule retention time dataset (SMRT). These approaches demonstrate accurate predictions comparable with the experimental error for the training set. The weak point of retention time prediction approaches is the transfer of predictions to various systems. The accuracy of this step depends both on the method of mapping and on the accuracy of the general model trained on SMRT. Therefore, improvements to both parts of prediction workflows may lead to improved compound annotations. Here, we evaluate capabilities of message-passing neural networks (MPNN) that have demonstrated outstanding performance on many chemical tasks to accurately predict retention times. The model was initially trained on SMRT, providing mean and median absolute cross-validation errors of 32 and 16 s, respectively. The pretrained MPNN was further fine-tuned on five publicly available small reversed-phase retention sets in a transfer learning mode and demonstrated up to 30% improvement of prediction accuracy for these sets compared with the state-of-the-art methods. We demonstrated that filtering isomeric candidates by predicted retention with the thresholds obtained from ROC curves eliminates up to 50% of false identities.
Collapse
|
24
|
CORAL: Quantitative Structure Retention Relationship (QSRR) of flavors and fragrances compounds studied on the stationary phase methyl silicone OV-101 column in gas chromatography using correlation intensity index and consensus modelling. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.133437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
25
|
Ribar D, Rijavec T, Kralj Cigić I. An exploration into the use of Hansen solubility parameters for modelling reversed-phase chromatographic separations. J Anal Sci Technol 2022. [DOI: 10.1186/s40543-022-00322-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
AbstractThe suitability of Hansen solubility parameters as descriptors for modelling analyte retention during reversed-phase chromatographic experiments was investigated. A novel theoretical model using Hansen solubility parameters as the basis for a complete mathematical derivation of the model was developed. The theoretical model also includes the cavitation volumes of the analytes, which were calculated using ab initio density functional theory methods. A set of three homologous phthalates was used for experimental data collection and subsequent model construction. The training error and the generalization error of the model were additionally evaluated using a range of chemically diverse analytes. Statistical evaluation of the results revealed that the model is suitable for analyte retention prediction but is limited to the analytes used in the model construction. Therefore, the resulting theoretical model cannot be easily generalized. A retention anomaly attributed to the column temperature and mobile phase composition was experimentally observed and mathematically investigated.
Collapse
|
26
|
De Gauquier P, Vanommeslaeghe K, Heyden YV, Mangelings D. Modelling approaches for chiral chromatography on polysaccharide-based and macrocyclic antibiotic chiral selectors: A review. Anal Chim Acta 2022; 1198:338861. [DOI: 10.1016/j.aca.2021.338861] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 07/12/2021] [Accepted: 07/19/2021] [Indexed: 12/25/2022]
|
27
|
Paritala J, Peraman R, Kondreddy VK, Subrahmanyam CVS, Ravichandiran V. Quantitative structure retention relationship (QSRR) approach for assessment of chromatographic behavior of antiviral drugs in the development of liquid chromatographic method. J LIQ CHROMATOGR R T 2022. [DOI: 10.1080/10826076.2022.2025827] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Jagadeesh Paritala
- Department of Pharmaceutical Analysis, Raghavendra Institute of Pharmaceutical Education and Research (RIPER)-Autonomous, Anantapur, India
| | - Ramalingam Peraman
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), Hajipur, Bihar, India
| | - Vinod Kumar Kondreddy
- Department of Pharmaceutical Analysis, Raghavendra Institute of Pharmaceutical Education and Research (RIPER)-Autonomous, Anantapur, India
| | | | - V Ravichandiran
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), Hajipur, Bihar, India
- National Institute of Pharmaceutical Education & Research (NIPER), Kolkata, India
| |
Collapse
|
28
|
Liapikos T, Zisi C, Kodra D, Kademoglou K, Diamantidou D, Begou O, Pappa-Louisi A, Theodoridis G. Quantitative Structure Retention Relationship (QSRR) Modelling for Analytes’ Retention Prediction in LC-HRMS by Applying Different Machine Learning Algorithms and Evaluating Their Performance. J Chromatogr B Analyt Technol Biomed Life Sci 2022; 1191:123132. [DOI: 10.1016/j.jchromb.2022.123132] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 01/12/2022] [Accepted: 01/16/2022] [Indexed: 12/26/2022]
|
29
|
Biancolillo A, D'Archivio AA. Transfer of gas chromatographic retention data among poly(siloxane) columns by quantitative structure-retention relationships based on molecular descriptors of both solutes and stationary phases. J Chromatogr A 2021; 1663:462758. [PMID: 34954535 DOI: 10.1016/j.chroma.2021.462758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/13/2021] [Accepted: 12/15/2021] [Indexed: 10/19/2022]
Abstract
In the present study, computational molecular descriptors of 90 saturated esters and seven poly(siloxane) stationary phases with different polarity (SE-30, OV-7, DC-710, OV-25, XE-60, OV-225 and Silar-5CP) were combined into quantitative structure-retention relationship (QSRR) models aimed at predicting the Kováts retention indices (RIs) of the solutes. The molecular descriptors (174) of the stationary phases included in the models were computed using Dragon software from poly(siloxane) oligomers made of 20 siloxane units reflecting the nominal composition of the stationary phase, whereas 439 molecular descriptors were adopted to represent the esters. Different QSRR models were generated by means of Partial Least Squares (PLS) regression to assess the accuracy of this approach in predicting the RIs of unexplored solutes both in known and external stationary phases. After calibration of each PLS model, the descriptors were selected/discarded according to their relevance, evaluated by Covariance Selection (CovSel), and the PLS models were re-built, which resulted in a noticeable improvement of their predictive ability. Firstly, all the available data were equally divided into a training and a test set; the model built on the calibration set was used to predict the RIs of the validation observations. Successively, seven diverse PLS models were created following a "leave-one-column-out" fashion procedure, each one finalized to the estimation of the RIs of the 90 esters associated with a single stationary phase, whereas the calibration model was calculated on the remaining data. All the estimated models provided successful results on the external stationary phase, and predictive performance further increased after variable selection based on CovSel analysis. The final models provided a Root Mean Square Error in Cross Validation (RMSECV) in the range 12-20, a Root Mean Square Error in Prediction (RMSEP) in the range 11-26, and Mean Absolute Percentage Errors in Prediction (MAMEPs) in the range 0.7-1.5, revealing accurate cross-column prediction. Eventually, to test the robustness of the proposed approach, the 90 solutes were equally partitioned into a calibration and a test set and two further QSSR strategies were applied. The first PLS model was calibrated on all the seven stationary phases and the RIs of the 45 external solutes in the same seven columns were simultaneously predicted. The last QSRR approach followed a "leave-one-column-out" scheme and RI of 45 test solutes on an external stationary phase was predicted by a PLS model calibrated with the data of the 45 remaining solutes and the six left stationary phases. After selection of the significant molecular descriptors, PLS regression provided RMSECV values in the range 6-19, RMSEPs in the range 10-14, and MAPEPs in the range 0.9-2.4, revealing the suitability of the approach to deduce the RI of unknown solutes in uncharted stationary phases.
Collapse
Affiliation(s)
- Alessandra Biancolillo
- Dipartimento di Scienze Fisiche e Chimiche, Università degli Studi dell'Aquila, Via Vetoio, 67010 Coppito, L'Aquila, Italy
| | - Angelo Antonio D'Archivio
- Dipartimento di Scienze Fisiche e Chimiche, Università degli Studi dell'Aquila, Via Vetoio, 67010 Coppito, L'Aquila, Italy.
| |
Collapse
|
30
|
Sagandykova G, Buszewski B. Perspectives and recent advances in quantitative structure-retention relationships for high performance liquid chromatography. How far are we? Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116294] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
31
|
Gritti F. Perspective on the Future Approaches to Predict Retention in Liquid Chromatography. Anal Chem 2021; 93:5653-5664. [PMID: 33797872 DOI: 10.1021/acs.analchem.0c05078] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The demand for rapid column screening, computer-assisted method development and method transfer, and unambiguous compound identification by LC/MS analyses has pushed analysts to adopt experimental protocols and software for the accurate prediction of the retention time in liquid chromatography (LC). This Perspective discusses the classical approaches used to predict retention times in LC over the last three decades and proposes future requirements to increase their accuracy. First, inverse methods for retention prediction are essentially applied during screening and gradient method optimization: a minimum number of experiments or design of experiments (DoE) is run to train and calibrate a model (either purely statistical or based on the principles and fundamentals of liquid chromatography) by a mere fitting process. They do not require the accurate knowledge of the true column hold-up volume V0, system dwell volume Vdwell (in gradient elution), and the retention behavior (k versus the content of strong solvent φ, temperature T, pH, and ionic strength I) of the analytes. Their relative accuracy is often excellent below a few percent. Statistical methods are expected to be the most attractive to handle very complex retention behavior such as in mixed-mode chromatography (MMC). Fundamentally correct retention models accounting for the simultaneous impact of φ, I, pH, and T in MMC are needed for method development based on chromatography principles. Second, direct methods for retention prediction are ideally suited for accurate method transfer from one column/system configuration to another: these quality by design (QbD) methods are based on the fundamentals and principles of solid-liquid adsorption and gradient chromatography. No model calibration is necessary; however, they require universal conventions for the accurate determination of true retention factors (for 1 < k < 30) as a function of the experimental variables (φ, T, pH, and I) and of the true column/system parameters (V0, Vdwell, dispersion volume, σ, and relaxation volume, τ, of the programmed gradient profile at the column inlet and gradient distortion at the column outlet). Finally, when the molecular structure of the analytes is either known or assumed, retention prediction has essentially been made on the basis of statistical approaches such as the linear solvation energy relationships (LSERs) and the quantitative structure retention relationships (QSRRs): their ability to accurately predict the retention remains limited within 10-30%. They have been combined with molecular similarity approaches (where the retention model is calibrated with compounds having structures similar to that of the targeted analytes) and artificial intelligence algorithms to further improve their accuracy below 10%. In this Perspective, it is proposed to adopt a more rigorous and fundamental approach by considering the very details of the solid-liquid adsorption process: Monte Carlo (MC) or molecular dynamics (MD) simulations are promising tools to explain and interpret retention data that are too complex to be described by either empirical or statistical retention models.
Collapse
Affiliation(s)
- Fabrice Gritti
- Waters Corporation, 34 Maple Street, Milford, Massachusetts 01757, United States
| |
Collapse
|
32
|
Nielson FF, Colby SM, Thomas DG, Renslow RS, Metz TO. Exploring the Impacts of Conformer Selection Methods on Ion Mobility Collision Cross Section Predictions. Anal Chem 2021; 93:3830-3838. [PMID: 33606495 DOI: 10.1021/acs.analchem.0c04341] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The prediction of structure dependent molecular properties, such as collision cross sections as measured using ion mobility spectrometry, are crucially dependent on the selection of the correct population of molecular conformers. Here, we report an in-depth evaluation of multiple conformation selection techniques, including simple averaging, Boltzmann weighting, lowest energy selection, low energy threshold reductions, and similarity reduction. Generating 50 000 conformers each for 18 molecules, we used the In Silico Chemical Library Engine (ISiCLE) to calculate the collision cross sections for the entire data set. First, we employed Monte Carlo simulations to understand the variability between conformer structures as generated using simulated annealing. Then we employed Monte Carlo simulations to the aforementioned conformer selection techniques applied on the simulated molecular property: the ion mobility collision cross section. Based on our analyses, we found Boltzmann weighting to be a good trade-off between precision and theoretical accuracy. Combining multiple techniques revealed that energy thresholds and root-mean-squared deviation-based similarity reductions can save considerable computational expense while maintaining property prediction accuracy. Molecular dynamic conformer generation tools like AMBER can continue to generate new lowest energy conformers even after tens of thousands of generations, decreasing precision between runs. This reduced precision can be ameliorated and theoretical accuracy increased by running density functional theory geometry optimization on carefully selected conformers.
Collapse
Affiliation(s)
- Felicity F Nielson
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Sean M Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Dennis G Thomas
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Ryan S Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| |
Collapse
|
33
|
He M, Zhou Y. How to identify “Material basis–Quality markers” more accurately in Chinese herbal medicines from modern chromatography-mass spectrometry data-sets: Opportunities and challenges of chemometric tools. CHINESE HERBAL MEDICINES 2021; 13:2-16. [PMID: 36117762 PMCID: PMC9476807 DOI: 10.1016/j.chmed.2020.05.006] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 03/26/2020] [Accepted: 05/25/2020] [Indexed: 12/20/2022] Open
|
34
|
Taraji M, Haddad PR. Method Optimisation in Hydrophilic-Interaction Liquid Chromatography by Design of Experiments Combined with Quantitative Structure–Retention Relationships. Aust J Chem 2021. [DOI: 10.1071/ch21102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Accurate prediction of the separation conditions for a set of target analytes with no retention data available is fundamental for routine analytical assays but remains a very challenging task. In this paper, a quality by design (QbD) optimisation workflow capable of discovering the optimal chromatographic conditions for separation of new compounds in hydrophilic-interaction liquid chromatography (HILIC) is introduced. This workflow features the application of quantitative structure−retention relationship (QSRR) methodology in conjunction with design of experiments (DoE) principles and was used to carry out a two-level full factorial DoE optimisation for a mixture of pharmaceutical analytes on zwitterionic, amide, amine, and bare silica HILIC stationary phases, with mobile phases containing varying acetonitrile content, mobile phase pH, and salt concentration. A dual-filtering approach that considers both retention time (tR) and structural similarity was used to identify the optimal set of analytes to train the QSRR in order to maximise prediction accuracy. Highly predictive retention models (average R2 of 0.98) were obtained and statistical analysis of the prediction performance of the QSRR models demonstrated their ability to predict the retention times of new compounds based solely on their molecular structures, with root-mean-square errors of prediction in the range 7.6–11.0 %. Further, the obtained retention data for pharmaceutical test compounds were used to compute their separation selectivity, which was used as input into a DoE optimiser in order to select the optimal separation conditions. Experimental separations performed under the chosen optimal working conditions showed good agreement with the theoretical predictions. To the best of our knowledge, this is the first study of a QbD optimisation workflow assisted with dual-filtering-based retention modelling to facilitate the method development process in HILIC.
Collapse
|
35
|
Kadlecová Z, Kalíková K, Ansorge M, Gilar M, Tesařová E. The effect of particle and ligand types on retention and peak shape in liquid chromatography. Microchem J 2020. [DOI: 10.1016/j.microc.2020.105466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
36
|
Prediction of Retention Time of Morphine and Its Derivatives Without Using Computer-Encoded Complex Descriptors. Chromatographia 2020. [DOI: 10.1007/s10337-020-03975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
37
|
Haddad PR, Taraji M, Szücs R. Prediction of Analyte Retention Time in Liquid Chromatography. Anal Chem 2020; 93:228-256. [DOI: 10.1021/acs.analchem.0c04190] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Paul R. Haddad
- Australian Centre for Research on Separation Science, School of Natural Sciences, University of Tasmania, Private Bag 75, Hobart, Tasmania, Australia 7001
| | - Maryam Taraji
- Australian Centre for Research on Separation Science, School of Natural Sciences, University of Tasmania, Private Bag 75, Hobart, Tasmania, Australia 7001
- The Australian Wine Research Institute, P.O. Box 197, Adelaide, South Australia 5064, Australia
- Metabolomics Australia, P.O. Box 197, Adelaide, South Australia 5064, Australia
| | - Roman Szücs
- Pfizer R&D UK Limited, Ramsgate Road, Sandwich CT13 9NJ, U.K
- Department of Analytical Chemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Mlynská Dolina CH2, Ilkovičova 6, SK-84215 Bratislava, Slovakia
| |
Collapse
|
38
|
Zhu QF, An N, Feng YQ. In-Depth Annotation Strategy of Saturated Hydroxy Fatty Acids Based on Their Chromatographic Retention Behaviors and MS Fragmentation Patterns. Anal Chem 2020; 92:14528-14535. [PMID: 33052648 DOI: 10.1021/acs.analchem.0c02719] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Hydroxy fatty acids are a class of bioactive compounds in a variety of organisms. The identification of hydroxy fatty acids in biological samples has still been a challenge because of their low abundance, high structural similarity, and limited availability of authentic hydroxy fatty acid standards. Here, we present a strategy for the annotation of saturated monohydroxyl fatty acids (OH-FAs) based on the integration of chromatographic retention rules and MS2 fragmentation patterns. Thirty-nine authentic OH-FA standards were used to investigate their retention behavior on a reversed-phase stationary phase (C18) of liquid chromatography, and we found that their retention simultaneously follows two kinds of "carbon number rules". Using the "carbon number rules", the retention index (RI) of all OH-FAs that contain carbon numbers from 8 to 18 (C8-18) can be predicted. Additionally, by studying the MS2 fragmentation of OH-FAs under collision-induced dissociation, we found that the intensity ratio (IR) of the characteristic fragment ions ([M + H]+-63 and [M + H]+-45) is closely related to the position of the hydroxyl group on the OH-FA structure, which is helpful to further identify and confirm the OH-FA isomers. As a result, 97 of 107 potential OH-FAs detected in honey, human serum, and rice seedling by chemical isotope labeling-assisted liquid chromatography-mass spectrometry were annotated upon the RI matching and IR confirming. Furthermore, in order to simplify the annotation process of OH-FAs, we constructed an OH-FA library to facilitate the annotation of OH-FAs. Overall, this study provides a new and promising tool for the in-depth annotation of OH-FA isomers.
Collapse
Affiliation(s)
- Quan-Fei Zhu
- Department of Chemistry, Wuhan University, Wuhan 430072, PR China
| | - Na An
- Department of Chemistry, Wuhan University, Wuhan 430072, PR China
| | - Yu-Qi Feng
- Department of Chemistry, Wuhan University, Wuhan 430072, PR China.,Frontier Science Center for Immunology and Metabolism, Wuhan University, Wuhan 430072, PR China
| |
Collapse
|
39
|
Kianpour M, Mohammadinasab E, Isfahani TM. Comparison between genetic algorithm‐multiple linear regression and back‐propagation‐artificial neural network methods for predicting the
LD
50
of organo (phosphate and thiophosphate) compounds. J CHIN CHEM SOC-TAIP 2020. [DOI: 10.1002/jccs.201900514] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Affiliation(s)
- Mina Kianpour
- Department of Chemistry, Arak BranchIslamic Azad University Arak Iran
| | | | | |
Collapse
|
40
|
Biopartitioning micellar chromatography under different conditions: Insight into the retention mechanism and the potential to model biological processes. J Chromatogr A 2020; 1621:461027. [PMID: 32276854 DOI: 10.1016/j.chroma.2020.461027] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 01/14/2020] [Accepted: 03/09/2020] [Indexed: 12/13/2022]
Abstract
In the present study, 88 structurally- diverse drugs were investigated by biopartitioning micellar chromatography (BMC) using Brij-35 as surfactant under different chromatographic conditions. It was found that temperature and presence of NaCl have only a minor effect in BMC retention. Correlation of BMC retention factors with octanol-water partitioning required the inclusion of fractions of ionized species as additional parameters, showing that there is a weaker effect of ionization in BMC environment. Compared to Immobilized Artificial Membrane (IAM) Chromatography, BMC retention factors cover a relatively narrow span, two-fold smaller than retention factors on IAM stationary phases as a result of the presence of micelles facilitating elution of lipophilic compounds and the absence of secondary attractive electrostatic interactions in the BMC environment. Similarities/dissimilarities between BMC, octanol-water partitioning and IAM Chromatography were investigated by Linear Free Energy Relationships (LSER). BMC retention factors were used to construct relationships with cell permeability,% Human Oral Absorption (%HOA) and Plasma Protein Binding (%PPB). Linear BMC models were obtained with Caco-2 cell lines and Parallel Artificial Membrane Permeability Assay (PAMPA). For %HOA, a hyperbolic model was established upon incorporation of topological polar surface area (tPSA) as additional parameter. A sigmoidal model was constructed for %PPB and a linear one for the corresponding thermodynamic binding constant logK. In both cases inclusion of the fraction of anionic species with a positive sign was required reflecting the preference of human albumin for acidic drugs.
Collapse
|
41
|
Watanabe N, Murata M, Ogawa T, Vavricka CJ, Kondo A, Ogino C, Araki M. Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions. J Chem Inf Model 2020; 60:1833-1843. [PMID: 32053362 DOI: 10.1021/acs.jcim.9b00877] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SE-models) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.
Collapse
Affiliation(s)
- Naoki Watanabe
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501 Japan
| | - Masahiro Murata
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan
| | - Teppei Ogawa
- Mitsui Knowledge Industry Co., Ltd. (MKI), 2-3-33 Nakanoshima, Kita-ku, Osaka 530-0005, Japan
| | - Christopher J Vavricka
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Akihiko Kondo
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Chiaki Ogino
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501 Japan
| | - Michihiro Araki
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| |
Collapse
|
42
|
Karadžić Banjac MŽ, Kovačević SZ, Tepić Horecki AN, Šumić ZM, Vakula AS, Podunavac‐Kuzmanović SO, Jevrić LR. Toward consistent discrimination of common bean (
Phaseolus vulgaris
L.) based on grain coat color, phytochemical composition, and antioxidant activity. J FOOD PROCESS PRES 2019. [DOI: 10.1111/jfpp.14246] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | | | | | - Zdravko M. Šumić
- Faculty of Technology Novi Sad University of Novi Sad Novi Sad Serbia
| | - Anita S. Vakula
- Faculty of Technology Novi Sad University of Novi Sad Novi Sad Serbia
| | | | - Lidija R. Jevrić
- Faculty of Technology Novi Sad University of Novi Sad Novi Sad Serbia
| |
Collapse
|
43
|
Ramezani AM, Yousefinejad S, Shahsavar A, Mohajeri A, Absalan G. Quantitative structure-retention relationship for chromatographic behaviour of anthraquinone derivatives through considering organic modifier features in micellar liquid chromatography. J Chromatogr A 2019; 1599:46-54. [DOI: 10.1016/j.chroma.2019.03.063] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 03/27/2019] [Accepted: 03/28/2019] [Indexed: 01/06/2023]
|