1
|
Protein design-scapes generated by microfluidic DNA assembly elucidate domain coupling in the bacterial histidine kinase CpxA. Proc Natl Acad Sci U S A 2021; 118:2017719118. [PMID: 33723045 PMCID: PMC8000134 DOI: 10.1073/pnas.2017719118] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The randomization and screening of combinatorial DNA libraries is a powerful technique for understanding sequence-function relationships and optimizing biosynthetic pathways. Although it can be difficult to predict a priori which sequence combinations encode functional units, it is often possible to omit undesired combinations that inflate library size and screening effort. However, defined library generation is difficult when a complex scan through sequence space is needed. To overcome this challenge, we designed a hybrid valve- and droplet-based microfluidic system that deterministically assembles DNA parts in picoliter droplets, reducing reagent consumption and bias. Using this system, we built a combinatorial library encoding an engineered histidine kinase (HK) based on bacterial CpxA. Our library encodes designed transmembrane (TM) domains that modulate the activity of the cytoplasmic domain of CpxA and variants of the structurally distant "S helix" located near the catalytic domain. We find that the S helix sets a basal activity further modulated by the TM domain. Surprisingly, we also find that a given TM motif can elicit opposing effects on the catalytic activity of different S-helix variants. We conclude that the intervening HAMP domain passively transmits signals and shapes the signaling response depending on subtle changes in neighboring domains. This flexibility engenders a richness in functional outputs as HKs vary in response to changing evolutionary pressures.
Collapse
|
2
|
Chen L, Luo A, Zhang Y, Liu F, Jiang Y, Xu Q, Chen X, Hu Q, Chen SF, Chen KJ, Kuo HC. Optimization of the single-phased white phosphor of Li2SrSiO4: Eu2+, Ce3+ for light-emitting diodes by using the combinatorial approach assisted with the Taguchi method. ACS COMBINATORIAL SCIENCE 2012; 14:636-44. [PMID: 23095104 DOI: 10.1021/co300058x] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The best performance of the phosphor Li(2)SrSiO(4): Eu(2+), Ce(3+) in terms of luminescence efficiency (LE), color rendering index (CRI) and color temperature (Tc) for light-emitting diode application was optimized with combinatorial approach. The combinatorial libraries were synthesized with solution-based method and the scale-up samples were synthesized with conventional solid-reaction method. Crystal structure was investigated by using the X-ray diffraction spectrometer. The emission spectra of each sample in combinatorial libraries were measured in situ by using a fiber optic spectrometer. Fluorescence spectrometers were used to record excitation and emission spectra of bulk samples. White light generation was tuned up by tailoring Eu(2+) and Ce(3+) concentrations in the single-phased host of Li(2)SrSiO(4) under near-ultraviolet excitation, but it exhibited low efficiency of luminescence and poor color rendering index. The effects of each level of the Eu(2+) and Ce(3+) concentrations on LE, CRI, and Tc were evaluated with the Taguchi method. The optimum levels of the interaction pairs between Eu(2+) and Ce(3+) concentration on LE, CRI, and Tc were [2, 1] (0.006 M, 0.003 M), [1, 2] (0.003 M, 0.006 M), and [3, 1] (0.009 M, 0.00 3M), respectively. The thermal stability of luminescence, the external quantum efficiency (QE), luminance, chromaticity coordinates, correlated color temperature, color purity including the composition ratio of RGB in white light, and color rendering index of the white light emission of phosphor were evaluated comprehensively from a bulk sample.
Collapse
Affiliation(s)
- Lei Chen
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
- Semiconductor and Optoelectronic Technology Engineering Research Center of Anhui Province, Wuhu 241000, China
| | - Anqi Luo
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Yao Zhang
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Fayong Liu
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Yang Jiang
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Qingsheng Xu
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Xinhui Chen
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Qingzhuo Hu
- School of Materials Science
and Engineering, Hefei University of Technology, Hefei 230009, China
| | - Shi-Fu Chen
- Department of Chemistry, Huaibei Normal University, Huaibei 235000, China
| | - Kuo-Ju Chen
- Department of Photonic & Institute of Electro-Optical Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan
| | - Hao-Chung Kuo
- Department of Photonic & Institute of Electro-Optical Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan
| |
Collapse
|
3
|
Xu G, Hughes-Oliver JM, Brooks JD, Yeatts JL, Baynes RE. Selection of appropriate training and validation set chemicals for modelling dermal permeability by U-optimal design. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 24:135-156. [PMID: 23157374 DOI: 10.1080/1062936x.2012.742458] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantitative structure-activity relationship (QSAR) models are being used increasingly in skin permeation studies. The main idea of QSAR modelling is to quantify the relationship between biological activities and chemical properties, and thus to predict the activity of chemical solutes. As a key step, the selection of a representative and structurally diverse training set is critical to the prediction power of a QSAR model. Early QSAR models selected training sets in a subjective way and solutes in the training set were relatively homogenous. More recently, statistical methods such as D-optimal design or space-filling design have been applied but such methods are not always ideal. This paper describes a comprehensive procedure to select training sets from a large candidate set of 4534 solutes. A newly proposed 'Baynes' rule', which is a modification of Lipinski's 'rule of five', was used to screen out solutes that were not qualified for the study. U-optimality was used as the selection criterion. A principal component analysis showed that the selected training set was representative of the chemical space. Gas chromatograph amenability was verified. A model built using the training set was shown to have greater predictive power than a model built using a previous dataset [1].
Collapse
Affiliation(s)
- G Xu
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | | | | | | | | |
Collapse
|
4
|
Chen L, Chu CI, Chen KJ, Chen PY, Hu SF, Liu RS. An intelligent approach to the discovery of luminescent materials using a combinatorial approach combined with Taguchi methodology. LUMINESCENCE 2011; 26:229-38. [DOI: 10.1002/bio.1318] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Revised: 04/05/2011] [Accepted: 04/19/2011] [Indexed: 11/11/2022]
Affiliation(s)
| | - Cheng-I Chu
- Department of Chemistry; National Taiwan University; Taipei
| | - Kuo-Ju Chen
- Institute of Electro-optical Science and Technology; National Taiwan Normal University; Taipei; Taiwan
| | - Po-Yuan Chen
- Institute of Electro-optical Science and Technology; National Taiwan Normal University; Taipei; Taiwan
| | - Shu-Fen Hu
- Department of Physics; National Taiwan Normal University; Taipei; Taiwan
| | - Ru-Shi Liu
- Department of Chemistry; National Taiwan University; Taipei
| |
Collapse
|
5
|
Chen H, Engkvist O, Blomberg N. Combinatorial library design from reagent pharmacophore fingerprints. Methods Mol Biol 2011; 685:135-152. [PMID: 20981522 DOI: 10.1007/978-1-60761-931-4_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Combinatorial and parallel chemical synthesis technologies are powerful tools in early drug discovery projects. Over the past couple of years an increased emphasis on targeted lead generation libraries and focussed screening libraries in the pharmaceutical industry has driven a surge in computational methods to explore molecular frameworks to establish new chemical equity. In this chapter we describe a complementary technique in the library design process, termed ProSAR, to effectively cover the accessible pharmacophore space around a given scaffold. With this method reagents are selected such that each R-group on the scaffold has an optimal coverage of pharmacophoric features. This is achieved by optimising the Shannon entropy, i.e. the information content, of the topological pharmacophore distribution for the reagents. As this method enumerates compounds with a systematic variation of user-defined pharmacophores to the attachment point on the scaffold, the enumerated compounds may serve as a good starting point for deriving a structure-activity relationship (SAR).
Collapse
Affiliation(s)
- Hongming Chen
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Mölndal, Sweden.
| | | | | |
Collapse
|
6
|
Abstract
Fragment screens for new ligands have had wide success, notwithstanding their constraint to libraries of 1,000-10,000 molecules. Larger libraries would be addressable were molecular docking reliable for fragment screens, but this has not been widely accepted. To investigate docking's ability to prioritize fragments, a library of >137,000 such molecules were docked against the structure of beta-lactamase. Forty-eight fragments highly ranked by docking were acquired and tested; 23 had K(i) values ranging from 0.7 to 9.2 mM. X-ray crystal structures of the enzyme-bound complexes were determined for 8 of the fragments. For 4, the correspondence between the predicted and experimental structures was high (RMSD between 1.2 and 1.4 A), whereas for another 2, the fidelity was lower but retained most key interactions (RMSD 2.4-2.6 A). Two of the 8 fragments adopted very different poses in the active site owing to enzyme conformational changes. The 48% hit rate of the fragment docking compares very favorably with "lead-like" docking and high-throughput screening against the same enzyme. To understand this, we investigated the occurrence of the fragment scaffolds among larger, lead-like molecules. Approximately 1% of commercially available fragments contain these inhibitors whereas only 10(-7)% of lead-like molecules do. This suggests that many more chemotypes and combinations of chemotypes are present among fragments than are available among lead-like molecules, contributing to the higher hit rates. The ability of docking to prioritize these fragments suggests that the technique can be used to exploit the better chemotype coverage that exists at the fragment level.
Collapse
|
7
|
Chen H, Börjesson U, Engkvist O, Kogej T, Svensson MA, Blomberg N, Weigelt D, Burrows JN, Lange T. ProSAR: A New Methodology for Combinatorial Library Design. J Chem Inf Model 2009; 49:603-14. [DOI: 10.1021/ci800231d] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Hongming Chen
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Ulf Börjesson
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Ola Engkvist
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Thierry Kogej
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Mats A. Svensson
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Niklas Blomberg
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Dirk Weigelt
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Jeremy N. Burrows
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Tim Lange
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| |
Collapse
|
8
|
Rabal O, Pascual R, Borrell JI, Teixidó J. Cell-Integral-Diversity Criterion: A Proposal for Minimizing Cluster Artifact in Cell-Based Selections. J Chem Inf Model 2007; 47:1886-96. [PMID: 17824683 DOI: 10.1021/ci600433c] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Cell-based methods and the diversity integral criterion (a distance-based technique) are commonly used approaches for assessing the diversity of collections of compounds in terms of space coverage. The main deficiency with cell-based methods is the arbitrariness of cell boundaries which leads to edge effects or cluster artifacts, i.e., situations in which similar molecules separated by a cell boundary yield a higher diversity score than molecules falling within the same cell but which are less similar to each other. We describe a straightforward diversity metric based on quantifying the distance to the center of the bins resulting from partitioning the descriptor space which aims at bypassing these artifacts. The mentioned criteria are compared for the diversity assessment of a set of selections carried out on three combinatorial libraries of different cardinalities. For each method, the influence of its parameters (reference partition and number of points) on their efficacy is examined. Furthermore, the proposed diversity metric is also applied to designing diverse libraries for three test cases. We show that full arrays selected by minimizing the sum of distances to the center of the cells are formed by compounds spaced further apart than selections obtained by maximizing the degree of cell occupancy.
Collapse
Affiliation(s)
- Obdulia Rabal
- Grup d'Enginyeria Molecular, Institut Químic de Sarrià, Universitat Ramon Llull, Via Augusta 390, E-08017 Barcelona, Spain
| | | | | | | |
Collapse
|
9
|
Rosania GR, Crippen G, Woolf P, States D, Shedden K. A Cheminformatic Toolkit for Mining Biomedical Knowledge. Pharm Res 2007; 24:1791-802. [PMID: 17385012 DOI: 10.1007/s11095-007-9285-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2007] [Accepted: 02/27/2007] [Indexed: 01/31/2023]
Abstract
PURPOSE Cheminformatics can be broadly defined to encompass any activity related to the application of information technology to the study of properties, effects and uses of chemical agents. One of the most important current challenges in cheminformatics is to allow researchers to search databases of biomedical knowledge, using chemical structures as input. MATERIALS AND METHODS An important step towards this goal was the establishment of PubChem, an open, centralized database of small molecules accessible through the World Wide Web. While PubChem is primarily intended to serve as a repository for high throughput screening data from federally-funded screening centers and academic research laboratories, the major impact of PubChem could also reside in its ability to serve as a chemical gateway to biomedical databases such as PubMed. CONCLUSION This article will review cheminformatic tools that can be applied to facilitate annotation of PubChem through links to the scientific literature; to integrate PubChem with transcriptomic, proteomic, and metabolomic datasets; to incorporate results of numerical simulations of physiological systems into PubChem annotation; and ultimately, to translate data of chemical genomics screening efforts into information that will benefit biomedical researchers and physician scientists across all therapeutic areas.
Collapse
Affiliation(s)
- Gus R Rosania
- Department of Pharmaceutical Sciences, University of Michigan College of Pharmacy, 428 Church Street, Ann Arbor, MI 48109, USA.
| | | | | | | | | |
Collapse
|
10
|
Papp A, Gulyas-Forró A, Gulyas Z, Dorman G, Urge L, Darvas F. Explicit Diversity Index (EDI): a novel measure for assessing the diversity of compound databases. J Chem Inf Model 2006; 46:1898-904. [PMID: 16995719 DOI: 10.1021/ci060074f] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A novel diversity assessment method, the Explicit Diversity Index (EDI), is introduced for druglike molecules. EDI combines structural and synthesis-related dissimilarity values and expresses them as a single number. As an easily interpretable measure, it facilitates the decision making in the design of combinatorial libraries, and it might assist in the comparison of compound sets provided by different manufacturers. Because of its rapid calculation algorithm, EDI enables the diversity assessment of in-house or commercial compound collections.
Collapse
Affiliation(s)
- Akos Papp
- AMRI Hungary, Zahony u. 7, H-1031 Budapest, Hungary, ComGrid Ltd., Zahony u. 7, H-1031 Budapest, Hungary
| | | | | | | | | | | |
Collapse
|
11
|
Truchon JF, Bayly CI. GLARE: a new approach for filtering large reagent lists in combinatorial library design using product properties. J Chem Inf Model 2006; 46:1536-48. [PMID: 16859286 DOI: 10.1021/ci0504871] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We present a novel computer algorithm, called GLARE (Global Library Assessment of REagents), that addresses the issue of optimal reagent selection in combinatorial library design. This program reduces or eliminates the time a medicinal chemist spends examining reagents which a priori cannot be part of a "good" library. Our approach takes the large reagent sets returned by standard chemical database queries and produces often considerably reduced reagent sets that are well-behaved with respect to a specific template. The pruning enforces "goodness" constraints such as the Lipinski rule of five on the product properties such that any reagent selection from the resulting sets produces only "good" products. The algorithm we implemented has three important features: (i) As opposed to genetic algorithms or other stochastic algorithms, GLARE uses a deterministic greedy procedure that smoothly filters out nonviable reagents. (ii) The pruning method can be biased to produce reagent sets with a balanced size, conserving proportionally more reagents in smaller sets. (iii) For very large combinatorial libraries, a partitioning scheme allows libraries as large as 10(12) to be evaluated in 0.25 s on an IBM AMD Opteron processor. This algorithm is validated on a diverse set of 12 libraries. The results that we obtained show an excellent compliance to the product property requirements and very fast timings.
Collapse
Affiliation(s)
- Jean-François Truchon
- Merck Frosst Canada & Co., 16711 Trans Canada Hwy., Kirkland, Québec, Canada H9H 3L1.
| | | |
Collapse
|
12
|
Ng C, Xiao Y, Putnam W, Lum B, Tropsha A. Quantitative structure-pharmacokinetic parameters relationships (QSPKR) analysis of antimicrobial agents in humans using simulated annealing k-nearest-neighbor and partial least-square analysis methods. J Pharm Sci 2005; 93:2535-44. [PMID: 15349962 DOI: 10.1002/jps.20117] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We have developed quantitative structure-pharmacokinetic parameters relationship (QSPKR) models using k-nearest-neighbor (k-NN) and partial least-square (PLS) methods to predict the volume of distribution at steady state (Vss) and clearance (CL) of 44 antimicrobial agents in humans. The performance of QSPKR was determined by the values of the internal leave-one-out, crossvalidated coefficient of determination q(2) for the training set and external predictive r(2) for the test set. The best simulated annealing (SA)-kNN model was highly predictive for Vss and provided q(2) and r(2) values of 0.93 and 0.80, respectively. For all compounds, the model produced average fold error values for Vss of 1.00 and for 93% of the compounds provided predictions that were within a twofold error of actual values. The best SA-kNN model for prediction of CL yielded q(2) and r(2) values of 0.77 and 0.94, respectively, and had an average fold rror of 1.05. Use of PLS methods resulted in inferior QSPKR models. The SA-kNN QSPKR approach has utility in drug discovery and development in the identification of compounds that possess appropriate pharmacokinetic characteristics in humans, and will assist in the selection of a suitable starting dose for Phase I, first-time-in-man studies.
Collapse
Affiliation(s)
- Chee Ng
- Department of Pharmacokinetic and Pharmacodynamic Sciences, Genentech Inc., 1 DNA Way, South San Francisco, CA 94080-4990, USA.
| | | | | | | | | |
Collapse
|
13
|
Jónsdóttir SO, Jørgensen FS, Brunak S. Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates. Bioinformatics 2005; 21:2145-60. [PMID: 15713739 DOI: 10.1093/bioinformatics/bti314] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION To gather information about available databases and chemoinformatics methods for prediction of properties relevant to the drug discovery and optimization process. RESULTS We present an overview of the most important databases with 2-dimensional and 3-dimensional structural information about drugs and drug candidates, and of databases with relevant properties. Access to experimental data and numerical methods for selecting and utilizing these data is crucial for developing accurate predictive in silico models. Many interesting predictive methods for classifying the suitability of chemical compounds as potential drugs, as well as for predicting their physico-chemical and ADMET properties have been proposed in recent years. These methods are discussed, and some possible future directions in this rapidly developing field are described.
Collapse
Affiliation(s)
- Svava Osk Jónsdóttir
- Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| | | | | |
Collapse
|
14
|
Jamois EA, Lin CT, Waldman M. Design of focused and restrained subsets from extremely large virtual libraries. J Mol Graph Model 2003; 22:141-9. [PMID: 12932785 DOI: 10.1016/s1093-3263(03)00154-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
With the current and ever-growing offering of reagents along with the vast palette of organic reactions, virtual libraries accessible to combinatorial chemists can reach sizes of billions of compounds or more. Extracting practical size subsets for experimentation has remained an essential step in the design of combinatorial libraries. A typical approach to computational library design involves enumeration of structures and properties for the entire virtual library, which may be unpractical for such large libraries. This study describes a new approach termed as on the fly optimization (OTFO) where descriptors are computed as needed within the subset optimization cycle and without intermediate enumeration of structures. Results reported herein highlight the advantages of coupling an ultra-fast descriptor calculation engine to subset optimization capabilities. We also show that enumeration of properties for the entire virtual library may not only be unpractical but also wasteful. Successful design of focused and restrained subsets can be achieved while sampling only a small fraction of the virtual library. We also investigate the stability of the method and compare results obtained from simulated annealing (SA) and genetic algorithms (GA).
Collapse
Affiliation(s)
- Eric A Jamois
- Accelrys Inc., 9685 Scranton Road, San Diego, CA 92121, USA.
| | | | | |
Collapse
|
15
|
Abstract
The design of combinatorial libraries involves the consideration of all synthesizable compounds (the virtual library), followed by the selection of a suitably sized subset for actual synthesis and experimentation. Several approaches to this task can be envisaged, involving either reagent-based or product-based considerations. Reagent-based design considers the properties of the building blocks rather than those of the final products. Although popular with chemists, this approach overlooks the extent of chemical transformations involved in generating products. In effect, several important properties cannot be derived from building blocks alone and require access to product structures. Several studies have demonstrated the superiority of product-based designs in yielding diverse and representative subsets. Although more computationally intensive, the latter approach provides a basis for more sophisticated designs where reagent-based and product based considerations can be combined for a best-of-breed approach.
Collapse
Affiliation(s)
- Eric A Jamois
- Accelrys Inc., 9685 Scranton Road, San Diego, CA 92121, USA.
| |
Collapse
|
16
|
Pascual R, Mateu M, Gasteiger J, Borrell JI, Teixidó J. Design and analysis of a combinatorial library of HEPT analogues: comparison of selection methodologies and inspection of the actually covered chemical space. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2003; 43:199-207. [PMID: 12546554 DOI: 10.1021/ci0255681] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A large virtual library of 125 396 HEPT analogues, built by combining all fragments present in the published 180-compound HEPT family, has been studied in terms of diversity criteria and the goodness of the 11 available standard diversity selection methods analyzed. All the algorithms under study, except Cell-based Density, have rank above a random selection of compounds, with Optimum and Standard Deviation based Binning and Cell-based Fraction algorithms being the best choices. Furthermore, analysis of the actually tested compounds has been performed to compare the traditional drug discovery methodology versus a rational selection of combinatorial libraries approach.
Collapse
Affiliation(s)
- Rosalia Pascual
- Grup d'Enginyeria Molecular, Institut Químic de Sarrià (IQS), Universitat Ramon Llull, Via Augusta 390, E-08017-Barcelona, Spain
| | | | | | | | | |
Collapse
|