1
|
Duggan BM, Cullum R, Fenical W, Amador LA, Rodríguez AD, La Clair JJ. Searching for Small Molecules with an Atomic Sort. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201911862] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Brendan M. Duggan
- Skaggs School of Pharmacy and Pharmaceutical Sciences University of California, San Diego 9500 Gilman Drive La Jolla CA 92093 USA
| | - Reiko Cullum
- Center for Marine Biotechnology and Biomedicine Scripps Institution of Oceanography University of California, San Diego La Jolla CA 92093-0204 USA
| | - William Fenical
- Center for Marine Biotechnology and Biomedicine Scripps Institution of Oceanography University of California, San Diego La Jolla CA 92093-0204 USA
| | - Luis A. Amador
- Molecular Sciences Research Center University of Puerto Rico 1390 Ponce de León Avenue San Juan 00926 Puerto Rico
| | - Abimael D. Rodríguez
- Molecular Sciences Research Center University of Puerto Rico 1390 Ponce de León Avenue San Juan 00926 Puerto Rico
| | - James J. La Clair
- Department of Chemistry and Biochemistry University of California San Diego 9500 Gilman Drive, La Jolla CA 92093 USA
| |
Collapse
|
2
|
Duggan BM, Cullum R, Fenical W, Amador LA, Rodríguez AD, La Clair JJ. Searching for Small Molecules with an Atomic Sort. Angew Chem Int Ed Engl 2020; 59:1144-1148. [PMID: 31696595 PMCID: PMC6942196 DOI: 10.1002/anie.201911862] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 10/24/2019] [Indexed: 12/14/2022]
Abstract
The discovery of biologically active small molecules requires sifting through large amounts of data to identify unique or unusual arrangements of atoms. Here, we develop, test and evaluate an atom-based sort to identify novel features of secondary metabolites and demonstrate its use to evaluate novelty in marine microbial and sponge extracts. This study outlines an important ongoing advance towards the translation of autonomous systems to identify, and ultimately elucidate, atomic novelty within a complex mixture of small molecules.
Collapse
Affiliation(s)
- Brendan M Duggan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Reiko Cullum
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, 92093-0204, USA
| | - William Fenical
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, 92093-0204, USA
| | - Luis A Amador
- Molecular Sciences Research Center, University of Puerto Rico, 1390 Ponce de León Avenue, San Juan, 00926, Puerto Rico
| | - Abimael D Rodríguez
- Molecular Sciences Research Center, University of Puerto Rico, 1390 Ponce de León Avenue, San Juan, 00926, Puerto Rico
| | - James J La Clair
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| |
Collapse
|
3
|
Xu C, Bouvier G, Bardiaux B, Nilges M, Malliavin T, Lisser A. Ordering Protein Contact Matrices. Comput Struct Biotechnol J 2018; 16:140-156. [PMID: 29632657 PMCID: PMC5889711 DOI: 10.1016/j.csbj.2018.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 02/28/2018] [Accepted: 03/01/2018] [Indexed: 11/29/2022] Open
Abstract
Numerous biophysical approaches provide information about residues spatial proximity in proteins. However, correct assignment of the protein fold from this proximity information is not straightforward if the spatially close protein residues are not assigned to residues in the primary sequence. Here, we propose an algorithm to assign such residue numbers by ordering the columns and lines of the raw protein contact matrix directly obtained from proximity information between unassigned amino acids. The ordering problem is formatted as the search of a trail within a graph connecting protein residues through the nonzero contact values. The algorithm performs in two steps: (i) finding the longest trail of the graph using an original dynamic programming algorithm, (ii) clustering the individual ordered matrices using a self-organizing map (SOM) approach. The combination of the dynamic programming and self-organizing map approaches constitutes a quite innovative point of the present work. The algorithm was validated on a set of about 900 proteins, representative of the sizes and proportions of secondary structures observed in the Protein Data Bank. The algorithm was revealed to be efficient for noise levels up to 40%, obtaining average gaps of about 20% at maximum between ordered and initial matrices. The proposed approach paves the ways toward a method of fold prediction from noisy proximity information, as TM scores larger than 0.5 have been obtained for ten randomly chosen proteins, in the case of a noise level of 10%. The methods has been also validated on two experimental cases, on which it performed satisfactorily.
Collapse
Affiliation(s)
- Chuan Xu
- Laboratoire de Recherche en Informatique, Université Paris-Sud and CNRS UMR8623, France
| | - Guillaume Bouvier
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Benjamin Bardiaux
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Michael Nilges
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Thérèse Malliavin
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Abdel Lisser
- Laboratoire de Recherche en Informatique, Université Paris-Sud and CNRS UMR8623, France
| |
Collapse
|
4
|
Crippen GM, Rousaki A, Revington M, Zhang Y, Zuiderweg ERP. SAGA: rapid automatic mainchain NMR assignment for large proteins. JOURNAL OF BIOMOLECULAR NMR 2010; 46:281-298. [PMID: 20232231 DOI: 10.1007/s10858-010-9403-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2009] [Accepted: 02/23/2010] [Indexed: 05/26/2023]
Abstract
Here we describe a new algorithm for automatically determining the mainchain sequential assignment of NMR spectra for proteins. Using only the customary triple resonance experiments, assignments can be quickly found for not only small proteins having rather complete data, but also for large proteins, even when only half the residues can be assigned. The result of the calculation is not the single best assignment according to some criterion, but rather a large number of satisfactory assignments that are summarized in such a way as to help the user identify portions of the sequence that are assigned with confidence, vs. other portions where the assignment has some correlated alternatives. Thus very imperfect initial data can be used to suggest future experiments.
Collapse
Affiliation(s)
- Gordon M Crippen
- College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, USA.
| | | | | | | | | |
Collapse
|
5
|
|
6
|
Mielke SP, Krishnan V. Characterization of protein secondary structure from NMR chemical shifts. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2009; 54:141-165. [PMID: 20160946 PMCID: PMC2766081 DOI: 10.1016/j.pnmrs.2008.06.002] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Affiliation(s)
- Steven P. Mielke
- UC Davis Genome Center, University of California, Davis, California
| | - V.V. Krishnan
- Department of Applied Science and Center for Comparative Medicine, University of California, Davis, California
- Department of Chemistry, California State University, Fresno, California
- Correspondence to or
| |
Collapse
|
7
|
Verdegem D, Dijkstra K, Hanoulle X, Lippens G. Graphical interpretation of Boolean operators for protein NMR assignments. JOURNAL OF BIOMOLECULAR NMR 2008; 42:11-21. [PMID: 18762868 DOI: 10.1007/s10858-008-9262-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2008] [Revised: 06/06/2008] [Accepted: 06/09/2008] [Indexed: 05/26/2023]
Abstract
We have developed a graphics based algorithm for semi-automated protein NMR assignments. Using the basic sequential triple resonance assignment strategy, the method is inspired by the Boolean operators as it applies "AND"-, "OR"- and "NOT"-like operations on planes pulled out of the classical three-dimensional spectra to obtain its functionality. The method's strength lies in the continuous graphical presentation of the spectra, allowing both a semi-automatic peaklist construction and sequential assignment. We demonstrate here its general use for the case of a folded protein with a well-dispersed spectrum, but equally for a natively unfolded protein where spectral resolution is minimal.
Collapse
Affiliation(s)
- Dries Verdegem
- Unité de Glycobiologie Structurale et Fonctionelle, UMR 8576 CNRS, IFR 147, Université des Sciences et Technologies de Lille, 59655, Villeneuve d'Ascq, France
| | | | | | | |
Collapse
|
8
|
Lemak A, Steren CA, Arrowsmith CH, Llinás M. Sequence specific resonance assignment via Multicanonical Monte Carlo search using an ABACUS approach. JOURNAL OF BIOMOLECULAR NMR 2008; 41:29-41. [PMID: 18458824 DOI: 10.1007/s10858-008-9238-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 04/08/2008] [Indexed: 05/26/2023]
Abstract
ABACUS [Grishaev et al. (2005) Proteins 61:36-43] is a novel protocol for automated protein structure determination via NMR. ABACUS starts from molecular fragments defined by unassigned J-coupled spin-systems and involves a Monte Carlo stochastic search in assignment space, probabilistic sequence selection, and assembly of fragments into structures that are used to guide the stochastic search. Here, we report further development of the two main algorithms that increase the flexibility and robustness of the method. Performance of the BACUS [Grishaev and Llinás (2004) J Biomol NMR 28:1-101] algorithm was significantly improved through use of sequential connectivities available from through-bond correlated 3D-NMR experiments, and a new set of likelihood probabilities derived from a database of 56 ultra high resolution X-ray structures. A Multicanonical Monte Carlo procedure, Fragment Monte Carlo (FMC), was developed for sequence-specific assignment of spin-systems. It relies on an enhanced assignment sampling and provides the uncertainty of assignments in a quantitative manner. The efficiency of the protocol was validated on data from four proteins of between 68-116 residues, yielding 100% accuracy in sequence specific assignment of backbone and side chain resonances.
Collapse
Affiliation(s)
- Alexander Lemak
- The Ontario Cancer Institute and Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada M5G 2M9.
| | | | | | | |
Collapse
|
9
|
|
10
|
Poulding S, Charlton AJ, Donarski J, Wilson JC. Removal of t(1) noise from metabolomic 2D (1)H-(13)C HSQC NMR spectra by Correlated Trace Denoising. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2007; 189:190-199. [PMID: 17920317 DOI: 10.1016/j.jmr.2007.09.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Revised: 09/06/2007] [Accepted: 09/10/2007] [Indexed: 05/25/2023]
Abstract
The presence of t(1) noise artefacts in 2D phase-cycled Heteronuclear Single Quantum Coherence (HSQC) spectra constrains the use of this experiment despite its superior sensitivity. This paper proposes a new processing algorithm, working in the frequency-domain, for reducing t(1) noise. The algorithm has been developed for use in contexts, such as metabolomic studies, where existing denoising techniques cannot always be applied. Two test cases are presented that show the algorithm to be effective in improving the SNR of peaks embedded within t(1) noise by a factor of more than 2, while retaining the intensity and shape of genuine peaks.
Collapse
|
11
|
Lula I, Denadai AL, Resende JM, de Sousa FB, de Lima GF, Pilo-Veloso D, Heine T, Duarte HA, Santos RAS, Sinisterra RD. Study of angiotensin-(1-7) vasoactive peptide and its beta-cyclodextrin inclusion complexes: complete sequence-specific NMR assignments and structural studies. Peptides 2007; 28:2199-210. [PMID: 17904691 DOI: 10.1016/j.peptides.2007.08.011] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/05/2007] [Revised: 08/06/2007] [Accepted: 08/06/2007] [Indexed: 11/19/2022]
Abstract
We report the complete sequence-specific hydrogen NMR assignments of vasoactive peptide angiotensin-(1-7) (Ang-(1-7)). Assignments of the majority of the resonances were accomplished by COSY, TOCSY, and ROESY peak coordinates at 400MHz and 600MHz. Long-side-chain amino acid spin system identification was facilitated by long-range coherence transfer experiments (TOCSY). Problems with overlapped resonance signals were solved by analysis of heteronuclear 2D experiments (HSQC and HMBC). Nuclear Overhauser effects (NOE) results were used to probe peptide conformation. We show that the inclusion of the angiotensin-(1-7) tyrosine residue is favored in inclusion complexes with beta-cyclodextrin. QM/MM simulations at the DFTB/UFF level confirm the experimental NMR findings and provide detailed structural information on these compounds in aqueous solution.
Collapse
Affiliation(s)
- Ivana Lula
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Av. Antonio Carlos 6627, 31270-901 Belo Horizonte, Minas Gerais, Brazil
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Baran MC, Moseley HNB, Aramini JM, Bayro MJ, Monleon D, Locke JY, Montelione GT. SPINS: a laboratory information management system for organizing and archiving intermediate and final results from NMR protein structure determinations. Proteins 2006; 62:843-51. [PMID: 16395675 DOI: 10.1002/prot.20840] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Recent technological advances and experimental techniques have contributed to an increasing number and size of NMR datasets. In order to scale up productivity, laboratory information management systems for handling these extensive data need to be designed and implemented. The SPINS (Standardized ProteIn Nmr Storage) Laboratory Information Management System (LIMS) addresses these needs by providing an interface for archival of complete protein NMR structure determinations, together with functionality for depositing these data to the public BioMagResBank (BMRB). The software tracks intermediate files during each step of an NMR structure-determination process, including: data collection, data processing, resonance assignments, resonance assignment validation, structure calculation, and structure validation. The underlying SPINS data dictionary allows for the integration of various third party NMR data processing and analysis software, enabling users to launch programs they are accustomed to using for each step of the structure determination process directly out of the SPINS user interface.
Collapse
Affiliation(s)
- Michael C Baran
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Northeast Structural Genomics Consortium, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | | | |
Collapse
|
13
|
Huang YJ, Moseley HNB, Baran MC, Arrowsmith C, Powers R, Tejero R, Szyperski T, Montelione GT. An integrated platform for automated analysis of protein NMR structures. Methods Enzymol 2005; 394:111-41. [PMID: 15808219 DOI: 10.1016/s0076-6879(05)94005-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Recent developments provide automated analysis of NMR assignments and three-dimensional (3D) structures of proteins. These approaches are generally applicable to proteins ranging from about 50 to 150 amino acids. In this chapter, we summarize progress by the Northeast Structural Genomics Consortium in standardizing the NMR data collection process for protein structure determination and in building an integrated platform for automated protein NMR structure analysis. Our integrated platform includes the following principal steps: (1) standardized NMR data collection, (2) standardized data processing (including spectral referencing and Fourier transformation), (3) automated peak picking and peak list editing, (4) automated analysis of resonance assignments, (5) automated analysis of NOESY data together with 3D structure determination, and (6) methods for protein structure validation. In particular, the software AutoStructure for automated NOESY data analysis is described in this chapter, together with a discussion of practical considerations for its use in high-throughput structure production efforts. The critical area of data quality assessment has evolved significantly over the past few years and involves evaluation of both intermediate and final peak lists, resonance assignments, and structural information derived from the NMR data. Methods for quality control of each of the major automated analysis steps in our platform are also discussed. Despite significant remaining challenges, when good quality data are available, automated analysis of protein NMR assignments and structures with this platform is both fast and reliable.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Baran MC, Huang YJ, Moseley HNB, Montelione GT. Automated analysis of protein NMR assignments and structures. Chem Rev 2004; 104:3541-56. [PMID: 15303826 DOI: 10.1021/cr030408p] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Michael C Baran
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, and Northeast Structural Genomics Consortium, Rutgers University, 679 Hoes Lane, Piscataway, NJ 08854, USA
| | | | | | | |
Collapse
|
15
|
Xu Y, Jablonsky MJ, Jackson PL, Braun W, Krishna NR. Automatic assignment of NOESY cross peaks and determination of the protein structure of a new world scorpion neurotoxin using NOAH/DIAMOD. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2001; 148:35-46. [PMID: 11133274 DOI: 10.1006/jmre.2000.2220] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The 3D NMR structures of the scorpion neurotoxin, CsE-v5, were determined from the same NOESY spectra with NOAH/DIAMOD, an automated assignment and 3D structure calculation software package, and with a conventional manual assignment combined with a distance geometry/simulated annealing (X-PLOR) refinement method. The NOESY assignments and the 3D structures obtained from the two independent methods were compared in detail. The NOAH/DIAMOD program suite uses feedback filtering and self-correcting distance geometry methods to automatically assign NOESY spectra and to calculate the 3D structure of a protein. NOESY cross peaks were automatically picked using a standard software package and combined with 74 manually assigned NOESY peaks to start the NOAH/DIAMOD calculations. After 63 NOAH/DIAMOD cycles, using REDAC procedures in the last 8 cycles, and final FANTOM constrained energy minimization, a bundle of 20 structures with the smallest target functions has a RMSD of 0.81 A for backbone atoms and 1.11 A for all heavy atoms to the mean structure. Despite some missing chemical shifts of side chain protons, 776 (including 74 manually assigned) of 1130 NOE peaks were unambiguously assigned, 150 peaks have more than one possible assignment compatible with the bundle structures, and only 30 peaks could not be assigned within the given chemical shift tolerance ranges in either the D1 or the D2 dimension. The remaining 174, mainly weak NOE peaks were not compatible with the final 20 best bundle structures at the last NOAH/DIAMOD cycle. The automatically determined structures agree well with the structures determined independently using the conventional method and the same NMR spectra, with the mean RMSD in well-defined regions of 0.84 A for bb and 1.48 A for all heavy atoms from residues 2-5, 18-26, 32-36, and 39-45. This study demonstrates the potential of the NOAH/DIAMOD program suite to automatically assign NMR data for proteins and determine their structure.
Collapse
Affiliation(s)
- Y Xu
- Department of Human Biological Chemistry and Genetics, Sealy Center for Structural Biology, Galveston, Texas, 77555-1157, USA
| | | | | | | | | |
Collapse
|
16
|
Schubert M, Oschkinat H, Schmieder P. MUSIC, selective pulses, and tuned delays: amino acid type-selective (1)H-(15)N correlations, II. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2001; 148:61-72. [PMID: 11133277 DOI: 10.1006/jmre.2000.2222] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Amino acid type-selective experiments help to remove ambiguities in either manual or automated assignment procedures. Here we present modified triple-resonance experiments that yield amino acid type-selective (1)H-(15)N correlations. They are based on the MUSIC coherence transfer scheme which replaces the initial INEPT transfer and is selective for XH(2) or XH(3) (where X is either (15)N or (13)C). Signals of the desired amino acid types are thus selected based on the topology of the side chain. MUSIC is combined with selective pulses and carefully tuned delays to create experiments for Ser (S-HSQC); Val, Ile, and Ala (VIA-HSQC); Leu and Ala (LA-HSQC); Asp, Asn, and Gly (DNG-HSQC), as well as Glu, Gln, and Gly (EQG-HSQC). The new experiments are recorded as two-dimensional spectra and their performance is demonstrated by their application to two protein domains of 83 and 115 residues.
Collapse
Affiliation(s)
- M Schubert
- Forschungsinstitut fŭr Molekulare Pharmakologie, Robert-Roessle-Str. 10, Berlin, D-13125, Germany
| | | | | |
Collapse
|
17
|
Frimurer TM, Bywater R, Naerum L, Lauritsen LN, Brunak S. Improving the odds in discriminating "drug-like" from "non drug-like" compounds. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2000; 40:1315-24. [PMID: 11128089 DOI: 10.1021/ci0003810] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We have used a feed-forward neural network technique to classify chemical compounds into potentially "drug-like" and "non drug-like" candidates. The neural network was trained to distinguish between a set of "drug-like" and "non drug-like" chemical compounds taken from the MACCS-II Drug Data Report (MDDR) and the Available Chemicals Directory (ACD). The 2D atom types (of the full atomic representation) were assigned and applied as descriptors to encode numerically each compound. There are four main conclusions: First the method performs well, correctly assigning 88% of the compounds in both MDDR and ACD. Improved discrimination was achieved by a more critical selection of training sets. Second, the method gives much better prediction performance than the widely used "Rule of Five", which accepts as many as 74% of the ACD compounds but only 66% of those in MDDR, resulting in a correlation coefficient which is effectively zero, compared to a value of 0.63 for the neural network prediction. Third, based on a standard Tanimoto similarity search the selection of drug-like compounds in the evaluation set is not biased toward compounds similar to those in the training set. Fourth, the trained neural network was applied to evaluate the drug-likeness of 136 GABA uptake inhibitors with impressive results. The implications of applying a neural network to characterize chemical compounds are discussed.
Collapse
|
18
|
Schubert M, Smalla M, Schmieder P, Oschkinat H. MUSIC in triple-resonance experiments: amino acid type-selective (1)H-(15)N correlations. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 1999; 141:34-43. [PMID: 10527741 DOI: 10.1006/jmre.1999.1881] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Amino acid type-selective triple-resonance experiments can be of great help for the assignment of protein spectra, since they help to remove ambiguities in either manual or automated assignment procedures. Here, modified triple-resonance experiments that yield amino acid type-selective (1)H-(15)N correlations are presented. They are based on novel coherence transfer schemes, the MUSIC pulse sequence elements, that replace the initial INEPT transfer and are selective for XH(2) or XH(3) (X can be (15)N or (13)C). The desired amino acid type is thereby selected based on the topology of the side chain. Experiments for Gly (G-HSQC); Ala (A-HSQC); Thr, Val, Ile, and Ala (TAVI-HSQC); Thr and Ala (TA-HSQC), as well as Asn and Gln (N-HSQC and QN-HSQC), are described. The new experiments are recorded as two-dimensional experiments and therefore need only small amounts of spectrometer time. The performance of the experiments is demonstrated with the application to two protein domains. Copyright 1999 Academic Press.
Collapse
Affiliation(s)
- M Schubert
- Forschungsinstitut fur Molekulare Pharmakologie, Alfred-Kowalke-Strasse 4, Berlin, D-10315, Germany
| | | | | | | |
Collapse
|
19
|
Starovasnik MA, Christinger HW, Wiesmann C, Champe MA, de Vos AM, Skelton NJ. Solution structure of the VEGF-binding domain of Flt-1: comparison of its free and bound states. J Mol Biol 1999; 293:531-44. [PMID: 10543948 DOI: 10.1006/jmbi.1999.3134] [Citation(s) in RCA: 47] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The extracellular portion of the VEGF and PlGF receptor, Flt-1 (or VEGFR-1), consists of seven immunoglobulin-like domains. The second domain from the N terminus (Flt-1D2) is necessary and sufficient for high affinity VEGF binding. The 1.7 A resolution crystal structure of Flt-1D2 bound to VEGF revealed that this domain is a member of the I-set of the immunoglobulin superfamily, but has several unusual features including a region near the N terminus that bulges away from the domain rather than pairing with the neighboring beta-strand. Some of the residues in this region make contact with VEGF, raising the possibility that this bulge could be a consequence of VEGF binding and might not be present in the absence of ligand. Here we report the three-dimensional structure of Flt-1D2 in its uncomplexed form determined by NMR spectroscopy. A semi-automated method for NOE assignment that takes advantage of the previously solved crystal structure was used to facilitate rapid analysis of the 3D NOESY spectra. The solution structure is very similar to the previously reported VEGF-bound crystal structure; the N-terminal bulge is present, albeit in a different conformation. We also report the 2.7 A crystal structure of Flt-1D2 in complex with VEGF solved in a different crystal form that reveals yet another conformation for the N-terminal bulge region. (1)H-(15)N heteronuclear NOEs indicate this region is flexible in solution; the crystal structures show that this region is able to adopt more than one conformation even when bound to VEGF. Thus, VEGF-binding is not accompanied by significant structural change in Flt-1D2, and the unusual structural features of Flt-1D2 are an intrinsic property of this domain.
Collapse
Affiliation(s)
- M A Starovasnik
- Department of Protein Engineering, Genentech, Inc., One DNA Way, South San Francisco, CA, 94080, USA.
| | | | | | | | | | | |
Collapse
|
20
|
Xu Y, Wu J, Gorenstein D, Braun W. Automated 2D NOESY assignment and structure calculation of Crambin(S22/I25) with the self-correcting distance geometry based NOAH/DIAMOD programs. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 1999; 136:76-85. [PMID: 9887292 DOI: 10.1006/jmre.1998.1616] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The NOAH/DIAMOD program suite was used to automatically assign an experimental 2D NOESY spectrum of the 46 residue protein crambin(S22/I25), using feedback filtering and self-correcting distance geometry (SECODG). Automatically picked NOESY cross peaks were combined with 157 manually assigned peaks to start NOAH/DIAMOD calculations. At each cycle, DIAMOD was used to calculate an ensemble of 40 structures from these NOE distance constraints and random starting structures. The 10 structures with smallest target function values were analyzed by the structure-based filter, NOAH, and a new set of possible assignments was automatically generated based on chemical shifts and distance constraints violations. After 60 iterations and final energy minimization, the 10 structures with smallest target functions converged to 1.48 A for backbone atoms. Despite several missing chemical shifts, 426 of 613 NOE peaks were unambiguously assigned; 59 peaks were ambiguously assigned. The remaining 128 peaks picked automatically by FELIX are probably primarily noise peaks, with a few real peaks that were not assigned by NOAH due to the incomplete proton chemical shifts list.
Collapse
Affiliation(s)
- Y Xu
- Sealy Center for Structural Biology and Department of Human Biological Chemistry and Genetics, University of Texas Medical Branch, Galveston, Texas, 77555-1157, USA
| | | | | | | |
Collapse
|
21
|
Schmieder P, Leidert M, Kelly M, Oschkinat H. Multiplicity-Selective Coherence Transfer Steps for the Design of Amino Acid-Selective Experiments-A Triple-Resonance Experiment Selective for Asn and Gln. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 1998; 131:199-202. [PMID: 9571093 DOI: 10.1006/jmre.1997.1348] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A multiplicity-selective coherence transfer step is discussed, that can replace the normal INEPT transfer in triple-resonance experiments. Depending on the pulse sequence in which they are implemented, amino acid-selective experiments will be created. Two experiments selective for Asn and Gln are proposed. Copyright 1998 Academic Press.
Collapse
Affiliation(s)
- P Schmieder
- Forschungsinstitut für Molekulare Pharmakologie, Alfred-Kowalke-Strasse 4, Berlin, D-10315, Germany
| | | | | | | |
Collapse
|
22
|
Zimmerman DE, Kulikowski CA, Huang Y, Feng W, Tashiro M, Shimotakahara S, Chien C, Powers R, Montelione GT. Automated analysis of protein NMR assignments using methods from artificial intelligence. J Mol Biol 1997; 269:592-610. [PMID: 9217263 DOI: 10.1006/jmbi.1997.1052] [Citation(s) in RCA: 248] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
An expert system for determining resonance assignments from NMR spectra of proteins is described. Given the amino acid sequence, a two-dimensional 15N-1H heteronuclear correlation spectrum and seven to eight three-dimensional triple-resonance NMR spectra for seven proteins, AUTOASSIGN obtained an average of 98% of sequence-specific spin-system assignments with an error rate of less than 0.5%. Execution times on a Sparc 10 workstation varied from 16 seconds for smaller proteins with simple spectra to one to nine minutes for medium size proteins exhibiting numerous extra spin systems attributed to conformational isomerization. AUTOASSIGN combines symbolic constraint satisfaction methods with a domain-specific knowledge base to exploit the logical structure of the sequential assignment problem, the specific features of the various NMR experiments, and the expected chemical shift frequencies of different amino acids. The current implementation specializes in the analysis of data derived from the most sensitive of the currently available triple-resonance experiments. Potential extensions of the system for analysis of additional types of protein NMR data are also discussed.
Collapse
Affiliation(s)
- D E Zimmerman
- Center for Advanced Biotechnology and Medicine and Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ 08854-5638, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Buchler NE, Zuiderweg ER, Wang H, Goldstein RA. Protein heteronuclear NMR assignments using mean-field simulated annealing. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 1997; 125:34-42. [PMID: 9245358 DOI: 10.1006/jmre.1997.1106] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A computational method for the assignment of the NMR spectra of larger (21 kDa) proteins using a set of six of the most sensitive heteronuclear multidimensional nuclear magnetic resonance experiments is described. Connectivity data obtained from HNC alpha, HN(CO)C alpha, HN(C alpha)H alpha, and H alpha (C alpha CO)NH and spin-system identification data obtained from CP-(H)CCH-TOCSY and CP-(H)C(C alpha CO)NH-TOCSY were used to perform sequence-specific assignments using a mean-field formalism and simulated annealing. This mean-field method reports the resonance assignments in a probabilistic fashion, displaying the certainty of assignments in an unambiguous and quantitative manner. This technique was applied to the NMR data of the 172-residue peptide-binding domain of the E. coli heat-shock protein, DnaK. The method is demonstrated to be robust to significant amounts of missing, spurious, noisy, extraneous, and erroneous data.
Collapse
Affiliation(s)
- N E Buchler
- Biophysics Research Division, University of Michigan, Ann Arbor 48109-1055, USA
| | | | | | | |
Collapse
|
24
|
Bartels C, G�ntert P, Billeter M, W�thrich K. GARANT-a general algorithm for resonance assignment of multidimensional nuclear magnetic resonance spectra. J Comput Chem 1997. [DOI: 10.1002/(sici)1096-987x(19970115)18:1<139::aid-jcc13>3.0.co;2-h] [Citation(s) in RCA: 110] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
25
|
Bartels C, G�ntert P, Billeter M, W�thrich K. GARANT-a general algorithm for resonance assignment of multidimensional nuclear magnetic resonance spectra. J Comput Chem 1997. [DOI: 10.1002/(sici)1096-987x(19970115)18:1%3c139::aid-jcc13%3e3.0.co;2-h] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
26
|
Rios CB, Feng W, Tashiro M, Shang Z, Montelione GT. Phase labeling of C-H and C-C spin-system topologies: application in constant-time PFG-CBCA(CO)NH experiments for discriminating amino acid spin-system types. JOURNAL OF BIOMOLECULAR NMR 1996; 8:345-350. [PMID: 8953221 DOI: 10.1007/bf00410332] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Triple-resonance experiments facilitate the determination of sequence-specific resonance assignments of medium-sized 13C,15N-enriched proteins. Some triple-resonance experiments can also be used to obtain information about amino acid spin-system topologies by proper delay tuning. The constant-time PFG-CBCA(CO)NH experiment allows discrimination between five different groups of amino acids by tuning (phase labeling) independently the delays for proton-carbon refocusing and carbon-carbon constant-time frequency labeling. The proton-carbon refocusing delay allows discrimination of spin-system topologies based on the number of protons attached to C alpha and C beta atoms (i.e. C-H phase labeling). In addition, tuning of the carbon-carbon constant-time frequency-labeling delay discriminates topologies based on the number of carbons directly coupled to C alpha and C beta atoms (i.e. C-C phase labeling). Classifying the spin systems into these five groups facilitates identification of amino acid types, making both manual and automated analysis of assignments easier. The use of this pair of optimally tuned PFG-CBCA(CO)NH experiments for distinguishing five spin-system topologies is demonstrated for the 124-residue bovine pancreatic ribonuclease A protein.
Collapse
Affiliation(s)
- C B Rios
- Center for Advanced Biotechnology, Rutgers University, Piscataway, NJ 08854-5638, USA
| | | | | | | | | |
Collapse
|
27
|
Clark DE, Westhead DR. Evolutionary algorithms in computer-aided molecular design. J Comput Aided Mol Des 1996; 10:337-58. [PMID: 8877705 DOI: 10.1007/bf00124503] [Citation(s) in RCA: 75] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
In recent years, search and optimisation algorithms inspired by evolutionary processes have been applied with marked success to a wide variety of problems in diverse fields of study. In this review, we survey the growing application of these 'evolutionary algorithms' in one such area: computer-aided molecular design. In the course of the review, we seek to summarise the work to date and to indicate where evolutionary algorithms have met with success and where they have not fared so well. In addition to this, we also attempt to discern some future trends in both the basic research concerning these algorithms and their application to the elucidation, design and modelling of chemical and biochemical structures.
Collapse
Affiliation(s)
- D E Clark
- Proteus Molecular Design Ltd., Macclesfield, U.K
| | | |
Collapse
|