1
|
Liu ZH, Teixeira JMC, Zhang O, Tsangaris TE, Li J, Gradinaru CC, Head-Gordon T, Forman-Kay JD. Local Disordered Region Sampling (LDRS) for ensemble modeling of proteins with experimentally undetermined or low confidence prediction segments. Bioinformatics 2023; 39:btad739. [PMID: 38060268 PMCID: PMC10733734 DOI: 10.1093/bioinformatics/btad739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/30/2023] [Accepted: 12/06/2023] [Indexed: 12/08/2023] Open
Abstract
SUMMARY The Local Disordered Region Sampling (LDRS, pronounced loaders) tool is a new module developed for IDPConformerGenerator, a previously validated approach to model intrinsically disordered proteins (IDPs). The IDPConformerGenerator LDRS module provides a method for generating all-atom conformations of intrinsically disordered protein regions at N- and C-termini of and in loops or linkers between folded regions of an existing protein structure. These disordered elements often lead to missing coordinates in experimental structures or low confidence in predicted structures. Requiring only a pre-existing PDB or mmCIF formatted structural template of the protein with missing coordinates or with predicted confidence scores and its full-length primary sequence, LDRS will automatically generate physically meaningful conformational ensembles of the missing flexible regions to complete the full-length protein. The capabilities of the LDRS tool of IDPConformerGenerator include modeling phosphorylation sites using enhanced Monte Carlo-Side Chain Entropy, transmembrane proteins within an all-atom bilayer, and multi-chain complexes. The modeling capacity of LDRS capitalizes on the modularity, the ability to be used as a library and via command-line, and the computational speed of the IDPConformerGenerator platform. AVAILABILITY AND IMPLEMENTATION The LDRS module is part of the IDPConformerGenerator modeling suite, which can be downloaded from GitHub at https://github.com/julie-forman-kay-lab/IDPConformerGenerator. IDPConformerGenerator is written in Python3 and works on Linux, Microsoft Windows, and Mac OS versions that support DSSP. Users can utilize LDRS's Python API for scripting the same way they can use any part of IDPConformerGenerator's API, by importing functions from the "idpconfgen.ldrs_helper" library. Otherwise, LDRS can be used as a command line interface application within IDPConformerGenerator. Full documentation is available within the command-line interface as well as on IDPConformerGenerator's official documentation pages (https://idpconformergenerator.readthedocs.io/en/latest/).
Collapse
Affiliation(s)
- Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - João M C Teixeira
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Oufan Zhang
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
| | - Thomas E Tsangaris
- Department of Physics, University of Toronto, Toronto, ON M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, ON L5L 1C6, Canada
| | - Jie Li
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
| | - Claudiu C Gradinaru
- Department of Physics, University of Toronto, Toronto, ON M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, ON L5L 1C6, Canada
| | - Teresa Head-Gordon
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA 94720-1462, United States
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94720-1762, United States
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
2
|
Liu ZH, Teixeira JM, Zhang O, Tsangaris TE, Li J, Gradinaru CC, Head-Gordon T, Forman-Kay JD. Local Disordered Region Sampling (LDRS) for Ensemble Modeling of Proteins with Experimentally Undetermined or Low Confidence Prediction Segments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.25.550520. [PMID: 37546943 PMCID: PMC10402175 DOI: 10.1101/2023.07.25.550520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
The Local Disordered Region Sampling (LDRS, pronounced loaders) tool, developed for the IDPConformerGenerator platform (Teixeira et al. 2022), provides a method for generating all-atom conformations of intrinsically disordered regions (IDRs) at N- and C-termini of and in loops or linkers between folded regions of an existing protein structure. These disordered elements often lead to missing coordinates in experimental structures or low confidence in predicted structures. Requiring only a pre-existing PDB structure of the protein with missing coordinates or with predicted confidence scores and its full-length primary sequence, LDRS will automatically generate physically meaningful conformational ensembles of the missing flexible regions to complete the full-length protein. The capabilities of the LDRS tool of IDPConformerGenerator include modeling phosphorylation sites using enhanced Monte Carlo Side Chain Entropy (MC-SCE) (Bhowmick and Head-Gordon 2015), transmembrane proteins within an all-atom bilayer, and multi-chain complexes. The modeling capacity of LDRS capitalizes on the modularity, ability to be used as a library and via command-line, and computational speed of the IDPConformerGenerator platform.
Collapse
Affiliation(s)
- Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - João M.C. Teixeira
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
| | - Oufan Zhang
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States of America
- Department of Chemistry, University of California, Berkeley, California 94720-1460 United States of America
| | - Thomas E. Tsangaris
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | - Jie Li
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States of America
- Department of Chemistry, University of California, Berkeley, California 94720-1460 United States of America
| | - Claudiu C. Gradinaru
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | - Teresa Head-Gordon
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States of America
- Department of Chemistry, University of California, Berkeley, California 94720-1460 United States of America
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720-1462, United States of America
- Department of Bioengineering, University of California, Berkeley, California 94720-1762, United States of America
| | - Julie D. Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
3
|
O'Donnell T, Cazals F. Enhanced conformational exploration of protein loops using a global parameterization of the backbone geometry. J Comput Chem 2023; 44:1094-1104. [PMID: 36733189 DOI: 10.1002/jcc.27067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Accepted: 12/22/2022] [Indexed: 02/04/2023]
Abstract
Flexible loops are paramount to protein functions, with action modes ranging from localized dynamics contributing to the free energy of the system, to large amplitude conformational changes accounting for the repositioning whole secondary structure elements or protein domains. However, generating diverse and low energy loops remains a difficult problem. This work introduces a novel paradigm to sample loop conformations, in the spirit of the hit-and-run (HAR) Markov chain Monte Carlo technique. The algorithm uses a decomposition of the loop into tripeptides, and a novel characterization of necessary conditions for Tripeptide Loop Closure to admit solutions. Denoting m the number of tripeptides, the algorithm works in an angular space of dimension 12 m. In this space, the hyper-surfaces associated with the aforementioned necessary conditions are used to run a HAR-like sampling technique. On classical loop cases up to 15 amino acids, our parameter free method compares favorably to previous work, generating more diverse conformational ensembles. We also report experiments on a 30 amino acids long loop, a size not processed in any previous work.
Collapse
Affiliation(s)
- Timothée O'Donnell
- Algorithms-Biology-Structure, Centre Inria at Université Côte d'Azur, Sophia Antipolis, France
| | - Frédéric Cazals
- Algorithms-Biology-Structure, Centre Inria at Université Côte d'Azur, Sophia Antipolis, France
| |
Collapse
|