1
|
Hopf TA, Green AG, Schubert B, Mersmann S, Schärfe CPI, Ingraham JB, Toth-Petroczy A, Brock K, Riesselman AJ, Palmedo P, Kang C, Sheridan R, Draizen EJ, Dallago C, Sander C, Marks DS. The EVcouplings Python framework for coevolutionary sequence analysis. Bioinformatics 2020; 35:1582-1584. [PMID: 30304492 PMCID: PMC6499242 DOI: 10.1093/bioinformatics/bty862] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 09/06/2018] [Accepted: 10/08/2018] [Indexed: 01/03/2023] Open
Abstract
SUMMARY Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users. AVAILABILITY AND IMPLEMENTATION https://github.com/debbiemarkslab/evcouplings.
Collapse
Affiliation(s)
- Thomas A Hopf
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Anna G Green
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Benjamin Schubert
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,cBio Center, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sophia Mersmann
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Charlotta P I Schärfe
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Center for Bioinformatics, University of Tübingen, Tübingen, Germany.,Applied Bioinformatics, Department of Computer Science, Tübingen, Germany
| | - John B Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Kelly Brock
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Perry Palmedo
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Chan Kang
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Robert Sheridan
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Eli J Draizen
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | - Christian Dallago
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,Department of Informatics, Technische Universität München, Garching, Germany
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,cBio Center, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
2
|
Sürün B, Schärfe CPI, Divine MR, Heinrich J, Toussaint NC, Zimmermann L, Beha J, Kohlbacher O. ClinVAP: a reporting strategy from variants to therapeutic options. Bioinformatics 2020; 36:2316-2317. [PMID: 31830259 PMCID: PMC7141851 DOI: 10.1093/bioinformatics/btz924] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 10/24/2019] [Accepted: 12/09/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Next-generation sequencing has become routine in oncology and opens up new avenues of therapies, particularly in personalized oncology setting. An increasing number of cases also implies a need for a more robust, automated and reproducible processing of long lists of variants for cancer diagnosis and therapy. While solutions for the large-scale analysis of somatic variants have been implemented, existing solutions often have issues with reproducibility, scalability and interoperability. RESULTS Clinical Variant Annotation Pipeline (ClinVAP) is an automated pipeline which annotates, filters and prioritizes somatic single nucleotide variants provided in variant call format. It augments the variant information with documented or predicted clinical effect. These annotated variants are prioritized based on driver gene status and druggability. ClinVAP is available as a fully containerized, self-contained pipeline maximizing reproducibility and scalability allowing the analysis of larger scale data. The resulting JSON-based report is suited for automated downstream processing, but ClinVAP can also automatically render the information into a user-defined template to yield a human-readable report. AVAILABILITY AND IMPLEMENTATION ClinVAP is available at https://github.com/PersonalizedOncology/ClinVAP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bilge Sürün
- Department of Computer Science, Applied Bioinformatics, Tübingen 72076, Germany
| | | | - Mathew R Divine
- Department of Computer Science, Applied Bioinformatics, Tübingen 72076, Germany
| | - Julian Heinrich
- Department of Computer Science, Applied Bioinformatics, Tübingen 72076, Germany
| | - Nora C Toussaint
- NEXUS Personalized Health Technologies, ETH Zurich, Zurich 8093, Switzerland
| | - Lukas Zimmermann
- Translational Bioinformatics, University Hospital Tübingen, Tübingen 72076, Germany
| | - Janina Beha
- Center for Personalized Medicine, University Hospital Tübingen, Tübingen 72076, Germany
| | - Oliver Kohlbacher
- Department of Computer Science, Applied Bioinformatics, Tübingen 72076, Germany.,Center for Personalized Medicine, University Hospital Tübingen, Tübingen 72076, Germany
| |
Collapse
|
3
|
Nicoludis JM, Lau SY, Schärfe CPI, Marks DS, Weihofen WA, Gaudet R. Structure and Sequence Analyses of Clustered Protocadherins Reveal Antiparallel Interactions that Mediate Homophilic Specificity. Structure 2015; 23:2087-98. [PMID: 26481813 PMCID: PMC4635037 DOI: 10.1016/j.str.2015.09.005] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Revised: 09/14/2015] [Accepted: 09/15/2015] [Indexed: 01/07/2023]
Abstract
Clustered protocadherin (Pcdh) proteins mediate dendritic self-avoidance in neurons via specific homophilic interactions in their extracellular cadherin (EC) domains. We determined crystal structures of EC1-EC3, containing the homophilic specificity-determining region, of two mouse clustered Pcdh isoforms (PcdhγA1 and PcdhγC3) to investigate the nature of the homophilic interaction. Within the crystal lattices, we observe antiparallel interfaces consistent with a role in trans cell-cell contact. Antiparallel dimerization is supported by evolutionary correlations. Two interfaces, located primarily on EC2-EC3, involve distinctive clustered Pcdh structure and sequence motifs, lack predicted glycosylation sites, and contain residues highly conserved in orthologs but not paralogs, pointing toward their biological significance as homophilic interaction interfaces. These two interfaces are similar yet distinct, reflecting a possible difference in interaction architecture between clustered Pcdh subfamilies. These structures initiate a molecular understanding of clustered Pcdh assemblies that are required to produce functional neuronal networks.
Collapse
Affiliation(s)
- John M. Nicoludis
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, 02138, USA
| | - Sze-Yi Lau
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA, 02138, USA
| | - Charlotta P. I. Schärfe
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA,Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Wilhelm A. Weihofen
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA, 02138, USA,Correspondence: (R. G.), (W. A.W.)
| | - Rachelle Gaudet
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA, 02138, USA,Correspondence: (R. G.), (W. A.W.)
| |
Collapse
|
4
|
Hopf TA, Schärfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, Bonvin AMJJ, Marks DS. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 2014; 3. [PMID: 25255213 PMCID: PMC4360534 DOI: 10.7554/elife.03430] [Citation(s) in RCA: 332] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 09/23/2014] [Indexed: 12/24/2022] Open
Abstract
Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.
Collapse
Affiliation(s)
- Thomas A Hopf
- Department of Systems Biology, Harvard University, Boston, United States
| | | | - João P G L M Rodrigues
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Anna G Green
- Department of Systems Biology, Harvard University, Boston, United States
| | - Oliver Kohlbacher
- Applied Bioinformatics, Quantitative Biology Center, University of Tübingen, Tübingen, Germany
| | - Chris Sander
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Debora S Marks
- Department of Systems Biology, Harvard University, Boston, United States
| |
Collapse
|