1
|
Ponomarenko EA, Krasnov GS, Kiseleva OI, Kryukova PA, Arzumanian VA, Dolgalev GV, Ilgisonis EV, Lisitsa AV, Poverennaya EV. Workability of mRNA Sequencing for Predicting Protein Abundance. Genes (Basel) 2023; 14:2065. [PMID: 38003008 PMCID: PMC10671741 DOI: 10.3390/genes14112065] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/03/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Transcriptomics methods (RNA-Seq, PCR) today are more routine and reproducible than proteomics methods, i.e., both mass spectrometry and immunochemical analysis. For this reason, most scientific studies are limited to assessing the level of mRNA content. At the same time, protein content (and its post-translational status) largely determines the cell's state and behavior. Such a forced extrapolation of conclusions from the transcriptome to the proteome often seems unjustified. The ratios of "transcript-protein" pairs can vary by several orders of magnitude for different genes. As a rule, the correlation coefficient between transcriptome-proteome levels for different tissues does not exceed 0.3-0.5. Several characteristics determine the ratio between the content of mRNA and protein: among them, the rate of movement of the ribosome along the mRNA and the number of free ribosomes in the cell, the availability of tRNA, the secondary structure, and the localization of the transcript. The technical features of the experimental methods also significantly influence the levels of the transcript and protein of the corresponding gene on the outcome of the comparison. Given the above biological features and the performance of experimental and bioinformatic approaches, one may develop various models to predict proteomic profiles based on transcriptomic data. This review is devoted to the ability of RNA sequencing methods for protein abundance prediction.
Collapse
Affiliation(s)
| | - George S. Krasnov
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia;
| | | | | | | | | | | | | | | |
Collapse
|
2
|
van den Berg PR, Bérenger-Currias NMLP, Budnik B, Slavov N, Semrau S. Integration of a multi-omics stem cell differentiation dataset using a dynamical model. PLoS Genet 2023; 19:e1010744. [PMID: 37167320 DOI: 10.1371/journal.pgen.1010744] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 05/23/2023] [Accepted: 04/14/2023] [Indexed: 05/13/2023] Open
Abstract
Stem cell differentiation is a highly dynamic process involving pervasive changes in gene expression. The large majority of existing studies has characterized differentiation at the level of individual molecular profiles, such as the transcriptome or the proteome. To obtain a more comprehensive view, we measured protein, mRNA and microRNA abundance during retinoic acid-driven differentiation of mouse embryonic stem cells. We found that mRNA and protein abundance are typically only weakly correlated across time. To understand this finding, we developed a hierarchical dynamical model that allowed us to integrate all data sets. This model was able to explain mRNA-protein discordance for most genes and identified instances of potential microRNA-mediated regulation. Overexpression or depletion of microRNAs identified by the model, followed by RNA sequencing and protein quantification, were used to follow up on the predictions of the model. Overall, our study shows how multi-omics integration by a dynamical model could be used to nominate candidate regulators.
Collapse
Affiliation(s)
| | | | - Bogdan Budnik
- Mass Spectrometry and Proteomics Resource Laboratory, Harvard University, Cambridge, Massachusetts, United States of America
| | - Nikolai Slavov
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, United States of America
| | - Stefan Semrau
- Leiden Institute of Physics, Leiden University, Leiden, Zuid-Holland, The Netherlands
| |
Collapse
|
3
|
Boahen CK, Oelen R, Le K, Netea MG, Franke L, van der Wijst MG, Kumar V. Integration of Candida albicans-induced single-cell gene expression data and secretory protein concentrations reveal genetic regulators of inflammation. Front Immunol 2023; 14:1069379. [PMID: 36865558 PMCID: PMC9972217 DOI: 10.3389/fimmu.2023.1069379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 01/23/2023] [Indexed: 02/16/2023] Open
Abstract
Both gene expression and protein concentrations are regulated by genetic variants. Exploring the regulation of both eQTLs and pQTLs simultaneously in a context- and cell-type dependent manner may help to unravel mechanistic basis for genetic regulation of pQTLs. Here, we performed meta-analysis of Candida albicans-induced pQTLs from two population-based cohorts and intersected the results with Candida-induced cell-type specific expression association data (eQTL). This revealed systematic differences between the pQTLs and eQTL, where only 35% of the pQTLs significantly correlated with mRNA expressions at single cell level, indicating the limitation of eQTLs use as a proxy for pQTLs. By taking advantage of the tightly co-regulated pattern of the proteins, we also identified SNPs affecting protein network upon Candida stimulations. Colocalization of pQTLs and eQTLs signals implicated several genomic loci including MMP-1 and AMZ1. Analysis of Candida-induced single cell gene expression data implicated specific cell types that exhibit significant expression QTLs upon stimulation. By highlighting the role of trans-regulatory networks in determining the abundance of secretory proteins, our study serve as a framework to gain insights into the mechanisms of genetic regulation of protein levels in a context-dependent manner.
Collapse
Affiliation(s)
- Collins K. Boahen
- Department of Internal Medicine and Radboud Institute of Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen, Netherlands
| | - Roy Oelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Kieu Le
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Mihai G. Netea
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen, Netherlands
- Department for Immunology and Metabolism, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Monique G.P. van der Wijst
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Vinod Kumar
- Department of Internal Medicine and Radboud Institute of Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen, Netherlands
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- Nitte University Centre for Science Education and Research (NUCSER), Nitte (Deemed to be University), Mangalore, India
| |
Collapse
|
4
|
Fernández-Torras A, Duran-Frigola M, Bertoni M, Locatelli M, Aloy P. Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque. Nat Commun 2022; 13:5304. [PMID: 36085310 PMCID: PMC9463154 DOI: 10.1038/s41467-022-33026-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 08/30/2022] [Indexed: 12/25/2022] Open
Abstract
Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., 'drug treats disease', 'gene interacts with gene'). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.
Collapse
Affiliation(s)
- Adrià Fernández-Torras
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Miquel Duran-Frigola
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Ersilia Open Source Initiative, Cambridge, UK
| | - Martino Bertoni
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Martina Locatelli
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
5
|
Burnum-Johnson KE, Conrads TP, Drake RR, Herr AE, Iyengar R, Kelly RT, Lundberg E, MacCoss MJ, Naba A, Nolan GP, Pevzner PA, Rodland KD, Sechi S, Slavov N, Spraggins JM, Van Eyk JE, Vidal M, Vogel C, Walt DR, Kelleher NL. New Views of Old Proteins: Clarifying the Enigmatic Proteome. Mol Cell Proteomics 2022; 21:100254. [PMID: 35654359 PMCID: PMC9256833 DOI: 10.1016/j.mcpro.2022.100254] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/09/2022] [Accepted: 05/27/2022] [Indexed: 11/23/2022] Open
Abstract
All human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide a >10 years of projection for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more propulsive enterprise. Extrapolating recent trends, we describe a next generation of approaches to define, quantify, and visualize the multiple dimensions of the proteome, thereby transforming our understanding and interactions with human disease in the coming decade.
Collapse
Affiliation(s)
- Kristin E Burnum-Johnson
- The Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington, USA.
| | - Thomas P Conrads
- Inova Women's Service Line, Inova Health System, Falls Church, Virginia, USA
| | - Richard R Drake
- Cell and Molecular Pharmacology and Experimental Therapeutics, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Amy E Herr
- Department of Bioengineering, University of California, Berkeley, California, USA
| | - Ravi Iyengar
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ryan T Kelly
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, Utah, USA
| | - Emma Lundberg
- Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Alexandra Naba
- Department of Physiology and Biophysics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Garry P Nolan
- Department of Pathology, Stanford University, Stanford, California, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, California, USA
| | - Karin D Rodland
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Salvatore Sechi
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, USA
| | - Nikolai Slavov
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
| | - Jeffrey M Spraggins
- Department of Cell and Developmental Biology, Mass Spectrometry Research Center, Vanderbilt University, Nashville, Tennessee, USA
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Institute in the Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Marc Vidal
- Department of Genetics, Harvard University, Cambridge, Massachusetts, USA
| | - Christine Vogel
- New York University Center for Genomics and Systems Biology, New York University, New York, New York, USA
| | - David R Walt
- Department of Pathology, Harvard Medical School, Brigham and Women's Hospital, Wyss Institute at Harvard University, Boston, Massachusetts, USA
| | - Neil L Kelleher
- Department of Chemistry, Northwestern University, Evanston, Illinois, USA.
| |
Collapse
|
6
|
Identifying causal genes for depression via integration of the proteome and transcriptome from brain and blood. Mol Psychiatry 2022; 27:2849-2857. [PMID: 35296807 DOI: 10.1038/s41380-022-01507-9] [Citation(s) in RCA: 68] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Revised: 02/17/2022] [Accepted: 02/22/2022] [Indexed: 12/15/2022]
Abstract
Genome-wide association studies (GWASs) have identified numerous risk genes for depression. Nevertheless, genes crucial for understanding the molecular mechanisms of depression and effective antidepressant drug targets are largely unknown. Addressing this, we aimed to highlight potentially causal genes by systematically integrating the brain and blood protein and expression quantitative trait loci (QTL) data with a depression GWAS dataset via a statistical framework including Mendelian randomization (MR), Bayesian colocalization, and Steiger filtering analysis. In summary, we identified three candidate genes (TMEM106B, RAB27B, and GMPPB) based on brain data and two genes (TMEM106B and NEGR1) based on blood data with consistent robust evidence at both the protein and transcriptional levels. Furthermore, the protein-protein interaction (PPI) network provided new insights into the interaction between brain and blood in depression. Collectively, four genes (TMEM106B, RAB27B, GMPPB, and NEGR1) affect depression by influencing protein and gene expression level, which could guide future researches on candidate genes investigations in animal studies as well as prioritize antidepressant drug targets.
Collapse
|
7
|
Sun P, Wu Y, Yin C, Jiang H, Xu Y, Sun H. Molecular Subtyping of Cancer Based on Distinguishing Co-Expression Modules and Machine Learning. Front Genet 2022; 13:866005. [PMID: 35586568 PMCID: PMC9108363 DOI: 10.3389/fgene.2022.866005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 03/07/2022] [Indexed: 02/05/2023] Open
Abstract
Molecular subtyping of cancer is recognized as a critical and challenging step towards individualized therapy. Most existing computational methods solve this problem via multi-classification of gene-expressions of cancer samples. Although these methods, especially deep learning, perform well in data classification, they usually require large amounts of data for model training and have limitations in interpretability. Besides, as cancer is a complex systemic disease, the phenotypic difference between cancer samples can hardly be fully understood by only analyzing single molecules, and differential expression-based molecular subtyping methods are reportedly not conserved. To address the above issues, we present here a new framework for molecular subtyping of cancer through identifying a robust specific co-expression module for each subtype of cancer, generating network features for each sample by perturbing correlation levels of specific edges, and then training a deep neural network for multi-class classification. When applied to breast cancer (BRCA) and stomach adenocarcinoma (STAD) molecular subtyping, it has superior classification performance over existing methods. In addition to improving classification performance, we consider the specific co-expressed modules selected for subtyping to be biologically meaningful, which potentially offers new insight for diagnostic biomarker design, mechanistic studies of cancer, and individualized treatment plan selection.
Collapse
Affiliation(s)
- Peishuo Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Wu
- Phase I Clinical Trails Center, The First Affiliated Hospital, China Medical University, Shenyang, China
| | - Chaoyi Yin
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Hongyang Jiang
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Xu
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia, Athens, GA, United States
- *Correspondence: Huiyan Sun, ; Ying Xu,
| | - Huiyan Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- *Correspondence: Huiyan Sun, ; Ying Xu,
| |
Collapse
|
8
|
Suomi T, Elo LL. Statistical and machine learning methods to study human CD4+ T cell proteome profiles. Immunol Lett 2022; 245:8-17. [DOI: 10.1016/j.imlet.2022.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/11/2022] [Accepted: 03/15/2022] [Indexed: 11/05/2022]
|
9
|
Clark NM, Elmore JM, Walley JW. To the proteome and beyond: advances in single-cell omics profiling for plant systems. PLANT PHYSIOLOGY 2022; 188:726-737. [PMID: 35235661 PMCID: PMC8825333 DOI: 10.1093/plphys/kiab429] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/16/2021] [Indexed: 05/19/2023]
Abstract
Recent advances in single-cell proteomics for animal systems could be adapted for plants to increase our understanding of plant development, response to stimuli, and cell-to-cell signaling.
Collapse
Affiliation(s)
- Natalie M Clark
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011, USA
| | - James Mitch Elmore
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011, USA
| | - Justin W Walley
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011, USA
| |
Collapse
|
10
|
Babu M, Singh N, Datta A. In Vitro Oxygen Glucose Deprivation Model of Ischemic Stroke: A Proteomics-Driven Systems Biological Perspective. Mol Neurobiol 2022; 59:2363-2377. [PMID: 35080759 DOI: 10.1007/s12035-022-02745-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/11/2022] [Indexed: 01/17/2023]
Abstract
Oxygen glucose deprivation (OGD) of brain cells is the commonest in vitro model of ischemic stroke that is used extensively for basic and preclinical stroke research. Protein mass spectrometry is one of the most promising and rapidly evolving technologies in biomedical research. A systems-level understanding of cell-type-specific responses to oxygen and glucose deprivation without systemic influence is a prerequisite to delineate the response of the neurovascular unit following ischemic stroke. In this systematic review, we summarize the proteomics studies done on different OGD models. These studies have followed an expression or interaction proteomics approach. They have been primarily used to understand the cellular pathophysiology of ischemia-reperfusion injury or to assess the efficacy of interventions as potential treatment options. We compile the limitations of OGD model and downstream proteomics experiment. We further show that despite having limitations, several proteins shortlisted as altered in in vitro OGD-proteomics studies showed comparable regulation in ischemic stroke patients. This showcases the translational potential of this approach for therapeutic target and biomarker discovery. We next discuss the approaches that can be adopted for cell-type-specific validation of OGD-proteomics results in the future. Finally, we briefly present the research questions that can be addressed by OGD-proteomics studies using emerging techniques of protein mass spectrometry. We have also created a web resource compiling information from OGD-proteomics studies to facilitate data sharing for community usage. This review intends to encourage preclinical stroke community to adopt a hypothesis-free proteomics approach to understand cell-type-specific responses following ischemic stroke.
Collapse
Affiliation(s)
- Manju Babu
- Laboratory of Translational Neuroscience, Division of Neuroscience, Yenepoya Research Center, Yenepoya (Deemed to be University), University Road, Deralakatte, Mangalore, 575018, Karnataka, India
| | - Nikhil Singh
- Laboratory of Translational Neuroscience, Division of Neuroscience, Yenepoya Research Center, Yenepoya (Deemed to be University), University Road, Deralakatte, Mangalore, 575018, Karnataka, India
| | - Arnab Datta
- Laboratory of Translational Neuroscience, Division of Neuroscience, Yenepoya Research Center, Yenepoya (Deemed to be University), University Road, Deralakatte, Mangalore, 575018, Karnataka, India.
| |
Collapse
|
11
|
Jackson CA, Vogel C. New horizons in the stormy sea of multimodal single-cell data integration. Mol Cell 2022; 82:248-259. [PMID: 35063095 PMCID: PMC8830781 DOI: 10.1016/j.molcel.2021.12.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/08/2021] [Accepted: 12/13/2021] [Indexed: 01/22/2023]
Abstract
While measurements of RNA expression have dominated the world of single-cell analyses, new single-cell techniques increasingly allow collection of different data modalities, measuring different molecules, structural connections, and intermolecular interactions. Integrating the resulting multimodal single-cell datasets is a new bioinformatics challenge. Equally important, it is a new experimental design challenge for the bench scientist, who is not only choosing from a myriad of techniques for each data modality but also faces new challenges in experimental design. The ultimate goal is to design, execute, and analyze multimodal single-cell experiments that are more than just descriptive but enable the learning of new causal and mechanistic biology. This objective requires strict consideration of the goals behind the analysis, which might range from mapping the heterogeneity of a cellular population to assembling system-wide causal networks that can further our understanding of cellular functions and eventually lead to models of tissues and organs. We review steps and challenges toward this goal. Single-cell transcriptomics is now a mature technology, and methods to measure proteins, lipids, small-molecule metabolites, and other molecular phenotypes at the single-cell level are rapidly developing. Integrating these single-cell readouts so that each cell has measurements of multiple types of data, e.g., transcriptomes, proteomes, and metabolomes, is expected to allow identification of highly specific cellular subpopulations and to provide the basis for inferring causal biological mechanisms.
Collapse
Affiliation(s)
- Christopher A Jackson
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York, NY, USA.
| | - Christine Vogel
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York, NY, USA
| |
Collapse
|
12
|
Nosti AJ, Barrio LC, Calderón-Celis F, Soldado A, Encinar JR. Absolute quantification of proteins using element mass spectrometry and generic standards. J Proteomics 2022; 256:104499. [DOI: 10.1016/j.jprot.2022.104499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 01/21/2022] [Indexed: 10/19/2022]
|
13
|
Ferreira M, Ventorim R, Almeida E, Silveira S, Silveira W. Protein Abundance Prediction Through Machine Learning Methods. J Mol Biol 2021; 433:167267. [PMID: 34563548 DOI: 10.1016/j.jmb.2021.167267] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 09/09/2021] [Accepted: 09/17/2021] [Indexed: 10/20/2022]
Abstract
Proteins are responsible for most physiological processes, and their abundance provides crucial information for systems biology research. However, absolute protein quantification, as determined by mass spectrometry, still has limitations in capturing the protein pool. Protein abundance is impacted by translation kinetics, which rely on features of codons. In this study, we evaluated the effect of codon usage bias of genes on protein abundance. Notably, we observed differences regarding codon usage patterns between genes coding for highly abundant proteins and genes coding for less abundant proteins. Analysis of synonymous codon usage and evolutionary selection showed a clear split between the two groups. Our machine learning models predicted protein abundances from codon usage metrics with remarkable accuracy, achieving strong correlation with experimental data. Upon integration of the predicted protein abundance in enzyme-constrained genome-scale metabolic models, the simulated phenotypes closely matched experimental data, which demonstrates that our predictive models are valuable tools for systems metabolic engineering approaches.
Collapse
Affiliation(s)
- Mauricio Ferreira
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@mauriciomyces
| | - Rafaela Ventorim
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil.
| | - Eduardo Almeida
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@elm_almeida
| | - Sabrina Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@sabrina_as
| | - Wendel Silveira
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil.
| |
Collapse
|
14
|
Multi-Omics Approach to Elucidate Cerebrospinal Fluid Changes in Dogs with Intervertebral Disc Herniation. Int J Mol Sci 2021; 22:ijms222111678. [PMID: 34769107 PMCID: PMC8583948 DOI: 10.3390/ijms222111678] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/19/2021] [Accepted: 10/25/2021] [Indexed: 12/16/2022] Open
Abstract
Herniation of the intervertebral disc (IVDH) is the most common cause of neurological and intervertebral disc degeneration-related diseases. Since the disc starts to degenerate before it can be observed by currently available diagnostic methods, there is an urgent need for novel diagnostic approaches. To identify molecular networks and pathways which may play important roles in intervertebral disc herniation, as well as to reveal the potential features which could be useful for monitoring disease progression and prognosis, multi-omics profiling, including high-resolution liquid chromatography-mass spectrometry (LC-MS)-based metabolomics and tandem mass tag (TMT)-based proteomics was performed. Cerebrospinal fluid of nine dogs with IVDH and six healthy controls were used for the analyses, and an additional five IVDH samples were used for proteomic data validation. Furthermore, multi-omics data were integrated to decipher a complex interaction between individual omics layers, leading to an improved prediction model. Together with metabolic pathways related to amino acids and lipid metabolism and coagulation cascades, our integromics prediction model identified the key features in IVDH, namely the proteins follistatin Like 1 (FSTL1), secretogranin V (SCG5), nucleobindin 1 (NUCB1), calcitonin re-ceptor-stimulating peptide 2 precursor (CRSP2) and the metabolites N-acetyl-D-glucosamine and adenine, involved in neuropathic pain, myelination, and neurotransmission and inflammatory response, respectively. Their clinical application is to be further investigated. The utilization of a novel integrative interdisciplinary approach may provide new opportunities to apply innovative diagnostic and monitoring methods as well as improve treatment strategies and personalized care for patients with degenerative spinal disorders.
Collapse
|
15
|
Fernández-Torras A, Comajuncosa-Creus A, Duran-Frigola M, Aloy P. Connecting chemistry and biology through molecular descriptors. Curr Opin Chem Biol 2021; 66:102090. [PMID: 34626922 DOI: 10.1016/j.cbpa.2021.09.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Revised: 08/23/2021] [Accepted: 09/03/2021] [Indexed: 01/14/2023]
Abstract
Through the representation of small molecule structures as numerical descriptors and the exploitation of the similarity principle, chemoinformatics has made paramount contributions to drug discovery, from unveiling mechanisms of action and repurposing approved drugs to de novo crafting of molecules with desired properties and tailored targets. Yet, the inherent complexity of biological systems has fostered the implementation of large-scale experimental screenings seeking a deeper understanding of the targeted proteins, the disrupted biological processes and the systemic responses of cells to chemical perturbations. After this wealth of data, a new generation of data-driven descriptors has arisen providing a rich portrait of small molecule characteristics that goes beyond chemical properties. Here, we give an overview of biologically relevant descriptors, covering chemical compounds, proteins and other biological entities, such as diseases and cell lines, while aligning them to the major contributions in the field from disciplines, such as natural language processing or computer vision. We now envision a new scenario for chemical and biological entities where they both are translated into a common numerical format. In this computational framework, complex connections between entities can be unveiled by means of simple arithmetic operations, such as distance measures, additions, and subtractions.
Collapse
Affiliation(s)
- Adrià Fernández-Torras
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Arnau Comajuncosa-Creus
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Miquel Duran-Frigola
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Ersilia Open Source Initiative, Cambridge, United Kingdom
| | - Patrick Aloy
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Institució Catalana de Recerca I Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
16
|
Elmore JM, Griffin BD, Walley JW. Advances in functional proteomics to study plant-pathogen interactions. CURRENT OPINION IN PLANT BIOLOGY 2021; 63:102061. [PMID: 34102449 DOI: 10.1016/j.pbi.2021.102061] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 04/22/2021] [Accepted: 04/25/2021] [Indexed: 05/20/2023]
Abstract
Pathogen infection triggers complex signaling networks in plant cells that ultimately result in either susceptibility or resistance. We have made substantial progress in dissecting many of these signaling events, and it is becoming clear that changes in proteome composition and protein activity are major drivers of plant-microbe interactions. Here, we highlight different approaches to analyze the functional proteomes of hosts and pathogens and discuss how they have been used to further our understanding of plant disease. Global proteome profiling can quantify the dynamics of proteins, posttranslational modifications, and biological pathways that contribute to immune-related outcomes. In addition, emerging techniques such as enzyme activity-based profiling, proximity labeling, and kinase-substrate profiling are being used to dissect biochemical events that operate during infection. Finally, we discuss how these functional approaches can be integrated with other profiling data to gain a mechanistic, systems-level view of plant and pathogen signaling.
Collapse
Affiliation(s)
- James M Elmore
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, 50014, USA.
| | - Brianna D Griffin
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, 50014, USA
| | - Justin W Walley
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, 50014, USA.
| |
Collapse
|
17
|
Levy ED, Vogel C. "Structuromics": another step toward a holistic view of the cell. Cell 2021; 184:301-303. [PMID: 33482097 DOI: 10.1016/j.cell.2020.12.030] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Large-scale mapping of protein structures and their different states is crucial for gaining a mechanistic understanding of proteome function and regulation. In this issue of Cell, Cappelletti et al. achieve such a feat and identify hundreds of protein structural changes in response to outside stressors, providing a rich "structuromics" resource characterizing cellular adaptation.
Collapse
Affiliation(s)
- Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel.
| | | |
Collapse
|
18
|
Lux V, Non AL, Pexman PM, Stadler W, Weber LAE, Krüger M. A Developmental Framework for Embodiment Research: The Next Step Toward Integrating Concepts and Methods. Front Syst Neurosci 2021; 15:672740. [PMID: 34393730 PMCID: PMC8360894 DOI: 10.3389/fnsys.2021.672740] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 06/28/2021] [Indexed: 12/17/2022] Open
Abstract
Embodiment research is at a turning point. There is an increasing amount of data and studies investigating embodiment phenomena and their role in mental processing and functions from across a wide range of disciplines and theoretical schools within the life sciences. However, the integration of behavioral data with data from different biological levels is challenging for the involved research fields such as movement psychology, social and developmental neuroscience, computational psychosomatics, social and behavioral epigenetics, human-centered robotics, and many more. This highlights the need for an interdisciplinary framework of embodiment research. In addition, there is a growing need for a cross-disciplinary consensus on level-specific criteria of embodiment. We propose that a developmental perspective on embodiment is able to provide a framework for overcoming such pressing issues, providing analytical tools to link timescales and levels of embodiment specific to the function under study, uncovering the underlying developmental processes, clarifying level-specific embodiment criteria, and providing a matrix and platform to bridge disciplinary boundaries among the involved research fields.
Collapse
Affiliation(s)
- Vanessa Lux
- Department of Genetic Psychology, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Amy L Non
- Department of Anthropology, University of California, San Diego, La Jolla, CA, United States
| | - Penny M Pexman
- Department of Psychology, University of Calgary, Calgary, AB, Canada
| | - Waltraud Stadler
- Chair of Human Movement Science, Department of Sports and Health Sciences, Technical University of Munich, Munich, Germany
| | - Lilian A E Weber
- Department of Psychiatry, Oxford Centre for Human Brain Activity, Warneford Hospital, Oxford, United Kingdom.,Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Melanie Krüger
- Institute of Sports Science, Faculty of Humanities, Leibniz University Hannover, Hannover, Germany
| |
Collapse
|
19
|
Abstract
PURPOSE OF REVIEW Erythropoiesis is a hierarchical process by which hematopoietic stem cells give rise to red blood cells through gradual cell fate restriction and maturation. Deciphering this process requires the establishment of dynamic gene regulatory networks (GRNs) that predict the response of hematopoietic cells to signals from the environment. Although GRNs have historically been derived from transcriptomic data, recent proteomic studies have revealed a major role for posttranscriptional mechanisms in regulating gene expression during erythropoiesis. These new findings highlight the need to integrate proteomic data into GRNs for a refined understanding of erythropoiesis. RECENT FINDINGS Here, we review recent proteomic studies that have furthered our understanding of erythropoiesis with a focus on quantitative mass spectrometry approaches to measure the abundance of transcription factors and cofactors during differentiation. Furthermore, we highlight challenges that remain in integrating transcriptomic, proteomic, and other omics data into a predictive model of erythropoiesis, and discuss the future prospect of single-cell proteomics. SUMMARY Recent proteomic studies have considerably expanded our knowledge of erythropoiesis beyond the traditional transcriptomic-centric perspective. These findings have both opened up new avenues of research to increase our understanding of erythroid differentiation, while at the same time presenting new challenges in integrating multiple layers of information into a comprehensive gene regulatory model.
Collapse
Affiliation(s)
- Marjorie Brand
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H8L6, Canada
| | | |
Collapse
|
20
|
multiSLIDE is a web server for exploring connected elements of biological pathways in multi-omics data. Nat Commun 2021; 12:2279. [PMID: 33863886 PMCID: PMC8052434 DOI: 10.1038/s41467-021-22650-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 03/24/2021] [Indexed: 12/12/2022] Open
Abstract
Quantitative multi-omics data are difficult to interpret and visualize due to large volume of data, complexity among data features, and heterogeneity of information represented by different omics platforms. Here, we present multiSLIDE, a web-based interactive tool for the simultaneous visualization of interconnected molecular features in heatmaps of multi-omics data sets. multiSLIDE visualizes biologically connected molecular features by keyword search of pathways or genes, offering convenient functionalities to query, rearrange, filter, and cluster data on a web browser in real time. Various querying mechanisms make it adaptable to diverse omics types, and visualizations are customizable. We demonstrate the versatility of multiSLIDE through three examples, showcasing its applicability to a wide range of multi-omics data sets, by allowing users to visualize established links between molecules from different omics data, as well as incorporate custom inter-molecular relationship information into the visualization. Online and stand-alone versions of multiSLIDE are available at https://github.com/soumitag/multiSLIDE. The integration and interpretation of different omics data types is an ongoing challenge for biologists. Here, the authors present a web-based, interactive tool called multiSLIDE for the visualization of protein, phosphoprotein, and RNA data presented as interlinked heatmaps.
Collapse
|
21
|
Luo JH, Wang M, Jia GF, He Y. Transcriptome-wide analysis of epitranscriptome and translational efficiency associated with heterosis in maize. JOURNAL OF EXPERIMENTAL BOTANY 2021; 72:2933-2946. [PMID: 33606877 PMCID: PMC8023220 DOI: 10.1093/jxb/erab074] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 02/12/2021] [Indexed: 05/14/2023]
Abstract
Heterosis has been extensively utilized to increase productivity in crops, yet the underlying molecular mechanisms remain largely elusive. Here, we generated transcriptome-wide profiles of mRNA abundance, m6A methylation, and translational efficiency from the maize F1 hybrid B73×Mo17 and its two parental lines to ascertain the contribution of each regulatory layer to heterosis at the seedling stage. We documented that although the global abundance and distribution of m6A remained unchanged, a greater number of genes had gained an m6A modification in the hybrid. Superior variations were observed at the m6A modification and translational efficiency levels when compared with mRNA abundance between the hybrid and parents. In the hybrid, the vast majority of genes with m6A modification exhibited a non-additive expression pattern, the percentage of which was much higher than that at levels of mRNA abundance and translational efficiency. Non-additive genes involved in different biological processes were hierarchically coordinated by discrete combinations of three regulatory layers. These findings suggest that transcriptional and post-transcriptional regulation of gene expression make distinct contributions to heterosis in hybrid maize. Overall, this integrated multi-omics analysis provides a valuable portfolio for interpreting transcriptional and post-transcriptional regulation of gene expression in hybrid maize, and paves the way for exploring molecular mechanisms underlying hybrid vigor.
Collapse
Affiliation(s)
- Jin-Hong Luo
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China
| | - Min Wang
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China
| | - Gui-Fang Jia
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yan He
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China
- Correspondence:
| |
Collapse
|
22
|
Vlachavas EI, Bohn J, Ückert F, Nürnberg S. A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci 2021; 22:2822. [PMID: 33802234 PMCID: PMC8000236 DOI: 10.3390/ijms22062822] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 02/06/2023] Open
Abstract
Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.
Collapse
Affiliation(s)
- Efstathios Iason Vlachavas
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Jonas Bohn
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Frank Ückert
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Sylvia Nürnberg
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| |
Collapse
|
23
|
Ho JJD, Man JHS, Schatz JH, Marsden PA. Translational remodeling by RNA-binding proteins and noncoding RNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1647. [PMID: 33694288 DOI: 10.1002/wrna.1647] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 02/09/2021] [Accepted: 02/10/2021] [Indexed: 12/14/2022]
Abstract
Responsible for generating the proteome that controls phenotype, translation is the ultimate convergence point for myriad upstream signals that influence gene expression. System-wide adaptive translational reprogramming has recently emerged as a pillar of cellular adaptation. As classic regulators of mRNA stability and translation efficiency, foundational studies established the concept of collaboration and competition between RNA-binding proteins (RBPs) and noncoding RNAs (ncRNAs) on individual mRNAs. Fresh conceptual innovations now highlight stress-activated, evolutionarily conserved RBP networks and ncRNAs that increase the translation efficiency of populations of transcripts encoding proteins that participate in a common cellular process. The discovery of post-transcriptional functions for long noncoding RNAs (lncRNAs) was particularly intriguing given their cell-type-specificity and historical definition as nuclear-functioning epigenetic regulators. The convergence of RBPs, lncRNAs, and microRNAs on functionally related mRNAs to enable adaptive protein synthesis is a newer biological paradigm that highlights their role as "translatome (protein output) remodelers" and reinvigorates the paradigm of "RNA operons." Together, these concepts modernize our understanding of cellular stress adaptation and strategies for therapeutic development. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Translation > Translation Regulation Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs.
Collapse
Affiliation(s)
- J J David Ho
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, Florida, USA.,Division of Hematology, Department of Medicine, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | - Jeffrey H S Man
- Keenan Research Centre, Li Ka Shing Knowledge Institute, St. Michael's Hospital, Toronto, Ontario, Canada.,Department of Medicine, University of Toronto, Toronto, Ontario, Canada.,Department of Respirology, University Health Network, Latner Thoracic Research Laboratories, University of Toronto, Toronto, Ontario, Canada
| | - Jonathan H Schatz
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, Florida, USA.,Division of Hematology, Department of Medicine, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | - Philip A Marsden
- Keenan Research Centre, Li Ka Shing Knowledge Institute, St. Michael's Hospital, Toronto, Ontario, Canada.,Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
24
|
Yang L, Li X, Shu T, Wang P, Li X. PseKNC and Adaboost-Based Method for DNA-Binding Proteins Recognition. INT J PATTERN RECOGN 2021. [DOI: 10.1142/s0218001421500221] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
DNA-binding proteins are an essential part of the DNA. It also an integral component during life processes of various organisms, for instance, DNA recombination, replication, and so on. Recognition of such proteins helps medical researchers pinpoint the cause of disease. Traditional techniques of identifying DNA-binding proteins are expensive and time-consuming. Machine learning methods can identify these proteins quickly and efficiently. However, the accuracies of the existing related methods were not high enough. In this paper, we propose a framework to identify DNA-binding proteins. The proposed framework first uses PseKNC (ps), MomoKGap (mo), and MomoDiKGap (md) methods to combine three algorithms to extract features. Further, we apply Adaboost weight ranking to select optimal feature subsets from the above three types of features. Based on the selected features, three algorithms (k-nearest neighbor (knn), Support Vector Machine (SVM), and Random Forest (RF)) are applied to classify it. Finally, three predictors for identifying DNA-binding proteins are established, including [Formula: see text], [Formula: see text], [Formula: see text]. We utilize benchmark and independent datasets to train and evaluate the proposed framework. Three tests are performed, including Jackknife test, 10-fold cross-validation and independent test. Among them, the accuracy of ps+md is the highest. We named the model with the best result as psmdDBPs and applied it to identify DNA-binding proteins.
Collapse
Affiliation(s)
- Lina Yang
- School of Computer, Electronics and Information, Guangxi University, Nanning, P. R. China
| | - Xiangyu Li
- School of Computer, Electronics and Information, Guangxi University, Nanning, P. R. China
| | - Ting Shu
- Guangdong-Hongkong-Macao Greater Bay Area, Weather Research Center for Monitoring Warning and Forecasting, (Shenzhen Institute of Meteorological Innovation), Shenzhen, P. R. China
| | - Patrick Wang
- Computer and Information Science, Northeastern University, Boston, USA
| | - Xichun Li
- Guangxi Normal University for Nationalities, Chongzuo, P. R. China
| |
Collapse
|
25
|
Dugourd A, Kuppe C, Sciacovelli M, Gjerga E, Gabor A, Emdal KB, Vieira V, Bekker‐Jensen DB, Kranz J, Bindels E, Costa AS, Sousa A, Beltrao P, Rocha M, Olsen JV, Frezza C, Kramann R, Saez‐Rodriguez J. Causal integration of multi-omics data with prior knowledge to generate mechanistic hypotheses. Mol Syst Biol 2021; 17:e9730. [PMID: 33502086 PMCID: PMC7838823 DOI: 10.15252/msb.20209730] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 12/18/2020] [Accepted: 12/21/2020] [Indexed: 01/07/2023] Open
Abstract
Multi-omics datasets can provide molecular insights beyond the sum of individual omics. Various tools have been recently developed to integrate such datasets, but there are limited strategies to systematically extract mechanistic hypotheses from them. Here, we present COSMOS (Causal Oriented Search of Multi-Omics Space), a method that integrates phosphoproteomics, transcriptomics, and metabolomics datasets. COSMOS combines extensive prior knowledge of signaling, metabolic, and gene regulatory networks with computational methods to estimate activities of transcription factors and kinases as well as network-level causal reasoning. COSMOS provides mechanistic hypotheses for experimental observations across multi-omics datasets. We applied COSMOS to a dataset comprising transcriptomics, phosphoproteomics, and metabolomics data from healthy and cancerous tissue from eleven clear cell renal cell carcinoma (ccRCC) patients. COSMOS was able to capture relevant crosstalks within and between multiple omics layers, such as known ccRCC drug targets. We expect that our freely available method will be broadly useful to extract mechanistic insights from multi-omics studies.
Collapse
Affiliation(s)
- Aurelien Dugourd
- Faculty of Medicine, and Heidelberg University HospitalInstitute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
- Faculty of MedicineJoint Research Centre for Computational Biomedicine (JRC‐COMBINE)RWTH Aachen UniversityAachenGermany
- Faculty of MedicineInstitute of Experimental Medicine and Systems BiologyRWTH Aachen UniversityAachenGermany
- Division of Nephrology and Clinical ImmunologyFaculty of MedicineRWTH Aachen UniversityAachenGermany
| | - Christoph Kuppe
- Faculty of MedicineInstitute of Experimental Medicine and Systems BiologyRWTH Aachen UniversityAachenGermany
- Division of Nephrology and Clinical ImmunologyFaculty of MedicineRWTH Aachen UniversityAachenGermany
- Department of Internal Medicine, Nephrology and TransplantationErasmus Medical CenterRotterdamThe Netherlands
| | - Marco Sciacovelli
- MRC Cancer UnitHutchison/MRC Research CentreUniversity of CambridgeCambridgeUK
| | - Enio Gjerga
- Faculty of Medicine, and Heidelberg University HospitalInstitute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
- Faculty of MedicineJoint Research Centre for Computational Biomedicine (JRC‐COMBINE)RWTH Aachen UniversityAachenGermany
| | - Attila Gabor
- Faculty of Medicine, and Heidelberg University HospitalInstitute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
| | - Kristina B. Emdal
- Faculty of Health and Medical SciencesProteomics ProgramNovo Nordisk Foundation Center for Protein ResearchUniversity of CopenhagenCopenhagenDenmark
| | - Vitor Vieira
- Centre of Biological EngineeringUniversity of Minho ‐ Campus de GualtarBragaPortugal
| | - Dorte B. Bekker‐Jensen
- Faculty of Health and Medical SciencesProteomics ProgramNovo Nordisk Foundation Center for Protein ResearchUniversity of CopenhagenCopenhagenDenmark
| | - Jennifer Kranz
- Faculty of MedicineInstitute of Experimental Medicine and Systems BiologyRWTH Aachen UniversityAachenGermany
- Department of Urology and Pediatric UrologySt. Antonius Hospital EschweilerAcademic Teaching Hospital of RWTH AachenEschweilerGermany
- Department of Urology and Kidney TransplantationMartin Luther UniversityHalle (Saale)Germany
| | | | - Ana S.H. Costa
- MRC Cancer UnitHutchison/MRC Research CentreUniversity of CambridgeCambridgeUK
- Present address:
Cold Spring Harbor LaboratoryCold Spring HarborNYUSA
| | - Abel Sousa
- Institute for Research and Innovation in Health (i3s)PortoPortugal
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)HinxtonUK
| | - Pedro Beltrao
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)HinxtonUK
| | - Miguel Rocha
- Centre of Biological EngineeringUniversity of Minho ‐ Campus de GualtarBragaPortugal
| | - Jesper V. Olsen
- Faculty of Health and Medical SciencesProteomics ProgramNovo Nordisk Foundation Center for Protein ResearchUniversity of CopenhagenCopenhagenDenmark
| | - Christian Frezza
- MRC Cancer UnitHutchison/MRC Research CentreUniversity of CambridgeCambridgeUK
| | - Rafael Kramann
- Faculty of MedicineInstitute of Experimental Medicine and Systems BiologyRWTH Aachen UniversityAachenGermany
- Division of Nephrology and Clinical ImmunologyFaculty of MedicineRWTH Aachen UniversityAachenGermany
- Department of Internal Medicine, Nephrology and TransplantationErasmus Medical CenterRotterdamThe Netherlands
| | - Julio Saez‐Rodriguez
- Faculty of Medicine, and Heidelberg University HospitalInstitute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
- Faculty of MedicineJoint Research Centre for Computational Biomedicine (JRC‐COMBINE)RWTH Aachen UniversityAachenGermany
- Molecular Medicine Partnership Unit, European Molecular Biology LaboratoryHeidelberg UniversityHeidelbergGermany
| |
Collapse
|
26
|
Ross AB, Langer JD, Jovanovic M. Proteome Turnover in the Spotlight: Approaches, Applications, and Perspectives. Mol Cell Proteomics 2020; 20:100016. [PMID: 33556866 PMCID: PMC7950106 DOI: 10.1074/mcp.r120.002190] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 11/25/2020] [Accepted: 11/30/2020] [Indexed: 01/17/2023] Open
Abstract
In all cells, proteins are continuously synthesized and degraded to maintain protein homeostasis and modify gene expression levels in response to stimuli. Collectively, the processes of protein synthesis and degradation are referred to as protein turnover. At a steady state, protein turnover is constant to maintain protein homeostasis, but in dynamic responses, proteins change their rates of synthesis and degradation to adjust their proteomes to internal or external stimuli. Thus, probing the kinetics and dynamics of protein turnover lends insight into how cells regulate essential processes such as growth, differentiation, and stress response. Here, we outline historical and current approaches to measuring the kinetics of protein turnover on a proteome-wide scale in both steady-state and dynamic systems, with an emphasis on metabolic tracing using stable isotope-labeled amino acids. We highlight important considerations for designing proteome turnover experiments, key biological findings regarding the conserved principles of proteome turnover regulation, and future perspectives for both technological and biological investigation.
Collapse
Affiliation(s)
- Alison Barbara Ross
- Department of Biological Sciences, Columbia University, New York, New York, USA
| | - Julian David Langer
- Proteomics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany; Proteomics, Max Planck Institute for Brain Research, Frankfurt am Main, Germany.
| | - Marko Jovanovic
- Department of Biological Sciences, Columbia University, New York, New York, USA.
| |
Collapse
|
27
|
Buccitelli C, Selbach M. mRNAs, proteins and the emerging principles of gene expression control. Nat Rev Genet 2020; 21:630-644. [PMID: 32709985 DOI: 10.1038/s41576-020-0258-4] [Citation(s) in RCA: 645] [Impact Index Per Article: 129.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/15/2020] [Indexed: 12/15/2022]
Abstract
Gene expression involves transcription, translation and the turnover of mRNAs and proteins. The degree to which protein abundances scale with mRNA levels and the implications in cases where this dependency breaks down remain an intensely debated topic. Here we review recent mRNA-protein correlation studies in the light of the quantitative parameters of the gene expression pathway, contextual confounders and buffering mechanisms. Although protein and mRNA levels typically show reasonable correlation, we describe how transcriptomics and proteomics provide useful non-redundant readouts. Integrating both types of data can reveal exciting biology and is an essential step in refining our understanding of the principles of gene expression control.
Collapse
Affiliation(s)
| | - Matthias Selbach
- Proteome Dynamics, Max Delbrück Center for Molecular Medicine, Berlin, Germany. .,Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
28
|
Gillespie MA, Palii CG, Sanchez-Taltavull D, Shannon P, Longabaugh WJR, Downes DJ, Sivaraman K, Espinoza HM, Hughes JR, Price ND, Perkins TJ, Ranish JA, Brand M. Absolute Quantification of Transcription Factors Reveals Principles of Gene Regulation in Erythropoiesis. Mol Cell 2020; 78:960-974.e11. [PMID: 32330456 PMCID: PMC7344268 DOI: 10.1016/j.molcel.2020.03.031] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 02/20/2020] [Accepted: 03/25/2020] [Indexed: 12/11/2022]
Abstract
Dynamic cellular processes such as differentiation are driven by changes in the abundances of transcription factors (TFs). However, despite years of studies, our knowledge about the protein copy number of TFs in the nucleus is limited. Here, by determining the absolute abundances of 103 TFs and co-factors during the course of human erythropoiesis, we provide a dynamic and quantitative scale for TFs in the nucleus. Furthermore, we establish the first gene regulatory network of cell fate commitment that integrates temporal protein stoichiometry data with mRNA measurements. The model revealed quantitative imbalances in TFs' cross-antagonistic relationships that underlie lineage determination. Finally, we made the surprising discovery that, in the nucleus, co-repressors are dramatically more abundant than co-activators at the protein level, but not at the RNA level, with profound implications for understanding transcriptional regulation. These analyses provide a unique quantitative framework to understand transcriptional regulation of cell differentiation in a dynamic context.
Collapse
Affiliation(s)
| | - Carmen G Palii
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H8L6, Canada
| | - Daniel Sanchez-Taltavull
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H8L6, Canada; Visceral Surgery and Medicine, Inselspital, Bern University Hospital, Department for BioMedical Research, University of Bern, Murtenstrasse 35, 3008 Bern, Switzerland
| | - Paul Shannon
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Damien J Downes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - Karthi Sivaraman
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada
| | | | - Jim R Hughes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DS, UK
| | | | - Theodore J Perkins
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H8L6, Canada.
| | - Jeffrey A Ranish
- Institute for Systems Biology, Seattle, WA 98109, USA; Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
| | - Marjorie Brand
- Sprott Center for Stem Cell Research, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H8L6, Canada.
| |
Collapse
|
29
|
Zhang B, Kuster B. Proteomics Is Not an Island: Multi-omics Integration Is the Key to Understanding Biological Systems. Mol Cell Proteomics 2019; 18:S1-S4. [PMID: 31399542 PMCID: PMC6692779 DOI: 10.1074/mcp.e119.001693] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Indexed: 12/18/2022] Open
Affiliation(s)
- Bing Zhang
- ‡Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas
- §Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Bernhard Kuster
- ¶Chair of Proteomics and Bioanalytics, Technische Universitat Munchen, Freising, Germany
- ‖Bavarian Biomolecular Mass Spectrometry Center, Technische Universitat Munchen, Freising, Germany
| |
Collapse
|