1
|
Ren S, Li Y, Zhou Z. RiboParser/RiboShiny: an integrated platform for comprehensive analysis and visualization of Ribo-seq data. J Genet Genomics 2025:S1673-8527(25)00119-5. [PMID: 40268050 DOI: 10.1016/j.jgg.2025.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 04/16/2025] [Accepted: 04/16/2025] [Indexed: 04/25/2025]
Abstract
Translation is a crucial step in gene expression. Over the past decade, the development and application of Ribosome profiling (Ribo-seq) have significantly advanced our understanding of translational regulation in vivo. However, the analysis and visualization of Ribo-seq data remain challenging. Despite the availability of various analytical pipelines, improvements in comprehensiveness, accuracy, and user-friendliness are still necessary. In this study, we develop RiboParser/RiboShiny, a robust framework for analyzing and visualizing Ribo-seq data. Building on published methods, we optimize ribosome structure-based and start/stop-based models to improve the accuracy and stability of P-site detection, even in species with a high proportion of leaderless transcripts. Leveraging these improvements, RiboParser offers comprehensive analyses, including quality control, gene-level analysis, codon-level analysis, and the analysis of Ribo-seq variants. Meanwhile, RiboShiny provides a user-friendly and adaptable platform for data visualization, facilitating deeper insights into the translational landscape. Furthermore, the integration of standardized genome annotation renders our platform universally applicable to various organisms with sequenced genomes. This framework has the potential to significantly improve the precision and efficiency of Ribo-seq data interpretation, thereby deepening our understanding of translational regulation.
Collapse
Affiliation(s)
- Shuchao Ren
- National Key Laboratory of Agricultural Microbiology, College of Life Science, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Yinan Li
- National Key Laboratory of Agricultural Microbiology, College of Life Science, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Zhipeng Zhou
- National Key Laboratory of Agricultural Microbiology, College of Life Science, Huazhong Agricultural University, Wuhan, Hubei 430070, China.
| |
Collapse
|
2
|
Camperi J, Chatla K, Freund E, Galan C, Lippold S, Guilbaud A. Current Analytical Strategies for mRNA-Based Therapeutics. Molecules 2025; 30:1629. [PMID: 40286229 PMCID: PMC11990077 DOI: 10.3390/molecules30071629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2025] [Revised: 04/02/2025] [Accepted: 04/03/2025] [Indexed: 04/29/2025] Open
Abstract
Recent advancements in mRNA technology, utilized in vaccines, immunotherapies, protein replacement therapies, and genome editing, have emerged as promising and increasingly viable treatments. The rapid, potent, and transient properties of mRNA-encoded proteins make them attractive tools for the effective treatment of a variety of conditions, ranging from infectious diseases to cancer and single-gene disorders. The capability for rapid and large-scale production of mRNA therapeutics fueled the global response to the COVID-19 pandemic. For effective clinical implementation, it is crucial to deeply characterize and control important mRNA attributes such as purity/integrity, identity, structural quality features, and functionality. This implies the use of powerful and advanced analytical techniques for quality control and characterization of mRNA. Improvements in analytical techniques such as electrophoresis, chromatography, mass spectrometry, sequencing, and functionality assessments have significantly enhanced the quality and detail of information available for product and process characterization, as well as for routine stability and release testing. Here, we review the latest advancements in analytical techniques for the characterization of mRNA-based therapeutics, typically employed by the biopharmaceutical industry for eventual market release.
Collapse
Affiliation(s)
- Julien Camperi
- Cell Therapy Engineering and Development, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA;
| | - Kamalakar Chatla
- Cell Therapy Engineering and Development, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA;
| | - Emily Freund
- Department of Molecular Biology, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA; (E.F.); (C.G.)
| | - Carolina Galan
- Department of Molecular Biology, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA; (E.F.); (C.G.)
| | - Steffen Lippold
- Protein Analytical Chemistry, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA;
| | - Axel Guilbaud
- Protein Analytical Chemistry, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA;
| |
Collapse
|
3
|
Wu X, Xu M, Yang JR, Lu J. Genome-wide impact of codon usage bias on translation optimization in Drosophila melanogaster. Nat Commun 2024; 15:8329. [PMID: 39333102 PMCID: PMC11437122 DOI: 10.1038/s41467-024-52660-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 09/17/2024] [Indexed: 09/29/2024] Open
Abstract
Accuracy and efficiency are fundamental to mRNA translation. Codon usage bias is widespread across species. Despite the long-standing association between optimized codon usage and improved translation, our understanding of its evolutionary basis and functional effects remains limited. Drosophila is widely used to study codon usage bias, but genome-scale experimental data are scarce. Using high-resolution mass spectrometry data from Drosophila melanogaster, we show that optimal codons have lower translation errors than nonoptimal codons after accounting for these biases. Genomic-scale analysis of ribosome profiling data shows that optimal codons are translated more rapidly than nonoptimal codons. Although we find no long-term selection favoring synonymous mutations in D. melanogaster after diverging from D. simulans, we identify signatures of positive selection driving codon optimization in the D. melanogaster population. These findings expand our understanding of the functional consequences of codon optimization and serve as a foundation for future investigations.
Collapse
Affiliation(s)
- Xinkai Wu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Mengze Xu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Jian-Rong Yang
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
- Key Laboratory of Tropical Disease Control, Ministry of Education, Sun Yat-sen University, Guangzhou, China.
- Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
4
|
Horvath A, Janapala Y, Woodward K, Mahmud S, Cleynen A, Gardiner E, Hannan R, Eyras E, Preiss T, Shirokikh N. Comprehensive translational profiling and STE AI uncover rapid control of protein biosynthesis during cell stress. Nucleic Acids Res 2024; 52:7925-7946. [PMID: 38721779 PMCID: PMC11260467 DOI: 10.1093/nar/gkae365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 03/21/2024] [Accepted: 04/25/2024] [Indexed: 07/23/2024] Open
Abstract
Translational control is important in all life, but it remains a challenge to accurately quantify. When ribosomes translate messenger (m)RNA into proteins, they attach to the mRNA in series, forming poly(ribo)somes, and can co-localize. Here, we computationally model new types of co-localized ribosomal complexes on mRNA and identify them using enhanced translation complex profile sequencing (eTCP-seq) based on rapid in vivo crosslinking. We detect long disome footprints outside regions of non-random elongation stalls and show these are linked to translation initiation and protein biosynthesis rates. We subject footprints of disomes and other translation complexes to artificial intelligence (AI) analysis and construct a new, accurate and self-normalized measure of translation, termed stochastic translation efficiency (STE). We then apply STE to investigate rapid changes to mRNA translation in yeast undergoing glucose depletion. Importantly, we show that, well beyond tagging elongation stalls, footprints of co-localized ribosomes provide rich insight into translational mechanisms, polysome dynamics and topology. STE AI ranks cellular mRNAs by absolute translation rates under given conditions, can assist in identifying its control elements and will facilitate the development of next-generation synthetic biology designs and mRNA-based therapeutics.
Collapse
Affiliation(s)
- Attila Horvath
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
| | - Yoshika Janapala
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
| | - Katrina Woodward
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
| | - Shafi Mahmud
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
| | - Alice Cleynen
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
- Institut Montpelliérain Alexander Grothendieck, Université de Montpellier, CNRS, Montpellier, France
| | - Elizabeth E Gardiner
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The National Platelet Research and Referral Centre, The Australian National University, Canberra, ACT 2601, Australia
| | - Ross D Hannan
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
- Department of Biochemistry and Molecular Biology, University of Melbourne, Parkville 3010, Australia
- Peter MacCallum Cancer Centre, Melbourne 3000, Australia
- Department of Biochemistry and Molecular Biology, Monash University, Clayton 3800, Australia
- School of Biomedical Sciences, University of Queensland, St Lucia 4067, Australia
| | - Eduardo Eyras
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Centre for Computational Biomedical Sciences, The Australian National University, Canberra, ACT 2601, Australia
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, ACT 2601, Australia
| | - Thomas Preiss
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia
| | - Nikolay E Shirokikh
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research, and The Shine-Dalgarno Centre for RNA Innovation, The Australian National University, Canberra, ACT 2601, Australia
| |
Collapse
|
5
|
Hoskins I, Rao S, Tante C, Cenik C. Integrated multiplexed assays of variant effect reveal determinants of catechol-O-methyltransferase gene expression. Mol Syst Biol 2024; 20:481-505. [PMID: 38355921 PMCID: PMC11066095 DOI: 10.1038/s44320-024-00018-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 02/16/2024] Open
Abstract
Multiplexed assays of variant effect are powerful methods to profile the consequences of rare variants on gene expression and organismal fitness. Yet, few studies have integrated several multiplexed assays to map variant effects on gene expression in coding sequences. Here, we pioneered a multiplexed assay based on polysome profiling to measure variant effects on translation at scale, uncovering single-nucleotide variants that increase or decrease ribosome load. By combining high-throughput ribosome load data with multiplexed mRNA and protein abundance readouts, we mapped the cis-regulatory landscape of thousands of catechol-O-methyltransferase (COMT) variants from RNA to protein and found numerous coding variants that alter COMT expression. Finally, we trained machine learning models to map signatures of variant effects on COMT gene expression and uncovered both directional and divergent impacts across expression layers. Our analyses reveal expression phenotypes for thousands of variants in COMT and highlight variant effects on both single and multiple layers of expression. Our findings prompt future studies that integrate several multiplexed assays for the readout of gene expression.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Shilpa Rao
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Charisma Tante
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
6
|
Shao B, Yan J, Zhang J, Liu L, Chen Y, Buskirk AR. Riboformer: a deep learning framework for predicting context-dependent translation dynamics. Nat Commun 2024; 15:2011. [PMID: 38443396 PMCID: PMC10915169 DOI: 10.1038/s41467-024-46241-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 02/18/2024] [Indexed: 03/07/2024] Open
Abstract
Translation elongation is essential for maintaining cellular proteostasis, and alterations in the translational landscape are associated with a range of diseases. Ribosome profiling allows detailed measurements of translation at the genome scale. However, it remains unclear how to disentangle biological variations from technical artifacts in these data and identify sequence determinants of translation dysregulation. Here we present Riboformer, a deep learning-based framework for modeling context-dependent changes in translation dynamics. Riboformer leverages the transformer architecture to accurately predict ribosome densities at codon resolution. When trained on an unbiased dataset, Riboformer corrects experimental artifacts in previously unseen datasets, which reveals subtle differences in synonymous codon translation and uncovers a bottleneck in translation elongation. Further, we show that Riboformer can be combined with in silico mutagenesis to identify sequence motifs that contribute to ribosome stalling across various biological contexts, including aging and viral infection. Our tool offers a context-aware and interpretable approach for standardizing ribosome profiling datasets and elucidating the regulatory basis of translation kinetics.
Collapse
Affiliation(s)
- Bin Shao
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA.
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Jiawei Yan
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Jing Zhang
- Biological Design Center, Boston University, Boston, MA, USA
| | - Lili Liu
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Ye Chen
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Allen R Buskirk
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
7
|
Popper B, Bürkle M, Ciccopiedi G, Marchioretto M, Forné I, Imhof A, Straub T, Viero G, Götz M, Schieweck R. Ribosome inactivation regulates translation elongation in neurons. J Biol Chem 2024; 300:105648. [PMID: 38219816 PMCID: PMC10869266 DOI: 10.1016/j.jbc.2024.105648] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 12/10/2023] [Accepted: 01/02/2024] [Indexed: 01/16/2024] Open
Abstract
Cellular plasticity is crucial for adapting to ever-changing stimuli. As a result, cells consistently reshape their translatome, and, consequently, their proteome. The control of translational activity has been thoroughly examined at the stage of translation initiation. However, the regulation of ribosome speed in cells is widely unknown. In this study, we utilized a timed ribosome runoff approach, along with proteomics and transmission electron microscopy, to investigate global translation kinetics in cells. We found that ribosome speeds vary among various cell types, such as astrocytes, induced pluripotent human stem cells, human neural stem cells, and human and rat neurons. Of all cell types studied, mature cortical neurons exhibit the highest rate of translation. This finding is particularly remarkable because mature cortical neurons express the eukaryotic elongation factor 2 (eEF2) at lower levels than other cell types. Neurons solve this conundrum by inactivating a fraction of their ribosomes. As a result, the increase in eEF2 levels leads to a reduction of inactive ribosomes and an enhancement of active ones. Processes that alter the demand for active ribosomes, like neuronal excitation, cause increased inactivation of redundant ribosomes in an eEF2-dependent manner. Our data suggest a novel regulatory mechanism in which neurons dynamically inactivate ribosomes to facilitate translational remodeling. These findings have important implications for developmental brain disorders characterized by, among other things, aberrant translation.
Collapse
Affiliation(s)
- Bastian Popper
- Core Facility Animal Models, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Martina Bürkle
- Department of Physiological Genomics, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Giuliana Ciccopiedi
- Department for Cell Biology & Anatomy, Biomedical Center (BMC), LMU Munich, Munich, Germany; Graduate School of Systemic Neurosciences, LMU Munich, Munich, Germany
| | - Marta Marchioretto
- Institute of Biophysics, National Research Council (CNR) Unit at Trento, Povo, Italy
| | - Ignasi Forné
- Protein Analysis Unit, Department for Molecular Biology, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Axel Imhof
- Protein Analysis Unit, Department for Molecular Biology, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Tobias Straub
- Bioinformatics Core Facility, Department of Molecular Biology, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Gabriella Viero
- Institute of Biophysics, National Research Council (CNR) Unit at Trento, Povo, Italy
| | - Magdalena Götz
- Department of Physiological Genomics, Biomedical Center (BMC), LMU Munich, Munich, Germany; Institute of Stem Cell Research, Helmholtz Center Munich, German Research Center for Environmental Health, Munich, Germany; SYNERGY, Excellence Cluster of Systems Neurology, Biomedical Center (BMC), LMU Munich, Munich, Germany
| | - Rico Schieweck
- Department of Physiological Genomics, Biomedical Center (BMC), LMU Munich, Munich, Germany; Department for Cell Biology & Anatomy, Biomedical Center (BMC), LMU Munich, Munich, Germany; Institute of Biophysics, National Research Council (CNR) Unit at Trento, Povo, Italy.
| |
Collapse
|
8
|
He S, Gao B, Sabnis R, Sun Q. Nucleic Transformer: Classifying DNA Sequences with Self-Attention and Convolutions. ACS Synth Biol 2023; 12:3205-3214. [PMID: 37916871 PMCID: PMC10863451 DOI: 10.1021/acssynbio.3c00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 11/03/2023]
Abstract
Much work has been done to apply machine learning and deep learning to genomics tasks, but these applications usually require extensive domain knowledge, and the resulting models provide very limited interpretability. Here, we present the Nucleic Transformer, a conceptually simple but effective and interpretable model architecture that excels in the classification of DNA sequences. The Nucleic Transformer employs self-attention and convolutions on nucleic acid sequences, leveraging two prominent deep learning strategies commonly used in computer vision and natural language analysis. We demonstrate that the Nucleic Transformer can be trained without much domain knowledge to achieve high performance in Escherichia coli promoter classification, viral genome identification, enhancer classification, and chromatin profile predictions.
Collapse
Affiliation(s)
- Shujun He
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Baizhen Gao
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Rushant Sabnis
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Qing Sun
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| |
Collapse
|
9
|
Bian B, Kumagai T, Saito Y. VeloPro: A pipeline integrating Ribo-seq and AlphaFold deciphers association patterns between translation velocity and protein structure features. IMETA 2023; 2:e148. [PMID: 38868219 PMCID: PMC10989810 DOI: 10.1002/imt2.148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 10/22/2023] [Indexed: 06/14/2024]
Abstract
VeloPro integrates Ribo-seq data and AlphaFold2-predicted 3D protein structure information for characterization of the association patterns between translation velocity and many protein structure features in prokaryotic and eukaryotic organisms across different taxonomical clades such as bacteria, fungi, protozoa, nematode, plants, insect, and mammals. We illustrated that association patterns between translation velocity and protein structure features differ across organisms, partially reflecting their taxonomical relationship.
Collapse
Affiliation(s)
- Bian Bian
- Department of Computational Biology and Medical SciencesGraduate School of Frontier Sciences, The University of TokyoKashiwaJapan
- Artificial Intelligence Research CenterNational Institute of Advanced Industrial Science and Technology (AIST)Koto‐kuJapan
| | | | - Yutaka Saito
- Department of Computational Biology and Medical SciencesGraduate School of Frontier Sciences, The University of TokyoKashiwaJapan
- Artificial Intelligence Research CenterNational Institute of Advanced Industrial Science and Technology (AIST)Koto‐kuJapan
- AIST‐Waseda University Computational Bio Big‐Data Open Innovation Laboratory (CBBD‐OIL)Shinjuku‐kuJapan
- Department of Data Science, School of Frontier EngineeringKitasato UniversitySagamiharaJapan
| |
Collapse
|
10
|
Shao B, Yan J, Zhang J, Buskirk AR. Riboformer: A Deep Learning Framework for Predicting Context-Dependent Translation Dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.24.538053. [PMID: 37163112 PMCID: PMC10168224 DOI: 10.1101/2023.04.24.538053] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Translation elongation is essential for maintaining cellular proteostasis, and alterations in the translational landscape are associated with a range of diseases. Ribosome profiling allows detailed measurement of translation at genome scale. However, it remains unclear how to disentangle biological variations from technical artifacts and identify sequence determinant of translation dysregulation. Here we present Riboformer, a deep learning-based framework for modeling context-dependent changes in translation dynamics. Riboformer leverages the transformer architecture to accurately predict ribosome densities at codon resolution. It corrects experimental artifacts in previously unseen datasets, reveals subtle differences in synonymous codon translation and uncovers a bottleneck in protein synthesis. Further, we show that Riboformer can be combined with in silico mutagenesis analysis to identify sequence motifs that contribute to ribosome stalling across various biological contexts, including aging and viral infection. Our tool offers a context-aware and interpretable approach for standardizing ribosome profiling datasets and elucidating the regulatory basis of translation kinetics.
Collapse
Affiliation(s)
- Bin Shao
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Present address: Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jiawei Yan
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Jing Zhang
- Biological Design Center, Boston University, Boston, MA, USA
| | - Allen R. Buskirk
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, USA
| |
Collapse
|
11
|
Wan Y, Jiang Z. TransCrispr: Transformer Based Hybrid Model for Predicting CRISPR/Cas9 Single Guide RNA Cleavage Efficiency. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1518-1528. [PMID: 36006888 DOI: 10.1109/tcbb.2022.3201631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
CRISPR/Cas9 is a widely used genome editing tool for site-directed modification of deoxyribonucleic acid (DNA) nucleotide sequences. However, how to accurately predict and evaluate the on- and off-target effects of single guide RNA (sgRNA) is one of the key problems for CRISPR/Cas9 system. Using computational methods to obtain high cell-specific sensitivity and specificity is a prerequisite for the optimal design of sgRNAs. Inspired by the work of predecessors, we found that sgRNA on-target knockout efficacy was not only related to the original sequence but also affected by important biological features. Hence, we introduce a novel approach called TransCrispr, which integrates Transformer and convolutional neural network (CNN) architecture to predict sgRNA knockout efficacy. Firstly, we encode the sequence data and send the transformed sgRNA sequence, positional information, and biological features into the network as input. Then, the convolutional neural network will automatically learn an appropriate feature representation for the sgRNA sequence and combine it with the positional information for self-attention learning of the Transformer. Finally, a regression score is generated by predicting biological features. Experiments on seven public datasets illustrate that TransCrispr outperforms state-of-the-art methods in terms of prediction accuracy and generalization ability.
Collapse
|
12
|
Mok A, Tunney R, Benegas G, Wallace EWJ, Lareau LF. choros: correction of sequence-based biases for accurate quantification of ribosome profiling data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.21.529452. [PMID: 36865295 PMCID: PMC9980091 DOI: 10.1101/2023.02.21.529452] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Ribosome profiling quantifies translation genome-wide by sequencing ribosome-protected fragments, or footprints. Its single-codon resolution allows identification of translation regulation, such as ribosome stalls or pauses, on individual genes. However, enzyme preferences during library preparation lead to pervasive sequence artifacts that obscure translation dynamics. Widespread over- and under-representation of ribosome footprints can dominate local footprint densities and skew estimates of elongation rates by up to five fold. To address these biases and uncover true patterns of translation, we present choros, a computational method that models ribosome footprint distributions to provide bias-corrected footprint counts. choros uses negative binomial regression to accurately estimate two sets of parameters: (i) biological contributions from codon-specific translation elongation rates; and (ii) technical contributions from nuclease digestion and ligation efficiencies. We use these parameter estimates to generate bias correction factors that eliminate sequence artifacts. Applying choros to multiple ribosome profiling datasets, we are able to accurately quantify and attenuate ligation biases to provide more faithful measurements of ribosome distribution. We show that a pattern interpreted as pervasive ribosome pausing near the beginning of coding regions is likely to arise from technical biases. Incorporating choros into standard analysis pipelines will improve biological discovery from measurements of translation.
Collapse
Affiliation(s)
- Amanda Mok
- Center for Computational Biology, University of California, Berkeley
| | - Robert Tunney
- Center for Computational Biology, University of California, Berkeley
| | - Gonzalo Benegas
- Center for Computational Biology, University of California, Berkeley
| | | | - Liana F. Lareau
- Center for Computational Biology, University of California, Berkeley
- Department of Bioengineering, University of California, Berkeley
| |
Collapse
|
13
|
Li F, Fang J, Yu Y, Hao S, Zou Q, Zeng Q, Yang X. Reanalysis of ribosome profiling datasets reveals a function of rocaglamide A in perturbing the dynamics of translation elongation via eIF4A. Nat Commun 2023; 14:553. [PMID: 36725859 PMCID: PMC9891901 DOI: 10.1038/s41467-023-36290-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 01/25/2023] [Indexed: 02/03/2023] Open
Abstract
The quickly accumulating ribosome profiling data is an insightful resource for studying the critical details of translation regulation under various biological contexts. Rocaglamide A (RocA), an antitumor heterotricyclic natural compound, has been shown to inhibit translation initiation of a large group of mRNA species by clamping eIF4A onto poly-purine motifs in the 5' UTRs. However, reanalysis of previous ribosome profiling datasets reveals an unexpected shift of the ribosome occupancy pattern, upon RocA treatment in various types of cells, during early translation elongation for a specific group of mRNA transcripts without poly-purine motifs over-represented in their 5' UTRs. Such perturbation of translation elongation dynamics can be attributed to the blockage of translating ribosomes due to the binding of eIF4A to the poly-purine sequence in coding regions. In summary, our study presents the complete dual modes of RocA in blocking translation initiation and elongation, which underlie the potent antitumor effect of RocA.
Collapse
Affiliation(s)
- Fajin Li
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China. .,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China. .,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China.
| | - Jianhuo Fang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Yifan Yu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Sijia Hao
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China
| | - Qin Zou
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Qinglin Zeng
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China. .,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China. .,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
14
|
Grigorashvili EI, Chervontseva ZS, Gelfand MS. Predicting RNA secondary structure by a neural network: what features may be learned? PeerJ 2022; 10:e14335. [PMID: 36530406 PMCID: PMC9756865 DOI: 10.7717/peerj.14335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 10/12/2022] [Indexed: 12/14/2022] Open
Abstract
Deep learning is a class of machine learning techniques capable of creating internal representation of data without explicit preprogramming. Hence, in addition to practical applications, it is of interest to analyze what features of biological data may be learned by such models. Here, we describe PredPair, a deep learning neural network trained to predict base pairs in RNA structure from sequence alone, without any incorporated prior knowledge, such as the stacking energies or possible spatial structures. PredPair learned the Watson-Crick and wobble base-pairing rules and created an internal representation of the stacking energies and helices. Application to independent experimental (DMS-Seq) data on nucleotide accessibility in mRNA showed that the nucleotides predicted as paired indeed tend to be involved in the RNA structure. The performance of the constructed model was comparable with the state-of-the-art method based on the thermodynamic approach, but with a higher false positives rate. On the other hand, it successfully predicted pseudoknots. t-SNE clusters of embeddings of RNA sequences created by PredPair tend to contain embeddings from particular Rfam families, supporting the predictions of PredPair being in line with biological classification.
Collapse
Affiliation(s)
| | | | - Mikhail S. Gelfand
- Center of Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Moscow, Russia,Institute of Information Transmission Problems, Moscow, Russia
| |
Collapse
|
15
|
Kim DJ, Kim J, Lee DH, Lee J, Woo HM. DeepTESR: A Deep Learning Framework to Predict the Degree of Translational Elongation Short Ramp for Gene Expression Control. ACS Synth Biol 2022; 11:1719-1726. [PMID: 35502843 DOI: 10.1021/acssynbio.2c00202] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Controlling translational elongation is essential for efficient protein synthesis. Ribosome profiling has revealed that the speed of ribosome movement is correlated with translational efficiency in the translational elongation ramp. In this work, we present a new deep learning model, called DeepTESR, to predict the degree of translational elongation short ramp (TESR) from mRNA sequence. The proposed deep learning model exhibited superior performance in predicting the TESR scores for 226 981 TESR sequences, resulting in the mean absolute error (MAE) of 0.285 and a coefficient of determination R2 of 0.627, superior to the conventional machine learning models (e.g., MAE of 0.335 and R2 of 0.571 for LightGBM). We experimentally validated that heterologous fluorescence expression of proteins with randomly selected TESR was moderately correlated with the predictions. Furthermore, a genome-wide analysis of TESR prediction in the 4305 coding sequences of Escherichia coli showed conserved TESRs over the clusters of orthologous groups. In this sense, DeepTESR can be used to predict the degree of TESR for gene expression control and to decipher the mechanism of translational control with ribosome profiling. DeepTESR is available at https://github.com/fmblab/DeepTESR.
Collapse
|
16
|
Fujita T, Yokoyama T, Shirouzu M, Taguchi H, Ito T, Iwasaki S. The landscape of translational stall sites in bacteria revealed by monosome and disome profiling. RNA (NEW YORK, N.Y.) 2022; 28:290-302. [PMID: 34906996 PMCID: PMC8848927 DOI: 10.1261/rna.078188.120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 11/24/2021] [Indexed: 05/29/2023]
Abstract
Ribosome pauses are associated with various cotranslational events and determine the fate of mRNAs and proteins. Thus, the identification of precise pause sites across the transcriptome is desirable; however, the landscape of ribosome pauses in bacteria remains ambiguous. Here, we harness monosome and disome (or collided ribosome) profiling strategies to survey ribosome pause sites in Escherichia coli Compared to eukaryotes, ribosome collisions in bacteria showed remarkable differences: a low frequency of disomes at stop codons, collisions occurring immediately after 70S assembly on start codons, and shorter queues of ribosomes trailing upstream. The pause sites corresponded with the biochemical validation by integrated nascent chain profiling (iNP) to detect polypeptidyl-tRNA, an elongation intermediate. Moreover, the subset of those sites showed puromycin resistance, presenting slow peptidyl transfer. Among the identified sites, the ribosome pause at Asn586 of ycbZ was validated by biochemical reporter assay, tRNA sequencing (tRNA-seq), and cryo-electron microscopy (cryo-EM) experiments. Our results provide a useful resource for ribosome stalling sites in bacteria.
Collapse
Affiliation(s)
- Tomoya Fujita
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198 Japan
- School of Life Science and Technology, Tokyo Institute of Technology, Midori-ku, Yokohama 226-8503, Japan
| | - Takeshi Yokoyama
- Laboratory for Protein Functional and Structural Biology, RIKEN Center for Biosystems Dynamics Research, Tsurumi-ku, Yokohama 230-0045, Japan
- Graduate School of Life Sciences, Tohoku University, Aoba-ku, Sendai 980-8577, Japan
| | - Mikako Shirouzu
- Laboratory for Protein Functional and Structural Biology, RIKEN Center for Biosystems Dynamics Research, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Hideki Taguchi
- School of Life Science and Technology, Tokyo Institute of Technology, Midori-ku, Yokohama 226-8503, Japan
- Cell Biology Center, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Midori-ku, Yokohama 226-8503, Japan
| | - Takuhiro Ito
- Laboratory for Translation Structural Biology, RIKEN Center for Biosystems Dynamics Research, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Shintaro Iwasaki
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198 Japan
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
17
|
Gobet C, Naef F. Ribo-DT: An automated pipeline for inferring codon dwell times from ribosome profiling data. Methods 2021; 203:10-16. [PMID: 34673173 DOI: 10.1016/j.ymeth.2021.10.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 10/08/2021] [Accepted: 10/11/2021] [Indexed: 11/16/2022] Open
Abstract
Protein synthesis is an energy consuming process characterised as a pivotal and highly regulated step in gene expression. The net protein output is dictated by a combination of translation initiation, elongation and termination rates that have remained difficult to measure. Recently, the development of ribosome profiling has enabled the inference of translation parameters through modelling, as this method informs on the ribosome position along the mRNA. Here, we present an automated, reproducible and portable computational pipeline to infer relative single-codon and codon-pair dwell times as well as gene flux from raw ribosome profiling sequencing data. As a case study, we applied our workflow to a publicly available yeast ribosome profiling dataset consisting of 57 independent gene knockouts related to RNA and tRNA modifications. We uncovered the effects of those modifications on translation elongation and codon selection during decoding. In particular, knocking out mod5 and trm7 increases codon-specific dwell times which indicates their potential tRNA targets, and highlights effects of nucleotide modifications on ribosome decoding rate.
Collapse
Affiliation(s)
- Cédric Gobet
- Institute of Bioengineering (IBI), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland.
| | - Félix Naef
- Institute of Bioengineering (IBI), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland.
| |
Collapse
|
18
|
Mundodi V, Choudhary S, Smith AD, Kadosh D. Global translational landscape of the Candida albicans morphological transition. G3-GENES GENOMES GENETICS 2021; 11:6046988. [PMID: 33585865 PMCID: PMC7849906 DOI: 10.1093/g3journal/jkaa043] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 12/01/2020] [Indexed: 12/14/2022]
Abstract
Candida albicans, a major human fungal pathogen associated with high mortality and/or morbidity rates in a wide variety of immunocompromised individuals, undergoes a reversible morphological transition from yeast to filamentous cells that is required for virulence. While previous studies have identified and characterized global transcriptional mechanisms important for driving this transition, as well as other virulence properties, in C. albicans and other pathogens, considerably little is known about the role of genome-wide translational mechanisms. Using ribosome profiling, we report the first global translational profile associated with C. albicans morphogenesis. Strikingly, many genes involved in pathogenesis, filamentation, and the response to stress show reduced translational efficiency (TE). Several of these genes are known to be strongly induced at the transcriptional level, suggesting that a translational fine-tuning mechanism is in place. We also identify potential upstream open reading frames (uORFs), associated with genes involved in pathogenesis, and novel ORFs, several of which show altered TE during filamentation. Using a novel bioinformatics method for global analysis of ribosome pausing that will be applicable to a wide variety of genetic systems, we demonstrate an enrichment of ribosome pausing sites in C. albicans genes associated with protein synthesis and cell wall functions. Altogether, our results suggest that the C. albicans morphological transition, and most likely additional virulence processes in fungal pathogens, is associated with widespread global alterations in TE that do not simply reflect changes in transcript levels. These alterations affect the expression of many genes associated with processes essential for virulence and pathogenesis.
Collapse
Affiliation(s)
- Vasanthakrishna Mundodi
- Department of Microbiology, Immunology and Molecular Genetics, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| | - Saket Choudhary
- Department of Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| | - Andrew D Smith
- Department of Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| | - David Kadosh
- Department of Microbiology, Immunology and Molecular Genetics, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| |
Collapse
|
19
|
Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021; 8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.
Collapse
Affiliation(s)
- Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Mariia Kokina
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Victor Garcia
- School of Life Sciences and Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
20
|
Han P, Shichino Y, Schneider-Poetsch T, Mito M, Hashimoto S, Udagawa T, Kohno K, Yoshida M, Mishima Y, Inada T, Iwasaki S. Genome-wide Survey of Ribosome Collision. Cell Rep 2021; 31:107610. [PMID: 32375038 DOI: 10.1016/j.celrep.2020.107610] [Citation(s) in RCA: 112] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Revised: 03/18/2020] [Accepted: 04/13/2020] [Indexed: 12/31/2022] Open
Abstract
Ribosome movement is not always smooth and is rather often impeded. For ribosome pauses, fundamental issues remain to be addressed, including where ribosomes pause on mRNAs, what kind of RNA/amino acid sequence causes this pause, and the physiological significance of this attenuation of protein synthesis. Here, we survey the positions of ribosome collisions caused by ribosome pauses in humans and zebrafish using modified ribosome profiling. Collided ribosomes, i.e., disomes, emerge at various sites: Pro-Pro/Gly/Asp motifs; Arg-X-Lys motifs; stop codons; and 3' untranslated regions. The electrostatic interaction between the charged nascent chain and the ribosome exit tunnel determines the eIF5A-mediated disome rescue at the Pro-Pro sites. In particular, XBP1u, a precursor of endoplasmic reticulum (ER)-stress-responsive transcription factor, shows striking queues of collided ribosomes and thus acts as a degradation substrate by ribosome-associated quality control. Our results provide insight into the causes and consequences of ribosome pause by dissecting collided ribosomes.
Collapse
Affiliation(s)
- Peixun Han
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan; RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Yuichi Shichino
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Tilman Schneider-Poetsch
- Chemical Genomics Research Group, RIKEN Center for Sustainable Resource Science, Wako, Saitama 351-0198, Japan
| | - Mari Mito
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Satoshi Hashimoto
- Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Miyagi 980-8578, Japan
| | - Tsuyoshi Udagawa
- Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Miyagi 980-8578, Japan
| | - Kenji Kohno
- Institute for Research Initiatives, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Minoru Yoshida
- Chemical Genomics Research Group, RIKEN Center for Sustainable Resource Science, Wako, Saitama 351-0198, Japan; Department of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan; Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Yuichiro Mishima
- Faculty of Life Sciences, Kyoto Sangyo University, Kita-ku, Kyoto 603-8555, Japan
| | - Toshifumi Inada
- Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Miyagi 980-8578, Japan
| | - Shintaro Iwasaki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan; RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan.
| |
Collapse
|
21
|
Tian T, Li S, Lang P, Zhao D, Zeng J. Full-length ribosome density prediction by a multi-input and multi-output model. PLoS Comput Biol 2021; 17:e1008842. [PMID: 33770074 PMCID: PMC8026034 DOI: 10.1371/journal.pcbi.1008842] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 04/07/2021] [Accepted: 03/01/2021] [Indexed: 11/29/2022] Open
Abstract
Translation elongation is regulated by a series of complicated mechanisms in both prokaryotes and eukaryotes. Although recent advance in ribosome profiling techniques has enabled one to capture the genome-wide ribosome footprints along transcripts at codon resolution, the regulatory codes of elongation dynamics are still not fully understood. Most of the existing computational approaches for modeling translation elongation from ribosome profiling data mainly focus on local contextual patterns, while ignoring the continuity of the elongation process and relations between ribosome densities of remote codons. Modeling the translation elongation process in full-length coding sequence (CDS) level has not been studied to the best of our knowledge. In this paper, we developed a deep learning based approach with a multi-input and multi-output framework, named RiboMIMO, for modeling the ribosome density distributions of full-length mRNA CDS regions. Through considering the underlying correlations in translation efficiency among neighboring and remote codons and extracting hidden features from the input full-length coding sequence, RiboMIMO can greatly outperform the state-of-the-art baseline approaches and accurately predict the ribosome density distributions along the whole mRNA CDS regions. In addition, RiboMIMO explores the contributions of individual input codons to the predictions of output ribosome densities, which thus can help reveal important biological factors influencing the translation elongation process. The analyses, based on our interpretable metric named codon impact score, not only identified several patterns consistent with the previously-published literatures, but also for the first time (to the best of our knowledge) revealed that the codons located at a long distance from the ribosomal A site may also have an association on the translation elongation rate. This finding of long-range impact on translation elongation velocity may shed new light on the regulatory mechanisms of protein synthesis. Overall, these results indicated that RiboMIMO can provide a useful tool for studying the regulation of translation elongation in the range of full-length CDS. Translation elongation is a process in which amino acids are linked into proteins by ribosomes in cells. Translation elongation rates along the mRNAs are not constant, and are regulated by a series of mechanisms, such as codon rarity and mRNA stability. In this study, we modeled the translation elongation process at a full-length coding sequence level and developed a deep learning based approach to predict the translation elongation rates from mRNA sequences, through extracting the regulatory codes of elongation rates from the contextual sequences. The analyses, based on our interpretable metric named codon impact score, for the first time (to the best of our knowledge), revealed that in addition to the neighboring codons of the ribosomal A sites, the remote codons may also have an important impact on the translation elongation rates. This new finding may stimulate additional experiments and shed light on the regulatory mechanisms of protein synthesis.
Collapse
Affiliation(s)
- Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Shuya Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Peng Lang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
- * E-mail: (DZ); (JZ)
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
- * E-mail: (DZ); (JZ)
| |
Collapse
|
22
|
Zhang G, Zeng T, Dai Z, Dai X. Prediction of CRISPR/Cas9 single guide RNA cleavage efficiency and specificity by attention-based convolutional neural networks. Comput Struct Biotechnol J 2021; 19:1445-1457. [PMID: 33841753 PMCID: PMC8010402 DOI: 10.1016/j.csbj.2021.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 02/26/2021] [Accepted: 03/01/2021] [Indexed: 12/26/2022] Open
Abstract
CRISPR/Cas9 is a preferred genome editing tool and has been widely adapted to ranges of disciplines, from molecular biology to gene therapy. A key prerequisite for the success of CRISPR/Cas9 is its capacity to distinguish between single guide RNAs (sgRNAs) on target and homologous off-target sites. Thus, optimized design of sgRNAs by maximizing their on-target activity and minimizing their potential off-target mutations are crucial concerns for this system. Several deep learning models have been developed for comprehensive understanding of sgRNA cleavage efficacy and specificity. Although the proposed methods yield the performance results by automatically learning a suitable representation from the input data, there is still room for the improvement of accuracy and interpretability. Here, we propose novel interpretable attention-based convolutional neural networks, namely CRISPR-ONT and CRISPR-OFFT, for the prediction of CRISPR/Cas9 sgRNA on- and off-target activities, respectively. Experimental tests on public datasets demonstrate that our models significantly yield satisfactory results in terms of accuracy and interpretability. Our findings contribute to the understanding of how RNA-guide Cas9 nucleases scan the mammalian genome. Data and source codes are available at https://github.com/Peppags/CRISPRont-CRISPRofft.
Collapse
Affiliation(s)
- Guishan Zhang
- Key Laboratory of Digital Signal and Image Processing of Guangdong Provincial, College of Engineering, Shantou University, Shantou 515063, China.,School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou 510006, China
| | - Tian Zeng
- School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou 510006, China
| | - Zhiming Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China.,Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangzhou 510006, China
| | - Xianhua Dai
- School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou 510006, China.,Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai 519000, China
| |
Collapse
|
23
|
Eshraghi M, Karunadharma PP, Blin J, Shahani N, Ricci EP, Michel A, Urban NT, Galli N, Sharma M, Ramírez-Jarquín UN, Florescu K, Hernandez J, Subramaniam S. Mutant Huntingtin stalls ribosomes and represses protein synthesis in a cellular model of Huntington disease. Nat Commun 2021; 12:1461. [PMID: 33674575 PMCID: PMC7935949 DOI: 10.1038/s41467-021-21637-y] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 01/29/2021] [Indexed: 02/08/2023] Open
Abstract
The polyglutamine expansion of huntingtin (mHTT) causes Huntington disease (HD) and neurodegeneration, but the mechanisms remain unclear. Here, we found that mHtt promotes ribosome stalling and suppresses protein synthesis in mouse HD striatal neuronal cells. Depletion of mHtt enhances protein synthesis and increases the speed of ribosomal translocation, while mHtt directly inhibits protein synthesis in vitro. Fmrp, a known regulator of ribosome stalling, is upregulated in HD, but its depletion has no discernible effect on protein synthesis or ribosome stalling in HD cells. We found interactions of ribosomal proteins and translating ribosomes with mHtt. High-resolution global ribosome footprint profiling (Ribo-Seq) and mRNA-Seq indicates a widespread shift in ribosome occupancy toward the 5' and 3' end and unique single-codon pauses on selected mRNA targets in HD cells, compared to controls. Thus, mHtt impedes ribosomal translocation during translation elongation, a mechanistic defect that can be exploited for HD therapeutics.
Collapse
Affiliation(s)
- Mehdi Eshraghi
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Pabalu P. Karunadharma
- grid.214007.00000000122199231The Scripps Research Institute, Genomic Core, Jupiter, FL USA
| | - Juliana Blin
- grid.462957.b0000 0004 0598 0706Laboratory of Biology and Cellular Modelling at Ecole Normale Supérieure of Lyon, RNA Metabolism in Immunity and Infection Lab, LBMC, Lyon, France
| | - Neelam Shahani
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Emiliano P. Ricci
- grid.462957.b0000 0004 0598 0706Laboratory of Biology and Cellular Modelling at Ecole Normale Supérieure of Lyon, RNA Metabolism in Immunity and Infection Lab, LBMC, Lyon, France
| | | | | | - Nicole Galli
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Manish Sharma
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Uri Nimrod Ramírez-Jarquín
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Katie Florescu
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Jennifer Hernandez
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| | - Srinivasa Subramaniam
- grid.214007.00000000122199231The Scripps Research Institute, Department of Neuroscience, Jupiter, FL USA
| |
Collapse
|
24
|
A machine learning-based framework for modeling transcription elongation. Proc Natl Acad Sci U S A 2021; 118:2007450118. [PMID: 33526657 DOI: 10.1073/pnas.2007450118] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
RNA polymerase II (Pol II) generally pauses at certain positions along gene bodies, thereby interrupting the transcription elongation process, which is often coupled with various important biological functions, such as precursor mRNA splicing and gene expression regulation. Characterizing the transcriptional elongation dynamics can thus help us understand many essential biological processes in eukaryotic cells. However, experimentally measuring Pol II elongation rates is generally time and resource consuming. We developed PEPMAN (polymerase II elongation pausing modeling through attention-based deep neural network), a deep learning-based model that accurately predicts Pol II pausing sites based on the native elongating transcript sequencing (NET-seq) data. Through fully taking advantage of the attention mechanism, PEPMAN is able to decipher important sequence features underlying Pol II pausing. More importantly, we demonstrated that the analyses of the PEPMAN-predicted results around various types of alternative splicing sites can provide useful clues into understanding the cotranscriptional splicing events. In addition, associating the PEPMAN prediction results with different epigenetic features can help reveal important factors related to the transcription elongation process. All these results demonstrated that PEPMAN can provide a useful and effective tool for modeling transcription elongation and understanding the related biological factors from available high-throughput sequencing data.
Collapse
|
25
|
Hu H, Liu X, Xiao A, Li Y, Zhang C, Jiang T, Zhao D, Song S, Zeng J. Riboexp: an interpretable reinforcement learning framework for ribosome density modeling. Brief Bioinform 2021; 22:6105941. [PMID: 33479731 DOI: 10.1093/bib/bbaa412] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 12/11/2020] [Indexed: 11/13/2022] Open
Abstract
Translation elongation is a crucial phase during protein biosynthesis. In this study, we develop a novel deep reinforcement learning-based framework, named Riboexp, to model the determinants of the uneven distribution of ribosomes on mRNA transcripts during translation elongation. In particular, our model employs a policy network to perform a context-dependent feature selection in the setting of ribosome density prediction. Our extensive tests demonstrated that Riboexp can significantly outperform the state-of-the-art methods in predicting ribosome density by up to 5.9% in terms of per-gene Pearson correlation coefficient on the datasets from three species. In addition, Riboexp can indicate more informative sequence features for the prediction task than other commonly used attribution methods in deep learning. In-depth analyses also revealed the meaningful biological insights generated by the Riboexp framework. Moreover, the application of Riboexp in codon optimization resulted in an increase of protein production by around 31% over the previous state-of-the-art method that models ribosome density. These results have established Riboexp as a powerful and useful computational tool in the studies of translation dynamics and protein synthesis. Availability: The data and code of this study are available on GitHub: https://github.com/Liuxg16/Riboexp. Contact: zengjy321@tsinghua.edu.cn; songsen@tsinghua.edu.cn.
Collapse
Affiliation(s)
- Hailin Hu
- School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Xianggen Liu
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, 100084, China.,Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, 100084, China
| | - An Xiao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - YangYang Li
- Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences, and School of Medicine, Tsinghua University, Beijing, 100084, China
| | | | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA.,Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China.,Institute of Integrative Genome Biology, University of California, Riverside, CA 92521, USA
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Sen Song
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, 100084, China.,Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, 100084, China
| | - Jianyang Zeng
- School of Medicine, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
26
|
Espah Borujeni A, Zhang J, Doosthosseini H, Nielsen AAK, Voigt CA. Genetic circuit characterization by inferring RNA polymerase movement and ribosome usage. Nat Commun 2020; 11:5001. [PMID: 33020480 PMCID: PMC7536230 DOI: 10.1038/s41467-020-18630-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 09/02/2020] [Indexed: 02/06/2023] Open
Abstract
To perform their computational function, genetic circuits change states through a symphony of genetic parts that turn regulator expression on and off. Debugging is frustrated by an inability to characterize parts in the context of the circuit and identify the origins of failures. Here, we take snapshots of a large genetic circuit in different states: RNA-seq is used to visualize circuit function as a changing pattern of RNA polymerase (RNAP) flux along the DNA. Together with ribosome profiling, all 54 genetic parts (promoters, ribozymes, RBSs, terminators) are parameterized and used to inform a mathematical model that can predict circuit performance, dynamics, and robustness. The circuit behaves as designed; however, it is riddled with genetic errors, including cryptic sense/antisense promoters and translation, attenuation, incorrect start codons, and a failed gate. While not impacting the expected Boolean logic, they reduce the prediction accuracy and could lead to failures when the parts are used in other designs. Finally, the cellular power (RNAP and ribosome usage) required to maintain a circuit state is calculated. This work demonstrates the use of a small number of measurements to fully parameterize a regulatory circuit and quantify its impact on host.
Collapse
Affiliation(s)
- Amin Espah Borujeni
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Jing Zhang
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Hamid Doosthosseini
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Alec A K Nielsen
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Christopher A Voigt
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
27
|
Kwon MS, Lee BT, Lee SY, Kim HU. Modeling regulatory networks using machine learning for systems metabolic engineering. Curr Opin Biotechnol 2020; 65:163-170. [DOI: 10.1016/j.copbio.2020.02.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/23/2020] [Accepted: 02/26/2020] [Indexed: 12/18/2022]
|
28
|
Computational discovery and modeling of novel gene expression rules encoded in the mRNA. Biochem Soc Trans 2020; 48:1519-1528. [PMID: 32662820 DOI: 10.1042/bst20191048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/15/2020] [Accepted: 06/17/2020] [Indexed: 11/17/2022]
Abstract
The transcript is populated with numerous overlapping codes that regulate all steps of gene expression. Deciphering these codes is very challenging due to the large number of variables involved, the non-modular nature of the codes, biases and limitations in current experimental approaches, our limited knowledge in gene expression regulation across the tree of life, and other factors. In recent years, it has been shown that computational modeling and algorithms can significantly accelerate the discovery of novel gene expression codes. Here, we briefly summarize the latest developments and different approaches in the field.
Collapse
|
29
|
Li F, Xing X, Xiao Z, Xu G, Yang X. RiboMiner: a toolset for mining multi-dimensional features of the translatome with ribosome profiling data. BMC Bioinformatics 2020; 21:340. [PMID: 32738892 PMCID: PMC7430821 DOI: 10.1186/s12859-020-03670-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/20/2020] [Indexed: 02/08/2023] Open
Abstract
Background Ribosome profiling has been widely used for studies of translation under a large variety of cellular and physiological contexts. Many of these studies have greatly benefitted from a series of data-mining tools designed for dissection of the translatome from different aspects. However, as the studies of translation advance quickly, the current toolbox still falls in short, and more specialized tools are in urgent need for deeper and more efficient mining of the important and new features of the translation landscapes. Results Here, we present RiboMiner, a bioinformatics toolset for mining of multi-dimensional features of the translatome with ribosome profiling data. RiboMiner performs extensive quality assessment of the data and integrates a spectrum of tools for various metagene analyses of the ribosome footprints and for detailed analyses of multiple features related to translation regulation. Visualizations of all the results are available. Many of these analyses have not been provided by previous methods. RiboMiner is highly flexible, as the pipeline could be easily adapted and customized for different scopes and targets of the studies. Conclusions Applications of RiboMiner on two published datasets did not only reproduced the main results reported before, but also generated novel insights into the translation regulation processes. Therefore, being complementary to the current tools, RiboMiner could be a valuable resource for dissections of the translation landscapes and the translation regulations by mining the ribosome profiling data more comprehensively and with higher resolution. RiboMiner is freely available at https://github.com/xryanglab/RiboMiner and https://pypi.org/project/RiboMiner.
Collapse
Affiliation(s)
- Fajin Li
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China
| | - Xudong Xing
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China
| | - Zhengtao Xiao
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Gang Xu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China. .,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
30
|
Arpat AB, Liechti A, De Matos M, Dreos R, Janich P, Gatfield D. Transcriptome-wide sites of collided ribosomes reveal principles of translational pausing. Genome Res 2020; 30:985-999. [PMID: 32703885 PMCID: PMC7397865 DOI: 10.1101/gr.257741.119] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 06/29/2020] [Indexed: 01/28/2023]
Abstract
Translation initiation is the major regulatory step defining the rate of protein production from an mRNA. Meanwhile, the impact of nonuniform ribosomal elongation rates is largely unknown. Using a modified ribosome profiling protocol based on footprints from two closely packed ribosomes (disomes), we have mapped ribosomal collisions transcriptome-wide in mouse liver. We uncover that the stacking of an elongating onto a paused ribosome occurs frequently and scales with translation rate, trapping ∼10% of translating ribosomes in the disome state. A distinct class of pause sites is indicative of deterministic pausing signals. Pause site association with specific amino acids, peptide motifs, and nascent polypeptide structure is suggestive of programmed pausing as a widespread mechanism associated with protein folding. Evolutionary conservation at disome sites indicates functional relevance of translational pausing. Collectively, our disome profiling approach allows unique insights into gene regulation occurring at the step of translation elongation.
Collapse
Affiliation(s)
- Alaaddin Bulak Arpat
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Angélica Liechti
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Mara De Matos
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - René Dreos
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Peggy Janich
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - David Gatfield
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
31
|
Martinez-Seidel F, Beine-Golovchuk O, Hsieh YC, Kopka J. Systematic Review of Plant Ribosome Heterogeneity and Specialization. FRONTIERS IN PLANT SCIENCE 2020; 11:948. [PMID: 32670337 PMCID: PMC7332886 DOI: 10.3389/fpls.2020.00948] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 06/10/2020] [Indexed: 05/25/2023]
Abstract
Plants dedicate a high amount of energy and resources to the production of ribosomes. Historically, these multi-protein ribosome complexes have been considered static protein synthesis machines that are not subject to extensive regulation but only read mRNA and produce polypeptides accordingly. New and increasing evidence across various model organisms demonstrated the heterogeneous nature of ribosomes. This heterogeneity can constitute specialized ribosomes that regulate mRNA translation and control protein synthesis. A prominent example of ribosome heterogeneity is seen in the model plant, Arabidopsis thaliana, which, due to genome duplications, has multiple paralogs of each ribosomal protein (RP) gene. We support the notion of plant evolution directing high RP paralog divergence toward functional heterogeneity, underpinned in part by a vast resource of ribosome mutants that suggest specialization extends beyond the pleiotropic effects of single structural RPs or RP paralogs. Thus, Arabidopsis is a highly suitable model to study this phenomenon. Arabidopsis enables reverse genetics approaches that could provide evidence of ribosome specialization. In this review, we critically assess evidence of plant ribosome specialization and highlight steps along ribosome biogenesis in which heterogeneity may arise, filling the knowledge gaps in plant science by providing advanced insights from the human or yeast fields. We propose a data analysis pipeline that infers the heterogeneity of ribosome complexes and deviations from canonical structural compositions linked to stress events. This analysis pipeline can be extrapolated and enhanced by combination with other high-throughput methodologies, such as proteomics. Technologies, such as kinetic mass spectrometry and ribosome profiling, will be necessary to resolve the temporal and spatial aspects of translational regulation while the functional features of ribosomal subpopulations will become clear with the combination of reverse genetics and systems biology approaches.
Collapse
Affiliation(s)
- Federico Martinez-Seidel
- Willmitzer Department, Max Planck-Institute of Molecular Plant Physiology, Potsdam, Germany
- School of BioSciences, University of Melbourne, Parkville, VIC, Australia
| | | | - Yin-Chen Hsieh
- Bioinformatics Subdivision, Wageningen University, Wageningen, Netherlands
| | - Joachim Kopka
- Willmitzer Department, Max Planck-Institute of Molecular Plant Physiology, Potsdam, Germany
| |
Collapse
|
32
|
Hu H, Xiao A, Zhang S, Li Y, Shi X, Jiang T, Zhang L, Zhang L, Zeng J. DeepHINT: understanding HIV-1 integration via deep learning with attention. Bioinformatics 2020; 35:1660-1667. [PMID: 30295703 DOI: 10.1093/bioinformatics/bty842] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 09/07/2018] [Accepted: 10/04/2018] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION Human immunodeficiency virus type 1 (HIV-1) genome integration is closely related to clinical latency and viral rebound. In addition to human DNA sequences that directly interact with the integration machinery, the selection of HIV integration sites has also been shown to depend on the heterogeneous genomic context around a large region, which greatly hinders the prediction and mechanistic studies of HIV integration. RESULTS We have developed an attention-based deep learning framework, named DeepHINT, to simultaneously provide accurate prediction of HIV integration sites and mechanistic explanations of the detected sites. Extensive tests on a high-density HIV integration site dataset showed that DeepHINT can outperform conventional modeling strategies by automatically learning the genomic context of HIV integration from primary DNA sequence alone or together with epigenetic information. Systematic analyses on diverse known factors of HIV integration further validated the biological relevance of the prediction results. More importantly, in-depth analyses of the attention values output by DeepHINT revealed intriguing mechanistic implications in the selection of HIV integration sites, including potential roles of several DNA-binding proteins. These results established DeepHINT as an effective and explainable deep learning framework for the prediction and mechanistic study of HIV integration. AVAILABILITY AND IMPLEMENTATION DeepHINT is available as an open-source software and can be downloaded from https://github.com/nonnerdling/DeepHINT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hailin Hu
- School of Medicine, Tsinghua University, Beijing, China
| | - An Xiao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Sai Zhang
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Yangyang Li
- Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China
| | - Xuanling Shi
- Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA.,Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China.,Institute of Integrative Genome Biology, University of California, Riverside, CA, USA
| | - Linqi Zhang
- Comprehensive AIDS Research Center, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China
| | - Lei Zhang
- School of Medicine, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
33
|
Kiniry SJ, Michel AM, Baranov PV. Computational methods for ribosome profiling data analysis. WILEY INTERDISCIPLINARY REVIEWS. RNA 2020; 11:e1577. [PMID: 31760685 DOI: 10.1002/wrna.1577] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 10/12/2019] [Accepted: 10/16/2019] [Indexed: 12/15/2022]
Abstract
Since the introduction of the ribosome profiling technique in 2009 its popularity has greatly increased. It is widely used for the comprehensive assessment of gene expression and for studying the mechanisms of regulation at the translational level. As the number of ribosome profiling datasets being produced continues to grow, so too does the need for reliable software that can provide answers to the biological questions it can address. This review describes the computational methods and tools that have been developed to analyze ribosome profiling data at the different stages of the process. It starts with initial routine processing of raw data and follows with more specific tasks such as the identification of translated open reading frames, differential gene expression analysis, or evaluation of local or global codon decoding rates. The review pinpoints challenges associated with each step and explains the ways in which they are currently addressed. In addition it provides a comprehensive, albeit incomplete, list of publicly available software applicable to each step, which may be a beneficial starting point to those unexposed to ribosome profiling analysis. The outline of current challenges in ribosome profiling data analysis may inspire computational biologists to search for novel, potentially superior, solutions that will improve and expand the bioinformatician's toolbox for ribosome profiling data analysis. This article is characterized under: Translation > Ribosome Structure/Function RNA Evolution and Genomics > Computational Analyses of RNA Translation > Translation Mechanisms Translation > Translation Regulation.
Collapse
Affiliation(s)
- Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Audrey M Michel
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
| |
Collapse
|
34
|
Gobet C, Weger BD, Marquis J, Martin E, Neelagandan N, Gachon F, Naef F. Robust landscapes of ribosome dwell times and aminoacyl-tRNAs in response to nutrient stress in liver. Proc Natl Acad Sci U S A 2020; 117:9630-9641. [PMID: 32295881 PMCID: PMC7196831 DOI: 10.1073/pnas.1918145117] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Translation depends on messenger RNA (mRNA)-specific initiation, elongation, and termination rates. While translation elongation is well studied in bacteria and yeast, less is known in higher eukaryotes. Here we combined ribosome and transfer RNA (tRNA) profiling to investigate the relations between translation elongation rates, (aminoacyl-) tRNA levels, and codon usage in mammals. We modeled codon-specific ribosome dwell times from ribosome profiling, considering codon pair interactions between ribosome sites. In mouse liver, the model revealed site- and codon-specific dwell times that differed from those in yeast, as well as pairs of adjacent codons in the P and A site that markedly slow down or speed up elongation. While translation efficiencies vary across diurnal time and feeding regimen, codon dwell times were highly stable and conserved in human. Measured tRNA levels correlated with codon usage and several tRNAs showed reduced aminoacylation, which was conserved in fasted mice. Finally, we uncovered that the longest codon dwell times could be explained by aminoacylation levels or high codon usage relative to tRNA abundance.
Collapse
Affiliation(s)
- Cédric Gobet
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
- Nestlé Research, CH-1015 Lausanne, Switzerland
| | - Benjamin Dieter Weger
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
- Nestlé Research, CH-1015 Lausanne, Switzerland
| | | | - Eva Martin
- Nestlé Research, CH-1015 Lausanne, Switzerland
| | - Nagammal Neelagandan
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
| | | | - Felix Naef
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland;
| |
Collapse
|
35
|
Recent advances in ribosome profiling for deciphering translational regulation. Methods 2020; 176:46-54. [DOI: 10.1016/j.ymeth.2019.05.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 05/02/2019] [Accepted: 05/15/2019] [Indexed: 12/16/2022] Open
|
36
|
Peeters MKR, Menschaert G. The hunt for sORFs: A multidisciplinary strategy. Exp Cell Res 2020; 391:111923. [PMID: 32135166 DOI: 10.1016/j.yexcr.2020.111923] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 02/21/2020] [Accepted: 02/23/2020] [Indexed: 11/28/2022]
Abstract
Growing evidence illustrates the shortcomings on the current understanding of the full complexity of the proteome. Previously overlooked small open reading frames (sORFs) and their encoded microproteins have filled important gaps, exerting their function as biologically relevant regulators. The characterization of the full small proteome has potential applications in many fields. Continuous development of techniques and tools led to an improved sORF discovery, where these can originate from bioinformatics analyses, from sequencing routines or proteomics approaches. In this mini review, we discuss the ongoing trends in the three fields and suggest some strategies for further characterization of high potential candidates.
Collapse
Affiliation(s)
- Marlies K R Peeters
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 900, Gent, Belgium
| | - Gerben Menschaert
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 900, Gent, Belgium.
| |
Collapse
|
37
|
XPRESSyourself: Enhancing, standardizing, and automating ribosome profiling computational analyses yields improved insight into data. PLoS Comput Biol 2020; 16:e1007625. [PMID: 32004313 PMCID: PMC7015430 DOI: 10.1371/journal.pcbi.1007625] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 02/12/2020] [Accepted: 12/20/2019] [Indexed: 11/19/2022] Open
Abstract
Ribosome profiling, an application of nucleic acid sequencing for monitoring ribosome activity, has revolutionized our understanding of protein translation dynamics. This technique has been available for a decade, yet the current state and standardization of publicly available computational tools for these data is bleak. We introduce XPRESSyourself, an analytical toolkit that eliminates barriers and bottlenecks associated with this specialized data type by filling gaps in the computational toolset for both experts and non-experts of ribosome profiling. XPRESSyourself automates and standardizes analysis procedures, decreasing time-to-discovery and increasing reproducibility. This toolkit acts as a reference implementation of current best practices in ribosome profiling analysis. We demonstrate this toolkit’s performance on publicly available ribosome profiling data by rapidly identifying hypothetical mechanisms related to neurodegenerative phenotypes and neuroprotective mechanisms of the small-molecule ISRIB during acute cellular stress. XPRESSyourself brings robust, rapid analysis of ribosome-profiling data to a broad and ever-expanding audience and will lead to more reproducible and accessible measurements of translation regulation. XPRESSyourself software is perpetually open-source under the GPL-3.0 license and is hosted at https://github.com/XPRESSyourself, where users can access additional documentation and report software issues.
Collapse
|
38
|
Cui H, Hu H, Zeng J, Chen T. DeepShape: estimating isoform-level ribosome abundance and distribution with Ribo-seq data. BMC Bioinformatics 2019; 20:678. [PMID: 31861979 PMCID: PMC6923924 DOI: 10.1186/s12859-019-3244-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background Ribosome profiling brings insight to the process of translation. A basic step in profile construction at transcript level is to map Ribo-seq data to transcripts, and then assign a huge number of multiple-mapped reads to similar isoforms. Existing methods either discard the multiple mapped-reads, or allocate them randomly, or assign them proportionally according to transcript abundance estimated from RNA-seq data. Results Here we present DeepShape, an RNA-seq free computational method to estimate ribosome abundance of isoforms, and simultaneously compute their ribosome profiles using a deep learning model. Our simulation results demonstrate that DeepShape can provide more accurate estimations on both ribosome abundance and profiles when compared to state-of-the-art methods. We applied DeepShape to a set of Ribo-seq data from PC3 human prostate cancer cells with and without PP242 treatment. In the four cell invasion/metastasis genes that are translationally regulated by PP242 treatment, different isoforms show very different characteristics of translational efficiency and regulation patterns. Transcript level ribosome distributions were analyzed by “Codon Residence Index (CRI)” proposed in this study to investigate the relative speed that a ribosome moves on a codon compared to its synonymous codons. We observe consistent CRI patterns in PC3 cells. We found that the translation of several codons could be regulated by PP242 treatment. Conclusion In summary, we demonstrate that DeepShape can serve as a powerful tool for Ribo-seq data analysis.
Collapse
Affiliation(s)
- Hongfei Cui
- Institute for Artificial Intelligence and Department of Computer Science and Technology, Tsinghua University, Beijing, China.,DonLinks School of Economics and Management, University of Science and Technology Beijing, Beijing, China
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
| | - Ting Chen
- Institute for Artificial Intelligence and Department of Computer Science and Technology, Tsinghua University, Beijing, China.
| |
Collapse
|
39
|
Alexaki A, Hettiarachchi GK, Athey JC, Katneni UK, Simhadri V, Hamasaki-Katagiri N, Nanavaty P, Lin B, Takeda K, Freedberg D, Monroe D, McGill JR, Peters R, Kames JM, Holcomb DD, Hunt RC, Sauna ZE, Gelinas A, Janjic N, DiCuccio M, Bar H, Komar AA, Kimchi-Sarfaty C. Effects of codon optimization on coagulation factor IX translation and structure: Implications for protein and gene therapies. Sci Rep 2019; 9:15449. [PMID: 31664102 PMCID: PMC6820528 DOI: 10.1038/s41598-019-51984-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 10/09/2019] [Indexed: 11/16/2022] Open
Abstract
Synonymous codons occur with different frequencies in different organisms, a phenomenon termed codon usage bias. Codon optimization, a common term for a variety of approaches used widely by the biopharmaceutical industry, involves synonymous substitutions to increase protein expression. It had long been presumed that synonymous variants, which, by definition, do not alter the primary amino acid sequence, have no effect on protein structure and function. However, a critical mass of reports suggests that synonymous codon variations may impact protein conformation. To investigate the impact of synonymous codons usage on protein expression and function, we designed an optimized coagulation factor IX (FIX) variant and used multiple methods to compare its properties to the wild-type FIX upon expression in HEK293T cells. We found that the two variants differ in their conformation, even when controlling for the difference in expression levels. Using ribosome profiling, we identified robust changes in the translational kinetics of the two variants and were able to identify a region in the gene that may have a role in altering the conformation of the protein. Our data have direct implications for codon optimization strategies, for production of recombinant proteins and gene therapies.
Collapse
Affiliation(s)
- Aikaterini Alexaki
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Gaya K Hettiarachchi
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - John C Athey
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Upendra K Katneni
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Vijaya Simhadri
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Nobuko Hamasaki-Katagiri
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Puja Nanavaty
- Center for Gene Regulation in Health and Disease, Cleveland State University, Cleveland, OH, USA
| | - Brian Lin
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Kazuyo Takeda
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Darón Freedberg
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Dougald Monroe
- University of North Carolina at Chapel hill, Chapel hill, NC, USA
| | - Joseph R McGill
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | | | - Jacob M Kames
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - David D Holcomb
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Ryan C Hunt
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Zuben E Sauna
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | | | | | - Michael DiCuccio
- National Center of Biotechnology Information, National Institutes of Health, Bethesda, MD, USA
| | - Haim Bar
- Department of Statistics, University of Connecticut, Storrs, CT, USA
| | - Anton A Komar
- Center for Gene Regulation in Health and Disease, Cleveland State University, Cleveland, OH, USA
| | - Chava Kimchi-Sarfaty
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA.
| |
Collapse
|
40
|
Hu Y, Wang Z, Hu H, Wan F, Chen L, Xiong Y, Wang X, Zhao D, Huang W, Zeng J. ACME: pan-specific peptide–MHC class I binding prediction through attention-based deep neural networks. Bioinformatics 2019; 35:4946-4954. [PMID: 31120490 DOI: 10.1093/bioinformatics/btz427] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 04/12/2019] [Accepted: 05/19/2019] [Indexed: 12/30/2022] Open
Abstract
Abstract
Motivation
Prediction of peptide binding to the major histocompatibility complex (MHC) plays a vital role in the development of therapeutic vaccines for the treatment of cancer. Algorithms with improved correlations between predicted and actual binding affinities are needed to increase precision and reduce the number of false positive predictions.
Results
We present ACME (Attention-based Convolutional neural networks for MHC Epitope binding prediction), a new pan-specific algorithm to accurately predict the binding affinities between peptides and MHC class I molecules, even for those new alleles that are not seen in the training data. Extensive tests have demonstrated that ACME can significantly outperform other state-of-the-art prediction methods with an increase of the Pearson correlation coefficient between predicted and measured binding affinities by up to 23 percentage points. In addition, its ability to identify strong-binding peptides has been experimentally validated. Moreover, by integrating the convolutional neural network with attention mechanism, ACME is able to extract interpretable patterns that can provide useful and detailed insights into the binding preferences between peptides and their MHC partners. All these results have demonstrated that ACME can provide a powerful and practically useful tool for the studies of peptide–MHC class I interactions.
Availability and implementation
ACME is available as an open source software at https://github.com/HYsxe/ACME.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yan Hu
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Ziqiang Wang
- Department of Urology, Shenzhen Second People’s Hospital, The First Affiliated Hospital of Shenzhen University, International Cancer Center, Shenzhen University School of Medicine, Shenzhen, China
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing, China
| | - Fangping Wan
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Lin Chen
- Turing AI Institute of Nanjing, Nanjing, China
| | - Yuanpeng Xiong
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
- Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Xiaoxia Wang
- Department of Urology, Shenzhen Second People’s Hospital, The First Affiliated Hospital of Shenzhen University, International Cancer Center, Shenzhen University School of Medicine, Shenzhen, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Weiren Huang
- Department of Urology, Shenzhen Second People’s Hospital, The First Affiliated Hospital of Shenzhen University, International Cancer Center, Shenzhen University School of Medicine, Shenzhen, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
| |
Collapse
|
41
|
Yu H, Meng W, Mao Y, Zhang Y, Sun Q, Tao S. Deciphering the rules of mRNA structure differentiation in Saccharomyces cerevisiae in vivo and in vitro with deep neural networks. RNA Biol 2019; 16:1044-1054. [PMID: 31119975 PMCID: PMC6602416 DOI: 10.1080/15476286.2019.1612692] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The structure of mRNA in vivo is unwound to some extent in response to multiple factors involved in the translation process, resulting in significant differences from the structure of the same mRNA in vitro. In this study, we have proposed a novel application of deep neural networks, named DeepDRU, to predict the degree of mRNA structure unwinding in vivo by fitting five quantifiable features that may affect mRNA folding: ribosome density (RD), minimum folding free energy (MFE), GC content, translation initiation ribosome density (INI) and mRNA structure position (POS). mRNA structures with adjustment of the simulated structural features were designed and then fed into the trained DeepDRU model. We found unique effect regions of these five features on mRNA structure in vivo. Strikingly, INI is the most critical factor affecting the structure of mRNA in vivo, and structural sequence features, including MFE and GC content, have relatively smaller effects. DeepDRU provides a new paradigm for predicting the unwinding capability of mRNA structure in vivo. This improved knowledge about the mechanisms of factors influencing the structural capability of mRNA to unwind will facilitate the design and functional analysis of mRNA structure in vivo.
Collapse
Affiliation(s)
- Haopeng Yu
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| | - Wenjing Meng
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| | - Yuanhui Mao
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| | - Yi Zhang
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| | - Qing Sun
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| | - Shiheng Tao
- a College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas , Northwest A&F University , Yangling , Shaanxi , China.,b Bioinformatics Center , Northwest A&F University , Yangling , Shaanxi , China
| |
Collapse
|
42
|
Ingolia NT, Hussmann JA, Weissman JS. Ribosome Profiling: Global Views of Translation. Cold Spring Harb Perspect Biol 2019; 11:cshperspect.a032698. [PMID: 30037969 DOI: 10.1101/cshperspect.a032698] [Citation(s) in RCA: 187] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The translation of messenger RNA (mRNA) into protein and the folding of the resulting protein into an active form are prerequisites for virtually every cellular process and represent the single largest investment of energy by cells. Ribosome profiling-based approaches have revolutionized our ability to monitor every step of protein synthesis in vivo, allowing one to measure the rate of protein synthesis across the proteome, annotate the protein coding capacity of genomes, monitor localized protein synthesis, and explore cotranslational folding and targeting. The rich and quantitative nature of ribosome profiling data provides an unprecedented opportunity to explore and model complex cellular processes. New analytical techniques and improved experimental protocols will provide a deeper understanding of the factors controlling translation speed and its impact on protein function and cell physiology as well as the role of ribosomal RNA and mRNA modifications in regulating translation.
Collapse
Affiliation(s)
- Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720
| | - Jeffrey A Hussmann
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California 94158.,Howard Hughes Medical Institute, San Francisco, California 94158
| | - Jonathan S Weissman
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California 94158.,Howard Hughes Medical Institute, San Francisco, California 94158
| |
Collapse
|
43
|
Accurate design of translational output by a neural network model of ribosome distribution. Nat Struct Mol Biol 2018; 25:577-582. [PMID: 29967537 PMCID: PMC6457438 DOI: 10.1038/s41594-018-0080-2] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 05/11/2018] [Indexed: 11/08/2022]
Abstract
Synonymous codon choice can have dramatic effects on ribosome speed and protein expression. Ribosome profiling experiments have underscored that ribosomes do not move uniformly along mRNAs. Here, we have modeled this variation in translation elongation by using a feed-forward neural network to predict the ribosome density at each codon as a function of its sequence neighborhood. Our approach revealed sequence features affecting translation elongation and characterized large technical biases in ribosome profiling. We applied our model to design synonymous variants of a fluorescent protein spanning the range of translation speeds predicted with our model. Levels of the fluorescent protein in budding yeast closely tracked the predicted translation speeds across their full range. We therefore demonstrate that our model captures information determining translation dynamics in vivo; that this information can be harnessed to design coding sequences; and that control of translation elongation alone is sufficient to produce large quantitative differences in protein output.
Collapse
|
44
|
Argüello RJ, Reverendo M, Mendes A, Camosseto V, Torres AG, Ribas de Pouplana L, van de Pavert SA, Gatti E, Pierre P. SunRiSE - measuring translation elongation at single-cell resolution by means of flow cytometry. J Cell Sci 2018; 131:jcs.214346. [PMID: 29700204 DOI: 10.1242/jcs.214346] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Accepted: 04/04/2018] [Indexed: 12/30/2022] Open
Abstract
The rate at which ribosomes translate mRNAs regulates protein expression by controlling co-translational protein folding and mRNA stability. Many factors regulate translation elongation, including tRNA levels, codon usage and phosphorylation of eukaryotic elongation factor 2 (eEF2). Current methods to measure translation elongation lack single-cell resolution, require expression of multiple transgenes and have never been successfully applied ex vivo Here, we show, by using a combination of puromycilation detection and flow cytometry (a method we call 'SunRiSE'), that translation elongation can be measured accurately in primary cells in pure or heterogenous populations isolated from blood or tissues. This method allows for the simultaneous monitoring of multiple parameters, such as mTOR or S6K1/2 signaling activity, the cell cycle stage and phosphorylation of translation factors in single cells, without elaborated, costly and lengthy purification procedures. We took advantage of SunRiSE to demonstrate that, in mouse embryonic fibroblasts, eEF2 phosphorylation by eEF2 kinase (eEF2K) mostly affects translation engagement, but has a surprisingly small effect on elongation, except after proteotoxic stress induction.This article has an associated First Person interview with the first author of the paper.
Collapse
Affiliation(s)
- Rafael J Argüello
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France
| | - Marisa Reverendo
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France
| | - Andreia Mendes
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France
| | - Voahirana Camosseto
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France
| | - Adrian G Torres
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Parc Científic de Barcelona, C/Baldiri Reixac 10, 08028 Barcelona, Catalonia, Spain
| | - Lluis Ribas de Pouplana
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Parc Científic de Barcelona, C/Baldiri Reixac 10, 08028 Barcelona, Catalonia, Spain.,Catalan Institute for Research and Advanced Studies (ICREA), P/Lluis Companys 23, 08010 Barcelona, Catalonia, Spain
| | - Serge A van de Pavert
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France
| | - Evelina Gatti
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France.,Institute for Research in Biomedicine (iBiMED) and Ilidio Pinho Foundation, Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Philippe Pierre
- Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Inserm, CNRS, 13288, Marseille Cedex 9, France .,Institute for Research in Biomedicine (iBiMED) and Ilidio Pinho Foundation, Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal
| |
Collapse
|