1
|
Akuwudike P, López-Riego M, Marczyk M, Kocibalova Z, Brückner F, Polańska J, Wojcik A, Lundholm L. Short- and long-term effects of radiation exposure at low dose and low dose rate in normal human VH10 fibroblasts. Front Public Health 2023; 11:1297942. [PMID: 38162630 PMCID: PMC10755029 DOI: 10.3389/fpubh.2023.1297942] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 11/20/2023] [Indexed: 01/03/2024] Open
Abstract
Introduction Experimental studies complement epidemiological data on the biological effects of low doses and dose rates of ionizing radiation and help in determining the dose and dose rate effectiveness factor. Methods Human VH10 skin fibroblasts exposed to 25, 50, and 100 mGy of 137Cs gamma radiation at 1.6, 8, 12 mGy/h, and at a high dose rate of 23.4 Gy/h, were analyzed for radiation-induced short- and long-term effects. Two sample cohorts, i.e., discovery (n = 30) and validation (n = 12), were subjected to RNA sequencing. The pool of the results from those six experiments with shared conditions (1.6 mGy/h; 24 h), together with an earlier time point (0 h), constituted a third cohort (n = 12). Results The 100 mGy-exposed cells at all abovementioned dose rates, harvested at 0/24 h and 21 days after exposure, showed no strong gene expression changes. DMXL2, involved in the regulation of the NOTCH signaling pathway, presented a consistent upregulation among both the discovery and validation cohorts, and was validated by qPCR. Gene set enrichment analysis revealed that the NOTCH pathway was upregulated in the pooled cohort (p = 0.76, normalized enrichment score (NES) = 0.86). Apart from upregulated apical junction and downregulated DNA repair, few pathways were consistently changed across exposed cohorts. Concurringly, cell viability assays, performed 1, 3, and 6 days post irradiation, and colony forming assay, seeded just after exposure, did not reveal any statistically significant early effects on cell growth or survival patterns. Tendencies of increased viability (day 6) and reduced colony size (day 21) were observed at 12 mGy/h and 23.4 Gy/min. Furthermore, no long-term changes were observed in cell growth curves generated up to 70 days after exposure. Discussion In conclusion, low doses of gamma radiation given at low dose rates had no strong cytotoxic effects on radioresistant VH10 cells.
Collapse
Affiliation(s)
- Pamela Akuwudike
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Milagrosa López-Riego
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Michal Marczyk
- Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland
- Yale Cancer Center, Yale School of Medicine, New Haven, CT, United States
| | - Zuzana Kocibalova
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Fabian Brückner
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Joanna Polańska
- Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland
| | - Andrzej Wojcik
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
- Institute of Biology, Jan Kochanowski University, Kielce, Poland
| | - Lovisa Lundholm
- Centre for Radiation Protection Research, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| |
Collapse
|
2
|
Zhou J, Wang X, Wei Z, Meng J, Huang D. 4acCPred: Weakly supervised prediction of N4-acetyldeoxycytosine DNA modification from sequences. MOLECULAR THERAPY - NUCLEIC ACIDS 2022; 30:337-345. [DOI: 10.1016/j.omtn.2022.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022]
|
3
|
Multi-attention multiple instance learning. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07259-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
4
|
Huang D, Song B, Wei J, Su J, Coenen F, Meng J. Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data. Bioinformatics 2021; 37:i222-i230. [PMID: 34252943 PMCID: PMC8336446 DOI: 10.1093/bioinformatics/btab278] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Motivation Increasing evidence suggests that post-transcriptional ribonucleic acid (RNA) modifications regulate essential biomolecular functions and are related to the pathogenesis of various diseases. Precise identification of RNA modification sites is essential for understanding the regulatory mechanisms of RNAs. To date, many computational approaches for predicting RNA modifications have been developed, most of which were based on strong supervision enabled by base-resolution epitranscriptome data. However, high-resolution data may not be available. Results We propose WeakRM, the first weakly supervised learning framework for predicting RNA modifications from low-resolution epitranscriptome datasets, such as those generated from acRIP-seq and hMeRIP-seq. Evaluations on three independent datasets (corresponding to three different RNA modification types and their respective sequencing technologies) demonstrated the effectiveness of our approach in predicting RNA modifications from low-resolution data. WeakRM outperformed state-of-the-art multi-instance learning methods for genomic sequences, such as WSCNN, which was originally designed for transcription factor binding site prediction. Additionally, our approach captured motifs that are consistent with existing knowledge, and visualization of the predicted modification-containing regions unveiled the potentials of detecting RNA modifications with improved resolution. Availability implementation The source code for the WeakRM algorithm, along with the datasets used, are freely accessible at: https://github.com/daiyun02211/WeakRM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daiyun Huang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Department of Computer Science, University of Liverpool, Liverpool L69 7ZB, UK
| | - Bowen Song
- Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Jingjue Wei
- Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Jionglong Su
- School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| | - Frans Coenen
- Department of Computer Science, University of Liverpool, Liverpool L69 7ZB, UK
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK.,AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
| |
Collapse
|
5
|
Li HD, Yang C, Zhang Z, Yang M, Wu FX, Omenn GS, Wang J. IsoResolve: predicting splice isoform functions by integrating gene and isoform-level features with domain adaptation. Bioinformatics 2021; 37:522-530. [PMID: 32966552 PMCID: PMC8088322 DOI: 10.1093/bioinformatics/btaa829] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/12/2020] [Accepted: 09/09/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION High resolution annotation of gene functions is a central goal in functional genomics. A single gene may produce multiple isoforms with different functions through alternative splicing. Conventional approaches, however, consider a gene as a single entity without differentiating these functionally different isoforms. Towards understanding gene functions at higher resolution, recent efforts have focused on predicting the functions of isoforms. However, the performance of existing methods is far from satisfactory mainly because of the lack of isoform-level functional annotation. RESULTS We present IsoResolve, a novel approach for isoform function prediction, which leverages the information from gene function prediction models with domain adaptation (DA). IsoResolve treats gene-level and isoform-level features as source and target domains, respectively. It uses DA to project the two domains into a latent variable space in such a way that the latent variables from the two domains have similar distribution, which enables the gene domain information to be leveraged for isoform function prediction. We systematically evaluated the performance of IsoResolve in predicting functions. Compared with five state-of-the-art methods, IsoResolve achieved significantly better performance. IsoResolve was further validated by case studies of genes with isoform-level functional annotation. AVAILABILITY AND IMPLEMENTATION IsoResolve is freely available at https://github.com/genemine/IsoResolve. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hong-Dong Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Changhuo Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, Hunan 410083, China
| | - Mengyun Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N5A9, Canada
| | - Gilbert S Omenn
- Institute for Systems Biology, Seattle, WA 98101, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| |
Collapse
|
6
|
Pozo F, Martinez-Gomez L, Walsh TA, Rodriguez JM, Di Domenico T, Abascal F, Vazquez J, Tress ML. Assessing the functional relevance of splice isoforms. NAR Genom Bioinform 2021; 3:lqab044. [PMID: 34046593 PMCID: PMC8140736 DOI: 10.1093/nargab/lqab044] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 04/22/2021] [Accepted: 05/17/2021] [Indexed: 12/20/2022] Open
Abstract
Alternative splicing of messenger RNA can generate an array of mature transcripts, but it is not clear how many go on to produce functionally relevant protein isoforms. There is only limited evidence for alternative proteins in proteomics analyses and data from population genetic variation studies indicate that most alternative exons are evolving neutrally. Determining which transcripts produce biologically important isoforms is key to understanding isoform function and to interpreting the real impact of somatic mutations and germline variations. Here we have developed a method, TRIFID, to classify the functional importance of splice isoforms. TRIFID was trained on isoforms detected in large-scale proteomics analyses and distinguishes these biologically important splice isoforms with high confidence. Isoforms predicted as functionally important by the algorithm had measurable cross species conservation and significantly fewer broken functional domains. Additionally, exons that code for these functionally important protein isoforms are under purifying selection, while exons from low scoring transcripts largely appear to be evolving neutrally. TRIFID has been developed for the human genome, but it could in principle be applied to other well-annotated species. We believe that this method will generate valuable insights into the cellular importance of alternative splicing.
Collapse
Affiliation(s)
- Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Laura Martinez-Gomez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Thomas A Walsh
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - José Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Tomas Di Domenico
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Jesús Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| |
Collapse
|
7
|
Jiao L, Yang Y, Yu W, Zhao Y, Long H, Gao J, Ding K, Ma C, Li J, Zhao S, Wang H, Li H, Yang M, Xu J, Wang J, Yang J, Kuang D, Luo F, Qian X, Xu L, Yin B, Liu W, Liu H, Lu S, Peng X. The olfactory route is a potential way for SARS-CoV-2 to invade the central nervous system of rhesus monkeys. Signal Transduct Target Ther 2021; 6:169. [PMID: 33895780 PMCID: PMC8065334 DOI: 10.1038/s41392-021-00591-7] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/22/2021] [Accepted: 03/23/2021] [Indexed: 01/08/2023] Open
Abstract
Neurological manifestations are frequently reported in the COVID-19 patients. Neuromechanism of SARS-CoV-2 remains to be elucidated. In this study, we explored the mechanisms of SARS-CoV-2 neurotropism via our established non-human primate model of COVID-19. In rhesus monkey, SARS-CoV-2 invades the CNS primarily via the olfactory bulb. Thereafter, viruses rapidly spread to functional areas of the central nervous system, such as hippocampus, thalamus, and medulla oblongata. The infection of SARS-CoV-2 induces the inflammation possibly by targeting neurons, microglia, and astrocytes in the CNS. Consistently, SARS-CoV-2 infects neuro-derived SK-N-SH, glial-derived U251, and brain microvascular endothelial cells in vitro. To our knowledge, this is the first experimental evidence of SARS-CoV-2 neuroinvasion in the NHP model, which provides important insights into the CNS-related pathogenesis of SARS-CoV-2.
Collapse
Affiliation(s)
- Li Jiao
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Yun Yang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Wenhai Yu
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Yuan Zhao
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Haiting Long
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Jiahong Gao
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Kaiyun Ding
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Chunxia Ma
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Jingmei Li
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Siwen Zhao
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Haixuan Wang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Haiyan Li
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Mengli Yang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Jingwen Xu
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Junbin Wang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Jing Yang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Dexuan Kuang
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Fangyu Luo
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Xingli Qian
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Longjiang Xu
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China
| | - Bin Yin
- State Key Laboratory of Medical Molecular Biology, Department of Molecular Biology and Biochemistry, Institute of Basic Medical Sciences, Medical Primate Research Center, Neuroscience Center, Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China
| | - Wei Liu
- Department of Anatomy, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China
| | - Hongqi Liu
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China.
| | - Shuaiyao Lu
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China. .,State Key Laboratory of Medical Molecular Biology, Department of Molecular Biology and Biochemistry, Institute of Basic Medical Sciences, Medical Primate Research Center, Neuroscience Center, Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China.
| | - Xiaozhong Peng
- National Kunming High-Level Biosafety Primate Research Center, Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, Yunnan, China. .,State Key Laboratory of Medical Molecular Biology, Department of Molecular Biology and Biochemistry, Institute of Basic Medical Sciences, Medical Primate Research Center, Neuroscience Center, Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China.
| |
Collapse
|
8
|
Nieboer MM, de Ridder J. svMIL: predicting the pathogenic effect of TAD boundary-disrupting somatic structural variants through multiple instance learning. Bioinformatics 2020; 36:i692-i699. [DOI: 10.1093/bioinformatics/btaa802] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 12/21/2022] Open
Abstract
Abstract
Motivation
Despite the fact that structural variants (SVs) play an important role in cancer, methods to predict their effect, especially for SVs in non-coding regions, are lacking, leaving them often overlooked in the clinic. Non-coding SVs may disrupt the boundaries of Topologically Associated Domains (TADs), thereby affecting interactions between genes and regulatory elements such as enhancers. However, it is not known when such alterations are pathogenic. Although machine learning techniques are a promising solution to answer this question, representing the large number of interactions that an SV can disrupt in a single feature matrix is not trivial.
Results
We introduce svMIL: a method to predict pathogenic TAD boundary-disrupting SV effects based on multiple instance learning, which circumvents the need for a traditional feature matrix by grouping SVs into bags that can contain any number of disruptions. We demonstrate that svMIL can predict SV pathogenicity, measured through same-sample gene expression aberration, for various cancer types. In addition, our approach reveals that somatic pathogenic SVs alter different regulatory interactions than somatic non-pathogenic SVs and germline SVs.
Availability and implementation
All code for svMIL is publicly available on GitHub: https://github.com/UMCUGenetics/svMIL.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marleen M. Nieboer
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht 3584 CG, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht 3584 CG, The Netherlands
| |
Collapse
|
9
|
Helmy M, Smith D, Selvarajoo K. Systems biology approaches integrated with artificial intelligence for optimized metabolic engineering. Metab Eng Commun 2020; 11:e00149. [PMID: 33072513 PMCID: PMC7546651 DOI: 10.1016/j.mec.2020.e00149] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/01/2020] [Accepted: 10/07/2020] [Indexed: 12/05/2022] Open
Abstract
Metabolic engineering aims to maximize the production of bio-economically important substances (compounds, enzymes, or other proteins) through the optimization of the genetics, cellular processes and growth conditions of microorganisms. This requires detailed understanding of underlying metabolic pathways involved in the production of the targeted substances, and how the cellular processes or growth conditions are regulated by the engineering. To achieve this goal, a large system of experimental techniques, compound libraries, computational methods and data resources, including multi-omics data, are used. The recent advent of multi-omics systems biology approaches significantly impacted the field by opening new avenues to perform dynamic and large-scale analyses that deepen our knowledge on the manipulations. However, with the enormous transcriptomics, proteomics and metabolomics available, it is a daunting task to integrate the data for a more holistic understanding. Novel data mining and analytics approaches, including Artificial Intelligence (AI), can provide breakthroughs where traditional low-throughput experiment-alone methods cannot easily achieve. Here, we review the latest attempts of combining systems biology and AI in metabolic engineering research, and highlight how this alliance can help overcome the current challenges facing industrial biotechnology, especially for food-related substances and compounds using microorganisms.
Collapse
Affiliation(s)
- Mohamed Helmy
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
| | - Derek Smith
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
| | - Kumar Selvarajoo
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
- Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore (NUS), Singapore, Singapore
| |
Collapse
|
10
|
Yu G, Wang K, Domeniconi C, Guo M, Wang J. Isoform function prediction based on bi-random walks on a heterogeneous network. Bioinformatics 2020; 36:303-310. [PMID: 31250882 DOI: 10.1093/bioinformatics/btz535] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 06/21/2019] [Accepted: 06/26/2019] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION Alternative splicing contributes to the functional diversity of protein species and the proteoforms translated from alternatively spliced isoforms of a gene actually execute the biological functions. Computationally predicting the functions of genes has been studied for decades. However, how to distinguish the functional annotations of isoforms, whose annotations are essential for understanding developmental abnormalities and cancers, is rarely explored. The main bottleneck is that functional annotations of isoforms are generally unavailable and functional genomic databases universally store the functional annotations at the gene level. RESULTS We propose IsoFun to accomplish Isoform Function prediction based on bi-random walks on a heterogeneous network. IsoFun firstly constructs an isoform functional association network based on the expression profiles of isoforms derived from multiple RNA-seq datasets. Next, IsoFun uses the available Gene Ontology annotations of genes, gene-gene interactions and the relations between genes and isoforms to construct a heterogeneous network. After this, IsoFun performs a tailored bi-random walk on the heterogeneous network to predict the association between GO terms and isoforms, thus accomplishing the prediction of GO annotations of isoforms. Experimental results show that IsoFun significantly outperforms the state-of-the-art algorithms and improves the area under the receiver-operating curve (AUROC) and the area under the precision-recall curve (AUPRC) by 17% and 44% at the gene-level, respectively. We further validated the performance of IsoFun on the genes ADAM15 and BCL2L1. IsoFun accurately differentiates the functions of respective isoforms of these two genes. AVAILABILITY AND IMPLEMENTATION The code of IsoFun is available at http://mlda.swu.edu.cn/codes.php? name=IsoFun. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Keyao Wang
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Carlotta Domeniconi
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.,Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing, China
| |
Collapse
|
11
|
Mishra SK, Muthye V, Kandoi G. Computational Methods for Predicting Functions at the mRNA Isoform Level. Int J Mol Sci 2020; 21:ijms21165686. [PMID: 32784445 PMCID: PMC7460821 DOI: 10.3390/ijms21165686] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 11/16/2022] Open
Abstract
Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable, and plant diseases. The mRNA isoforms of the same gene can have dramatically different functions. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.
Collapse
|
12
|
Shaw D, Chen H, Jiang T. DeepIsoFun: a deep domain adaptation approach to predict isoform functions. Bioinformatics 2020; 35:2535-2544. [PMID: 30535380 DOI: 10.1093/bioinformatics/bty1017] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/07/2018] [Accepted: 12/08/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Isoforms are mRNAs produced from the same gene locus by alternative splicing and may have different functions. Although gene functions have been studied extensively, little is known about the specific functions of isoforms. Recently, some computational approaches based on multiple instance learning have been proposed to predict isoform functions from annotated gene functions and expression data, but their performance is far from being desirable primarily due to the lack of labeled training data. To improve the performance on this problem, we propose a novel deep learning method, DeepIsoFun, that combines multiple instance learning with domain adaptation. The latter technique helps to transfer the knowledge of gene functions to the prediction of isoform functions and provides additional labeled training data. Our model is trained on a deep neural network architecture so that it can adapt to different expression distributions associated with different gene ontology terms. RESULTS We evaluated the performance of DeepIsoFun on three expression datasets of human and mouse collected from SRA studies at different times. On each dataset, DeepIsoFun performed significantly better than the existing methods. In terms of area under the receiver operating characteristics curve, our method acquired at least 26% improvement and in terms of area under the precision-recall curve, it acquired at least 10% improvement over the state-of-the-art methods. In addition, we also study the divergence of the functions predicted by our method for isoforms from the same gene and the overall correlation between expression similarity and the similarity of predicted functions. AVAILABILITY AND IMPLEMENTATION https://github.com/dls03/DeepIsoFun/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dipan Shaw
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA
| | - Hao Chen
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA.,Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
| |
Collapse
|
13
|
ISOGO: Functional annotation of protein-coding splice variants. Sci Rep 2020; 10:1069. [PMID: 31974522 PMCID: PMC6978412 DOI: 10.1038/s41598-020-57974-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 01/07/2020] [Indexed: 12/25/2022] Open
Abstract
The advent of RNA-seq technologies has switched the paradigm of genetic analysis from a genome to a transcriptome-based perspective. Alternative splicing generates functional diversity in genes, but the precise functions of many individual isoforms are yet to be elucidated. Gene Ontology was developed to annotate gene products according to their biological processes, molecular functions and cellular components. Despite a single gene may have several gene products, most annotations are not isoform-specific and do not distinguish the functions of the different proteins originated from a single gene. Several approaches have tried to automatically annotate ontologies at the isoform level, but this has shown to be a daunting task. We have developed ISOGO (ISOform + GO function imputation), a novel algorithm to predict the function of coding isoforms based on their protein domains and their correlation of expression along 11,373 cancer patients. Combining these two sources of information outperforms previous approaches: it provides an area under precision-recall curve (AUPRC) five times larger than previous attempts and the median AUROC of assigned functions to genes is 0.82. We tested ISOGO predictions on some genes with isoform-specific functions (BRCA1, MADD,VAMP7 and ITSN1) and they were coherent with the literature. Besides, we examined whether the main isoform of each gene -as predicted by APPRIS- was the most likely to have the annotated gene functions and it occurs in 99.4% of the genes. We also evaluated the predictions for isoform-specific functions provided by the CAFA3 challenge and results were also convincing. To make these results available to the scientific community, we have deployed a web application to consult ISOGO predictions (https://biotecnun.unav.es/app/isogo). Initial data, website link, isoform-specific GO function predictions and R code is available at https://gitlab.com/icassol/isogo.
Collapse
|
14
|
|
15
|
Abstract
Alternative Splicing produces multiple mRNA isoforms of genes which have important diverse roles such as regulation of gene expression, human heritable diseases, and response to environmental stresses. However, little has been done to assign functions at the mRNA isoform level. Functional networks, where the interactions are quantified by their probability of being involved in the same biological process are typically generated at the gene level. We use a diverse array of tissue-specific RNA-seq datasets and sequence information to train random forest models that predict the functional networks. Since there is no mRNA isoform-level gold standard, we use single isoform genes co-annotated to Gene Ontology biological process annotations, Kyoto Encyclopedia of Genes and Genomes pathways, BioCyc pathways and protein-protein interactions as functionally related (positive pair). To generate the non-functional pairs (negative pair), we use the Gene Ontology annotations tagged with "NOT" qualifier. We describe 17 Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) following a leave-one-tissue-out strategy in addition to an organism level reference functional network for mouse. We validate our predictions by comparing its performance with previous methods, randomized positive and negative class labels, updated Gene Ontology annotations, and by literature evidence. We demonstrate the ability of our networks to reveal tissue-specific functional differences of the isoforms of the same genes. All scripts and data from TENSION are available at: https://doi.org/10.25380/iastate.c.4275191 .
Collapse
Affiliation(s)
- Gaurav Kandoi
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA
- Department of Electrical and Computer Engineering, Iowa State University, Ames, IA, USA
| | - Julie A Dickerson
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA.
- Department of Electrical and Computer Engineering, Iowa State University, Ames, IA, USA.
| |
Collapse
|
16
|
Rajderkar S, Mann JM, Panaretos C, Yumoto K, Li HD, Mishina Y, Ralston B, Kaartinen V. Trim33 is required for appropriate development of pre-cardiogenic mesoderm. Dev Biol 2019; 450:101-114. [PMID: 30940539 DOI: 10.1016/j.ydbio.2019.03.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 03/27/2019] [Accepted: 03/27/2019] [Indexed: 11/25/2022]
Abstract
Congenital cardiac malformations are among the most common birth defects in humans. Here we show that Trim33, a member of the Tif1 subfamily of tripartite domain containing transcriptional cofactors, is required for appropriate differentiation of the pre-cardiogenic mesoderm during a narrow time window in late gastrulation. While mesoderm-specific Trim33 mutants did not display noticeable phenotypes, epiblast-specific Trim33 mutant embryos developed ventricular septal defects, showed sparse trabeculation and abnormally thin compact myocardium, and died as a result of cardiac failure during late gestation. Differentiating embryoid bodies deficient in Trim33 showed an enrichment of gene sets associated with cardiac differentiation and contractility, while the total number of cardiac precursor cells was reduced. Concordantly, cardiac progenitor cell proliferation was reduced in Trim33-deficient embryos. ChIP-Seq performed using antibodies against Trim33 in differentiating embryoid bodies revealed more than 4000 peaks, which were significantly enriched close to genes implicated in stem cell maintenance and mesoderm development. Nearly half of the Trim33 peaks overlapped with binding sites of the Ctcf insulator protein. Our results suggest that Trim33 is required for appropriate differentiation of precardiogenic mesoderm during late gastrulation and that it will likely mediate some of its functions via multi-protein complexes, many of which include the chromatin architectural and insulator protein Ctcf.
Collapse
Affiliation(s)
- Sudha Rajderkar
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jeffrey M Mann
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Christopher Panaretos
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kenji Yumoto
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Hong-Dong Li
- Center for Bioinformatics, School of Information Science and Engineering, Central South University, Changsha, Hunan, 410083, PR China
| | - Yuji Mishina
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Benjamin Ralston
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Vesa Kaartinen
- Department of Biologic and Materials Sciences, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
17
|
Ma J, Wang J, Ghoraie LS, Men X, Haibe-Kains B, Dai P. Network-based approach to identify principal isoforms among four cancer types. Mol Omics 2019; 15:117-129. [PMID: 30720033 DOI: 10.1039/c8mo00234g] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Protein isoforms are structurally similar proteins produced by alternative splicing of a single gene or genes from the same family. Isoforms of a protein can perform the same, similar, or even opposite biological functions. A previous study identified principal isoforms of proteins based on the extent of interactions per isoform in a functional relationship network, focusing on data from normal tissues. Additionally, the expression levels of specific isoforms of various genes associated with tumorigenesis and prognosis are frequently altered in tumors compared with those in normal tissues. In this study, we aimed to identify higher degree isoforms (HDIs) of multi-isoform genes (MIGs) in cancer by applying a meta-analytical framework to calculate co-expression between each pair of isoforms in two large datasets of RNA-seq profiles from breast cancer, lung cancer, leukemia, and colon cancer cell lines. Then, we compared HDIs with isoforms identified by proteomic data and prognostic and predictive evidence in various cancers. In addition, we separately analyzed the associations between HDIs and non-HDIs (nHDIs) of the same genes according to transcript expression and drug responses in various cancer type cell lines. Collectively, these results indicated the complex properties of HDIs per gene identified by cancer type-based isoform-isoform co-expression networks and showed the potential of HDIs as novel therapeutic targets for cancer treatment.
Collapse
Affiliation(s)
- Jun Ma
- National Engineering Research Center for Miniaturized Detection Systems, Northwest University, Xi'an, P. R. China. and Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Jenny Wang
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Laleh Soltan Ghoraie
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Xin Men
- Microbiology Institute of Shaanxi, China and National Engineering Research Center for Miniaturized Detection Systems, Northwest University, Xi'an, P. R. China.
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Penggao Dai
- National Engineering Research Center for Miniaturized Detection Systems, Northwest University, Xi'an, P. R. China.
| |
Collapse
|
18
|
Nery TGM, Silva EM, Tavares R, Passetti F. The Challenge to Search for New Nervous System Disease Biomarker Candidates: the Opportunity to Use the Proteogenomics Approach. J Mol Neurosci 2018; 67:150-164. [PMID: 30554402 DOI: 10.1007/s12031-018-1220-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 11/18/2018] [Indexed: 12/14/2022]
Abstract
Alzheimer's disease, Parkinson's disease, prion diseases, schizophrenia, and multiple sclerosis are the most common nervous system diseases, affecting millions of people worldwide. The current scientific literature associates these pathological conditions to abnormal expression levels of certain proteins, which in turn improved the knowledge concerning normal and affected brains. However, there is no available cure or preventive therapy for any of these disorders. Proteogenomics is a recent approach defined as the data integration of both nucleotide high-throughput sequencing and protein mass spectrometry technologies. In the last years, proteogenomics studies in distinct diseases have emerged as a strategy for the identification of uncharacterized proteoforms, which are all the different protein forms derived from a single gene. For many of these diseases, at least one protein used as biomarker presents more than one proteoform, which fosters the analysis of publicly available data focusing proteoforms. Given this context, we describe the most important biomarkers for each neurodegenerative disease and how genomics, transcriptomics, and proteomics separately contributed to unveil them. Finally, we present a selection of proteogenomics studies in which the combination of nucleotide and proteome high-throughput data, from cell lines or brain tissue samples, is used to uncover proteoforms not previously described. We believe that this new approach may improve our knowledge about nervous system diseases and brain function and an opportunity to identify new biomarker candidates.
Collapse
Affiliation(s)
- Thais Guimarães Martins Nery
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (Fiocruz), Manguinhos, Rio de Janeiro, Brazil
- Laboratory of Gene Expression Regulation, Carlos Chagas Institute, Fundação Oswaldo Cruz (Fiocruz), Curitiba, Brazil
| | - Esdras Matheus Silva
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (Fiocruz), Manguinhos, Rio de Janeiro, Brazil
- Laboratory of Gene Expression Regulation, Carlos Chagas Institute, Fundação Oswaldo Cruz (Fiocruz), Curitiba, Brazil
| | - Raphael Tavares
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Minas Gerais, Brazil
| | - Fabio Passetti
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (Fiocruz), Manguinhos, Rio de Janeiro, Brazil.
- Laboratory of Gene Expression Regulation, Carlos Chagas Institute, Fundação Oswaldo Cruz (Fiocruz), Curitiba, Brazil.
| |
Collapse
|
19
|
Paik YK, Overall CM, Deutsch EW, Hancock WS, Omenn GS. Progress in the Chromosome-Centric Human Proteome Project as Highlighted in the Annual Special Issue IV. J Proteome Res 2018; 15:3945-3950. [PMID: 27809547 DOI: 10.1021/acs.jproteome.6b00803] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Young-Ki Paik
- Yonsei Proteome Research Center and Department of Biochemistry, Yonsei University
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry, University of British Columbia
| | | | | | - Gilbert S Omenn
- Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics and School of Public Health, University of Michigan
| |
Collapse
|
20
|
A High-Resolution Genome-Wide CRISPR/Cas9 Viability Screen Reveals Structural Features and Contextual Diversity of the Human Cell-Essential Proteome. Mol Cell Biol 2017; 38:MCB.00302-17. [PMID: 29038160 DOI: 10.1128/mcb.00302-17] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 09/11/2017] [Indexed: 11/20/2022] Open
Abstract
To interrogate genes essential for cell growth, proliferation and survival in human cells, we carried out a genome-wide clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 screen in a B-cell lymphoma line using a custom extended-knockout (EKO) library of 278,754 single-guide RNAs (sgRNAs) that targeted 19,084 RefSeq genes, 20,852 alternatively spliced exons, and 3,872 hypothetical genes. A new statistical analysis tool called robust analytics and normalization for knockout screens (RANKS) identified 2,280 essential genes, 234 of which were unique. Individual essential genes were validated experimentally and linked to ribosome biogenesis and stress responses. Essential genes exhibited a bimodal distribution across 10 different cell lines, consistent with a continuous variation in essentiality as a function of cell type. Genes essential in more lines had more severe fitness defects and encoded the evolutionarily conserved structural cores of protein complexes, whereas genes essential in fewer lines formed context-specific modules and encoded subunits at the periphery of essential complexes. The essentiality of individual protein residues across the proteome correlated with evolutionary conservation, structural burial, modular domains, and protein interaction interfaces. Many alternatively spliced exons in essential genes were dispensable and were enriched for disordered regions. Fitness defects were observed for 44 newly evolved hypothetical reading frames. These results illuminate the contextual nature and evolution of essential gene functions in human cells.
Collapse
|
21
|
The impact of the RBM4-initiated splicing cascade on modulating the carcinogenic signature of colorectal cancer cells. Sci Rep 2017; 7:44204. [PMID: 28276498 PMCID: PMC5343574 DOI: 10.1038/srep44204] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 02/06/2017] [Indexed: 12/31/2022] Open
Abstract
A growing body of studies has demonstrated that dysregulated splicing profiles constitute pivotal mechanisms for carcinogenesis. In this study, we identified discriminative splicing profiles of colorectal cancer (CRC) cells compared to adjacent normal tissues using deep RNA-sequencing (RNA-seq). The RNA-seq results and cohort studies indicated a relatively high ratio of exon 4-excluded neuro-oncological ventral antigen 1 (Nova1−4) and intron 2-retained SRSF6 (SRSF6+intron 2) transcripts in CRC tissues and cell lines. Nova1 variants exhibited differential effects on eliminating SRSF6 expression in CRC cells by inducing SRSF6+intron 2 transcripts which were considered to be the putative target of alternative splicing-coupled nonsense-mediated decay mechanism. Moreover, the splicing profile of vascular endothelial growth factor (VEGF)165/VEGF165b transcripts was relevant to SRSF6 expression, which manipulates the progression of CRC calls. These results highlight the novel and hierarchical role of an alternative splicing cascade that is involved in the development of CRC.
Collapse
|
22
|
Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks. Methods Mol Biol 2017; 1558:415-436. [PMID: 28150250 DOI: 10.1007/978-1-4939-6783-4_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.
Collapse
|
23
|
Omenn GS, Lane L, Lundberg EK, Beavis RC, Overall CM, Deutsch EW. Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications. J Proteome Res 2016; 15:3951-3960. [PMID: 27487407 PMCID: PMC5129622 DOI: 10.1021/acs.jproteome.6b00511] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The HUPO Human Proteome Project (HPP) has two overall goals: (1) stepwise completion of the protein parts list-the draft human proteome including confidently identifying and characterizing at least one protein product from each protein-coding gene, with increasing emphasis on sequence variants, post-translational modifications (PTMs), and splice isoforms of those proteins; and (2) making proteomics an integrated counterpart to genomics throughout the biomedical and life sciences community. PeptideAtlas and GPMDB reanalyze all major human mass spectrometry data sets available through ProteomeXchange with standardized protocols and stringent quality filters; neXtProt curates and integrates mass spectrometry and other findings to present the most up to date authorative compendium of the human proteome. The HPP Guidelines for Mass Spectrometry Data Interpretation version 2.1 were applied to manuscripts submitted for this 2016 C-HPP-led special issue [ www.thehpp.org/guidelines ]. The Human Proteome presented as neXtProt version 2016-02 has 16,518 confident protein identifications (Protein Existence [PE] Level 1), up from 13,664 at 2012-12, 15,646 at 2013-09, and 16,491 at 2014-10. There are 485 proteins that would have been PE1 under the Guidelines v1.0 from 2012 but now have insufficient evidence due to the agreed-upon more stringent Guidelines v2.0 to reduce false positives. neXtProt and PeptideAtlas now both require two non-nested, uniquely mapping (proteotypic) peptides of at least 9 aa in length. There are 2,949 missing proteins (PE2+3+4) as the baseline for submissions for this fourth annual C-HPP special issue of Journal of Proteome Research. PeptideAtlas has 14,629 canonical (plus 1187 uncertain and 1755 redundant) entries. GPMDB has 16,190 EC4 entries, and the Human Protein Atlas has 10,475 entries with supportive evidence. neXtProt, PeptideAtlas, and GPMDB are rich resources of information about post-translational modifications (PTMs), single amino acid variants (SAAVSs), and splice isoforms. Meanwhile, the Biology- and Disease-driven (B/D)-HPP has created comprehensive SRM resources, generated popular protein lists to guide targeted proteomics assays for specific diseases, and launched an Early Career Researchers initiative.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Human Protein Science, University of Geneva, CMU, Michel-Servet 1, 1211 Geneva 4, Switzerland
| | - Emma K. Lundberg
- SciLifeLab Stockholm and School of Biotechnology, KTH, Karolinska Institutet Science Park, Tomtebodavägen 23, SE-171 65 Solna, Sweden
| | - Ronald C. Beavis
- Biochemistry & Medical Genetics, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
| | - Christopher M. Overall
- Biochemistry and Molecular Biology, and Oral Biological and Medical Sciences University of British Columbia, 2350 Health Sciences Mall, Room 4.401, Vancouver, BC V6T 1Z3, Canada
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| |
Collapse
|