1
|
Soremekun OS, Omolabi KF, Soliman MES. Identification and classification of differentially expressed genes reveal potential molecular signature associated with SARS-CoV-2 infection in lung adenocarcinomal cells. Inform Med Unlocked 2020; 20:100384. [PMID: 32835074 PMCID: PMC7308782 DOI: 10.1016/j.imu.2020.100384] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 06/20/2020] [Accepted: 06/21/2020] [Indexed: 01/04/2023] Open
Abstract
Genomic techniques such as next-generation sequencing and microarrays have facilitated the identification and classification of molecular signatures inherent in cells upon viral infection, for possible therapeutic targets. Therefore, in this study, we performed a differential gene expression analysis, pathway enrichment analysis, and gene ontology on RNAseq data obtained from SARS-CoV-2 infected A549 cells. Differential expression analysis revealed that 753 genes were up-regulated while 746 down-regulated. SNORA81, OAS2, SYCP2, LOC100506985, and SNORD35B are the top 5 upregulated genes upon SARS-Cov-2 infection. Expectedly, these genes have been implicated in the immune response to viral assaults. In the Ontology of protein classification, a high percentage of the genes are classified as Gene-specific transcriptional regulator, metabolite interconversion enzyme, and Protein modifying enzymes. Twenty pathways with P-value lower than 0.05 were enriched in the up-regulated genes while 18 pathways are enriched in the down-regulated DEGs. The toll-like receptor signalling pathway is one of the major pathways enriched. This pathway plays an important role in the innate immune system by identifying the pathogen-associated molecular signature emanating from various microorganisms. Taken together, our results present a novel understanding of genes and corresponding pathways upon SARS-Cov-2 infection, and could facilitate the identification of novel therapeutic targets and biomarkers in the treatment of COVID-19.
Collapse
Affiliation(s)
- Opeyemi S Soremekun
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa
| | - Kehinde F Omolabi
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa
| | - Mahmoud E S Soliman
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa
| |
Collapse
|
2
|
Crommelin DJA, Sindelar RD, Meibohm B. Genomics, Other “Omic” Technologies, Personalized Medicine, and Additional Biotechnology-Related Techniques. Pharmaceutical Biotechnology 2013. [PMCID: PMC7122419 DOI: 10.1007/978-1-4614-6486-0_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
The products resulting for biotechnologies continue to grow at an exponential rate, and the expectations are that an even greater percentage of drug development will be in the area of the biologics. In 2011, worldwide there were over 800 new biotech drugs and treatments in development including 23 antisense, 64 cell therapy, 50 gene therapy, 300 monoclonal antibodies, 78 recombinant proteins, and 298 vaccines (PhRMA 2012). Pharmaceutical biotechnology techniques are at the core of most methodologies used today for drug discovery and development of both biologics and small molecules. While recombinant DNA technology and hybridoma techniques were the major methods utilized in pharmaceutical biotechnology through most of its historical timeline, our ever-widening understanding of human cellular function and disease processes and a wealth of additional and innovative biotechnologies have been, and will continue to be, developed in order to harvest the information found in the human genome. These technological advances will provide a better understanding of the relationship between genetics and biological function, unravel the underlying causes of disease, explore the association of genomic variation and drug response, enhance pharmaceutical research, and fuel the discovery and development of new and novel biopharmaceuticals. These revolutionary technologies and additional biotechnology-related techniques are improving the very competitive and costly process of drug development of new medicinal agents, diagnostics, and medical devices. Some of the technologies and techniques described in this chapter are both well established and commonly used applications of biotechnology producing potential therapeutic products now in development including clinical trials. New techniques are emerging at a rapid and unprecedented pace and their full impact on the future of molecular medicine has yet to be imagined.
Collapse
Affiliation(s)
- Daan J. A. Crommelin
- Department of Pharmaceutical Sciences, Utrecht University, Utrecht, Utrecht The Netherlands
| | - Robert D. Sindelar
- Department of Pharmaceutical Sciences and Department of Medicine, The University of British Columbia, Vancouver, British Columbia Canada
| | - Bernd Meibohm
- Department of Pharmaceutical Sciences, University of Tennessee Health Science Center, College of Pharmacy, Memphis, Tennessee USA
| |
Collapse
|
3
|
Eriksson M, Nilsson I, Kogej T, Southan C, Johansson M, Tyrchan C, Muresan S, Blomberg N, Bjäreland M. SARConnect: A Tool to Interrogate the Connectivity Between Proteins, Chemical Structures and Activity Data. Mol Inform 2012; 31:555-568. [PMID: 23308082 PMCID: PMC3535785 DOI: 10.1002/minf.201200030] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 04/14/2012] [Indexed: 11/21/2022]
Abstract
The access and use of large-scale structure-activity relationships (SAR) is increasing as the range of targets and availability of bioactive compound-to-protein mappings expands. However, effective exploitation requires merging and normalisation of activity data, mappings to target classifications as well as visual display of chemical structure relationships. This work describes the development of the application "SARConnect" to address these issues. We discuss options for delivery and analysis of large-scale SAR data together with a set of use-cases to illustrate the design choices and utility. The main activity sources of ChEMBL,1 GOSTAR2 and AstraZeneca's internal system IBIS, had already been integrated in Chemistry Connect.3 For target relationships we selected human UniProtKB/Swiss-Prot4 as our primary source of a heuristic target classification. Similarly, to explore chemical relationships we combined several methods for framework and scaffold analysis into a unified, hierarchical classification where ease of navigation was the primary goal. An application was built on TIBCO Spotfire to retrieve data for visual display. Consequently, users can explore relationships between target, activity and structure across internal, external and commercial sources that encompass approximately 3 million compounds, 2000 human proteins and 10 million activity values. Examples showing the utility of the application are given.
Collapse
Affiliation(s)
- Mats Eriksson
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | - Thierry Kogej
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | | | | | - Sorel Muresan
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | | |
Collapse
|
4
|
Abstract
A huge number of genes within the human genome code for proteins that mediate and/or control nutritional processes. Although a large body of information on the number of genes, on chromosomal localisation, gene structure and function has been gathered, we are far from understanding the orchestrated way of how they make metabolism. Nevertheless, based on the genetic information emerging on a daily basis, we are offered fantastic new tools that allow us new insights into the molecular basis of human metabolism under normal as well as pathophysiological conditions. Recent technological advancements have made it possible to analyse simultaneously large sets of mRNA and/or proteins expressed in a biological sample or to define genetic heterogeneity that may be important for the individual response of an organism to changes in its nutritional environment. Applications of the new techniques of genome and proteome analysis are central for the development of nutritional sciences in the next decade and its integration into the rapidly developing era of functional genomics.
Collapse
|
5
|
Abstract
The introduction of molecular tools in food research offers the possibility to the food industry to benefit from the experience gained in the field by pharmaceutical companies. In this work we are showing how in silico virtual screening techniques based on molecular similarity were applied for identifying novel umami-tasting compounds. The results obtained suggest that 5'-ribonucleotides and monosodium glutamate might elicit the fifth basic taste via the same molecular mechanism. New algorithms were developed and used in this work, such as the dimension reduction of data sets by singular value decomposition and the introduction of the correlation dimension as a natural dimension of a chemical space. It is shown that the representations of molecular data sets in chemical spaces possess self-similar properties, characteristic of fractal objects.
Collapse
Affiliation(s)
- Martin G Grigorov
- Nestlé Research Center, Vers-chez-les-Blanc, P.O. Box 44, CH-1000 Lausanne 26, Switzerland.
| | | | | | | |
Collapse
|
6
|
Abstract
Cells precisely monitor the concentration and functionality of each protein for optimal performance. Protein quality control involves molecular chaperones, folding catalysts, and proteases that are often heat shock proteins. One quality control factor is HtrA, one of a new class of oligomeric serine proteases. The defining feature of the HtrA family is the combination of a catalytic domain with at least one C-terminal PDZ domain. Here, we discuss the properties and roles of this ATP-independent protease chaperone system in protein metabolism and cell fate.
Collapse
Affiliation(s)
- Tim Clausen
- Max-Planck-Institut für Biochemie, Abteilung Strukturforschung, Am Klopferspitz 18A, 82152 Martinsried, Germany
| | | | | |
Collapse
|
7
|
Abstract
Of the approximately 400 known human proteases, approximately 14% are under investigation as drug targets. Although the total is certain to rise during the finishing phase of the human genome project, the initial annotation of the approximately 30,000 human proteome set includes approximately 500 proteases. Bioinformatic analysis can now be performed on complete human protease families and will soon include comparisons with mice and fish. New sequences will require evaluation of their function in normal physiology and human disease. By revealing details such as splice variants and population polymorphisms, genomic sequence information will have a central role in the validation of protease drug targets.
Collapse
Affiliation(s)
- C Southan
- Head of Computational Biology, Gemini Genomics (UK), 162 Science Park, Milton Road, CB4 0GH, tel.: +44 (0) 1223-435342; fax: +44 (0) 1223-435301, Cambridge, UK
| |
Collapse
|
8
|
Abstract
Over 400 human proteases documented in secondary databases can already be delineated in genomic sequence. A Genome Ontology annotation of 30585 sequences in the provisional human proteome set recognises 498 proteases, i.e. 1.6%. Homology searches against finished sequence and comparisons between mouse and zebrafish are likely to increase this total. However, the data already indicate that the mechanistic class, sequence family and domain distribution of the genomic complement of proteases is unlikely to shift significantly from that already observed in the transcript data. Genomically derived novel sequences will require bioinformatic analysis and biochemical verification. The increasing availability of annotated genomic data will enable studies of splice variants, transcriptional control, polymorphisms, pseudogenes, inactive homologues and evolution. Comparative work on complete human protease families should produce a more integrated picture of their biochemistry and physiology. Genomic data will also lead to the identification of new protease involvement in disease processes and their evaluation as drug targets.
Collapse
Affiliation(s)
- C Southan
- Department of Computational Biology, Gemini Genomics (UK) Ltd, 162 Science Park, Milton Road, CB4 0GH, Cambridge, UK.
| |
Collapse
|
9
|
Abstract
The year 2000 stands as a landmark in modern biology: the first draft of the human genome sequence has been completed. For the pharmaceutical industry, this achievement provides tremendous opportunities because the genomic sequence exposes all human drug targets for therapeutic intervention. The challenge for the pharmaceutical companies is to exploit this definitive resource for the identification of potential molecular targets, rapid characterization of their function and validation of their involvement in disease pathology. Bioinformatics approaches provide increasingly crucial tools to systematically support this exploratory target drug discovery activity.
Collapse
Affiliation(s)
- P Sanseau
- Target Bioinformatics, Glaxo SmithKline, Gunnels Wood Road, SG1 2NY, Stevenage, UK
| |
Collapse
|
10
|
Abstract
The revealing of the entire complement of protease and protease inhibitor sequences by the Human Genome Project will be of great importance to both academic and pharmaceutical research. Although the finishing phase is not yet complete, a selection of secondary annotation sources and comparisons with completed model organism genomes already allow useful estimates to be made. Conservative extrapolation suggests a total of approximately 1.8% for human proteases. This is close to the figures for yeast (1.7%) and worm (1.8%) but lower than the fly (3.4%) which has a large trypsin-like protease content. Using estimates for the human proteome of between 40,000 and 60,000 genes would extrapolate to 700-1,100 proteases, compared with approximately 360 currently represented as GenBank mRNAs. Preliminary comparisons between domain annotations for predicted human gene products and completed proteins suggest the genomic protease family and mechanistic class distributions will broadly reflect those in the current transcript data. The protease:inhibitor ratio at the mRNA level is currently approximately 9:1, but genome annotation data indicate that inhibitory domains are more widespread than this ratio would indicate.
Collapse
Affiliation(s)
- C Southan
- Department of Bioinformatics, Target Discovery, SmithKline Beecham Pharmaceuticals, Essex, UK
| |
Collapse
|