1
|
Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024; 44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]
Abstract
Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.
Collapse
|
2
|
Mapping the global interactome of the ARF family reveals spatial organization in cellular signaling pathways. J Cell Sci 2024; 137:jcs262140. [PMID: 38606629 DOI: 10.1242/jcs.262140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Accepted: 04/08/2024] [Indexed: 04/13/2024] Open
Abstract
The ADP-ribosylation factors (ARFs) and ARF-like (ARL) GTPases serve as essential molecular switches governing a wide array of cellular processes. In this study, we used proximity-dependent biotin identification (BioID) to comprehensively map the interactome of 28 out of 29 ARF and ARL proteins in two cellular models. Through this approach, we identified ∼3000 high-confidence proximal interactors, enabling us to assign subcellular localizations to the family members. Notably, we uncovered previously undefined localizations for ARL4D and ARL10. Clustering analyses further exposed the distinctiveness of the interactors identified with these two GTPases. We also reveal that the expression of the understudied member ARL14 is confined to the stomach and intestines. We identified phospholipase D1 (PLD1) and the ESCPE-1 complex, more precisely, SNX1, as proximity interactors. Functional assays demonstrated that ARL14 can activate PLD1 in cellulo and is involved in cargo trafficking via the ESCPE-1 complex. Overall, the BioID data generated in this study provide a valuable resource for dissecting the complexities of ARF and ARL spatial organization and signaling.
Collapse
|
3
|
Mapping the global interactome of the ARF family reveals spatial organization in cellular signaling pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.01.530598. [PMID: 36909472 PMCID: PMC10002736 DOI: 10.1101/2023.03.01.530598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
The ADP-ribosylation factors (ARFs) and ARF-like (ARLs) GTPases serve as essential molecular switches governing a wide array of cellular processes. In this study, we utilized proximity-dependent biotin identification (BioID) to comprehensively map the interactome of 28 out of 29 ARF and ARL proteins in two cellular models. Through this approach, we identified ~3000 high-confidence proximal interactors, enabling us to assign subcellular localizations to the family members. Notably, we uncovered previously undefined localizations for ARL4D and ARL10. Clustering analyses further exposed the distinctiveness of the interactors identified with these two GTPases. We also reveal that the expression of the understudied member ARL14 is confined to the stomach and intestines. We identified phospholipase D1 (PLD1) and the ESCPE-1 complex, more precisely SNX1, as proximity interactors. Functional assays demonstrated that ARL14 can activate PLD1 in cellulo and is involved in cargo trafficking via the ESCPE-1 complex. Overall, the BioID data generated in this study provide a valuable resource for dissecting the complexities of ARF and ARL spatial organization and signaling.
Collapse
|
4
|
CDK12 Loss Promotes Prostate Cancer Development While Exposing Vulnerabilities to Paralog-Based Synthetic Lethality. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.20.585990. [PMID: 38562774 PMCID: PMC10983964 DOI: 10.1101/2024.03.20.585990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Biallelic loss of cyclin-dependent kinase 12 (CDK12) defines a unique molecular subtype of metastatic castration-resistant prostate cancer (mCRPC). It remains unclear, however, whether CDK12 loss per se is sufficient to drive prostate cancer development-either alone, or in the context of other genetic alterations-and whether CDK12-mutant tumors exhibit sensitivity to specific pharmacotherapies. Here, we demonstrate that tissue-specific Cdk12 ablation is sufficient to induce preneoplastic lesions and robust T cell infiltration in the mouse prostate. Allograft-based CRISPR screening demonstrated that Cdk12 loss is positively associated with Trp53 inactivation but negatively associated with Pten inactivation-akin to what is observed in human mCRPC. Consistent with this, ablation of Cdk12 in prostate organoids with concurrent Trp53 loss promotes their proliferation and ability to form tumors in mice, while Cdk12 knockout in the Pten-null prostate cancer mouse model abrogates tumor growth. Bigenic Cdk12 and Trp53 loss allografts represent a new syngeneic model for the study of androgen receptor (AR)-positive, luminal prostate cancer. Notably, Cdk12/Trp53 loss prostate tumors are sensitive to immune checkpoint blockade. Cdk12-null organoids (either with or without Trp53 co-ablation) and patient-derived xenografts from tumors with CDK12 inactivation are highly sensitive to inhibition or degradation of its paralog kinase, CDK13. Together, these data identify CDK12 as a bona fide tumor suppressor gene with impact on tumor progression and lends support to paralog-based synthetic lethality as a promising strategy for treating CDK12-mutant mCRPC.
Collapse
|
5
|
Drug-drug interactions prediction based on deep learning and knowledge graph: A review. iScience 2024; 27:109148. [PMID: 38405609 PMCID: PMC10884936 DOI: 10.1016/j.isci.2024.109148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024] Open
Abstract
Drug-drug interactions (DDIs) can produce unpredictable pharmacological effects and lead to adverse events that have the potential to cause irreversible damage to the organism. Traditional methods to detect DDIs through biological or pharmacological analysis are time-consuming and expensive, therefore, there is an urgent need to develop computational methods to effectively predict drug-drug interactions. Currently, deep learning and knowledge graph techniques which can effectively extract features of entities have been widely utilized to develop DDI prediction methods. In this research, we aim to systematically review DDI prediction researches applying deep learning and graph knowledge. The available biomedical data and public databases related to drugs are firstly summarized in this review. Then, we discuss the existing drug-drug interactions prediction methods which have utilized deep learning and knowledge graph techniques and group them into three main classes: deep learning-based methods, knowledge graph-based methods, and methods that combine deep learning with knowledge graph. We comprehensively analyze the commonly used drug related data and various DDI prediction methods, and compare these prediction methods on benchmark datasets. Finally, we briefly discuss the challenges related to drug-drug interactions prediction, including asymmetric DDIs prediction and high-order DDI prediction.
Collapse
|
6
|
Identifying potential ligand-receptor interactions based on gradient boosted neural network and interpretable boosting machine for intercellular communication analysis. Comput Biol Med 2024; 171:108110. [PMID: 38367445 DOI: 10.1016/j.compbiomed.2024.108110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/24/2024] [Accepted: 02/04/2024] [Indexed: 02/19/2024]
Abstract
Cell-cell communication is essential to many key biological processes. Intercellular communication is generally mediated by ligand-receptor interactions (LRIs). Thus, building a comprehensive and high-quality LRI resource can significantly improve intercellular communication analysis. Meantime, due to lack of a "gold standard" dataset, it remains a challenge to evaluate LRI-mediated intercellular communication results. Here, we introduce CellGiQ, a high-confident LRI prediction framework for intercellular communication analysis. Highly confident LRIs are first inferred by LRI feature extraction with BioTriangle, LRI selection using LightGBM, and LRI classification based on ensemble of gradient boosted neural network and interpretable boosting machine. Subsequently, known and identified high-confident LRIs are filtered by combining single-cell RNA sequencing (scRNA-seq) data and further applied to intercellular communication inference through a quartile scoring strategy. To validation the predictions, CellGiQ exploited several evaluation strategies: using AUC and AUPR, it surpassed six competing LRI prediction models on four LRI datasets; through Venn diagrams and molecular docking, its predicted LRIs were validated by five other popular intercellular communication inference methods; based on the overlapping LRIs, it computed high Jaccard index with six other state-of-the-art intercellular communication prediction tools within human HNSCC tissues; by comparing with classical models and literature retrieve, its inferred HNSCC-related intercellular communication results was further validated. The novelty of this study is to identify high-confident LRIs based on machine learning as well as design several LRI validation ways, providing reference for computational LRI prediction. CellGiQ provides an open-source and useful tool to decompose LRI-mediated intercellular communication at single cell resolution. CellGiQ is freely available at https://github.com/plhhnu/CellGiQ.
Collapse
|
7
|
Whole genome sequencing of a family with autosomal dominant features within the oculoauriculovertebral spectrum. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.07.24301824. [PMID: 38370836 PMCID: PMC10871465 DOI: 10.1101/2024.02.07.24301824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Background Oculoauriculovertebral Spectrum (OAVS) encompasses a wide variety of anomalies on derivatives from the first and second pharyngeal arches including macrostomia, hemifacial microsomia, micrognathia, preauricular tags, ocular and vertebral anomalies. We present the genetic findings of a large three-generation family with multiple members affected with macrostomia, preauricular tags and uni- or bilateral ptosis following an autosomal dominant segregation pattern. Methods We generated whole genome sequencing data for the proband, affected parent and unaffected paternal grandparent followed by Sanger sequencing on 23 family members for the top 10 candidate genes: KCND2, PDGFRA, CASP9, NCOA3, WNT10A, SIX1, MTF1, KDR/VEGFR2, LRRK1, and TRIM2. We performed parent and sibling-based transmission disequilibrium tests and burden analysis to explore segregation and burden of candidate gene mutations. Bioinformatic analyses investigated the biological connection between genes and the abnormal phenotypes. Results Overall, rare missense mutations in SIX1, KDR/VEGFR2, and PDGFRA showed the best evidence of segregation with the OAV phenotypes in this family. When considering affection with any of the 3 OAVS phenotypes as an outcome, parent-TDTs and sib-TDTs (unadjusted p-values) found that SIX1 (p=0.025, p=0.052), followed by PDGFRA (p=0.180, p=0.069) and KDR/VEGFR2 (p=0.180, p=0.069) have the strongest associations in this family. Burden analysis via a penalized linear mixed model identified SIX1 (RC=0.87) and PDGFRA (RC=0.98) as having the strongest association with OAVS severity. Using phenotype-specific ogfrautcomes, sib-TDTs identified associations between (1) SIX1 with uni- or bilateral ptosis (p=0.049) and ear tags (p=0.01), (2) PDGFRA and KDR/VEGFR2 with ear tags (both p<0.01). Conclusion Our study reports the genomic findings of a large family with multiple individuals affected with OAVS phenotypes with autosomal dominant inheritance. Our findings narrow down to three potential candidate genes, SIX1, PDGFRA, and KDR/VEGFR2. Among these, SIX1 has been previously associated with OAVS ear malformations and it is co-expressed with EYA1 during ear development. Attempts to strengthen the genotype-phenotype co-relation underlying the OAVS of phenotypes are essential to discover the etiological factors leading to this complex and burdensome condition as well as for family counseling and prevention efforts.
Collapse
|
8
|
FusionPDB: a knowledgebase of human fusion proteins. Nucleic Acids Res 2024; 52:D1289-D1304. [PMID: 37870473 PMCID: PMC10767906 DOI: 10.1093/nar/gkad920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/19/2023] [Accepted: 10/09/2023] [Indexed: 10/24/2023] Open
Abstract
Tumorigenic functions due to the formation of fusion genes have been targeted for cancer therapeutics (i.e. kinase inhibitors). However, many fusion proteins involved in various cellular processes have not been studied for targeted therapeutics. This is because the lack of complete fusion protein sequences and their whole 3D structures has made it challenging to develop new therapeutic strategies. To fill these critical gaps, we developed a computational pipeline and a resource of human fusion proteins named FusionPDB, available at https://compbio.uth.edu/FusionPDB. FusionPDB is organized into four levels: 43K fusion protein sequences (14.7K in-frame fusion genes, Level 1), over 2300 + 1267 fusion protein 3D structures (from 2300 recurrent and 266 manually curated in-frame fusion genes, Level 2), pLDDT score analysis for the 1267 fusion proteins from 266 manually curated fusion genes (Level 3), and virtual screening outcomes for 68 selected fusion proteins from 266 manually curated fusion genes (Level 4). FusionPDB is the only resource providing whole 3D structures of fusion proteins and comprehensive knowledge of human fusion proteins. It will be regularly updated until it covers all human fusion proteins in the future.
Collapse
|
9
|
Immune cell-specific and common molecular signatures in rheumatoid arthritis through molecular network approaches. Biosystems 2023; 234:105063. [PMID: 37852410 DOI: 10.1016/j.biosystems.2023.105063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 09/20/2023] [Accepted: 10/13/2023] [Indexed: 10/20/2023]
Abstract
Rheumatoid arthritis (RA) is an autoimmune disorder and common symptom of RA is chronic synovial inflammation. The pathogenesis of RA is not fully understood. Therefore, we aimed to identify underlying common and distinct molecular signatures and pathways among ten types of tissue and cells obtained from patients with RA. In this study, transcriptomic data including synovial tissues, macrophages, blood, T cells, CD4+T cells, CD8+T cells, natural killer T (NKT), cells natural killer (NK) cells, neutrophils, and monocyte cells were analyzed with an integrative and comparative network biology perspective. Each dataset yielded a list of differentially expressed genes as well as a reconstruction of the tissue-specific protein-protein interaction (PPI) network. Molecular signatures were identified by a statistical test using the hypergeometric probability density function by employing the interactions of transcriptional regulators and PPI. Reporter metabolites of each dataset were determined by using genome-scale metabolic networks. It was defined as the common hub proteins, novel molecular signatures, and metabolites in two or more tissue types while immune cell-specific molecular signatures were identified, too. Importantly, miR-155-5p is found as a common miRNA in all tissues. Moreover, NCOA3, PRKDC and miR-3160 might be novel molecular signatures for RA. Our results establish a novel approach for identifying immune cell-specific molecular signatures of RA and provide insights into the role of common tissue-specific genes, miRNAs, TFs, receptors, and reporter metabolites. Experimental research should be used to validate the corresponding genes, miRNAs, and metabolites.
Collapse
|
10
|
The lncRNA DLX6-AS1/miR-16-5p axis regulates autophagy and apoptosis in non-small cell lung cancer: A Boolean model of cell death. Noncoding RNA Res 2023; 8:605-614. [PMID: 37767112 PMCID: PMC10520667 DOI: 10.1016/j.ncrna.2023.08.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/25/2023] [Accepted: 08/06/2023] [Indexed: 09/29/2023] Open
Abstract
Long non-coding RNA (lncRNA) distal-less homeobox 6 antisense RNA 1 (DLX6-AS1) is elevated in a variety of cancers, including non-small cell lung cancer (NSCLC) and cervical cancer. Although it was found that the microRNA-16-5p (miR-16), which is known to regulate autophagy and apoptosis, had been downregulated in similar cancers. Recent research has shown that in tumors with similar characteristics, DLX6-AS1 acts as a sponge for miR-16 expression. However, the cell death-related molecular mechanism of the DLX6-AS1/miR-16 axis has yet to be investigated. Therefore, we propose a dynamic Boolean model to investigate gene regulation in cell death processes via the DLX6-AS1/miR-16 axis. We found the finest concordance when we compared our model to many experimental investigations including gain-of-function genes in NSCLC and cervical cancer. A unique positive circuit involving BMI1/ATM/miR-16 is also something we predict. Our results suggest that this circuit is essential for regulating autophagy and apoptosis under stress signals. Thus, our Boolean network enables an evident cell-death process coupled with NSCLC and cervical cancer. Therefore, our results suggest that DLX6-AS1 targeting may boost miR-16 activity and thereby restrict tumor growth in these cancers by triggering autophagy and apoptosis.
Collapse
|
11
|
Exploring Small Molecules Targeting Protein-Protein Interactions (PPIs): Advancements and Future Prospects. Pharmaceuticals (Basel) 2023; 16:1644. [PMID: 38139771 PMCID: PMC10747528 DOI: 10.3390/ph16121644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023] Open
Abstract
This Special Issue of Pharmaceuticals is dedicated to the clinically relevant, intricate realm of "Small Molecules Targeting Protein-Protein Interactions (PPIs): Current Strategies for the Development of New Drugs" [...].
Collapse
|
12
|
Drug-drug interaction prediction: databases, web servers and computational models. Brief Bioinform 2023; 25:bbad445. [PMID: 38113076 PMCID: PMC10782925 DOI: 10.1093/bib/bbad445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/26/2023] [Accepted: 11/14/2023] [Indexed: 12/21/2023] Open
Abstract
In clinical treatment, two or more drugs (i.e. drug combination) are simultaneously or successively used for therapy with the purpose of primarily enhancing the therapeutic efficacy or reducing drug side effects. However, inappropriate drug combination may not only fail to improve efficacy, but even lead to adverse reactions. Therefore, according to the basic principle of improving the efficacy and/or reducing adverse reactions, we should study drug-drug interactions (DDIs) comprehensively and thoroughly so as to reasonably use drug combination. In this review, we first introduced the basic conception and classification of DDIs. Further, some important publicly available databases and web servers about experimentally verified or predicted DDIs were briefly described. As an effective auxiliary tool, computational models for predicting DDIs can not only save the cost of biological experiments, but also provide relevant guidance for combination therapy to some extent. Therefore, we summarized three types of prediction models (including traditional machine learning-based models, deep learning-based models and score function-based models) proposed during recent years and discussed the advantages as well as limitations of them. Besides, we pointed out the problems that need to be solved in the future research of DDIs prediction and provided corresponding suggestions.
Collapse
|
13
|
Integration of circulating microRNAs and transcriptome signatures identifies early-pregnancy biomarkers of preeclampsia. Clin Transl Med 2023; 13:e1446. [PMID: 37905457 PMCID: PMC10616748 DOI: 10.1002/ctm2.1446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 09/21/2023] [Accepted: 10/01/2023] [Indexed: 11/02/2023] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) have been implicated in the pathobiology of preeclampsia, a common hypertensive disorder of pregnancy. In a nested matched case-control cohort within the Vitamin D Antenatal Asthma Reduction Trial (VDAART), we previously identified peripheral blood mRNA signatures related to preeclampsia and vitamin D status (≤30 ng/mL) during gestation from 10 to 18 weeks, using differential expression analysis. METHODS Using quantitative PCR arrays, we conducted profiling of circulating miRNAs at 10-18 weeks of gestation in the same VDAART cohort to identify differentially expressed (DE) miRNAs associated with preeclampsia and vitamin D status. For the validation of the expression of circulating miRNA signatures in the placenta, the HTR-8/SVneo trophoblast cell line was used. Targets of circulating miRNA signatures in the preeclampsia mRNA signatures were identified by consensus ranking of miRNA-target prediction scores from four sources. The connected component of target signatures was identified by mapping to the protein-protein interaction (PPI) network and hub targets were determined. As experimental validation, we examined the gene and protein expression of IGF1R, one of the key hub genes, as a target of the DE miRNA, miR-182-5p, in response to a miR-182-5p mimic in HTR-8/SVneo cells. RESULTS Pregnant women with preeclampsia had 16 circulating DE miRNAs relative to normal pregnancy controls that were also DE under vitamin D insufficiency (9/16 = 56% upregulated, FDR < .05). Thirteen miRNAs (13/16 = 81.3%) were detected in HTR-8/SVneo cells. Overall, 16 DE miRNAs had 122 targets, of which 87 were unique. Network analysis demonstrated that the 32 targets of DE miRNA signatures created a connected subnetwork in the preeclampsia module with CXCL8, CXCL10, CD274, MMP9 and IGF1R having the highest connectivity and centrality degree. In an in vitro validation experiment, the introduction of an hsa-miR-182-5p mimic resulted in significant reduction of its target IGF1R gene and protein expression within HTR-8/SVneo cells. CONCLUSIONS The integration of the circulating DE miRNA and mRNA signatures associated preeclampsia added additional insights into the subclinical molecular signature of preeclampsia. Our systems and network biology approach revealed several biological pathways, including IGF-1, that may play a role in the early pathophysiology of preeclampsia. These pathways and signatures also denote potential biomarkers for the early stages of preeclampsia and suggest possible preventive measures.
Collapse
|
14
|
A framework for considering prior information in network-based approaches to omics data analysis. Proteomics 2023; 23:e2200402. [PMID: 37986684 DOI: 10.1002/pmic.202200402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/20/2023] [Accepted: 09/21/2023] [Indexed: 11/22/2023]
Abstract
For decades, molecular biologists have been uncovering the mechanics of biological systems. Efforts to bring their findings together have led to the development of multiple databases and information systems that capture and present pathway information in a computable network format. Concurrently, the advent of modern omics technologies has empowered researchers to systematically profile cellular processes across different modalities. Numerous algorithms, methodologies, and tools have been developed to use prior knowledge networks (PKNs) in the analysis of omics datasets. Interestingly, it has been repeatedly demonstrated that the source of prior knowledge can greatly impact the results of a given analysis. For these methods to be successful it is paramount that their selection of PKNs is amenable to the data type and the computational task they aim to accomplish. Here we present a five-level framework that broadly describes network models in terms of their scope, level of detail, and ability to inform causal predictions. To contextualize this framework, we review a handful of network-based omics analysis methods at each level, while also describing the computational tasks they aim to accomplish.
Collapse
|
15
|
Identification of essential genes associated with SARS-CoV-2 infection as potential drug target candidates with machine learning algorithms. Sci Rep 2023; 13:15141. [PMID: 37704748 PMCID: PMC10499814 DOI: 10.1038/s41598-023-42127-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 09/05/2023] [Indexed: 09/15/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) requires the fast discovery of effective treatments to fight this worldwide concern. Several genes associated with the SARS-CoV-2, which are essential for its functionality, pathogenesis, and survival, have been identified. These genes, which play crucial roles in SARS-CoV-2 infection, are considered potential therapeutic targets. Developing drugs against these essential genes to inhibit their regular functions could be a good approach for COVID-19 treatment. Artificial intelligence and machine learning methods provide powerful infrastructures for interpreting and understanding the available data and can assist in finding fast explanations and cures. We propose a method to highlight the essential genes that play crucial roles in SARS-CoV-2 pathogenesis. For this purpose, we define eleven informative topological and biological features for the biological and PPI networks constructed on gene sets that correspond to COVID-19. Then, we use three different unsupervised learning algorithms with different approaches to rank the important genes with respect to our defined informative features. Finally, we present a set of 18 important genes related to COVID-19. Materials and implementations are available at: https://github.com/MahnazHabibi/Gene_analysis .
Collapse
|
16
|
Deciphering ligand-receptor-mediated intercellular communication based on ensemble deep learning and the joint scoring strategy from single-cell transcriptomic data. Comput Biol Med 2023; 163:107137. [PMID: 37364528 DOI: 10.1016/j.compbiomed.2023.107137] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 05/18/2023] [Accepted: 06/04/2023] [Indexed: 06/28/2023]
Abstract
BACKGROUND Cell-cell communication in a tumor microenvironment is vital to tumorigenesis, tumor progression and therapy. Intercellular communication inference helps understand molecular mechanisms of tumor growth, progression and metastasis. METHODS Focusing on ligand-receptor co-expressions, in this study, we developed an ensemble deep learning framework, CellComNet, to decipher ligand-receptor-mediated cell-cell communication from single-cell transcriptomic data. First, credible LRIs are captured by integrating data arrangement, feature extraction, dimension reduction, and LRI classification based on an ensemble of heterogeneous Newton boosting machine and deep neural network. Next, known and identified LRIs are screened based on single-cell RNA sequencing (scRNA-seq) data in certain tissues. Finally, cell-cell communication is inferred by incorporating scRNA-seq data, the screened LRIs, a joint scoring strategy that combines expression thresholding and expression product of ligands and receptors. RESULTS The proposed CellComNet framework was compared with four competing protein-protein interaction prediction models (PIPR, XGBoost, DNNXGB, and OR-RCNN) and obtained the best AUCs and AUPRs on four LRI datasets, elucidating the optimal LRI classification ability. CellComNet was further applied to analyze intercellular communication in human melanoma and head and neck squamous cell carcinoma (HNSCC) tissues. The results demonstrate that cancer-associated fibroblasts highly communicate with melanoma cells and endothelial cells strong communicate with HNSCC cells. CONCLUSIONS The proposed CellComNet framework efficiently identified credible LRIs and significantly improved cell-cell communication inference performance. We anticipate that CellComNet can contribute to anticancer drug design and tumor-targeted therapy.
Collapse
|
17
|
A spatially defined human Notch receptor interaction network reveals Notch intracellular storage and Ataxin-2-mediated fast recycling. Cell Rep 2023; 42:112819. [PMID: 37454291 DOI: 10.1016/j.celrep.2023.112819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 05/18/2023] [Accepted: 06/29/2023] [Indexed: 07/18/2023] Open
Abstract
The Notch signaling pathway controls cell growth, differentiation, and fate decisions. Dysregulation of Notch signaling has been linked to various human diseases. Notch receptor resides in multiple cellular compartments, and its translocation plays a central role in pathway activation. However, the spatial regulation of Notch receptor functions remains largely elusive. Using TurboID-based proximity labeling followed by affinity purification and mass spectrometry, we establish a spatially defined human Notch receptor interaction network. Notch receptors interact with different proteins in distinct subcellular compartments to perform specific cellular functions. This spatially defined interaction network also reveals that a large fraction of NOTCH is stored at the endoplasmic reticulum (ER)-Golgi intermediate compartment and recruits Ataxin-2-dependent recycling machinery for rapid recycling, Notch signaling activation, and leukemogenesis. Our work provides insights into dynamic Notch receptor complexes with exquisite spatial resolution, which will help in elucidating the detailed regulation of Notch receptors and highlight potential therapeutic targets for Notch-related pathogenesis.
Collapse
|
18
|
BERTwalk for integrating gene networks to predict gene- to pathway-level properties. BIOINFORMATICS ADVANCES 2023; 3:vbad086. [PMID: 37448813 PMCID: PMC10336298 DOI: 10.1093/bioadv/vbad086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/14/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023]
Abstract
Motivation Graph representation learning is a fundamental problem in the field of data science with applications to integrative analysis of biological networks. Previous work in this domain was mostly limited to shallow representation techniques. A recent deep representation technique, BIONIC, has achieved state-of-the-art results in a variety of tasks but used arbitrarily defined components. Results Here, we present BERTwalk, an unsupervised learning scheme that combines the BERT masked language model with a network propagation regularization for graph representation learning. The transformation from networks to texts allows our method to naturally integrate different networks and provide features that inform not only nodes or edges but also pathway-level properties. We show that our BERTwalk model outperforms BIONIC, as well as four other recent methods, on two comprehensive benchmarks in yeast and human. We further show that our model can be utilized to infer functional pathways and their effects. Availability and implementation Code and data are available at https://github.com/raminass/BERTwalk. Contact roded@tauex.tau.ac.il.
Collapse
|
19
|
A multi-scale map of protein assemblies in the DNA damage response. Cell Syst 2023; 14:447-463.e8. [PMID: 37220749 PMCID: PMC10330685 DOI: 10.1016/j.cels.2023.04.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 01/30/2023] [Accepted: 04/25/2023] [Indexed: 05/25/2023]
Abstract
The DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales. Affinity purifications of 21 DDR proteins, with/without genotoxin exposure, are combined with multi-omics data to reveal a hierarchical organization of 605 proteins into 109 assemblies. The map captures canonical repair mechanisms and proposes new DDR-associated proteins extending to stress, transport, and chromatin functions. We find that protein assemblies closely align with genetic dependencies in processing specific genotoxins and that proteins in multiple assemblies typically act in multiple genotoxin responses. Follow-up by DDR functional readouts newly implicates 12 assembly members in double-strand-break repair. The DNA damage response assemblies map is available for interactive visualization and query (ccmi.org/ddram/).
Collapse
|
20
|
Spatially resolved expression landscape and gene-regulatory network of human gastric corpus epithelium. Protein Cell 2023; 14:433-447. [PMID: 37402315 PMCID: PMC10319429 DOI: 10.1093/procel/pwac059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 10/30/2022] [Indexed: 07/20/2023] Open
Abstract
Molecular knowledge of human gastric corpus epithelium remains incomplete. Here, by integrated analyses using single-cell RNA sequencing (scRNA-seq), spatial transcriptomics, and single-cell assay for transposase accessible chromatin sequencing (scATAC-seq) techniques, we uncovered the spatially resolved expression landscape and gene-regulatory network of human gastric corpus epithelium. Specifically, we identified a stem/progenitor cell population in the isthmus of human gastric corpus, where EGF and WNT signaling pathways were activated. Meanwhile, LGR4, but not LGR5, was responsible for the activation of WNT signaling pathway. Importantly, FABP5 and NME1 were identified and validated as crucial for both normal gastric stem/progenitor cells and gastric cancer cells. Finally, we explored the epigenetic regulation of critical genes for gastric corpus epithelium at chromatin state level, and identified several important cell-type-specific transcription factors. In summary, our work provides novel insights to systematically understand the cellular diversity and homeostasis of human gastric corpus epithelium in vivo.
Collapse
|
21
|
Machine Learning Advances in Predicting Peptide/Protein-Protein Interactions Based on Sequence Information for Lead Peptides Discovery. Adv Biol (Weinh) 2023; 7:e2200232. [PMID: 36775876 DOI: 10.1002/adbi.202200232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/30/2022] [Indexed: 02/14/2023]
Abstract
Peptides have shown increasing advantages and significant clinical value in drug discovery and development. With the development of high-throughput technologies and artificial intelligence (AI), machine learning (ML) methods for discovering new lead peptides have been expanded and incorporated into rational drug design. Predictions of peptide-protein interactions (PepPIs) and protein-protein interactions (PPIs) are both opportunities and challenges in computational biology, which will help to better understand the mechanisms of disease and provide the impetus for the discovery of lead peptides. This paper comprehensively reviews computational models for PepPI and PPI predictions. It begins with an introduction of various databases of peptide ligands and target proteins. Then it discusses data formats and feature representations for proteins and peptides. Furthermore, classical ML methods and emerging deep learning (DL) methods that can be used to train prediction models of PepPI and PPI are classified into four categories, and their advantages and disadvantages are analyzed. To assess the relative performance of different models, different validation protocols and evaluation indexes are discussed. The goal of this review is to help researchers quickly get started to develop computational frameworks using these integrated resources and eventually promote the discovery of lead peptides.
Collapse
|
22
|
Deep learning-assisted prediction of protein-protein interactions in Arabidopsis thaliana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 114:984-994. [PMID: 36919205 DOI: 10.1111/tpj.16188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 02/20/2023] [Accepted: 03/09/2023] [Indexed: 05/27/2023]
Abstract
Currently, the experimentally identified interactome of Arabidopsis (Arabidopsis thaliana) is still far from complete, suggesting that computational prediction methods can complement experimental techniques. Motivated by the prosperity and success of deep learning algorithms and natural language processing techniques, we introduce an integrative deep learning framework, DeepAraPPI, allowing us to predict protein-protein interactions (PPIs) of Arabidopsis utilizing sequence, domain and Gene Ontology (GO) information. Our current DeepAraPPI comprises: (i) a word2vec encoding-based Siamese recurrent convolutional neural network (RCNN) model; (ii) a Domain2vec encoding-based multiple-layer perceptron (MLP) model; and (iii) a GO2vec encoding-based MLP model. Finally, DeepAraPPI combines the prediction results of the three individual predictors through a logistic regression model. Compiling high-quality positive and negative training and test samples by applying strict filtering strategies, DeepAraPPI shows superior performance compared with existing state-of-the-art Arabidopsis PPI prediction methods. DeepAraPPI also provides better cross-species predictive ability in rice (Oryza sativa) than traditional machine learning methods, although the overall performance in cross-species prediction remains to be improved. DeepAraPPI is freely accessible at http://zzdlab.com/deeparappi/. In the meantime, we have also made the source code and data sets of DeepAraPPI available at https://github.com/zjy1125/DeepAraPPI.
Collapse
|
23
|
Ultra-Rare Genetic Variation in Relapsing Polychondritis: A Whole-Exome Sequencing Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.10.23288250. [PMID: 37292664 PMCID: PMC10246166 DOI: 10.1101/2023.04.10.23288250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Objective Relapsing polychondritis (RP) is a systemic inflammatory disease of unknown etiology. The study objective was to examine the contribution of rare genetic variations in RP. Methods We performed a case-control exome-wide rare variant association analysis including 66 unrelated European American RP cases and 2923 healthy controls. Gene-level collapsing analysis was performed using Firth's logistics regression. Pathway analysis was performed on an exploratory basis with three different methods: Gene Set Enrichment Analysis (GSEA), sequence kernel association test (SKAT) and higher criticism test. Plasma DCBLD2 levels were measured in patients with RP and healthy controls using enzyme-linked immunosorbent assay (ELISA). Results In the collapsing analysis, RP was associated with higher burden of ultra-rare damaging variants in the DCBLD2 gene (7.6% vs 0.1%, unadjusted odds ratio = 79.8, p = 2.93 × 10-7). Patients with RP and ultra-rare damaging variants in DCBLD2 had a higher prevalence of cardiovascular manifestations. Plasma DCBLD2 protein levels were significantly higher in RP than healthy controls (5.9 vs 2.3, p < 0.001). Pathway analysis showed statistically significant enrichment of genes in the tumor necrosis factor (TNF) signaling pathway driven by rare damaging variants in RELB, RELA and REL using higher criticism test weighted by degree and eigenvector centrality. Conclusions This study identified specific rare variants in DCBLD2 as putative genetic risk factors for RP. Genetic variation within the TNF pathway is also potentially associated with development of RP. These findings should be validated in additional patients with RP and supported by future functional experiments.
Collapse
|
24
|
L1CAM deployed perivascular tumor niche promotes vessel wall invasion of tumor thrombus and metastasis of renal cell carcinoma. Cell Death Discov 2023; 9:112. [PMID: 37015905 PMCID: PMC10073121 DOI: 10.1038/s41420-023-01410-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 03/21/2023] [Accepted: 03/23/2023] [Indexed: 04/06/2023] Open
Abstract
The survival of tumor cells in the bloodstream, and vasculature adhesion at metastatic sites are crucial for tumor metastasis. Perivascular invasion aids tumor cell self-renewal, survival, and formation of metastases by facilitating readily available oxygen, nutrients, and endothelial-derived paracrine factors. Renal cell carcinoma (RCC) is among the most prevalent tumors of the urinary system, and the formation of venous tumor thrombus (VTT) is a characteristic feature of RCC. We observed high expression of L1CAM in the VTT with vessel wall invasion. L1CAM promotes the adhesion, migration, and invasion ability of RCC and enhances metastasis by interacting with ITGA5, which elicits activation of signaling downstream of integrin α5β1. L1CAM promotes ADAM17 transcription to facilitate transmembrane ectodomain cleavage and release of soluble L1CAM. In response to soluble L1CAM, vascular endothelial cells release several cytokines and chemokines. Endothelial-derived CXCL5 and its receptor CXCR2 promote the migration and intravasation of RCC toward endothelial cells suggesting that crosstalk between endothelial cells and tumor cells has a direct guiding role in driving the metastatic spread of RCC. LICAM plays a crucial role in the invasive ability of RCC, and regulation of L1CAM expression may contribute therapeutically to preventing RCC progression.
Collapse
|
25
|
Quadra-Stable Dynamics of p53 and PTEN in the DNA Damage Response. Cells 2023; 12:cells12071085. [PMID: 37048159 PMCID: PMC10093226 DOI: 10.3390/cells12071085] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/23/2023] [Accepted: 03/29/2023] [Indexed: 04/14/2023] Open
Abstract
Cell fate determination is a complex process that is frequently described as cells traveling on rugged pathways, beginning with DNA damage response (DDR). Tumor protein p53 (p53) and phosphatase and tensin homolog (PTEN) are two critical players in this process. Although both of these proteins are known to be key cell fate regulators, the exact mechanism by which they collaborate in the DDR remains unknown. Thus, we propose a dynamic Boolean network. Our model incorporates experimental data obtained from NSCLC cells and is the first of its kind. Our network's wild-type system shows that DDR activates the G2/M checkpoint, and this triggers a cascade of events, involving p53 and PTEN, that ultimately lead to the four potential phenotypes: cell cycle arrest, senescence, autophagy, and apoptosis (quadra-stable dynamics). The network predictions correspond with the gain-and-loss of function investigations in the additional two cell lines (HeLa and MCF-7). Our findings imply that p53 and PTEN act as molecular switches that activate or deactivate specific pathways to govern cell fate decisions. Thus, our network facilitates the direct investigation of quadruplicate cell fate decisions in DDR. Therefore, we concluded that concurrently controlling PTEN and p53 dynamics may be a viable strategy for enhancing clinical outcomes.
Collapse
|
26
|
Analysis of affinity purification-related proteomic data for studying protein-protein interaction networks in cells. Brief Bioinform 2023; 24:bbad010. [PMID: 36682002 PMCID: PMC10025443 DOI: 10.1093/bib/bbad010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/22/2022] [Accepted: 01/02/2023] [Indexed: 01/23/2023] Open
Abstract
During intracellular signal transduction, protein-protein interactions (PPIs) facilitate protein complex assembly to regulate protein localization and function, which are critical for numerous cellular events. Over the years, multiple techniques have been developed to characterize PPIs to elucidate roles and regulatory mechanisms of proteins. Among them, the mass spectrometry (MS)-based interactome analysis has been increasing in popularity due to its unbiased and informative manner towards understanding PPI networks. However, with MS instrumentation advancing and yielding more data than ever, the analysis of a large amount of PPI-associated proteomic data to reveal bona fide interacting proteins become challenging. Here, we review the methods and bioinformatic resources that are commonly used in analyzing large interactome-related proteomic data and propose a simple guideline for identifying novel interacting proteins for biological research.
Collapse
|
27
|
Multiplexed kinase interactome profiling quantifies cellular network activity and plasticity. Mol Cell 2023; 83:803-818.e8. [PMID: 36736316 PMCID: PMC10072906 DOI: 10.1016/j.molcel.2023.01.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 12/07/2022] [Accepted: 01/11/2023] [Indexed: 02/05/2023]
Abstract
Dynamic changes in protein-protein interaction (PPI) networks underlie all physiological cellular functions and drive devastating human diseases. Profiling PPI networks can, therefore, provide critical insight into disease mechanisms and identify new drug targets. Kinases are regulatory nodes in many PPI networks; yet, facile methods to systematically study kinase interactome dynamics are lacking. We describe kinobead competition and correlation analysis (kiCCA), a quantitative mass spectrometry-based chemoproteomic method for rapid and highly multiplexed profiling of endogenous kinase interactomes. Using kiCCA, we identified 1,154 PPIs of 238 kinases across 18 diverse cancer lines, quantifying context-dependent kinase interactome changes linked to cancer type, plasticity, and signaling states, thereby assembling an extensive knowledgebase for cell signaling research. We discovered drug target candidates, including an endocytic adapter-associated kinase (AAK1) complex that promotes cancer cell epithelial-mesenchymal plasticity and drug resistance. Our data demonstrate the importance of kinase interactome dynamics for cellular signaling in health and disease.
Collapse
|
28
|
Using human genetics to improve safety assessment of therapeutics. Nat Rev Drug Discov 2023; 22:145-162. [PMID: 36261593 DOI: 10.1038/s41573-022-00561-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/02/2022] [Indexed: 02/07/2023]
Abstract
Human genetics research has discovered thousands of proteins associated with complex and rare diseases. Genome-wide association studies (GWAS) and studies of Mendelian disease have resulted in an increased understanding of the role of gene function and regulation in human conditions. Although the application of human genetics has been explored primarily as a method to identify potential drug targets and support their relevance to disease in humans, there is increasing interest in using genetic data to identify potential safety liabilities of modulating a given target. Human genetic variants can be used as a model to anticipate the effect of lifelong modulation of therapeutic targets and identify the potential risk for on-target adverse events. This approach is particularly useful for non-clinical safety evaluation of novel therapeutics that lack pharmacologically relevant animal models and can contribute to the intrinsic safety profile of a drug target. This Review illustrates applications of human genetics to safety studies during drug discovery and development, including assessing the potential for on- and off-target associated adverse events, carcinogenicity risk assessment, and guiding translational safety study designs and monitoring strategies. A summary of available human genetic resources and recommended best practices is provided. The challenges and future perspectives of translating human genetic information to identify risks for potential drug effects in preclinical and clinical development are discussed.
Collapse
|
29
|
Targeted activation in localized protein environments via deep red photoredox catalysis. Nat Chem 2023; 15:101-109. [PMID: 36216892 PMCID: PMC9840673 DOI: 10.1038/s41557-022-01057-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 09/02/2022] [Indexed: 01/17/2023]
Abstract
State-of-the-art photoactivation strategies in chemical biology provide spatiotemporal control and visualization of biological processes. However, using high-energy light (λ < 500 nm) for substrate or photocatalyst sensitization can lead to background activation of photoactive small-molecule probes and reduce its efficacy in complex biological environments. Here we describe the development of targeted aryl azide activation via deep red-light (λ = 660 nm) photoredox catalysis and its use in photocatalysed proximity labelling. We demonstrate that aryl azides are converted to triplet nitrenes via a redox-centric mechanism and show that its spatially localized formation requires both red light and a photocatalyst-targeting modality. This technology was applied in different colon cancer cell systems for targeted protein environment labelling of epithelial cell adhesion molecule (EpCAM). We identified a small subset of proteins with previously known and unknown association to EpCAM, including CDH3, a clinically relevant protein that shares high tumour-selective expression with EpCAM.
Collapse
|
30
|
NMTF-DTI: A Nonnegative Matrix Tri-factorization Approach With Multiple Kernel Fusion for Drug-Target Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:586-594. [PMID: 34914594 DOI: 10.1109/tcbb.2021.3135978] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Prediction of drug-target interactions (DTIs) plays a significant role in drug development and drug discovery. Although this task requires a large investment in terms of time and cost, especially when it is performed experimentally, the results are not necessarily significant. Computational DTI prediction is a shortcut to reduce the risks of experimental methods. In this study, we propose an effective approach of nonnegative matrix tri-factorization, referred to as NMTF-DTI, to predict the interaction scores between drugs and targets. NMTF-DTI utilizes multiple kernels (similarity measures) for drugs and targets and Laplacian regularization to boost the prediction performance. The performance of NMTF-DTI is evaluated via cross-validation and is compared with existing DTI prediction methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR). We evaluate our method on four gold standard datasets, comparing to other state-of-the-art methods. Cross-validation and a separate, manually created dataset are used to set parameters. The results show that NMTF-DTI outperforms other competing methods. Moreover, the results of a case study also confirm the superiority of NMTF-DTI.
Collapse
|
31
|
Prediction of Kinase-Substrate Associations Using The Functional Landscape of Kinases and Phosphorylation Sites. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2023; 28:73-84. [PMID: 36540966 PMCID: PMC9782723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Protein phosphorylation is a key post-translational modification that plays a central role in many cellular processes. With recent advances in biotechnology, thousands of phosphorylated sites can be identified and quantified in a given sample, enabling proteome-wide screening of cellular signaling. However, for most (> 90%) of the phosphorylation sites that are identified in these experiments, the kinase(s) that target these sites are unknown. To broadly utilize available structural, functional, evolutionary, and contextual information in predicting kinase-substrate associations (KSAs), we develop a network-based machine learning framework. Our framework integrates a multitude of data sources to characterize the landscape of functional relationships and associations among phosphosites and kinases. To construct a phosphosite-phosphosite association network, we use sequence similarity, shared biological pathways, co-evolution, co-occurrence, and co-phosphorylation of phosphosites across different biological states. To construct a kinase-kinase association network, we integrate protein-protein interactions, shared biological pathways, and membership in common kinase families. We use node embeddings computed from these heterogeneous networks to train machine learning models for predicting kinase-substrate associations. Our systematic computational experiments using the PhosphositePLUS database shows that the resulting algorithm, NetKSA, outperforms two state-of-the-art algorithms, including KinomeXplorer and LinkPhinder, in overall KSA prediction. By stratifying the ranking of kinases, NetKSA also enables annotation of phosphosites that are targeted by relatively less-studied kinases.Availability: The code and data are available at compbio.case.edu/NetKSA/.
Collapse
|
32
|
A census of actin-associated proteins in humans. Front Cell Dev Biol 2023; 11:1168050. [PMID: 37187613 PMCID: PMC10175787 DOI: 10.3389/fcell.2023.1168050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 03/31/2023] [Indexed: 05/17/2023] Open
Abstract
Actin filaments help in maintaining the cell structure and coordinating cellular movements and cargo transport within the cell. Actin participates in the interaction with several proteins and also with itself to form the helical filamentous actin (F-actin). Actin-binding proteins (ABPs) and actin-associated proteins (AAPs) coordinate the actin filament assembly and processing, regulate the flux between globular G-actin and F-actin in the cell, and help maintain the cellular structure and integrity. We have used protein-protein interaction data available through multiple sources (STRING, BioGRID, mentha, and a few others), functional annotation, and classical actin-binding domains to identify actin-binding and actin-associated proteins in the human proteome. Here, we report 2482 AAPs and present an analysis of their structural and sequential domains, functions, evolutionary conservation, cellular localization, abundance, and tissue-specific expression patterns. This analysis provides a base for the characterization of proteins involved in actin dynamics and turnover in the cell.
Collapse
|
33
|
Recent development of machine learning models for the prediction of drug-drug interactions. KOREAN J CHEM ENG 2023; 40:276-285. [PMID: 36748027 PMCID: PMC9894510 DOI: 10.1007/s11814-023-1377-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 12/09/2022] [Accepted: 12/16/2022] [Indexed: 02/05/2023]
Abstract
Polypharmacy, the co-administration of multiple drugs, has become an area of concern as the elderly population grows and an unexpected infection, such as COVID-19 pandemic, keeps emerging. However, it is very costly and time-consuming to experimentally examine the pharmacological effects of polypharmacy. To address this challenge, machine learning models that predict drug-drug interactions (DDIs) have actively been developed in recent years. In particular, the growing volume of drug datasets and the advances in machine learning have facilitated the model development. In this regard, this review discusses the DDI-predicting machine learning models that have been developed since 2018. Our discussion focuses on dataset sources used to develop the models, featurization approaches of molecular structures and biological information, and types of DDI prediction outcomes from the models. Finally, we make suggestions for research opportunities in this field.
Collapse
|
34
|
Study on the Mechanism of Radix Astragali against Renal Aging Based on Network Pharmacology. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2022; 2022:6987677. [PMID: 36561604 PMCID: PMC9767736 DOI: 10.1155/2022/6987677] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 11/17/2022] [Accepted: 11/26/2022] [Indexed: 12/15/2022]
Abstract
Radix Astragali is widely used in the traditional Chinese medicine with the effect of antiaging. The purpose of this study is to explore the main active ingredients and targets of Radix Astragali against renal aging by network pharmacology and further to verify the mechanism of the main active ingredients in vitro. TCMSP, ETCM, and TCMID databases were used to screen active ingredients of Radix Astragali. Targets of active ingredients were predicted using BATMAN-TCM and cross validated using kidney aging-related genes obtained from GeneCards and NCBI database. Pathways enrichment and protein-protein interaction (PPI) analysis were performed on core targets. Additionally, a pharmacological network was constructed based on the active ingredients-targets-pathways. HK-2 cell was treated with D-galactose to generate a cell model of senescence. CCK-8 and β-galactosidase were used to detect the effect of Radix Astragali active components on cell proliferation and aging. ELISA was used to detect the expression of senescence-associated secreted protein (TGF-β and IL-6) in the cell culture supernatant. Western blot was used to detect the expression of key proteins in the SIRT1/p53 pathway. Five active ingredients (Astragaloside I, II, III, IV and choline) were identified from Radix Astragali, and all these active ingredients target a total of 128 genes. Enrichment analysis showed these genes were implicated in 153 KEGG pathways, including the p53, FoxO, and AMPK pathway. 117 proteins and 572 interactions were found in PPI network. TP53 and SIRT1 were two hub genes in PPI network, which interacted with each other. The pharmacological network showed that the five main active ingredients target on some coincident genes, including TP53 and SIRT1. These targeted genes were involved in the p53, FoxO, and AMPK pathway. Proliferation of HK-2 cells was increased by Astragaloside IV treatment compared with that of the D-Gal treatment group. However, the proliferation of the SA-β-gal positive cells were inhibited. The expression of TGF-β and IL-6 in the D-Gal group was higher than that in the normal group, and the treatment of Astragaloside IV could significantly reduce the expression of TGF-β and IL-6. The expression of SIRT1 in the Astragaloside IV group was higher than that in the D-Gal group. However, the expression of p53 and p21 was less in the Astragaloside IV group than that in the D-Gal group. This study suggested that Astragaloside IV is an important active ingredient of Radix Astragali in the treatment of kidney aging via the SITR1-p53 pathway.
Collapse
|
35
|
Abstract
The connectivity of a gene, defined as the number of interactions a gene's product has with other genes' products, is a key characteristic of a gene. In prokaryotes, the complexity hypothesis predicts that genes which undergo more frequent horizontal transfer will be less connected than genes which are only very rarely transferred. We tested the role of horizontal gene transfer, and other potentially important factors, by examining the connectivity of chromosomal and plasmid genes, across 134 diverse prokaryotic species. We found that (i) genes on plasmids were less connected than genes on chromosomes; (ii) connectivity of plasmid genes was not correlated with plasmid mobility; and (iii) the sociality of genes (cooperative or private) was not correlated with gene connectivity.
Collapse
|
36
|
Processes in DNA damage response from a whole-cell multi-omics perspective. iScience 2022; 25:105341. [PMID: 36339253 PMCID: PMC9633746 DOI: 10.1016/j.isci.2022.105341] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 08/10/2022] [Accepted: 10/10/2022] [Indexed: 11/09/2022] Open
Abstract
Technological advances have made it feasible to collect multi-condition multi-omic time courses of cellular response to perturbation, but the complexity of these datasets impedes discovery due to challenges in data management, analysis, visualization, and interpretation. Here, we report a whole-cell mechanistic analysis of HL-60 cellular response to bendamustine. We integrate both enrichment and network analysis to show the progression of DNA damage and programmed cell death over time in molecular, pathway, and process-level detail using an interactive analysis framework for multi-omics data. Our framework, Mechanism of Action Generator Involving Network analysis (MAGINE), automates network construction and enrichment analysis across multiple samples and platforms, which can be integrated into our annotated gene-set network to combine the strengths of networks and ontology-driven analysis. Taken together, our work demonstrates how multi-omics integration can be used to explore signaling processes at various resolutions and demonstrates multi-pathway involvement beyond the canonical bendamustine mechanism.
Collapse
|
37
|
Alteration in tyrosine phosphorylation of cardiac proteome and EGFR pathway contribute to hypertrophic cardiomyopathy. Commun Biol 2022; 5:1251. [PMID: 36380187 PMCID: PMC9666710 DOI: 10.1038/s42003-022-04021-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 09/22/2022] [Indexed: 11/16/2022] Open
Abstract
Alterations of serine/threonine phosphorylation of the cardiac proteome are a hallmark of heart failure. However, the contribution of tyrosine phosphorylation (pTyr) to the pathogenesis of cardiac hypertrophy remains unclear. We use global mapping to discover and quantify site-specific pTyr in two cardiac hypertrophic mouse models, i.e., cardiac overexpression of ErbB2 (TgErbB2) and α myosin heavy chain R403Q (R403Q-αMyHC Tg), compared to control hearts. From this, there are significant phosphoproteomic alterations in TgErbB2 mice in right ventricular cardiomyopathy, hypertrophic cardiomyopathy (HCM), and dilated cardiomyopathy (DCM) pathways. On the other hand, R403Q-αMyHC Tg mice indicated that the EGFR1 pathway is central for cardiac hypertrophy, along with angiopoietin, ErbB, growth hormone, and chemokine signaling pathways activation. Surprisingly, most myofilament proteins have downregulation of pTyr rather than upregulation. Kinase-substrate enrichment analysis (KSEA) shows a marked downregulation of MAPK pathway activity downstream of k-Ras in TgErbB2 mice and activation of EGFR, focal adhesion, PDGFR, and actin cytoskeleton pathways. In vivo ErbB2 inhibition by AG-825 decreases cardiomyocyte disarray. Serine/threonine and tyrosine phosphoproteome confirm the above-described pathways and the effectiveness of AG-825 Treatment. Thus, altered pTyr may play a regulatory role in cardiac hypertrophic models.
Collapse
|
38
|
A new machine learning method for cancer mutation analysis. PLoS Comput Biol 2022; 18:e1010332. [PMID: 36251702 PMCID: PMC9612828 DOI: 10.1371/journal.pcbi.1010332] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/27/2022] [Accepted: 10/05/2022] [Indexed: 11/23/2022] Open
Abstract
It is complicated to identify cancer-causing mutations. The recurrence of a mutation in patients remains one of the most reliable features of mutation driver status. However, some mutations are more likely to happen than others for various reasons. Different sequencing analysis has revealed that cancer driver genes operate across complex pathways and networks, with mutations often arising in a mutually exclusive pattern. Genes with low-frequency mutations are understudied as cancer-related genes, especially in the context of networks. Here we propose a machine learning method to study the functionality of mutually exclusive genes in the networks derived from mutation associations, gene-gene interactions, and graph clustering. These networks have indicated critical biological components in the essential pathways, especially those mutated at low frequency. Studying the network and not just the impact of a single gene significantly increases the statistical power of clinical analysis. The proposed method identified important driver genes with different frequencies. We studied the function and the associated pathways in which the candidate driver genes participate. By introducing lower-frequency genes, we recognized less studied cancer-related pathways. We also proposed a novel clustering method to specify driver modules. We evaluated each driver module with different criteria, including the terms of biological processes and the number of simultaneous mutations in each cancer. Materials and implementations are available at: https://github.com/MahnazHabibi/MutationAnalysis.
Collapse
|
39
|
Integrated Analysis and Validation of Autophagy-Related Genes and Immune Infiltration in Acute Myocardial Infarction. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:3851551. [PMID: 36238493 PMCID: PMC9553342 DOI: 10.1155/2022/3851551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 08/16/2022] [Accepted: 09/07/2022] [Indexed: 11/24/2022]
Abstract
Background Acute myocardial infarction (AMI) is one of the most critical conditions of coronary heart disease with many uncertainties regarding reduction of ischemia/reperfusion injury, medical treatment strategies, and other aspects. The inflammatory immune response has a bidirectional regulatory role in AMI and plays an essential role in myocardial remodeling after AMI. The purpose of our research was tantamount to explore possible mechanisms of AMI and to analyze the relationship with the immune microenvironment. Methods We firstly analyzed the expression profile of GSE61144 and HADb to identify differentially expressed autophagy-related genes (DEARGs). Then, we performed GO, functional enrichment analysis, and constructed PPI network by Metascape. A lncRNA-miRNA-mRNA ceRNA network was built, and hub genes were extracted by Cytoscape. After that, we used CIBERSORT algorithm to estimate the proportion of immunocytes, followed by correlation analysis to find relationships between hub DEARGs and immunocyte subsets. Finally, we verified those hub genes in another dataset and cellular experiments qPCR. Results Compared with controls, we identified 44 DEARGs and then filtered the genes of MCODE by constructing PPI network for further analysis. A total of 45 lncRNAs, 24 miRNAs, 19 mRNAs, 162 lncRNA-miRNA pairs, and 37 mRNA-miRNA pairs were used to construct a ceRNA network, and 4 hub DEARGs (BCL2, MAPK1, RAF1, and PRKAR1A) were extracted. We then estimated 5 classes of immunocytes that differed between AMI and controls. According to the results of correlation analysis, these 4 hub DEARGs may play modulatory effects in immune infiltrating cells, notably in CD8+ T cells and neutrophils. Finally, the same results were verified in GSE60993 and qPCR experiments. Conclusion Our findings suggest that those hub DEARGs (BCL2, MAPK1, RAF1, and PRKAR1A) and immunocytes probably play functions in the progression of AMI, providing potential diagnostic markers and new perspectives for treatment of AMI.
Collapse
|
40
|
Comprehensive analysis of pathways in Coronavirus 2019 (COVID-19) using an unsupervised machine learning method. Appl Soft Comput 2022; 128:109510. [PMID: 35992221 PMCID: PMC9384336 DOI: 10.1016/j.asoc.2022.109510] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 01/07/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022]
Abstract
The World Health Organization (WHO) introduced “Coronavirus disease 19” or “COVID-19” as a novel coronavirus in March 2020. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) requires the fast discovery of effective treatments to fight this worldwide crisis. Artificial intelligence and bioinformatics analysis pipelines can assist with finding biomarkers, explanations, and cures. Artificial intelligence and machine learning methods provide powerful infrastructures for interpreting and understanding the available data. On the other hand, pathway enrichment analysis, as a dominant tool, could help researchers discover potential key targets present in biological pathways of host cells that are targeted by SARS-CoV-2. In this work, we propose a two-stage machine learning approach for pathway analysis. During the first stage, four informative gene sets that can represent important COVID-19 related pathways are selected. These “representative genes” are associated with the COVID-19 pathology. Then, two distinctive networks were constructed for COVID-19 related signaling and disease pathways. In the second stage, the pathways of each network are ranked with respect to some unsupervised scorning method based on our defined informative features. Finally, we present a comprehensive analysis of the top important pathways in both networks. Materials and implementations are available at: https://github.com/MahnazHabibi/Pathway.
Collapse
|
41
|
SID-4/NCK-1 is important for dsRNA import in Caenorhabditis elegans. G3 (BETHESDA, MD.) 2022; 12:6722623. [PMID: 36165710 PMCID: PMC9635667 DOI: 10.1093/g3journal/jkac252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 08/25/2022] [Indexed: 12/24/2022]
Abstract
RNA interference is sequence-specific gene silencing triggered by double-stranded RNA. Systemic RNA interference is where double-stranded RNA, expressed or introduced into 1 cell, is transported to and initiates RNA interference in other cells. Systemic RNA interference is very efficient in Caenorhabditis elegans and genetic screens for systemic RNA interference-defective mutants have identified RNA transporters (SID-1, SID-2, and SID-5) and a signaling protein (SID-3). Here, we report that SID-4 is nck-1, a C. elegans NCK-like adaptor protein. sid-4 null mutations cause a weak, dose-sensitive, systemic RNA interference defect and can be effectively rescued by SID-4 expression in target tissues only, implying a role in double-stranded RNA import. SID-4 and SID-3 (ACK-1 kinase) homologs interact in mammals and insects, suggesting that they may function in a common signaling pathway; however, a sid-3; sid-4 double mutants showed additive resistance to RNA interference, suggesting that these proteins likely interact with other signaling pathways as well. A bioinformatic screen coupled to RNA interference sensitivity tests identified 23 additional signaling components with weak RNA interference-defective phenotypes. These observations suggest that environmental conditions may modulate systemic RNA interference efficacy, and indeed, sid-3 and sid-4 are required for growth temperature effects on systemic RNA interference silencing efficiency.
Collapse
|
42
|
Protocol for establishing a protein-protein interaction network using tandem affinity purification followed by mass spectrometry in mammalian cells. STAR Protoc 2022; 3:101569. [PMID: 35874475 PMCID: PMC9304681 DOI: 10.1016/j.xpro.2022.101569] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Identification of protein interactors is fundamental to understanding their functions. Here, we describe a modified protocol for tandem affinity purification coupled with mass spectrometry (TAP/MS), which includes two-step purification. We detail the S-, 2×FLAG-, and Streptavidin-Binding Peptide (SBP)- tandem tags (SFB-tag) system for protein purification. This protocol can be used to identify protein interactors and establish a high-confidence protein-protein interaction network based on computational models. This is particularly useful for identifying bona fide interacting proteins for subsequent functional studies. For complete details on the use and execution of this protocol, please refer to Bian et al. (2021).
Collapse
|
43
|
Network pharmacology study of Yishen capsules in the treatment of diabetic nephropathy. PLoS One 2022; 17:e0273498. [PMID: 36094934 PMCID: PMC9467320 DOI: 10.1371/journal.pone.0273498] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 08/03/2022] [Indexed: 11/19/2022] Open
Abstract
Objective
In this study, we used network pharmacology to explore the possible therapeutic mechanism underlying the treatment of diabetic nephropathy with Yishen capsules.
Methods
The active chemical constituents of Yishen capsules were acquired using the Traditional Chinese Medicine Systems Pharmacology platform and the Encyclopedia of Traditional Chinese Medicine. Component target proteins were then searched and screened in the BATMAN database. Target proteins were cross-validated using the Comparative Toxicogenomics Database, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses of the target proteins were performed. Then, protein–protein interaction (PPI) analysis was performed using the STRING database. Finally, a pharmacological network was constructed to show the component-target-pathway relationships. Molecular docking was used to analyse the interaction between drug components and target proteins.
Results
In total, 285 active chemical components were found, including 85 intersection targets against DN. In the pharmacological network, 5 key herbs (A. membranaceus, A. sinensis, E. ferox, A. orientale, and R. rosea) and their corresponding 12 key components (beta-sitosterol, beta-carotene, stigmasterol, alisol B, mairin, quercetin, caffeic acid, 1-monolinolein, kaempferol, jaranol, formononetin, and calycosin) were screened. Furthermore, the 12 key components were related to 24 target protein nodes (e.g., AGT, AKT1, AKT2, BCL2, NFKB1, and SIRT1) and enriched in 24 pathway nodes (such as the NF-kappa B, AGE-RAGE, toll-like receptor, and relaxin signaling pathways). Molecular docking revealed that hydrogen bond was formed between drug components and target proteins.
Conclusion
In conclusion, the active constituents of Yishen capsules modulate targets or signaling pathways in DN pathogenesis.
Collapse
|
44
|
Tissue Specificity Based Isoform Function Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3048-3059. [PMID: 34185647 DOI: 10.1109/tcbb.2021.3093167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Alternative splicing enables a gene spliced into different isoforms and hence protein variants. Identifying individual functions of these isoforms help deciphering the functional diversity of proteins. Although much efforts have been made for automatic gene function prediction, few efforts have been moved toward computational isoform function prediction, mainly due to the unavailable (or scanty) functional annotations of isoforms. Existing efforts directly combine multiple RNA-seq datasets without account of the important tissue specificity of alternative splicing. To bridge this gap, we introduce a novel approach called TS-Isofun to predict the functions of isoforms by integrating multiple functional association networks with respect to tissue specificity. TS-Isofun first constructs tissue-specific isoform functional association networks using multiple RNA-seq datasets from tissue-wise. Next, TS-Isofun assigns weights to these networks and models the tissue specificity by selectively integrating them with adaptive weights. It then introduces a joint matrix factorization-based data fusion model to leverage the integrated network, gene-level data and functional annotations of genes to infer the functions of isoforms. To achieve coherent weight assignment and isoform function prediction, TS-Isofun jointly optimizes the weights of individual networks and the isoform function prediction in a unified objective function. Experimental results show that TS-Isofun significantly outperforms state-of-the-art methods and the account of tissue specificity contributes to more accurate isoform function prediction.
Collapse
|
45
|
Integration of probabilistic functional networks without an external Gold Standard. BMC Bioinformatics 2022; 23:302. [PMID: 35879662 PMCID: PMC9316706 DOI: 10.1186/s12859-022-04834-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 07/11/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Probabilistic functional integrated networks (PFINs) are designed to aid our understanding of cellular biology and can be used to generate testable hypotheses about protein function. PFINs are generally created by scoring the quality of interaction datasets against a Gold Standard dataset, usually chosen from a separate high-quality data source, prior to their integration. Use of an external Gold Standard has several drawbacks, including data redundancy, data loss and the need for identifier mapping, which can complicate the network build and impact on PFIN performance. Additionally, there typically are no Gold Standard data for non-model organisms. RESULTS We describe the development of an integration technique, ssNet, that scores and integrates both high-throughput and low-throughout data from a single source database in a consistent manner without the need for an external Gold Standard dataset. Using data from Saccharomyces cerevisiae we show that ssNet is easier and faster, overcoming the challenges of data redundancy, Gold Standard bias and ID mapping. In addition ssNet results in less loss of data and produces a more complete network. CONCLUSIONS The ssNet method allows PFINs to be built successfully from a single database, while producing comparable network performance to networks scored using an external Gold Standard source and with reduced data loss.
Collapse
|
46
|
Recent advances in proteomics and metabolomics in plants. MOLECULAR HORTICULTURE 2022; 2:17. [PMID: 37789425 PMCID: PMC10514990 DOI: 10.1186/s43897-022-00038-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/20/2022] [Indexed: 10/05/2023]
Abstract
Over the past decade, systems biology and plant-omics have increasingly become the main stream in plant biology research. New developments in mass spectrometry and bioinformatics tools, and methodological schema to integrate multi-omics data have leveraged recent advances in proteomics and metabolomics. These progresses are driving a rapid evolution in the field of plant research, greatly facilitating our understanding of the mechanistic aspects of plant metabolisms and the interactions of plants with their external environment. Here, we review the recent progresses in MS-based proteomics and metabolomics tools and workflows with a special focus on their applications to plant biology research using several case studies related to mechanistic understanding of stress response, gene/protein function characterization, metabolic and signaling pathways exploration, and natural product discovery. We also present a projection concerning future perspectives in MS-based proteomics and metabolomics development including their applications to and challenges for system biology. This review is intended to provide readers with an overview of how advanced MS technology, and integrated application of proteomics and metabolomics can be used to advance plant system biology research.
Collapse
|
47
|
Abstract
Computational drug repositioning aims to identify potential applications of existing drugs for the treatment of diseases for which they were not designed. This approach can considerably accelerate the traditional drug discovery process by decreasing the required time and costs of drug development. Tensor decomposition enables us to integrate multiple drug- and disease-related data to boost the performance of prediction. In this study, a nonnegative tensor decomposition for drug repositioning, NTD-DR, is proposed. In order to capture the hidden information in drug-target, drug-disease, and target-disease networks, NTD-DR uses these pairwise associations to construct a three-dimensional tensor representing drug-target-disease triplet associations and integrates them with similarity information of drugs, targets, and disease to make a prediction. We compare NTD-DR with recent state-of-the-art methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR) and find that our method outperforms competing methods. Moreover, case studies with five diseases also confirm the reliability of predictions made by NTD-DR. Our proposed method identifies more known associations among the top 50 predictions than other methods. In addition, novel associations identified by NTD-DR are validated by literature analyses.
Collapse
|
48
|
Discovery and identification of genes involved in DNA damage repair in yeast. Gene 2022; 831:146549. [PMID: 35569766 DOI: 10.1016/j.gene.2022.146549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 02/16/2022] [Accepted: 05/06/2022] [Indexed: 11/04/2022]
Abstract
DNA repair defects are common in tumour cells and can lead to misrepair of double-strand breaks (DSBs), posing a significant challenge to cellular integrity. The overall mechanisms of DSB have been known for decades. However, the list of the genes that affect the efficiency of DSB repair continues to grow. Additional factors that play a role in DSB repair pathways have yet to be identified. In this study, we present a computational approach to identify novel gene functions that are involved in DNA damage repair in Saccharomyces cerevisiae. Among the primary candidates, GAL7, YMR130W, and YHI9 were selected for further analysis since they had not previously been identified as being active in DNA repair pathways. Originally, GAL7 was linked to galactose metabolism. YHI9 and YMR130W encode proteins of unknown functions. Laboratory testing of deletion strains gal7Δ, ymr130wΔ, and yhi9Δ implicated all 3 genes in Homologous Recombination (HR) and/or Non-Homologous End Joining (NHEJ) repair pathways, and enhanced sensitivity to DNA damage-inducing drugs suggested involvement in the broader DNA damage repair machinery. A subsequent genetic interaction analysis revealed interconnections of these three genes, most strikingly through SIR2, SIR3 and SIR4 that are involved in chromatin regulation and DNA damage repair network.
Collapse
|
49
|
TritiKBdb: A Functional Annotation Resource for Deciphering the Complete Interaction Networks in Wheat-Karnal Bunt Pathosystem. Int J Mol Sci 2022; 23:ijms23137455. [PMID: 35806459 PMCID: PMC9267065 DOI: 10.3390/ijms23137455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/30/2022] [Accepted: 06/30/2022] [Indexed: 02/01/2023] Open
Abstract
The study of molecular interactions, especially the inter-species protein-protein interactions, is crucial for understanding the disease infection mechanism in plants. These interactions play an important role in disease infection and host immune responses against pathogen attack. Among various critical fungal diseases, the incidences of Karnal bunt (Tilletia indica) around the world have hindered the export of the crops such as wheat from infected regions, thus causing substantial economic losses. Due to sparse information on T. indica, limited insight is available with regard to gaining in-depth knowledge of the interaction mechanisms between the host and pathogen proteins during the disease infection process. Here, we report the development of a comprehensive database and webserver, TritiKBdb, that implements various tools to study the protein-protein interactions in the Triticum species-Tilletia indica pathosystem. The novel ‘interactomics’ tool allows the user to visualize/compare the networks of the predicted interactions in an enriched manner. TritiKBdb is a user-friendly database that provides functional annotations such as subcellular localization, available domains, KEGG pathways, and GO terms of the host and pathogen proteins. Additionally, the information about the host and pathogen proteins that serve as transcription factors and effectors, respectively, is also made available. We believe that TritiKBdb will serve as a beneficial resource for the research community, and aid the community in better understanding the infection mechanisms of Karnal bunt and its interactions with wheat. The database is freely available for public use at http://bioinfo.usu.edu/tritikbdb/.
Collapse
|
50
|
Acute Myeloid Leukemia: New Multiomics Molecular Signatures and Implications for Systems Medicine Diagnostics and Therapeutics Innovation. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2022; 26:392-403. [PMID: 35763314 DOI: 10.1089/omi.2022.0051] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Acute myeloid leukemia (AML) is a common, complex, and multifactorial malignancy of the hematopoietic system. AML diagnosis and treatment outcomes display marked heterogeneity and patient-to-patient variations. To date, AML-related biomarker discovery research has employed single omics inquiries. Multiomics analyses that reconcile and integrate the data streams from multiple levels of the cellular hierarchy, from genes to proteins to metabolites, offer much promise for innovation in AML diagnostics and therapeutics. We report, in this study, a systems medicine and multiomics approach to integrate the AML transcriptome data and reporter biomolecules at the RNA, protein, and metabolite levels using genome-scale biological networks. We utilized two independent transcriptome datasets (GSE5122, GSE8970) in the Gene Expression Omnibus database. We identified new multiomics molecular signatures of relevance to AML: miRNAs (e.g., mir-484 and miR-519d-3p), receptors (ACVR1 and PTPRG), transcription factors (PRDM14 and GATA3), and metabolites (in particular, amino acid derivatives). The differential expression profiles of all reporter biomolecules were crossvalidated in independent RNA-Seq and miRNA-Seq datasets. Notably, we found that PTPRG holds important prognostication potential as evaluated by Kaplan-Meier survival analyses. The multiomics relationships unraveled in this analysis point toward the genomic pathogenesis of AML. These multiomics molecular leads warrant further research and development as potential diagnostic and therapeutic targets.
Collapse
|