1
|
Martins Fernandes Pereira K, de Carvalho AC, Ventura Fernandes BH, Dos Santos Grecco S, Rodrigues E, da Silva Fernandes MJ, de Carvalho LRS, Nakamura MU, Guo S, Hernández RB. Systems toxicology studies reveal important insights about chronic exposure of zebrafish to Kalanchoe pinnata (Lam.) Pers leaf - KPL: Implications for medicinal use. JOURNAL OF ETHNOPHARMACOLOGY 2025; 338:119044. [PMID: 39532221 DOI: 10.1016/j.jep.2024.119044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 11/04/2024] [Accepted: 11/05/2024] [Indexed: 11/16/2024]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE The prevalence of depression and anxiety is high during pregnancy. Several traditional medicines use the plant Kalanchoe pinnata (Lam.) Pers. (KP) to treat emotional disorders, inflammation, and to prevent preterm delivery, but the effects on the exposed offspring and the mechanism behind these events remain unknown. AIM OF THE STUDY In this work, integrated systems toxicology (INSYSTA) was used to investigate traditional toxicological outcomes and behavioral performance in zebrafish larvae after chronic exposure (from 2 to 96 hpf) to K. pinnata leaf extracts (KPL). MATERIALS AND METHODS We investigated light/dark preference, thigmotaxis and locomotor activity parameters, followed by gene expression and systems biology approaches to discover the mechanisms behind toxicological endpoint and phenomics. RESULTS The embryos exposed to 700 mg/L KPL showed retarded development including hatching delay. Larvae exposed to 500 mg/L KPL resulted in decreased dark avoidance and increased locomotor activity, while 700 mg/L showed opposite effects. The INSYSTA revealed sixteen genes down-regulated after KPL chronic treatment; they are involved in folding, sorting, and degradation of proteins as well as DNA replication and repair mechanisms. This may result in deregulation of the organismal functions, including those of immune and endocrine systems. These physiological changes appear to make embryos more sensitive to infections and disorders that resemble 47 human diseases. CONCLUSION These findings suggest that the medicinal use of plant extracts requires strict toxicological, pharmacological, and medical supervision. At the same time, it suggests a polypharmacological pathway for KPL extract that goes beyond preventing premature delivery and controlling anxiety.
Collapse
Affiliation(s)
- Kássia Martins Fernandes Pereira
- Department of Neurology and Neurosurgery, Escola Paulista de Medicina, Universidade Federal de São Paulo, 04021-001, São Paulo, SP, Brazil.
| | | | - Bianca H Ventura Fernandes
- Technical Directorate of Support for Teaching, Research and Innovation at the Faculty of Medicine of the University of São Paulo, São Paulo, SP, Brazil.
| | - Simone Dos Santos Grecco
- Department of Chemistry, Universidade Federal de São Paulo, 09972-270, Diadema, SP, Brazil; Triplet Biotechnology Solutions, São Paulo, Brazil.
| | - Eliana Rodrigues
- Center for Ethnobotanical and Ethnopharmacological Studies, Department of Environmental Sciences, Universidade Federal de São Paulo, São Paulo, SP, Brazil.
| | - Maria José da Silva Fernandes
- Department of Neurology and Neurosurgery, Escola Paulista de Medicina, Universidade Federal de São Paulo, 04021-001, São Paulo, SP, Brazil.
| | - Luciani Renata Silveira de Carvalho
- Technical Directorate of Support for Teaching, Research and Innovation at the Faculty of Medicine of the University of São Paulo, São Paulo, SP, Brazil; Discipline of Endocrinology, Laboratory of Hormones and Molecular Genetics-LIM42, Hospital das Clínicas of the University of São Paulo, São Paulo, SP, Brazil.
| | - Mary Uchiyama Nakamura
- Department of Obstetrics, Universidade Federal de São Paulo, São Paulo, SP, 04021-001, Brazil.
| | - Su Guo
- Department of Bioengineering and Therapeutic Sciences, Programs in Biological Sciences and Human Genetics, University of California, San Francisco, CA, 94158-2811, USA.
| | - Raúl Bonne Hernández
- Laboratory of Bioinorganic and Environmental Toxicology - LABITA, Department of Exact and Earth Sciences, Universidade Federal de São Paulo, 09972-270, Diadema, SP, Brazil.
| |
Collapse
|
2
|
Gao Y, Lai J, Feng C, Li L, Zu Q, Li J, Du D. Transcriptional Analysis of Tissues in Tartary Buckwheat Seedlings Under IAA Stimulation. Genes (Basel) 2024; 16:30. [PMID: 39858577 PMCID: PMC11764492 DOI: 10.3390/genes16010030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 12/04/2024] [Accepted: 12/13/2024] [Indexed: 01/27/2025] Open
Abstract
Background:Fagopyrum tataricum, commonly referred to as tartary buckwheat, is a cultivated medicinal and edible crop renowned for its economic and nutritional significance. Following the publication of the buckwheat genome, research on its functional genomics across various growth environments has gradually begun. Auxin plays a crucial role in many life processes. Analyzing the expression changes in tartary buckwheat after IAA treatment is of great significance for understanding its growth and environmental adaptability. Methods: This study investigated the changes in auxin response during the buckwheat seedling stage through high-throughput transcriptome sequencing and the identification and annotation of differentially expressed genes (DEGs) across three treatment stages. Results: After IAA treatment, there are 3355 DEGs in leaves and 3974 DEGs in roots identified. These DEGs are significantly enriched in plant hormone signaling, MAPK signaling pathways, phenylpropanoid biosynthesis, and flavonoid biosynthesis pathways. This result suggests a notable correlation between these tissues in buckwheat and their response to IAA, albeit with significant differences in response patterns. Additionally, the identification of tissue-specific expression genes in leaves and other tissues revealed distinct tissue variations. Conclusions: Following IAA treatment, an increase in tissue-specific expression genes observed, indicating that IAA significantly regulates the growth of buckwheat tissues. This study also validated certain genes, particularly those in plant hormone signaling pathways, providing a foundational dataset for the further analysis of buckwheat growth and tissue development and laying the groundwork for understanding buckwheat growth and development.
Collapse
Affiliation(s)
- Yingying Gao
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| | - Jialing Lai
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| | - Chenglu Feng
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| | - Luyang Li
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| | - Qihang Zu
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| | - Juan Li
- College of Nursing and Health Management & College of Life Science and Chemistry, Wuhan Donghu University, Wuhan 430212, China
- Innovation Institute for Biomedical Material, Wuhan Donghu University, Wuhan 430212, China
| | - Dengxiang Du
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan 430023, China
| |
Collapse
|
3
|
Asghar MA, Tang S, Wong LP, Yang P, Zhao Q. "Infectious uveitis: a comprehensive systematic review of emerging trends and molecular pathogenesis using network analysis". J Ophthalmic Inflamm Infect 2024; 14:60. [PMID: 39565496 PMCID: PMC11579267 DOI: 10.1186/s12348-024-00444-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 11/14/2024] [Indexed: 11/21/2024] Open
Abstract
BACKGROUND Infectious uveitis is a significant cause of visual impairment worldwide, caused by diverse pathogens such as viruses, bacteria, fungi, and parasites. Understanding its prevalence, etiology, pathogenesis, molecular mechanism, and clinical manifestations is essential for effective diagnosis and management. METHODS A systematic literature search was conducted using PubMed, Google Scholar, Web of Science, Scopus, and Embase, focusing on studies published in the last fifteen years from 2009 to 2023. Keywords included "uveitis," "infectious uveitis," "viral uveitis," and others. Rigorous inclusion and exclusion criteria were applied, and data were synthesized thematically. Gene symbols related to infectious uveitis were analyzed using protein-protein interaction (PPI) networks and pathway analyses to uncover molecular mechanisms associated with infectious uveitis. RESULTS The search from different databases yielded 97 eligible studies. The review identified a significant rise in publications on infectious uveitis, particularly viral uveitis, over the past fifteen years. Infectious uveitis prevalence varies geographically, with high rates in developing regions due to systemic infections and limited diagnostic resources. Etiologies include viruses (39%), bacteria (17%), and other pathogens, substantially impacting adults aged 20-50 years. Pathogenesis involves complex interactions between infectious agents and the ocular immune response, with key roles for cytokines and chemokines. The PPI network highlighted IFNG, IL6, TNF, and CD4 as central nodes. Enriched pathways included cytokine-cytokine receptor interaction and JAK-STAT signaling. Clinical manifestations range from anterior to posterior uveitis, with systemic symptoms often accompanying ocular signs. Diagnostic strategies encompass clinical evaluation, laboratory tests, and imaging, while management involves targeted antimicrobial therapy and anti-inflammatory agents. CONCLUSION This review underscores the complexity of infectious uveitis, driven by diverse pathogens and influenced by various geographical and systemic factors. Molecular insights from PPI networks and pathway analyses provide a deeper understanding of its pathogenesis. Effective management requires comprehensive diagnostic approaches and targeted therapeutic strategies.
Collapse
Affiliation(s)
| | - Shixin Tang
- College of Public Health, Chongqing Medical University, Chongqing, PR, China
| | - Li Ping Wong
- Department of Social and Preventive Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, 50603, Malaysia
| | - Peizeng Yang
- Chongqing Key Lab of Ophthalmology, Chongqing Branch of National Clinical Research Center for Ocular Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing Eye Institute, Chongqing, China
| | - Qinjian Zhao
- College of Pharmacy, Chongqing Medical University, Chongqing, PR , China.
| |
Collapse
|
4
|
Xiao-Yan G, Qiong-Yu Z, Si-Yuan T. Elucidating the material basis and potential mechanisms of Daqinglong Decoction acting on influenza by UPLC-Q-TOF/MS and network pharmacology. J Biomol Struct Dyn 2024; 42:9587-9601. [PMID: 37962031 DOI: 10.1080/07391102.2023.2275173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 08/20/2023] [Indexed: 11/15/2023]
Abstract
Daqinglong Decoction (DQLD), a traditional Chinese medicine (TCM) prescription firstly recorded in Shang han lun (the treatise on febrile diseases), has been used hundreds of years for the clinical treatment of influenza. However, the chemical composition and therapeutic mechanism of this prescription are unclear. UPLC-Q-TOF/MS was employed to analyze the chemical compounds in both methanol and boiling water extracts of DQLD. The compounds were then screened, characterized, and filtered using the TCMSP, TCMIP, TCM-ID and SymMap database, with a focus on their oral bioavailability and drug-likeness values. The resulting data were analyzed and optimized using the R language platform, Autodock and Gromacs software to identify biological processes and pathways. A total of 121 compounds were identified, of which 5 showed good binding ability to influenza virus targets (1L1B, IL10, CASP3, STAT3, TNF, and others). The active ingredient-target-influenza virus pathway was constructed using a network drug target analysis model prediction of DQLD, which was mainly enriched in Human cytomegalovirus infection, PI3K-Akt, HIF-1, and other signaling pathways through 1L1B, IL10 and other targets. Those pathways highly correlated to the body's inflammatory response, improve immunity, and exert anti-influenza virus effects. In summary, this study demonstrated that DQLD's active ingredients can effectively bind to influenza virus targets and exert anti-influenza virus effects by reducing inflammation and improving immunity through Human cytomegalovirus infection, PI3K-Akt and HIF-1 signaling pathways. These findings offer important insights into the potential mechanisms of action of DQLD and its potential use as a TCM against influenza and other viral infections.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Gong Xiao-Yan
- School of Nursing, Yongzhou Vocational Technical College, Yongzhou, China
| | - Zhang Qiong-Yu
- School of Fundamental Sciences, Yongzhou Vocational Technical College, Yongzhou, China
| | - Tang Si-Yuan
- School of Nursing, Central South University, Changsha, China
| |
Collapse
|
5
|
Huang F, Gao Q, Zhou X, Guo W, Feng K, Zhu L, Huang T, Cai YD. Prediction of Solubility of Proteins in Escherichia coli Based on Functional and Structural Features Using Machine Learning Methods. Protein J 2024; 43:983-996. [PMID: 39243320 DOI: 10.1007/s10930-024-10230-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/21/2024] [Indexed: 09/09/2024]
Abstract
Protein solubility is a critical parameter that determines the stability, activity, and functionality of proteins, with broad and far-reaching implications in biotechnology and biochemistry. Accurate prediction and control of protein solubility are essential for successful protein expression and purification in research and industrial settings. This study gathered information on soluble and insoluble proteins. In characterizing the proteins, they were mapped to STRING and characterized by functional and structural features. All functional/structural features were integrated to create a 5768-dimensional binary vector to encode proteins. Seven feature-ranking algorithms were employed to analyze the functional/structural features, yielding seven feature lists. These lists were subjected to the incremental feature selection, incorporating four classification algorithms, one by one to build effective classification models and identify functional/structural features with classification-related importance. Some essential functional/structural features used to differentiate between soluble and insoluble proteins were identified, including GO:0009987 (intercellular communication) and GO:0022613 (ribonucleoprotein complex biogenesis). The best classification model using support vector machine as the classification algorithm and 295 optimized functional/structural features generated the F1 score of 0.825, which can be a powerful tool to differentiate soluble proteins from insoluble proteins.
Collapse
Affiliation(s)
- Feiming Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China
| | - Qian Gao
- Department of Pharmacy, Shanghai Children's Medical Center, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - XianChao Zhou
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Institutes for Biological Sciences (SIBS), Shanghai Jiao Tong University School of Medicine (SJTUSM), Chinese Academy of Sciences (CAS), Shanghai, 200030, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, 510507, China
| | - Lin Zhu
- School of Information Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Tao Huang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Bio-Med Big Data Center, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China.
| |
Collapse
|
6
|
Borah K, Das HS, Seth S, Mallick K, Rahaman Z, Mallik S. A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis. Funct Integr Genomics 2024; 24:139. [PMID: 39158621 DOI: 10.1007/s10142-024-01415-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 07/30/2024] [Accepted: 08/01/2024] [Indexed: 08/20/2024]
Abstract
Recent advancements in biomedical technologies and the proliferation of high-dimensional Next Generation Sequencing (NGS) datasets have led to significant growth in the bulk and density of data. The NGS high-dimensional data, characterized by a large number of genomics, transcriptomics, proteomics, and metagenomics features relative to the number of biological samples, presents significant challenges for reducing feature dimensionality. The high dimensionality of NGS data poses significant challenges for data analysis, including increased computational burden, potential overfitting, and difficulty in interpreting results. Feature selection and feature extraction are two pivotal techniques employed to address these challenges by reducing the dimensionality of the data, thereby enhancing model performance, interpretability, and computational efficiency. Feature selection and feature extraction can be categorized into statistical and machine learning methods. The present study conducts a comprehensive and comparative review of various statistical, machine learning, and deep learning-based feature selection and extraction techniques specifically tailored for NGS and microarray data interpretation of humankind. A thorough literature search was performed to gather information on these techniques, focusing on array-based and NGS data analysis. Various techniques, including deep learning architectures, machine learning algorithms, and statistical methods, have been explored for microarray, bulk RNA-Seq, and single-cell, single-cell RNA-Seq (scRNA-Seq) technology-based datasets surveyed here. The study provides an overview of these techniques, highlighting their applications, advantages, and limitations in the context of high-dimensional NGS data. This review provides better insights for readers to apply feature selection and feature extraction techniques to enhance the performance of predictive models, uncover underlying biological patterns, and gain deeper insights into massive and complex NGS and microarray data.
Collapse
Affiliation(s)
- Kasmika Borah
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India
| | - Himanish Shekhar Das
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India.
| | - Soumita Seth
- Department of Computer Science and Engineering, Future Institute of Engineering and Management, Narendrapur, Kolkata, 700150, West Bengal, India
| | - Koushik Mallick
- Department of Computer Science and Engineering, RCC Institute of Information Technology, Canal S Rd, Beleghata, Kolkata, 700015, West Bengal, India
| | | | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of Public Health, Boston, MA, 02115, USA.
- Department of Pharmacology & Toxicology, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
7
|
Zhang YH, Huang F, Li J, Shen W, Chen L, Feng K, Huang T, Cai YD. Identification of Protein-Protein Interaction Associated Functions Based on Gene Ontology. Protein J 2024; 43:477-486. [PMID: 38436837 DOI: 10.1007/s10930-024-10180-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2024] [Indexed: 03/05/2024]
Abstract
Protein-protein interactions (PPIs) involve the physical or functional contact between two or more proteins. Generally, proteins that can interact with each other always have special relationships. Some previous studies have reported that gene ontology (GO) terms are related to the determination of PPIs, suggesting the special patterns on the GO terms of proteins in PPIs. In this study, we explored the special GO term patterns on human PPIs, trying to uncover the underlying functional mechanism of PPIs. The experimental validated human PPIs were retrieved from STRING database, which were termed as positive samples. Additionally, we randomly paired proteins occurring in positive samples, yielding lots of negative samples. A simple calculation was conducted to count the number of positive samples for each GO term pair, where proteins in samples were annotated by GO terms in the pair individually. The similar number for negative samples was also counted and further adjusted due to the great gap between the numbers of positive and negative samples. The difference of the above two numbers and the relative ratio compared with the number on positive samples were calculated. This ratio provided a precise evaluation of the occurrence of GO term pairs for positive samples and negative samples, indicating the latent GO term patterns for PPIs. Our analysis unveiled several nuclear biological processes, including gene transcription, cell proliferation, and nutrient metabolism, as key biological functions. Interactions between major proliferative or metabolic GO terms consistently correspond with significantly reported PPIs in recent literature.
Collapse
Affiliation(s)
- Yu-Hang Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China
| | - JiaBo Li
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, People's Republic of China
| | - WenFeng Shen
- School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, 201209, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, People's Republic of China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, 510507, People's Republic of China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, People's Republic of China.
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, People's Republic of China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China.
| |
Collapse
|
8
|
Sousa RT, Silva S, Pesquita C. Explaining protein-protein interactions with knowledge graph-based semantic similarity. Comput Biol Med 2024; 170:108076. [PMID: 38308873 DOI: 10.1016/j.compbiomed.2024.108076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/11/2023] [Accepted: 01/27/2024] [Indexed: 02/05/2024]
Abstract
The application of artificial intelligence and machine learning methods for several biomedical applications, such as protein-protein interaction prediction, has gained significant traction in recent decades. However, explainability is a key aspect of using machine learning as a tool for scientific discovery. Explainable artificial intelligence approaches help clarify algorithmic mechanisms and identify potential bias in the data. Given the complexity of the biomedical domain, explanations should be grounded in domain knowledge which can be achieved by using ontologies and knowledge graphs. These knowledge graphs express knowledge about a domain by capturing different perspectives of the representation of real-world entities. However, the most popular way to explore knowledge graphs with machine learning is through using embeddings, which are not explainable. As an alternative, knowledge graph-based semantic similarity offers the advantage of being explainable. Additionally, similarity can be computed to capture different semantic aspects within the knowledge graph and increasing the explainability of predictive approaches. We propose a novel method to generate explainable vector representations, KGsim2vec, that uses aspect-oriented semantic similarity features to represent pairs of entities in a knowledge graph. Our approach employs a set of machine learning models, including decision trees, genetic programming, random forest and eXtreme gradient boosting, to predict relations between entities. The experiments reveal that considering multiple semantic aspects when representing the similarity between two entities improves explainability and predictive performance. KGsim2vec performs better than black-box methods based on knowledge graph embeddings or graph neural networks. Moreover, KGsim2vec produces global models that can capture biological phenomena and elucidate data biases.
Collapse
Affiliation(s)
- Rita T Sousa
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal.
| | - Sara Silva
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
| | - Catia Pesquita
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
9
|
Yang J, Wu X, You J. Unveiling the potential of HSPA4: a comprehensive pan-cancer analysis of HSPA4 in diagnosis, prognosis, and immunotherapy. Aging (Albany NY) 2024; 16:2517-2541. [PMID: 38305786 PMCID: PMC10911360 DOI: 10.18632/aging.205496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/03/2024] [Indexed: 02/03/2024]
Abstract
With the global rise in cancer incidence and mortality rates, research on the topic has become increasingly urgent. Among the significant players in this field are heat shock proteins (HSPs), particularly HSPA4 from the HSP70 subfamily, which has recently garnered considerable interest for its role in cancer progression. However, despite numerous studies on HSPA4 in specific cancer types, a comprehensive analysis across all cancer types is lacking. This study employs various bioinformatics techniques to delve into the role of HSPA4 in pan-cancer. Our objective is to assess its potential in clinical diagnosis, prognosis, and as a future molecular target for therapy. The research findings reveal significant differences in HSPA4 expression across different cancer types, suggesting its diagnostic value and close association with cancer staging and patient survival rates. Furthermore, genetic variations and methylation status of HSPA4 play critical roles in tumorigenesis. Lastly, the interaction of HSPA4 with immune cells is linked to the tumor microenvironment (TME) and immunotherapy. In summary, HSPA4 emerges as a promising cancer biomarker and a vital member of the HSPs family, holding potential applications in diagnosis, prognosis, and immunotherapy.
Collapse
Affiliation(s)
- Junhao Yang
- Department of Surgical Oncology and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China
| | - Xiaoxiao Wu
- Department of Rheumatology, Second Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310000, China
| | - Jianhong You
- Department of Ultrasound, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361000, China
| |
Collapse
|
10
|
Liu Y, Chen Y, Yue X, Liu Y, Ning J, Li L, Wu J, Luo X, Zhang S. Proteomics and Metabolomics Analysis Reveal the Regulation Mechanism of Linoleate Isomerase Activity and Function in Propionibacterium acnes. ACS OMEGA 2024; 9:1643-1655. [PMID: 38222669 PMCID: PMC10785318 DOI: 10.1021/acsomega.3c08243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/26/2023] [Accepted: 11/29/2023] [Indexed: 01/16/2024]
Abstract
Conjugated linoleic acid (CLA) holds significant application prospects due to its anticancer, anti-atherosclerosis, lipid-lowering, weight-loss, and growth-promoting functions. The key to its efficient production lies in optimizing the biocatalytic performance of linoleic acid isomerase (LAI). Here, we constructed a Propionibacterium acnes mutant library and screened positive mutants with high linoleate isomerase activity. The proteomics and metabolomics were used to explore the mechanism in the regulation of linoleic acid isomerase activity. High-throughput proteomics revealed 104 differentially expressed proteins unique to positive mutant strains of linoleic acid isomerase of which 57 were upregulated and 47 were downregulated. These differentially expressed proteins were primarily involved in galactose metabolism, the phosphotransferase system, starch metabolism, and sucrose metabolism. Differential metabolic pathways were mainly enriched in amino acid biosynthesis, including glutamate metabolism, the Aminoacyl-tRNA biosynthesis pathway, and the ABC transporter pathway. The upregulated metabolites include dl-valine and Acetyl coA, while the downregulated metabolites include Glutamic acid and Phosphoenolpyruvate. Overall, the activity of linoleic acid isomerase in the mutant strain was increased by the regulation of key proteins involved in galactose metabolism, sucrose metabolism, and the phosphotransferase system. This study provides a theoretical basis for the development of high-yield CLA food.
Collapse
Affiliation(s)
- Ying Liu
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Yeping Chen
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Xiqing Yue
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Yingying Liu
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Jianting Ning
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Libo Li
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Junrui Wu
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Xue Luo
- College
of Food Science, Shenyang Agricultural University, Shenyang 110000, China
| | - Shuang Zhang
- College
of Food Science, Northeast Agricultural
University, Harbin 150000, China
| |
Collapse
|
11
|
Li J, Zhao Y, Liang R, Mao Y, Zuo H, Hopkins DL, Yang X, Luo X, Zhu L, Zhang Y. Effects of different protein phosphorylation levels on the tenderness of different ultimate pH beef. Food Res Int 2023; 174:113512. [PMID: 37986506 DOI: 10.1016/j.foodres.2023.113512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/26/2023] [Accepted: 09/24/2023] [Indexed: 11/22/2023]
Abstract
This study investigated the relationship between tenderness and protein phosphorylation levels of normal ultimate pH (pHu, 5.4-5.8, NpHu), intermediate pHu (5.8-6.2, IpHu) and high pHu (≥6.2, HpHu) Longissimus lumborum from beef. During 21 d of ageing, the HpHu group had the lowest Warner-Bratzler shear force (WBSF) values, while the IpHu group showed the highest and even after 21 days of ageing still had high levels. In the late stage of the 24 h post-mortem period the faster degradation rate of troponin T and earlier activation of caspase 9 in the HpHu group were the key reasons for the lower WBSF compared with the NpHu and IpHu groups. The activity of caspase 3 cannot explain the tenderness differences between IpHu and HpHu groups, since their activities did not show any difference. At 24 h post-mortem, 17 common differential phosphorylated peptides were detected among pHu groups, of which nine were associated with pHu and WBSF. The higher phosphorylation level of glycogen synthase may have caused the delay of meat tenderization in the IpHu group.
Collapse
Affiliation(s)
- Jiqiang Li
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Yan Zhao
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Rongrong Liang
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Yanwei Mao
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Huixin Zuo
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - David L Hopkins
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China; Canberra ACT, 2903, Australia.
| | - Xiaoyin Yang
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Xin Luo
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Lixian Zhu
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| | - Yimin Zhang
- Lab of Beef Processing and Quality Control, College of Food Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, PR China; National R&D Center for Beef Processing Technology, Tai'an, Shandong 271018, PR China.
| |
Collapse
|
12
|
Wu JM, Qiu WR, Liu Z, Xu ZC, Zhang SH. Integrative approach for classifying male tumors based on DNA methylation 450K data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:19133-19151. [PMID: 38052593 DOI: 10.3934/mbe.2023845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Malignancies such as bladder urothelial carcinoma, colon adenocarcinoma, liver hepatocellular carcinoma, lung adenocarcinoma and prostate adenocarcinoma significantly impact men's well-being. Accurate cancer classification is vital in determining treatment strategies and improving patient prognosis. This study introduced an innovative method that utilizes gene selection from high-dimensional datasets to enhance the performance of the male tumor classification algorithm. The method assesses the reliability of DNA methylation data to distinguish the five most prevalent types of male cancers from normal tissues by employing DNA methylation 450K data obtained from The Cancer Genome Atlas (TCGA) database. First, the chi-square test is used for dimensionality reduction and second, L1 penalized logistic regression is used for feature selection. Furthermore, the stacking ensemble learning technique was employed to integrate seven common multiclassification models. Experimental results demonstrated that the ensemble learning model utilizing multiple classification models outperformed any base classification model. The proposed ensemble model achieved an astonishing overall accuracy (ACC) of 99.2% in independent testing data. Moreover, it may present novel ideas and pathways for the early detection and treatment of future diseases.
Collapse
Affiliation(s)
- Ji-Ming Wu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Wang-Ren Qiu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Zi Liu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Zhao-Chun Xu
- Computer Department, Jing-De-Zhen Ceramic University, Jingdezhen 333403, China
| | - Shou-Hua Zhang
- Department of General Surgery, Jiangxi Provincial Children's Hospital, Nanchang 330006, China
| |
Collapse
|
13
|
Chen C, Luo L, Zheng C, Ding P, Liu H, Luo H. Self-prediction of relations in GO facilitates its quality auditing. J Biomed Inform 2023; 144:104441. [PMID: 37437682 DOI: 10.1016/j.jbi.2023.104441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 06/29/2023] [Accepted: 07/05/2023] [Indexed: 07/14/2023]
Abstract
As applications of the gene ontology (GO) increase rapidly in the biomedical field, quality auditing of it is becoming more and more important. Existing auditing methods are mostly based on rules, observed patterns or hypotheses. In this study, we propose a machine-learning-based framework for GO to audit itself: we first predict the IS-A relations among concepts in GO, then use differences between predicted results and existing relations to uncover potential errors. Specifically, we transfer the taxonomy of GO 2020 January release into a dataset with concept pairs as items and relations between them as labels(pairs with no direct IS-A relation are labeled as ndrs). To fully obtain the representation of each pair, we integrate the embeddings for the concept name, concept definition, as well as concept node in a substring-based topological graph. We divide the dataset into 10 parts, and rotate over all the parts by choosing one part as the testing set and the remaining as the training set each time. After 10 rotations, the prediction model predicted 4,640 existing IS-A pairs as ndrs. In the GO 2022 March release, 340 of these predictions were validated, demonstrating significance with a p-value of 1.60e-46 when compared to the results of randomly selected pairs. On the other hand, the model predicted 2,840 out of 17,079 selected ndrs in GO to be IS-A's relations. After deleting those that caused redundancies and circles, 924 predicted IS-A's relations remained. Among 200 pairs randomly selected, 30 were validated as missing IS-A's by domain experts. In conclusion, this study investigates a novel way of auditing biomedical ontologies by predicting the relations in it, which was shown to be useful for discovering potential errors.
Collapse
Affiliation(s)
- Cheng Chen
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, China.
| | - Chunlei Zheng
- VA Boston Cooperative Studies Program, MAVERIC, VA Boston Healthcare System, Boston, MA, USA
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, China.
| | - Huan Liu
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, China
| | - Hanyu Luo
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, China
| |
Collapse
|
14
|
Pati SK, Gupta MK, Banerjee A, Mallik S, Zhao Z. PPIGCF: A Protein-Protein Interaction-Based Gene Correlation Filter for Optimal Gene Selection. Genes (Basel) 2023; 14:genes14051063. [PMID: 37239423 DOI: 10.3390/genes14051063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/26/2023] [Accepted: 05/04/2023] [Indexed: 05/28/2023] Open
Abstract
Biological data at the omics level are highly complex, requiring powerful computational approaches to identifying significant intrinsic characteristics to further search for informative markers involved in the studied phenotype. In this paper, we propose a novel dimension reduction technique, protein-protein interaction-based gene correlation filtration (PPIGCF), which builds on gene ontology (GO) and protein-protein interaction (PPI) structures to analyze microarray gene expression data. PPIGCF first extracts the gene symbols with their expression from the experimental dataset, and then, classifies them based on GO biological process (BP) and cellular component (CC) annotations. Every classification group inherits all the information on its CCs, corresponding to the BPs, to establish a PPI network. Then, the gene correlation filter (regarding gene rank and the proposed correlation coefficient) is computed on every network and eradicates a few weakly correlated genes connected with their corresponding networks. PPIGCF finds the information content (IC) of the other genes related to the PPI network and takes only the genes with the highest IC values. The satisfactory results of PPIGCF are used to prioritize significant genes. We performed a comparison with current methods to demonstrate our technique's efficiency. From the experiment, it can be concluded that PPIGCF needs fewer genes to reach reasonable accuracy (~99%) for cancer classification. This paper reduces the computational complexity and enhances the time complexity of biomarker discovery from datasets.
Collapse
Affiliation(s)
- Soumen Kumar Pati
- Department of Bioinformatics, Maulana Abul Kalam Azad University of Technology, Haringhata 741249, West Bengal, India
| | - Manan Kumar Gupta
- Department of Bioinformatics, Maulana Abul Kalam Azad University of Technology, Haringhata 741249, West Bengal, India
| | - Ayan Banerjee
- Department of Computer Science and Engineering, Jalpaiguri Govt. Engineering College, Jalpaiguri 735102, West Bengal, India
| | - Saurav Mallik
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Department of Environmental Health, Harvard T H Chan School of Public Health, Boston, MA 02115, USA
- Department of Pharmacology & Toxicology, University of Arizona, Tucson, AZ 85721, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
15
|
Li H, Ma Q, Ren J, Guo W, Feng K, Li Z, Huang T, Cai YD. Immune responses of different COVID-19 vaccination strategies by analyzing single-cell RNA sequencing data from multiple tissues using machine learning methods. Front Genet 2023; 14:1157305. [PMID: 37007947 PMCID: PMC10065150 DOI: 10.3389/fgene.2023.1157305] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 03/07/2023] [Indexed: 03/19/2023] Open
Abstract
Multiple types of COVID-19 vaccines have been shown to be highly effective in preventing SARS-CoV-2 infection and in reducing post-infection symptoms. Almost all of these vaccines induce systemic immune responses, but differences in immune responses induced by different vaccination regimens are evident. This study aimed to reveal the differences in immune gene expression levels of different target cells under different vaccine strategies after SARS-CoV-2 infection in hamsters. A machine learning based process was designed to analyze single-cell transcriptomic data of different cell types from the blood, lung, and nasal mucosa of hamsters infected with SARS-CoV-2, including B and T cells from the blood and nasal cavity, macrophages from the lung and nasal cavity, alveolar epithelial and lung endothelial cells. The cohort was divided into five groups: non-vaccinated (control), 2*adenovirus (two doses of adenovirus vaccine), 2*attenuated (two doses of attenuated virus vaccine), 2*mRNA (two doses of mRNA vaccine), and mRNA/attenuated (primed by mRNA vaccine, boosted by attenuated vaccine). All genes were ranked using five signature ranking methods (LASSO, LightGBM, Monte Carlo feature selection, mRMR, and permutation feature importance). Some key genes that contributed to the analysis of immune changes, such as RPS23, DDX5, PFN1 in immune cells, and IRF9 and MX1 in tissue cells, were screened. Afterward, the five feature sorting lists were fed into the feature incremental selection framework, which contained two classification algorithms (decision tree [DT] and random forest [RF]), to construct optimal classifiers and generate quantitative rules. Results showed that random forest classifiers could provide relative higher performance than decision tree classifiers, whereas the DT classifiers provided quantitative rules that indicated special gene expression levels under different vaccine strategies. These findings may help us to develop better protective vaccination programs and new vaccines.
Collapse
Affiliation(s)
- Hao Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Qinglan Ma
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Jingxin Ren
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Institutes for Biological Sciences (SIBS), Shanghai Jiao Tong University School of Medicine (SJTUSM), Chinese Academy of Sciences (CAS), Shanghai, China
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Zhandong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
16
|
Li J, Huang F, Ma Q, Guo W, Feng K, Huang T, Cai YD. Identification of genes related to immune enhancement caused by heterologous ChAdOx1-BNT162b2 vaccines in lymphocytes at single-cell resolution with machine learning methods. Front Immunol 2023; 14:1131051. [PMID: 36936955 PMCID: PMC10017451 DOI: 10.3389/fimmu.2023.1131051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 02/13/2023] [Indexed: 03/06/2023] Open
Abstract
The widely used ChAdOx1 nCoV-19 (ChAd) vector and BNT162b2 (BNT) mRNA vaccines have been shown to induce robust immune responses. Recent studies demonstrated that the immune responses of people who received one dose of ChAdOx1 and one dose of BNT were better than those of people who received vaccines with two homologous ChAdOx1 or two BNT doses. However, how heterologous vaccines function has not been extensively investigated. In this study, single-cell RNA sequencing data from three classes of samples: volunteers vaccinated with heterologous ChAdOx1-BNT and volunteers vaccinated with homologous ChAd-ChAd and BNT-BNT vaccinations after 7 days were divided into three types of immune cells (3654 B, 8212 CD4+ T, and 5608 CD8+ T cells). To identify differences in gene expression in various cell types induced by vaccines administered through different vaccination strategies, multiple advanced feature selection methods (max-relevance and min-redundancy, Monte Carlo feature selection, least absolute shrinkage and selection operator, light gradient boosting machine, and permutation feature importance) and classification algorithms (decision tree and random forest) were integrated into a computational framework. Feature selection methods were in charge of analyzing the importance of gene features, yielding multiple gene lists. These lists were fed into incremental feature selection, incorporating decision tree and random forest, to extract essential genes, classification rules and build efficient classifiers. Highly ranked genes include PLCG2, whose differential expression is important to the B cell immune pathway and is positively correlated with immune cells, such as CD8+ T cells, and B2M, which is associated with thymic T cell differentiation. This study gave an important contribution to the mechanistic explanation of results showing the stronger immune response of a heterologous ChAdOx1-BNT vaccination schedule than two doses of either BNT or ChAdOx1, offering a theoretical foundation for vaccine modification.
Collapse
Affiliation(s)
- Jing Li
- School of Computer Science, Baicheng Normal University, Baicheng, Jilin, China
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - QingLan Ma
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) and Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Science, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
17
|
Badshah Y, Shabbir M, Khan K, Akhtar H. Expression Profiles of Hepatic Immune Response Genes in HEV Infection. Pathogens 2023; 12:pathogens12030392. [PMID: 36986315 PMCID: PMC10057882 DOI: 10.3390/pathogens12030392] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/09/2023] [Accepted: 02/17/2023] [Indexed: 03/05/2023] Open
Abstract
Hepatitis E is a liver inflammation caused by infection with the hepatitis E virus (HEV). Every year, there are an estimated 20 million HEV infections worldwide, leading to an estimated 3.3 million symptomatic cases of hepatitis E. HEV viral load has been studied about the disease progression; however, hepatic the host gene expression against HEV infection remains unknown. Methods: We identified the expression profiles of hepatic immune response genes in HEV infections. Fresh blood samples were collected from all the study subjects (130 patients and 124 controls) in 3ml EDTA vacutainers. HEV viral load was determined by a real-time PCR. The total RNA was isolated from the blood using the TRIZOL method. The expression of theCCL2, CCL5, CXCL10, CXCL16, TNF, IFNGR1, and SAMSN1 genes was studied in the blood of 130 HEV patients and 124 controls using a real-time PCR. Results: Gene expression profiles indicate high levels of CCL2, CCL5, CXCL10, CXCL16, TNF, IFNGR1, and SAMSN1 genes that might lead to the recruitment of leukocytes and infected cell apoptosis. Conclusion: Our study demonstrated distinct differences in the expression profiles of host immune response-related genes of HEV infections and provided valuable insight into the potential impact of these genes on disease progression.
Collapse
Affiliation(s)
- Yasmin Badshah
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad 44000, Pakistan
- Correspondence: (Y.B.); (H.A.); Tel.: +92-321-5272489 (Y.B. & H.A.)
| | - Maria Shabbir
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Khushbukhat Khan
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Hashaam Akhtar
- Global Health Security Agenda (GHSA), National Institutes of Health (NIH), Islamabad 44000, Pakistan
- Correspondence: (Y.B.); (H.A.); Tel.: +92-321-5272489 (Y.B. & H.A.)
| |
Collapse
|
18
|
Zheng X, Zhang C, Zheng D, Guo Q, Maierhaba M, Xue L, Zeng X, Wu Y, Gao W. An original cuproptosis-related genes signature effectively influences the prognosis and immune status of head and neck squamous cell carcinoma. Front Genet 2023; 13:1084206. [PMID: 36685880 PMCID: PMC9845781 DOI: 10.3389/fgene.2022.1084206] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 12/15/2022] [Indexed: 01/06/2023] Open
Abstract
Background: Recently, a non-apoptotic cell death pathway that is dependent on the presence of copper ions was proposed, named as cuproptosis. Cuproptosis have been found to have a strong association with the clinical progression and prognosis of several cancers. Head and neck squamous cell carcinoma (HNSC) are among the most common malignant tumors, with a 5-year relative survival rate ranging between 40% and 50%. The underlying mechanisms and clinical significance of cuproptosis-related genes (CRGs) in HNSC progression have not been clarified. Methods: In this study, expression pattern, biological functions, Immunohistochemistry (IHC), gene variants and immune status were analyzed to investigate the effects of CRGs on HNSC progression. Moreover, a 12-CRGs signature and nomogram were also constructed for prognosis prediction of HNSC. Results: The results revealed that some CRGs were dysregulated, had somatic mutations, and CNV in HNSC tissues. Among them, ISCA2 was found to be upregulated in HNSC and was strongly correlated with the overall survival (OS) of HNSC patients (HR = 1.13 [1.01-1.26], p-value = 0.0331). Functionally, CRGs was mainly associated with the TCA cycle, cell cycle, iron-sulfur cluster assembly, p53 signaling pathway, chemical carcinogenesis, and carbon metabolism in cancer. A 12-CRGs signature for predicting the OS was constructed which included, CAT, MTFR1L, OXA1L, POLE, NTHL1, DNA2, ATP7B, ISCA2, GLRX5, NDUFA1, and NDUFB2. This signature showed good prediction performance on the OS (HR = 5.3 [3.4-8.2], p-value = 3.4e-13) and disease-specific survival (HR = 6.4 [3.6-11], p-value = 2.4e-10). Furthermore, 12-CRGs signature significantly suppressed the activation of CD4+ T cells and antigen processing and presentation. Finally, a nomogram based on a 12-CRGs signature and clinical features was constructed which showed a significantly adverse effect on OS (HR = 1.061 [1.042-1.081], p-value = 1.6e-10) of HNSC patients. Conclusion: This study reveals the association of CRGs with the progression of HNSC based on multi-omics analysis. The study of CRGs is expected to improve clinical diagnosis, immunotherapeutic responsiveness and prognosis prediction of HNSC.
Collapse
Affiliation(s)
- Xiwang Zheng
- Shanxi Key Laboratory of Otorhinolaryngology Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
- Shanxi Province Clinical Medical Research Center for Precision Medicine of Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
| | - Chunming Zhang
- Shanxi Key Laboratory of Otorhinolaryngology Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
- Shanxi Province Clinical Medical Research Center for Precision Medicine of Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
- Department of Otolaryngology Head and Neck Surgery, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
| | - Defei Zheng
- Department of Hematology/Oncology, Children’s Hospital of Soochow University, Suzhou, Jiangsu, China
| | - Qingbo Guo
- Shanxi Key Laboratory of Otorhinolaryngology Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
- Shanxi Province Clinical Medical Research Center for Precision Medicine of Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
| | - Mijiti Maierhaba
- Shanxi Key Laboratory of Otorhinolaryngology Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
- Shanxi Province Clinical Medical Research Center for Precision Medicine of Head and Neck Cancer, First Hospital of Shanxi Medical University, Taiyuan, Shanxi, China
| | - Lingbin Xue
- Department of Otolaryngology Head and Neck Surgery, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
- Shenzhen Institute of Otolaryngology and Key Laboratory of Otolaryngology, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
| | - Xianhai Zeng
- Department of Otolaryngology Head and Neck Surgery, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
- Shenzhen Institute of Otolaryngology and Key Laboratory of Otolaryngology, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
| | - Yongyan Wu
- Department of Otolaryngology Head and Neck Surgery, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
- Shenzhen Institute of Otolaryngology and Key Laboratory of Otolaryngology, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
| | - Wei Gao
- Department of Otolaryngology Head and Neck Surgery, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
- Shenzhen Institute of Otolaryngology and Key Laboratory of Otolaryngology, Longgang Otolaryngology Hospital, Shenzhen, Guangdong, China
| |
Collapse
|
19
|
Xu Y, Huang F, Guo W, Feng K, Zhu L, Zeng Z, Huang T, Cai YD. Characterization of chromatin accessibility patterns in different mouse cell types using machine learning methods at single-cell resolution. Front Genet 2023; 14:1145647. [PMID: 36936430 PMCID: PMC10014730 DOI: 10.3389/fgene.2023.1145647] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/06/2023] Open
Abstract
Chromatin accessibility is a generic property of the eukaryotic genome, which refers to the degree of physical compaction of chromatin. Recent studies have shown that chromatin accessibility is cell type dependent, indicating chromatin heterogeneity across cell lines and tissues. The identification of markers used to distinguish cell types at the chromosome level is important to understand cell function and classify cell types. In the present study, we investigated transcriptionally active chromosome segments identified by sci-ATAC-seq at single-cell resolution, including 69,015 cells belonging to 77 different cell types. Each cell was represented by existence status on 20,783 genes that were obtained from 436,206 active chromosome segments. The gene features were deeply analyzed by Boruta, resulting in 3897 genes, which were ranked in a list by Monte Carlo feature selection. Such list was further analyzed by incremental feature selection (IFS) method, yielding essential genes, classification rules and an efficient random forest (RF) classifier. To improve the performance of the optimal RF classifier, its features were further processed by autoencoder, light gradient boosting machine and IFS method. The final RF classifier with MCC of 0.838 was constructed. Some marker genes such as H2-Dmb2, which are specifically expressed in antigen-presenting cells (e.g., dendritic cells or macrophages), and Tenm2, which are specifically expressed in T cells, were identified in this study. Our analysis revealed numerous potential epigenetic modification patterns that are unique to particular cell types, thereby advancing knowledge of the critical functions of chromatin accessibility in cell processes.
Collapse
Affiliation(s)
- Yaochen Xu
- Department of Mathematics, School of Sciences, Shanghai University, Shanghai, China
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) and Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Lin Zhu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Zhenbing Zeng
- Department of Mathematics, School of Sciences, Shanghai University, Shanghai, China
- *Correspondence: Zhenbing Zeng, ; Tao Huang, ; Yu-Dong Cai,
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Zhenbing Zeng, ; Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Zhenbing Zeng, ; Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
20
|
Ren J, Zhou X, Guo W, Feng K, Huang T, Cai YD. Identification of Methylation Signatures and Rules for Sarcoma Subtypes by Machine Learning Methods. BIOMED RESEARCH INTERNATIONAL 2022; 2022:5297235. [PMID: 36619306 PMCID: PMC9812612 DOI: 10.1155/2022/5297235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 11/28/2022] [Accepted: 12/08/2022] [Indexed: 12/31/2022]
Abstract
Sarcoma, the second common type of solid tumor in children and adolescents, has a wide variety of subtypes that are often not properly diagnosed at an early stage, leading to late metastases and causing serious loss of life and property to patients and families. It exhibits a high degree of heterogeneity at the cellular, molecular, and epigenetic levels, where DNA methylation has been proposed to play a role in the diagnosis of sarcoma subtypes. Thus, this study is aimed at finding potential biomarkers at the DNA methylation level to distinguish different sarcoma subtypes. A machine learning process was designed to analyse sarcoma samples, each of which was represented by lots of methylation sites. Irrelevant sites were removed using the Boruta method, and remaining sites related to the target variables were kept for further analyses. Afterward, three feature ranking methods (LASSO, LightGBM, and MCFS) were adopted to rank these features, and six classification models were constructed by combining incremental feature selection and two classification algorithms (decision tree and random forest). Among these models, the performance of RF model was higher than that of DT model under all three ranking conditions. The specific expression of genes obtained from the annotation of highly correlated methylation site features, such as PRKAR1B, INPP5A, and GLI3, was proven to be associated with sarcoma by publications. Moreover, the quantitative rules obtained by decision tree algorithm helped us to understand the essential differences between various sarcoma types and classify sarcoma subtypes, providing a new means of clinical identification and determining new therapeutic targets.
Collapse
Affiliation(s)
- Jingxin Ren
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - XianChao Zhou
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200030, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| |
Collapse
|
21
|
Afsar T, Razak S, Trembley JH, Khan K, Shabbir M, Almajwal A, Alruwaili NW, Ijaz MU. Prevention of Testicular Damage by Indole Derivative MMINA via Upregulated StAR and CatSper Channels with Coincident Suppression of Oxidative Stress and Inflammation: In Silico and In Vivo Validation. Antioxidants (Basel) 2022; 11:2063. [PMID: 36290786 PMCID: PMC9598787 DOI: 10.3390/antiox11102063] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/01/2022] [Accepted: 10/13/2022] [Indexed: 11/28/2022] Open
Abstract
Cis-diamminedichloroplatinum (II) (CDDP) is a widely used antineoplastic agent with numerous associated side effects. We investigated the mechanisms of action of the indole derivative N'-(4-dimethylaminobenzylidene)-2-1-(4-(methylsulfinyl) benzylidene)-5-fluoro-2-methyl-1H-inden-3-yl) acetohydrazide (MMINA) to protect against CDDP-induced testicular damage. Five groups of rats (n = 7) were treated with saline, DMSO, CDDP, CDDP + MMINA, or MMINA. Reproductive hormones, antioxidant enzyme activity, histopathology, daily sperm production, and oxidative stress markers were examined. Western blot analysis was performed to access the expression of steroidogenic acute regulatory protein (StAR) and inflammatory biomarker expression in testis, while expression of calcium-dependent cation channel of sperm (CatSper) in epididymis was examined. The structural and dynamic molecular docking behavior of MMINA was analyzed using bioinformatics tools. The construction of molecular interactions was performed through KEGG, DAVID, and STRING databases. MMINA treatment reversed CDDP-induced nitric oxide (NO) and malondialdehyde (MDA) augmentation, while boosting the activity of glutathione peroxidase (GPx) and superoxide dismutase (SOD) in the epididymis and testicular tissues. CDDP treatment significantly lowered sperm count, sperm motility, and epididymis sperm count. Furthermore, CDDP reduced epithelial height and tubular diameter and increased luminal diameter with impaired spermatogenesis. MMINA rescued testicular damage caused by CDDP. MMINA rescued CDDP-induced reproductive dysfunctions by upregulating the expression of the CatSper protein, which plays an essential role in sperm motility, MMINA increased testosterone secretion and StAR protein expression. MMINA downregulated the expression of NF-κB, STAT-3, COX-2, and TNF-α. Hydrogen bonding and hydrophobic interactions were predicted between MMINA and 3β-HSD, CatSper, NF-κβ, and TNFα. Molecular interactome outcomes depicted the formation of one hydrogen bond and one hydrophobic interaction between 3β-HSD that contributed to its strong binding with MMINA. CatSper also made one hydrophobic interaction and one hydrogen bond with MMINA but with a lower binding affinity of -7.7 relative to 3β-HSD, whereas MMINA made one hydrogen bond with NF-κβ residue Lys37 and TNF-α reside His91 and two hydrogen bonds with Lys244 and Thr456 of STAT3. Our experimental and in silico results revealed that MMINA boosted the antioxidant defense mechanism, restored the levels of fertility hormones, and suppressed histomorphological alterations.
Collapse
Affiliation(s)
- Tayyaba Afsar
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, Riyadh 12372, Saudi Arabia
| | - Suhail Razak
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, Riyadh 12372, Saudi Arabia
| | - Janeen H. Trembley
- Minneapolis VA Health Care System Research Service, Minneapolis, MN 55455, USA
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN 55455, USA
| | - Khushbukhat Khan
- Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Maria Shabbir
- Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Ali Almajwal
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, Riyadh 12372, Saudi Arabia
| | - Nawaf W. Alruwaili
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, Riyadh 12372, Saudi Arabia
| | - Muhammad Umar Ijaz
- Department of Zoology, Wildlife, and Fisheries, University of Agriculture, Faisalabad 38040, Pakistan
| |
Collapse
|
22
|
Jian F, Huang F, Zhang YH, Huang T, Cai YD. Identifying anal and cervical tumorigenesis-associated methylation signaling with machine learning methods. Front Oncol 2022; 12:998032. [PMID: 36249027 PMCID: PMC9557006 DOI: 10.3389/fonc.2022.998032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/14/2022] [Indexed: 11/13/2022] Open
Abstract
Cervical and anal carcinoma are neoplastic diseases with various intraepithelial neoplasia stages. The underlying mechanisms for cancer initiation and progression have not been fully revealed. DNA methylation has been shown to be aberrantly regulated during tumorigenesis in anal and cervical carcinoma, revealing the important roles of DNA methylation signaling as a biomarker to distinguish cancer stages in clinics. In this research, several machine learning methods were used to analyze the methylation profiles on anal and cervical carcinoma samples, which were divided into three classes representing various stages of tumor progression. Advanced feature selection methods, including Boruta, LASSO, LightGBM, and MCFS, were used to select methylation features that are highly correlated with cancer progression. Some methylation probes including cg01550828 and its corresponding gene RNF168 have been reported to be associated with human papilloma virus-related anal cancer. As for biomarkers for cervical carcinoma, cg27012396 and its functional gene HDAC4 were confirmed to regulate the glycolysis and survival of hypoxic tumor cells in cervical carcinoma. Furthermore, we developed effective classifiers for identifying various tumor stages and derived classification rules that reflect the quantitative impact of methylation on tumorigenesis. The current study identified methylation signals associated with the development of cervical and anal carcinoma at qualitative and quantitative levels using advanced machine learning methods.
Collapse
Affiliation(s)
- Fangfang Jian
- Department of Obstetrics & Gynecology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
23
|
Liu Z, Meng M, Ding S, Zhou X, Feng K, Huang T, Cai YD. Identification of methylation signatures and rules for predicting the severity of SARS-CoV-2 infection with machine learning methods. Front Microbiol 2022; 13:1007295. [PMID: 36212830 PMCID: PMC9537378 DOI: 10.3389/fmicb.2022.1007295] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 09/01/2022] [Indexed: 11/17/2022] Open
Abstract
Patients infected with SARS-CoV-2 at various severities have different clinical manifestations and treatments. Mild or moderate patients usually recover with conventional medical treatment, but severe patients require prompt professional treatment. Thus, stratifying infected patients for targeted treatment is meaningful. A computational workflow was designed in this study to identify key blood methylation features and rules that can distinguish the severity of SARS-CoV-2 infection. First, the methylation features in the expression profile were deeply analyzed by a Monte Carlo feature selection method. A feature list was generated. Next, this ranked feature list was fed into the incremental feature selection method to determine the optimal features for different classification algorithms, thereby further building optimal classifiers. These selected key features were analyzed by functional enrichment to detect their biofunctional information. Furthermore, a set of rules were set up by a white-box algorithm, decision tree, to uncover different methylation patterns on various severity of SARS-CoV-2 infection. Some genes (PARP9, MX1, IRF7), corresponding to essential methylation sites, and rules were validated by published academic literature. Overall, this study contributes to revealing potential expression features and provides a reference for patient stratification. The physicians can prioritize and allocate health and medical resources for COVID-19 patients based on their predicted severe clinical outcomes.
Collapse
Affiliation(s)
- Zhiyang Liu
- School of Life Sciences, Changchun Sci-Tech University, Changchun, China
| | - Mei Meng
- State Key Laboratory of Oncogenes and Related Genes, Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - ShiJian Ding
- School of Life Sciences, Shanghai University, Shanghai, China
| | - XiaoChao Zhou
- State Key Laboratory of Oncogenes and Related Genes, Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- Yu-Dong Cai,
| |
Collapse
|
24
|
Yang L, Zhang YH, Huang F, Li Z, Huang T, Cai YD. Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway. Front Genet 2022; 13:1011659. [PMID: 36171880 PMCID: PMC9511048 DOI: 10.3389/fgene.2022.1011659] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Protein–protein interactions (PPIs) are extremely important for gaining mechanistic insights into the functional organization of the proteome. The resolution of PPI functions can help in the identification of novel diagnostic and therapeutic targets with medical utility, thus facilitating the development of new medications. However, the traditional methods for resolving PPI functions are mainly experimental methods, such as co-immunoprecipitation, pull-down assays, cross-linking, label transfer, and far-Western blot analysis, that are not only expensive but also time-consuming. In this study, we constructed an integrated feature selection scheme for the large-scale selection of the relevant functions of PPIs by using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations of PPI participants. First, we encoded the proteins in each PPI with their gene ontologies and KEGG pathways. Then, the encoded protein features were refined as features of both positive and negative PPIs. Subsequently, Boruta was used for the initial filtering of features to obtain 5684 features. Three feature ranking algorithms, namely, least absolute shrinkage and selection operator, light gradient boosting machine, and max-relevance and min-redundancy, were applied to evaluate feature importance. Finally, the top-ranked features derived from multiple datasets were comprehensively evaluated, and the intersection of results mined by three feature ranking algorithms was taken to identify the features with high correlation with PPIs. Some functional terms were identified in our study, including cytokine–cytokine receptor interaction (hsa04060), intrinsic component of membrane (GO:0031224), and protein-binding biological process (GO:0005515). Our newly proposed integrated computational approach offers a novel perspective of the large-scale mining of biological functions linked to PPI.
Collapse
Affiliation(s)
- Lili Yang
- Measurement Biotechnique Research Center, School of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - ZhanDong Li
- Measurement Biotechnique Research Center, School of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
25
|
Zhao C, Jiang Z, Tian L, Tang L, Zhou A, Dong T. Bioinformatics-Based Approach for Exploring the Immune Cell Infiltration Patterns in Alzheimer's Disease and Determining the Intervention Mechanism of Liuwei Dihuang Pill. Dose Response 2022; 20:15593258221115563. [PMID: 35898725 PMCID: PMC9310246 DOI: 10.1177/15593258221115563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 07/07/2022] [Indexed: 11/15/2022] Open
Abstract
Traditional Chinese medicine (TCM) compounds have recently garnered attention for the regulation of immune cell infiltration and the prevention and treatment of Alzheimer's disease (AD). The Liuwei Dihuang Pill (LDP) has potential in this regard; however, its specific molecular mechanism currently remains unclear. Therefore, we adopted a bioinformatics approach to investigate the infiltration patterns of different types of immune cells in AD and explored the molecular mechanism of LDP intervention, with the aim of providing a new basis for improving the clinical immunotherapy of AD patients. We found that M1 macrophages showed significantly different degrees of infiltration between the hippocampal tissue samples of AD patients and healthy individuals. Four immune intersection targets of LDP in the treatment of AD were identified; they were enriched in 206 biological functions and 30 signaling pathways. Quercetin had the best docking effect with the core immune target PRKCB. Our findings suggest that infiltrated immune cells may influence the course of AD and that LDP can regulate immune cell infiltration through multi-component, multi-target, and multi-pathway approaches, providing a new research direction regarding AD immunotherapy.
Collapse
Affiliation(s)
- Chenling Zhao
- The First Clinical Medical College, Anhui University of Chinese Medicine, Hefei, China
| | - Zhangsheng Jiang
- The First Clinical Medical College, Anhui University of Chinese Medicine, Hefei, China
| | - Liwei Tian
- The First Clinical Medical College, Anhui University of Chinese Medicine, Hefei, China
| | - Lulu Tang
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
| | - An Zhou
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
| | - Ting Dong
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
| |
Collapse
|
26
|
Li H, Zhang S, Chen L, Pan X, Li Z, Huang T, Cai YD. Identifying Functions of Proteins in Mice With Functional Embedding Features. Front Genet 2022; 13:909040. [PMID: 35651937 PMCID: PMC9149260 DOI: 10.3389/fgene.2022.909040] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 04/28/2022] [Indexed: 12/02/2022] Open
Abstract
In current biology, exploring the biological functions of proteins is important. Given the large number of proteins in some organisms, exploring their functions one by one through traditional experiments is impossible. Therefore, developing quick and reliable methods for identifying protein functions is necessary. Considerable accumulation of protein knowledge and recent developments on computer science provide an alternative way to complete this task, that is, designing computational methods. Several efforts have been made in this field. Most previous methods have adopted the protein sequence features or directly used the linkage from a protein–protein interaction (PPI) network. In this study, we proposed some novel multi-label classifiers, which adopted new embedding features to represent proteins. These features were derived from functional domains and a PPI network via word embedding and network embedding, respectively. The minimum redundancy maximum relevance method was used to assess the features, generating a feature list. Incremental feature selection, incorporating RAndom k-labELsets to construct multi-label classifiers, used such list to construct two optimum classifiers, corresponding to two key measurements: accuracy and exact match. These two classifiers had good performance, and they were superior to classifiers that used features extracted by traditional methods.
Collapse
Affiliation(s)
- Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - ShiQi Zhang
- Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| | - ZhanDong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
27
|
Li Z, Pan X, Cai YD. Identification of Type 2 Diabetes Biomarkers From Mixed Single-Cell Sequencing Data With Feature Selection Methods. Front Bioeng Biotechnol 2022; 10:890901. [PMID: 35721855 PMCID: PMC9201257 DOI: 10.3389/fbioe.2022.890901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 04/04/2022] [Indexed: 11/18/2022] Open
Abstract
Diabetes is the most common disease and a major threat to human health. Type 2 diabetes (T2D) makes up about 90% of all cases. With the development of high-throughput sequencing technologies, more and more fundamental pathogenesis of T2D at genetic and transcriptomic levels has been revealed. The recent single-cell sequencing can further reveal the cellular heterogenicity of complex diseases in an unprecedented way. With the expectation on the molecular essence of T2D across multiple cell types, we investigated the expression profiling of more than 1,600 single cells (949 cells from T2D patients and 651 cells from normal controls) and identified the differential expression profiling and characteristics at the transcriptomics level that can distinguish such two groups of cells at the single-cell level. The expression profile was analyzed by several machine learning algorithms, including Monte Carlo feature selection, support vector machine, and repeated incremental pruning to produce error reduction (RIPPER). On one hand, some T2D-associated genes (MTND4P24, MTND2P28, and LOC100128906) were discovered. On the other hand, we revealed novel potential pathogenic mechanisms in a rule manner. They are induced by newly recognized genes and neglected by traditional bulk sequencing techniques. Particularly, the newly identified T2D genes were shown to follow specific quantitative rules with diabetes prediction potentials, and such rules further indicated several potential functional crosstalks involved in T2D.
Collapse
Affiliation(s)
- Zhandong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Xiaoyong Pan
- Key Laboratory of System Control and Information Processing, Institute of Image Processing and Pattern Recognition, Ministry of Education of China, Shanghai Jiao Tong University, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Yu-Dong Cai,
| |
Collapse
|
28
|
Li Z, Mei Z, Ding S, Chen L, Li H, Feng K, Huang T, Cai YD. Identifying Methylation Signatures and Rules for COVID-19 With Machine Learning Methods. Front Mol Biosci 2022; 9:908080. [PMID: 35620480 PMCID: PMC9127386 DOI: 10.3389/fmolb.2022.908080] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open
Abstract
The occurrence of coronavirus disease 2019 (COVID-19) has become a serious challenge to global public health. Definitive and effective treatments for COVID-19 are still lacking, and targeted antiviral drugs are not available. In addition, viruses can regulate host innate immunity and antiviral processes through the epigenome to promote viral self-replication and disease progression. In this study, we first analyzed the methylation dataset of COVID-19 using the Monte Carlo feature selection method to obtain a feature list. This feature list was subjected to the incremental feature selection method combined with a decision tree algorithm to extract key biomarkers, build effective classification models and classification rules that can remarkably distinguish patients with or without COVID-19. EPSTI1, NACAP1, SHROOM3, C19ORF35, and MX1 as the essential features play important roles in the infection and immune response to novel coronavirus. The six significant rules extracted from the optimal classifier quantitatively explained the expression pattern of COVID-19. Therefore, these findings validated that our method can distinguish COVID-19 at the methylation level and provide guidance for the diagnosis and treatment of COVID-19.
Collapse
Affiliation(s)
- Zhandong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Shijian Ding
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
29
|
Li Z, Guo W, Zeng T, Yin J, Feng K, Huang T, Cai YD. Detecting Brain Structure-Specific Methylation Signatures and Rules for Alzheimer's Disease. Front Neurosci 2022; 16:895181. [PMID: 35585924 PMCID: PMC9108872 DOI: 10.3389/fnins.2022.895181] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 04/11/2022] [Indexed: 01/01/2023] Open
Abstract
Alzheimer's disease (AD) is a progressive disease that leads to irreversible behavioral changes, erratic emotions, and loss of motor skills. These conditions make people with AD hard or almost impossible to take care of. Multiple internal and external pathological factors may affect or even trigger the initiation and progression of AD. DNA methylation is one of the most effective regulatory roles during AD pathogenesis, and pathological methylation alterations may be potentially different in the various brain structures of people with AD. Although multiple loci associated with AD initiation and progression have been identified, the spatial distribution patterns of AD-associated DNA methylation in the brain have not been clarified. According to the systematic methylation profiles on different structural brain regions, we applied multiple machine learning algorithms to investigate such profiles. First, the profile on each brain region was analyzed by the Boruta feature filtering method. Some important methylation features were extracted and further analyzed by the max-relevance and min-redundancy method, resulting in a feature list. Then, the incremental feature selection method, incorporating some classification algorithms, adopted such list to identify candidate AD-associated loci at methylation with structural specificity, establish a group of quantitative rules for revealing the effects of DNA methylation in various brain regions (i.e., four brain structures) on AD pathogenesis. Furthermore, some efficient classifiers based on essential methylation sites were proposed to identify AD samples. Results revealed that methylation alterations in different brain structures have different contributions to AD pathogenesis. This study further illustrates the complex pathological mechanisms of AD.
Collapse
Affiliation(s)
- ZhanDong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Tao Zeng
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Jie Yin
- Cancer Institute, Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Department of Human Genetics, Institute of Genetics, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
30
|
Li ZD, Yu X, Mei Z, Zeng T, Chen L, Xu XL, Li H, Huang T, Cai YD. Identifying luminal and basal mammary cell specific genes and their expression patterns during pregnancy. PLoS One 2022; 17:e0267211. [PMID: 35486595 PMCID: PMC9053804 DOI: 10.1371/journal.pone.0267211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 04/05/2022] [Indexed: 11/25/2022] Open
Abstract
Mammary gland is present in all mammals and usually functions in producing milk to feed the young offspring. Mammogenesis refers to the growth and development of mammary gland, which begins at puberty and ends after lactation. Pregnancy is regulated by various cytokines, which further contributes to mammary gland development. Epithelial cells, including basal and luminal cells, are one of the major components of mammary gland cells. The development of basal and luminal cells has been observed to significantly differ at different stages. However, the underlying mechanisms for differences between basal and luminal cells have not been fully studied. To explore the mechanisms underlying the differentiation of mammary progenitors or their offspring into luminal and myoepithelial cells, the single-cell sequencing data on mammary epithelia cells of virgin and pregnant mouse was deeply investigated in this work. We evaluated features by using Monte Carlo feature selection and plotted the incremental feature selection curve with support vector machine or RIPPER to find the optimal gene features and rules that can divide epithelial cells into four clusters with different cell subtypes like basal and luminal cells and different phases like pregnancy and virginity. As representations, the feature genes Cldn7, Gjb6, Sparc, Cldn3, Cited1, Krt17, Spp1, Cldn4, Gjb2 and Cldn19 might play an important role in classifying the epithelial mammary cells. Notably, seven most important rules based on the combination of cell-specific and tissue-specific expressions of feature genes effectively classify the epithelial mammary cells in a quantitative and interpretable manner.
Collapse
Affiliation(s)
- Zhan Dong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Xiangtian Yu
- Shanghai Jiao Tong University Affiliated Sixth People’s Hospital, Shanghai, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Tao Zeng
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Xian Ling Xu
- Guangdong AIB Polytechnic College, Guangzhou, China
| | - Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- * E-mail: (TH); (YDC)
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- * E-mail: (TH); (YDC)
| |
Collapse
|
31
|
Chen L, Mei Z, Guo W, Ding S, Huang T, Cai YD. Recognition of Immune Cell Markers of COVID-19 Severity with Machine Learning Methods. BIOMED RESEARCH INTERNATIONAL 2022; 2022:6089242. [PMID: 35528178 PMCID: PMC9073549 DOI: 10.1155/2022/6089242] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 04/11/2022] [Indexed: 01/08/2023]
Abstract
COVID-19 is hypothesized to be linked to the host's excessive inflammatory immunological response to SARS-CoV-2 infection, which is regarded to be a major factor in disease severity and mortality. Numerous immune cells play a key role in immune response regulation, and gene expression analysis in these cells could be a useful method for studying disease states, assessing immunological responses, and detecting biomarkers. Here, we developed a machine learning procedure to find biomarkers that discriminate disease severity in individual immune cells (B cell, CD4+ cell, CD8+ cell, monocyte, and NK cell) using single-cell gene expression profiles of COVID-19. The gene features of each profile were first filtered and ranked using the Boruta feature selection method and mRMR, and the resulting ranked feature lists were then fed into the incremental feature selection method to determine the optimal number of features with decision tree and random forest algorithms. Meanwhile, we extracted the classification rules in each cell type from the optimal decision tree classifiers. The best gene sets discovered in this study were analyzed by GO and KEGG pathway enrichment, and some important biomarkers like TLR2, ITK, CX3CR1, IL1B, and PRDM1 were validated by recent literature. The findings reveal that the optimal gene sets for each cell type can accurately classify COVID-19 disease severity and provide insight into the molecular mechanisms involved in disease progression.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai 200444, China
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200031, China
| | - ShiJian Ding
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| |
Collapse
|
32
|
Li Z, Guo W, Ding S, Feng K, Lu L, Huang T, Cai Y. Detecting Blood Methylation Signatures in Response to Childhood Cancer Radiotherapy via Machine Learning Methods. BIOLOGY 2022; 11:biology11040607. [PMID: 35453806 PMCID: PMC9030135 DOI: 10.3390/biology11040607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/09/2022] [Accepted: 04/14/2022] [Indexed: 11/16/2022]
Abstract
Radiotherapy is a helpful treatment for cancer, but it can also potentially cause changes in many molecules, resulting in adverse effects. Among these changes, the occurrence of abnormal DNA methylation patterns has alarmed scientists. To explore the influence of region-specific radiotherapy on blood DNA methylation, we designed a computational workflow by using machine learning methods that can identify crucial methylation alterations related to treatment exposure. Irrelevant methylation features from the DNA methylation profiles of 2052 childhood cancer survivors were excluded via the Boruta method, and the remaining features were ranked using the minimum redundancy maximum relevance method to generate feature lists. These feature lists were then fed into the incremental feature selection method, which uses a combination of deep forest, k-nearest neighbor, random forest, and decision tree to find the most important methylation signatures and build the best classifiers and classification rules. Several methylation signatures and rules have been discovered and confirmed, allowing for a better understanding of methylation patterns in response to different treatment exposures.
Collapse
Affiliation(s)
- Zhandong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun 130052, China;
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200025, China;
| | - Shijian Ding
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China;
| | - Lin Lu
- Department of Radiology, Columbia University Medical Center, New York, NY 10032, USA
- Correspondence: (L.L.); (T.H.); or (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Correspondence: (L.L.); (T.H.); or (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| | - Yudong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
- Correspondence: (L.L.); (T.H.); or (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| |
Collapse
|
33
|
Zhou X, Ding S, Wang D, Chen L, Feng K, Huang T, Li Z, Cai Y. Identification of Cell Markers and Their Expression Patterns in Skin Based on Single-Cell RNA-Sequencing Profiles. Life (Basel) 2022; 12:life12040550. [PMID: 35455041 PMCID: PMC9025372 DOI: 10.3390/life12040550] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/27/2022] [Accepted: 04/04/2022] [Indexed: 12/19/2022] Open
Abstract
Atopic dermatitis and psoriasis are members of a family of inflammatory skin disorders. Cellular immune responses in skin tissues contribute to the development of these diseases. However, their underlying immune mechanisms remain to be fully elucidated. We developed a computational pipeline for analyzing the single-cell RNA-sequencing profiles of the Human Cell Atlas skin dataset to investigate the pathological mechanisms of skin diseases. First, we applied the maximum relevance criterion and the Boruta feature selection method to exclude irrelevant gene features from the single-cell gene expression profiles of inflammatory skin disease samples and healthy controls. The retained gene features were ranked by using the Monte Carlo feature selection method on the basis of their importance, and a feature list was compiled. This list was then introduced into the incremental feature selection method that combined the decision tree and random forest algorithms to extract important cell markers and thus build excellent classifiers and decision rules. These cell markers and their expression patterns have been analyzed and validated in recent studies and are potential therapeutic and diagnostic targets for skin diseases because their expression affects the pathogenesis of inflammatory skin diseases.
Collapse
Affiliation(s)
- Xianchao Zhou
- School of Life Sciences, Shanghai University, Shanghai 200444, China; (X.Z.); (S.D.)
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Shijian Ding
- School of Life Sciences, Shanghai University, Shanghai 200444, China; (X.Z.); (S.D.)
| | - Deling Wang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Department of Medical Imaging, Sun Yat-sen University Cancer Center, Guangzhou 510060, China;
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China;
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China;
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Correspondence: (T.H.); (Z.L.); (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| | - Zhandong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun 130052, China
- Correspondence: (T.H.); (Z.L.); (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| | - Yudong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China; (X.Z.); (S.D.)
- Correspondence: (T.H.); (Z.L.); (Y.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.C.)
| |
Collapse
|
34
|
Similarity-Based Method with Multiple-Feature Sampling for Predicting Drug Side Effects. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9547317. [PMID: 35401786 PMCID: PMC8993545 DOI: 10.1155/2022/9547317] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 09/18/2021] [Accepted: 03/15/2022] [Indexed: 12/23/2022]
Abstract
Drugs can treat different diseases but also bring side effects. Undetected and unaccepted side effects for approved drugs can greatly harm the human body and bring huge risks for pharmaceutical companies. Traditional experimental methods used to determine the side effects have several drawbacks, such as low efficiency and high cost. One alternative to achieve this purpose is to design computational methods. Previous studies modeled a binary classification problem by pairing drugs and side effects; however, their classifiers can only extract one feature from each type of drug association. The present work proposed a novel multiple-feature sampling scheme that can extract several features from one type of drug association. Thirteen classification algorithms were employed to construct classifiers with features yielded by such scheme. Their performance was greatly improved compared with that of the classifiers that use the features yielded by the original scheme. Best performance was observed for the classifier based on random forest with MCC of 0.8661, AUROC of 0.969, and AUPR of 0.977. Finally, one key parameter in the multiple-feature sampling scheme was analyzed.
Collapse
|
35
|
Li Z, Wang D, Liao H, Zhang S, Guo W, Chen L, Lu L, Huang T, Cai YD. Exploring the Genomic Patterns in Human and Mouse Cerebellums Via Single-Cell Sequencing and Machine Learning Method. Front Genet 2022; 13:857851. [PMID: 35309141 PMCID: PMC8930846 DOI: 10.3389/fgene.2022.857851] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 02/09/2022] [Indexed: 12/29/2022] Open
Abstract
In mammals, the cerebellum plays an important role in movement control. Cellular research reveals that the cerebellum involves a variety of sub-cell types, including Golgi, granule, interneuron, and unipolar brush cells. The functional characteristics of cerebellar cells exhibit considerable differences among diverse mammalian species, reflecting a potential development and evolution of nervous system. In this study, we aimed to recognize the transcriptional differences between human and mouse cerebellum in four cerebellar sub-cell types by using single-cell sequencing data and machine learning methods. A total of 321,387 single-cell sequencing data were used. The 321,387 cells included 4 cell types, i.e., Golgi (5,048, 1.57%), granule (250,307, 77.88%), interneuron (60,526, 18.83%), and unipolar brush (5,506, 1.72%) cells. Our results showed that by using gene expression profiles as features, the optimal classification model could achieve very high even perfect performance for Golgi, granule, interneuron, and unipolar brush cells, respectively, suggesting a remarkable difference between the genomic profiles of human and mouse. Furthermore, a group of related genes and rules contributing to the classification was identified, which might provide helpful information for deepening the understanding of cerebellar cell heterogeneity and evolution.
Collapse
Affiliation(s)
- ZhanDong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Deling Wang
- Department of Radiology, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - HuiPing Liao
- Eye Institute of Shandong University of Traditional Chinese Medicine, Jinan, China
| | - ShiQi Zhang
- Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Lin Lu
- Department of Radiology, Columbia University Medical Center, New York, NY, United States
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
36
|
Meng C, Ju Y, Shi H. TMPpred: A support vector machine-based thermophilic protein identifier. Anal Biochem 2022; 645:114625. [PMID: 35218736 DOI: 10.1016/j.ab.2022.114625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/18/2022] [Accepted: 02/21/2022] [Indexed: 11/13/2022]
Abstract
MOTIVATION The thermostability of proteins will cause them to break the temperature binding and play more functions. Using machine learning, we explored the mechanism of and reasons for protein thermostability characteristics. RESULTS Different from other methods that only pursue the performance of models, we aim to find important features so as to provide a powerful reference for in vitro experiments. We transformed this problem into a binary classification problem, that is, the distinction between thermophilic proteins and nonthermophilic proteins. Using support vector machine-based model construction and analysis, we inferred that Gly, Ala, Ser and Thr may be the most important components at the residue level that determine the thermal stability of proteins. It is also noteworthy that our proposed model obtains an Sn of 0.892, an Sp of 0.857, an ACC of 0.87566 and an AUC of 0.874. To facilitate other researchers, we wrapped our model and deployed it as a web server, which is accessible at http://112.124.26.17:7000/TMPpred/index.html.
Collapse
Affiliation(s)
- Chaolu Meng
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China; Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Hohhot, China
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen, China.
| | - Hua Shi
- School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen, China.
| |
Collapse
|
37
|
Predicting Heart Cell Types by Using Transcriptome Profiles and a Machine Learning Method. Life (Basel) 2022; 12:life12020228. [PMID: 35207515 PMCID: PMC8877019 DOI: 10.3390/life12020228] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 01/29/2022] [Accepted: 01/29/2022] [Indexed: 11/17/2022] Open
Abstract
The heart is an essential organ in the human body. It contains various types of cells, such as cardiomyocytes, mesothelial cells, endothelial cells, and fibroblasts. The interactions between these cells determine the vital functions of the heart. Therefore, identifying the different cell types and revealing the expression rules in these cell types are crucial. In this study, multiple machine learning methods were used to analyze the heart single-cell profiles with 11 different heart cell types. The single-cell profiles were first analyzed via light gradient boosting machine method to evaluate the importance of gene features on the profiling dataset, and a ranking feature list was produced. This feature list was then brought into the incremental feature selection method to identify the best features and build the optimal classifiers. The results suggested that the best decision tree (DT) and random forest classification models achieved the highest weighted F1 scores of 0.957 and 0.981, respectively. The selected features, such as NPPA, LAMA2, DLC1, and the classification rules extracted from the optimal DT classifier played a crucial role in cardiac structure and function in recent research and enrichment analysis. In particular, some lncRNAs (LINC02019, NEAT1) were found to be quite important for the recognition of different cardiac cell types. In summary, these findings provide a solid academic foundation for the development of molecular diagnostics and biomarker discovery for cardiac diseases.
Collapse
|
38
|
Predicting RNA 5-Methylcytosine Sites by Using Essential Sequence Features and Distributions. BIOMED RESEARCH INTERNATIONAL 2022; 2022:4035462. [PMID: 35071593 PMCID: PMC8776474 DOI: 10.1155/2022/4035462] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 12/07/2021] [Accepted: 12/22/2021] [Indexed: 12/15/2022]
Abstract
Methylation is one of the most common and considerable modifications in biological systems mediated by multiple enzymes. Recent studies have shown that methylation has been widely identified in different RNA molecules. RNA methylation modifications have various kinds, such as 5-methylcytosine (m5C). However, for individual methylation sites, their functions still remain to be elucidated. Testing of all methylation sites relies heavily on high-throughput sequencing technology, which is expensive and labor consuming. Thus, computational prediction approaches could serve as a substitute. In this study, multiple machine learning models were used to predict possible RNA m5C sites on the basis of mRNA sequences in human and mouse. Each site was represented by several features derived from
-mers of an RNA subsequence containing such site as center. The powerful max-relevance and min-redundancy (mRMR) feature selection method was employed to analyse these features. The outcome feature list was fed into incremental feature selection method, incorporating four classification algorithms, to build efficient models. Furthermore, the sites related to features used in the models were also investigated.
Collapse
|
39
|
Ershov PV, Mezentsev YV, Ivanov AS. Interfacial Peptides as Affinity Modulating Agents of Protein-Protein Interactions. Biomolecules 2022; 12:106. [PMID: 35053254 PMCID: PMC8773757 DOI: 10.3390/biom12010106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 01/06/2022] [Accepted: 01/06/2022] [Indexed: 12/25/2022] Open
Abstract
The identification of disease-related protein-protein interactions (PPIs) creates objective conditions for their pharmacological modulation. The contact area (interfaces) of the vast majority of PPIs has some features, such as geometrical and biochemical complementarities, "hot spots", as well as an extremely low mutation rate that give us key knowledge to influence these PPIs. Exogenous regulation of PPIs is aimed at both inhibiting the assembly and/or destabilization of protein complexes. Often, the design of such modulators is associated with some specific problems in targeted delivery, cell penetration and proteolytic stability, as well as selective binding to cellular targets. Recent progress in interfacial peptide design has been achieved in solving all these difficulties and has provided a good efficiency in preclinical models (in vitro and in vivo). The most promising peptide-containing therapeutic formulations are under investigation in clinical trials. In this review, we update the current state-of-the-art in the field of interfacial peptides as potent modulators of a number of disease-related PPIs. Over the past years, the scientific interest has been focused on following clinically significant heterodimeric PPIs MDM2/p53, PD-1/PD-L1, HIF/HIF, NRF2/KEAP1, RbAp48/MTA1, HSP90/CDC37, BIRC5/CRM1, BIRC5/XIAP, YAP/TAZ-TEAD, TWEAK/FN14, Bcl-2/Bax, YY1/AKT, CD40/CD40L and MINT2/APP.
Collapse
Affiliation(s)
- Pavel V. Ershov
- Institute of Biomedical Chemistry, 119121 Moscow, Russia; (Y.V.M.); (A.S.I.)
| | | | | |
Collapse
|
40
|
Khan K, Zafar S, Hafeez A, Badshah Y, Shahid K, Mahmood Ashraf N, Shabbir M. PRKCE non-coding variants influence on transcription as well as translation of its gene. RNA Biol 2022; 19:1115-1129. [PMID: 36299231 PMCID: PMC9621080 DOI: 10.1080/15476286.2022.2139110] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/10/2022] [Accepted: 10/17/2022] [Indexed: 10/31/2022] Open
Abstract
Untranslated regions of the gene play a crucial role in gene expression regulation at mRNA and protein levels. Mutations at UTRs impact expression by altering transcription factor binding, transcriptional/translational efficacy, miRNA-mediated gene regulation, mRNA secondary structure, ribosomal translocation, and stability. PKCε, a serine/threonine kinase, is aberrantly expressed in numerous diseases such as cardiovascular disorders, neurological disorders, and cancers; its probable cause is unknown. Therefore, in the current study, the influence of PRKCE 5'-and 3'UTR variants was explored for their potential impact on its transcription and translation through several bioinformatics approaches. UTR variants data was obtained through different databases and initially evaluated for their regulatory function. Variants with regulatory function were then studied for their effect on PRKCE binding with transcription factors (TF) and miRNAs, as well as their impact on mRNA secondary structure. Study outcomes indicated the regulatory function of 73 5'UTR and 17 3'UTR variants out of 376. 5'UTR variants introduced AP1 binding sites and promoted the PRKCE transcription. Four 3'UTR variants introduced a circular secondary structure, increasing PRKCE translational efficacy. A region in 5'UTR position 45,651,564 to 45,651,644 was found where variants readily influenced the miRNA-PRKCE mRNA binding. The study further highlighted a PKCε-regulated feedback loop mechanism that induces the activity of TFs, promoting its gene transcription. The study provides foundations for experimentation to understand these variants' role in diseases. These variants can also serve as the genetic markers for different diseases' diagnoses after validation at the cell and population levels.
Collapse
Affiliation(s)
- Khushbukhat Khan
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Sameen Zafar
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Amna Hafeez
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Yasmin Badshah
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Kanza Shahid
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Naeem Mahmood Ashraf
- School of Biochemistry & Biotechnology, University of the Punjab, Lahore, Pakistan
| | - Maria Shabbir
- Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| |
Collapse
|
41
|
Hu M, Wang J. Identification of Hub Genes and Immune Cell Infiltration Characteristics in Alzheimer's Disease. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:7036194. [PMID: 34966527 PMCID: PMC8712155 DOI: 10.1155/2021/7036194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 11/18/2021] [Accepted: 11/26/2021] [Indexed: 12/17/2022]
Abstract
The purpose of this study was to identify hub genes closely correlated with Alzheimer's disease (AD) and their association with immune cell infiltration. In this work, 119 overlapping differentially expressed genes (DEGs) were obtained from GSE5281 and GSE122063 datasets through differential expression analysis. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed on the 119 DEGs, revealing some important biological functions and key pathways. AD immune cell infiltration analysis revealed a significant difference in the proportion of immune cells between the AD group and the control group. Finally, correlation analysis between target hub genes and immune cells indicated that GFAP had a positive or negative correlation with some specific immune cells. Our results provided useful clues, which will help to explain the molecular mechanism of AD and search for precise prognostic markers and potential therapeutic targets.
Collapse
Affiliation(s)
- Ming Hu
- Department of Graduate School, Hebei Medical University, Shijiazhuang 050000, Hebei, China
| | - Jianhua Wang
- Department of Graduate School, Hebei Medical University, Shijiazhuang 050000, Hebei, China
- Deparment of Neurology, Hebei General Hospital, Shijiazhuang 050051, Hebei, China
| |
Collapse
|
42
|
Ding S, Li H, Zhang YH, Zhou X, Feng K, Li Z, Chen L, Huang T, Cai YD. Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines. Front Cell Dev Biol 2021; 9:781285. [PMID: 34917619 PMCID: PMC8669964 DOI: 10.3389/fcell.2021.781285] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/16/2021] [Indexed: 12/12/2022] Open
Abstract
There are many types of cancers. Although they share some hallmarks, such as proliferation and metastasis, they are still very different from many perspectives. They grow on different organ or tissues. Does each cancer have a unique gene expression pattern that makes it different from other cancer types? After the Cancer Genome Atlas (TCGA) project, there are more and more pan-cancer studies. Researchers want to get robust gene expression signature from pan-cancer patients. But there is large variance in cancer patients due to heterogeneity. To get robust results, the sample size will be too large to recruit. In this study, we tried another approach to get robust pan-cancer biomarkers by using the cell line data to reduce the variance. We applied several advanced computational methods to analyze the Cancer Cell Line Encyclopedia (CCLE) gene expression profiles which included 988 cell lines from 20 cancer types. Two feature selection methods, including Boruta, and max-relevance and min-redundancy methods, were applied to the cell line gene expression data one by one, generating a feature list. Such list was fed into incremental feature selection method, incorporating one classification algorithm, to extract biomarkers, construct optimal classifiers and decision rules. The optimal classifiers provided good performance, which can be useful tools to identify cell lines from different cancer types, whereas the biomarkers (e.g. NCKAP1, TNFRSF12A, LAMB2, FKBP9, PFN2, TOM1L1) and rules identified in this work may provide a meaningful and precise reference for differentiating multiple types of cancer and contribute to the personalized treatment of tumors.
Collapse
Affiliation(s)
- ShiJian Ding
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Hao Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - XianChao Zhou
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - KaiYan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - ZhanDong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Tao Huang
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
43
|
Chen L, Li Z, Zeng T, Zhang YH, Zhang S, Huang T, Cai YD. Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features. Front Genet 2021; 12:783128. [PMID: 34804131 PMCID: PMC8603309 DOI: 10.3389/fgene.2021.783128] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 10/22/2021] [Indexed: 12/12/2022] Open
Abstract
Given the limitation of technologies, the subcellular localizations of proteins are difficult to identify. Predicting the subcellular localization and the intercellular distribution patterns of proteins in accordance with their specific biological roles, including validated functions, relationships with other proteins, and even their specific sequence characteristics, is necessary. The computational prediction of protein subcellular localizations can be performed on the basis of the sequence and the functional characteristics. In this study, the protein-protein interaction network, functional annotation of proteins and a group of direct proteins with known subcellular localization were used to construct models. To build efficient models, several powerful machine learning algorithms, including two feature selection methods, four classification algorithms, were employed. Some key proteins and functional terms were discovered, which may provide important contributions for determining protein subcellular locations. Furthermore, some quantitative rules were established to identify the potential subcellular localizations of proteins. As the first prediction model that uses direct protein annotation information (i.e., functional features) and STRING-based protein-protein interaction network (i.e., network features), our computational model can help promote the development of predictive technologies on subcellular localizations and provide a new approach for exploring the protein subcellular localization patterns and their potential biological importance.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai, China
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - ZhanDong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Zeng
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - ShiQi Zhang
- Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
44
|
Xu C, Zheng H, Liu T, Zhang Y, Feng Y. Bioinformatics analysis identifies CSF1R as an essential gene mediating Neuropathic pain - Experimental research. Int J Surg 2021; 95:106140. [PMID: 34628075 DOI: 10.1016/j.ijsu.2021.106140] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 09/14/2021] [Accepted: 10/04/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND Neuropathic pain (NP) severely affects the quality of life; however, there is no effective long-term treatment. The spinal dorsal horn (SDH) is an essential target for studying NP mechanisms and clinical treatments. MATERIALS AND METHODS We searched the Gene Expression Omnibus (GEO) for the datasets of SDH microarray changes in mice NP models. Bioinformatics analysis was conducted to identify differentially expressed genes (DEGs), DEG enrichment pathways, and critical hub genes in the datasets. Finally, we explored the expression, function, and relevant mechanisms of the mouse NP model's most critical hub gene. RESULTS Two SDH microarray datasets for the mice NP model were retrieved from GEO, GSE75072, and GSE111216. We found 43 overlapping DEGs in the datasets, primarily in the inflammatory and immune pathways. The most essential hub gene was the colony-stimulating factor 1 receptor (CSF1R). Seven days after creating the mouse NP model-spared nerve injury (SNI) model or Sham model, the expression of CSF1R and microglia increased significantly in the SDH of SNI group. PLX3397, an inhibitor of CSF1R, reduced the SDH CSF1R and microglia expression after SNI and significantly alleviated the hyperalgesia in the SNI mice. CONCLUSION SDH CSF1R participates in regulation NP, which is related to changes in the activity of microglia in the SDH.
Collapse
Affiliation(s)
- Chao Xu
- Department of Anesthesiology, Peking University People's Hospital, Beijing, China Neuroscience Research Institute and Department of Neurobiology, School of Basic Medical Sciences, Peking University; Key Laboratory for Neuroscience, Ministry of Education and National Health Commission, Peking University, Beijing, China Key Laboratory of Anesthesia and Analgesia, Xuzhou Medical University, Xuzhou, China Department of Anesthesiology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | | | | | | | | |
Collapse
|
45
|
iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:7681497. [PMID: 34671418 PMCID: PMC8523280 DOI: 10.1155/2021/7681497] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 09/15/2021] [Accepted: 09/27/2021] [Indexed: 12/20/2022]
Abstract
Membrane protein is an important kind of proteins. It plays essential roles in several cellular processes. Based on the intramolecular arrangements and positions in a cell, membrane proteins can be divided into several types. It is reported that the types of a membrane protein are highly related to its functions. Determination of membrane protein types is a hot topic in recent years. A plenty of computational methods have been proposed so far. Some of them used functional domain information to encode proteins. However, this procedure was still crude. In this study, we designed a novel feature extraction scheme to obtain informative features of proteins from their functional domain information. Such scheme termed domains as words and proteins, represented by its domains, as sentences. The natural language processing approach, word2vector, was applied to access the features of domains, which were further refined to protein features. Based on these features, RAndom k-labELsets with random forest as the base classifier was employed to build the multilabel classifier, namely, iMPT-FDNPL. The tenfold cross-validation results indicated the good performance of such classifier. Furthermore, such classifier was superior to other classifiers based on features derived from functional domains via one-hot scheme or derived from other properties of proteins, suggesting the effectiveness of protein features generated by the proposed scheme.
Collapse
|
46
|
Identification of Novel Choroidal Neovascularization-Related Genes Using Laplacian Heat Diffusion Algorithm. BIOMED RESEARCH INTERNATIONAL 2021; 2021:2295412. [PMID: 34532497 PMCID: PMC8440095 DOI: 10.1155/2021/2295412] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 08/20/2021] [Indexed: 11/20/2022]
Abstract
Choroidal neovascularization (CNV) is a type of eye disease that can cause vision loss. In recent years, many studies have attempted to investigate the major pathological processes and molecular pathogenic mechanisms of CNV. Because many diseases are related to genes, the genes associated with CNV need to be identified. In this study, we proposed a network-based approach for identifying novel CNV-associated genes. To execute such method, we first employed a protein-protein interaction network reported in STRING. Then, we applied a network diffusion algorithm, Laplacian heat diffusion, on this network by selecting validated CNV-related genes as the seed nodes. As a result, some novel genes that had unknown but strong relationships with validated genes were identified. Furthermore, we used a screening procedure to extract the most essential genes. Eleven latent CNV-related genes were finally obtained. Extensive analyses were performed to confirm that these genes are novel CNV-related genes.
Collapse
|
47
|
Chen L, Zhou X, Zeng T, Pan X, Zhang YH, Huang T, Fang Z, Cai YD. Recognizing Pattern and Rule of Mutation Signatures Corresponding to Cancer Types. Front Cell Dev Biol 2021; 9:712931. [PMID: 34513841 PMCID: PMC8427289 DOI: 10.3389/fcell.2021.712931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 07/02/2021] [Indexed: 11/20/2022] Open
Abstract
Cancer has been generally defined as a cluster of systematic malignant pathogenesis involving abnormal cell growth. Genetic mutations derived from environmental factors and inherited genetics trigger the initiation and progression of cancers. Although several well-known factors affect cancer, mutation features and rules that affect cancers are relatively unknown due to limited related studies. In this study, a computational investigation on mutation profiles of cancer samples in 27 types was given. These profiles were first analyzed by the Monte Carlo Feature Selection (MCFS) method. A feature list was thus obtained. Then, the incremental feature selection (IFS) method adopted such list to extract essential mutation features related to 27 cancer types, find out 207 mutation rules and construct efficient classifiers. The top 37 mutation features corresponding to different cancer types were discussed. All the qualitatively analyzed gene mutation features contribute to the distinction of different types of cancers, and most of such mutation rules are supported by recent literature. Therefore, our computational investigation could identify potential biomarkers and prediction rules for cancers in the mutation signature level.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai, China.,College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Xianchao Zhou
- School of Life Sciences and Technology, ShanghaiTech University, Shanghai, China.,Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Tao Zeng
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Xiaoyong Pan
- Key Laboratory of System Control and Information Processing, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Ministry of Education of China, Shanghai, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - Tao Huang
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Zhaoyuan Fang
- Zhejiang University-University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
48
|
Huang GH, Zhang YH, Chen L, Li Y, Huang T, Cai YD. Identifying Lung Cancer Cell Markers with Machine Learning Methods and Single-Cell RNA-Seq Data. Life (Basel) 2021; 11:life11090940. [PMID: 34575089 PMCID: PMC8467493 DOI: 10.3390/life11090940] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Revised: 09/03/2021] [Accepted: 09/06/2021] [Indexed: 11/21/2022] Open
Abstract
Non-small cell lung cancer is a major lethal subtype of epithelial lung cancer, with high morbidity and mortality. The single-cell sequencing technique plays a key role in exploring the pathogenesis of non-small cell lung cancer. We proposed a computational method for distinguishing cell subtypes from the different pathological regions of non-small cell lung cancer on the basis of transcriptomic profiles, including a group of qualitative classification criteria (biomarkers) and various rules. The random forest classifier reached a Matthew’s correlation coefficient (MCC) of 0.922 by using 720 features, and the decision tree reached an MCC of 0.786 by using 1880 features. The obtained biomarkers and rules were analyzed in the end of this study.
Collapse
Affiliation(s)
- Guo-Hua Huang
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
- Department of Mechanical and Energy Engineering, Shaoyang University, Shaoyang 422000, China;
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA;
| | - Lei Chen
- Department of College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China;
| | - You Li
- Department of Mechanical and Energy Engineering, Shaoyang University, Shaoyang 422000, China;
| | - Tao Huang
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China
- Correspondence: (T.H.); (Y.-D.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.-D.C.)
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
- Correspondence: (T.H.); (Y.-D.C.); Tel.: +86-21-54923269 (T.H.); +86-21-66136132 (Y.-D.C.)
| |
Collapse
|
49
|
Li Y, Yu X, Wang Y, Zheng X, Chu Q. Kaempferol-3- O-rutinoside, a flavone derived from Tetrastigma hemsleyanum, suppresses lung adenocarcinoma via the calcium signaling pathway. Food Funct 2021; 12:8351-8365. [PMID: 34338262 DOI: 10.1039/d1fo00581b] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Lung cancer has been threatening human health worldwide for a long time. However, the clinic therapies remain unsatisfactory. In this study, the anti-adenocarcinoma lung cancer A549 cell line abilities of Tetrastigma hemsleyanum tuber flavonoids (THTF) were evaluated in vivo, and isobaric tags for relative and absolute quantification (iTRAQ)-based proteomic analysis was conducted to detect the protein alterations in THTF-treated solid tumors. The differentially expressed proteins were related to the cytoskeleton and mostly accumulated in the calcium signaling pathway. The in vitro study illustrated that 80 μg mL-1 THTF significantly suppressed cellular viability to approximately 75% of the control. Further results suggested that kaempferol-3-O-rutinoside (K3R), the major component of THTF, effectively triggered cytoskeleton collapse, mitochondrial dysfunction and consequent calcium overload to achieve apoptosis, which remained consistent with proteomic results. This study uncovers a new mechanism for THTF anti-tumor ability, and suggests THTF and K3R as promising anti-cancer agents, providing new ideas and possible strategies for future anti-lung cancer prevention and therapy.
Collapse
Affiliation(s)
- Yonglu Li
- Department of Food Science and Nutrition; Zhejiang Key Laboratory for Agro-food Processing; Fuli Institute of Food Science; National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang University, Hangzhou 310058, People's Republic of China.
| | - Xin Yu
- Department of Food Science and Nutrition; Zhejiang Key Laboratory for Agro-food Processing; Fuli Institute of Food Science; National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang University, Hangzhou 310058, People's Republic of China.
| | - Yaxuan Wang
- Department of Food Science and Nutrition; Zhejiang Key Laboratory for Agro-food Processing; Fuli Institute of Food Science; National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang University, Hangzhou 310058, People's Republic of China.
| | - Xiaodong Zheng
- Department of Food Science and Nutrition; Zhejiang Key Laboratory for Agro-food Processing; Fuli Institute of Food Science; National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang University, Hangzhou 310058, People's Republic of China.
| | - Qiang Chu
- Department of Food Science and Nutrition; Zhejiang Key Laboratory for Agro-food Processing; Fuli Institute of Food Science; National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang University, Hangzhou 310058, People's Republic of China. and State Key Laboratory of Silicon Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, People's Republic of China
| |
Collapse
|
50
|
Tong C, Hu H, Chen G, Li Z, Li A, Zhang J. Chlorine disinfectants promote microbial resistance in Pseudomonas sp. ENVIRONMENTAL RESEARCH 2021; 199:111296. [PMID: 34010624 DOI: 10.1016/j.envres.2021.111296] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/25/2021] [Accepted: 05/04/2021] [Indexed: 06/12/2023]
Abstract
The substantial use of disinfectants has increased antibiotic resistance, thereby mediating serious ecological safety issues worldwide. Accumulating studies have reported the role of chlorine disinfectants in promoting disinfectant resistance. The present study sought to investigate the role of chlorine disinfectants in developing multiple resistance in Pseudomonas sp. isolated from the river through antioxidant enzyme measurement, global transcriptional analyses, Gene Ontology (GO), and the Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. The results demonstrated that 100 mg/L sodium hypochlorite could increase disinfectant resistance and antibiotic resistance. The SOS response (a conserved response to DNA damage) triggered by oxidative stress makes bacteria resistant to chlorine. An increase in antibiotic resistance could be attributed to a decreased membrane permeability, increased expression of MuxABC-OpmB efflux pump, beta-lactamase, and antioxidant enzymes. Additionally, KEGG enrichment analysis suggested that the differentially expressed genes were highly enriched in the metabolic pathways. In summary, the study results revealed the impact of chlorine disinfectants in promoting microbial disinfectant resistance and antibiotic resistance. This study will provide insight into disinfectant resistance mechanisms.
Collapse
Affiliation(s)
- Chaoyu Tong
- Collage of Environmental Science and Engineering, Ocean University of China, Qingdao, 266100, China.
| | - Hong Hu
- Collage of Environmental Science and Engineering, Ocean University of China, Qingdao, 266100, China.
| | - Gang Chen
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China.
| | - Zhengyan Li
- Collage of Environmental Science and Engineering, Ocean University of China, Qingdao, 266100, China.
| | - Aifeng Li
- Collage of Environmental Science and Engineering, Ocean University of China, Qingdao, 266100, China.
| | - Jianye Zhang
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China.
| |
Collapse
|