1
|
Zheng H, Xu L, Xie H, Xie J, Ma Y, Hu Y, Wu L, Chen J, Wang M, Yi Y, Huang Y, Wang D. RIscoper 2.0: A deep learning tool to extract RNA biomedical relation sentences from literature. Comput Struct Biotechnol J 2024; 23:1469-1476. [PMID: 38623560 PMCID: PMC11016866 DOI: 10.1016/j.csbj.2024.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/15/2024] [Accepted: 03/21/2024] [Indexed: 04/17/2024] Open
Abstract
RNA plays an extensive role in a multi-dimensional regulatory system, and its biomedical relationships are scattered across numerous biological studies. However, text mining works dedicated to the extraction of RNA biomedical relations remain limited. In this study, we established a comprehensive and reliable corpus of RNA biomedical relations, recruiting over 30,000 sentences manually curated from more than 15,000 biomedical literature. We also updated RIscoper 2.0, a BERT-based deep learning tool to extract RNA biomedical relation sentences from literature. Benefiting from approximately 100,000 annotated named entities, we integrated the text classification and named entity recognition tasks in this tool. Additionally, RIscoper 2.0 outperformed the original tool in both tasks and can discover new RNA biomedical relations. Additionally, we provided a user-friendly online search tool that enables rapid scanning of RNA biomedical relationships using local and online resources. Both the online tools and data resources of RIscoper 2.0 are available at http://www.rnainter.org/riscoper.
Collapse
Affiliation(s)
- Hailong Zheng
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Linfu Xu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Hailong Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jiajing Xie
- National Institute for Data Science in Health and Medicine, Xiamen University, 361102 Xiamen, China
| | - Yapeng Ma
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yongfei Hu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Le Wu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jia Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Meiyi Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Ying Yi
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yan Huang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
- Guangdong Province Key Laboratory of Molecular Tumor Pathology, 510515, Guangzhou, China
| |
Collapse
|
2
|
Raj S, Raj S, Namdeo V, Srivastava A. Decoding the gene-disease associations in type 2 diabetes: A curated dataset for text mining-based classification. Data Brief 2024; 54:110418. [PMID: 38708311 PMCID: PMC11068543 DOI: 10.1016/j.dib.2024.110418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 04/03/2024] [Accepted: 04/09/2024] [Indexed: 05/07/2024] Open
Abstract
Type 2 Diabetes (T2D) exerts a substantial impact on mortality rates. According to 2023 statistics, more than half a billion individuals are experiencing the effects of T2D, making it one of the top 10 leading contributors to worldwide deaths. Multiple factors contribute to the onset of T2D, such as obesity, poor diet and lifestyle, the mutation in specific genes and many more. Among the various factors that contribute to the development of T2D, genetics is a pivotal aspect. Due to the significant influence of genes in the initiation and advancement of various phases of T2D, our focus lies on exploring the association between T2D and genes. In the present article, we have curated Standard disease gene association data which contains evidence or reference sentences which contain this disease gene association information, which is further classified into 4 classes: Yes, No, Ambiguous and X each pertaining to Positive, Negative, Ambiguous and Not related disease-gene associations respectively. For the purpose of this work, we downloaded T2D related abstracts from PubMed using EDirect and further pre-processed this abstract data to extract Reference Sentences Data. This data was later double-fold manually validated to compile this disease gene association data. The data produced in this article serves as reference data for the training text mining-based biological literature classifiers. Classifiers will further be used to predict classes of published literature, not just for T2D, but can also be expanded beyond to encompass a wide range of disease and their complications. The compilation of positively linked genes derived from these predictions can then be utilized for in-depth system-level analysis of T2D.
Collapse
Affiliation(s)
- Sushrutha Raj
- Amity Institute of Integrative Sciences and Health, Amity University Haryana, Amity Education Valley, Gurgaon 122413, India
| | - Sushmitha Raj
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
| | - Vindhya Namdeo
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
| | - Alok Srivastava
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
- L V Prasad Eye Institute, Hyderabad 500034, Telangana, India
| |
Collapse
|
3
|
Sternberg PW, Van Auken K, Wang Q, Wright A, Yook K, Zarowiecki M, Arnaboldi V, Becerra A, Brown S, Cain S, Chan J, Chen WJ, Cho J, Davis P, Diamantakis S, Dyer S, Grigoriadis D, Grove CA, Harris T, Howe K, Kishore R, Lee R, Longden I, Luypaert M, Müller HM, Nuin P, Quinton-Tulloch M, Raciti D, Schedl T, Schindelman G, Stein L. WormBase 2024: status and transitioning to Alliance infrastructure. Genetics 2024; 227:iyae050. [PMID: 38573366 PMCID: PMC11075546 DOI: 10.1093/genetics/iyae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open
Abstract
WormBase has been the major repository and knowledgebase of information about the genome and genetics of Caenorhabditis elegans and other nematodes of experimental interest for over 2 decades. We have 3 goals: to keep current with the fast-paced C. elegans research, to provide better integration with other resources, and to be sustainable. Here, we discuss the current state of WormBase as well as progress and plans for moving core WormBase infrastructure to the Alliance of Genome Resources (the Alliance). As an Alliance member, WormBase will continue to interact with the C. elegans community, develop new features as needed, and curate key information from the literature and large-scale projects.
Collapse
Affiliation(s)
- Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stephanie Brown
- School of Infection and Immunity, University of Glasgow, Glasgow G12 8TA, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | | | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ian Longden
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Hans-Michael Müller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| |
Collapse
|
4
|
Li H, Zhao J, Xing Y, Chen J, Wen Z, Ma R, Han F, Huang B, Wang H, Li C, Chen Y, Ning X. Identification of Age-Related Characteristic Genes Involved in Severe COVID-19 Infection Among Elderly Patients Using Machine Learning and Immune Cell Infiltration Analysis. Biochem Genet 2024:10.1007/s10528-024-10802-9. [PMID: 38656671 DOI: 10.1007/s10528-024-10802-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Accepted: 04/05/2024] [Indexed: 04/26/2024]
Abstract
Elderly patients infected with severe acute respiratory syndrome coronavirus 2 are at higher risk of severe clinical manifestation, extended hospitalization, and increased mortality. Those patients are more likely to experience persistent symptoms and exacerbate the condition of basic diseases with long COVID-19 syndrome. However, the molecular mechanisms underlying severe COVID-19 in the elderly patients remain unclear. Our study aims to investigate the function of the interaction between disease-characteristic genes and immune cell infiltration in patients with severe COVID-19 infection. COVID-19 datasets (GSE164805 and GSE180594) and aging dataset (GSE69832) were obtained from the Gene Expression Omnibus database. The combined different expression genes (DEGs) were subjected to Gene Ontology (GO) functional enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Diseases Ontology functional enrichment analysis, Gene Set Enrichment Analysis, machine learning, and immune cell infiltration analysis. GO and KEGG enrichment analyses revealed that the eight DEGs (IL23A, PTGER4, PLCB1, IL1B, CXCR1, C1QB, MX2, ALOX12) were mainly involved in inflammatory mediator regulation of TRP channels, coronavirus disease-COVID-19, and cytokine activity signaling pathways. Three-degree algorithm (LASSO, SVM-RFE, KNN) and correlation analysis showed that the five DEGs up-regulated the immune cells of macrophages M0/M1, memory B cells, gamma delta T cell, dendritic cell resting, and master cell resisting. Our study identified five hallmark genes that can serve as disease-characteristic genes and target immune cells infiltrated in severe COVID-19 patients among the elderly population, which may contribute to the study of pathogenesis and the evaluation of diagnosis and prognosis in aging patients infected with severe COVID-19.
Collapse
Affiliation(s)
- Huan Li
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
- Department of Nephrology, The Second People's Hospital of Shaan xi Province, Xi'an, China
| | - Jin Zhao
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
- Department of Nephrology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Yan Xing
- Department of Nephrology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Jia Chen
- Xi'an Medical University, Xi'an, China
| | | | - Rui Ma
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Fengxia Han
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Boyong Huang
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Hao Wang
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Cui Li
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Yang Chen
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China
| | - Xiaoxuan Ning
- Department of Geriatrics, Xijing Hospital, Fourth Military Medical University, No. 127 Chang le West Road, Xi'an, 710032, Shaanxi, China.
| |
Collapse
|
5
|
Soliman Y, Yakandawala U, Leong C, Garlock ES, Brinkman FSL, Winsor GL, Kozyrskyj AL, Mandhane PJ, Turvey SE, Moraes TJ, Subbarao P, Nickel NC, Thiessen K, Azad MB, Kelly LE. The use of prescription medications and non-prescription medications during lactation in a prospective Canadian cohort study. Int Breastfeed J 2024; 19:23. [PMID: 38589955 PMCID: PMC11000278 DOI: 10.1186/s13006-024-00628-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 03/17/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND A lack of safety data on postpartum medication use presents a potential barrier to breastfeeding and may result in infant exposure to medications in breastmilk. The type and extent of medication use by lactating women requires investigation. METHODS Data were collected from the CHILD Cohort Study which enrolled pregnant women across Canada between 2008 and 2012. Participants completed questionnaires regarding medications and non-prescription medications used and breastfeeding status at 3, 6 and 12 months postpartum. Medications, along with self-reported reasons for medication use, were categorized by ontologies [hierarchical controlled vocabulary] as part of a large-scale curation effort to enable more robust investigations of reasons for medication use. RESULTS A total of 3542 mother-infant dyads were recruited to the CHILD study. Breastfeeding rates were 87.4%, 75.3%, 45.5% at 3, 6 and 12 months respectively. About 40% of women who were breastfeeding at 3 months used at least one prescription medication during the first three months postpartum; this proportion decreased over time to 29.5% % at 6 months and 32.8% at 12 months. The most commonly used prescription medication by breastfeeding women was domperidone at 3 months (9.0%, n = 229/2540) and 6 months (5.6%, n = 109/1948), and norethisterone at 12 months (4.1%, n = 48/1180). The vast majority of domperidone use by breastfeeding women (97.3%) was for lactation purposes which is off-label (signifying unapproved use of an approved medication). Non-prescription medications were more often used among breastfeeding than non-breastfeeding women (67.6% versus 48.9% at 3 months, p < 0.0001), The most commonly used non-prescription medications were multivitamins and Vitamin D at 3, 6 and 12 months postpartum. CONCLUSIONS In Canada, medication use is common postpartum; 40% of breastfeeding women use prescription medications in the first 3 months postpartum. A diverse range of medications were used, with many women taking more than one prescription and non-prescription medicines. The most commonly used prescription medication by breastfeeding women were domperidone for off-label lactation support, signalling a need for more data on the efficacy of domperidone for this indication. This data should inform research priorities and communication strategies developed to optimize care during lactation.
Collapse
Affiliation(s)
- Youstina Soliman
- Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, MB, Canada
| | - Uma Yakandawala
- George and Fay Yee Centre for Healthcare Innovation, Winnipeg, MB, Canada
- College of Pharmacy, Rady Faculty of Health Science, University of Manitoba, Winnipeg, MB, Canada
| | - Christine Leong
- College of Pharmacy, Rady Faculty of Health Science, University of Manitoba, Winnipeg, MB, Canada
| | - Emma S Garlock
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Geoffrey L Winsor
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Anita L Kozyrskyj
- Department of Pediatrics, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Piushkumar J Mandhane
- Department of Pediatrics, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Stuart E Turvey
- Department of Pediatrics, University of British Columbia, Vancouver, Canada
| | - Theo J Moraes
- Department of Pediatrics, Hospital for Sick Children, University of Toronto, Toronto, Canada
| | - Padmaja Subbarao
- Department of Pediatrics, Hospital for Sick Children, University of Toronto, Toronto, Canada
| | - Nathan C Nickel
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada
- Manitoba Interdisciplinary Lactation Centre (MILC), Winnipeg, MB, Canada
| | - Kellie Thiessen
- College of Nursing, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | - Meghan B Azad
- Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, MB, Canada
- Manitoba Interdisciplinary Lactation Centre (MILC), Children's Hospital Research Institute of Manitoba, Winnipeg, MB, Canada
| | - Lauren E Kelly
- Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, MB, Canada.
- George and Fay Yee Centre for Healthcare Innovation, Winnipeg, MB, Canada.
- Manitoba Interdisciplinary Lactation Centre (MILC), Children's Hospital Research Institute of Manitoba, Winnipeg, MB, Canada.
- , 417-753 McDermot Ave, R3E 0T6, Winnipeg, MB, Canada.
| |
Collapse
|
6
|
Xu T, Gao W, Zhu L, Chen W, Niu C, Yin W, Ma L, Zhu X, Ling Y, Gao S, Liu L, Jiao N, Chen W, Zhang G, Zhu R, Wu D. NAFLDkb: A Knowledge Base and Platform for Drug Development against Nonalcoholic Fatty Liver Disease. J Chem Inf Model 2024; 64:2817-2828. [PMID: 37167092 DOI: 10.1021/acs.jcim.3c00395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Nonalcoholic fatty liver disease (NAFLD) is the most common chronic liver disease with a broad spectrum of histologic manifestations. The rapidly growing prevalence and the complex pathologic mechanisms of NAFLD pose great challenges for treatment development. Despite tremendous efforts devoted to drug development, there are no FDA-approved medicines yet. Here, we present NAFLDkb, a specialized knowledge base and platform for computer-aided drug design against NAFLD. With multiperspective information curated from diverse source materials and public databases, NAFLDkb presents the associations of drug-related entities as individual knowledge graphs. Practical drug discovery tools that facilitate the utilization and expansion of NAFLDkb have also been implemented in the web interface, including chemical structure search, drug-likeness screening, knowledge-based repositioning, and research article annotation. Moreover, case studies of a knowledge graph repositioning model and a generative neural network model are presented herein, where three repositioning drug candidates and 137 novel lead-like compounds were newly established as NAFLD pharmacotherapy options reusing data records and machine learning tools in NAFLDkb, suggesting its clinical reliability and great potential in identifying novel drug-disease associations of NAFLD and generating new insights to accelerate NAFLD drug development. NAFLDkb is freely accessible at https://www.biosino.org/nafldkb and will be updated periodically with the latest findings.
Collapse
Affiliation(s)
- Tingjun Xu
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
- Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 LingLing Road, Shanghai 200032, P. R. China
| | - Wenxing Gao
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Lixin Zhu
- Guangdong Institute of Gastroenterology; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases; Biomedical Innovation Center, Sun Yat-sen University, Guangzhou 510655, P. R. China
- Department of General Surgery, The Sixth Affiliated Hospital of Sun Yat-sen University, Guangzhou 510655, P. R. China
| | - Wanning Chen
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Chaoqun Niu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China
| | - Wenjing Yin
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Liangxiao Ma
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China
| | - Xinyue Zhu
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Yunchao Ling
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China
| | - Sheng Gao
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Lei Liu
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Na Jiao
- National Clinical Research Center for Child Health, the Children's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, Zhejiang, P. R. China
| | - Weiming Chen
- Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 LingLing Road, Shanghai 200032, P. R. China
| | - Guoqing Zhang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China
| | - Ruixin Zhu
- Putuo People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200060, P. R. China
| | - Dingfeng Wu
- National Clinical Research Center for Child Health, the Children's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
7
|
Santangelo BE, Apgar M, Colorado ASB, Martin CG, Sterrett J, Wall E, Joachimiak MP, Hunter LE, Lozupone CA. Integrating biological knowledge for mechanistic inference in the host-associated microbiome. Front Microbiol 2024; 15:1351678. [PMID: 38638909 PMCID: PMC11024261 DOI: 10.3389/fmicb.2024.1351678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 04/20/2024] Open
Abstract
Advances in high-throughput technologies have enhanced our ability to describe microbial communities as they relate to human health and disease. Alongside the growth in sequencing data has come an influx of resources that synthesize knowledge surrounding microbial traits, functions, and metabolic potential with knowledge of how they may impact host pathways to influence disease phenotypes. These knowledge bases can enable the development of mechanistic explanations that may underlie correlations detected between microbial communities and disease. In this review, we survey existing resources and methodologies for the computational integration of broad classes of microbial and host knowledge. We evaluate these knowledge bases in their access methods, content, and source characteristics. We discuss challenges of the creation and utilization of knowledge bases including inconsistency of nomenclature assignment of taxa and metabolites across sources, whether the biological entities represented are rooted in ontologies or taxonomies, and how the structure and accessibility limit the diversity of applications and user types. We make this information available in a code and data repository at: https://github.com/lozuponelab/knowledge-source-mappings. Addressing these challenges will allow for the development of more effective tools for drawing from abundant knowledge to find new insights into microbial mechanisms in disease by fostering a systematic and unbiased exploration of existing information.
Collapse
Affiliation(s)
- Brook E. Santangelo
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Madison Apgar
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | | | - Casey G. Martin
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - John Sterrett
- Department of Integrative Physiology, University of Colorado, Boulder, CO, United States
| | - Elena Wall
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Marcin P. Joachimiak
- Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Biosystems Data Science Department, Berkeley, CA, United States
| | - Lawrence E. Hunter
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Catherine A. Lozupone
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
8
|
Caballero-Oteyza A, Crisponi L, Peng XP, Yauy K, Volpi S, Giardino S, Freeman AF, Grimbacher B, Proietti M. GenIA, the Genetic Immunology Advisor database for inborn errors of immunity. J Allergy Clin Immunol 2024; 153:831-843. [PMID: 38040041 DOI: 10.1016/j.jaci.2023.11.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/23/2023] [Accepted: 11/15/2023] [Indexed: 12/03/2023]
Abstract
BACKGROUND To date, no publicly accessible platform has captured and synthesized all of the layered dimensions of genotypic, phenotypic, and mechanistic information published in the field of inborn errors of immunity (IEIs). Such a platform would represent the extensive and complex landscape of IEIs and could increase the rate of diagnosis in patients with a suspected IEI, which remains unacceptably low. OBJECTIVE Our aim was to create an expertly curated, patient-centered, multidimensional IEI database that enables aggregation and sophisticated data interrogation and promotes involvement from diverse stakeholders across the community. METHODS The database structure was designed following a subject-centered model and written in Structured Query Language (SQL). The web application is written in Hypertext Preprocessor (PHP), Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript. All data stored in the Genetic Immunology Advisor (GenIA) are extracted by manually reviewing published research articles. RESULTS We completed data collection and curation for 24 pilot genes. Using these data, we have exemplified how GenIA can provide quick access to structured, longitudinal, more thorough, comprehensive, and up-to-date IEI knowledge than do currently existing databases, such as ClinGen, Human Phenotype Ontology (HPO), ClinVar, or Online Mendelian Inheritance in Man (OMIM), with which GenIA intends to dovetail. CONCLUSIONS GenIA strives to accurately capture the extensive genetic, mechanistic, and phenotypic heterogeneity found across IEIs, as well as genetic paradigms and diagnostic pitfalls associated with individual genes and conditions. The IEI community's involvement will help promote GenIA as an enduring resource that supports and improves knowledge sharing, research, diagnosis, and care for patients with genetic immune disease.
Collapse
Affiliation(s)
- Andrés Caballero-Oteyza
- Clinic for Immunology and Rheumatology, Hanover Medical School, Hanover, Germany; RESiST-Cluster of Excellence 2155, Hanover Medical School, Hanover, Germany; Institute for Immunodeficiency, Center for Chronic Immunodeficiency, University Hospital Freiburg, Freiburg, Germany.
| | - Laura Crisponi
- Institute for Genetic and Biomedical Research, The National Research Council, Monserrato, Cagliari, Italy
| | - Xiao P Peng
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Md
| | - Kevin Yauy
- University of Montpellier, LIRMM, CNRS, Reference Center for Congenital Anomalies, Clinical Genetic Unit, Montpellier University Hospital Center, Montpellier, France
| | - Stefano Volpi
- Center for Autoinflammatory Diseases and Immunodeficiencies, Pediatric Rheumatology Clinic, IRCCS Istituto Giannina Gaslini, Genova, and DINOGMI, Università degli Studi di Genova, Genova, Italy
| | - Stefano Giardino
- Hematopoietic Stem Cell Transplantation Unit, IRCCS Istituto Giannina Gaslini, Genova, Italy
| | - Alexandra F Freeman
- Laboratory of Clinical Immunology and Microbiology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Md
| | - Bodo Grimbacher
- Institute for Immunodeficiency, Center for Chronic Immunodeficiency, University Hospital Freiburg, Freiburg, Germany; Clinic of Rheumatology and Clinical Immunology, Center for Chronic Immunodeficiency, Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, Freiburg, Germany; RESiST-Cluster of Excellence 2155, Hanover Medical School, Satellite Center Freiburg, Freiburg, Germany; German Center for Infection Research, Satellite Center Freiburg, Freiburg, Germany; Centre for Integrative Biological Signalling Studies, Albert-Ludwigs University of Freiburg, Freiburg, Germany
| | - Michele Proietti
- Clinic for Immunology and Rheumatology, Hanover Medical School, Hanover, Germany; RESiST-Cluster of Excellence 2155, Hanover Medical School, Hanover, Germany; Institute for Immunodeficiency, Center for Chronic Immunodeficiency, University Hospital Freiburg, Freiburg, Germany.
| |
Collapse
|
9
|
Verma G, Rebholz-Schuhmann D, Madden MG. Enabling personalised disease diagnosis by combining a patient's time-specific gene expression profile with a biomedical knowledge base. BMC Bioinformatics 2024; 25:62. [PMID: 38326757 PMCID: PMC10848462 DOI: 10.1186/s12859-024-05674-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 01/25/2024] [Indexed: 02/09/2024] Open
Abstract
BACKGROUND Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems. RESULTS We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches: LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average. CONCLUSIONS We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.
Collapse
Affiliation(s)
- Ghanshyam Verma
- Insight Centre for Data Analytics, School of Computer Science, University of Galway, Galway, Ireland.
- School of Computer Science, University of Galway, Galway, Ireland.
| | | | - Michael G Madden
- Insight Centre for Data Analytics, School of Computer Science, University of Galway, Galway, Ireland
- School of Computer Science, University of Galway, Galway, Ireland
| |
Collapse
|
10
|
Colvin A, Youssef S, Noh H, Wright J, Jumonville G, LaRow Brown K, Tatonetti NP, Milner JD, Weng C, Bordone LA, Petukhova L. Inborn Errors of Immunity Contribute to the Burden of Skin Disease and Create Opportunities for Improving the Practice of Dermatology. J Invest Dermatol 2024; 144:307-315.e1. [PMID: 37716649 DOI: 10.1016/j.jid.2023.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 09/18/2023]
Abstract
Opportunities to improve the clinical management of skin disease are being created by advances in genomic medicine. Large-scale sequencing increasingly challenges notions about single-gene disorders. It is now apparent that monogenic etiologies make appreciable contributions to the population burden of disease and that they are underrecognized in clinical practice. A genetic diagnosis informs on molecular pathology and may direct targeted treatments and tailored prevention strategies for patients and family members. It also generates knowledge about disease pathogenesis and management that is relevant to patients without rare pathogenic variants. Inborn errors of immunity are a large class of monogenic etiologies that have been well-studied and contribute to the population burden of inflammatory diseases. To further delineate the contributions of inborn errors of immunity to the pathogenesis of skin disease, we performed a set of analyses that identified 316 inborn errors of immunity associated with skin pathologies, including common skin diseases. These data suggest that clinical sequencing is underutilized in dermatology. We next use these data to derive a network that illuminates the molecular relationships of these disorders and suggests an underlying etiological organization to immune-mediated skin disease. Our results motivate the further development of a molecularly derived and data-driven reorganization of clinical diagnoses of skin disease.
Collapse
Affiliation(s)
- Annelise Colvin
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Soundos Youssef
- Department of Pediatrics and Adolescent Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - Heeju Noh
- Department of Systems Biology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Julia Wright
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA
| | - Ghislaine Jumonville
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA
| | - Kathleen LaRow Brown
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA; Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, California, USA; Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Joshua D Milner
- Department of Pediatrics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Lindsey A Bordone
- Department of Dermatology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Lynn Petukhova
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA; Department of Dermatology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA.
| |
Collapse
|
11
|
Afonin AM, Piironen AK, de Sousa Maciel I, Ivanova M, Alatalo A, Whipp AM, Pulkkinen L, Rose RJ, van Kamp I, Kaprio J, Kanninen KM. Proteomic insights into mental health status: plasma markers in young adults. Transl Psychiatry 2024; 14:55. [PMID: 38267423 PMCID: PMC10808121 DOI: 10.1038/s41398-024-02751-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
Global emphasis on enhancing prevention and treatment strategies necessitates an increased understanding of the biological mechanisms of psychopathology. Plasma proteomics is a powerful tool that has been applied in the context of specific mental disorders for biomarker identification. The p-factor, also known as the "general psychopathology factor", is a concept in psychopathology suggesting that there is a common underlying factor that contributes to the development of various forms of mental disorders. It has been proposed that the p-factor can be used to understand the overall mental health status of an individual. Here, we aimed to discover plasma proteins associated with the p-factor in 775 young adults in the FinnTwin12 cohort. Using liquid chromatography-tandem mass spectrometry, 13 proteins with a significant connection with the p-factor were identified, 8 of which were linked to epidermal growth factor receptor (EGFR) signaling. This exploratory study provides new insight into biological alterations associated with mental health status in young adults.
Collapse
Affiliation(s)
- Alexey M Afonin
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Aino-Kaisa Piironen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Izaque de Sousa Maciel
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Mariia Ivanova
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Arto Alatalo
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Alyce M Whipp
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Lea Pulkkinen
- Department of Psychology, University of Jyvaskyla, Jyvaskyla, Finland
| | - Richard J Rose
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, USA
| | - Irene van Kamp
- Centre for Sustainability, Environment and Health, National Institute for Public Health and the Environment, Bilthoven, the Netherlands
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Katja M Kanninen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland.
| |
Collapse
|
12
|
Alqaissi E, Alotaibi F, Sher Ramzan M, Algarni A. Novel graph-based machine-learning technique for viral infectious diseases: application to influenza and hepatitis diseases. Ann Med 2024; 55:2304108. [PMID: 38242107 PMCID: PMC10802812 DOI: 10.1080/07853890.2024.2304108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/18/2023] [Indexed: 01/21/2024] Open
Abstract
BACKGROUND Most infectious diseases are caused by viruses, fungi, bacteria and parasites. Their ability to easily infect humans and trigger large-scale epidemics makes them a public health concern. Methods for early detection of these diseases have been developed; however, they are hindered by the absence of a unified, interoperable and reusable model. This study seeks to create a holistic and real-time model for swift, preliminary detection of infectious diseases using symptoms and additional clinical data. MATERIALS AND METHODS In this study, we present a medical knowledge graph (MKG) that leverages multiple data sources to analyse connections between different nodes. Medical ontologies were used to enhance the MKG. We applied various graph algorithms to extract key features. The performance of multiple machine-learning (ML) techniques for influenza and hepatitis detection was assessed, selecting multi-layer perceptron (MLP) and random forest (RF) models due to their superior outcomes. The hyperparameters of both graph-based ML models were automatically fine-tuned. RESULTS Both the graph-based MLP and RF models showcased the least loss and error rates, along with the most specific, accurate recall, precision and F1 scores. Their Matthews correlation coefficients were also optimal. When compared with existing ML techniques and findings from the literature, these graph-based ML models manifested superior detection accuracy. CONCLUSIONS The graph-based MLP and RF models effectively diagnosed influenza and hepatitis, respectively. This underlines the potential of graph data science in enhancing ML model performance and uncovering concealed relationships in the MKG.
Collapse
Affiliation(s)
- Eman Alqaissi
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
- Computer Science and Information Systems, The Applied College, King Khalid University, Abha, Saudi Arabia
| | - Fahd Alotaibi
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Muhammad Sher Ramzan
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | | |
Collapse
|
13
|
Baron JA, Johnson CSB, Schor MA, Olley D, Nickel L, Felix V, Munro J, Bello S, Bearer C, Lichenstein R, Bisordi K, Koka R, Greene C, Schriml L. The DO-KB Knowledgebase: a 20-year journey developing the disease open science ecosystem. Nucleic Acids Res 2024; 52:D1305-D1314. [PMID: 37953304 PMCID: PMC10767934 DOI: 10.1093/nar/gkad1051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/14/2023] Open
Abstract
In 2003, the Human Disease Ontology (DO, https://disease-ontology.org/) was established at Northwestern University. In the intervening 20 years, the DO has expanded to become a highly-utilized disease knowledge resource. Serving as the nomenclature and classification standard for human diseases, the DO provides a stable, etiology-based structure integrating mechanistic drivers of human disease. Over the past two decades the DO has grown from a collection of clinical vocabularies, into an expertly curated semantic resource of over 11300 common and rare diseases linking disease concepts through more than 37000 vocabulary cross mappings (v2023-08-08). Here, we introduce the recently launched DO Knowledgebase (DO-KB), which expands the DO's representation of the diseaseome and enhances the findability, accessibility, interoperability and reusability (FAIR) of disease data through a new SPARQL service and new Faceted Search Interface. The DO-KB is an integrated data system, built upon the DO's semantic disease knowledge backbone, with resources that expose and connect the DO's semantic knowledge with disease-related data across Open Linked Data resources. This update includes descriptions of efforts to assess the DO's global impact and improvements to data quality and content, with emphasis on changes in the last two years.
Collapse
Affiliation(s)
- J Allen Baron
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Michael A Schor
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Dustin Olley
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Lance Nickel
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Victor Felix
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - James B Munro
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
- Animal and Plant Health Inspection Service, Plant Protection and Quarantine, USDA, USA
| | - Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, USA
| | | | | | | | - Rima Koka
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Carol Greene
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| |
Collapse
|
14
|
Sun ZY, Yang CL, Huang LJ, Mo ZC, Zhang KN, Fan WH, Wang KY, Wu F, Wang JG, Meng FL, Zhao Z, Jiang T. circRNADisease v2.0: an updated resource for high-quality experimentally supported circRNA-disease associations. Nucleic Acids Res 2024; 52:D1193-D1200. [PMID: 37897359 PMCID: PMC10767896 DOI: 10.1093/nar/gkad949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/09/2023] [Accepted: 10/12/2023] [Indexed: 10/30/2023] Open
Abstract
circRNADisease v2.0 is an enhanced and reliable database that offers experimentally verified relationships between circular RNAs (circRNAs) and various diseases. It is accessible at http://cgga.org.cn/circRNADisease/ or http://cgga.org.cn:9091/circRNADisease/. The database currently includes 6998 circRNA-disease entries across multiple species, representing a remarkable 19.77-fold increase compared to the previous version. This expansion consists of a substantial rise in the number of circRNAs (from 330 to 4246), types of diseases (from 48 to 330) and covered species (from human only to 12 species). Furthermore, a new section has been introduced in the database, which collects information on circRNA-associated factors (genes, proteins and microRNAs), molecular mechanisms (molecular pathways), biological functions (proliferation, migration, invasion, etc.), tumor and/or cell line and/or patient-derived xenograft (PDX) details, and prognostic evidence in diseases. In addition, we identified 7 159 865 relationships between mutations and circRNAs among 30 TCGA cancer types. Due to notable enhancements and extensive data expansions, the circRNADisease 2.0 database has become an invaluable asset for both clinical practice and fundamental research. It enables researchers to develop a more comprehensive understanding of how circRNAs impact complex diseases.
Collapse
Affiliation(s)
- Zhi-Yan Sun
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
- Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Chang-Lin Yang
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
| | - Li-Jie Huang
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
| | - Zong-Chao Mo
- SIAT-HKUST Joint Laboratory of Cell Evolution and Digital Health, HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Shenzhen 518045, China
- Division of Life Science, Department of Chemical and Biological Engineering, and State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Ke-Nan Zhang
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
| | - Wen-Hua Fan
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
- Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Kuan-Yu Wang
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
- Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Fan Wu
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
| | - Ji-Guang Wang
- SIAT-HKUST Joint Laboratory of Cell Evolution and Digital Health, HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Shenzhen 518045, China
- Division of Life Science, Department of Chemical and Biological Engineering, and State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Fan-Lin Meng
- Marketing and Management Department, CapitalBio Technology, Beijing 101111, China
- National Engineering Research Center for Beijing Biochip Technology, Beijing 102206, China
| | - Zheng Zhao
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
- SIAT-HKUST Joint Laboratory of Cell Evolution and Digital Health, HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Shenzhen 518045, China
- Division of Life Science, Department of Chemical and Biological Engineering, and State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Tao Jiang
- Beijing Neurosurgical Institute, Capital Medical University, Beijing 100070, China
- Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| |
Collapse
|
15
|
Liu Z, Chen R, Yang L, Jiang J, Ma S, Chen L, He M, Mao Y, Guo C, Kong X, Zhang X, Qi Y, Liu F, He F, Li D. CDS-DB, an omnibus for patient-derived gene expression signatures induced by cancer treatment. Nucleic Acids Res 2024; 52:D1163-D1179. [PMID: 37889038 PMCID: PMC10767794 DOI: 10.1093/nar/gkad888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/25/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
Patient-derived gene expression signatures induced by cancer treatment, obtained from paired pre- and post-treatment clinical transcriptomes, can help reveal drug mechanisms of action (MOAs) in cancer patients and understand the molecular response mechanism of tumor sensitivity or resistance. Their integration and reuse may bring new insights. Paired pre- and post-treatment clinical transcriptomic data are rapidly accumulating. However, a lack of systematic collection makes data access, integration, and reuse challenging. We therefore present the Cancer Drug-induced gene expression Signature DataBase (CDS-DB). CDS-DB has collected 78 patient-derived, paired pre- and post-treatment transcriptomic source datasets with uniformly reprocessed expression profiles and manually curated metadata such as drug administration dosage, sampling time and location, and intrinsic drug response status. From these source datasets, 2012 patient-level gene perturbation signatures were obtained, covering 85 therapeutic regimens, 39 cancer subtypes and 3628 patient samples. Besides data browsing, download and search, CDS-DB also supports single signature analysis (including differential gene expression, functional enrichment, tumor microenvironment and correlation analyses), signature comparative analysis and signature connectivity analysis. This provides insights into drug MOA and its heterogeneity in patients, drug resistance mechanisms, drug repositioning and drug (combination) discovery, etc. CDS-DB is available at http://cdsdb.ncpsb.org.cn/.
Collapse
Affiliation(s)
- Zhongyang Liu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
- College of Life Sciences, Hebei University, Baoding 071002, China
| | - Ruzhen Chen
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Lele Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
| | - Jianzhou Jiang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Life Sciences, Hebei University, Baoding 071002, China
| | - Shurui Ma
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Basic Medicine, Anhui Medical University, Hefei 230032, China
| | - Lanhui Chen
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Mengqi He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Yichao Mao
- College of Life Sciences, Hebei University, Baoding 071002, China
| | - Congcong Guo
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Xiangya Kong
- Beijing Cloudna Technology Company, Limited, Beijing 100029, China
| | - Xinlei Zhang
- Beijing Cloudna Technology Company, Limited, Beijing 100029, China
| | - Yaning Qi
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
| | - Fengsong Liu
- College of Life Sciences, Hebei University, Baoding 071002, China
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Dong Li
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| |
Collapse
|
16
|
Cui C, Zhong B, Fan R, Cui Q. HMDD v4.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res 2024; 52:D1327-D1332. [PMID: 37650649 PMCID: PMC10767894 DOI: 10.1093/nar/gkad717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/07/2023] [Accepted: 08/19/2023] [Indexed: 09/01/2023] Open
Abstract
MicroRNAs (miRNAs) are a class of important small non-coding RNAs with critical molecular functions in almost all biological processes, and thus, they play important roles in disease diagnosis and therapy. Human MicroRNA Disease Database (HMDD) represents an important and comprehensive resource for biomedical researchers in miRNA-related medicine. Here, we introduce HMDD v4.0, which curates 53530 miRNA-disease association entries from literatures. In comparison to HMDD v3.0 released five years ago, HMDD v4.0 contains 1.5 times more entries. In addition, some new categories have been curated, including exosomal miRNAs implicated in diseases, virus-encoded miRNAs involved in human diseases, and entries containing miRNA-circRNA interactions. We also curated sex-biased miRNAs in diseases. Furthermore, in a case study, disease similarity analysis successfully revealed that sex-biased miRNAs related to developmental anomalies are associated with a number of human diseases with sex bias. HMDD can be freely visited at http://www.cuilab.cn/hmdd.
Collapse
Affiliation(s)
- Chunmei Cui
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Bitao Zhong
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Rui Fan
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
- School of Sports Medicine, Wuhan Institute of Physical Education, No. 461 Luoyu Rd. Wuchang District, Wuhan 430079, Hubei Province, China
| |
Collapse
|
17
|
Lin X, Lu Y, Zhang C, Cui Q, Tang YD, Ji X, Cui C. LncRNADisease v3.0: an updated database of long non-coding RNA-associated diseases. Nucleic Acids Res 2024; 52:D1365-D1369. [PMID: 37819033 PMCID: PMC10767967 DOI: 10.1093/nar/gkad828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/04/2023] [Accepted: 09/21/2023] [Indexed: 10/13/2023] Open
Abstract
Systematic integration of lncRNA-disease associations is of great importance for further understanding their underlying molecular mechanisms and exploring lncRNA-based biomarkers and therapeutics. The database of long non-coding RNA-associated diseases (LncRNADisease) is designed for the above purpose. Here, an updated version (LncRNADisease v3.0) has curated comprehensive lncRNA (including circRNA) and disease associations from the burgeoning literatures. LncRNADisease v3.0 exhibits an over 2-fold increase in experimentally supported associations, with a total of 25440 entries, compared to the last version. Besides, each lncRNA-disease pair is assigned a confidence score based on experimental evidence. Moreover, all associations between lncRNAs/circRNAs and diseases are classified into general associations and causal associations, representing whether lncRNAs or circRNAs can directly lead to the development or progression of corresponding diseases, with 15721 and 9719 entries, respectively. In a case study, we used the data of LncRNADisease v3.0 to calculate the phenotypic similarity between human and mouse lncRNAs. This database will continue to serve as a valuable resource for potential clinical applications related to lncRNAs and circRNAs. LncRNADisease v3.0 is freely available at http://www.rnanut.net/lncrnadisease.
Collapse
Affiliation(s)
- Xiao Lin
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Yingyu Lu
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Chenhao Zhang
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
- School of Sports Medicine, Wuhan Institute of Physical Education, No.461 Luoyu Rd. Wuchang District, WuHan 430079, Hubei Province, China
| | - Yi-Da Tang
- Department of Cardiology and Institute of Vascular Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University Third Hospital, 49 Huayuanbei Road, Beijing 100191, China
| | - Xiangwen Ji
- Department of Cardiology and Institute of Vascular Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University Third Hospital, 49 Huayuanbei Road, Beijing 100191, China
| | - Chunmei Cui
- Department of Biomedical Informatics, Center for Noncoding RNA Medicine, State Key Laboratory of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| |
Collapse
|
18
|
Zhai Z, Lin Z, Meng X, Zheng X, Du Y, Li Z, Zhang X, Liu C, Zhou L, Zhang X, Tian Z, Ma Q, Li J, Li Q, Pan J. DiSignAtlas: an atlas of human and mouse disease signatures based on bulk and single-cell transcriptomics. Nucleic Acids Res 2024; 52:D1236-D1245. [PMID: 37930831 PMCID: PMC10767933 DOI: 10.1093/nar/gkad961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/09/2023] [Accepted: 10/13/2023] [Indexed: 11/08/2023] Open
Abstract
Molecular signatures are usually sets of biomolecules that can serve as diagnostic, prognostic, predictive, or therapeutic markers for a specific disease. Omics data derived from various high-throughput molecular biology technologies offer global, unbiased and appropriately comparable data, which can be used to identify such molecular signatures. To address the need for comprehensive disease signatures, DiSignAtlas (http://www.inbirg.com/disignatlas/) was developed to provide transcriptomics-based signatures for a wide range of diseases. A total of 181 434 transcriptome profiles were manually curated from studies involving 1836 nonredundant disease types in humans and mice. Then, 10 306 comparison datasets comprising both disease and control samples, including 328 single-cell RNA sequencing datasets, were established. Furthermore, a total of 3 775 317 differentially expressed genes in humans and 1 723 674 in mice were identified as disease signatures by analysing transcriptome profiles using commonly used pipelines. In addition to providing multiple methods for the retrieval of disease signatures, DiSignAtlas provides downstream functional enrichment analysis, cell type analysis and signature correlation analysis between diseases or species when available. Moreover, multiple analytical and comparison tools for disease signatures are available. DiSignAtlas is expected to become a valuable resource for both bioscientists and bioinformaticians engaged in translational research.
Collapse
Affiliation(s)
- Zhaoyu Zhai
- Precision Medicine Center, The Second Affiliated Hospital of Chongqing Medical University, Chongqing 400010, China
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Zhewei Lin
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Xuehang Meng
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Xiao Zheng
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Yujia Du
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Zhi Li
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Xuelu Zhang
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Chang Liu
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Lu Zhou
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Xu Zhang
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Zhihao Tian
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Qinfeng Ma
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Jinhao Li
- Hepatobiliary Surgery, The Second Affiliated Hospital of Chongqing Medical University, Chongqing 400010, China
| | - Qiang Li
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| | - Jianbo Pan
- Precision Medicine Center, The Second Affiliated Hospital of Chongqing Medical University, Chongqing 400010, China
- Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, Chongqing 400016, China
| |
Collapse
|
19
|
Xiong Z, Chen P, Yuan M, Yao L, Wang Z, Liu P, Jiang Y. Integrated Bioinformatics and Validation Reveal IFI27 and Its Related Molecules as Potential Identifying Genes in Liver Cirrhosis. Biomolecules 2023; 14:13. [PMID: 38275754 PMCID: PMC10813755 DOI: 10.3390/biom14010013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 11/27/2023] [Accepted: 12/08/2023] [Indexed: 01/27/2024] Open
Abstract
Liver cirrhosis remains a significant global public health concern, with liver transplantation standing as the foremost effective treatment currently available. Therefore, investigating the pathogenesis of liver cirrhosis and developing novel therapies is imperative. Mitochondrial dysfunction stands out as a pivotal factor in its development. This study aimed to elucidate the relationship between mitochondria dysfunction and liver cirrhosis using bioinformatic methods to unveil its pathogenesis. Initially, we identified 460 co-expressed differential genes (co-DEGs) from the GSE14323 and GSE25097 datasets, alongside their combined datasets. Functional analysis revealed that these co-DEGs were associated with inflammatory cytokines and cirrhosis-related signaling pathways. Utilizing weighted gene co-expression network analysis (WCGNA), we screened module genes, intersecting them with co-DEGs and oxidative stress-related mitochondrial genes. Two algorithms (least absolute shrinkage and selection operator (LASSO) regression and SVE-RFE) were then employed to further analyze the intersecting genes. Finally, COX7A1 and IFI27 emerged as identifying genes for liver cirrhosis, validated through a receiver operating characteristic (ROC) curve analysis and related experiments. Additionally, immune infiltration highlighted a strong correlation between macrophages and cirrhosis, with the identifying genes (COX7A1 and IFI27) being significantly associated with macrophages. In conclusion, our findings underscore the critical role of oxidative stress-related mitochondrial genes (COX7A1 and IFI27) in liver cirrhosis development, highlighting their association with macrophage infiltration. This study provides novel insights into understanding the pathogenesis of liver cirrhosis.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Yingan Jiang
- Department of Infectious Diseases, Renmin Hospital of Wuhan University, Wuhan 430060, China; (Z.X.); (P.C.); (M.Y.); (L.Y.); (Z.W.); (P.L.)
| |
Collapse
|
20
|
Zulian V, Fiscon G, Paci P, Garbuglia AR. Hepatitis B Virus and microRNAs: A Bioinformatics Approach. Int J Mol Sci 2023; 24:17224. [PMID: 38139051 PMCID: PMC10743825 DOI: 10.3390/ijms242417224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/20/2023] [Accepted: 12/05/2023] [Indexed: 12/24/2023] Open
Abstract
In recent decades, microRNAs (miRNAs) have emerged as key regulators of gene expression, and the identification of viral miRNAs (v-miRNAs) within some viruses, including hepatitis B virus (HBV), has attracted significant attention. HBV infections often progress to chronic states (CHB) and may induce fibrosis/cirrhosis and hepatocellular carcinoma (HCC). The presence of HBV can dysregulate host miRNA expression, influencing several biological pathways, such as apoptosis, innate and immune response, viral replication, and pathogenesis. Consequently, miRNAs are considered a promising biomarker for diagnostic, prognostic, and treatment response. The dynamics of miRNAs during HBV infection are multifaceted, influenced by host variability and miRNA interactions. Given the ability of miRNAs to target multiple messenger RNA (mRNA), understanding the viral-host (human) interplay is complex but essential to develop novel clinical applications. Therefore, bioinformatics can help to analyze, identify, and interpret a vast amount of miRNA data. This review explores the bioinformatics tools available for viral and host miRNA research. Moreover, we introduce a brief overview focusing on the role of miRNAs during HBV infection. In this way, this review aims to help the selection of the most appropriate bioinformatics tools based on requirements and research goals.
Collapse
Affiliation(s)
- Verdiana Zulian
- Virology Laboratory, National Institute for Infectious Diseases “Lazzaro Spallanzani” IRCCS, 00149 Rome, Italy;
| | - Giulia Fiscon
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, 00185 Rome, Italy; (G.F.); (P.P.)
- Institute for Systems Analysis and Computer Science “Antonio Ruberti”, National Research Council, 00185 Rome, Italy
| | - Paola Paci
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, 00185 Rome, Italy; (G.F.); (P.P.)
- Institute for Systems Analysis and Computer Science “Antonio Ruberti”, National Research Council, 00185 Rome, Italy
| | - Anna Rosa Garbuglia
- Virology Laboratory, National Institute for Infectious Diseases “Lazzaro Spallanzani” IRCCS, 00149 Rome, Italy;
| |
Collapse
|
21
|
Marino GB, Ahmed N, Xie Z, Jagodnik KM, Han J, Clarke DJB, Lachmann A, Keller MP, Attie AD, Ma’ayan A. D2H2: diabetes data and hypothesis hub. Bioinform Adv 2023; 3:vbad178. [PMID: 38107655 PMCID: PMC10723036 DOI: 10.1093/bioadv/vbad178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/25/2023] [Accepted: 12/02/2023] [Indexed: 12/19/2023]
Abstract
Motivation There is a rapid growth in the production of omics datasets collected by the diabetes research community. However, such published data are underutilized for knowledge discovery. To make bioinformatics tools and published omics datasets from the diabetes field more accessible to biomedical researchers, we developed the Diabetes Data and Hypothesis Hub (D2H2). Results D2H2 contains hundreds of high-quality curated transcriptomics datasets relevant to diabetes, accessible via a user-friendly web-based portal. The collected and processed datasets are curated from the Gene Expression Omnibus (GEO). Each curated study has a dedicated page that provides data visualization, differential gene expression analysis, and single-gene queries. To enable the investigation of these curated datasets and to provide easy access to bioinformatics tools that serve gene and gene set-related knowledge, we developed the D2H2 chatbot. Utilizing GPT, we prompt users to enter free text about their data analysis needs. Parsing the user prompt, together with specifying information about all D2H2 available tools and workflows, we answer user queries by invoking the most relevant tools via the tools' API. D2H2 also has a hypotheses generation module where gene sets are randomly selected from the bulk RNA-seq precomputed signatures. We then find highly overlapping gene sets extracted from publications listed in PubMed Central with abstract dissimilarity. With the help of GPT, we speculate about a possible explanation of the high overlap between the gene sets. Overall, D2H2 is a platform that provides a suite of bioinformatics tools and curated transcriptomics datasets for hypothesis generation. Availability and implementation D2H2 is available at: https://d2h2.maayanlab.cloud/ and the source code is available from GitHub at https://github.com/MaayanLab/D2H2-site under the CC BY-NC 4.0 license.
Collapse
Affiliation(s)
- Giacomo B Marino
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Nasheath Ahmed
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Jason Han
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Mark P Keller
- Department of Biochemistry, University of Wisconsin, Madison, WI 53706, United States
| | - Alan D Attie
- Department of Biochemistry, University of Wisconsin, Madison, WI 53706, United States
| | - Avi Ma’ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| |
Collapse
|
22
|
Abstract
Comparing genomic and biological characteristics across multiple species is essential to using model systems to investigate the molecular and cellular mechanisms underlying human biology and disease and to translate mechanistic insights from studies in model organisms for clinical applications. Building a scalable knowledge commons platform that supports cross-species comparison of rich, expertly curated knowledge regarding gene function, phenotype, and disease associations available for model organisms and humans is the primary mission of the Alliance of Genome Resources (the Alliance). The Alliance is a consortium of seven model organism knowledgebases (mouse, rat, yeast, nematode, zebrafish, frog, fruit fly) and the Gene Ontology resource. The Alliance uses a common set of gene ortholog assertions as the basis for comparing biological annotations across the organisms represented in the Alliance. The major types of knowledge associated with genes that are represented in the Alliance database currently include gene function, phenotypic alleles and variants, human disease associations, pathways, gene expression, and both protein-protein and genetic interactions. The Alliance has enhanced the ability of researchers to easily compare biological annotations for common data types across model organisms and human through the implementation of shared programmatic access mechanisms, data-specific web pages with a unified "look and feel", and interactive user interfaces specifically designed to support comparative biology. The modular infrastructure developed by the Alliance allows the resource to serve as an extensible "knowledge commons" capable of expanding to accommodate additional model organisms.
Collapse
|
23
|
He N, Li D, Xu F, Jin J, Li L, Tian L, Chen B, Li X, Ning S, Wang L, Wang J. LncPCD: a manually curated database of experimentally supported associations between lncRNA-mediated programmed cell death and diseases. Database (Oxford) 2023; 2023:baad087. [PMID: 38011720 PMCID: PMC10681436 DOI: 10.1093/database/baad087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/02/2023] [Accepted: 11/15/2023] [Indexed: 11/29/2023]
Abstract
Programmed cell death (PCD) refers to controlled cell death that is conducted to keep the internal environment stable. Long noncoding RNAs (lncRNAs) participate in the progression of PCD in a variety of diseases. However, no specialized online repository is available to collect and store the associations between lncRNA-mediated PCD and diseases. Here, we developed LncPCD, a comprehensive database that provides information on experimentally supported associations of lncRNA-mediated PCD with diseases. The current version of LncPCD documents 6666 associations between five common types of PCD (apoptosis, autophagy, ferroptosis, necroptosis and pyroptosis) and 1222 lncRNAs in 331 diseases. We also manually curated a wealth of information: (1) 7 important lncRNA regulatory mechanisms, (2) 310 PCD-associated cell types in three species, (3) detailed information on lncRNA subcellular locations and (4) clinical applications for lncRNA-mediated PCD in diseases. Additionally, 10 single-cell sequencing datasets were integrated into LncPCD to characterize the dynamics of lncRNAs in diseases. Overall, LncPCD is an extremely useful resource for understanding the functions and mechanisms of lncRNA-mediated PCD in diseases. Database URL: http://spare4.hospital.studio:9000/lncPCD/Home.jsp.
Collapse
Affiliation(s)
| | - Danyang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Fanfan Xu
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Jingnan Jin
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Lifang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Liting Tian
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Biying Chen
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Xiaoju Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Shangwei Ning
- College of Bioinformatics Science and Technology, Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Jianjian Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang 150081, China
| |
Collapse
|
24
|
Xu Q, Liu Y, Sun D, Huang X, Li F, Zhai J, Li Y, Zhou Q, Qian N, Niu B. OncoCTMiner: streamlining precision oncology trial matching via molecular profile analysis. Database (Oxford) 2023; 2023:baad077. [PMID: 37935585 PMCID: PMC10630409 DOI: 10.1093/database/baad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/08/2023] [Accepted: 10/21/2023] [Indexed: 11/09/2023]
Abstract
By establishing omics sequencing of patient tumors as a crucial element in cancer treatment, the extensive implementation of precision oncology necessitates effective and prompt execution of clinical studies for approving molecular-targeted therapies. However, the substantial volume of patient sequencing data, combined with strict clinical trial criteria, increasingly complicates the process of matching patients to precision oncology studies. To streamline enrollment in these studies, we developed OncoCTMiner, an automated pre-screening platform for molecular cancer clinical trials. Through manual tagging of eligibility criteria for 2227 oncology trials, we identified key bio-concepts such as cancer types, genes, alterations, drugs, biomarkers and therapies. Utilizing this manually annotated corpus along with open-source biomedical natural language processing tools, we trained multiple named entity recognition models specifically designed for precision oncology trials. These models analyzed 460 952 clinical trials, revealing 8.15 million precision medicine concepts, 9.32 million entity-criteria-trial triplets and a comprehensive precision oncology eligibility criteria database. Most significantly, we developed a patient-trial matching system based on cancer patients' clinical and genetic profiles, which can seamlessly integrate with the omics data analysis platform. This system expedites the pre-screening process for potentially suitable precision oncology trials, offering patients swifter access to promising treatment options. Database URL https://oncoctminer.chosenmedinfo.com.
Collapse
Affiliation(s)
- Quan Xu
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Yueyue Liu
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Dawei Sun
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Xiaoqian Huang
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Feihong Li
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - JinCheng Zhai
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
| | - Yang Li
- Beijing International Center for Mathematical Research, Peking University, No. 5 Yiheyuan Road Haidian District, Beijing 100871, China
- Chongqing Research Institute of Big Data, Peking University, Chongqing 401333, China
| | - Qiming Zhou
- Department of Bioinformatics, Beijing ChosenMed Clinical Laboratory Co. Ltd., Jinghai Industrial Park, 156 Jinghai 4th Road, Economic and Technological Development Area, Beijing 100176, China
- Research and Development Center, ChosenMed Technology (Zhejiang) Co. Ltd., Room 101, Building 8, Jincheng International Science and Technology City, No. 26 Zhenxing East Road, Linping District, Hangzhou, 311103, China
| | - Niansong Qian
- Department of Oncology, Senior Department of Respiratory and Critical Care Medicine, The Eighth Medical Center of Chinese PLA General Hospital, No.17 A Heishanhu Road, Haidian District, Beijing 100853, China
| | - Beifang Niu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
25
|
Bi X, Liang W, Zhao Q, Wang J. SSLpheno: a self-supervised learning approach for gene-phenotype association prediction using protein-protein interactions and gene ontology data. Bioinformatics 2023; 39:btad662. [PMID: 37941450 PMCID: PMC10666204 DOI: 10.1093/bioinformatics/btad662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 10/17/2023] [Accepted: 11/03/2023] [Indexed: 11/10/2023] Open
Abstract
MOTIVATION Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene-phenotype associations still suffer from imbalanced category distribution and a lack of labeled data in small categories. RESULTS To address the problem of labeled-data scarcity, we propose a self-supervised learning strategy for gene-phenotype association prediction, called SSLpheno. Our approach utilizes an attributed network that integrates protein-protein interactions and gene ontology data. We apply a Laplacian-based filter to ensure feature smoothness and use self-supervised training to optimize node feature representation. Specifically, we calculate the cosine similarity of feature vectors and select positive and negative sample nodes for reconstruction training labels. We employ a deep neural network for multi-label classification of phenotypes in the downstream task. Our experimental results demonstrate that SSLpheno outperforms state-of-the-art methods, especially in categories with fewer annotations. Moreover, our case studies illustrate the potential of SSLpheno as an effective prescreening tool for gene-phenotype association identification. AVAILABILITY AND IMPLEMENTATION https://github.com/bixuehua/SSLpheno.
Collapse
Affiliation(s)
- Xuehua Bi
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi 830017, China
| | - Weiyang Liang
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
| | - Qichang Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
26
|
Foos G, Blazeska N, Nielsen M, Carter H, Kosaloglu-Yalcin Z, Peters B, Sette A. A meta-analysis of epitopes in prostate-specific antigens identifies opportunities and knowledge gaps. Hum Immunol 2023; 84:578-589. [PMID: 37679223 PMCID: PMC11017785 DOI: 10.1016/j.humimm.2023.08.145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/16/2023] [Accepted: 08/21/2023] [Indexed: 09/09/2023]
Abstract
BACKGROUND The Cancer Epitope Database and Analysis Resource (CEDAR) is a newly developed repository of cancer epitope data from peer-reviewed publications, which includes epitope-specific T cell, antibody, and MHC ligand assays. Here we focus on prostate cancer as our first cancer category to demonstrate the capabilities of CEDAR, and to shed light on the advances of epitope-related prostate cancer research. RESULTS The meta-analysis focused on a subset of data describing epitopes from 8 prostate-specific (PS) antigens. A total of 460 epitopes were associated with these proteins, 187 T cell, 109B cell, and 271 MHC ligand epitopes. The number of epitopes was not correlated with the length of the protein; however, we found a significant positive correlation between the number of references per specific PS antigen and the number of reported epitopes. Forty-four different class I and 27 class II restrictions were found, with the most epitopes described for HLA-A*02:01 and HLA-DRB1*01:01. Cytokine assays were mostly limited to IFNg assays and a very limited number of tetramer assays were performed. Monoclonal and polyclonal B cell responses were balanced, with the highest number of epitopes studied in ELISA/Western blot assays. Additionally, epitopes were generically described as associated with prostate cancer, with little granularity specifying diseases state. We found that in vivo and tumor recognition assays were sparse, and the number of epitopes with annotated B/T cell receptor information were limited. Potential immunodominant regions were identified by the use of the ImmunomeBrowser tool. CONCLUSION CEDAR provides a comprehensive repository of epitopes related to prostate-specific antigens. This inventory of epitope data with its wealth of searchable T cell, B cell and MHC ligand information provides a useful tool for the scientific community. At the same time, we identify significant knowledge gaps that could be addressed by experimental analysis.
Collapse
Affiliation(s)
- Gabriele Foos
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA
| | - Nina Blazeska
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA
| | - Morten Nielsen
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, CP 1650 San Martín, Argentina; Department of Health Technology, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Hannah Carter
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Zeynep Kosaloglu-Yalcin
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA
| | - Bjoern Peters
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA; Department of Medicine, Division of Infectious Diseases and Global Public Health, University of California, San Diego (UCSD), La Jolla, CA 92037, USA
| | - Alessandro Sette
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA; Department of Medicine, Division of Infectious Diseases and Global Public Health, University of California, San Diego (UCSD), La Jolla, CA 92037, USA.
| |
Collapse
|
27
|
Lim CM, González Díaz A, Fuxreiter M, Pun FW, Zhavoronkov A, Vendruscolo M. Multiomic prediction of therapeutic targets for human diseases associated with protein phase separation. Proc Natl Acad Sci U S A 2023; 120:e2300215120. [PMID: 37774095 PMCID: PMC10556643 DOI: 10.1073/pnas.2300215120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/02/2023] [Indexed: 10/01/2023] Open
Abstract
The phenomenon of protein phase separation (PPS) underlies a wide range of cellular functions. Correspondingly, the dysregulation of the PPS process has been associated with numerous human diseases. To enable therapeutic interventions based on the regulation of this association, possible targets should be identified. For this purpose, we present an approach that combines the multiomic PandaOmics platform with the FuzDrop method to identify PPS-prone disease-associated proteins. Using this approach, we prioritize candidates with high PandaOmics and FuzDrop scores using a profiling method that accounts for a wide range of parameters relevant for disease mechanism and pharmacological intervention. We validate the differential phase separation behaviors of three predicted Alzheimer's disease targets (MARCKS, CAMKK2, and p62) in two cell models of this disease. Overall, the approach that we present generates a list of possible therapeutic targets for human diseases associated with the dysregulation of the PPS process.
Collapse
Affiliation(s)
- Christine M. Lim
- Yusuf Hamied Department of Chemistry, Centre for Misfolding Diseases, University of Cambridge, CambridgeCB2 1EW, United Kingdom
| | - Alicia González Díaz
- Yusuf Hamied Department of Chemistry, Centre for Misfolding Diseases, University of Cambridge, CambridgeCB2 1EW, United Kingdom
| | - Monika Fuxreiter
- Department of Biomedical Sciences, University of Padova, Padova35131, Italy
| | - Frank W. Pun
- Insilico Medicine, Hong Kong Science and Technology Park, Hong Kong, China
| | - Alex Zhavoronkov
- Insilico Medicine, Hong Kong Science and Technology Park, Hong Kong, China
| | - Michele Vendruscolo
- Yusuf Hamied Department of Chemistry, Centre for Misfolding Diseases, University of Cambridge, CambridgeCB2 1EW, United Kingdom
| |
Collapse
|
28
|
Stefancsik R, Balhoff JP, Balk MA, Ball RL, Bello SM, Caron AR, Chesler EJ, de Souza V, Gehrke S, Haendel M, Harris LW, Harris NL, Ibrahim A, Koehler S, Matentzoglu N, McMurry JA, Mungall CJ, Munoz-Torres MC, Putman T, Robinson P, Smedley D, Sollis E, Thessen AE, Vasilevsky N, Walton DO, Osumi-Sutherland D. The Ontology of Biological Attributes (OBA)-computational traits for the life sciences. Mamm Genome 2023; 34:364-378. [PMID: 37076585 PMCID: PMC10382347 DOI: 10.1007/s00335-023-09992-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 04/06/2023] [Indexed: 04/21/2023]
Abstract
Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focussed measurable trait data. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.
Collapse
Affiliation(s)
- Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, 27517, USA
| | - Meghan A Balk
- Natural History Museum, University of Oslo, Oslo, Norway
| | - Robyn L Ball
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | | | - Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah Gehrke
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Melissa Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Laura W Harris
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Arwa Ibrahim
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | | | - Julie A McMurry
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Elliot Sollis
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Nicole Vasilevsky
- Data Collaboration Center, Critical Path Institute, Tucson, AZ, 85718, USA
| | | | | |
Collapse
|
29
|
Lemas DJ, Du X, Dado-Senn B, Xu K, Dobrowolski A, Magalhães M, Aristizabal-Henao JJ, Young BE, Francois M, Thompson LA, Parker LA, Neu J, Laporta J, Misra BB, Wane I, Samaan S, Garrett TJ. Untargeted Metabolomic Analysis of Lactation-Stage-Matched Human and Bovine Milk Samples at 2 Weeks Postnatal. Nutrients 2023; 15:3768. [PMID: 37686800 PMCID: PMC10490210 DOI: 10.3390/nu15173768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 08/20/2023] [Accepted: 08/21/2023] [Indexed: 09/10/2023] Open
Abstract
Epidemiological data demonstrate that bovine whole milk is often substituted for human milk during the first 12 months of life and may be associated with adverse infant outcomes. The objective of this study is to interrogate the human and bovine milk metabolome at 2 weeks of life to identify unique metabolites that may impact infant health outcomes. Human milk (n = 10) was collected at 2 weeks postpartum from normal-weight mothers (pre-pregnant BMI < 25 kg/m2) that vaginally delivered term infants and were exclusively breastfeeding their infant for at least 2 months. Similarly, bovine milk (n = 10) was collected 2 weeks postpartum from normal-weight primiparous Holstein dairy cows. Untargeted data were acquired on all milk samples using high-resolution liquid chromatography-high-resolution tandem mass spectrometry (HR LC-MS/MS). MS data pre-processing from feature calling to metabolite annotation was performed using MS-DIAL and MS-FLO. Our results revealed that more than 80% of the milk metabolome is shared between human and bovine milk samples during early lactation. Unbiased analysis of identified metabolites revealed that nearly 80% of milk metabolites may contribute to microbial metabolism and microbe-host interactions. Collectively, these results highlight untargeted metabolomics as a potential strategy to identify unique and shared metabolites in bovine and human milk that may relate to and impact infant health outcomes.
Collapse
Affiliation(s)
- Dominick J. Lemas
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
- Department of Obstetrics and Gynecology, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
- Center for Perinatal Outcomes Research, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
| | - Xinsong Du
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Bethany Dado-Senn
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Ke Xu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Amanda Dobrowolski
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Marina Magalhães
- Department of Behavioral Nursing Science, College of Nursing, University of Florida, Gainesville, FL 32603, USA;
| | - Juan J. Aristizabal-Henao
- Department of Physiological Science, Center for Environmental and Human Toxicology, College of Veterinary Science, University of Florida, Gainesville, FL 32608, USA;
| | - Bridget E. Young
- Division of Breastfeeding and Lactation Medicine, University of Rochester Medical Center, Rochester, NY 14642, USA;
| | - Magda Francois
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Lindsay A. Thompson
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Leslie A. Parker
- Center for Perinatal Outcomes Research, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
| | - Josef Neu
- Department of Pediatrics, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
| | - Jimena Laporta
- Department of Obstetrics and Gynecology, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
| | | | - Ismael Wane
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Samih Samaan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32608, USA; (X.D.); (K.X.); (A.D.); (M.F.); (L.A.T.); (I.W.); (S.S.)
| | - Timothy J. Garrett
- Department of Pathology, Immunology and Laboratory Medicine, College of Medicine, University of Florida, Gainesville, FL 32608, USA;
| |
Collapse
|
30
|
Ley M, Heinzel A, Fillinger L, Kratochwill K, Perco P. OntoloViz: a GUI for interactive visualization of ranked disease or drug lists using the MeSH and ATC ontologies. Bioinform Adv 2023; 3:vbad113. [PMID: 38496343 PMCID: PMC10941809 DOI: 10.1093/bioadv/vbad113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 07/25/2023] [Accepted: 08/22/2023] [Indexed: 03/19/2024]
Abstract
Motivation Structured vocabularies for drugs and diseases represent, besides their primary use for annotating scientific literature or scientific information in general, a valuable resource for visualizing aggregated information. The Medical Subject Headings (MeSH) and Anatomical Therapeutic Chemical (ATC) ontologies are widely used structured vocabularies for diseases and drugs, respectively. Their hierarchical tree-like structure can be used as a basis for creating intuitive visual displays for specific diseases and drugs within their higher-order classifications. Such displays are helpful means to contextualize diseases and drugs in various settings such as in drug repositioning. However, there are few tools that can harness the potential of these structured ontologies to create informative visual representations without extensive programming and data processing skills. Results We have developed OntoloViz, a Graphical User Interface (GUI) for visualizing annotated lists of drugs or diseases in the context of their MeSH or ATC ontologies in an intuitively interpretable sunburst layout. Minimum input is a list of disease or drug names. Users in addition have the option to specify numerical parameters for the input lists to enhance the visualization, e.g. to visualize term frequencies. The GUI allows values to be propagated upwards in the respective ontology tree structure thus facilitating exploration of gene and drug lists. We present two use cases for OntoloViz, namely (i) a graphical representation of clinically tested drugs for coronavirus disease (COVID-19) based on ATC Classification and (ii) a graphical representation of literature annotation of human diseases on the MeSH ontology. Availability and implementation The OntoloViz package can be retrieved from PyPi. The source code along with test data, template, and documentations are available at GitHub (https://github.com/Delta4AI/OntoloViz).
Collapse
Affiliation(s)
- Matthias Ley
- Computational Biology Department, Delta4 GmbH, Vienna 1080, Austria
- Division of Pediatric Nephrology and Gastroenterology, Department of Pediatrics and Adolescent Medicine, Comprehensive Center for Pediatrics, Medical University Vienna, Vienna 1090, Austria
| | - Andreas Heinzel
- Computational Biology Department, Delta4 GmbH, Vienna 1080, Austria
- Department of Internal Medicine III, Medical University Vienna, Vienna 1090, Austria
| | - Lucas Fillinger
- Computational Biology Department, Delta4 GmbH, Vienna 1080, Austria
| | - Klaus Kratochwill
- Computational Biology Department, Delta4 GmbH, Vienna 1080, Austria
- Division of Pediatric Nephrology and Gastroenterology, Department of Pediatrics and Adolescent Medicine, Comprehensive Center for Pediatrics, Medical University Vienna, Vienna 1090, Austria
| | - Paul Perco
- Computational Biology Department, Delta4 GmbH, Vienna 1080, Austria
- Department of Internal Medicine IV, Medical University Innsbruck, Innsbruck 6020, Austria
| |
Collapse
|
31
|
Kaldunski ML, Smith JR, Brodie KC, De Pons JL, Demos WM, Gibson AC, Hayman GT, Lamers L, Laulederkind SJF, Thorat K, Thota J, Tutaj MA, Tutaj M, Vedi M, Wang SJ, Zacher S, Dwinell MR, Kwitek AE. Rare disease research resources at the Rat Genome Database. Genetics 2023; 224:iyad078. [PMID: 37119810 PMCID: PMC10411567 DOI: 10.1093/genetics/iyad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 04/05/2023] [Accepted: 04/19/2023] [Indexed: 05/01/2023] Open
Abstract
Rare diseases individually affect relatively few people, but as a group they impact considerable numbers of people. The Rat Genome Database (https://rgd.mcw.edu) is a knowledgebase that offers resources for rare disease research. This includes disease definitions, genes, quantitative trail loci (QTLs), genetic variants, annotations to published literature, links to external resources, and more. One important resource is identifying relevant cell lines and rat strains that serve as models for disease research. Diseases, genes, and strains have report pages with consolidated data, and links to analysis tools. Utilizing these globally accessible resources for rare disease research, potentiating discovery of mechanisms and new treatments, can point researchers toward solutions to alleviate the suffering of those afflicted with these diseases.
Collapse
Affiliation(s)
- Mary L Kaldunski
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Kent C Brodie
- Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jeffrey L De Pons
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Wendy M Demos
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Adam C Gibson
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Logan Lamers
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stanley J F Laulederkind
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Ketaki Thorat
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marek A Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Monika Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Mahima Vedi
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stacy Zacher
- Finance and Administration, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Anne E Kwitek
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
- Joint Department of Biomedical Engineering, Marquette University & Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
32
|
Li X, Yuan H, Wu X, Wang C, Wu M, Shi H, Lv Y. MultiDS-MDA: Integrating multiple data sources into heterogeneous network for predicting novel metabolite-drug associations. Comput Biol Med 2023; 162:107067. [PMID: 37276756 DOI: 10.1016/j.compbiomed.2023.107067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]
Abstract
Metabolic processes in the human body play an important role in maintaining normal life activities, and the abnormal concentration of metabolites is closely related to the occurrence and development of diseases. The use of drugs is considered to have a major impact on metabolism, and drug metabolites can contribute to efficacy, drug toxicity and drug-drug interaction. However, our understanding of metabolite-drug associations is far from complete, and individual data source tends to be incomplete and noisy. Therefore, the integration of various types of data sources for inferring reliable metabolite-drug associations is urgently needed. In this study, we proposed a computational framework, MultiDS-MDA, for identifying metabolite-drug associations by integrating multiple data sources, including chemical structure information of metabolites and drugs, the relationships of metabolite-gene, metabolite-disease, drug-gene and drug-disease, the data of gene ontology (GO) and disease ontology (DO) and known metabolite-drug connections. The performance of MultiDS-MDA was evaluated by 5-fold cross-validation, which achieved an area under the ROC curve (AUROC) of 0.911 and an area under the precision-recall curve (AUPRC) of 0.907. Additionally, MultiDS-MDA showed outstanding performance compared with similar approaches. Case studies for three metabolites (cholesterol, thromboxane B2 and coenzyme Q10) and three drugs (simvastatin, pravastatin and morphine) also demonstrated the reliability and efficiency of MultiDS-MDA, and it is anticipated that MultiDS-MDA will serve as a powerful tool for future exploration of metabolite-drug interactions and contribute to drug development and drug combination.
Collapse
Affiliation(s)
- Xiuhong Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hao Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xiaoliang Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chengyi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Meitao Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| |
Collapse
|
33
|
Avila R, Rubinetti V, Zhou X, Hu D, Qian Z, Cano MA, Rodolpho E, Tsueng G, Greene C, Wu C. MyGeneset.info: an interactive and programmatic platform for community-curated and user-created collections of genes. Nucleic Acids Res 2023; 51:W350-W356. [PMID: 37070209 PMCID: PMC10481249 DOI: 10.1093/nar/gkad289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/28/2023] [Accepted: 04/13/2023] [Indexed: 04/19/2023] Open
Abstract
Gene definitions and identifiers can be painful to manage-more so when trying to include gene function annotations as this can be highly context-dependent. Creating groups of genes or gene sets can help provide such context, but it compounds the issue as each gene within the gene set can map to multiple identifiers and have annotations derived from multiple sources. We developed MyGeneset.info to provide an API for integrated annotations for gene sets suitable for use in analytical pipelines or web servers. Leveraging our previous work with MyGene.info (a server that provides gene-centric annotations and identifiers), MyGeneset.info addresses the challenge of managing gene sets from multiple resources. With our API, users readily have read-only access to gene sets imported from commonly-used resources such as Wikipathways, CTD, Reactome, SMPDB, MSigDB, GO, and DO. In addition to supporting the access and reuse of approximately 180k gene sets from humans, common model organisms (mice, yeast, etc.), and less-common ones (e.g. black cottonwood tree), MyGeneset.info supports user-created gene sets, providing an important means for making gene sets more FAIR. User-created gene sets can serve as a way to store and manage collections for analysis or easy dissemination through a consistent API.
Collapse
Affiliation(s)
- Ricardo Avila
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Vincent Rubinetti
- Department of Biochemistry and Molecular Genetics, Center for Health AI, University of Colorado School of Medicine, Aurora, CO, USA
| | - Xinghua Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Dongbo Hu
- Department of Biochemistry and Molecular Genetics, Center for Health AI, University of Colorado School of Medicine, Aurora, CO, USA
| | - Zhongchao Qian
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Everaldo Rodolpho
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Casey Greene
- Department of Biochemistry and Molecular Genetics, Center for Health AI, University of Colorado School of Medicine, Aurora, CO, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
34
|
Lu J, Shen J, Xiong B, Ma W, Staab S, Yang C. HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting. Int ACM SIGIR Conf Res Dev Inf Retr 2023; 2023:2052-2056. [PMID: 38352127 PMCID: PMC10863609 DOI: 10.1145/3539618.3591997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
Medical decision-making processes can be enhanced by comprehensive biomedical knowledge bases, which require fusing knowledge graphs constructed from different sources via a uniform index system. The index system often organizes biomedical terms in a hierarchy to provide the aligned entities with fine-grained granularity. To address the challenge of scarce supervision in the biomedical knowledge fusion (BKF) task, researchers have proposed various unsupervised methods. However, these methods heavily rely on ad-hoc lexical and structural matching algorithms, which fail to capture the rich semantics conveyed by biomedical entities and terms. Recently, neural embedding models have proved effective in semantic-rich tasks, but they rely on sufficient labeled data to be adequately trained. To bridge the gap between the scarce-labeled BKF and neural embedding models, we propose HiPrompt, a supervision-efficient knowledge fusion framework that elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts. Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.
Collapse
Affiliation(s)
| | | | - Bo Xiong
- University of Stuttgart, Germany
| | | | - Steffen Staab
- University of Stuttgart, Germany, University of Southampton, UK
| | | |
Collapse
|
35
|
Zhao D, Qin D, Yin L, Yang Q. Integrated Bioinformatics Analysis and Experimental Verification of Immune Cell Infiltration and the Related Core Genes in Ulcerative Colitis. Pharmgenomics Pers Med 2023; 16:629-643. [PMID: 37383675 PMCID: PMC10296601 DOI: 10.2147/pgpm.s406644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 05/26/2023] [Indexed: 06/30/2023] Open
Abstract
Background Ulcerative colitis is a recurrent autoimmune disease. At present, the pathogenesis of UC is not completely clear. Hence, the etiology and underlying molecular mechanism need to be further investigated. Methods Three sets of microarray datasets were included from the Gene Expression Omnibus database. The differentially expressed genes in two sets of datasets were analyzed using the R software, and the core genes of UC were screened using machine learning. The sensitivity and specificity of the core genes were evaluated with the receiver operating characteristic curve in another microarray dataset. Subsequently, the CIBERSORT tool was used to analyze the relationship between UC and its core genes and immune cell infiltration. To verify the relationship between UC and core genes and the relationship between core genes and immune cell infiltration in vivo. Results A total of 36 DEGs were identified. AQP8, HMGCS2, and VNN1 were determined to be the core genes of UC. These genes had high sensitivity and specificity in receiver operating characteristic curve analysis. According to the analysis of immune cell infiltration, neutrophils, monocytes, and macrophages were positively correlated with UC. AQP8, HMGCS2, and VNN1 were also correlated with immune cell infiltration to varying degrees. In vivo experiments verified that the expressions of neutrophils, monocytes, and macrophages increased in the UC colon. Furthermore, the expressions of AQP8 and HMGCS2 decreased, whereas that of VNN1 increased. Azathioprine treatment improved all the indicators to different degrees. Conclusion AQP8, HMGCS2, and VNN1 are the core genes of UC and exhibit different degrees of correlation with immune cells. These genes are expected to become new therapeutic targets for UC. Moreover, the occurrence and development of UC are influenced by immune cell infiltration.
Collapse
Affiliation(s)
- Danya Zhao
- The First School of Clinical Medicine of Zhejiang Chinese Medical University, Hangzhou, People’s Republic of China
| | - Danping Qin
- Department of Gastroenterology, The First Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, People’s Republic of China
| | - Liming Yin
- Institute of Hematology, The First Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, People’s Republic of China
| | - Qiang Yang
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, People’s Republic of China
| |
Collapse
|
36
|
Li YH, Sun CC, Chen PM, Chen HH. SGK1 Target Genes Involved in Heart and Blood Vessel Functions in PC12 Cells. Cells 2023; 12:1641. [PMID: 37371111 DOI: 10.3390/cells12121641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/07/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
Serum and glucocorticoid-regulated kinase 1 (SGK1) is expressed in neuronal cells and involved in the pathogenesis of hypertension and metabolic syndrome, regulation of neuronal function, and depression in the brain. This study aims to identify the cellular mechanisms and signaling pathways of SGK1 in neuronal cells. In this study, the SGK1 inhibitor GSK650394 is used to suppress SGK1 expression in PC12 cells using an in vitro neuroscience research platform. Comparative transcriptomic analysis was performed to investigate the effects of SGK1 inhibition in nervous cells using mRNA sequencing (RNA-seq), differentially expressed genes (DEGs), and gene enrichment analysis. In total, 12,627 genes were identified, including 675 and 2152 DEGs at 48 and 72 h after treatment with GSK650394 in PC12 cells, respectively. Gene enrichment analysis data indicated that SGK1 inhibition-induced DEGs were enriched in 94 and 173 genes associated with vascular development and functional regulation and were validated using real-time PCR, Western blotting, and GEPIA2. Therefore, this study uses RNA-seq, DEG analysis, and GEPIA2 correlation analysis to identify positive candidate genes and signaling pathways regulated by SGK1 in rat nervous cells, which will enable further exploration of the underlying molecular signaling mechanisms of SGK1 and provide new insights into neuromodulation in cardiovascular diseases.
Collapse
Affiliation(s)
- Yu-He Li
- Department of Laboratory Medicine, Zuoying Branch of Kaohsiung Armed Forces General Hospital, Kaohsiung 813, Taiwan
| | - Chia-Cheng Sun
- Physical Examination Center, Show Chwan Memorial Hospital, Changhua 500, Taiwan
| | - Po-Ming Chen
- Research Assistant Center, Show Chwan Memorial Hospital, Changhua 500, Taiwan
| | - Hsin-Hung Chen
- Department of Medical Education and Research, Kaohsiung Veterans General Hospital, Kaohsiung 813, Taiwan
| |
Collapse
|
37
|
Liu Y, Li G. Empowering biologists to decode omics data: the Genekitr R package and web server. BMC Bioinformatics 2023; 24:214. [PMID: 37221491 DOI: 10.1186/s12859-023-05342-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 05/16/2023] [Indexed: 05/25/2023] Open
Abstract
BACKGROUND A variety of high-throughput analyses, such as transcriptome, proteome, and metabolome analysis, have been developed, producing unprecedented amounts of omics data. These studies generate large gene lists, of which the biological significance shall be deeply understood. However, manually interpreting these lists is difficult, especially for non-bioinformatics-savvy scientists. RESULTS We developed an R package and a corresponding web server-Genekitr, to assist biologists in exploring large gene sets. Genekitr comprises four modules: gene information retrieval, ID (identifier) conversion, enrichment analysis and publication-ready plotting. Currently, the information retrieval module can retrieve information on up to 23 attributes for genes of 317 organisms. The ID conversion module assists in ID-mapping of genes, probes, proteins, and aliases. The enrichment analysis module organizes 315 gene set libraries in different biological contexts by over-representation analysis and gene set enrichment analysis. The plotting module performs customizable and high-quality illustrations that can be used directly in presentations or publications. CONCLUSIONS This web server tool will make bioinformatics more accessible to scientists who might not have programming expertise, allowing them to perform bioinformatics tasks without coding.
Collapse
Affiliation(s)
- Yunze Liu
- Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China
- Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China
- Department of Biomedical Science, Faculty of Health Sciences, University of Macau, Macau SAR, China
| | - Gang Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China.
- Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China.
- Department of Biomedical Science, Faculty of Health Sciences, University of Macau, Macau SAR, China.
| |
Collapse
|
38
|
Zhuo J, Wang K, Shi Z, Yuan C. Immunogenic cell death-led discovery of COVID-19 biomarkers and inflammatory infiltrates. Front Microbiol 2023; 14:1191004. [PMID: 37228369 PMCID: PMC10203236 DOI: 10.3389/fmicb.2023.1191004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 04/18/2023] [Indexed: 05/27/2023] Open
Abstract
Immunogenic cell death (ICD) serves a critical role in regulating cell death adequate to activate an adaptive immune response, and it is associated with various inflammation-related diseases. However, the specific role of ICD-related genes in COVID-19 remains unclear. We acquired COVID-19-related information from the GEO database and a total of 14 ICD-related differentially expressed genes (DEGs) were identified. These ICD-related DEGs were closely associated with inflammation and immune activity. Afterward, CASP1, CD4, and EIF2AK3 among the 14 DEGs were selected as feature genes based on LASSO, Random Forest, and SVM-RFE algorithms, which had reliable diagnostic abilities. Moreover, functional enrichment analysis indicated that these feature genes may have a potential role in COVID-19 by being involved in the regulation of immune response and metabolism. Further CIBERSORT analysis demonstrated that the variations in the immune microenvironment of COVID-19 patients may be correlated with CASP1, CD4, and EIF2AK3. Additionally, 33 drugs targeting 3 feature genes had been identified, and the ceRNA network demonstrated a complicated regulative association based on these feature genes. Our work identified that CASP1, CD4, and EIF2AK3 were diagnostic genes of COVID-19 and correlated with immune activity. This study presents a reliable diagnostic signature and offers an overview to investigate the mechanism of COVID-19.
Collapse
Affiliation(s)
- Jianzhen Zhuo
- Guangdong Medical University, Dongguan, Guangdong, China
- Clinical Laboratory, Boai Hospital of Zhongshan Affiliated to Southern Medical University, Zhongshan, China
| | - Ke Wang
- Clinical Laboratory, Boai Hospital of Zhongshan Affiliated to Southern Medical University, Zhongshan, China
| | - Zijun Shi
- Reproductive Medical Center, Boai Hospital of Zhongshan Affiliated to Southern Medical University, Zhongshan, China
| | - Chunlei Yuan
- Guangdong Medical University, Dongguan, Guangdong, China
- Clinical Laboratory, Boai Hospital of Zhongshan Affiliated to Southern Medical University, Zhongshan, China
| |
Collapse
|
39
|
Alqaissi E, Alotaibi F, Ramzan MS. Graph data science and machine learning for the detection of COVID-19 infection from symptoms. PeerJ Comput Sci 2023; 9:e1333. [PMID: 37346701 PMCID: PMC10280642 DOI: 10.7717/peerj-cs.1333] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 03/16/2023] [Indexed: 06/23/2023]
Abstract
Background COVID-19 is an infectious disease caused by SARS-CoV-2. The symptoms of COVID-19 vary from mild-to-moderate respiratory illnesses, and it sometimes requires urgent medication. Therefore, it is crucial to detect COVID-19 at an early stage through specific clinical tests, testing kits, and medical devices. However, these tests are not always available during the time of the pandemic. Therefore, this study developed an automatic, intelligent, rapid, and real-time diagnostic model for the early detection of COVID-19 based on its symptoms. Methods The COVID-19 knowledge graph (KG) constructed based on literature from heterogeneous data is imported to understand the COVID-19 different relations. We added human disease ontology to the COVID-19 KG and applied a node-embedding graph algorithm called fast random projection to extract an extra feature from the COVID-19 dataset. Subsequently, experiments were conducted using two machine learning (ML) pipelines to predict COVID-19 infection from its symptoms. Additionally, automatic tuning of the model hyperparameters was adopted. Results We compared two graph-based ML models, logistic regression (LR) and random forest (RF) models. The proposed graph-based RF model achieved a small error rate = 0.0064 and the best scores on all performance metrics, including specificity = 98.71%, accuracy = 99.36%, precision = 99.65%, recall = 99.53%, and F1-score = 99.59%. Furthermore, the Matthews correlation coefficient achieved by the RF model was higher than that of the LR model. Comparative analysis with other ML algorithms and with studies from the literature showed that the proposed RF model exhibited the best detection accuracy. Conclusion The graph-based RF model registered high performance in classifying the symptoms of COVID-19 infection, thereby indicating that the graph data science, in conjunction with ML techniques, helps improve performance and accelerate innovations.
Collapse
Affiliation(s)
- Eman Alqaissi
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
- Information Systems, King Khalid University, Abha, Saudi Arabia
| | - Fahd Alotaibi
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Muhammad Sher Ramzan
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
40
|
Sousa DF, Couto FM. K-RET: knowledgeable biomedical relation extraction system. Bioinformatics 2023; 39:7108769. [PMID: 37018156 PMCID: PMC10112952 DOI: 10.1093/bioinformatics/btad174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 02/25/2023] [Accepted: 03/29/2023] [Indexed: 04/20/2023]
Abstract
MOTIVATION Relation extraction (RE) is a crucial process to deal with the amount of text published daily, e.g. to find missing associations in a database. RE is a text mining task for which the state-of-the-art approaches use bidirectional encoders, namely, BERT. However, state-of-the-art performance may be limited by the lack of efficient external knowledge injection approaches, with a larger impact in the biomedical area given the widespread usage and high quality of biomedical ontologies. This knowledge can propel these systems forward by aiding them in predicting more explainable biomedical associations. With this in mind, we developed K-RET, a novel, knowledgeable biomedical RE system that, for the first time, injects knowledge by handling different types of associations, multiple sources and where to apply it, and multi-token entities. RESULTS We tested K-RET on three independent and open-access corpora (DDI, BC5CDR, and PGR) using four biomedical ontologies handling different entities. K-RET improved state-of-the-art results by 2.68% on average, with the DDI Corpus yielding the most significant boost in performance, from 79.30% to 87.19% in F-measure, representing a P-value of 2.91×10-12. AVAILABILITY AND IMPLEMENTATION https://github.com/lasigeBioTM/K-RET.
Collapse
Affiliation(s)
- Diana F Sousa
- Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal
| | - Francisco M Couto
- Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal
| |
Collapse
|
41
|
Abstract
FlyBase (www.flybase.org) is the primary online database of genetic, genomic, and functional information about Drosophila melanogaster. The long and rich history of Drosophila research, combined with recent surges in genomic-scale and high-throughput technologies, means that FlyBase now houses a huge quantity of data. Researchers need to be able to query these data rapidly and intuitively, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This article describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Using the "Search FlyBase" tab of QuickSearch Basic Protocol 2: Using the "Data Class" tab of QuickSearch Basic Protocol 3: Using the "References" tab of QuickSearch Basic Protocol 4: Using the "Gene Groups" tab of QuickSearch Basic Protocol 5: Using the "Pathways" tab of QuickSearch Basic Protocol 6: Using the "GO" tab of QuickSearch Basic Protocol 7: Using the "Protein Domains" tab of QuickSearch Basic Protocol 8: Using the "Expression" tab of QuickSearch Basic Protocol 9: Using the "GAL4 etc" tab of QuickSearch Basic Protocol 10: Using the "Phenotype" tab of QuickSearch Basic Protocol 11: Using the "Human Disease" tab of QuickSearch Basic Protocol 12: Using the "Homologs" tab of QuickSearch Support Protocol 1: Managing FlyBase hit lists.
Collapse
Affiliation(s)
- Steven J. Marygold
- Department of Physiology, Development and NeuroscienceUniversity of CambridgeDowning StreetCambridgeUnited Kingdom
| | | |
Collapse
|
42
|
Günay Ç, Aykol D, Özsoy Ö, Sönmezler E, Hanci YS, Kara B, Akkoyunlu Sünnetçi D, Cine N, Deniz A, Özer T, Ölçülü CB, Yilmaz Ö, Kanmaz S, Yilmaz S, Tekgül H, Yildiz N, Acar Arslan E, Cansu A, Olgaç Dündar N, Kusgoz F, Didinmez E, Gençpinar P, Aksu Uzunhan T, Ertürk B, Gezdirici A, Ayaz A, Ölmez A, Ayanoğlu M, Tosun A, Topçu Y, Kiliç B, Aydin K, Çağlar E, Ersoy Kosvali Ö, Okuyaz Ç, Besen Ş, Tekin Orgun L, Erol İ, Yüksel D, Sezer A, Atasoy E, Toprak Ü, Güngör S, Ozgor B, Karadağ M, Dilber C, Şahinoğlu B, Uyur Yalçin E, Eldes Hacifazlioglu N, Yaramiş A, Edem P, Gezici Tekin H, Yilmaz Ü, Ünalp A, Turay S, Biçer D, Gül Mert G, Dokurel Çetin İ, Kirik S, Öztürk G, Karal Y, Sanri A, Aksoy A, Polat M, Özgün N, Soydemir D, Sarikaya Uzan G, Ülker Üstebay D, Gök A, Yeşilmen MC, Yiş U, Karakülah G, Bursali A, Oktay Y, Hiz Kurul S. Shared Biological Pathways and Processes in Patients with Intellectual Disability: A Multicenter Study. Neuropediatrics 2023. [PMID: 36787800 DOI: 10.1055/a-2034-8528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
BACKGROUND Although the underlying genetic causes of intellectual disability (ID) continue to be rapidly identified, the biological pathways and processes that could be targets for a potential molecular therapy are not yet known. This study aimed to identify ID-related shared pathways and processes utilizing enrichment analyses. METHOD In this multicenter study, causative genes of patients with ID were used as input for Disease Ontology (DO), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes enrichment analysis. RESULTS Genetic test results of 720 patients from 27 centers were obtained. Patients with chromosomal deletion/duplication, non-ID genes, novel genes, and results with changes in more than one gene were excluded. A total of 558 patients with 341 different causative genes were included in the study. Pathway-based enrichment analysis of the ID-related genes via ClusterProfiler revealed 18 shared pathways, with lysine degradation and nicotine addiction being the most common. The most common of the 25 overrepresented DO terms was ID. The most frequently overrepresented GO biological process, cellular component, and molecular function terms were regulation of membrane potential, ion channel complex, and voltage-gated ion channel activity/voltage-gated channel activity, respectively. CONCLUSION Lysine degradation, nicotine addiction, and thyroid hormone signaling pathways are well-suited to be research areas for the discovery of new targeted therapies in ID patients.
Collapse
Affiliation(s)
- Çağatay Günay
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Duygu Aykol
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Özlem Özsoy
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Ece Sönmezler
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Yaren Sena Hanci
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Bülent Kara
- Department of Pediatric Neurology, Kocaeli University School of Medicine, Kocaeli, Turkey
| | | | - Naci Cine
- Department of Medical Genetics, Kocaeli University School of Medicine, Kocaeli, Turkey
| | - Adnan Deniz
- Department of Pediatric Neurology, Kocaeli University School of Medicine, Kocaeli, Turkey
| | - Tolgahan Özer
- Department of Medical Genetics, Kocaeli University School of Medicine, Kocaeli, Turkey
| | - Cemile Büşra Ölçülü
- Department of Child Neurology, Ege University Faculty of Medicine, Izmir, Turkey
| | - Özlem Yilmaz
- Department of Child Neurology, Ege University Faculty of Medicine, Izmir, Turkey
| | - Seda Kanmaz
- Department of Child Neurology, Ege University Faculty of Medicine, Izmir, Turkey
| | - Sanem Yilmaz
- Department of Child Neurology, Ege University Faculty of Medicine, Izmir, Turkey
| | - Hasan Tekgül
- Department of Child Neurology, Ege University Faculty of Medicine, Izmir, Turkey
| | - Nihal Yildiz
- Department of Pediatric Neurology, Karadeniz Technical University, Faculty of Medicine, Farabi Hospital, Trabzon, Turkey
| | - Elif Acar Arslan
- Department of Pediatric Neurology, Karadeniz Technical University, Faculty of Medicine, Farabi Hospital, Trabzon, Turkey
| | - Ali Cansu
- Department of Pediatric Neurology, Karadeniz Technical University, Faculty of Medicine, Farabi Hospital, Trabzon, Turkey
| | - Nihal Olgaç Dündar
- Department of Pediatric Neurology, İzmir Katip Çelebi University, Izmir, Turkey
| | - Fatma Kusgoz
- Department of Pediatric Neurology, Tepecik Research and Training Hospital, Izmir, Turkey
| | - Elif Didinmez
- Department of Pediatric Neurology, Tepecik Research and Training Hospital, Izmir, Turkey
| | - Pınar Gençpinar
- Department of Pediatric Neurology, İzmir Katip Çelebi University, Izmir, Turkey
| | - Tuğçe Aksu Uzunhan
- Department of Pediatric Neurology, Prof Dr Cemil Tascioglu City Hospital, Istanbul, Turkey
| | - Biray Ertürk
- Department of Pediatric Neurology, Prof Dr Cemil Tascioglu City Hospital, Istanbul, Turkey
| | - Alper Gezdirici
- Department of Medical Genetics, Başakşehir Çam and Sakura City Hospital, Istanbul, Turkey
| | - Akif Ayaz
- Department of Medical Genetics, Istanbul Medipol University School of Medicine, Istanbul, Turkey
| | - Akgün Ölmez
- Denizli Pediatric Neurology Clinic, Denizli, Turkey
| | - Müge Ayanoğlu
- Department of Child Neurology, Adnan Menderes University School of Medicine, Aydın, Turkey
| | - Ayşe Tosun
- Department of Child Neurology, Adnan Menderes University School of Medicine, Aydın, Turkey
| | - Yasemin Topçu
- Department of Pediatric Neurology, Istanbul Medipol University Faculty of Medicine, Istanbul, Turkey
| | - Betül Kiliç
- Department of Pediatric Neurology, Istanbul Medipol University Faculty of Medicine, Istanbul, Turkey
| | - Kürşad Aydin
- Department of Pediatric Neurology, Istanbul Medipol University Faculty of Medicine, Istanbul, Turkey
| | - Ezgi Çağlar
- Departments of Pediatric Neurology, Mersin University Faculty of Medicine, Mersin, Turkey
| | - Özlem Ersoy Kosvali
- Departments of Pediatric Neurology, Mersin University Faculty of Medicine, Mersin, Turkey
| | - Çetin Okuyaz
- Departments of Pediatric Neurology, Mersin University Faculty of Medicine, Mersin, Turkey
| | - Şeyda Besen
- Division of Pediatric Neurology, Başkent University Adana Medical and Research Center Faculty of Medicine, Adana, Turkey
| | - Leman Tekin Orgun
- Division of Pediatric Neurology, Başkent University Adana Medical and Research Center Faculty of Medicine, Adana, Turkey
| | - İlknur Erol
- Division of Pediatric Neurology, Başkent University Adana Medical and Research Center Faculty of Medicine, Adana, Turkey
| | - Deniz Yüksel
- Department of Pediatric Neurology, University of Health Sciences Faculty of Medicine, Dr Sami Ulus Maternity Child Health and Diseases Training and Research Hospital, Ankara, Turkey
| | - Abdullah Sezer
- Department of Genetics, University of Health Sciences Faculty of Medicine, Dr Sami Ulus Maternity Child Health and Diseases Training and Research Hospital, Ankara, Turkey
| | - Ergin Atasoy
- Department of Pediatric Neurology, University of Health Sciences Faculty of Medicine, Dr Sami Ulus Maternity Child Health and Diseases Training and Research Hospital, Ankara, Turkey
| | - Ülkühan Toprak
- Department of Pediatric Neurology, University of Health Sciences Faculty of Medicine, Dr Sami Ulus Maternity Child Health and Diseases Training and Research Hospital, Ankara, Turkey
| | - Serdal Güngör
- Department of Paediatric Neurology, Inonu University Faculty of Medicine, Turgut Ozal Research Center, Malatya, Turkey
| | - Bilge Ozgor
- Department of Paediatric Neurology, Inonu University Faculty of Medicine, Turgut Ozal Research Center, Malatya, Turkey
| | - Meral Karadağ
- Department of Paediatric Neurology, Inonu University Faculty of Medicine, Turgut Ozal Research Center, Malatya, Turkey
| | - Cengiz Dilber
- Department of Pediatric Neurology, Kahramanmaras Sutcu Imam University Faculty of Medicine, Kahramanmaraş, Turkey
| | - Bahtiyar Şahinoğlu
- Deparment of Genetics, Dr Ersin Arslan Traning and Research Hospital, Gaziantep, Turkey
| | - Emek Uyur Yalçin
- Departments of Pediatrics and Pediatric Neurology, University of Health Sciences, Zeynep Kamil Maternity and Children's Diseases Hospital, Istanbul, Turkey
| | - Nilüfer Eldes Hacifazlioglu
- Departments of Pediatrics and Pediatric Neurology, University of Health Sciences, Zeynep Kamil Maternity and Children's Diseases Hospital, Istanbul, Turkey
| | - Ahmet Yaramiş
- Diyarbakır Pediatric Neurology Clinic, Diyarbakır, Turkey
| | - Pınar Edem
- Department of Pediatric Neurology, Bakırcay University, Cigli District Training Hospital, Izmir, Turkey
| | - Hande Gezici Tekin
- Department of Pediatric Neurology, Bakırcay University, Cigli District Training Hospital, Izmir, Turkey
| | - Ünsal Yilmaz
- Department of Pediatric Neurology, Dr. Behcet Uz Children's Hospital, Izmir, Turkey
| | - Aycan Ünalp
- Department of Pediatric Neurology, Dr. Behcet Uz Children's Hospital, Izmir, Turkey
| | - Sevim Turay
- Department of Pediatric Neurology, Duzce University Faculty of Medicine, Düzce, Turkey
| | - Didem Biçer
- Department of Pediatric Neurology, Çukurova University Faculty of Medicine, Adana, Turkey
| | - Gülen Gül Mert
- Department of Pediatric Neurology, Çukurova University Faculty of Medicine, Adana, Turkey
| | - İpek Dokurel Çetin
- Department of Pediatric Neurology, Balıkesir Atatürk Training and Research Hospital, Balıkesir, Turkey
| | - Serkan Kirik
- Fırat University School of Medicine, Pediatric Neurology, Elazığ, Turkey
| | - Gülten Öztürk
- Department of Pediatric Neurology, Marmara University School of Medicine, Istanbul, Turkey
| | - Yasemin Karal
- Department of Pediatric Neurology, Trakya University, Faculty of Medicine, Edirne, Turkey
| | - Aslıhan Sanri
- Department of Pediatric Genetics, University of Health Sciences, Samsun Training and Research Hospital, Samsun, Turkey
| | - Ayşe Aksoy
- Department of Pediatric Neurology, Ondokuz Mayıs University, Samsun, Turkey
| | - Muzaffer Polat
- Department of Pediatric Neurology, Celal Bayar University School of Medicine, Manisa, Turkey
| | - Nezir Özgün
- Department of Pediatric Neurology, Mardin Artuklu University, Faculty of Health Sciences, Mardin, Turkey
| | - Didem Soydemir
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Gamze Sarikaya Uzan
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Döndü Ülker Üstebay
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Ayşen Gök
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Mehmet Can Yeşilmen
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Uluç Yiş
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
| | - Gökhan Karakülah
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Ahmet Bursali
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Yavuz Oktay
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Semra Hiz Kurul
- Department of Pediatric Neurology, Dokuz Eylul University Faculty of Medicine, Izmir, Turkey
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| |
Collapse
|
43
|
Kartheeswaran KP, Rayan AXA, Varrieth GT. Enhanced disease-disease association with information enriched disease representation. Math Biosci Eng 2023; 20:8892-8932. [PMID: 37161227 DOI: 10.3934/mbe.2023391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
OBJECTIVE Quantification of disease-disease association (DDA) enables the understanding of disease relationships for discovering disease progression and finding comorbidity. For effective DDA strength calculation, there is a need to address the main challenge of integration of various biomedical aspects of DDA is to obtain an information rich disease representation. MATERIALS AND METHODS An enhanced and integrated DDA framework is developed that integrates enriched literature-based with concept-based DDA representation. The literature component of the proposed framework uses PubMed abstracts and consists of improved neural network model that classifies DDAs for an enhanced literature-based DDA representation. Similarly, an ontology-based joint multi-source association embedding model is proposed in the ontology component using Disease Ontology (DO), UMLS, claims insurance, clinical notes etc. Results and Discussion: The obtained information rich disease representation is evaluated on different aspects of DDA datasets such as Gene, Variant, Gene Ontology (GO) and a human rated benchmark dataset. The DDA scores calculated using the proposed method achieved a high correlation mainly in gene-based dataset. The quantified scores also shown better correlation of 0.821, when evaluated on human rated 213 disease pairs. In addition, the generated disease representation is proved to have substantial effect on correlation of DDA scores for different categories of disease pairs. CONCLUSION The enhanced context and semantic DDA framework provides an enriched disease representation, resulting in high correlated results with different DDA datasets. We have also presented the biological interpretation of disease pairs. The developed framework can also be used for deriving the strength of other biomedical associations.
Collapse
|
44
|
Shi Y, Liu W, Yang Y, Ci Y, Shi L. Exploration of the Shared Molecular Mechanisms between COVID-19 and Neurodegenerative Diseases through Bioinformatic Analysis. Int J Mol Sci 2023; 24. [PMID: 36902271 DOI: 10.3390/ijms24054839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/15/2023] [Accepted: 02/23/2023] [Indexed: 03/06/2023] Open
Abstract
The COVID-19 pandemic has caused millions of deaths and remains a major public health burden worldwide. Previous studies found that a large number of COVID-19 patients and survivors developed neurological symptoms and might be at high risk of neurodegenerative diseases, such as Alzheimer's disease (AD) and Parkinson's disease (PD). We aimed to explore the shared pathways between COVID-19, AD, and PD by using bioinformatic analysis to reveal potential mechanisms, which may explain the neurological symptoms and degeneration of brain that occur in COVID-19 patients, and to provide early intervention. In this study, gene expression datasets of the frontal cortex were employed to detect common differentially expressed genes (DEGs) of COVID-19, AD, and PD. A total of 52 common DEGs were then examined using functional annotation, protein-protein interaction (PPI) construction, candidate drug identification, and regulatory network analysis. We found that the involvement of the synaptic vesicle cycle and down-regulation of synapses were shared by these three diseases, suggesting that synaptic dysfunction might contribute to the onset and progress of neurodegenerative diseases caused by COVID-19. Five hub genes and one key module were obtained from the PPI network. Moreover, 5 drugs and 42 transcription factors (TFs) were also identified on the datasets. In conclusion, the results of our study provide new insights and directions for follow-up studies of the relationship between COVID-19 and neurodegenerative diseases. The hub genes and potential drugs we identified may provide promising treatment strategies to prevent COVID-19 patients from developing these disorders.
Collapse
|
45
|
Baron JA, Schriml LM. Assessing resource use: a case study with the Human Disease Ontology. Database (Oxford) 2023; 2023:7059632. [PMID: 36856688 PMCID: PMC9972798 DOI: 10.1093/database/baad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 01/11/2023] [Accepted: 02/09/2023] [Indexed: 03/02/2023]
Abstract
As a genomic resource provider, grappling with getting a handle on how your resource is utilized can be extremely challenging. At the same time, being able to thus document the plethora of use cases is vital to demonstrate sustainability. Herein, we describe a flexible workflow, built on readily available software, that the Human Disease Ontology (DO) project has utilized to transition to semi-automated methods to identify uses of the ontology in the published literature. The novel R package DO.utils (https://github.com/DiseaseOntology/DO.utils) has been devised with a small set of key functions to support our usage workflow in combination with Google Sheets. Use of this workflow has resulted in a 3-fold increase in the number of identified publications that use the DO and has provided novel usage insights that offer new research directions and reveal a clearer picture of the DO's use and scientific impact. The DO's resource use assessment workflow and the supporting software are designed to be useful to other resources, including databases, software tools, method providers and other web resources, to achieve similar results. Database URL: https://github.com/DiseaseOntology/DO.utils.
Collapse
Affiliation(s)
- J. Allen Baron
- *Corresponding author: Tel: +1-410-706-0716; Fax: +1-410-706-6756;
| | - Lynn M Schriml
- Correspondence may also be addressed to Lynn M. Schriml. Tel: +1-410-706-6776; Fax: +1-410-706-6756;
| |
Collapse
|
46
|
Schriml LM, Lichenstein R, Bisordi K, Bearer C, Baron JA, Greene C. Modeling the enigma of complex disease etiology. J Transl Med 2023; 21:148. [PMID: 36829165 PMCID: PMC9957692 DOI: 10.1186/s12967-023-03987-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 02/14/2023] [Indexed: 02/26/2023] Open
Abstract
BACKGROUND Complex diseases often present as a diagnosis riddle, further complicated by the combination of multiple phenotypes and diseases as features of other diseases. With the aim of enhancing the determination of key etiological factors, we developed and tested a complex disease model that encompasses diverse factors that in combination result in complex diseases. This model was developed to address the challenges of classifying complex diseases given the evolving nature of understanding of disease and interaction and contributions of genetic, environmental, and social factors. METHODS Here we present a new approach for modeling complex diseases that integrates the multiple contributing genetic, epigenetic, environmental, host and social pathogenic effects causing disease. The model was developed to provide a guide for capturing diverse mechanisms of complex diseases. Assessment of disease drivers for asthma, diabetes and fetal alcohol syndrome tested the model. RESULTS We provide a detailed rationale for a model representing the classification of complex disease using three test conditions of asthma, diabetes and fetal alcohol syndrome. Model assessment resulted in the reassessment of the three complex disease classifications and identified driving factors, thus improving the model. The model is robust and flexible to capture new information as the understanding of complex disease improves. CONCLUSIONS The Human Disease Ontology's Complex Disease model offers a mechanism for defining more accurate disease classification as a tool for more precise clinical diagnosis. This broader representation of complex disease, therefore, has implications for clinicians and researchers who are tasked with creating evidence-based and consensus-based recommendations and for public health tracking of complex disease. The new model facilitates the comparison of etiological factors between complex, common and rare diseases and is available at the Human Disease Ontology website.
Collapse
Affiliation(s)
- Lynn M. Schriml
- grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD USA
| | - Richard Lichenstein
- grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Baltimore, MD USA
| | - Katharine Bisordi
- grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Baltimore, MD USA
| | - Cynthia Bearer
- grid.67105.350000 0001 2164 3847Case Western Reserve University, Cleveland, OH USA
| | - J. Allen Baron
- grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD USA
| | - Carol Greene
- grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Baltimore, MD USA
| |
Collapse
|
47
|
Stefancsik R, Balhoff JP, Balk MA, Ball R, Bello SM, Caron AR, Chessler E, de Souza V, Gehrke S, Haendel M, Harris LW, Harris NL, Ibrahim A, Koehler S, Matentzoglu N, McMurry JA, Mungall CJ, Munoz-Torres MC, Putman T, Robinson P, Smedley D, Sollis E, Thessen AE, Vasilevsky N, Walton DO, Osumi-Sutherland D. The Ontology of Biological Attributes (OBA) - Computational Traits for the Life Sciences. bioRxiv 2023:2023.01.26.525742. [PMID: 36747660 PMCID: PMC9900877 DOI: 10.1101/2023.01.26.525742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focused measurable trait data. Moreover, variations in gene expression in response to environmental disturbances even without any genetic alterations can also be associated with particular biological attributes. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.
Collapse
Affiliation(s)
- Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - James P. Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC 27517, USA
| | - Meghan A. Balk
- National Ecological Observatory Network, Battelle, Boulder, CO 80301, USA
| | - Robyn Ball
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | | | - Anita R. Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah Gehrke
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Melissa Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Laura W. Harris
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nomi L. Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Arwa Ibrahim
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | | | - Julie A. McMurry
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Christopher J. Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Elliot Sollis
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Nicole Vasilevsky
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | | | | |
Collapse
|
48
|
Ding Q, Xing J, Bai F, Shao W, Hou K, Zhang S, Hu Y, Zhang B, Zhao H, Xu Q. C1QC, VSIG4, and CFD as Potential Peripheral Blood Biomarkers in Atrial Fibrillation-Related Cardioembolic Stroke. Oxid Med Cell Longev 2023; 2023:5199810. [PMID: 36644582 DOI: 10.1155/2023/5199810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 11/28/2022] [Accepted: 12/09/2022] [Indexed: 01/07/2023]
Abstract
Atrial fibrillation (AF) is a major risk factor for ischemic stroke. We aimed to identify novel potential biomarkers with diagnostic value in patients with atrial fibrillation-related cardioembolic stroke (AF-CE).Publicly available gene expression profiles related to AF, cardioembolic stroke (CE), and large artery atherosclerosis (LAA) were downloaded from the Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) were identified and then functionally annotated. The support vector machine recursive feature elimination (SVM-RFE) and least absolute shrinkage and selection operator (LASSO) regression analysis were conducted to identify potential diagnostic AF-CE biomarkers. Furthermore, the results were validated by using external data sets, and discriminability was measured by the area under the ROC curve (AUC). In order to verify the predictive results, the blood samples of 13 healthy controls, 20 patients with CE, and 20 patients with LAA stroke were acquired for RT-qPCR, and the correlation between biomarkers and clinical features was further explored. Lastly, a nomogram and the companion website were developed to predict the CE-risk rate. Three feature genes (C1QC, VSIG4, and CFD) were selected and validated in the training and the external datasets. The qRT-PCR evaluation showed that the levels of blood biomarkers (C1QC, VSIG4, and CFD) in patients with AF-CE can be used to differentiate patients with AF-CE from normal controls (P < 0.05) and can effectively discriminate AF-CE from LAA stroke (P < 0.05). Immune cell infiltration analysis revealed that three feature genes were correlated with immune system such as neutrophils. Clinical impact curve, calibration curves, ROC, and DCAs of the nomogram indicate that the nomogram had good performance. Our findings showed that C1QC, VSIG4, and CFD can potentially serve as diagnostic blood biomarkers of AF-CE; novel nomogram and the companion website can help clinicians to identify high-risk individuals, thus helping to guide treatment decisions for stroke patients.
Collapse
|
49
|
Li M, Liu J, Zhu J, Wang H, Sun C, Gao NL, Zhao XM, Chen WH. Performance of Gut Microbiome as an Independent Diagnostic Tool for 20 Diseases: Cross-Cohort Validation of Machine-Learning Classifiers. Gut Microbes 2023; 15:2205386. [PMID: 37140125 PMCID: PMC10161951 DOI: 10.1080/19490976.2023.2205386] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/05/2023] Open
Abstract
Cross-cohort validation is essential for gut-microbiome-based disease stratification but was only performed for limited diseases. Here, we systematically evaluated the cross-cohort performance of gut microbiome-based machine-learning classifiers for 20 diseases. Using single-cohort classifiers, we obtained high predictive accuracies in intra-cohort validation (~0.77 AUC), but low accuracies in cross-cohort validation, except the intestinal diseases (~0.73 AUC). We then built combined-cohort classifiers trained on samples combined from multiple cohorts to improve the validation of non-intestinal diseases, and estimated the required sample size to achieve validation accuracies of >0.7. In addition, we observed higher validation performance for classifiers using metagenomic data than 16S amplicon data in intestinal diseases. We further quantified the cross-cohort marker consistency using a Marker Similarity Index and observed similar trends. Together, our results supported the gut microbiome as an independent diagnostic tool for intestinal diseases and revealed strategies to improve cross-cohort performance based on identified determinants of consistent cross-cohort gut microbiome alterations.
Collapse
Affiliation(s)
- Min Li
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Jinxin Liu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Jiaying Zhu
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Huarui Wang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Chuqing Sun
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Na L Gao
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xing-Ming Zhao
- Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai, China
- State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- International Human Phenome Institutes (Shanghai), Shanghai, China
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
- College of Life Science, Henan Normal University, Xinxiang, China
- Institution of Medical Artificial Intelligence, Binzhou Medical University, Yantai, China
| |
Collapse
|
50
|
Shkrigunov T, Kisrieva Y, Samenkova N, Larina O, Zgoda V, Rusanov A, Romashin D, Luzgina N, Karuzina I, Lisitsa A, Petushkova N. Comparative proteoinformatics revealed the essentials of SDS impact on HaCaT keratinocytes. Sci Rep 2022; 12:21437. [PMID: 36509991 PMCID: PMC9744838 DOI: 10.1038/s41598-022-25934-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
There is no direct evidence supporting that SDS is a carcinogen, so to investigate this fact, we used HaCaT keratinocytes as a model of human epidermal cells. To reveal the candidate proteins and/or pathways characterizing the SDS impact on HaCaT, we proposed comparative proteoinformatics pipeline. For protein extraction, the performance of two sample preparation protocols was assessed: 0.2% SDS-based solubilization combined with the 1DE-gel concentration (Protocol 1) and osmotic shock (Protocol 2). As a result, in SDS-exposed HaCaT cells, Protocol 1 revealed 54 differentially expressed proteins (DEPs) involved in the disease of cellular proliferation (DOID:14566), whereas Protocol 2 found 45 DEPs of the same disease ID. The 'skin cancer' term was a single significant COSMIC term for Protocol 1 DEPs, including those involved in double-strand break repair pathway (BIR, GO:0000727). Considerable upregulation of BIR-associated proteins MCM3, MCM6, and MCM7 was detected. The eightfold increase in MCM6 level was verified by reverse transcription qPCR. Thus, Protocol 1 demonstrated high effectiveness in terms of the total number and sensitivity of MS identifications in HaCaT cell line proteomic analysis. The utility of Protocol 1 was confirmed by the revealed upregulation of cancer-associated MCM6 in HaCaT keratinocytes induced by non-toxic concentration of SDS. Data are available via ProteomeXchange with identifier PXD035202.
Collapse
Affiliation(s)
- Timur Shkrigunov
- grid.418846.70000 0000 8607 342XCenter of Scientific and Practical Education, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Yulia Kisrieva
- grid.418846.70000 0000 8607 342XLaboratory of Microsomal Oxidation, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Natalia Samenkova
- grid.418846.70000 0000 8607 342XLaboratory of Microsomal Oxidation, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Olesya Larina
- grid.418846.70000 0000 8607 342XLaboratory of Microsomal Oxidation, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Victor Zgoda
- grid.418846.70000 0000 8607 342XLaboratory of Systems Biology, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Alexander Rusanov
- grid.418846.70000 0000 8607 342XLaboratory of Precision BioSystems, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Daniil Romashin
- grid.418846.70000 0000 8607 342XLaboratory of Precision BioSystems, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Natalia Luzgina
- grid.418846.70000 0000 8607 342XLaboratory of Precision BioSystems, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Irina Karuzina
- grid.418846.70000 0000 8607 342XLaboratory of Microsomal Oxidation, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Andrey Lisitsa
- grid.418846.70000 0000 8607 342XCenter of Scientific and Practical Education, Institute of Biomedical Chemistry, Moscow, Russia 119121
| | - Natalia Petushkova
- grid.418846.70000 0000 8607 342XLaboratory of Microsomal Oxidation, Institute of Biomedical Chemistry, Moscow, Russia 119121
| |
Collapse
|