1
|
Wang Y, Lin Y, Wu S, Sun J, Meng Y, Jin E, Kong D, Duan G, Bei S, Fan Z, Wu G, Hao L, Song S, Tang B, Zhao W. BioKA: a curated and integrated biomarker knowledgebase for animals. Nucleic Acids Res 2024; 52:D1121-D1130. [PMID: 37843156 PMCID: PMC10767812 DOI: 10.1093/nar/gkad873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/19/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
Biomarkers play an important role in various area such as personalized medicine, drug development, clinical care, and molecule breeding. However, existing animals' biomarker resources predominantly focus on human diseases, leaving a significant gap in non-human animal disease understanding and breeding research. To address this limitation, we present BioKA (Biomarker Knowledgebase for Animals, https://ngdc.cncb.ac.cn/bioka), a curated and integrated knowledgebase encompassing multiple animal species, diseases/traits, and annotated resources. Currently, BioKA houses 16 296 biomarkers associated with 951 mapped diseases/traits across 31 species from 4747 references, including 11 925 gene/protein biomarkers, 1784 miRNA biomarkers, 1043 mutation biomarkers, 773 metabolic biomarkers, 357 circRNA biomarkers and 127 lncRNA biomarkers. Furthermore, BioKA integrates various annotations such as GOs, protein structures, protein-protein interaction networks, miRNA targets and so on, and constructs an interactive knowledge network of biomarkers including circRNA-miRNA-mRNA associations, lncRNA-miRNA associations and protein-protein associations, which is convenient for efficient data exploration. Moreover, BioKA provides detailed information on 308 breeds/strains of 13 species, and homologous annotations for 8784 biomarkers across 16 species, and offers three online application tools. The comprehensive knowledge provided by BioKA not only advances human disease research but also contributes to a deeper understanding of animal diseases and supports livestock breeding.
Collapse
Affiliation(s)
- Yibo Wang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yihao Lin
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Sicheng Wu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiani Sun
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuyan Meng
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Enhui Jin
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Demian Kong
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guangya Duan
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shaoqi Bei
- Qilu University of Technology (Shandong Academy of Sciences), Shandong 250353, China
| | - Zhuojing Fan
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Gangao Wu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Lili Hao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Shuhui Song
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Bixia Tang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Wenming Zhao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Wysocka M, Wysocki O, Zufferey M, Landers D, Freitas A. A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data. BMC Bioinformatics 2023; 24:198. [PMID: 37189058 DOI: 10.1186/s12859-023-05262-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 03/30/2023] [Indexed: 05/17/2023] Open
Abstract
BACKGROUND There is an increasing interest in the use of Deep Learning (DL) based methods as a supporting analytical framework in oncology. However, most direct applications of DL will deliver models with limited transparency and explainability, which constrain their deployment in biomedical settings. METHODS This systematic review discusses DL models used to support inference in cancer biology with a particular emphasis on multi-omics analysis. It focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability, fundamental properties in the biomedical domain. For this, we retrieved and analyzed 42 studies focusing on emerging architectural and methodological advances, the encoding of biological domain knowledge and the integration of explainability methods. RESULTS We discuss the recent evolutionary arch of DL models in the direction of integrating prior biological relational and network knowledge to support better generalisation (e.g. pathways or Protein-Protein-Interaction networks) and interpretability. This represents a fundamental functional shift towards models which can integrate mechanistic and statistical inference aspects. We introduce a concept of bio-centric interpretability and according to its taxonomy, we discuss representational methodologies for the integration of domain prior knowledge in such models. CONCLUSIONS The paper provides a critical outlook into contemporary methods for explainability and interpretability used in DL for cancer. The analysis points in the direction of a convergence between encoding prior knowledge and improved interpretability. We introduce bio-centric interpretability which is an important step towards formalisation of biological interpretability of DL models and developing methods that are less problem- or application-specific.
Collapse
Affiliation(s)
- Magdalena Wysocka
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
| | - Oskar Wysocki
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland.
| | - Marie Zufferey
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland
| | - Dónal Landers
- DeLondra Oncology Ltd, 38 Carlton Avenue, Wilmslow, SK9 4EP, UK
| | - André Freitas
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland
| |
Collapse
|
3
|
Bouzid A, Al Ani M, de la Fuente D, Al Shareef ZM, Quadri A, Hamoudi R, Al-Rawi N. Identification of p53-target genes in human papillomavirus-associated head and neck cancer by integrative bioinformatics analysis. Front Oncol 2023; 13:1128753. [PMID: 37081989 PMCID: PMC10110890 DOI: 10.3389/fonc.2023.1128753] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 03/17/2023] [Indexed: 04/07/2023] Open
Abstract
IntroductionHead and neck cancer (HNC) is a highly prevalent and heterogeneous malignancy. Although extensive efforts have been made to advance its treatment, the prognosis remained poor with increased mortality. Human papillomaviruses (HPV) have been associated with high risk in HNC. TP53, a tumor suppressor, is the most frequently altered gene in HNC, therefore, investigating its target genes for the identification of novel biomarkers or therapeutic targets in HPV-related HNC progression is highly recommended.MethodsTranscriptomic profiles from three independent gene expression omnibus (GEO) datasets, including 44 HPV+ and 70 HPV- HNC patients, were subjected to integrative statistical and Bioinformatics analyses. For the top-selected marker, further in-silico validation in TCGA and GTEx databases and experimental validation in 65 (51 HPV- and 14 HPV+) subjects with histologically confirmed head and neck squamous cell carcinoma (HNSCC) have been performed.ResultsA total of 498 differentially expressed genes (DEGs) were identified including 291 up-regulated genes and 207 down-regulated genes in HPV+ compared to HPV- HNSCC patients. Functional annotations and gene set enrichment analysis (GSEA) showed that the up-regulated genes were significantly involved in p53-related pathways. The integrative analysis between the Hub-genes identified in the complex protein-protein network and the top frequent genes resulting from GSEA showed an intriguing correlation with five biomarkers which are EZH2, MDM2, PCNA, STAT5A and TYMS. Importantly, the MDM2 gene showed the highest gene expression difference between HPV+ and HPV- HNSCC (Average log2FC = 1.89). Further in-silico validation in a large HNSCC cohort from TCGA and GTEx databases confirmed the over-expression of MDM2 in HPV+ compared to HPV- HNSCC patients (p = 2.39E-05). IHC scoring showed that MDM2 protein expression was significantly higher in HPV+ compared to HPV- HNSCC patients (p = 0.031).DiscussionOur findings showed evidence that over-expression of MDM2, proto-oncogene, may affect the occurrence and proliferation of HPV-associated HNSCC by disturbing the p53-target genes and consequently the p53-related pathways.
Collapse
Affiliation(s)
- Amal Bouzid
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- *Correspondence: Amal Bouzid, ; Rifat Hamoudi, ; Natheer Al-Rawi,
| | - Muwaffaq Al Ani
- Ear Nose and Throat (ENT) Department, Tawam Hospital, Al-Ain, United Arab Emirates
| | - David de la Fuente
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Zainab Mohamed Al Shareef
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Asif Quadri
- Department of Anatomic Pathology, National Reference lab, Abu Dhabi, United Arab Emirates
| | - Rifat Hamoudi
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
- Division of Surgery and Interventional Science, University College London, London, United Kingdom
- *Correspondence: Amal Bouzid, ; Rifat Hamoudi, ; Natheer Al-Rawi,
| | - Natheer Al-Rawi
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
- *Correspondence: Amal Bouzid, ; Rifat Hamoudi, ; Natheer Al-Rawi,
| |
Collapse
|
4
|
Sapra D, Kaur H, Dhall A, Raghava GPS. ProCanBio: A Database of Manually Curated Biomarkers for Prostate Cancer. J Comput Biol 2021; 28:1248-1257. [PMID: 34898255 DOI: 10.1089/cmb.2021.0348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Prostate cancer (PCa) is the second lethal malignancy in men worldwide. In the past, numerous research groups investigated the omics profiles of patients and scrutinized biomarkers for the diagnosis and prognosis of PCa. However, information related to the biomarkers is widely scattered across numerous resources in complex textual format, which poses hindrance to understand the tumorigenesis of this malignancy and scrutinization of robust signature. To create a comprehensive resource, we collected all the relevant literature on PCa biomarkers from the PubMed. We scrutinize the extensive information about each biomarker from a total of 412 unique research articles. Each entry of the database incorporates PubMed ID, biomarker name, biomarker type, biomolecule, source, subjects, validation status, and performance measures such as sensitivity, specificity, and hazard ratio (HR). In this study, we present ProCanBio, a manually curated database that maintains detailed data on 2053 entries of potential PCa biomarkers obtained from 412 publications in user-friendly tabular format. Among them are 766 protein-based, 507 RNA-based, 157 genomic mutations, 260 miRNA-based, and 122 metabolites-based biomarkers. To explore the information in the resource, a web-based interactive platform was developed with searching and browsing facilities. To the best of the authors' knowledge, there is no resource that can consolidate the information contained in all the published literature. Besides this, ProCanBio is freely available and is compatible with most web browsers and devices. Eventually, we anticipate this resource will be highly useful for the research community involved in the area of prostate malignancy.
Collapse
Affiliation(s)
- Dikscha Sapra
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Harpreet Kaur
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| |
Collapse
|
5
|
Cui F, Cheng L, Zou Q. Briefings in functional genomics special section editorial: analysis of integrated multiple omics data. Brief Funct Genomics 2021; 20:196-197. [PMID: 34279568 DOI: 10.1093/bfgp/elab033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Feifei Cui
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Liang Cheng
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, Heilongjiang, 150028, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| |
Collapse
|