1
|
Sapra D, Kaur H, Dhall A, Raghava GPS. ProCanBio: A Database of Manually Curated Biomarkers for Prostate Cancer. J Comput Biol 2021; 28:1248-1257. [PMID: 34898255 DOI: 10.1089/cmb.2021.0348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Prostate cancer (PCa) is the second lethal malignancy in men worldwide. In the past, numerous research groups investigated the omics profiles of patients and scrutinized biomarkers for the diagnosis and prognosis of PCa. However, information related to the biomarkers is widely scattered across numerous resources in complex textual format, which poses hindrance to understand the tumorigenesis of this malignancy and scrutinization of robust signature. To create a comprehensive resource, we collected all the relevant literature on PCa biomarkers from the PubMed. We scrutinize the extensive information about each biomarker from a total of 412 unique research articles. Each entry of the database incorporates PubMed ID, biomarker name, biomarker type, biomolecule, source, subjects, validation status, and performance measures such as sensitivity, specificity, and hazard ratio (HR). In this study, we present ProCanBio, a manually curated database that maintains detailed data on 2053 entries of potential PCa biomarkers obtained from 412 publications in user-friendly tabular format. Among them are 766 protein-based, 507 RNA-based, 157 genomic mutations, 260 miRNA-based, and 122 metabolites-based biomarkers. To explore the information in the resource, a web-based interactive platform was developed with searching and browsing facilities. To the best of the authors' knowledge, there is no resource that can consolidate the information contained in all the published literature. Besides this, ProCanBio is freely available and is compatible with most web browsers and devices. Eventually, we anticipate this resource will be highly useful for the research community involved in the area of prostate malignancy.
Collapse
Affiliation(s)
- Dikscha Sapra
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Harpreet Kaur
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, New Delhi, India
| |
Collapse
|
2
|
Tang J, Wang Y, Luo Y, Fu J, Zhang Y, Li Y, Xiao Z, Lou Y, Qiu Y, Zhu F. Computational advances of tumor marker selection and sample classification in cancer proteomics. Comput Struct Biotechnol J 2020; 18:2012-2025. [PMID: 32802273 PMCID: PMC7403885 DOI: 10.1016/j.csbj.2020.07.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/11/2022] Open
Abstract
Cancer proteomics has become a powerful technique for characterizing the protein markers driving transformation of malignancy, tracing proteome variation triggered by therapeutics, and discovering the novel targets and drugs for the treatment of oncologic diseases. To facilitate cancer diagnosis/prognosis and accelerate drug target discovery, a variety of methods for tumor marker identification and sample classification have been developed and successfully applied to cancer proteomic studies. This review article describes the most recent advances in those various approaches together with their current applications in cancer-related studies. Firstly, a number of popular feature selection methods are overviewed with objective evaluation on their advantages and disadvantages. Secondly, these methods are grouped into three major classes based on their underlying algorithms. Finally, a variety of sample separation algorithms are discussed. This review provides a comprehensive overview of the advances on tumor maker identification and patients/samples/tissues separations, which could be guidance to the researches in cancer proteomics.
Collapse
Key Words
- ANN, Artificial Neural Network
- ANOVA, Analysis of Variance
- CFS, Correlation-based Feature Selection
- Cancer proteomics
- Computational methods
- DAPC, Discriminant Analysis of Principal Component
- DT, Decision Trees
- EDA, Estimation of Distribution Algorithm
- FC, Fold Change
- GA, Genetic Algorithms
- GR, Gain Ratio
- HC, Hill Climbing
- HCA, Hierarchical Cluster Analysis
- IG, Information Gain
- LDA, Linear Discriminant Analysis
- LIMMA, Linear Models for Microarray Data
- MBF, Markov Blanket Filter
- MWW, Mann–Whitney–Wilcoxon test
- OPLS-DA, Orthogonal Partial Least Squares Discriminant Analysis
- PCA, Principal Component Analysis
- PLS-DA, Partial Least Square Discriminant Analysis
- RF, Random Forest
- RF-RFE, Random Forest with Recursive Feature Elimination
- SA, Simulated Annealing
- SAM, Significance Analysis of Microarrays
- SBE, Sequential Backward Elimination
- SFS, and Sequential Forward Selection
- SOM, Self-organizing Map
- SU, Symmetrical Uncertainty
- SVM, Support Vector Machine
- SVM-RFE, Support Vector Machine with Recursive Feature Elimination
- Sample classification
- Tumor marker selection
- sPLSDA, Sparse Partial Least Squares Discriminant Analysis
- t-SNE, Student t Distribution
- χ2, Chi-square
Collapse
Affiliation(s)
- Jing Tang
- Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yang Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,School of Pharmaceutical Sciences and Innovative Drug Research Centre, Chongqing University, Chongqing 401331, China
| | - Yi Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziyu Xiao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yan Lou
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou 310000, China
| | - Yunqing Qiu
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou 310000, China
| | - Feng Zhu
- Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
3
|
Lin Y, Qian F, Shen L, Chen F, Chen J, Shen B. Computer-aided biomarker discovery for precision medicine: data resources, models and applications. Brief Bioinform 2020; 20:952-975. [PMID: 29194464 DOI: 10.1093/bib/bbx158] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Revised: 10/17/2017] [Indexed: 12/21/2022] Open
Abstract
Biomarkers are a class of measurable and evaluable indicators with the potential to predict disease initiation and progression. In contrast to disease-associated factors, biomarkers hold the promise to capture the changeable signatures of biological states. With methodological advances, computer-aided biomarker discovery has now become a burgeoning paradigm in the field of biomedical science. In recent years, the 'big data' term has accumulated for the systematical investigation of complex biological phenomena and promoted the flourishing of computational methods for systems-level biomarker screening. Compared with routine wet-lab experiments, bioinformatics approaches are more efficient to decode disease pathogenesis under a holistic framework, which is propitious to identify biomarkers ranging from single molecules to molecular networks for disease diagnosis, prognosis and therapy. In this review, the concept and characteristics of typical biomarker types, e.g. single molecular biomarkers, module/network biomarkers, cross-level biomarkers, etc., are explicated on the guidance of systems biology. Then, publicly available data resources together with some well-constructed biomarker databases and knowledge bases are introduced. Biomarker identification models using mathematical, network and machine learning theories are sequentially discussed. Based on network substructural and functional evidences, a novel bioinformatics model is particularly highlighted for microRNA biomarker discovery. This article aims to give deep insights into the advantages and challenges of current computational approaches for biomarker detection, and to light up the future wisdom toward precision medicine and nation-wide healthcare.
Collapse
Affiliation(s)
- Yuxin Lin
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| | - Fuliang Qian
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| | - Li Shen
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| | - Feifei Chen
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| | - Jiajia Chen
- School of Chemistry, Biology and Material Engineering, Suzhou University of Science and Technology, China
| | - Bairong Shen
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| |
Collapse
|
4
|
Silajdžić E, Björkqvist M. A Critical Evaluation of Wet Biomarkers for Huntington's Disease: Current Status and Ways Forward. J Huntingtons Dis 2019; 7:109-135. [PMID: 29614689 PMCID: PMC6004896 DOI: 10.3233/jhd-170273] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
There is an unmet clinical need for objective biomarkers to monitor disease progression and treatment response in Huntington's disease (HD). The aim of this review is, therefore, to provide practical advice for biomarker discovery and to summarise studies on biofluid markers for HD. A PubMed search was performed to review literature with regard to candidate saliva, urine, blood and cerebrospinal fluid biomarkers for HD. Information has been organised into tables to allow a pragmatic approach to the discussion of the evidence and generation of practical recommendations for future studies. Many of the markers published converge on metabolic and inflammatory pathways, although changes in other analytes representing antioxidant and growth factor pathways have also been found. The most promising markers reflect neuronal and glial degeneration, particularly neurofilament light chain. International collaboration to standardise assays and study protocols, as well as to recruit sufficiently large cohorts, will facilitate future biomarker discovery and development.
Collapse
Affiliation(s)
- Edina Silajdžić
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Maria Björkqvist
- Department of Experimental Medical Science, Brain Disease Biomarker Unit, Wallenberg Neuroscience Center, Lund University, Lund, Sweden
| |
Collapse
|
5
|
Leclercq M, Vittrant B, Martin-Magniette ML, Scott Boyer MP, Perin O, Bergeron A, Fradet Y, Droit A. Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data. Front Genet 2019; 10:452. [PMID: 31156708 PMCID: PMC6532608 DOI: 10.3389/fgene.2019.00452] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 04/30/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML.
Collapse
Affiliation(s)
- Mickael Leclercq
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Médecine Moléculaire, Université Laval, Québec City, QC, Canada
| | - Benjamin Vittrant
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Médecine Moléculaire, Université Laval, Québec City, QC, Canada
| | - Marie Laure Martin-Magniette
- Institute of Plant Sciences Paris Saclay IPS2, CNRS, INRA, Université Paris-Sud, Université Evry, Université Paris-Saclay, Paris Diderot, Sorbonne Paris-Cité, Orsay, France.,UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, Paris, France
| | - Marie Pier Scott Boyer
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Médecine Moléculaire, Université Laval, Québec City, QC, Canada
| | - Olivier Perin
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Alain Bergeron
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Chirurgie, Oncology Axis, Université Laval, Québec City, QC, Canada
| | - Yves Fradet
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Chirurgie, Oncology Axis, Université Laval, Québec City, QC, Canada
| | - Arnaud Droit
- Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC, Canada.,Département de Médecine Moléculaire, Université Laval, Québec City, QC, Canada
| |
Collapse
|