Liu G, Wang S, Liu J, Zhang J, Pan X, Fan X, Shao T, Sun Y. Using machine learning methods to study the tumour microenvironment and its biomarkers in osteosarcoma metastasis.
Heliyon 2024;
10:e29322. [PMID:
38623240 PMCID:
PMC11016722 DOI:
10.1016/j.heliyon.2024.e29322]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 04/04/2024] [Accepted: 04/04/2024] [Indexed: 04/17/2024] Open
Abstract
Background
The long-term prognosis for patients with osteosarcoma (OS) metastasis remains unfavourable, highlighting the urgent need for research that explores potential biomarkers using innovative methodologies.
Methods
This study explored potential biomarkers for OS metastasis by analysing data from the Cancer Genome Atlas Program (TCGA) and Gene Expression Omnibus (GEO) databases. The synthetic minority oversampling technique (SMOTE) was employed to tackle class imbalances, while genes were selected using four feature selection algorithms (Monte Carlo feature selection [MCFS], Borota, minimum-redundancy maximum-relevance [mRMR], and light gradient-boosting machine [LightGBM]) based on the gene expression matrix. Four machine learning (ML) algorithms (support vector machine [SVM], extreme gradient boosting [XGBoost], random forest [RF], and k-nearest neighbours [kNN]) were utilized to determine the optimal number of genes for building the model. Interpretable machine learning (IML) was applied to construct prediction networks, revealing potential relationships among the selected genes. Additionally, enrichment analysis, survival analysis, and immune infiltration were performed on the featured genes.
Results
In DS1, DS2, and DS3, the IML algorithm identified 53, 45, and 46 features, respectively. Using the merged gene set, we obtained a total of 79 interpretable prediction rules for OS metastasis. We subsequently conducted an in-depth investigation on 39 crucial molecules associated with predicting OS metastasis, elucidating their roles within the tumour microenvironment. Importantly, we found that certain genes act as both predictors and differentially expressed genes. Finally, our study unveiled statistically significant differences in survival between the high and low expression groups of TRIP4, S100A9, SELL and SLC11A1, and there was a certain correlation between these genes and 22 various immune cells.
Conclusions
The biomarkers discovered in this study hold significant implications for personalized therapies, potentially enhancing the clinical prognosis of patients with OS.
Collapse