1
|
Yang ZM, Qin XY, Lu YY, Yao LK, Liu AQ, Yu QT, Jiang W, Liang J, Li Y, Zhou SZ, Qiu Y. Pathogen spectrum and clinical characteristics of lung cancer patients: A 10-year retrospective study. Int J Cancer 2025; 156:1470-1479. [PMID: 39680677 DOI: 10.1002/ijc.35272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 11/06/2024] [Accepted: 11/13/2024] [Indexed: 12/18/2024]
Abstract
Infection is the most common non-cancer cause of death in patients with lung cancer (LC). However, original research reports with large sample sizes on the epidemiology, pathogen spectrum, immune status changes, and prognosis of these patients are lacking. A retrospective study of LC patients with infection was performed at Guangxi Medical University Cancer Hospital from 2014 to 2023. In total, 699 LC patients with disease complicated by infection were included in the study. The incidence of infection increased from 4.61% in 2014 to 9.77% in 2023 among patients with LC. A total of 109 types of pathogens were detected. The most prevalent pathogenic organisms in each category were bacteria (Klebsiella pneumoniae and Escherichia coli), fungi (Candida spp. and Aspergillus spp.), viruses (COVID-19 and Epstein-Barr virus), and special pathogens (Mycobacterium tuberculosis and Mycoplasma pneumoniae). Upon diagnosis of infection, the total T lymphocyte, helper T cell, Th/Ts ratio, and B lymphocyte counts decreased, while the natural killer cell and suppressor T-cell counts increased. Infection is a crucial risk factor affecting the prognosis and mortality of patients with LC. The susceptibility of patients with LC to infection may be related to immunodeficiency resulting from antitumor treatment and disease progression.
Collapse
Affiliation(s)
- Zhen-Ming Yang
- Medical Oncology of Respiratory Medicine, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Xiu-Yu Qin
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Yan-Yan Lu
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Lun-Kai Yao
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Ai-Qun Liu
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Qi-Tao Yu
- Medical Oncology of Respiratory Medicine, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Wei Jiang
- Medical Oncology of Respiratory Medicine, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Jie Liang
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Yu Li
- Department of Respiratory and Critical Care Medicine, Hezhou People's Hospital, Hezhou, Guangxi, China
| | - Shao-Zhang Zhou
- Medical Oncology of Respiratory Medicine, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| | - Ye Qiu
- Gastroenterology and Respiratory Internal Medicine Department, Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China
| |
Collapse
|
2
|
Feng CH, Deng F, Disis ML, Gao N, Zhang L. Towards machine learning fairness in classifying multicategory causes of deaths in colorectal or lung cancer patients. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.14.638368. [PMID: 40027644 PMCID: PMC11870570 DOI: 10.1101/2025.02.14.638368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Classification of patient multicategory survival outcomes is important for personalized cancer treatments. Machine Learning (ML) algorithms have increasingly been used to inform healthcare decisions, but these models are vulnerable to biases in data collection and algorithm creation. ML models have previously been shown to exhibit racial bias, but their fairness towards patients from different age and sex groups have yet to be studied. Therefore, we compared the multimetric performances of 5 ML models (random forests, multinomial logistic regression, linear support vector classifier, linear discriminant analysis, and multilayer perceptron) when classifying colorectal cancer patients ( n =515) of various age, sex, and racial groups using the TCGA data. All five models exhibited biases for these sociodemographic groups. We then repeated the same process on lung adenocarcinoma ( n =589) to validate our findings. Surprisingly, most models tended to perform more poorly overall for the largest sociodemographic groups. Methods to optimize model performance, including testing the model on merged age, sex, or racial groups, and creating a model trained on and used for an individual or merged sociodemographic group, show potential to reduce disparities in model performance for different groups. Notably, these methods may be used to improve ML fairness while avoiding penalizing the model for exhibiting bias and thus sacrificing overall performance.
Collapse
|
3
|
Nunez JJ, Leung B, Ho C, Ng RT, Bates AT. Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processing. COMMUNICATIONS MEDICINE 2024; 4:69. [PMID: 38589545 PMCID: PMC11001970 DOI: 10.1038/s43856-024-00495-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND Patients with cancer often have unmet psychosocial needs. Early detection of who requires referral to a counsellor or psychiatrist may improve their care. This work used natural language processing to predict which patients will see a counsellor or psychiatrist from a patient's initial oncology consultation document. We believe this is the first use of artificial intelligence to predict psychiatric outcomes from non-psychiatric medical documents. METHODS This retrospective prognostic study used data from 47,625 patients at BC Cancer. We analyzed initial oncology consultation documents using traditional and neural language models to predict whether patients would see a counsellor or psychiatrist in the 12 months following their initial oncology consultation. RESULTS Here, we show our best models achieved a balanced accuracy (receiver-operating-characteristic area-under-curve) of 73.1% (0.824) for predicting seeing a psychiatrist, and 71.0% (0.784) for seeing a counsellor. Different words and phrases are important for predicting each outcome. CONCLUSION These results suggest natural language processing can be used to predict psychosocial needs of patients with cancer from their initial oncology consultation document. Future research could extend this work to predict the psychosocial needs of medical patients in other settings.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, BC, Canada.
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada.
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada.
| | | | | | - Raymond T Ng
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
| | - Alan T Bates
- BC Cancer, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
4
|
Deng F, Zhao L, Yu N, Lin Y, Zhang L. Union With Recursive Feature Elimination: A Feature Selection Framework to Improve the Classification Performance of Multicategory Causes of Death in Colorectal Cancer. J Transl Med 2024; 104:100320. [PMID: 38158124 DOI: 10.1016/j.labinv.2023.100320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 12/05/2023] [Accepted: 12/20/2023] [Indexed: 01/03/2024] Open
Abstract
Despite the use of machine learning tools, it is challenging to properly model cause-specific deaths in colorectal cancer (CRC) patients and choose appropriate treatments. Here, we propose an interesting feature selection framework, namely union with recursive feature elimination (U-RFE), to select the union feature sets that are crucial in CRC progression-specific mortality using The Cancer Genome Atlas (TCGA) dataset. Based on the union feature sets, we compared the performance of 5 classification algorithms, including logistic regression (LR), support vector machines (SVM), random forest (RF), eXtreme gradient boosting (XGBoost), and Stacking, to identify the best model for classifying 4-category deaths. In the first stage of U-RFE, LR, SVM, and RF were used as base estimators to obtain subsets containing the same number of features but not exactly the same specific features. Union analysis of the subsets was then performed to determine the final union feature set, effectively combining the advantages of different algorithms. We found that the U-RFE framework could improve various models' performance. Stacking outperformed LR, SVM, RF, and XGBoost in most scenarios. When the target feature number of the RFE was set to 50 and the union feature set contained 298 deterministic features, the Stacking model achieved F1_weighted, Recall_weighted, Precision_weighted, Accuracy, and Matthews correlation coefficient of 0.851, 0.864, 0.854, 0.864, and 0.717, respectively. The performance of the minority categories was also significantly improved. Therefore, this recursive feature elimination-based approach of feature selection improves performances of classifying CRC deaths using clinical and omics data or those using other data with high feature redundancy and imbalance.
Collapse
Affiliation(s)
- Fei Deng
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China.
| | - Lin Zhao
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Ning Yu
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Yuxiang Lin
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China
| | - Lanjing Zhang
- Department of Biological Sciences, Rutgers University, Newark, New Jersey; Department of Pathology, Princeton Medical Center, Plainsboro, New Jersey; Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey; Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, New Jersey.
| |
Collapse
|
5
|
Jaworsky M, Tao X, Pan L, Pokhrel SR, Yong J, Zhang J. Interrelated feature selection from health surveys using domain knowledge graph. Health Inf Sci Syst 2023; 11:54. [PMID: 37981989 PMCID: PMC10654272 DOI: 10.1007/s13755-023-00254-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 10/17/2023] [Indexed: 11/21/2023] Open
Abstract
Finding patterns among risk factors and chronic illness can suggest similar causes, provide guidance to improve healthy lifestyles, and give clues for possible treatments for outliers. Prior studies have typically isolated data challenges from single-disease datasets. However, the predictive power of multiple diseases is more helpful in establishing a healthy lifestyle than investigating one disease. Most studies typically focus on single-disease datasets; however, to ensure that health advice is generalized and contemporary, the features that predict the likelihood of many diseases can improve health advice effectiveness when considering the patient's point of view. We construct and present a novel knowledge-based qualitative method to remove redundant features from a dataset and redefine the outliers. The results of our trials upon five annual chronic disease health surveys demonstrate that our Knowledge Graph-based feature selection, when applied to many machine learning and deep learning multi-label classifiers, can improve classification performance. Our methodology is compatible with future directions, such as graph neural networks. It provides clinicians with an efficient process to select the most relevant health survey questions and responses regarding single or many human organ systems.
Collapse
Affiliation(s)
- Markian Jaworsky
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba, QLD Australia
| | - Xiaohui Tao
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba, QLD Australia
| | - Lei Pan
- School of Information Technology, Deakin University, Waurn Ponds, VIC 3216 Australia
| | - Shiva Raj Pokhrel
- School of Information Technology, Deakin University, Waurn Ponds, VIC 3216 Australia
| | - Jianming Yong
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba, QLD Australia
| | - Ji Zhang
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba, QLD Australia
| |
Collapse
|
6
|
Heidari A, Javaheri D, Toumaj S, Navimipour NJ, Rezaei M, Unal M. A new lung cancer detection method based on the chest CT images using Federated Learning and blockchain systems. Artif Intell Med 2023; 141:102572. [PMID: 37295902 DOI: 10.1016/j.artmed.2023.102572] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/16/2023] [Accepted: 04/27/2023] [Indexed: 06/12/2023]
Abstract
With an estimated five million fatal cases each year, lung cancer is one of the significant causes of death worldwide. Lung diseases can be diagnosed with a Computed Tomography (CT) scan. The scarcity and trustworthiness of human eyes is the fundamental issue in diagnosing lung cancer patients. The main goal of this study is to detect malignant lung nodules in a CT scan of the lungs and categorize lung cancer according to severity. In this work, cutting-edge Deep Learning (DL) algorithms were used to detect the location of cancerous nodules. Also, the real-life issue is sharing data with hospitals around the world while bearing in mind the organizations' privacy issues. Besides, the main problems for training a global DL model are creating a collaborative model and maintaining privacy. This study presented an approach that takes a modest amount of data from multiple hospitals and uses blockchain-based Federated Learning (FL) to train a global DL model. The data were authenticated using blockchain technology, and FL trained the model internationally while maintaining the organization's anonymity. First, we presented a data normalization approach that addresses the variability of data obtained from various institutions using various CT scanners. Furthermore, using a CapsNets method, we classified lung cancer patients in local mode. Finally, we devised a way to train a global model cooperatively utilizing blockchain technology and FL while maintaining anonymity. We also gathered data from real-life lung cancer patients for testing purposes. The suggested method was trained and tested on the Cancer Imaging Archive (CIA) dataset, Kaggle Data Science Bowl (KDSB), LUNA 16, and the local dataset. Finally, we performed extensive experiments with Python and its well-known libraries, such as Scikit-Learn and TensorFlow, to evaluate the suggested method. The findings showed that the method effectively detects lung cancer patients. The technique delivered 99.69 % accuracy with the smallest possible categorization error.
Collapse
Affiliation(s)
- Arash Heidari
- Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Danial Javaheri
- Department of Computer Engineering, Chosun University, Gwangju 61452, Republic of Korea
| | - Shiva Toumaj
- Urmia University of Medical Sciences, Urmia, Iran
| | - Nima Jafari Navimipour
- Department of Computer Engineering, Kadir Has University, Istanbul, Turkiye; Future Technology Research Center, National Yunlin University of Science and Technology, Douliou, Yunlin 64002, Taiwan.
| | - Mahsa Rezaei
- Tabriz University of Medical Sciences, Faculty of Surgery, Tabriz, Iran
| | - Mehmet Unal
- Department of Computer Engineering, Nisantasi University, Istanbul, Turkiye
| |
Collapse
|
7
|
Nunez JJ, Leung B, Ho C, Bates AT, Ng RT. Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Cheryl Ho
- BC Cancer, Vancouver, British Columbia, Canada
| | - Alan T. Bates
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond T. Ng
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
8
|
P D, C G. A systematic review on machine learning and deep learning techniques in cancer survival prediction. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 174:62-71. [PMID: 35933043 DOI: 10.1016/j.pbiomolbio.2022.07.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/13/2022] [Accepted: 07/19/2022] [Indexed: 06/15/2023]
Abstract
Cancer is a disease which is characterised by the unusual and uncontrollable growth of body cells. This usually happens asymptomatically and gets spread to other parts of the body. The major problem in treating cancer is that its progress is not monitored once it is diagnosed. The progress or the prognosis can be done through survival analysis. The survival analysis is the branch of statistics that deals in predicting the time of event of occurrence. In the case of cancer prognosis the event is the survival time of the patient from the onset of the disease or it can be the recurrence of the disease after undergoing a treatment. This study aims to bring out the machine learning and deep learning models involved in providing the prognosis to the cancer patients.
Collapse
Affiliation(s)
- Deepa P
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - Gunavathi C
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.
| |
Collapse
|
9
|
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Khanna NN, Ruzsa Z, Sharma AM, Saxena S, Faa G, Laird JR, Johri AM, Kalra MK, Paraskevas KI, Saba L. A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review. Diagnostics (Basel) 2022; 12:722. [PMID: 35328275 PMCID: PMC8947682 DOI: 10.3390/diagnostics12030722] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 12/16/2022] Open
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks.
Collapse
Affiliation(s)
- Jasjit S. Suri
- Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA 95661, USA
| | - Mrinalini Bhagawati
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Sudip Paul
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Athanasios D. Protogerou
- Research Unit Clinic, Laboratory of Pathophysiology, Department of Cardiovascular Prevention, National and Kapodistrian University of Athens, 11527 Athens, Greece;
| | - Petros P. Sfikakis
- Rheumatology Unit, National Kapodistrian University of Athens, 11527 Athens, Greece;
| | - George D. Kitas
- Arthritis Research UK Centre for Epidemiology, Manchester University, Manchester 46962, UK;
| | - Narendra N. Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi 110020, India;
| | - Zoltan Ruzsa
- Department of Internal Medicines, Invasive Cardiology Division, University of Szeged, 6720 Szeged, Hungary;
| | - Aditya M. Sharma
- Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA 22903, USA;
| | - Sanjay Saxena
- Department of CSE, International Institute of Information Technology, Bhubaneswar 751003, India;
| | - Gavino Faa
- Department of Pathology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| | - John R. Laird
- Cardiology Department, St. Helena Hospital, St. Helena, CA 94574, USA;
| | - Amer M. Johri
- Department of Medicine, Division of Cardiology, Queen’s University, Kingston, ON K7L 3N6, Canada;
| | - Manudeep K. Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Kosmas I. Paraskevas
- Department of Vascular Surgery, Central Clinic of Athens, N. Iraklio, 14122 Athens, Greece;
| | - Luca Saba
- Department of Radiology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| |
Collapse
|
10
|
Predictions of cervical cancer identification by photonic method combined with machine learning. Sci Rep 2022; 12:3762. [PMID: 35260666 PMCID: PMC8904553 DOI: 10.1038/s41598-022-07723-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/15/2022] [Indexed: 12/26/2022] Open
Abstract
Cervical cancer is one of the most commonly appearing cancers, which early diagnosis is of greatest importance. Unfortunately, many diagnoses are based on subjective opinions of doctors-to date, there is no general measurement method with a calibrated standard. The problem can be solved with the measurement system being a fusion of an optoelectronic sensor and machine learning algorithm to provide reliable assistance for doctors in the early diagnosis stage of cervical cancer. We demonstrate the preliminary research on cervical cancer assessment utilizing an optical sensor and a prediction algorithm. Since each matter is characterized by refractive index, measuring its value and detecting changes give information about the state of the tissue. The optical measurements provided datasets for training and validating the analyzing software. We present data preprocessing, machine learning results utilizing four algorithms (Random Forest, eXtreme Gradient Boosting, Naïve Bayes, Convolutional Neural Networks) and assessment of their performance for classification of tissue as healthy or sick. Our solution allows for rapid sample measurement and automatic classification of the results constituting a potential support tool for doctors.
Collapse
|
11
|
Feng CH, Disis ML, Cheng C, Zhang L. Multimetric feature selection for analyzing multicategory outcomes of colorectal cancer: random forest and multinomial logistic regression models. J Transl Med 2022; 102:236-244. [PMID: 34537824 DOI: 10.1038/s41374-021-00662-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/10/2021] [Accepted: 08/12/2021] [Indexed: 11/09/2022] Open
Abstract
Colorectal cancer (CRC) is one of the most common cancers worldwide, and a leading cause of cancer deaths. Better classifying multicategory outcomes of CRC with clinical and omic data may help adjust treatment regimens based on individual's risk. Here, we selected the features that were useful for classifying four-category survival outcome of CRC using the clinical and transcriptomic data, or clinical, transcriptomic, microsatellite instability and selected oncogenic-driver data (all data) of TCGA. We also optimized multimetric feature selection to develop the best multinomial logistic regression (MLR) and random forest (RF) models that had the highest accuracy, precision, recall and F1 score, respectively. We identified 2073 differentially expressed genes of the TCGA RNASeq dataset. MLR overall outperformed RF in the multimetric feature selection. In both RF and MLR models, precision, recall and F1 score increased as the feature number increased and peaked at the feature number of 600-1000, while the models' accuracy remained stable. The best model was the MLR one with 825 features based on sum of squared coefficients using all data, and attained the best accuracy of 0.855, F1 of 0.738 and precision of 0.832, which were higher than those using clinical and transcriptomic data. The top-ranked features in the MLR model of the best performance using clinical and transcriptomic data were different from those using all data. However, pathologic staging, HBS1L, TSPYL4, and TP53TG3B were the overlapping top-20 ranked features in the best models using clinical and transcriptomic, or all data. Thus, we developed a multimetric feature-selection based MLR model that outperformed RF models in classifying four-category outcome of CRC patients. Interestingly, adding microsatellite instability and oncogenic-driver data to clinical and transcriptomic data improved models' performances. Precision and recall of tuned algorithms may change significantly as the feature number changes, but accuracy appears not sensitive to these changes.
Collapse
Affiliation(s)
| | - Mary L Disis
- UW Medicine Cancer Vaccine Institute, University of Washington, Seattle, WA, USA
| | - Chao Cheng
- Department of Medicine, Section of Epidemiology and Population Sciences, Baylor College of Medicine, Houston, TX, USA.,Department of Medicine, Baylor College of Medicine, Houston, TX, USA.,Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | - Lanjing Zhang
- Department of Biological Sciences, Rutgers University, Newark, NJ, USA. .,Department of Pathology, Princeton Medical Center, Plainsboro, NJ, USA. .,Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA. .,Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, USA.
| |
Collapse
|
12
|
Yang R, Liu Y, Wang Y, Wang X, Ci H, Song C, Wu S. Low PRRX1 expression and high ZEB1 expression are significantly correlated with epithelial-mesenchymal transition and tumor angiogenesis in non-small cell lung cancer. Medicine (Baltimore) 2021; 100:e24472. [PMID: 33530259 PMCID: PMC7850718 DOI: 10.1097/md.0000000000024472] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 01/04/2021] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Paired related homeobox 1 (PRRX1) and zinc finger E-box binding homeobox 1 (ZEB1) have been observed to play a vital role in the epithelial-mesenchymal transition (EMT) process in different types of cancer. The microvessel density (MVD) is the most common indicator used to quantify angiogenesis. This study aimed to investigate expression of PRRX1 and ZEB1 in non-small cell lung cancer (NSCLC) and to explore associations between these factors and tumor prognosis, EMT markers and angiogenesis. METHODS Data for a total of 111 surgically resected NSCLC cases from January 2013 to December 2014 were collected. We used an immunohistochemical method to detect expression levels of PRRX1, ZEB1, and E-cadherin, and to assess MVD (marked by CD34 staining). SPSS 26.0 was employed to evaluate the connection between these factors and clinical and histopathological features, overall survival (OS) and tumor angiogenesis. RESULTS PRRX1 expression was obviously lower in tumor samples than in control samples. Low expression of PRRX1, which was more common in the high-MVD group than in the low-MVD group (P = .009), correlated positively with E-cadherin expression (P < .001). Additionally, we showed that ZEB1 was expressed at higher levels in tumor samples than in normal samples. High expression of ZEB1 was associated negatively with E-cadherin expression (P < .001) and positively associated with high MVD (P = .001). Based on Kaplan-Meier and multivariate survival analyses, we found that PRRX1, ZEB1, E-cadherin and the MVD had predictive value for OS in NSCLC patients. CONCLUSIONS These findings suggest that PRRX1 and ZEB1 may serve as novel prognostic biomarkers and potential therapeutic targets.
Collapse
Affiliation(s)
- Ruixue Yang
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| | - Yuanqun Liu
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| | - Yufei Wang
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| | - Xiaolin Wang
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| | - Hongfei Ci
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| | - Chao Song
- Department of Thoracic Surgery, Bengbu Medical College, Bengbu, Anhui Province, China
| | - Shiwu Wu
- Department of Pathology, the First Affiliated Hospital of Bengbu Medical College
- Department of Pathology
| |
Collapse
|