1
|
Guo L, Xie Y, He J, Li X, Zhou W, Chen Q. Breast cancer prediction model based on clinical and biochemical characteristics: clinical data from patients with benign and malignant breast tumors from a single center in South China. J Cancer Res Clin Oncol 2023; 149:13257-13269. [PMID: 37480526 DOI: 10.1007/s00432-023-05181-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 07/11/2023] [Indexed: 07/24/2023]
Abstract
OBJECTIVE Breast cancer is the most prevalent cancer and is second leading cause of death from malignancy among women worldwide. In addition to tumor factors, the host characteristics of tumors have been paid more and more attention by the medical community. This study aimed to develop a breast cancer prediction model for the Chinese population using clinical and biochemical characteristics. METHODS This is a retrospective study. From 2012 to 2021, we selected 19,751 patients with breast diseases from the Guangdong Hospital of Traditional Chinese Medicine, which included 5660 patients with breast cancer and 14,091 patients with benign breast diseases-75% of patients were randomly assigned to the training group and 25% to the test group using a total of 34 clinical and biochemical characteristics. Significant clinical signs were investigated, and logistic regression with recursive feature elimination (RFE) model was used to develop a prediction model for distinguishing benign from malignant breast diseases. The prediction model's accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) were calculated. RESULTS Clinical statistics demonstrated that the prediction model comprised 19 clinical characteristics had statistical separability in both the training group and the test group, as well as good sensitivity and prediction. CONCLUSIONS This model based on biochemical parameters demonstrates a significant predictive effect for breast cancer and may be useful as a reference for invasive tissue biopsy in patients undergoing BI-RADS 3 and 4A breast imaging.
Collapse
Affiliation(s)
- Li Guo
- Department of Breast, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, No. 111 of Dade Road, Yuexiu District, Guangzhou, 510120, China
| | - Yanyan Xie
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, No. 232 Wide Ring East Road, Panyu District, Guangzhou, 510006, China
| | - Junhao He
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, No. 232 Wide Ring East Road, Panyu District, Guangzhou, 510006, China
| | - Xian Li
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, No. 232 Wide Ring East Road, Panyu District, Guangzhou, 510006, China
| | - Wu Zhou
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, No. 232 Wide Ring East Road, Panyu District, Guangzhou, 510006, China.
| | - Qianjun Chen
- Department of Breast, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, No. 111 of Dade Road, Yuexiu District, Guangzhou, 510120, China.
| |
Collapse
|
2
|
Rabiei R, Ayyoubzadeh SM, Sohrabei S, Esmaeili M, Atashi A. Prediction of Breast Cancer using Machine Learning Approaches. J Biomed Phys Eng 2022; 12:297-308. [PMID: 35698545 PMCID: PMC9175124 DOI: 10.31661/jbpe.v0i0.2109-1403] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 03/05/2022] [Indexed: 05/27/2023]
Abstract
BACKGROUND Breast cancer is considered one of the most common cancers in women caused by various clinical, lifestyle, social, and economic factors. Machine learning has the potential to predict breast cancer based on features hidden in data. OBJECTIVE This study aimed to predict breast cancer using different machine-learning approaches applying demographic, laboratory, and mammographic data. MATERIAL AND METHODS In this analytical study, the database, including 5,178 independent records, 25% of which belonged to breast cancer patients with 24 attributes in each record was obtained from Motamed cancer institute (ACECR), Tehran, Iran. The database contained 5,178 independent records, 25% of which belonged to breast cancer patients containing 24 attributes in each record. The random forest (RF), neural network (MLP), gradient boosting trees (GBT), and genetic algorithms (GA) were used in this study. Models were initially trained with demographic and laboratory features (20 features). The models were then trained with all demographic, laboratory, and mammographic features (24 features) to measure the effectiveness of mammography features in predicting breast cancer. RESULTS RF presented higher performance compared to other techniques (accuracy 80%, sensitivity 95%, specificity 80%, and the area under the curve (AUC) 0.56). Gradient boosting (AUC=0.59) showed a stronger performance compared to the neural network. CONCLUSION Combining multiple risk factors in modeling for breast cancer prediction could help the early diagnosis of the disease with necessary care plans. Collection, storage, and management of different data and intelligent systems based on multiple factors for predicting breast cancer are effective in disease management.
Collapse
Affiliation(s)
- Reza Rabiei
- PhD, Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Seyed Mohammad Ayyoubzadeh
- PhD, Department of Health Information Technology and Management, School of Allied Medical Sciences, Tehran University of Medical Science, Tehran, Iran
| | - Solmaz Sohrabei
- MSc, Department Deputy of Development, Management and Resources, Office of Statistic and Information Technology Management, Zanjan University of Medical Sciences, Zanjan, Iran
| | - Marzieh Esmaeili
- PhD, Department of Health Information Technology and Management, School of Allied Medical Sciences, Tehran University of Medical Science, Tehran, Iran
| | - Alireza Atashi
- PhD, Department of E-Health, Virtual School, Tehran University of Medical Sciences, Medical Informatics Research Group, Clinical Research Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| |
Collapse
|
3
|
Esmaeili M, Ayyoubzadeh SM, Ahmadinejad N, Ghazisaeedi M, Nahvijou A, Maghooli K. A decision support system for mammography reports interpretation. Health Inf Sci Syst 2020; 8:17. [PMID: 32257128 PMCID: PMC7113352 DOI: 10.1007/s13755-020-00109-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 03/30/2020] [Indexed: 02/06/2023] Open
Abstract
PURPOSE Mammography plays a key role in the diagnosis of breast cancer; however, decision-making based on mammography reports is still challenging. This paper aims to addresses the challenges regarding decision-making based on mammography reports and propose a Clinical Decision Support System (CDSS) using data mining methods to help clinicians to interpret mammography reports. METHODS For this purpose, 2441 mammography reports were collected from Imam Khomeini Hospital from March 21, 2018, to March 20, 2019. In the first step, these mammography reports are analyzed and program code is developed to transform the reports into a dataset. Then, the weight of every feature of the dataset is calculated. Random Forest, Naïve Bayes, K-nearest neighbor (K-NN), Deep Learning classifiers are applied to the dataset to build a model capable of predicting the need for referral to biopsy. Afterward, the models are evaluated using cross-validation with measuring Area Under Curve (AUC), accuracy, sensitivity, specificity indices. RESULTS The mammography type (diagnostic or screening), mass and calcification features mentioned in the reports are the most important features for decision-making. Results reveal that the K-NN model is the most accurate and specific classifier with the accuracy and specificity values of 84.06% and 84.72% respectively. The Random Forest classifier has the best sensitivity and AUC with the sensitivity and AUC values of 87.74% and 0.905 respectively. CONCLUSIONS Accordingly, data mining approaches are proved to be a helpful tool to make the final decision as to whether patients should be referred to biopsy or not based on mammography reports. The developed CDSS may also be helpful especially for less experienced radiologists.
Collapse
Affiliation(s)
- Marzieh Esmaeili
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
- Scientific Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Seyed Mohammad Ayyoubzadeh
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
- Scientific Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Nasrin Ahmadinejad
- Medical Imaging Cancer, Imam Khomeini Hospital, Cancer Research Institute, Tehran, Iran
- Advanced Diagnostic and Interventional Radiology Research Cancer (ADIR), Tehran University of Medical Sciences, Tehran, Iran
| | - Marjan Ghazisaeedi
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
| | - Azin Nahvijou
- Cancer Research Center, Cancer Institute of Iran, Tehran University of Medical Sciences, Tehran, Iran
| | - Keivan Maghooli
- Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| |
Collapse
|
4
|
Abstract
Medical images have become increasingly important in clinical practice and medical research, and the need to manage images at the hospital level has become urgent in China. To unify patient identification in examinations from different medical specialties, increase convenient access to medical images under authentication, and make medical images suitable for further artificial intelligence investigations, we implemented an enterprise imaging strategy by adopting an image integration platform as the main tool at Xuanwu Hospital. Workflow re-engineering and business system transformation was also performed to ensure the quality and content of the imaging data. More than 54 million medical images and approximately 1 million medical reports were integrated, and uniform patient identification, images, and report integration were made available to the medical staff and were accessible via a mobile application, which were achieved by implementing the enterprise imaging strategy. However, to integrate all medical images of different specialties at a hospital and ensure that the images and reports are qualified for data mining, some further policy and management measures are still needed.
Collapse
Affiliation(s)
- Shanshan Li
- Information Center, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, People's Republic of China
| | - Yao Liu
- Information Center, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, People's Republic of China
| | - Yifang Yuan
- Computer Department, North China University of Technology, No. 5 Jinyuanzhuang Road, Shijingshan District, Beijing, 100144, People's Republic of China
| | - Jia Li
- Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, People's Republic of China
| | - Lan Wei
- Information Center, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, People's Republic of China
| | - Yuelong Wang
- DJ HealthUnion Corp, No. 5B Bld A, 1068 West Tianshan Road, Shanghai, 200051, People's Republic of China
| | - Xiaolu Fei
- Information Center, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, People's Republic of China.
| |
Collapse
|
5
|
Almeida E, Ferreira P, Vinhoza T, Dutra I, Li J, Wu Y, Burnside E. ExpertBayes: Automatically refining manually built Bayesian networks. Proc Int Conf Mach Learn Appl 2014; 2014:362-366. [PMID: 27066596 PMCID: PMC4826063 DOI: 10.1109/icmla.2014.64] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Bayesian network structures are usually built using only the data and starting from an empty network or from a naïve Bayes structure. Very often, in some domains, like medicine, a prior structure knowledge is already known. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers with a very small computational cost, while maintaining most of the intended meaning of the original model.
Collapse
Affiliation(s)
| | | | | | - Inês Dutra
- Department of Computer Science, University of Porto
| | | | - Yirong Wu
- University of Wisconsin, Madison, USA
| | | |
Collapse
|