1
|
Ledger A, Ceusters J, Valentin L, Testa A, Van Holsbeke C, Franchi D, Bourne T, Froyman W, Timmerman D, Van Calster B. Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm. BMC Med Res Methodol 2023; 23:276. [PMID: 38001421 PMCID: PMC10668424 DOI: 10.1186/s12874-023-02103-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic. METHODS This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125. RESULTS Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold. CONCLUSION Although several models had similarly good performance, individual probability estimates varied substantially.
Collapse
Affiliation(s)
- Ashleigh Ledger
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium
| | - Jolien Ceusters
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium
- Department of Oncology, Leuven Cancer Institute, Laboratory of Tumor Immunology and Immunotherapy, KU Leuven, Leuven, Belgium
| | - Lil Valentin
- Department of Obstetrics and Gynecology, Skåne University Hospital, Malmö, Sweden
- Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden
| | - Antonia Testa
- Department of Woman, Child and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
- Dipartimento Universitario Scienze della Vita e Sanità Pubblica, Università Cattolica del Sacro Cuore, Rome, Italy
| | | | - Dorella Franchi
- Preventive Gynecology Unit, Division of Gynecology, European Institute of Oncology IRCCS, Milan, Italy
| | - Tom Bourne
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium
- Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
- Queen Charlotte's and Chelsea Hospital, Imperial College, London, UK
| | - Wouter Froyman
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium
- Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium
- Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium.
- Department of Biomedical Data Sciences, Leiden University Medical Centre (LUMC), Leiden, Netherlands.
- Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium.
| |
Collapse
|
2
|
Moszynski R, Szubert S, Szpurek D, Michalak S, Krygowska J, Sajdak S. Usefulness of the HE4 biomarker as a second-line test in the assessment of suspicious ovarian tumors. Arch Gynecol Obstet 2013; 288:1377-83. [PMID: 23722285 PMCID: PMC3825535 DOI: 10.1007/s00404-013-2901-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2012] [Accepted: 05/14/2013] [Indexed: 01/27/2023]
Abstract
Purpose The aim of our study was the evaluation of HE4 usefulness as a test in assessment of ovarian tumors which are suspicious and difficult to classify correctly via subjective ultrasound examination. Methods In this retrospective cohort study 253 women diagnosed with adnexal masses were examined preoperatively. Suspicious tumors (n = 145) were divided into groups of: “probably benign” (n = 70), “uncertain” (n = 34), and “probably malignant” (n = 41). “Uncertain” tumors were also assessed as “benign” (n = 11) or “malignant” (n = 23). The logistic regression model was performed to analyze if the serum marker improves the prediction of a malignant finding and net reclassification improvement (NRI) was calculated to measure diagnostic improvement. Results Within the analyzed group 85 (58.6 %) benign and 60 (41.4 %) malignant tumors were confirmed histopathologically. The comparison of HE4 with subjective ultrasound assessment showed lowered NRI in the entire analyzed group as well as in the groups of tumors classified as “probably benign” or “probably malignant” (NRI = −0.16; P = 0.0139 and NRI = −0.133; P = 0.0489, respectively). The analysis of logistic regression model confirmed that biomarkers do not improve diagnostic accuracy. The difference between areas under ROC for HE4 (0.891) and CA125 (0.902) was not statistically significant (P = 0.760). Conclusions After subjective ultrasound assessment, the addition of the second-line test—HE4 as well as CA125 serum level does not improve diagnostic performance. However, HE4 evaluation satisfies the clinical expectations of diagnostic tools for ovarian tumors and, thus, may be useful to less experienced sonographers.
Collapse
Affiliation(s)
- Rafal Moszynski
- Division of Gynecological Surgery, Poznan University of Medical Sciences, 33. Polna St., 60-535, Poznan, Poland,
| | | | | | | | | | | |
Collapse
|
3
|
Dodge JE, Covens AL, Lacchetti C, Elit LM, Le T, Devries-Aboud M, Fung-Kee-Fung M. Preoperative identification of a suspicious adnexal mass: A systematic review and meta-analysis. Gynecol Oncol 2012; 126:157-66. [DOI: 10.1016/j.ygyno.2012.03.048] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2012] [Revised: 03/28/2012] [Accepted: 03/31/2012] [Indexed: 12/14/2022]
|
4
|
Mathematical Models to Discriminate Between Benign and Malignant Adnexal Masses: Potential Diagnostic Improvement Using Ovarian HistoScanning. Int J Gynecol Cancer 2011; 21:35-43. [DOI: 10.1097/igc.0b013e3182000528] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Purpose:Accurate preoperative clinical assessment of adnexal masses can optimize outcomes by ensuring appropriate and timely surgery. This article addresses whether a new technology, ovarian HistoScanning, has an additional diagnostic value in mathematical models developed for the differential diagnosis of adnexal masses.Patients and Methods:Transvaginal sonography-based morphological variables were obtained through blinded analysis of archived images in 199 women enrolled in a prospective study to assess the performance of ovarian HistoScanning. Logistic regression (LR) and neural network (NN) models including these variables and clinical and patient data along with the HistoScanning score (HSS) (range, 0-125; based on mathematical algorithms) were developed in a learning set (60% patients). The remaining 40% patients (evaluation set) were used to assess model performance.Results:Of all morphological and clinical variables tested, serum CA-125, presence of a solid component, and HSS were most significant and used to develop the LR model. The NN model included all variables. The novel variable, HSS, offered significant improvement in the LR and NN models' performance. The LR and NN models in an independent evaluation set were found to have area under the receiver operating characteristic curve = 0.97 (95% confidence interval [CI], 94-99) and 0.93 (95% CI, 88-98), sensitivities = 83% (95% CI, 71%-91%) and 80% (95% CI, 67%-89%), and specificities = 98% (95% CI, 89%-99%) and 86% (95% CI, 72%-95%), respectively. In addition, these models showed an improved performance when compared with 3 other existing models (allP< 0.05).Conclusions:This initial report shows a clear benefit of including ovarian HistoScanning into mathematical models used for discriminating benign from malignant ovarian masses. These models may be specifically helpful to the less experienced examiner. Future research should assess performance of these models in prospective clinical trials in different populations.
Collapse
|
5
|
|
6
|
|
7
|
Geomini P, Kruitwagen R, Bremer GL, Cnossen J, Mol BWJ. The accuracy of risk scores in predicting ovarian malignancy: a systematic review. Obstet Gynecol 2009; 113:384-94. [PMID: 19155910 DOI: 10.1097/aog.0b013e318195ad17] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
OBJECTIVE To perform a systematic review of the literature on the accuracy of prediction models in the preoperative assessment of adnexal masses. DATA SOURCES Studies were identified through the MEDLINE and EMBASE databases from inception to March 2008. The MEDLINE search was performed using the keywords ["ovarian neoplasms"[MeSH] NOT "therapeutics"[MeSH] AND "model"] and ["ovarian neoplasms"[MeSH] NOT "therapeutics"[MeSH] AND "prediction"]. The Embase search was performed using the keywords [ovary tumor AND prediction], [ovary tumor AND Mathematical model], and [ovary tumor AND statistical model]. METHODS OF STUDY SELECTION The search detected 1,161 publications; from the cross-references, another 116 studies were identified. Language restrictions were not applied. Eligible studies contained data on the accuracy of models predicting the risk of malignancy in ovarian masses. Models were required to combine at least two parameters. TABULATION, INTEGRATION, AND RESULTS Two independent reviewers selected studies and extracted study characteristics, study quality, and test accuracy. There were 109 accuracy studies that met the selection criteria. Accuracy data were used to form two-by-two contingency tables of the results of the risk score compared with definitive histology. We used bivariate meta-analysis to estimate pooled sensitivities and specificities and to fit summary receiver operating characteristic curves.Studies included in our analysis reported on 83 different prediction models. The model developed by Sassone was the most evaluated prediction model. All models has acceptable sensitivity and specificity. However, the Risk of Malignancy Index I and the Risk of Malignancy Index II, which use the product of the serum CA 125 level, an ultrasound scan result, and the menopausal state, were the best predictors. When 200 was used as the cutoff level, the pooled estimate for sensitivity was 78% for a specificity of 87%. CONCLUSION Based on our review, the Risk of Malignancy Index should be the prediction model of choice in the preoperative assessment of the adnexal mass.
Collapse
Affiliation(s)
- Peggy Geomini
- Department of Obstetrics and Gynecology, Máxima Medical Centre, Veldhoven, The Netherlands.
| | | | | | | | | |
Collapse
|
8
|
Yörük P, Dündar O, Yildizhan B, Tütüncü L, Pekin T. Comparison of the risk of malignancy index and self-constructed logistic regression models in preoperative evaluation of adnexal masses. JOURNAL OF ULTRASOUND IN MEDICINE : OFFICIAL JOURNAL OF THE AMERICAN INSTITUTE OF ULTRASOUND IN MEDICINE 2008; 27:1469-1477. [PMID: 18809957 DOI: 10.7863/jum.2008.27.10.1469] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
OBJECTIVE The aim of this study was to evaluate women with adnexal masses in the preoperative period by creating 2 logistic regression models, 1 including sonographic morphologic characteristics and the other including both morphologic and color Doppler characteristics, to compare the diagnostic accuracy of these 2 models with the risk of malignancy index (RMI). METHODS This prospective study included 38 malignant, 7 borderline, and 244 benign ovarian masses. The menopausal status, presence of septa, presence of papillary projections, location of the tumor, presence of ascites, presence of metastases, cancer antigen 125 level, tumor volume, septa thickness, and percentage of the solid component were included in the initial analysis. A second regression analysis was performed with the addition of Doppler parameters (location of blood flow and lowest resistive index) in the data set. Diagnostic performance of the 2 regression models and RMI were described and compared by generating receiver operating characteristic curves for each model. RESULTS The area under the curve values for the morphologic model (model 1), Doppler model (model 2), and RMI were 0.907, 0.971, and 0.889, respectively. Significance levels of model 1 and the RMI were similar (P = .23), whereas model 2 had a significantly higher area under the curve compared with both model 1 (P = .037) and the RMI (P = .018). CONCLUSIONS The addition of Doppler parameters in the regression model significantly increases the predictive performance. Nevertheless, in low-resource settings, the RMI remains the method of choice for distinguishing adnexal masses and referral to gynecologic oncology clinics.
Collapse
Affiliation(s)
- Pynar Yörük
- Department ofObstetrics and Gynecology, Marmara University, Istanbul, Turkey. .
| | | | | | | | | |
Collapse
|
9
|
Brun JL, Cortez A, Rouzier R, Callard P, Bazot M, Uzan S, Daraï E. Factors influencing the use and accuracy of frozen section diagnosis of epithelial ovarian tumors. Am J Obstet Gynecol 2008; 199:244.e1-7. [PMID: 18486086 DOI: 10.1016/j.ajog.2008.04.002] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2007] [Revised: 12/27/2007] [Accepted: 04/02/2008] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The objective of the study was to study factors influencing the use and accuracy of frozen section diagnosis (FSD) of ovarian tumors. STUDY DESIGN Surgery was performed in 414 patients with epithelial ovarian tumors between 2001 and 2006. Factors were identified by univariate and multivariate analysis. RESULTS FSD was requested in 274 patients: 152 benign, 55 borderline, and 67 malignant tumors. Age 50 years or older, tumor size 10 cm or greater, and preoperative evidence of malignancy were associated with FSD request. The sensitivity and specificity of FSD for benign, borderline, and malignant tumors were 97% and 81%, 62% and 96%, and 88% and 99%, respectively. The histologic type (mucinous), tumor size (less than 10 cm), the borderline component (less than 10%), and the pathologist's experience predicted misdiagnosis of borderline tumors. Spread outside the ovary was the only significant predictor of accurate FSD of malignant tumors. CONCLUSION FSD is less accurate for borderline than benign and malignant ovarian tumors. The pathologist's experience is a major determinant of diagnostic accuracy.
Collapse
|
10
|
Timmerman D, Van Holsbeke C, Van den Bosch T, Van Calster B, Van Huffel S, Vergote I. To the Editor. Int J Gynecol Cancer 2007; 17:543; author reply 544. [PMID: 17309566 DOI: 10.1111/j.1525-1438.2007.00810.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
11
|
Moszyński R, Szpurek D, Sajdak S, Smoleń A. Letter. Int J Gynecol Cancer 2007. [DOI: 10.1111/j.1525-1438.2007.00877.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|