Liang Z. Developing probabilistic ensemble machine learning models for home-based sleep apnea screening using overnight SpO2 data at varying data granularity.
Sleep Breath 2024:10.1007/s11325-024-03141-x. [PMID:
39190088 DOI:
10.1007/s11325-024-03141-x]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 08/08/2024] [Accepted: 08/14/2024] [Indexed: 08/28/2024]
Abstract
PURPOSE
This study aims to develop sleep apnea screening models with overnight SpO2 data, and to investigate the impact of the SpO2 data granularity on model performance.
METHODS
A total of 7,718 SpO2 recordings from the SHHS and MESA datasets were used. Probabilistic ensemble machine learning was employed to predict sleep apnea status at three AHI cutoff points: ≥ 5, ≥ 15, and ≥ 30 events/hour. To investigate the impact of data granularity, SpO2 data were aggregated at 30, 60, and 300 s.
RESULTS
Our models demonstrated good to excellent performance on internal test, with average area under the curve (AUC) values of 0.91, 0.93, and 0.96 for cutoffs ≥ 5, ≥ 15, and ≥ 30 at data granularity of 1 s, respectively. Both sensitivity (0.76, 0.84, 0.89) and specificity (0.87, 0.86, 0.90) ranged from good to excellent across three cutoffs. Positive predictive values (PPV) ranged from excellent to fair (0.97, 0.83, 0.66), and negative predictive values (NPV) ranged from low to excellent (0.43, 0.87, 0.98). Model performance on external test slightly dropped compared to internal test, but still achieved good to excellent AUC above 0.80 across all data granularity and all the three cutoffs. Data granularity of 300 s led to a reduction in performance metrics across all cutoffs.
CONCLUSION
Our models demonstrated superior performance across all three AHI cutoff thresholds compared to existing large sleep apnea screening models, even when considering varying SpO2 data granularity. However, lower data granularity was associated with decreased screening performance, indicating a need for further research in this area.
Collapse