Pan-Cancer Detection and Typing by Mining Patterns in Large Genome-Wide Cell-Free DNA Sequencing Datasets.
Clin Chem 2022;
68:1164-1176. [PMID:
35769009 DOI:
10.1093/clinchem/hvac095]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 04/25/2022] [Indexed: 11/13/2022]
Abstract
BACKGROUND
Cell-free DNA (cfDNA) analysis holds great promise for non-invasive cancer screening, diagnosis, and monitoring. We hypothesized that mining the patterns of cfDNA shallow whole-genome sequencing datasets from patients with cancer could improve cancer detection.
METHODS
By applying unsupervised clustering and supervised machine learning on large cfDNA shallow whole-genome sequencing datasets from healthy individuals (n = 367) and patients with different hematological (n = 238) and solid malignancies (n = 320), we identified cfDNA signatures that enabled cancer detection and typing.
RESULTS
Unsupervised clustering revealed cancer type-specific sub-grouping. Classification using a supervised machine learning model yielded accuracies of 96% and 65% in discriminating hematological and solid malignancies from healthy controls, respectively. The accuracy of disease type prediction was 85% and 70% for the hematological and solid cancers, respectively. The potential utility of managing a specific cancer was demonstrated by classifying benign from invasive and borderline adnexal masses with an area under the curve of 0.87 and 0.74, respectively.
CONCLUSIONS
This approach provides a generic analytical strategy for non-invasive pan-cancer detection and cancer type prediction.
Collapse