Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Arvanitis TN, White S, Harrison S, Chaplin R, Despotou G. A method for machine learning generation of realistic synthetic datasets for validating healthcare applications. Health Informatics J 2022;28:14604582221077000. [PMID: 35414269 DOI: 10.1177/14604582221077000] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

For:	Arvanitis TN, White S, Harrison S, Chaplin R, Despotou G. A method for machine learning generation of realistic synthetic datasets for validating healthcare applications. Health Informatics J 2022;28:14604582221077000. [PMID: 35414269 DOI: 10.1177/14604582221077000] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Vallevik VB, Babic A, Marshall SE, Elvatun S, Brøgger HMB, Alagaratnam S, Edwin B, Veeraragavan NR, Befring AK, Nygård JF. Can I trust my fake data - A comprehensive quality assessment framework for synthetic tabular data in healthcare. Int J Med Inform 2024;185:105413. [PMID: 38493547 DOI: 10.1016/j.ijmedinf.2024.105413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/17/2024] [Accepted: 03/11/2024] [Indexed: 03/19/2024]

Abstract

BACKGROUND

Ensuring safe adoption of AI tools in healthcare hinges on access to sufficient data for training, testing and validation. Synthetic data has been suggested in response to privacy concerns and regulatory requirements and can be created by training a generator on real data to produce a dataset with similar statistical properties. Competing metrics with differing taxonomies for quality evaluation have been proposed, resulting in a complex landscape. Optimising quality entails balancing considerations that make the data fit for use, yet relevant dimensions are left out of existing frameworks.

METHOD

We performed a comprehensive literature review on the use of quality evaluation metrics on synthetic data within the scope of synthetic tabular healthcare data using deep generative methods. Based on this and the collective team experiences, we developed a conceptual framework for quality assurance. The applicability was benchmarked against a practical case from the Dutch National Cancer Registry.

CONCLUSION

We present a conceptual framework for quality assuranceof synthetic data for AI applications in healthcare that aligns diverging taxonomies, expands on common quality dimensions to include the dimensions of Fairness and Carbon footprint, and proposes stages necessary to support real-life applications. Building trust in synthetic data by increasing transparency and reducing the safety risk will accelerate the development and uptake of trustworthy AI tools for the benefit of patients.

DISCUSSION

Despite the growing emphasis on algorithmic fairness and carbon footprint, these metrics were scarce in the literature review. The overwhelming focus was on statistical similarity using distance metrics while sequential logic detection was scarce. A consensus-backed framework that includes all relevant quality dimensions can provide assurance for safe and responsible real-life applications of synthetic data. As the choice of appropriate metrics are highly context dependent, further research is needed on validation studies to guide metric choices and support the development of technical standards.

Collapse

He S, Chong P, Yoon BJ, Chung PH, Chen D, Marzouk S, Black KC, Sharp W, Safari P, Goldstein JN, Raja AS, Lee J. Entropy removal of medical diagnostics. Sci Rep 2024;14:1181. [PMID: 38216607 PMCID: PMC10786933 DOI: 10.1038/s41598-024-51268-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 01/03/2024] [Indexed: 01/14/2024] Open

Chen YC, Chen SC, Liu YS. Reducing abnormal expenses in national health insurance based on a control chart and decision tree-driven define, measure, analyze, improve and control process. Health Informatics J 2023;29:14604582231203757. [PMID: 37730249 DOI: 10.1177/14604582231203757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]

Nikolentzos G, Vazirgiannis M, Xypolopoulos C, Lingman M, Brandt EG. Synthetic electronic health records generated with variational graph autoencoders. NPJ Digit Med 2023;6:83. [PMID: 37120594 PMCID: PMC10148837 DOI: 10.1038/s41746-023-00822-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 04/05/2023] [Indexed: 05/01/2023] Open

Alharbi A, Ahmad M, Alosaimi W, Alyami H, Sarkar AK, Agrawal A, Kumar R, Khan RA. Securing healthcare information system through fuzzy based decision-making methodology. Health Informatics J 2022;28:14604582221135420. [DOI: 10.1177/14604582221135420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Review of Artificial Intelligence and Machine Learning Technologies: Classification, Restrictions, Opportunities and Challenges. MATHEMATICS 2022. [DOI: 10.3390/math10152552] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

A Methodology for Controlling Bias and Fairness in Synthetic Data Generation. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]