Pabón J, Gómez D, Cerón JD, Salazar-Cabrera R, López DM, Blobel B. A Comprehensive Dataset for Activity of Daily Living (ADL) Research Compiled by Unifying and Processing Multiple Data Sources.
J Pers Med 2025;
15:210. [PMID:
40423081 DOI:
10.3390/jpm15050210]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2025] [Revised: 05/12/2025] [Accepted: 05/15/2025] [Indexed: 05/28/2025] Open
Abstract
Background: Activities of Daily Living (ADLs) are essential tasks performed at home and used in healthcare to monitor sedentary behavior, track rehabilitation therapy, and monitor chronic obstructive pulmonary disease. The Barthel Index, used by healthcare professionals, has limitations due to its subjectivity. Human activity recognition (HAR) is a more accurate method using Information and Communication Technologies (ICTs) to assess ADLs more accurately. This work aims to create a singular, adaptable, and heterogeneous ADL dataset that integrates information from various sources, ensuring a rich representation of different individuals and environments. Methods: A literature review was conducted in Scopus, the University of California Irvine (UCI) Machine Learning Repository, Google Dataset Search, and the University of Cauca Repository to obtain datasets related to ADLs. Inclusion criteria were defined, and a list of dataset characteristics was made to integrate multiple datasets. Twenty-nine datasets were identified, including data from various accelerometers, gyroscopes, inclinometers, and heart rate monitors. These datasets were classified and analyzed from the review. Tasks such as dataset selection, categorization, analysis, cleaning, normalization, and data integration were performed. Results: The resulting unified dataset contained 238,990 samples, 56 activities, and 52 columns. The integrated dataset features a wealth of information from diverse individuals and environments, improving its adaptability for various applications. Conclusions: In particular, it can be used in various data science projects related to ADL and HAR, and due to the integration of diverse data sources, it is potentially useful in addressing bias in and improving the generalizability of machine learning models.
Collapse