1
|
Wang N, Wang Z, Liu H, Wang Y, Li J, Hong X. Test of the relationship between adolescents' 24-h activity behavior and anxiety symptoms using compositional data analysis. BMC Public Health 2025; 25:1819. [PMID: 40382562 DOI: 10.1186/s12889-025-22864-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Accepted: 04/21/2025] [Indexed: 05/20/2025] Open
Abstract
BACKGROUND Physical activity plays a crucial role in promoting health, notably in mitigating anxiety symptoms. However, limited research has explored how different intensities of physical activity uniquely influence anxiety. This study investigated the dose‒response relationship between Chinese adolescents' 24-h activity behavior and anxiety symptoms using compositional data analysis (CoDA). METHODS The temporal distribution of 24-h activity behaviors of 176 adolescents was objectively measured by accelerometers, and anxiety symptoms were assessed by the Self-Rating Anxiety Scale (SAS). Data were analyzed using CoDA and the isotemporal substitution model to statistically modify the intensity and duration of exercise in predicting anxiety. RESULTS Moderate-to-vigorous physical activity (MVPA), but not light physical activity (LPA), was negatively associated with adolescent anxiety symptoms; SB, SP and anxiety symptoms were positively inter-correlated. Isotemporal substitution analyses indicated that replacing 15 min of other activities with MVPA, or substituting SB with LPA, reduced anxiety symptom levels; conversely, the opposite substitutions increased it. Dose-effect analysis showed that the reallocation between LPA and SB had an equivalent but opposite impact on anxiety symptom levels. Meanwhile, When replacing other activities with MVPA, anxiety levels decreased slowly; when MVPA was replaced by other activities, anxiety levels increased rapidly. CONCLUSION MVPA is a key factor in alleviating anxiety symptoms, but it is essential to consider adolescents' 24-h activity behaviors holistically. The primary goal should be to maintain existing levels of MVPA while reasonably promoting the replacement of SB with MVPA, thereby enhancing adolescents' physical and mental health.
Collapse
Affiliation(s)
- Ning Wang
- School of Sports Medicine, Wuhan Sports University, Wuhan, 430079, China
| | - Ziyi Wang
- School of Sports Medicine, Wuhan Sports University, Wuhan, 430079, China
| | - Hui Liu
- School of Sports Medicine, Wuhan Sports University, Wuhan, 430079, China
- Hubei Key Laboratory of Exercise Training and Monitoring, Wuhan, China
| | - Yifeng Wang
- School of Physical Education, Wuhan Sports University, Wuhan, China
| | - Jinkun Li
- School of Physical Education and Sports, Central China Normal University, Wuhan, China
| | - Xiaobin Hong
- School of Sports Medicine, Wuhan Sports University, Wuhan, 430079, China.
- Hubei Key Laboratory of Exercise Training and Monitoring, Wuhan, China.
| |
Collapse
|
2
|
Wang Y, Sun J, Zhang Y, Wang J, Lu S. Association of reallocating time between physical activity and sedentary behavior on the risk of depression: a systematic review and meta-analysis. Front Psychol 2025; 16:1505061. [PMID: 40370399 PMCID: PMC12075196 DOI: 10.3389/fpsyg.2025.1505061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Accepted: 04/11/2025] [Indexed: 05/16/2025] Open
Abstract
Background and aims Sedentary behavior (SB) is a prevalent lifestyle factor and a risk factor for various health conditions, including depression (encompassing both clinically diagnosed depressive disorders and depressive symptoms). This study aimed to summarize the estimated impact of reallocating time spent in SB to light-intensity physical activity (LPA) or moderate-to-vigorous physical activity (MVPA) on the risk of depression from observational studies, as well as the impact of reallocating time spent in MVPA and LPA to SB. Methods Four databases [PubMed, Scopus, SPORTdiscus, and PsycINFO (via EBSCOhost platform)] were searched and analyzed for relevant studies published up to August 2024. Meta-analyses were performed on the estimated regression coefficients (b) and 95% confidence intervals (CIs) for depression symptom scores. All statistical analyses were performed using STATA 16.0. Results Twenty-seven studies involving 702,755 participants met the inclusion criteria. Reallocating SB to LPA and MVPA was significantly associated with reductions in depression risk (b = -0.04, 95% CI = -0.06 to -0.03, p < 0.001; b = -0.11, 95% CI = -0.19 to -0.03, p = 0.004). Subgroup analyses indicated that reallocating 30 and 60 min of SB to LPA or MVPA was significantly associated with reduced depression risk, with significant differences in PA intensity and age, but not for 10 and 15 min groups. Conversely, reallocating LPA and MVPA to SB was significantly associated with increased depression risk (b = 0.11, 95% CI = 0.01 to 0.21, p = 0.039; b = 0.17, 95% CI = 0.08 to 0.25, p < 0.001). Subgroup analyses indicated that reallocating 30 min of LPA or MVPA to SB was significantly associated with increased depression risk, with no difference in PA intensity. Conclusions Reallocating SB to PA was beneficial, whereas reallocating PA to SB was detrimental to the risk of depression. The results highlight the importance of considering PA intensity and duration in the development of behavioral guidelines aimed at reducing the risk of depression. Systematic review registration https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=546666, identifier: CRD42024546666.
Collapse
Affiliation(s)
- Yue Wang
- Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, China
| | - Jun Sun
- School of Physical Education, Central China Normal University, Wuhan, China
| | - Yuheng Zhang
- School of Sports, Wuhan University of Science and Technology, Wuhan, China
| | - Jiali Wang
- School of Sports, Wuhan University of Science and Technology, Wuhan, China
| | - Songtao Lu
- School of Physical Education, Central China Normal University, Wuhan, China
- School of Sports, Wuhan University of Science and Technology, Wuhan, China
| |
Collapse
|
3
|
Kuzik N, Duncan MJ, Beshara N, MacDonald M, Silva DAS, Tremblay MS. A systematic review and meta-analysis of the first decade of compositional data analyses of 24-hour movement behaviours, health, and well-being in school-aged children. JOURNAL OF ACTIVITY, SEDENTARY AND SLEEP BEHAVIORS 2025; 4:4. [PMID: 40217545 PMCID: PMC11948812 DOI: 10.1186/s44167-025-00076-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Accepted: 03/15/2025] [Indexed: 04/14/2025]
Abstract
INTRODUCTION Movement behaviours (e.g., sleep, sedentary behaviour, light physical activity [LPA], moderate to vigorous physical activity [MVPA]) are associated with numerous health and well-being outcomes. Compositional data analyses (CoDA) accounts for the interdependent nature of movement behaviours. This systematic review and meta-analysis provides a timely synthesis of the first decade of CoDA research examining the association between movement behaviours, health, and well-being in school-aged children. METHODS Databases were systematically searched for peer-reviewed studies examining CoDA associations between movement behaviours and health or well-being in school-aged children (5.0-17.9 years). All health and well-being outcomes were eligible for inclusion, as were all methods of reporting CoDA results. Where possible meta-analyses were conducted. RESULTS Twenty-six studies were included in the review. Sample sizes ranged from 88 - 5,828 (median = 387) participants and the mean ages ranged from 8 to 16 years. Regression parameters (kstudies=16) were the most common method of reporting results, followed by substitution effects (kstudies=12), optimal compositions (kstudies=3), and movement behaviour clusters (kstudies =1). Weighted compositional means of movement behaviours were calculated (e.g., 49.8 min/day of MVPA). For regression analyses, results were generally null, though some favourable trends were observed for MVPA and unfavourable trends for LPA and sedentary behaviour within individual health and well-being outcomes categories. Meta-analyses of substitutions supported the benefits of MVPA, with the risks of reducing MVPA for other movement behaviours being double the magnitude compared to the benefits of adding MVPA. DISCUSSION The most consistent conclusions within this review align with previous reviews that support the benefits of MVPA. Further, some evidence supported 24-hour movement behaviour guideline recommendations of increasing sleep and decreasing sedentary behaviour. This review also quantified not only the need to promote MVPA, but perhaps more importantly the urgency needed to preserve the limited MVPA children currently accumulate. Findings reinforce the "more/less is better" messages for movement behaviours, but do not allow us to recommend more specific balances of movement behaviours. As CoDA of movement behaviours progresses and accumulates further research, the methods and discussion points within the current review can aide future meta-analyses aimed at advancing the precision health guidance needed for optimizing children's health and well-being.
Collapse
Affiliation(s)
- Nicholas Kuzik
- Healthy Active Living and Obesity Research Group, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada.
| | - Markus J Duncan
- Healthy Active Living and Obesity Research Group, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada
- , ParticipACTION, 4 New Street, Toronto, ON, M5R 1P6, Canada
| | - Natalie Beshara
- Healthy Active Living and Obesity Research Group, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada
- National University of Ireland, University Road, Galway, H91 TK33, Ireland
| | - Matthew MacDonald
- Healthy Active Living and Obesity Research Group, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada
| | | | - Mark S Tremblay
- Healthy Active Living and Obesity Research Group, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada
- Department of Pediatrics, University of Ottawa, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada
| |
Collapse
|
4
|
Brown DMY, Burkart S, Groves CI, Balbim GM, Pfledderer CD, Porter CD, Laurent CS, Johnson EK, Kracht CL. A systematic review of research reporting practices in observational studies examining associations between 24-h movement behaviors and indicators of health using compositional data analysis. JOURNAL OF ACTIVITY, SEDENTARY AND SLEEP BEHAVIORS 2024; 3:23. [PMID: 39371105 PMCID: PMC11446952 DOI: 10.1186/s44167-024-00062-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 09/02/2024] [Indexed: 10/08/2024]
Abstract
Background Compositional data analysis (CoDA) techniques are well suited for examining associations between 24-h movement behaviors (i.e., sleep, sedentary behavior, physical activity) and indicators of health given they recognize these behaviors are co-dependent, representing relative parts that make up a whole day. Accordingly, CoDA techniques have seen increased adoption in the past decade, however, heterogeneity in research reporting practices may hinder efforts to synthesize and quantify these relationships via meta-analysis. This systematic review described reporting practices in studies that used CoDA techniques to investigate associations between 24-h movement behaviors and indicators of health. Methods A systematic search of eight databases was conducted, in addition to supplementary searches (e.g., forward/backward citations, expert consultation). Observational studies that used CoDA techniques involving log-ratio transformation of behavioral data to examine associations between time-based estimates of 24-h movement behaviors and indicators of health were included. Reporting practices were extracted and classified into seven areas: (1) methodological justification, (2) behavioral measurement and data handling strategies, (3) composition construction, (4) analytic plan, (5) composition-specific descriptive statistics, (6) model results, and (7) auxiliary information. Study quality and risk of bias were assessed by the National Institutes of Health Quality Assessment Tool for Observational Cohort and Cross-sectional Studies. Results 102 studies met our inclusion criteria. Reporting practices varied considerably across areas, with most achieving high standards in methodological justification, but inconsistent reporting across all other domains. Some items were reported in all studies (e.g., how many parts the daily composition was partitioned into), whereas others seldom reported (e.g., definition of a day: midnight-to-midnight versus wake-to-wake). Study quality and risk of bias was fair in most studies (85%). Conclusions Current studies generally demonstrate inconsistent reporting practices. Consistent, clear and detailed reporting practices are evidently needed moving forward as the field of time-use epidemiology aims to accurately capture and analyze movement behavior data in relation to health outcomes, facilitate comparisons across studies, and inform public health interventions and policy decisions. Achieving consensus regarding reporting recommendations is a key next step. Supplementary Information The online version contains supplementary material available at 10.1186/s44167-024-00062-8.
Collapse
Affiliation(s)
| | - Sarah Burkart
- University of South Carolina, Arnold School of Public Health, 921 Assembly St, Columbia, SC 29208 USA
| | - Claire I. Groves
- The University of Texas at San Antonio, 1 UTSA Circle, San Antonio, TX 78249 USA
| | | | - Christopher D. Pfledderer
- The University of Texas Health Science Center Houston, School of Public Health in Austin, Austin, TX 78701 USA
| | - Carah D. Porter
- Kansas State University, 1105 Sunset Ave, Manhattan, KS 66502 USA
| | | | - Emily K. Johnson
- The University of Texas at San Antonio, 1 UTSA Circle, San Antonio, TX 78249 USA
| | - Chelsea L. Kracht
- University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS 66160 USA
| |
Collapse
|
5
|
Ma Y, Gao Y, Yang H, Zhang Y, Ku Y. Enhancing mental well-being of undergraduates: establishing cut-off values and analyzing substitutive effects of physical activity on depression regulation. Front Psychol 2024; 15:1432454. [PMID: 39319070 PMCID: PMC11420123 DOI: 10.3389/fpsyg.2024.1432454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 08/23/2024] [Indexed: 09/26/2024] Open
Abstract
Objective This study aimed to analyze the effects of physical activity (PA), sleep quality, and sedentary behavior on subthreshold depression (StD) among undergraduates. Methods This study included 834 undergraduates and assessed the impact of PA time, sleep quality, and sedentary behavior on depression. The receiver operating characteristic (ROC) analysis was performed to determine cut-off values for StD risk, while the isochronous substitution analysis was performed to evaluate the effects of different activities on depression regulation. Results Gender, age, and academic grade had no significant influence on depression levels among undergraduates (p > 0.05). However, students engaging in sedentary behavior for more than 12.1 h per day or with a Pittsburgh Sleep Quality Index score above 3.5 were at an increased risk of subclinical depression. Additionally, the isochronous substitution of light-intensity physical activity for other activities (sleep, sedentary behavior, moderate and vigorous intensity physical activity) showed statistically significant effects (p < 0.05) in both 5-min and 10-min substitution models, demonstrating a positive effect on alleviating depression. Conclusion The findings indicate that specific lifestyle factors, particularly high levels of sedentary behavior and poor sleep quality, are crucial determinants of subclinical depression among undergraduates, independent of demographic variables such as gender, age, and academic grade. Notably, light-intensity PA plays a key role in StD regulation, as substituting it with more intense physical activities or improving sleep quality substantially reduces depression scores. Furthermore, the benefits such substitution became more pronounced with the increase in duration of the activity.
Collapse
Affiliation(s)
- Yue Ma
- Faculty of Medicine, Macau University of Science and Technology, Taipa, Macau SAR, China
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, Guangdong, China
| | - Yulin Gao
- School of Nursing, Southern Medical University, Guangzhou, Guangdong, China
| | - Hui Yang
- School of Nursing, Southern Medical University, Guangzhou, Guangdong, China
| | - Yu Zhang
- School of Nursing, Southern Medical University, Guangzhou, Guangdong, China
| | - Yixuan Ku
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Center for Brain and Mental Well-being, Department of Psychology, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
6
|
Rong F, Li X, Jia L, Liu J, Li S, Zhang Z, Wang R, Wang D, Wan Y. Substitutions of physical activity and sedentary behavior with negative emotions and sex difference among college students. PSYCHOLOGY OF SPORT AND EXERCISE 2024; 72:102605. [PMID: 38346583 DOI: 10.1016/j.psychsport.2024.102605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
BACKGROUND A growing number of studies have found that physical activity (PA) benefits mental health, and sedentary behavior (SB) increases the risk of psychological symptoms, but it remains unclear whether substituting the duration of different activities may affect the association, and whether sex difference exists among college students. METHODS A longitudinal survey was conducted in three colleges in Anhui province, China. A total of 8049 participants validly answered the questionnaire, which included demographic variables, PA, SB and negative emotions (depressive, anxiety, and stress symptoms). RESULTS Substituting 30 min per day of SB with equivalent walking was associated with lower scores of depressive symptoms. Stress symptoms were reduced when SB was substituted by moderate intensity PA (MPA) and walking. Substituting 30 min per day of MPA in place of walking and SB was associated with an amelioration of depressive and stress symptoms in females, and a reallocation of 30 min walking was associated with lower depressive and stress symptom scores when SB was replaced in males. CONCLUSIONS Replacing SB with walking and MPA ameliorates the depressive and stress symptoms in young adults. The results suggest a reallocation of time from SB or walking to MPA in females while from SB to walking in males may markedly reduce the depressive and stress symptoms in college population.
Collapse
Affiliation(s)
- Fan Rong
- Department of Maternal, Child and Adolescent Health, School of Public Health, Anhui Medical University, Anhui, China; Anhui Provincial Key Laboratory of Population Health and Aristogenices, Anhui, China
| | - Xin Li
- School of Clinical Medical, Anqing Medical College, Anhui, China
| | - Liyuan Jia
- Department of Maternal, Child and Adolescent Health, School of Public Health, Anhui Medical University, Anhui, China; Anhui Provincial Key Laboratory of Population Health and Aristogenices, Anhui, China
| | - Jing Liu
- School of Clinical Medical, Huainan Union University, Huainan, China
| | - Shuqin Li
- Department of Maternal, Child and Adolescent Health, School of Public Health, Anhui Medical University, Anhui, China; Anhui Provincial Key Laboratory of Population Health and Aristogenices, Anhui, China
| | - Zhixian Zhang
- Department of Maternal, Child and Adolescent Health, School of Public Health, Anhui Medical University, Anhui, China; Anhui Provincial Key Laboratory of Population Health and Aristogenices, Anhui, China
| | - Rui Wang
- Teaching Affairs Office, Anqing Medical College, Anhui, China
| | - Danni Wang
- Department of Health Promotion and Behavioral Sciences, School of Public Health, Anhui Medical University, Anhui, China.
| | - Yuhui Wan
- Department of Maternal, Child and Adolescent Health, School of Public Health, Anhui Medical University, Anhui, China; Anhui Provincial Key Laboratory of Population Health and Aristogenices, Anhui, China.
| |
Collapse
|
7
|
Groves CI, Huong C, Porter CD, Summerville B, Swafford I, Witham B, Hayward M, Kwan MYW, Brown DMY. Associations between 24-h movement behaviors and indicators of mental health and well-being across the lifespan: a systematic review. JOURNAL OF ACTIVITY, SEDENTARY AND SLEEP BEHAVIORS 2024; 3:9. [PMID: 40217439 PMCID: PMC11960375 DOI: 10.1186/s44167-024-00048-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 02/28/2024] [Indexed: 04/15/2025]
Abstract
Researchers have adopted a variety of analytical techniques to examine the collective influence of 24-h movement behaviors (i.e., physical activity, sedentary behaviors, sleep) on mental health, but efforts to synthesize this growing body of literature have been limited to studies of children and youth. This systematic review investigated how combinations of 24-h movement behaviors relate to indicators of mental ill-being and well-being across the lifespan. A systematic search of MEDLINE, PsycINFO, Embase, and SPORTDiscus was conducted. Studies were included if they reported all three movement behaviors; an indicator of mental ill-being or well-being; and were published in English after January 2009. Samples of both clinical and non-clinical populations were included. A total of 73 studies (n = 58 cross-sectional; n = 15 longitudinal) met our inclusion criteria, of which 47 investigated children/youth and 26 investigated adults. Seven analytical approaches were used: guideline adherence (total and specific combinations), movement compositions, isotemporal substitution, profile/cluster analyses, the Goldilocks method and rest-activity rhythmicity. More associations were reported for indicators of mental ill-being (n = 127 for children/youth; n = 53 for adults) than well-being (n = 54 for children/youth; n = 26 for adults). Across the lifespan, favorable benefits were most consistently observed for indicators of mental well-being and ill-being when all three components of the 24-h movement guidelines were met. Movement compositions were more often associated with indicators of mental health for children and youth than adults. Beneficial associations were consistently observed for indicators of mental health when sedentary behavior was replaced with sleep or physical activity. Other analytic approaches indicated that engaging in healthier and more consistent patterns of movement behaviors (emphasizing adequate sleep, maximizing physical activity, minimizing sedentary behaviors) were associated with better mental health. Favorable associations were reported less often in longitudinal studies. Collectively, these findings provide further support for adopting an integrative whole day approach to promote mental well-being and prevent and manage mental ill-being over the status quo of focusing on these behaviors in isolation. This literature, however, is still emerging-for adults in particular-and more longitudinal work is required to make stronger inferences.
Collapse
Affiliation(s)
- Claire I Groves
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA.
| | - Christopher Huong
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Carah D Porter
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Bryce Summerville
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Isabella Swafford
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Braden Witham
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Matt Hayward
- Dolph Briscoe Jr Library, University of Texas Health San Antonio, San Antonio, TX, 78229, USA
| | - Matthew Y W Kwan
- Department of Child and Youth Studies, Brock University, St. Catharines, ON, L2S 3A1, Canada
| | - Denver M Y Brown
- Department of Psychology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| |
Collapse
|
8
|
Nguyen T, Mengersen K, Sous D, Liquet B. SMOTE-CD: SMOTE for compositional data. PLoS One 2023; 18:e0287705. [PMID: 37384667 PMCID: PMC10309641 DOI: 10.1371/journal.pone.0287705] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 06/12/2023] [Indexed: 07/01/2023] Open
Abstract
Compositional data are a special kind of data, represented as a proportion carrying relative information. Although this type of data is widely spread, no solution exists to deal with the cases where the classes are not well balanced. After describing compositional data imbalance, this paper proposes an adaptation of the original Synthetic Minority Oversampling TEchnique (SMOTE) to deal with compositional data imbalance. The new approach, called SMOTE for Compositional Data (SMOTE-CD), generates synthetic examples by computing a linear combination of selected existing data points, using compositional data operations. The performance of the SMOTE-CD is tested with three different regressors (Gradient Boosting tree, Neural Networks, Dirichlet regressor) applied to two real datasets and to synthetic generated data, and the performance is evaluated using accuracy, cross-entropy, F1-score, R2 score and RMSE. The results show improvements across all metrics, but the impact of oversampling on performance varies depending on the model and the data. In some cases, oversampling may lead to a decrease in performance for the majority class. However, for the real data, the best performance across all models is achieved when oversampling is used. Notably, the F1-score is consistently increased with oversampling. Unlike the original technique, the performance is not improved when combining oversampling of the minority classes and undersampling of the majority class. The Python package smote-cd implements the method and is available online.
Collapse
Affiliation(s)
- Teo Nguyen
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematics and Physical Sciences, Macquarie University, Sydney, NSW, Australia
| | - Kerrie Mengersen
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia
| | - Damien Sous
- Laboratoire des Sciences Pour l’ingénieur Appliquées à la Mécanique et au Génie Électrique, Université de Pau et des Pays de l’Adour, E2S UPPA, Anglet, France
- Mediterranean Institute of Oceanography, Université de Toulon, Aix Marseille Université, CNRS, IRD, La Garde, France
| | - Benoit Liquet
- Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l’Adour, E2S UPPA, CNRS, Anglet, France
- School of Mathematics and Physical Sciences, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|