1
|
Babič J, Kunavar T, Oztop E, Kawato M. Success-efficient/failure-safe strategy for hierarchical reinforcement motor learning. PLoS Comput Biol 2025; 21:e1013089. [PMID: 40344154 PMCID: PMC12121909 DOI: 10.1371/journal.pcbi.1013089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 05/29/2025] [Accepted: 04/23/2025] [Indexed: 05/11/2025] Open
Abstract
Our study explores how ecological aspects of motor learning enhance survival by improving movement efficiency and mitigating injury risks during task failures. Traditional motor control theories mainly address isolated body movements and often overlook these ecological factors. We introduce a novel computational motor control approach, incorporating ecological fitness and a strategy that alternates between success-driven movement efficiency and failure-driven safety, akin to win-stay/lose-shift tactics. In our experiments, participants performed squat-to-stand movements under novel force perturbations. They adapted effectively through various adaptive motor control mechanisms to avoid falls, reducing failure rates rapidly. The results indicate a high-level ecological controller in human motor learning that switches objectives between safety and movement efficiency, depending on failure or success. This approach is supported by policy learning, internal model adaptation, and adaptive feedback control. Our findings offer a comprehensive perspective on human motor control, integrating risk management in a hierarchical reinforcement learning framework for real-world environments.
Collapse
Affiliation(s)
- Jan Babič
- Laboratory for Neuromechanics and Biorobotics, Department of Automatics, Biocybernetics and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Tjasa Kunavar
- Laboratory for Neuromechanics and Biorobotics, Department of Automatics, Biocybernetics and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia
- Jožef Stefan International Postgraduate School, Ljubljana, Slovenia
| | - Erhan Oztop
- Ozyegin University, Istanbul, Turkiye
- Osaka University, Osaka, Japan
| | - Mitsuo Kawato
- Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
| |
Collapse
|
2
|
Kooij KVD, Smeets JBJ, Mastrigt NMV, Wijk BCMV. The sign of exploration during reward-based motor learning is not independent from trial to trial. Exp Brain Res 2025; 243:117. [PMID: 40232309 PMCID: PMC12000264 DOI: 10.1007/s00221-025-07074-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Accepted: 03/30/2025] [Indexed: 04/16/2025]
Abstract
Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such 'reward-based motor learning' relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.
Collapse
Affiliation(s)
- Katinka van der Kooij
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, van der Boechorststraat 9, 1081BT, Amsterdam, The Netherlands.
| | - Jeroen B J Smeets
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, van der Boechorststraat 9, 1081BT, Amsterdam, The Netherlands
| | - Nina M van Mastrigt
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, van der Boechorststraat 9, 1081BT, Amsterdam, The Netherlands
- Department of Psychology, Justus-Liebig-Universität Gießen, Gießen, Germany
| | - Bernadette C M van Wijk
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, van der Boechorststraat 9, 1081BT, Amsterdam, The Netherlands
- Department of Neurology, Amsterdam Neuroscience, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
3
|
Konrad JD, Lohse KR, Marrus N, Lang CE. Trial-to-trial motor behavior during a reinforcement learning task in children ages 6 to 12. Hum Mov Sci 2025; 99:103317. [PMID: 39667095 DOI: 10.1016/j.humov.2024.103317] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 11/14/2024] [Accepted: 12/04/2024] [Indexed: 12/14/2024]
Abstract
INTRODUCTION During practice, learners use available feedback from one trial to develop and implement motor commands for the next trial. Unsuccessful trials (i.e., "misses") should be followed by different motor behavior (e.g., goal-directed changes and/or exploration of movement parameters), while successful trials (i.e., "hits") should maintain the same behavior (e.g., minimize variance and recapitulate the same motor plan to the best of one's ability). Measuring the trial-to-trial changes in motor behavior can provide insights into how the motor system uses feedback and regulates movement variability while trying to improve performance. There have been no reports on the trial-to-trial motor behavior of typically developing children despite the profound motor development that occurs in this period and its relevance to long-term functional outcomes. METHODS We recruited 72 typically developing children from ages 6 to 12 to perform a reinforcement learning beanbag toss to a target. Their target errors were used to examine their motor exploration and autocorrelation. RESULTS Comparing variability at different trial-to-trial intervals showed that children exhibit motor exploration above and beyond the effect of sampling bias. Mean autocorrelations of different lags were near zero suggesting that successive trials were largely unrelated. CONCLUSION We found evidence that children utilize motor exploration in the target space of a target throwing task. After failed trials they exhibited increased variability to search for more optimal motor solutions. After successes, they minimized variability to create the same successful performance.
Collapse
Affiliation(s)
- Jeffrey D Konrad
- Program in Physical Therapy, Washington University School of Medicine, CB 8502, 4444 Forest Park Ave., Suite 1101, St. Louis, MO 63108-2212, United States
| | - Keith R Lohse
- Program in Physical Therapy, Washington University School of Medicine, CB 8502, 4444 Forest Park Ave., Suite 1101, St. Louis, MO 63108-2212, United States; Department of Neurology, Washington University School of Medicine, MSC 8111-29-9000, 660 S. Euclid Ave., St. Louis, MO 63110, United States
| | - Natasha Marrus
- Department of Psychiatry, Washington University School of Medicine, CB 8509, 660 South Euclid Ave., St. Louis, MO 63110, United States
| | - Catherine E Lang
- Program in Physical Therapy, Washington University School of Medicine, CB 8502, 4444 Forest Park Ave., Suite 1101, St. Louis, MO 63108-2212, United States; Program in Occupational Therapy, Washington University School of Medicine, MSC 8505-66-1, 4444 Forest Park Avenue, St. Louis, MO 63108-2212, United States; Department of Neurology, Washington University School of Medicine, MSC 8111-29-9000, 660 S. Euclid Ave., St. Louis, MO 63110, United States.
| |
Collapse
|
4
|
van Duijnhoven E, van der Kooij K, Vlot E, Brehm MA, Waterval NFJ. Adaptation of functional gait parameters to a newly provided stiffness-optimized ankle-foot orthosis. Clin Biomech (Bristol, Avon) 2025; 122:106428. [PMID: 39732035 DOI: 10.1016/j.clinbiomech.2024.106428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 12/20/2024] [Accepted: 12/23/2024] [Indexed: 12/30/2024]
Abstract
BACKGROUND Clinical decisions regarding ankle-foot-orthosis stiffness in people with calf muscle weakness are based on immediate evaluations, not taking gait adaptation into account. This study examined adaptation of step length, walking speed and energy cost of walking in the 3-months post-provision and whether individuals with higher gait variability adapt more compared to individuals with lower gait variability. METHODS We conducted a post-hoc analysis in eighteen stiffness-optimized ankle-foot-orthosis users with bilateral calf muscle weakness. Gait biomechanics, step length, walking speed and walking energy cost directly after provision (T1) and 3-months post-provision of the ankle-foot-orthosis (T2) were compared using paired sampled t-tests. Based on gait variability scores at T1, a high and low gait variability group was determined, and change scores in the functional gait parameters were compared using non-parametric independent sampled t-tests. A significance level of p ˂ 0.1 was used. FINDINGS No significant differences in step length, walking speed and energy cost of walking between T1 and T2 were found (p > 0.20). Step length increased more in people with high gait variability scores at T1 compared to those with low gait variability scores (High: +3.1 [-3.2 - +6.9], Low: +0.2 [-6.8 - +3.7] cm, p = 0.085), while no differences between groups were found for walking speed and energy cost of walking (p > 0.129). INTERPRETATION After provision of stiffness-optimized ankle-foot-orthoses in people with bilateral calf muscle weakness, no functional gait adaptations were found. However, people demonstrating high gait variability increased step length more compared to those demonstrating lower variability, which might be an indication that variability plays a role in adaptation.
Collapse
Affiliation(s)
- Elza van Duijnhoven
- Amsterdam UMC location University of Amsterdam, Rehabilitation Medicine, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Movement Sciences, Rehabilitation and Development, Amsterdam, the Netherlands
| | - Katinka van der Kooij
- Vrije Universiteit Amsterdam, Faculty of Behavioural and Movement Sciences, Department of Human Movement Sciences, the Netherlands
| | - Esther Vlot
- Vrije Universiteit Amsterdam, Faculty of Behavioural and Movement Sciences, Department of Human Movement Sciences, the Netherlands
| | - Merel-Anne Brehm
- Amsterdam UMC location University of Amsterdam, Rehabilitation Medicine, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Movement Sciences, Rehabilitation and Development, Amsterdam, the Netherlands
| | - Niels F J Waterval
- Amsterdam UMC location University of Amsterdam, Rehabilitation Medicine, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Movement Sciences, Rehabilitation and Development, Amsterdam, the Netherlands.
| |
Collapse
|
5
|
López-Fernández M, Sabido R, Caballero C, Moreno FJ. Relationship between initial motor variability and learning and adaptive ability. A systematic review. Neuroscience 2025; 565:301-311. [PMID: 39547333 DOI: 10.1016/j.neuroscience.2024.10.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/05/2024] [Accepted: 10/29/2024] [Indexed: 11/17/2024]
Abstract
Motor variability is an intrinsic feature of human beings that has been associated with the ability for learning and adaptation to specific tasks. The purpose of this review is to examine whether there is a possible direct relationship between individuals' initial variability in their ability for learning and adaptation in motor tasks. Eighteen articles examined the relationship between initial motor variability and the ability for learning or adaptation. Twelve found a direct relationship. In reward-based tasks, greater initial variability was associated with greater learning and adaption improvement when assessed using linear measures of dispersion, however, this association was not observed with temporal structure variability. While in error-based task associations were reported with both greater amount variability and more complexity temporal structure. Nevertheless, bias in initial performance related to the amount of variability was found, so the temporal structure of initial variability seems to be a better indicator of improvement in this type of task. Further research is needed for further research to better understand the potential relationship between initial motor variability and the ability for learning or adaptation in motor tasks.
Collapse
Affiliation(s)
- Miguel López-Fernández
- Sport Research Centre, Department of Sport Science, Miguel Hernández University, Elche, Spain.
| | - Rafael Sabido
- Sport Research Centre, Department of Sport Science, Miguel Hernández University, Elche, Spain
| | - Carla Caballero
- Sport Research Centre, Department of Sport Science, Miguel Hernández University, Elche, Spain; Neurosciences Research Group, Alicante Institute for Health and Biomedical Research (ISABIAL), Spain, Alicante, Spain
| | - Francisco J Moreno
- Sport Research Centre, Department of Sport Science, Miguel Hernández University, Elche, Spain
| |
Collapse
|
6
|
Roth AM, Buggeln JH, Hoh JE, Wood JM, Sullivan SR, Ngo TT, Calalo JA, Lokesh R, Morton SM, Grill S, Jeka JJ, Carter MJ, Cashaback JGA. Roles and interplay of reinforcement-based and error-based processes during reaching and gait in neurotypical adults and individuals with Parkinson's disease. PLoS Comput Biol 2024; 20:e1012474. [PMID: 39401183 PMCID: PMC11472932 DOI: 10.1371/journal.pcbi.1012474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 09/11/2024] [Indexed: 10/17/2024] Open
Abstract
From a game of darts to neurorehabilitation, the ability to explore and fine tune our movements is critical for success. Past work has shown that exploratory motor behaviour in response to reinforcement (reward) feedback is closely linked with the basal ganglia, while movement corrections in response to error feedback is commonly attributed to the cerebellum. While our past work has shown these processes are dissociable during adaptation, it is unknown how they uniquely impact exploratory behaviour. Moreover, converging neuroanatomical evidence shows direct and indirect connections between the basal ganglia and cerebellum, suggesting that there is an interaction between reinforcement-based and error-based neural processes. Here we examine the unique roles and interaction between reinforcement-based and error-based processes on sensorimotor exploration in a neurotypical population. We also recruited individuals with Parkinson's disease to gain mechanistic insight into the role of the basal ganglia and associated reinforcement pathways in sensorimotor exploration. Across three reaching experiments, participants were given either reinforcement feedback, error feedback, or simultaneously both reinforcement & error feedback during a sensorimotor task that encouraged exploration. Our reaching results, a re-analysis of a previous gait experiment, and our model suggests that in isolation, reinforcement-based and error-based processes respectively boost and suppress exploration. When acting in concert, we found that reinforcement-based and error-based processes interact by mutually opposing one another. Finally, we found that those with Parkinson's disease had decreased exploration when receiving reinforcement feedback, supporting the notion that compromised reinforcement-based processes reduces the ability to explore new motor actions. Understanding the unique and interacting roles of reinforcement-based and error-based processes may help to inform neurorehabilitation paradigms where it is important to discover new and successful motor actions.
Collapse
Affiliation(s)
- Adam M. Roth
- Department of Mechanical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - John H. Buggeln
- Department of Biomedical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - Joanna E. Hoh
- Kinesiology and Applied Physiology, University of Delaware, Newark, Delaware, United States of America
- Biomechanics and Movement Science Program, University of Delaware, Newark, Delaware, United States of America
| | - Jonathan M. Wood
- Biomechanics and Movement Science Program, University of Delaware, Newark, Delaware, United States of America
- Department of Physical Therapy, University of Delaware, Newark, Delaware, United States of America
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, Delaware, United States of America
| | - Seth R. Sullivan
- Department of Biomedical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - Truc T. Ngo
- Department of Biomedical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - Jan A. Calalo
- Department of Mechanical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - Rakshith Lokesh
- Department of Biomedical Engineering, University of Delaware, Newark, Delaware, United States of America
| | - Susanne M. Morton
- Biomechanics and Movement Science Program, University of Delaware, Newark, Delaware, United States of America
- Department of Physical Therapy, University of Delaware, Newark, Delaware, United States of America
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, Delaware, United States of America
| | - Stephen Grill
- Kinesiology and Applied Physiology, University of Delaware, Newark, Delaware, United States of America
- Johns Hopkins Regional Physicians, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - John J. Jeka
- Kinesiology and Applied Physiology, University of Delaware, Newark, Delaware, United States of America
- Biomechanics and Movement Science Program, University of Delaware, Newark, Delaware, United States of America
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, Delaware, United States of America
| | - Michael J. Carter
- Department of Kinesiology, McMaster University, Hamilton, Ontario, Canada
| | - Joshua G. A. Cashaback
- Department of Mechanical Engineering, University of Delaware, Newark, Delaware, United States of America
- Department of Biomedical Engineering, University of Delaware, Newark, Delaware, United States of America
- Kinesiology and Applied Physiology, University of Delaware, Newark, Delaware, United States of America
- Biomechanics and Movement Science Program, University of Delaware, Newark, Delaware, United States of America
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, Delaware, United States of America
| |
Collapse
|
7
|
Chen Y, Abram SJ, Ivry RB, Tsay JS. Motor adaptation is reduced by symbolic compared to sensory feedback. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.28.601293. [PMID: 39005305 PMCID: PMC11244888 DOI: 10.1101/2024.06.28.601293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Motor adaptation - the process of reducing motor errors through feedback and practice - is an essential feature of human competence, allowing us to move accurately in dynamic and novel environments. Adaptation typically results from sensory feedback, with most learning driven by visual and proprioceptive feedback that arises with the movement. In humans, motor adaptation can also be driven by symbolic feedback. In the present study, we examine how implicit and explicit components of motor adaptation are modulated by symbolic feedback. We conducted three reaching experiments involving over 400 human participants to compare sensory and symbolic feedback using a task in which both types of learning processes could be operative (Experiment 1) or tasks in which learning was expected to be limited to only an explicit process (Experiments 2 and 3). Adaptation with symbolic feedback was dominated by explicit strategy use, with minimal evidence of implicit recalibration. Even when matched in terms of information content, adaptation to rotational and mirror reversal perturbations was slower in response to symbolic feedback compared to sensory feedback. Our results suggest that the abstract and indirect nature of symbolic feedback disrupts strategic reasoning and/or refinement, deepening our understanding of how feedback type influences the mechanisms of sensorimotor learning.
Collapse
Affiliation(s)
- Yifei Chen
- Department of Psychology, University of California, Berkeley
- Helen Wills Neuroscience Institute, University of California, Berkeley
| | - Sabrina J. Abram
- Department of Psychology, University of California, Berkeley
- Helen Wills Neuroscience Institute, University of California, Berkeley
| | - Richard B. Ivry
- Department of Psychology, University of California, Berkeley
- Helen Wills Neuroscience Institute, University of California, Berkeley
| | | |
Collapse
|
8
|
Banca P, Herrojo Ruiz M, Gonzalez-Zalba MF, Biria M, Marzuki AA, Piercy T, Sule A, Fineberg NA, Robbins TW. Action sequence learning, habits, and automaticity in obsessive-compulsive disorder. eLife 2024; 12:RP87346. [PMID: 38722306 PMCID: PMC11081634 DOI: 10.7554/elife.87346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2024] Open
Abstract
This study investigates the goal/habit imbalance theory of compulsion in obsessive-compulsive disorder (OCD), which postulates enhanced habit formation, increased automaticity, and impaired goal/habit arbitration. It directly tests these hypotheses using newly developed behavioral tasks. First, OCD patients and healthy participants were trained daily for a month using a smartphone app to perform chunked action sequences. Despite similar procedural learning and attainment of habitual performance (measured by an objective automaticity criterion) by both groups, OCD patients self-reported higher subjective habitual tendencies via a recently developed questionnaire. Subsequently, in a re-evaluation task assessing choices between established automatic and novel goal-directed actions, both groups were sensitive to re-evaluation based on monetary feedback. However, OCD patients, especially those with higher compulsive symptoms and habitual tendencies, showed a clear preference for trained/habitual sequences when choices were based on physical effort, possibly due to their higher attributed intrinsic value. These patients also used the habit-training app more extensively and reported symptom relief post-study. The tendency to attribute higher intrinsic value to familiar actions may be a potential mechanism leading to compulsions and an important addition to the goal/habit imbalance hypothesis in OCD. We also highlight the potential of smartphone app training as a habit reversal therapeutic tool.
Collapse
Affiliation(s)
- Paula Banca
- Department of Psychology, University of CambridgeCambridgeUnited Kingdom
- Behavioural and Clinical Neuroscience Institute, University of CambridgeCambridgeUnited Kingdom
| | - Maria Herrojo Ruiz
- Department of Psychology, Goldsmiths University of LondonLondonUnited Kingdom
| | | | - Marjan Biria
- Department of Psychology, University of CambridgeCambridgeUnited Kingdom
- Behavioural and Clinical Neuroscience Institute, University of CambridgeCambridgeUnited Kingdom
| | - Aleya A Marzuki
- Department of Psychology, University of CambridgeCambridgeUnited Kingdom
- Behavioural and Clinical Neuroscience Institute, University of CambridgeCambridgeUnited Kingdom
| | - Thomas Piercy
- Department of Psychiatry, School of Clinical Medicine, University of CambridgeCambridgeUnited Kingdom
| | - Akeem Sule
- Department of Psychiatry, School of Clinical Medicine, University of CambridgeCambridgeUnited Kingdom
| | - Naomi A Fineberg
- Hertfordshire Partnership University NHS Foundation TrustWelwyn Garden CityUnited Kingdom
- University of HertfordshireHatfieldUnited Kingdom
| | - Trevor W Robbins
- Department of Psychology, University of CambridgeCambridgeUnited Kingdom
- Behavioural and Clinical Neuroscience Institute, University of CambridgeCambridgeUnited Kingdom
| |
Collapse
|
9
|
Roth AM, Lokesh R, Tang J, Buggeln JH, Smith C, Calalo JA, Sullivan SR, Ngo T, Germain LS, Carter MJ, Cashaback JGA. Punishment Leads to Greater Sensorimotor Learning But Less Movement Variability Compared to Reward. Neuroscience 2024; 540:12-26. [PMID: 38220127 PMCID: PMC10922623 DOI: 10.1016/j.neuroscience.2024.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/05/2024] [Accepted: 01/09/2024] [Indexed: 01/16/2024]
Abstract
When a musician practices a new song, hitting a correct note sounds pleasant while striking an incorrect note sounds unpleasant. Such reward and punishment feedback has been shown to differentially influence the ability to learn a new motor skill. Recent work has suggested that punishment leads to greater movement variability, which causes greater exploration and faster learning. To further test this idea, we collected 102 participants over two experiments. Unlike previous work, in Experiment 1 we found that punishment did not lead to faster learning compared to reward (n = 68), but did lead to a greater extent of learning. Surprisingly, we also found evidence to suggest that punishment led to less movement variability, which was related to the extent of learning. We then designed a second experiment that did not involve adaptation, allowing us to further isolate the influence of punishment feedback on movement variability. In Experiment 2, we again found that punishment led to significantly less movement variability compared to reward (n = 34). Collectively our results suggest that punishment feedback leads to less movement variability. Future work should investigate whether punishment feedback leads to a greater knowledge of movement variability and or increases the sensitivity of updating motor actions.
Collapse
Affiliation(s)
- Adam M Roth
- Department of Mechanical Engineering, University of Delaware, United States
| | - Rakshith Lokesh
- Department of Biomedical Engineering, University of Delaware, United States
| | - Jiaqiao Tang
- Department of Kinesiology, McMaster University, Canada
| | - John H Buggeln
- Department of Biomedical Engineering, University of Delaware, United States
| | - Carly Smith
- Department of Biomedical Engineering, University of Delaware, United States
| | - Jan A Calalo
- Department of Mechanical Engineering, University of Delaware, United States
| | - Seth R Sullivan
- Department of Biomedical Engineering, University of Delaware, United States
| | - Truc Ngo
- Department of Biomedical Engineering, University of Delaware, United States
| | | | | | - Joshua G A Cashaback
- Department of Mechanical Engineering, University of Delaware, United States; Department of Biomedical Engineering, University of Delaware, United States; Kinesiology and Applied Physiology, University of Delaware, United States; Interdisciplinary Neuroscience Graduate Program, University of Delaware, United States; Biomechanics and Movement Science Program, University of Delaware, United States; Department of Kinesiology, McMaster University, Canada.
| |
Collapse
|
10
|
Wood JM, Kim HE, Morton SM. Reinforcement Learning during Locomotion. eNeuro 2024; 11:ENEURO.0383-23.2024. [PMID: 38438263 PMCID: PMC10946027 DOI: 10.1523/eneuro.0383-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 02/20/2024] [Accepted: 02/23/2024] [Indexed: 03/06/2024] Open
Abstract
When learning a new motor skill, people often must use trial and error to discover which movement is best. In the reinforcement learning framework, this concept is known as exploration and has been linked to increased movement variability in motor tasks. For locomotor tasks, however, increased variability decreases upright stability. As such, exploration during gait may jeopardize balance and safety, making reinforcement learning less effective. Therefore, we set out to determine if humans could acquire and retain a novel locomotor pattern using reinforcement learning alone. Young healthy male and female participants walked on a treadmill and were provided with binary reward feedback (indicated by a green checkmark on the screen) that was tied to a fixed monetary bonus, to learn a novel stepping pattern. We also recruited a comparison group who walked with the same novel stepping pattern but did so by correcting for target error, induced by providing real-time veridical visual feedback of steps and a target. In two experiments, we compared learning, motor variability, and two forms of motor memories between the groups. We found that individuals in the binary reward group did, in fact, acquire the new walking pattern by exploring (increasing motor variability). Additionally, while reinforcement learning did not increase implicit motor memories, it resulted in more accurate explicit motor memories compared with the target error group. Overall, these results demonstrate that humans can acquire new walking patterns with reinforcement learning and retain much of the learning over 24 h.
Collapse
Affiliation(s)
- Jonathan M Wood
- Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
- Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
| | - Hyosub E Kim
- Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
- Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
- Department of Psychological and Brain Sciences, University of Delaware, Newark, Delaware 19716
- School of Kinesiology, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Susanne M Morton
- Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
- Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
| |
Collapse
|
11
|
Roth AM, Calalo JA, Lokesh R, Sullivan SR, Grill S, Jeka JJ, van der Kooij K, Carter MJ, Cashaback JGA. Reinforcement-based processes actively regulate motor exploration along redundant solution manifolds. Proc Biol Sci 2023; 290:20231475. [PMID: 37848061 PMCID: PMC10581769 DOI: 10.1098/rspb.2023.1475] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/06/2023] [Indexed: 10/19/2023] Open
Abstract
From a baby's babbling to a songbird practising a new tune, exploration is critical to motor learning. A hallmark of exploration is the emergence of random walk behaviour along solution manifolds, where successive motor actions are not independent but rather become serially dependent. Such exploratory random walk behaviour is ubiquitous across species' neural firing, gait patterns and reaching behaviour. The past work has suggested that exploratory random walk behaviour arises from an accumulation of movement variability and a lack of error-based corrections. Here, we test a fundamentally different idea-that reinforcement-based processes regulate random walk behaviour to promote continual motor exploration to maximize success. Across three human reaching experiments, we manipulated the size of both the visually displayed target and an unseen reward zone, as well as the probability of reinforcement feedback. Our empirical and modelling results parsimoniously support the notion that exploratory random walk behaviour emerges by utilizing knowledge of movement variability to update intended reach aim towards recently reinforced motor actions. This mechanism leads to active and continuous exploration of the solution manifold, currently thought by prominent theories to arise passively. The ability to continually explore muscle, joint and task redundant solution manifolds is beneficial while acting in uncertain environments, during motor development or when recovering from a neurological disorder to discover and learn new motor actions.
Collapse
Affiliation(s)
- Adam M. Roth
- Department of Mechanical Engineering, University of Delaware, Newark, DE 19716, USA
| | - Jan A. Calalo
- Department of Mechanical Engineering, University of Delaware, Newark, DE 19716, USA
| | - Rakshith Lokesh
- Department of Biomedical Engineering, University of Delaware, Newark, DE 19716, USA
| | - Seth R. Sullivan
- Department of Biomedical Engineering, University of Delaware, Newark, DE 19716, USA
| | - Stephen Grill
- Kinesiology and Applied Physiology, University of Delaware, Newark, DE 19716, USA
| | - John J. Jeka
- Kinesiology and Applied Physiology, University of Delaware, Newark, DE 19716, USA
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, DE 19716, USA
- Biomechanics and Movement Science Program, University of Delaware, Newark, DE 19716, USA
| | - Katinka van der Kooij
- Faculty of Behavioural and Movement Science, Vrije University Amsterdam, Amsterdam, 1081HV, The Netherlands
| | - Michael J. Carter
- Department of Kinesiology, McMaster University, Room 203, Ivor Wynne Centre, Hamilton, L8S 4L8, Ontario, Canada
| | - Joshua G. A. Cashaback
- Department of Mechanical Engineering, University of Delaware, Newark, DE 19716, USA
- Department of Biomedical Engineering, University of Delaware, Newark, DE 19716, USA
- Kinesiology and Applied Physiology, University of Delaware, Newark, DE 19716, USA
- Interdisciplinary Neuroscience Graduate Program, University of Delaware, Newark, DE 19716, USA
- Biomechanics and Movement Science Program, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
12
|
Failure induces task-irrelevant exploration during a stencil task. Exp Brain Res 2023; 241:677-686. [PMID: 36658441 PMCID: PMC9852808 DOI: 10.1007/s00221-023-06548-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 01/04/2023] [Indexed: 01/21/2023]
Abstract
During reward-based motor tasks, performance failure leads to an increase in movement variability along task-relevant dimensions. These increases in movement variability are indicative of exploratory behaviour in search of a better, more successful motor action. It is unclear whether failure also induces exploration along task-irrelevant dimensions that do not influence performance. In this study, we ask whether participants would explore the task-irrelevant dimension while they performed a stencil task. With a stylus, participants applied downward, normal force that influenced whether they received reward (task-relevant) as they simultaneously made erasing-like movement patterns along the tablet that did not influence performance (task-irrelevant). In this task, the movement pattern was analyzed as the distribution of movement directions within a movement. The results showed significant exploration of task-relevant force and task-irrelevant movement patterns. We conclude that failure can induce additional movement variability along a task-irrelevant dimension.
Collapse
|
13
|
Shaping Exploration: How Does the Constraint-Induced Movement Therapy Helps Patients Finding a New Movement Solution. J Funct Morphol Kinesiol 2022; 8:jfmk8010004. [PMID: 36648896 PMCID: PMC9844369 DOI: 10.3390/jfmk8010004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/15/2022] [Accepted: 12/15/2022] [Indexed: 12/24/2022] Open
Abstract
Despite the relative success of constraint-induced movement therapy in the recovery of injury-/trauma-related populations, the mechanisms by which it promotes its results are still unknown. From a dynamical systems approach, we investigated whether the induced exploratory patterns within and between trials during an exercise in Shaping (the therapy's practice) could shed light on this process. We analyzed data from four chronic spinal-cord injury patients during a task of placing and removing their feet from a step. We assessed the within and between trial dynamics through recurrent quantification analyses and task-space analyses, respectively. From our results, individuals found movement patterns directed to modulate foot height (to accomplish the task). Additionally, when the task was manipulated (increasing step height), individuals increased coupling and coupling variability in the ankle, hip, and knee over trials. This pattern of findings is in consonance with the idea of Shaping inducing exploration of different movements. Such exploration might be an important factor affording the positive changes observed in the literature.
Collapse
|
14
|
Wiegel P, Elizabeth Spedden M, Ramsenthaler C, Malling Beck M, Lundbye-Jensen J. Trial-to-trial Variability and Cortical Processing Depend on Recent Outcomes During Human Reinforcement Motor Learning. Neuroscience 2022; 501:85-102. [PMID: 35970424 DOI: 10.1016/j.neuroscience.2022.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 07/22/2022] [Accepted: 08/09/2022] [Indexed: 10/15/2022]
Abstract
The history of our actions and their outcomes represent important information, informing choices and efficiently guiding future behavior. While unsuccessful (S-) outcomes are expected to lead to more explorative motor states and increased behavioral variability, successful (S+) outcomes are expected to reinforce the use of the previous action. Here, we show that humans attribute different values to previous actions during reinforcement motor learning when they experience S- compared to S+ outcomes. Behavioral variability after an S- outcome is influenced more by the previous outcome than after S+ outcomes. Using electroencephalography, we show that theta band oscillations of the prefrontal cortex are most prominent during changes in two consecutive outcomes, potentially reflecting the need for enhanced cognitive control. Our results suggest that S+ experiences 'overwrite' previous motor states to a greater extent than S- experiences and that modulations in neural oscillations in the prefrontal cortex play a potential role in encoding changes in movement variability state during reinforcement motor learning.
Collapse
Affiliation(s)
- Patrick Wiegel
- Movement & Neuroscience, Department of Nutrition, Exercise & Sports, University of Copenhagen, Denmark.
| | - Meaghan Elizabeth Spedden
- Movement & Neuroscience, Department of Nutrition, Exercise & Sports, University of Copenhagen, Denmark
| | - Christina Ramsenthaler
- Clinic for Palliative Care, University Medical Center Freiburg, Freiburg, Germany; Wolfson Palliative Care Research Centre, Hull & York Medical School, University of Hull, Hull, UK; Cicely Saunders Institute, Department of Palliative Care, Policy & Rehabilitation, King's College London, London, UK
| | - Mikkel Malling Beck
- Movement & Neuroscience, Department of Nutrition, Exercise & Sports, University of Copenhagen, Denmark
| | - Jesper Lundbye-Jensen
- Movement & Neuroscience, Department of Nutrition, Exercise & Sports, University of Copenhagen, Denmark
| |
Collapse
|
15
|
Friedman J, Amiaz A, Korman M. The online and offline effects of changing movement timing variability during training on a finger-opposition task. Sci Rep 2022; 12:13319. [PMID: 35922460 PMCID: PMC9349301 DOI: 10.1038/s41598-022-16335-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 07/08/2022] [Indexed: 11/09/2022] Open
Abstract
In motor learning tasks, there is mixed evidence for whether increased task-relevant variability in early learning stages leads to improved outcomes. One problem is that there may be a connection between skill level and motor variability, such that participants who initially have more variability may also perform worse on the task, so will have more room to improve. To avoid this confound, we experimentally manipulated the amount of movement timing variability (MTV) during training to test whether it improves performance. Based on previous studies showing that most of the improvement in finger-opposition tasks comes from optimizing the relative onset time of the finger movements, we used auditory cues (beeps) to guide the onset times of sequential movements during a training session, and then assessed motor performance after the intervention. Participants were assigned to three groups that either: (a) followed a prescribed random rhythm for their finger touches (Variable MTV), (b) followed a fixed rhythm (Fixed control MTV), or (c) produced the entire sequence following a single beep (Unsupervised control MTV). While the intervention was successful in increasing MTV during training for the Variable group, it did not lead to improved outcomes post-training compared to either control group, and the use of fixed timing led to significantly worse performance compared to the Unsupervised control group. These results suggest that manipulating MTV through auditory cues does not produce greater learning than unconstrained training in motor sequence tasks.
Collapse
Affiliation(s)
- Jason Friedman
- Department of Physical Therapy, School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel. .,Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
| | - Assaf Amiaz
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Maria Korman
- Department of Occupational Therapy, Faculty of Health Sciences, Ariel University, Ariel, Israel
| |
Collapse
|
16
|
Lafe CW, Newell KM. Instructions on Task Constraints Mediate Perceptual-Motor Search and How Movement Variability Relates to Performance Outcome. J Mot Behav 2022; 54:669-685. [PMID: 35440288 DOI: 10.1080/00222895.2022.2063787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Movement variability has been postulated to afford perception of the perceptual motor workspace and to be directly linked to improved performance. Here, we investigated how instructions mediate the search process and the relation between performance outcome and movement variability. We used a novel bimanual force tracking task where zero error was achieved via proportional output between the hands. Participants were either instructed or not as to how to coordinate their force output to achieve this goal, but the goal to minimize error was explained to all participants. The provision of instructions restricted the overall area of the task space that was searched. Moreover, the time dependent properties of the search were influenced; where instructions increased the likelihood that participants would produce a higher force level over practice. Multiple regression revealed that variability was positively correlated with performance outcome, but the strength of this relation was dependent on instructions and individual search strategies. The findings are consistent with the view that information through instructions shapes individual emergent perceptual-motor search strategies that in turn mediate how movement variability relates to performance outcome.
Collapse
Affiliation(s)
- Charley W Lafe
- Department of Kinesiology, University of Georgia, Athens, GA, USA
| | - Karl M Newell
- Department of Kinesiology, University of Georgia, Athens, GA, USA
| |
Collapse
|
17
|
Brummelman E, Grapsas S, van der Kooij K. Parental praise and children's exploration: a virtual reality experiment. Sci Rep 2022; 12:4967. [PMID: 35322062 PMCID: PMC8943146 DOI: 10.1038/s41598-022-08226-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 02/28/2022] [Indexed: 11/09/2022] Open
Abstract
When children practice a new skill and fail, it is critical for them to explore new strategies to succeed. How can parents encourage children's exploration? Bridging insights from developmental psychology and the neuroscience of motor control, we examined the effects of parental praise on children's motor exploration. We theorize that modest praise can spark exploration. Unlike inflated praise, modest praise acknowledges children's performance, without setting a high standard for future performance. This may be reassuring to children with lower levels of self-esteem, who often doubt their ability. We conducted a novel virtual-reality experiment. Children (N = 202, ages 8-12) reported self-esteem and performed a virtual-reality 3D trajectory-matching task, with success/failure feedback after each trial. Children received modest praise ("You did well!"), inflated praise ("You did incredibly well!"), or no praise from their parent. We measured motor exploration as children's tendency to vary their movements following failure. Relative to no praise, modest praise-unlike inflated praise-encouraged exploration in children with lower levels of self-esteem. By contrast, modest praise discouraged exploration in children with higher levels of self-esteem. Effects were small yet robust. This experiment demonstrates that modest praise can spark exploration in children with lower levels of self-esteem.
Collapse
Affiliation(s)
- Eddie Brummelman
- Research Institute of Child Development and Education, University of Amsterdam, P.O. Box 15780, 1001 NG, Amsterdam, The Netherlands.
| | | | | |
Collapse
|
18
|
Sidarta A, Komar J, Ostry DJ. Clustering analysis of movement kinematics in reinforcement learning. J Neurophysiol 2021; 127:341-353. [PMID: 34936514 PMCID: PMC8816628 DOI: 10.1152/jn.00229.2021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Reinforcement learning has been used as an experimental model of motor skill acquisition, where at times movements are successful and thus reinforced. One fundamental problem is to understand how humans select exploration over exploitation during learning. The decision could be influenced by factors such as task demands and reward availability. In this study, we applied a clustering algorithm to examine how a change in the accuracy requirements of a task affected the choice of exploration over exploitation. Participants made reaching movements to an unseen target using a planar robot arm and received reward after each successful movement. For one group of participants, the width of the hidden target decreased after every other training block. For a second group, it remained constant. The clustering algorithm was applied to the kinematic data to characterize motor learning on a trial-to-trial basis as a sequence of movements, each belonging to one of the identified clusters. By the end of learning, movement trajectories across all participants converged primarily to a single cluster with the greatest number of successful trials. Within this analysis framework, we defined exploration and exploitation as types of behavior in which two successive trajectories belong to different or similar clusters, respectively. The frequency of each mode of behavior was evaluated over the course of learning. It was found that by reducing the target width, participants used a greater variety of different clusters and displayed more exploration than exploitation. Excessive exploration relative to exploitation was found to be detrimental to subsequent motor learning. NEW & NOTEWORTHY The choice of exploration versus exploitation is a fundamental problem in learning new motor skills through reinforcement. In this study, we employed a data-driven approach to characterize movements on a trial-by-trial basis with an unsupervised clustering algorithm. Using this technique, we found that changes in task demands and, in particular, in the required accuracy of movements, influenced the ratio of exploration to exploitation. This analysis framework provides an attractive tool to investigate mechanisms of explorative and exploitative behavior while studying motor learning.
Collapse
Affiliation(s)
- Ananda Sidarta
- Rehabilitation Research Institute of Singapore, Nanyang Technological University, Singapore
| | - John Komar
- National Institute of Education, Nanyang Technological University, Singapore
| | - David J Ostry
- Department of Psychology, McGill University, Montreal, Quebec, Canada.,Haskins Laboratories, New Haven, CT, United States
| |
Collapse
|
19
|
Herrojo Ruiz M, Maudrich T, Kalloch B, Sammler D, Kenville R, Villringer A, Sehm B, Nikulin VV. Modulation of neural activity in frontopolar cortex drives reward-based motor learning. Sci Rep 2021; 11:20303. [PMID: 34645848 PMCID: PMC8514446 DOI: 10.1038/s41598-021-98571-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 08/26/2021] [Indexed: 12/03/2022] Open
Abstract
The frontopolar cortex (FPC) contributes to tracking the reward of alternative choices during decision making, as well as their reliability. Whether this FPC function extends to reward gradients associated with continuous movements during motor learning remains unknown. We used anodal transcranial direct current stimulation (tDCS) over the right FPC to investigate its role in reward-based motor learning. Nineteen healthy human participants practiced novel sequences of finger movements on a digital piano with corresponding auditory feedback. Their aim was to use trialwise reward feedback to discover a hidden performance goal along a continuous dimension: timing. We additionally modulated the contralateral motor cortex (left M1) activity, and included a control sham stimulation. Right FPC-tDCS led to faster learning compared to lM1-tDCS and sham through regulation of motor variability. Bayesian computational modelling revealed that in all stimulation protocols, an increase in the trialwise expectation of reward was followed by greater exploitation, as shown previously. Yet, this association was weaker in lM1-tDCS suggesting a less efficient learning strategy. The effects of frontopolar stimulation were dissociated from those induced by lM1-tDCS and sham, as motor exploration was more sensitive to inferred changes in the reward tendency (volatility). The findings suggest that rFPC-tDCS increases the sensitivity of motor exploration to updates in reward volatility, accelerating reward-based motor learning.
Collapse
Affiliation(s)
- M Herrojo Ruiz
- Psychology Department, Goldsmiths University of London, London, UK. .,Center for Cognition and Decision Making, National Research University Higher School of Economics, Moscow, Russian Federation. .,Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - T Maudrich
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - B Kalloch
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - D Sammler
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| | - R Kenville
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - A Villringer
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - B Sehm
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Department of Neurology, University Hospital Halle (Saale), Halle, Germany
| | - V V Nikulin
- Center for Cognition and Decision Making, National Research University Higher School of Economics, Moscow, Russian Federation. .,Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| |
Collapse
|
20
|
Impaired Refinement of Kinematic Variability in Huntington Disease Mice on an Automated Home Cage Forelimb Motor Task. J Neurosci 2021; 41:8589-8602. [PMID: 34429377 DOI: 10.1523/jneurosci.0165-21.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 06/21/2021] [Accepted: 08/18/2021] [Indexed: 11/21/2022] Open
Abstract
The effective development of novel therapies in mouse models of neurologic disorders relies on behavioral assessments that provide accurate read-outs of neuronal dysfunction and/or degeneration. We designed an automated behavioral testing system (PiPaw), which integrates an operant lever-pulling task directly into the mouse home cage. This task is accessible to group-housed mice 24 h per day, enabling high-throughput longitudinal analysis of forelimb motor learning. Moreover, this design eliminates the need for exposure to novel environments and minimizes experimenter interaction, significantly reducing two of the largest stressors associated with animal behavior. Male mice improved their performance of this task over 1 week of testing by reducing intertrial variability of reward-related kinematic parameters (pull amplitude or peak velocity). In addition, mice displayed short-term improvements in reward rate, and a concomitant decrease in movement variability, over the course of brief bouts of task engagement. We used this system to assess motor learning in mouse models of the inherited neurodegenerative disorder, Huntington disease (HD). Despite having no baseline differences in task performance, male Q175-FDN HD mice were unable to modulate the variability of their movements to increase reward on either short or long timescales. Task training was associated with a decrease in the amplitude of spontaneous excitatory activity recorded from striatal medium spiny neurons in the hemisphere contralateral to the trained forelimb in WT mice; however, no such changes were observed in Q175-FDN mice. This behavioral screening platform should prove useful for preclinical drug trials toward improved treatments in HD and other neurologic disorders.SIGNIFICANCE STATEMENT In order to develop effective therapies for neurologic disorders, such as Huntington disease (HD), it is important to be able to accurately and reliably assess the behavior of mouse models of these conditions. Moreover, these behavioral assessments should provide an accurate readout of underlying neuronal dysfunction and/or degeneration. In this paper, we used an automated behavioral testing system to assess motor learning in mice within their home cage. Using this system, we were able to study motor abnormalities in HD mice with an unprecedented level of detail, and identified a specific behavioral deficit associated with an underlying impairment in striatal neuronal plasticity. These results validate the usefulness of this system for assessing behavior in mouse models of HD and other neurologic disorders.
Collapse
|
21
|
van Mastrigt NM, van der Kooij K, Smeets JBJ. Pitfalls in quantifying exploration in reward-based motor learning and how to avoid them. BIOLOGICAL CYBERNETICS 2021; 115:365-382. [PMID: 34341885 PMCID: PMC8382626 DOI: 10.1007/s00422-021-00884-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 06/23/2021] [Indexed: 06/13/2023]
Abstract
When learning a movement based on binary success information, one is more variable following failure than following success. Theoretically, the additional variability post-failure might reflect exploration of possibilities to obtain success. When average behavior is changing (as in learning), variability can be estimated from differences between subsequent movements. Can one estimate exploration reliably from such trial-to-trial changes when studying reward-based motor learning? To answer this question, we tried to reconstruct the exploration underlying learning as described by four existing reward-based motor learning models. We simulated learning for various learner and task characteristics. If we simply determined the additional change post-failure, estimates of exploration were sensitive to learner and task characteristics. We identified two pitfalls in quantifying exploration based on trial-to-trial changes. Firstly, performance-dependent feedback can cause correlated samples of motor noise and exploration on successful trials, which biases exploration estimates. Secondly, the trial relative to which trial-to-trial change is calculated may also contain exploration, which causes underestimation. As a solution, we developed the additional trial-to-trial change (ATTC) method. By moving the reference trial one trial back and subtracting trial-to-trial changes following specific sequences of trial outcomes, exploration can be estimated reliably for the three models that explore based on the outcome of only the previous trial. Since ATTC estimates are based on a selection of trial sequences, this method requires many trials. In conclusion, if exploration is a binary function of previous trial outcome, the ATTC method allows for a model-free quantification of exploration.
Collapse
Affiliation(s)
- Nina M van Mastrigt
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
| | - Katinka van der Kooij
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Jeroen B J Smeets
- Department of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
22
|
Lafe CW, Newell KM. Task and Informational Constraints on Search Strategies: Testing the Idea of Convergence to Tolerant Regions. J Mot Behav 2021; 55:603-618. [PMID: 34130615 DOI: 10.1080/00222895.2021.1913088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 03/09/2021] [Accepted: 03/31/2021] [Indexed: 10/21/2022]
Abstract
The experiment reported was designed to investigate the interaction of information and force variability on the evolving search strategy, specifically testing the hypothesis of convergence to tolerant regions. Participants were required to produce proportional bimanual isometric force output over three days of practice, with no prespecified force target and where performance was more tolerant to force variability at higher forces. The duration of intermittent visual feedback was manipulated to test the effects of information and force variability on the search process and the resulting sensitivity to tolerant regions of the task space. The findings showed that just under half of the participants exploited more tolerant regions and that this was predicted by the initial force conditions. Different characterizations of the individual search patterns were also predicted by inherent force-dependent variability and initial force conditions. Visual intermittency feedback did not affect the time-dependent properties of the search process but did influence the within-trial variability. Our findings suggest that the attraction to tolerant regions needs to be considered within the interactions of the different categories of constraints on the search process.
Collapse
Affiliation(s)
- Charley W Lafe
- Department of Kinesiology, University of Georgia, Athens, GA, USA
| | - Karl M Newell
- Department of Kinesiology, University of Georgia, Athens, GA, USA
| |
Collapse
|
23
|
Palidis DJ, McGregor HR, Vo A, MacDonald PA, Gribble PL. Null effects of levodopa on reward- and error-based motor adaptation, savings, and anterograde interference. J Neurophysiol 2021; 126:47-67. [PMID: 34038228 DOI: 10.1152/jn.00696.2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Dopamine signaling is thought to mediate reward-based learning. We tested for a role of dopamine in motor adaptation by administering the dopamine precursor levodopa to healthy participants in two experiments involving reaching movements. Levodopa has been shown to impair reward-based learning in cognitive tasks. Thus, we hypothesized that levodopa would selectively impair aspects of motor adaptation that depend on the reinforcement of rewarding actions. In the first experiment, participants performed two separate tasks in which adaptation was driven either by visual error-based feedback of the hand position or binary reward feedback. We used EEG to measure event-related potentials evoked by task feedback. We hypothesized that levodopa would specifically diminish adaptation and the neural responses to feedback in the reward learning task. However, levodopa did not affect motor adaptation in either task nor did it diminish event-related potentials elicited by reward outcomes. In the second experiment, participants learned to compensate for mechanical force field perturbations applied to the hand during reaching. Previous exposure to a particular force field can result in savings during subsequent adaptation to the same force field or interference during adaptation to an opposite force field. We hypothesized that levodopa would diminish savings and anterograde interference, as previous work suggests that these phenomena result from a reinforcement learning process. However, we found no reliable effects of levodopa. These results suggest that reward-based motor adaptation, savings, and interference may not depend on the same dopaminergic mechanisms that have been shown to be disrupted by levodopa during various cognitive tasks.NEW & NOTEWORTHY Motor adaptation relies on multiple processes including reinforcement of successful actions. Cognitive reinforcement learning is impaired by levodopa-induced disruption of dopamine function. We administered levodopa to healthy adults who participated in multiple motor adaptation tasks. We found no effects of levodopa on any component of motor adaptation. This suggests that motor adaptation may not depend on the same dopaminergic mechanisms as cognitive forms or reinforcement learning that have been shown to be impaired by levodopa.
Collapse
Affiliation(s)
- Dimitrios J Palidis
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Graduate Program in Neuroscience, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Heather R McGregor
- Department of Applied Physiology and Kinesiology, University of Florida, Gainesville, Florida
| | - Andrew Vo
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montreal, Quebec, Canada
| | - Penny A MacDonald
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.,Department of Clinical Neurological Sciences, University of Western Ontario, London, Ontario, Canada
| | - Paul L Gribble
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.,Haskins Laboratories, New Haven, Connecticut
| |
Collapse
|
24
|
McIlroy B, Passfield L, Holmberg HC, Sperlich B. Virtual Training of Endurance Cycling - A Summary of Strengths, Weaknesses, Opportunities and Threats. Front Sports Act Living 2021; 3:631101. [PMID: 33748754 PMCID: PMC7969501 DOI: 10.3389/fspor.2021.631101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 01/28/2021] [Indexed: 11/13/2022] Open
Abstract
Virtual online training has emerged as one of the top 20 worldwide fitness trends for 2021 and continues to develop rapidly. Although this allows the cycling community to engage in virtual training and competition, critical evaluation of virtual training platforms is limited. Here, we discuss the strengths, weaknesses, opportunities and threats associated with virtual training technology and cycling in an attempt to enhance awareness of such aspects. Strengths include immersive worlds, innovative drafting mechanics, and versatility. Weaknesses include questionable data accuracy, inadequate strength and reliability of power-speed algorithms. Opportunities exist for expanding strategic partnerships with major cycling races, brands, and sponsors and improving user experience with the addition of video capture and "e-coaching." Threats are present in the form of cheating during competition, and a lack of uptake and acceptance by a broader community.
Collapse
Affiliation(s)
- Benjamin McIlroy
- Department of Sport Science, Integrative and Experimental Exercise Science, University of Würzburg, Würzburg, Germany.,Department of Sport and Public Services, Brooklands College, Weybridge, United Kingdom
| | - Louis Passfield
- Faculty of Kinesiology, University of Calgary, Calgary, AB, Canada.,School of Sport and Exercise Sciences, University of Kent, Canterbury, United Kingdom
| | - Hans-Christer Holmberg
- Department of Physiology and Pharmacology, Biomedicum C5, Karolinska Institute, Stockholm, Sweden.,Department of Engineering Sciences and Mathematics, Luleå University of Technology, Luleå, Sweden
| | - Billy Sperlich
- Department of Sport Science, Integrative and Experimental Exercise Science, University of Würzburg, Würzburg, Germany
| |
Collapse
|
25
|
Abstract
The survival of an organism depends on the ability to make adaptive decisions to achieve the needs of the organism: where to get food, who to mate with, and how to evade predators. Decision-making is a term used to describe a collection of behavioral and/or computational functions that guide the selection of an option amongst a set of alternatives. Some of these functions may include calculating the costs and benefits of a particular action, evaluating differences in value of each of the alternative outcomes and the likelihood of receiving a particular outcome, using past experiences to generate predictions or expectations about action-outcome associations, and/or integration of past experiences to make novel inferences that can be used in new environments. There is considerable interest in understanding the neurobiological mechanisms that mediate these decision-making functions and recent advances in behavioral approaches, neuroscience techniques, and neuroimaging measures have begun to develop mechanistic links between biology, reward, and decision making. This multidisciplinary work holds great promise for elucidating the biological mechanisms mediating decision-making deficits in normal and abnormal states. The multidisciplinary studies included in this Collection provide new insights into the neuroscience of decision making and reward.
Collapse
|