1
|
Okan A, Hallquist MN. Negative affect-driven impulsivity as hierarchical model-based overgeneralization. Trends Cogn Sci 2025; 29:407-420. [PMID: 39919952 PMCID: PMC12058388 DOI: 10.1016/j.tics.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 01/09/2025] [Accepted: 01/10/2025] [Indexed: 02/09/2025]
Abstract
'If your mouth is burned by milk, you blow before you eat yogurt' ('Sütten ağzı yanan yoğurdu üfleyerek yer'). This Turkish proverb advises caution based on past experiences when similar situations are encountered. However, although we may infer similarities across experiences, each situation is a complex combination of many features, and generalizing across situations based on perceived similarities may not achieve desired outcomes when obtaining them depends on more subtle or overlooked features. Here, we examine how models of generalization can uncover the model-based (MB) processes underlying reactive and rigid behaviors traditionally considered model-free (MF). Our novel conceptualization suggests that emotionally driven impulsive behaviors stem from a propensity to overgeneralize based on surface-level similarities, hindering the incorporation of other informative, discriminant cues.
Collapse
Affiliation(s)
- Aysenur Okan
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, NC, USA.
| | - Michael N Hallquist
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, NC, USA
| |
Collapse
|
2
|
Papale AE, Brown VM, Ianni AM, Hallquist MN, Luna B, Dombrovski AY. Prefrontal default-mode network interactions with posterior hippocampus during exploration. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.12.642890. [PMID: 40161797 PMCID: PMC11952374 DOI: 10.1101/2025.03.12.642890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Hippocampal maps and ventral prefrontal cortex (vPFC) value and goal representations support foraging in continuous spaces. How might hippocampal-vPFC interactions control the balance between behavioral exploration and exploitation? Using fMRI and reinforcement learning modeling, we investigated vPFC and hippocampal responses as humans explored and exploited a continuous one-dimensional space, with out-of-session and out-of-sample replication. The spatial distribution of rewards, or value landscape, modulated activity in the hippocampus and default network vPFC subregions, but not in ventrolateral prefrontal control subregions or medial orbitofrontal limbic subregions. While prefrontal default network and hippocampus displayed higher activity in less complex, easy-to-exploit value landscapes, vPFC-hippocampal connectivity increased in uncertain landscapes requiring exploration. Further, synchronization between prefrontal default network and posterior hippocampus scaled with behavioral exploration. Considered alongside electrophysiological studies, our findings suggest that locations to be explored are identified through coordinated activity binding prefrontal default network value representations to posterior hippocampal maps.
Collapse
Affiliation(s)
- Andrew E. Papale
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Angela M. Ianni
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
| | - Michael N. Hallquist
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Beatriz Luna
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
3
|
Tsypes A, Hallquist MN, Ianni A, Kaurin A, Wright AGC, Dombrovski AY. Exploration-Exploitation and Suicidal Behavior in Borderline Personality Disorder and Depression. JAMA Psychiatry 2024; 81:1010-1019. [PMID: 38985462 PMCID: PMC11238070 DOI: 10.1001/jamapsychiatry.2024.1796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 04/25/2024] [Indexed: 07/11/2024]
Abstract
Importance Clinical theory and behavioral studies suggest that people experiencing suicidal crisis are often unable to find constructive solutions or incorporate useful information into their decisions, resulting in premature convergence on suicide and neglect of better alternatives. However, prior studies of suicidal behavior have not formally examined how individuals resolve the tradeoffs between exploiting familiar options and exploring potentially superior alternatives. Objective To investigate exploration and exploitation in suicidal behavior from the formal perspective of reinforcement learning. Design, Setting, and Participants Two case-control behavioral studies of exploration-exploitation of a large 1-dimensional continuous space and a 21-day prospective ambulatory study of suicidal ideation were conducted between April 2016 and March 2022. Participants were recruited from inpatient psychiatric units, outpatient clinics, and the community in Pittsburgh, Pennsylvania, and underwent laboratory and ambulatory assessments. Adults diagnosed with borderline personality disorder (BPD) and midlife and late-life major depressive disorder (MDD) were included, with each sample including demographically equated groups with a history of high-lethality suicide attempts, low-lethality suicide attempts, individuals with BPD or MDD but no suicide attempts, and control individuals without psychiatric disorders. The MDD sample also included a subgroup with serious suicidal ideation. Main Outcomes and Measures Behavioral (model-free and model-derived) indices of exploration and exploitation, suicide attempt lethality (Beck Lethality Scale), and prospectively assessed suicidal ideation. Results The BPD group included 171 adults (mean [SD] age, 30.55 [9.13] years; 135 [79%] female). The MDD group included 143 adults (mean [SD] age, 62.03 [6.82] years; 81 [57%] female). Across the BPD (χ23 = 50.68; P < .001) and MDD (χ24 = 36.34; P < .001) samples, individuals with high-lethality suicide attempts discovered fewer options than other groups as they were unable to shift away from unrewarded options. In contrast, those with low-lethality attempts were prone to excessive behavioral shifts after rewarded and unrewarded actions. No differences were seen in strategic early exploration or in exploitation. Among 84 participants with BPD in the ambulatory study, 56 reported suicidal ideation. Underexploration also predicted incident suicidal ideation (χ21 = 30.16; P < .001), validating the case-control results prospectively. The findings were robust to confounds, including medication exposure, affective state, and behavioral heterogeneity. Conclusions and Relevance The findings suggest that narrow exploration and inability to abandon inferior options are associated with serious suicidal behavior and chronic suicidal thoughts. By contrast, individuals in this study who engaged in low-lethality suicidal behavior displayed a low threshold for taking potentially disadvantageous actions.
Collapse
Affiliation(s)
- Aliona Tsypes
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Michael N. Hallquist
- Department of Psychology and Neuroscience, University of North Carolina, Chapel Hill
| | - Angela Ianni
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Aleksandra Kaurin
- Department of Psychology, University of Wuppertal, Wuppertal, Germany
| | - Aidan G. C. Wright
- Department of Psychology, University of Michigan, Ann Arbor
- Eisenberg Family Depression Center, University of Michigan, Ann Arbor
| | - Alexandre Y. Dombrovski
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| |
Collapse
|
4
|
Hallquist MN, Hwang K, Luna B, Dombrovski AY. Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space. SCIENCE ADVANCES 2024; 10:eadj2219. [PMID: 38394198 PMCID: PMC10889364 DOI: 10.1126/sciadv.adj2219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/23/2024] [Indexed: 02/25/2024]
Abstract
Primates exploring and exploiting a continuous sensorimotor space rely on dynamic maps in the dorsal stream. Two complementary perspectives exist on how these maps encode rewards. Reinforcement learning models integrate rewards incrementally over time, efficiently resolving the exploration/exploitation dilemma. Working memory buffer models explain rapid plasticity of parietal maps but lack a plausible exploration/exploitation policy. The reinforcement learning model presented here unifies both accounts, enabling rapid, information-compressing map updates and efficient transition from exploration to exploitation. As predicted by our model, activity in human frontoparietal dorsal stream regions, but not in MT+, tracks the number of competing options, as preferred options are selectively maintained on the map, while spatiotemporally distant alternatives are compressed out. When valuable new options are uncovered, posterior β1/α oscillations desynchronize within 0.4 to 0.7 s, consistent with option encoding by competing β1-stabilized subpopulations. Together, outcomes matching locally cached reward representations rapidly update parietal maps, biasing choices toward often-sampled, rewarded options.
Collapse
Affiliation(s)
| | - Kai Hwang
- Department of Psychological and Brain Sciences, Iowa Neuroscience Institute, University of Iowa, Iowa City, IA, USA
| | - Beatriz Luna
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
5
|
Brown VM, Hallquist MN, Frank MJ, Dombrovski AY. Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task. Cognition 2022; 229:105233. [PMID: 35917612 PMCID: PMC9530017 DOI: 10.1016/j.cognition.2022.105233] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/02/2022] [Accepted: 07/22/2022] [Indexed: 11/27/2022]
Abstract
When navigating uncertain worlds, humans must balance exploring new options versus exploiting known rewards. Longer horizons and spatially structured option values encourage humans to explore, but the impact of real-world cognitive constraints such as environment size and memory demands on explore-exploit decisions is unclear. In the present study, humans chose between options varying in uncertainty during a multi-armed bandit task with varying environment size and memory demands. Regression and cognitive computational models of choice behavior showed that with a lower cognitive load, humans are more exploratory than a simulated value-maximizing learner, but under cognitive constraints, they adaptively scale down exploration to maintain exploitation. Thus, while humans are curious, cognitive constraints force people to decrease their strategic exploration in a resource-rational-like manner to focus on harvesting known rewards.
Collapse
Affiliation(s)
- Vanessa M Brown
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Michael N Hallquist
- Department of Psychology, Pennsylvania State University, State College, PA, USA; Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Michael J Frank
- Department of Cognitive, Linguistic, and Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, RI, USA
| | | |
Collapse
|
6
|
Wang D, Chen S, Hu Y, Liu L, Wang H. Behavior Decision of Mobile Robot With a Neurophysiologically Motivated Reinforcement Learning Model. IEEE Trans Cogn Dev Syst 2022. [DOI: 10.1109/tcds.2020.3035778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
7
|
Dombrovski AY, Hallquist MN. Search for solutions, learning, simulation, and choice processes in suicidal behavior. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2022; 13:e1561. [PMID: 34008338 PMCID: PMC9285563 DOI: 10.1002/wcs.1561] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/06/2021] [Accepted: 04/07/2021] [Indexed: 12/25/2022]
Abstract
Suicide may be viewed as an unfortunate outcome of failures in decision processes. Such failures occur when the demands of a crisis exceed a person's capacity to (i) search for options, (ii) learn and simulate possible futures, and (iii) make advantageous value-based choices. Can individual-level decision deficits and biases drive the progression of the suicidal crisis? Our overview of the evidence on this question is informed by clinical theory and grounded in reinforcement learning and behavioral economics. Cohort and case-control studies provide strong evidence that limited cognitive capacity and particularly impaired cognitive control are associated with suicidal behavior, imposing cognitive constraints on decision-making. We conceptualize suicidal ideation as an element of impoverished consideration sets resulting from a search for solutions under cognitive constraints and mood-congruent Pavlovian influences, a view supported by mostly indirect evidence. More compelling is the evidence of impaired learning in people with a history of suicidal behavior. We speculate that an inability to simulate alternative futures using one's model of the world may undermine alternative solutions in a suicidal crisis. The hypothesis supported by the strongest evidence is that the selection of suicide over alternatives is facilitated by a choice process undermined by randomness. Case-control studies using gambling tasks, armed bandits, and delay discounting support this claim. Future experimental studies will need to uncover real-time dynamics of choice processes in suicidal people. In summary, the decision process framework sheds light on neurocognitive mechanisms that facilitate the progression of the suicidal crisis. This article is categorized under: Economics > Individual Decision-Making Psychology > Emotion and Motivation Psychology > Learning Neuroscience > Behavior.
Collapse
Affiliation(s)
| | - Michael N. Hallquist
- Department of Psychology and NeuroscienceUniversity of North CarolinaChapel HillNorth CarolinaUSA
| |
Collapse
|
8
|
Spreng RN, Turner GR. From exploration to exploitation: a shifting mental mode in late life development. Trends Cogn Sci 2021; 25:1058-1071. [PMID: 34593321 PMCID: PMC8844884 DOI: 10.1016/j.tics.2021.09.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 08/30/2021] [Accepted: 09/01/2021] [Indexed: 12/31/2022]
Abstract
Changes in cognition, affect, and brain function combine to promote a shift in the nature of mentation in older adulthood, favoring exploitation of prior knowledge over exploratory search as the starting point for thought and action. Age-related exploitation biases result from the accumulation of prior knowledge, reduced cognitive control, and a shift toward affective goals. These are accompanied by changes in cortical networks, as well as attention and reward circuits. By incorporating these factors into a unified account, the exploration-to-exploitation shift offers an integrative model of cognitive, affective, and brain aging. Here, we review evidence for this model, identify determinants and consequences, and survey the challenges and opportunities posed by an exploitation-biased mental mode in later life.
Collapse
Affiliation(s)
- R Nathan Spreng
- Laboratory of Brain and Cognition, Montreal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montreal, QC H3A 2B4, Canada; McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada; Departments of Psychiatry and Psychology, McGill University, Montreal, QC H3A 0G4, Canada.
| | - Gary R Turner
- Department of Psychology, York University, Toronto, ON M3J 1P3, Canada
| |
Collapse
|
9
|
Dombrovski AY, Luna B, Hallquist MN. Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma. Nat Commun 2020; 11:5407. [PMID: 33106508 PMCID: PMC7589536 DOI: 10.1038/s41467-020-18864-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 08/20/2020] [Indexed: 12/15/2022] Open
Abstract
When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.
Collapse
Affiliation(s)
| | - Beatriz Luna
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Michael N Hallquist
- Department of Psychology, Penn State University, University Park, PA, 16801, USA.
- Department of Psychology and Neuroscience, University of North Carolina, Chapel Hill, NC, 27599-3270, USA.
| |
Collapse
|